This website requires JavaScript.
Explore
Help
Sign in
simcop2387
/
AutoGPTQ
Watch
1
Star
0
Fork
You've already forked AutoGPTQ
0
Code
Issues
Pull requests
Projects
Releases
Packages
1
Wiki
Activity
3f90a22632
AutoGPTQ
/
auto_gptq
/
modeling
History
qwopqwop200
3f90a22632
fix bug
2023-04-28 08:26:58 +09:00
..
__init__.py
support dispatch layers to different devices when loading pretrained model before quantization
2023-04-27 02:24:08 +08:00
_base.py
fix bug
2023-04-28 08:26:58 +09:00
_const.py
support dispatch layers to different devices when loading pretrained model before quantization
2023-04-27 02:24:08 +08:00
_utils.py
big fix
2023-04-27 19:33:25 +08:00
auto.py
add support to cpu offloading and multi gpus inference on quantized model
2023-04-28 00:53:57 +08:00
bloom.py
support dispatch layers to different devices when loading pretrained model before quantization
2023-04-27 02:24:08 +08:00
gpt_neox.py
support dispatch layers to different devices when loading pretrained model before quantization
2023-04-27 02:24:08 +08:00
gptj.py
support dispatch layers to different devices when loading pretrained model before quantization
2023-04-27 02:24:08 +08:00
llama.py
support dispatch layers to different devices when loading pretrained model before quantization
2023-04-27 02:24:08 +08:00
moss.py
support dispatch layers to different devices when loading pretrained model before quantization
2023-04-27 02:24:08 +08:00
opt.py
support dispatch layers to different devices when loading pretrained model before quantization
2023-04-27 02:24:08 +08:00