AutoGPTQ

History

PanQiWei 7a3397e7ba add cpu offload when doing quantization		2023-04-27 21:25:24 +08:00
..
__init__.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
_base.py	add cpu offload when doing quantization	2023-04-27 21:25:24 +08:00
_const.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
_utils.py	big fix	2023-04-27 19:33:25 +08:00
auto.py	support multi gpus quantization	2023-04-27 18:48:43 +08:00
bloom.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
gpt_neox.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
gptj.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
llama.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
moss.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
opt.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00