AutoGPTQ

History

qwopqwop200 435eebee4b support conv1d,conv2d		2023-04-28 09:13:00 +09:00
..
__init__.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
_base.py	fix bug	2023-04-28 08:26:58 +09:00
_const.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
_utils.py	support conv1d,conv2d	2023-04-28 09:13:00 +09:00
auto.py	add support to cpu offloading and multi gpus inference on quantized model	2023-04-28 00:53:57 +08:00
bloom.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
gpt2.py	add gpt2	2023-04-28 09:11:50 +09:00
gpt_neox.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
gptj.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
llama.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
moss.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
opt.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00