AutoGPTQ/auto_gptq/modeling
qwopqwop200 3f90a22632
fix bug
2023-04-28 08:26:58 +09:00
..
__init__.py support dispatch layers to different devices when loading pretrained model before quantization 2023-04-27 02:24:08 +08:00
_base.py fix bug 2023-04-28 08:26:58 +09:00
_const.py support dispatch layers to different devices when loading pretrained model before quantization 2023-04-27 02:24:08 +08:00
_utils.py big fix 2023-04-27 19:33:25 +08:00
auto.py add support to cpu offloading and multi gpus inference on quantized model 2023-04-28 00:53:57 +08:00
bloom.py support dispatch layers to different devices when loading pretrained model before quantization 2023-04-27 02:24:08 +08:00
gpt_neox.py support dispatch layers to different devices when loading pretrained model before quantization 2023-04-27 02:24:08 +08:00
gptj.py support dispatch layers to different devices when loading pretrained model before quantization 2023-04-27 02:24:08 +08:00
llama.py support dispatch layers to different devices when loading pretrained model before quantization 2023-04-27 02:24:08 +08:00
moss.py support dispatch layers to different devices when loading pretrained model before quantization 2023-04-27 02:24:08 +08:00
opt.py support dispatch layers to different devices when loading pretrained model before quantization 2023-04-27 02:24:08 +08:00