AutoGPTQ

History

LaaZa 63247a0669 Add support for GPTBigCode		2023-05-08 12:28:29 +03:00
..
__init__.py	Add support for GPTBigCode	2023-05-08 12:28:29 +03:00
_base.py	Merge pull request #38 from PanQiWei/faster-cuda-no-actorder	2023-05-04 21:47:19 +08:00
_const.py	Add support for GPTBigCode	2023-05-08 12:28:29 +03:00
_utils.py	fix incorrect pack while using cuda, desc_act and grouping	2023-05-07 20:44:47 +08:00
auto.py	Add support for GPTBigCode	2023-05-08 12:28:29 +03:00
bloom.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
gpt2.py	fix device mismatch when directly using model to inference after quantization	2023-04-28 16:41:46 +08:00
gpt_bigcode.py	Add support for GPTBigCode	2023-05-08 12:28:29 +03:00
gpt_neox.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
gptj.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
llama.py	remove override of _resize_attention_mask for llama and opt	2023-04-28 23:08:42 +08:00
moss.py	remove non-parameters module from MOSSGPTQForCausalLM.outside_layer_modules	2023-04-29 10:58:29 +08:00
opt.py	remove override of _resize_attention_mask for llama and opt	2023-04-28 23:08:42 +08:00