AutoGPTQ

History

Automation Pipeline 9fb99f61e7 Merge remote-tracking branches 'laaza/Mistral' and 'laaza/MPT'		2023-10-22 07:53:59 -04:00
..
__init__.py	Merge remote-tracking branches 'laaza/Mistral' and 'laaza/MPT'	2023-10-22 07:53:59 -04:00
_base.py	Revert "fix bug(breaking change) remove (zeors -= 1)"	2023-09-27 10:37:31 +08:00
_const.py	Merge remote-tracking branches 'laaza/Mistral' and 'laaza/MPT'	2023-10-22 07:53:59 -04:00
_utils.py	import exllama QuantLinear instead of exllamav2's	2023-09-27 11:05:13 +08:00
auto.py	Merge remote-tracking branches 'laaza/Mistral' and 'laaza/MPT'	2023-10-22 07:53:59 -04:00
baichuan.py	Rename the class to match reference capitalisation	2023-06-18 21:01:07 +03:00
bloom.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
codegen.py	Add support for CodeGen/2	2023-05-08 17:34:00 +03:00
gpt2.py	fix device mismatch when directly using model to inference after quantization	2023-04-28 16:41:46 +08:00
gpt_bigcode.py	Add support for GPTBigCode	2023-05-08 12:28:29 +03:00
gpt_neox.py	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
gptj.py	add GPTJ fused attention module	2023-05-14 16:17:21 +08:00
internlm.py	Add support for InternLM	2023-07-07 09:25:40 -07:00
llama.py	make compatible with older transformers version	2023-05-15 13:26:18 +08:00
mistral.py	Add support for Mistral models.	2023-10-04 01:07:55 +03:00
moss.py	remove non-parameters module from MOSSGPTQForCausalLM.outside_layer_modules	2023-04-29 10:58:29 +08:00
mpt.py	Add initial support for MPT	2023-05-12 14:46:52 +03:00
opt.py	remove override of _resize_attention_mask for llama and opt	2023-04-28 23:08:42 +08:00
qwen.py	Update qwen.py for Qwen-VL	2023-08-30 16:29:55 +08:00
rw.py	support falcon	2023-05-27 07:53:39 +09:00