AutoGPTQ

Author	SHA1	Message	Date
潘其威(William)	bdb08c16fc	Merge branch 'main' into Codegen	2023-05-14 13:10:52 +08:00
PanQiWei	f159aeabb6	refactor .from_quantized api and improve model loading strategy	2023-05-12 18:09:50 +08:00
LaaZa	b8187ff05a	Add support for CodeGen/2	2023-05-08 17:34:00 +03:00
qwopqwop200	d49281bc5d	support faster and model load strict	2023-05-04 09:07:34 +09:00
qwopqwop200	24251d1397	check kwargs	2023-05-02 22:32:54 +09:00
qwopqwop200	ccd87e5800	add Auto model parameter	2023-05-02 22:15:56 +09:00
ZXED	24a371d14a	use the same Optional style as in other params	2023-04-29 09:52:11 +03:00
ZXED	c22770188d	allow user to set trust_remote_code flag manually	2023-04-29 09:52:11 +03:00
ZXED	b3f19a7ba7	support custom model name when loading the model	2023-04-29 09:52:11 +03:00
ZXED	ea8ab73343	support custom quantize_config when loading the model	2023-04-29 09:51:50 +03:00
qwopqwop200	ac41f68532	add gpt2	2023-04-28 09:14:05 +09:00
PanQiWei	51d2e53130	add support to cpu offloading and multi gpus inference on quantized model	2023-04-28 00:53:57 +08:00
PanQiWei	498de923f2	support multi gpus quantization	2023-04-27 18:48:43 +08:00
PanQiWei	c9bb427546	align 'from_pretrained' api	2023-04-27 02:29:32 +08:00
PanQiWei	f2359f56cb	add support to use push_to_hub to upload and share quantized model	2023-04-26 16:55:01 +08:00
PanQiWei	832dc4a7a1	refactor file structure	2023-04-25 18:58:20 +08:00
PanQiWei	a259fb06bb	add support to MOSS model	2023-04-25 11:54:29 +08:00
PanQiWei	7ba0edffe0	refactor file structure of modeling module	2023-04-23 17:33:09 +08:00