AutoGPTQ

Author	SHA1	Message	Date
TheBloke	1b3329b399	Fix 'groupsize' -> 'group_size' in all other .py files. I haven't touched any CUDA kernels in case there's any complexity there I don't understand	2023-05-05 14:44:16 +01:00
qwopqwop200	208d660920	fix bug	2023-05-04 10:04:00 +09:00
qwopqwop200	f51a92ed79	support faster and model load strict	2023-05-04 09:53:28 +09:00
qwopqwop200	d8707f92a9	support fused_attn	2023-05-02 21:54:15 +09:00
qwopqwop200	f47322f073	fix bug	2023-05-02 21:14:27 +09:00
qwopqwop200	a6d4f5c091	fix bug	2023-05-02 19:19:04 +09:00
qwopqwop200	1388acac94	fix bug	2023-05-02 19:13:13 +09:00
qwopqwop200	f51f763fde	fused attn ,fused mlp apply	2023-05-02 18:51:04 +09:00
PanQiWei	b490ab004e	remove override of _resize_attention_mask for llama and opt	2023-04-28 23:08:42 +08:00
PanQiWei	a2abff983e	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
PanQiWei	a830a62bc3	fix bugs for attention_mask and position_ids	2023-04-20 18:32:21 +08:00
PanQiWei	229b61e20e	first init	2023-04-14 01:09:40 +08:00