AutoGPTQ

Author	SHA1	Message	Date
PanQiWei	7c2ec905a6	extrac rope logic into a single method for better override in child class	2023-08-13 16:13:44 +08:00
PanQiWei	fdb8c4500a	extend to support qlinear_exllama's fusion	2023-08-11 14:52:26 +08:00
PanQiWei	efe47aafe5	prevent potential import error	2023-08-10 15:36:54 +08:00
PanQiWei	26dc6852fe	support inherit one of the three fused attention class and customize attn_bias building logic	2023-08-07 18:59:04 +08:00
PanQiWei	e5f874e5af	add fused attention injection logic to llama	2023-08-07 13:45:37 +08:00
PanQiWei	1f9717af7f	change classes default values	2023-08-06 18:24:23 +08:00
PanQiWei	7a70bcf6d8	doing 'memory_efficient_fusion' in __init__	2023-08-06 17:23:57 +08:00
PanQiWei	677409e2fe	fix using wrong attribute	2023-08-06 16:23:19 +08:00
PanQiWei	9155ef3038	fix using wrong attribute	2023-08-06 15:37:11 +08:00
PanQiWei	f67b512cee	add 'training' argument	2023-08-06 14:54:34 +08:00
PanQiWei	7d0909160c	add fused MLPs	2023-08-04 20:03:16 +08:00
PanQiWei	8b19122775	add fused attentions	2023-08-04 19:11:43 +08:00
PanQiWei	cd8a674002	add FusedGeneralQuantLinear	2023-08-04 19:10:32 +08:00