AutoGPTQ

Author	SHA1	Message	Date
PanQiWei	26dc6852fe	support inherit one of the three fused attention class and customize attn_bias building logic	2023-08-07 18:59:04 +08:00
PanQiWei	2092a80b81	keep attn_op as what it is when passed in	2023-08-06 18:38:25 +08:00
PanQiWei	9155ef3038	fix using wrong attribute	2023-08-06 15:37:11 +08:00
PanQiWei	ab6faa6496	implement gptj attention and mlp fused ops injection logic	2023-08-06 14:55:06 +08:00
PanQiWei	d5429441ef	add GPTJ fused attention module	2023-05-14 16:17:21 +08:00
PanQiWei	a2abff983e	support dispatch layers to different devices when loading pretrained model before quantization	2023-04-27 02:24:08 +08:00
PanQiWei	229b61e20e	first init	2023-04-14 01:09:40 +08:00