Commit graph

9 commits

Author SHA1 Message Date
PanQiWei
e5f874e5af add fused attention injection logic to llama 2023-08-07 13:45:37 +08:00
PanQiWei
1f9717af7f change classes default values 2023-08-06 18:24:23 +08:00
PanQiWei
7a70bcf6d8 doing 'memory_efficient_fusion' in __init__ 2023-08-06 17:23:57 +08:00
PanQiWei
677409e2fe fix using wrong attribute 2023-08-06 16:23:19 +08:00
PanQiWei
9155ef3038 fix using wrong attribute 2023-08-06 15:37:11 +08:00
PanQiWei
f67b512cee add 'training' argument 2023-08-06 14:54:34 +08:00
PanQiWei
7d0909160c add fused MLPs 2023-08-04 20:03:16 +08:00
PanQiWei
8b19122775 add fused attentions 2023-08-04 19:11:43 +08:00
PanQiWei
cd8a674002 add FusedGeneralQuantLinear 2023-08-04 19:10:32 +08:00