PanQiWei
|
7c2ec905a6
|
extrac rope logic into a single method for better override in child class
|
2023-08-13 16:13:44 +08:00 |
|
PanQiWei
|
fdb8c4500a
|
extend to support qlinear_exllama's fusion
|
2023-08-11 14:52:26 +08:00 |
|
PanQiWei
|
efe47aafe5
|
prevent potential import error
|
2023-08-10 15:36:54 +08:00 |
|
PanQiWei
|
26dc6852fe
|
support inherit one of the three fused attention class and customize attn_bias building logic
|
2023-08-07 18:59:04 +08:00 |
|
PanQiWei
|
e5f874e5af
|
add fused attention injection logic to llama
|
2023-08-07 13:45:37 +08:00 |
|
PanQiWei
|
1f9717af7f
|
change classes default values
|
2023-08-06 18:24:23 +08:00 |
|
PanQiWei
|
7a70bcf6d8
|
doing 'memory_efficient_fusion' in __init__
|
2023-08-06 17:23:57 +08:00 |
|
PanQiWei
|
677409e2fe
|
fix using wrong attribute
|
2023-08-06 16:23:19 +08:00 |
|
PanQiWei
|
9155ef3038
|
fix using wrong attribute
|
2023-08-06 15:37:11 +08:00 |
|
PanQiWei
|
f67b512cee
|
add 'training' argument
|
2023-08-06 14:54:34 +08:00 |
|
PanQiWei
|
7d0909160c
|
add fused MLPs
|
2023-08-04 20:03:16 +08:00 |
|
PanQiWei
|
8b19122775
|
add fused attentions
|
2023-08-04 19:11:43 +08:00 |
|
PanQiWei
|
cd8a674002
|
add FusedGeneralQuantLinear
|
2023-08-04 19:10:32 +08:00 |
|