PanQiWei
|
26dc6852fe
|
support inherit one of the three fused attention class and customize attn_bias building logic
|
2023-08-07 18:59:04 +08:00 |
|
PanQiWei
|
2092a80b81
|
keep attn_op as what it is when passed in
|
2023-08-06 18:38:25 +08:00 |
|
PanQiWei
|
9155ef3038
|
fix using wrong attribute
|
2023-08-06 15:37:11 +08:00 |
|
PanQiWei
|
ab6faa6496
|
implement gptj attention and mlp fused ops injection logic
|
2023-08-06 14:55:06 +08:00 |
|
PanQiWei
|
d5429441ef
|
add GPTJ fused attention module
|
2023-05-14 16:17:21 +08:00 |
|
PanQiWei
|
a2abff983e
|
support dispatch layers to different devices when loading pretrained model before quantization
|
2023-04-27 02:24:08 +08:00 |
|
PanQiWei
|
229b61e20e
|
first init
|
2023-04-14 01:09:40 +08:00 |
|