TheBloke
|
1b3329b399
|
Fix 'groupsize' -> 'group_size' in all other .py files. I haven't touched any CUDA kernels in case there's any complexity there I don't understand
|
2023-05-05 14:44:16 +01:00 |
|
qwopqwop200
|
208d660920
|
fix bug
|
2023-05-04 10:04:00 +09:00 |
|
qwopqwop200
|
f51a92ed79
|
support faster and model load strict
|
2023-05-04 09:53:28 +09:00 |
|
qwopqwop200
|
d8707f92a9
|
support fused_attn
|
2023-05-02 21:54:15 +09:00 |
|
qwopqwop200
|
f47322f073
|
fix bug
|
2023-05-02 21:14:27 +09:00 |
|
qwopqwop200
|
a6d4f5c091
|
fix bug
|
2023-05-02 19:19:04 +09:00 |
|
qwopqwop200
|
1388acac94
|
fix bug
|
2023-05-02 19:13:13 +09:00 |
|
qwopqwop200
|
f51f763fde
|
fused attn ,fused mlp apply
|
2023-05-02 18:51:04 +09:00 |
|
PanQiWei
|
b490ab004e
|
remove override of _resize_attention_mask for llama and opt
|
2023-04-28 23:08:42 +08:00 |
|
PanQiWei
|
a2abff983e
|
support dispatch layers to different devices when loading pretrained model before quantization
|
2023-04-27 02:24:08 +08:00 |
|
PanQiWei
|
a830a62bc3
|
fix bugs for attention_mask and position_ids
|
2023-04-20 18:32:21 +08:00 |
|
PanQiWei
|
229b61e20e
|
first init
|
2023-04-14 01:09:40 +08:00 |
|