qwopqwop200
|
5d5b687ca8
|
qigen formatting qlinear
|
2023-08-17 15:19:01 +09:00 |
|
qwopqwop200
|
a807e038bb
|
remove many contiguous and change arguments name
|
2023-08-11 16:09:42 +09:00 |
|
qwopqwop200
|
870be83bea
|
Merge branch 'PanQiWei:main' into main
|
2023-08-10 22:48:30 +09:00 |
|
qwopqwop200
|
7ba78af3ae
|
support cpu
|
2023-08-10 22:48:04 +09:00 |
|
Felix Marty
|
4af7ea619d
|
patch for transformers compatiblity
|
2023-08-09 14:23:59 +00:00 |
|
PanQiWei
|
44c7a1a184
|
make exllama_kernels compilation as optional
|
2023-08-09 17:42:22 +08:00 |
|
fxmarty
|
71f23268eb
|
Merge pull request #1 from qwopqwop200/exllama-q4-kernel
Exllama q4 kernel
|
2023-08-05 00:15:22 +09:00 |
|
Félix Marty
|
4fb3e20c5e
|
Merge branch 'main' into exllama-q4-kernel
|
2023-08-04 15:13:34 +02:00 |
|
leiwang1999
|
a0de5c2c51
|
regist buffer of general quant linear
|
2023-08-03 05:15:09 +00:00 |
|
qwopqwop200
|
3fc097dcd8
|
change pcak func support only 4 bit
|
2023-08-01 20:01:45 +09:00 |
|
qwopqwop200
|
a60c9a8552
|
add pack fun
|
2023-08-01 12:22:41 +09:00 |
|
Felix Marty
|
339c57a902
|
fix
|
2023-07-31 15:57:44 +00:00 |
|
Felix Marty
|
129fa4b67e
|
act-order now works fine
|
2023-07-31 15:36:58 +00:00 |
|
Felix Marty
|
38447262c0
|
fix fused attn
|
2023-07-31 13:46:32 +00:00 |
|
Felix Marty
|
760667dccc
|
cleaning
|
2023-07-31 11:58:10 +00:00 |
|
Felix Marty
|
179776bd1d
|
exllama kernel
|
2023-07-31 11:50:45 +00:00 |
|
PanQiWei
|
5883b45d73
|
fix error raised when cuda kernels are not installed
|
2023-07-26 13:59:28 +08:00 |
|
qwopqwop200
|
9578c59d31
|
fix cuda bug
|
2023-07-25 16:50:05 +09:00 |
|
潘其威(William)
|
046c031139
|
Merge pull request #141 from AngainorDev/patch-1
Fix error message
|
2023-06-19 10:11:10 +08:00 |
|
Angainor Development
|
e75611e1b7
|
Fix error message
|
2023-06-05 22:19:09 +02:00 |
|
lunar
|
618a5f50ee
|
Add transpose operator when replace Conv1d with qlinear_cuda_old
|
2023-06-05 23:11:18 +08:00 |
|
qwopqwop200
|
f4820f2988
|
change qlinear cuda support 64dim
|
2023-06-03 07:30:34 +09:00 |
|
qwopqwop200
|
2df7d7105d
|
support 64 cuda dim
|
2023-06-02 19:54:37 +09:00 |
|
qwopqwop200
|
b03f53294f
|
support 64dim cuda
|
2023-06-02 19:53:50 +09:00 |
|
qwopqwop200
|
0891ea4036
|
support 32dim triton]
|
2023-06-02 19:05:55 +09:00 |
|
qwopqwop200
|
0f2841cb13
|
remove log
|
2023-05-30 23:51:55 +09:00 |
|
qwopqwop200
|
33809a8e59
|
remove log
|
2023-05-30 23:51:39 +09:00 |
|
qwopqwop200
|
dfd9dc0e6b
|
change if trainable backend pytorch
|
2023-05-30 23:43:55 +09:00 |
|
qwopqwop200
|
5274313067
|
change if trainable backend pytorch
|
2023-05-30 23:40:58 +09:00 |
|
PanQiWei
|
2b532f9453
|
add trainable mode
|
2023-05-26 13:11:30 +08:00 |
|
PanQiWei
|
69609c4bc7
|
support faster vecquant4matmul cuda kernel
|
2023-05-26 08:55:05 +08:00 |
|
PanQiWei
|
cfd27e8caa
|
refactor file structure of qlinears
|
2023-05-26 07:18:16 +08:00 |
|