Commit graph

11 commits

Author SHA1 Message Date
PanQiWei
4373d6b29c Merge branch 'main' into improve_cpu_offload 2023-05-23 23:47:33 +08:00
PanQiWei
db63c0876a half out 2023-05-23 16:08:28 +08:00
Lex Song
f2ab4fab46 Fix CUDA out of memory error in qlinear_old.py
Add a missing line from qlinear.py to qlinear_old.py to convert the output tensor.
This resolves a CUDA out of memory error that occurred without this line.
2023-05-20 21:10:11 +08:00
TheBloke
898f1ef62d Rename 'quant_cuda' to 'autogptq_cuda' to avoid conflicts with existing GPTQ-for-LLaMa installations. 2023-05-20 09:33:51 +01:00
PanQiWei
4bb10fda49 groupsize -> group_size 2023-05-12 13:37:52 +08:00
qwopqwop200
3ff6ab18cb
Merge branch 'main' into faster-llama 2023-05-06 00:20:29 +09:00
TheBloke
1b3329b399 Fix 'groupsize' -> 'group_size' in all other .py files. I haven't touched any CUDA kernels in case there's any complexity there I don't understand 2023-05-05 14:44:16 +01:00
PanQiWei
6cba6e7123 reformat code 2023-05-04 22:16:08 +08:00
qwopqwop200
c359f672a8
support faster and model load strict 2023-05-04 09:04:07 +09:00
qwopqwop200
f0f37c1fe7
fix bug 2023-05-01 18:09:39 +09:00
qwopqwop200
9dfcac8e26
add qlinear_old 2023-05-01 13:03:57 +09:00