AutoGPTQ

History

PanQiWei fef1a4fe4b make code clean and extendable		2023-05-12 20:11:55 +08:00
..
triton_utils	add triton support	2023-04-25 20:05:22 +08:00
__init__.py	refactor file structure	2023-04-25 18:58:20 +08:00
_fused_base.py	add _fused_base.py	2023-05-12 18:09:23 +08:00
fused_llama_attn.py	make code clean and extendable	2023-05-12 20:11:55 +08:00
fused_llama_mlp.py	make code clean and extendable	2023-05-12 20:11:55 +08:00
qlinear.py	Merge branch 'main' into faster-llama	2023-05-06 00:20:29 +09:00
qlinear_old.py	groupsize -> group_size	2023-05-12 13:37:52 +08:00
qlinear_triton.py	Fix 'groupsize' -> 'group_size' in all other .py files. I haven't touched any CUDA kernels in case there's any complexity there I don't understand	2023-05-05 14:44:16 +01:00