AutoGPTQ

History

TheBloke 898f1ef62d Rename 'quant_cuda' to 'autogptq_cuda' to avoid conflicts with existing GPTQ-for-LLaMa installations.		2023-05-20 09:33:51 +01:00
..
triton_utils	fix bugs	2023-05-14 13:07:18 +08:00
__init__.py	refactor file structure	2023-04-25 18:58:20 +08:00
_fused_base.py	refactor file structure for triton kernels	2023-05-14 11:49:10 +08:00
fused_gptj_attn.py	add GPTJ fused attention module	2023-05-14 16:17:21 +08:00
fused_llama_attn.py	compatible with older pytorch version	2023-05-14 16:17:03 +08:00
fused_llama_mlp.py	fix bugs	2023-05-14 13:07:18 +08:00
qlinear.py	Rename 'quant_cuda' to 'autogptq_cuda' to avoid conflicts with existing GPTQ-for-LLaMa installations.	2023-05-20 09:33:51 +01:00
qlinear_old.py	Rename 'quant_cuda' to 'autogptq_cuda' to avoid conflicts with existing GPTQ-for-LLaMa installations.	2023-05-20 09:33:51 +01:00
qlinear_triton.py	refactor file structure for triton kernels	2023-05-14 11:49:10 +08:00