AutoGPTQ/auto_gptq/nn_modules
2023-05-20 09:33:51 +01:00
..
triton_utils fix bugs 2023-05-14 13:07:18 +08:00
__init__.py refactor file structure 2023-04-25 18:58:20 +08:00
_fused_base.py refactor file structure for triton kernels 2023-05-14 11:49:10 +08:00
fused_gptj_attn.py add GPTJ fused attention module 2023-05-14 16:17:21 +08:00
fused_llama_attn.py compatible with older pytorch version 2023-05-14 16:17:03 +08:00
fused_llama_mlp.py fix bugs 2023-05-14 13:07:18 +08:00
qlinear.py Rename 'quant_cuda' to 'autogptq_cuda' to avoid conflicts with existing GPTQ-for-LLaMa installations. 2023-05-20 09:33:51 +01:00
qlinear_old.py Rename 'quant_cuda' to 'autogptq_cuda' to avoid conflicts with existing GPTQ-for-LLaMa installations. 2023-05-20 09:33:51 +01:00
qlinear_triton.py refactor file structure for triton kernels 2023-05-14 11:49:10 +08:00