AutoGPTQ

History

PanQiWei 0fcfddda90 rename 'inject_to_model' to 'convert_to_torch_linear'		2023-08-06 12:09:16 +08:00
..
fused_modules	add fused MLPs	2023-08-04 20:03:16 +08:00
qlinear	rename 'inject_to_model' to 'convert_to_torch_linear'	2023-08-06 12:09:16 +08:00
triton_utils	support 32dim triton kernel	2023-06-02 19:04:12 +09:00
__init__.py	refactor file structure	2023-04-25 18:58:20 +08:00
_fused_base.py	add trainable mode	2023-05-26 13:11:30 +08:00
fused_gptj_attn.py	add trainable mode	2023-05-26 13:11:30 +08:00
fused_llama_attn.py	add trainable mode	2023-05-26 13:11:30 +08:00
fused_llama_mlp.py	update FusedLlamaMLPForQuantizedModel for general usage purpose	2023-05-27 07:47:20 +08:00