AutoGPTQ/auto_gptq/nn_modules/fused_modules
2023-08-13 16:13:44 +08:00
..
__init__.py add FusedGeneralQuantLinear 2023-08-04 19:10:32 +08:00
attention.py extrac rope logic into a single method for better override in child class 2023-08-13 16:13:44 +08:00
linear.py extend to support qlinear_exllama's fusion 2023-08-11 14:52:26 +08:00
mlp.py doing 'memory_efficient_fusion' in __init__ 2023-08-06 17:23:57 +08:00