AutoGPTQ/auto_gptq/nn_modules/fused_modules
2023-08-07 13:45:37 +08:00
..
__init__.py add FusedGeneralQuantLinear 2023-08-04 19:10:32 +08:00
attention.py add fused attention injection logic to llama 2023-08-07 13:45:37 +08:00
linear.py fix using wrong attribute 2023-08-06 16:23:19 +08:00
mlp.py doing 'memory_efficient_fusion' in __init__ 2023-08-06 17:23:57 +08:00