AutoGPTQ/auto_gptq/nn_modules/fused_modules
2023-08-07 18:59:04 +08:00
..
__init__.py add FusedGeneralQuantLinear 2023-08-04 19:10:32 +08:00
attention.py support inherit one of the three fused attention class and customize attn_bias building logic 2023-08-07 18:59:04 +08:00
linear.py fix using wrong attribute 2023-08-06 16:23:19 +08:00
mlp.py doing 'memory_efficient_fusion' in __init__ 2023-08-06 17:23:57 +08:00