AutoGPTQ

Author	SHA1	Message	Date
Ryan Voots	ced04e1dff	disable the error exit here, see if the pregen code works	2023-10-26 12:43:07 -04:00
Ryan Voots	07021b9a1c	Generated files so that when they fail to work in pipeline then it still continues with what should be some ok defaults	2023-10-26 10:26:42 -04:00
Ryan Voots	3011e13009	Built locally for temp setup, not sure what its doing but it is doing weird stuff on build server, like it never determines something	2023-10-26 10:26:13 -04:00
Ryan Voots	153c085a32	Make this fail early when the actual problem happens	2023-10-26 09:38:59 -04:00
Automation Pipeline	9fb99f61e7	Merge remote-tracking branches 'laaza/Mistral' and 'laaza/MPT'	2023-10-22 07:53:59 -04:00
Vivek Khandelwal	e4b2493733	Modify qlinear_cuda for tracing the GPTQ model (#367 ) Changes: -- The change to the torch.bitwise_and is done because during tracing this model the current usage of the torch.bitwise_and result in an in-place variant of this op, resulting in an issue during the downstream lowering pipeline of the traced model via Torch-MLIR and IREE-SHARK. That's why the op usage is changed to not result in an in-place variaunt. -- The change to the torch.matmul call in the forward function is done because currently, it assumes that the weights will always be of fp16 type. But, when the model is executed for the float32 weights it results in an error. That's why the current change cast the LHS of the matmul to the same type as the RHS one. Both the above changes doesn't affect the model in any way. Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>	2023-10-21 01:06:01 +09:00
LaaZa	4b7389ddb7	Merge branch 'main' into MPT # Conflicts: # auto_gptq/modeling/__init__.py # auto_gptq/modeling/_const.py # auto_gptq/modeling/auto.py	2023-10-04 20:21:49 +03:00
LaaZa	99acbead42	Add support for Mistral models.	2023-10-04 01:07:55 +03:00
潘其威(William)	51c043c6be	Merge pull request #355 from PanQiWei/fix_pack_model_use_exllamav2 import exllama QuantLinear instead of exllamav2's in `pack_model`	2023-09-27 11:06:35 +08:00
student686	c1a3013c45	import exllama QuantLinear instead of exllamav2's	2023-09-27 11:05:13 +08:00
潘其威(William)	3b81fb5ea0	Merge pull request #354 from PanQiWei/revert-325-main Reverts #325 for it may breaks exllama kernels	2023-09-27 10:39:00 +08:00
潘其威(William)	3de7fbb0d5	Revert "fix bug(breaking change) remove (zeors -= 1)"	2023-09-27 10:37:31 +08:00
潘其威(William)	ac23d6b819	Merge pull request #325 from qwopqwop200/main remove an unnecessary line (zeors -= 1) to make disable 'sym' feature truely possible	2023-09-26 14:20:39 +08:00
潘其威(William)	62fd0371ac	Merge branch 'main' into main	2023-09-26 14:09:04 +08:00
潘其威(William)	b461b6fa13	Merge pull request #335 from z80maniac/ignore-extra-args Ignore unknown parameters in quantize_config.json	2023-09-26 14:00:38 +08:00
潘其威(William)	04db761eed	Merge pull request #347 from alex4321/peft-model-use-adapter-name Use `adapter_name` for `get_gptq_peft_model` with `train_mode=True`	2023-09-26 13:55:06 +08:00
潘其威(William)	50d2e86890	Merge pull request #349 from SunMarc/exllamav2_integration exllamav2 integration	2023-09-26 13:49:59 +08:00
Marc Sun	c912bf361a	exllamav2 integration	2023-09-25 16:51:18 +00:00
student686	645bd15a96	update README	2023-09-25 18:55:34 +08:00
student686	d2844437fd	update README	2023-09-25 18:53:03 +08:00
student686	da84da846b	update README	2023-09-25 18:51:03 +08:00
student686	50da063f65	update README	2023-09-25 18:47:40 +08:00
Alexander Pozharskii	0185095402	Use `adapter_name` for `get_gptq_peft_model` with `train_mode=True`	2023-09-24 17:11:19 +04:00
潘其威(William)	06e071e68e	Merge pull request #326 from TheBloke/TB_Latest_Falcon Add support for Falcon as part of Transformers 4.33.0, including new Falcon 180B	2023-09-14 22:49:25 +08:00
PanQiWei	7a75176224	update README	2023-09-11 11:15:08 +08:00
ZXED	121dbd15a5	Ignore unknown parameters in quantize_config.json	2023-09-10 18:39:40 +03:00
qwopqwop200	94de4ef185	GPTQ backward compatibility support	2023-09-08 10:16:29 +09:00
qwopqwop200	9e0682a63e	Optimize q4_matmul https://github.com/turboderp/exllama/pull/275	2023-09-07 12:54:46 +09:00
TheBloke	034f6730ed	Removed unexpected file that shouldn't have been added, sorry	2023-09-06 18:08:30 +01:00
TheBloke	02a87dce76	Add support for Falcon as part of Transformers 4.33.0, including new Falcon 180B	2023-09-06 18:03:33 +01:00
qwopqwop200	6b1ceb1897	if exllama auto diable fused attention	2023-09-06 18:14:04 +09:00
qwopqwop200	ad5b0d72ee	fix bug	2023-09-06 16:41:41 +09:00
qwopqwop200	f752336cda	fix bug	2023-09-06 16:39:22 +09:00
潘其威(William)	1793227283	Merge pull request #311 from SunMarc/fix_max_input_length fix typo in max_input_length	2023-09-01 10:21:54 +08:00
潘其威(William)	782bb603d9	Merge pull request #303 from JustinLin610/patch-1 Update qwen.py for Qwen-VL	2023-09-01 10:20:24 +08:00
Marc Sun	04b321da89	fix type	2023-08-31 14:07:16 -04:00
潘其威(William)	1e938e6bad	Merge pull request #310 from PanQiWei/fix_to()_metod_bug fix model type changed after calling .to() method	2023-08-31 19:04:02 +08:00
潘其威(William)	1339db3045	Merge pull request #309 from PanQiWei/install-skip-qigen(windows) skip qigen installation on windows	2023-08-31 19:03:43 +08:00
PanQiWei	c7021f0f44	fix model type changed after calling .to() method	2023-08-31 18:39:03 +08:00
qwopqwop200	f97b77a64e	fix install bug	2023-08-31 15:00:38 +09:00
qwopqwop200	45a1ee4d84	install check qigen	2023-08-31 14:37:39 +09:00
qwopqwop200	71d56c76d0	skip install qigen(windows)	2023-08-31 14:35:04 +09:00
Junyang Lin	7c39a3a315	Update qwen.py for Qwen-VL add transformer.visual as outside layer for the adaptation to Qwen-VL	2023-08-30 16:29:55 +08:00
PanQiWei	604c96144f	temporarily set the version of main branch to 0.5.0.dev0	2023-08-25 17:36:23 +08:00
潘其威(William)	6bbf70373f	Merge pull request #288 from PanQiWei/revert-287-v0.4.2-release Revert "V0.4.2 release"	2023-08-25 17:34:27 +08:00
潘其威(William)	e5050a5650	Revert "V0.4.2 release"	2023-08-25 17:26:55 +08:00
潘其威(William)	1049fd014a	Merge pull request #287 from PanQiWei/v0.4.2-release V0.4.2 release	2023-08-25 17:26:41 +08:00
qwopqwop200	6a9d80eddc	Merge remote-tracking branch 'qwopqwop200/main' into main	2023-08-25 18:06:03 +09:00
qwopqwop200	dafdd6189a	duplicate code remove	2023-08-25 14:59:13 +09:00
fxmarty	144302f58f	Update install instructions (#286 )	2023-08-25 04:17:25 +09:00

1 2 3 4 5 ...

663 commits