AutoGPTQ

Author	SHA1	Message	Date
Automation Pipeline	9fb99f61e7	Merge remote-tracking branches 'laaza/Mistral' and 'laaza/MPT'	2023-10-22 07:53:59 -04:00
Vivek Khandelwal	e4b2493733	Modify qlinear_cuda for tracing the GPTQ model (#367 ) Changes: -- The change to the torch.bitwise_and is done because during tracing this model the current usage of the torch.bitwise_and result in an in-place variant of this op, resulting in an issue during the downstream lowering pipeline of the traced model via Torch-MLIR and IREE-SHARK. That's why the op usage is changed to not result in an in-place variaunt. -- The change to the torch.matmul call in the forward function is done because currently, it assumes that the weights will always be of fp16 type. But, when the model is executed for the float32 weights it results in an error. That's why the current change cast the LHS of the matmul to the same type as the RHS one. Both the above changes doesn't affect the model in any way. Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>	2023-10-21 01:06:01 +09:00
LaaZa	4b7389ddb7	Merge branch 'main' into MPT # Conflicts: # auto_gptq/modeling/__init__.py # auto_gptq/modeling/_const.py # auto_gptq/modeling/auto.py	2023-10-04 20:21:49 +03:00
LaaZa	99acbead42	Add support for Mistral models.	2023-10-04 01:07:55 +03:00
student686	c1a3013c45	import exllama QuantLinear instead of exllamav2's	2023-09-27 11:05:13 +08:00
潘其威(William)	3de7fbb0d5	Revert "fix bug(breaking change) remove (zeors -= 1)"	2023-09-27 10:37:31 +08:00
潘其威(William)	62fd0371ac	Merge branch 'main' into main	2023-09-26 14:09:04 +08:00
潘其威(William)	b461b6fa13	Merge pull request #335 from z80maniac/ignore-extra-args Ignore unknown parameters in quantize_config.json	2023-09-26 14:00:38 +08:00
潘其威(William)	04db761eed	Merge pull request #347 from alex4321/peft-model-use-adapter-name Use `adapter_name` for `get_gptq_peft_model` with `train_mode=True`	2023-09-26 13:55:06 +08:00
Marc Sun	c912bf361a	exllamav2 integration	2023-09-25 16:51:18 +00:00
Alexander Pozharskii	0185095402	Use `adapter_name` for `get_gptq_peft_model` with `train_mode=True`	2023-09-24 17:11:19 +04:00
ZXED	121dbd15a5	Ignore unknown parameters in quantize_config.json	2023-09-10 18:39:40 +03:00
qwopqwop200	94de4ef185	GPTQ backward compatibility support	2023-09-08 10:16:29 +09:00
TheBloke	02a87dce76	Add support for Falcon as part of Transformers 4.33.0, including new Falcon 180B	2023-09-06 18:03:33 +01:00
qwopqwop200	6b1ceb1897	if exllama auto diable fused attention	2023-09-06 18:14:04 +09:00
qwopqwop200	ad5b0d72ee	fix bug	2023-09-06 16:41:41 +09:00
潘其威(William)	1793227283	Merge pull request #311 from SunMarc/fix_max_input_length fix typo in max_input_length	2023-09-01 10:21:54 +08:00
潘其威(William)	782bb603d9	Merge pull request #303 from JustinLin610/patch-1 Update qwen.py for Qwen-VL	2023-09-01 10:20:24 +08:00
Marc Sun	04b321da89	fix type	2023-08-31 14:07:16 -04:00
潘其威(William)	1e938e6bad	Merge pull request #310 from PanQiWei/fix_to()_metod_bug fix model type changed after calling .to() method	2023-08-31 19:04:02 +08:00
PanQiWei	c7021f0f44	fix model type changed after calling .to() method	2023-08-31 18:39:03 +08:00
qwopqwop200	45a1ee4d84	install check qigen	2023-08-31 14:37:39 +09:00
Junyang Lin	7c39a3a315	Update qwen.py for Qwen-VL add transformer.visual as outside layer for the adaptation to Qwen-VL	2023-08-30 16:29:55 +08:00
PanQiWei	604c96144f	temporarily set the version of main branch to 0.5.0.dev0	2023-08-25 17:36:23 +08:00
潘其威(William)	e5050a5650	Revert "V0.4.2 release"	2023-08-25 17:26:55 +08:00
潘其威(William)	1049fd014a	Merge pull request #287 from PanQiWei/v0.4.2-release V0.4.2 release	2023-08-25 17:26:41 +08:00
qwopqwop200	6a9d80eddc	Merge remote-tracking branch 'qwopqwop200/main' into main	2023-08-25 18:06:03 +09:00
qwopqwop200	dafdd6189a	duplicate code remove	2023-08-25 14:59:13 +09:00
Félix Marty	8254da4f15	update version	2023-08-24 17:47:14 +02:00
Felix Marty	04730ac66c	expose api to set exllama max length	2023-08-24 11:22:15 +00:00
qwopqwop200	f23a06f911	Merge branch 'PanQiWei:main' into main	2023-08-17 15:22:43 +09:00
qwopqwop200	b8a42911a6	qigen refactoring	2023-08-17 15:22:16 +09:00
qwopqwop200	5d5b687ca8	qigen formatting qlinear	2023-08-17 15:19:01 +09:00
qwopqwop200	084c9d8860	name change	2023-08-17 15:17:09 +09:00
PanQiWei	893fc5d7a3	release 0.4.1	2023-08-13 16:35:59 +08:00
PanQiWei	34b4ba451c	fix typo	2023-08-13 16:26:02 +08:00
qwopqwop200	051f3facc7	change arguments name	2023-08-11 16:10:32 +09:00
qwopqwop200	a807e038bb	remove many contiguous and change arguments name	2023-08-11 16:09:42 +09:00
qwopqwop200	c591d6a1e1	change name make_quant_cpu to make_quant_qigen	2023-08-11 15:12:33 +09:00
qwopqwop200	2c1afc2ad9	chang name make_quant_cpu to make_quant_qigen	2023-08-11 15:04:58 +09:00
qwopqwop200	aa5528cb10	use_cpu name change and default dtype change	2023-08-11 09:51:36 +09:00
qwopqwop200	870be83bea	Merge branch 'PanQiWei:main' into main	2023-08-10 22:48:30 +09:00
qwopqwop200	7ba78af3ae	support cpu	2023-08-10 22:48:04 +09:00
Felix Marty	4af7ea619d	patch for transformers compatiblity	2023-08-09 14:23:59 +00:00
PanQiWei	44c7a1a184	make exllama_kernels compilation as optional	2023-08-09 17:42:22 +08:00
PanQiWei	172deae049	expose disable_exllama argument	2023-08-09 12:03:31 +08:00
PanQiWei	86a3d4a094	release 0.4.0	2023-08-09 11:54:31 +08:00
qwopqwop200	fe244503e0	add ","	2023-08-08 19:57:23 +09:00
qwopqwop200	d22f89c524	support qwen	2023-08-08 19:27:43 +09:00
qwopqwop200	dc5541e78a	static groups default value change	2023-08-08 14:11:39 +09:00

1 2 3 4 5 ...

333 commits