AutoGPTQ

Author	SHA1	Message	Date
PanQiWei	801610367d	Merge branch 'main' into xformers_integration	2023-08-05 18:02:00 +08:00
PanQiWei	7d0909160c	add fused MLPs	2023-08-04 20:03:16 +08:00
PanQiWei	8b19122775	add fused attentions	2023-08-04 19:11:43 +08:00
PanQiWei	cd8a674002	add FusedGeneralQuantLinear	2023-08-04 19:10:32 +08:00
PanQiWei	116d8267d7	update requirements	2023-08-04 19:10:05 +08:00
潘其威(William)	5d8fa85029	Merge pull request #226 from LeiWang1999/fix/general_attr Register quant params in GeneralQuantLinear for friendly post process.	2023-08-04 18:42:54 +08:00
潘其威(William)	45152b7add	Merge pull request #220 from fxmarty/fix-revison-loading Fix revision used to load the quantization config	2023-08-04 18:25:22 +08:00
潘其威(William)	63da0cd00a	Merge pull request #214 from fxmarty/rocm-support Add RoCm support	2023-08-04 18:24:15 +08:00
潘其威(William)	e6790ba2cb	Update README.md	2023-08-04 18:16:55 +08:00
leiwang1999	a0de5c2c51	regist buffer of general quant linear	2023-08-03 05:15:09 +00:00
Felix Marty	1f99b94ae2	fix revision	2023-07-31 15:03:33 +00:00
Felix Marty	23eb519e68	typo	2023-07-28 17:45:34 +00:00
Felix Marty	caf6625b68	warning about triton	2023-07-28 17:42:37 +00:00
Felix Marty	d20540173c	change repo owner back to PanQiWei	2023-07-28 16:43:58 +00:00
Felix Marty	121399f8e5	fix workflows	2023-07-28 16:42:39 +00:00
Felix Marty	677d23be2d	style	2023-07-28 15:14:46 +00:00
Felix Marty	8112239848	add workflow, edit readme & add tests	2023-07-28 15:10:39 +00:00
Felix Marty	2cb191e114	fix bugs	2023-07-28 14:10:44 +00:00
Felix Marty	547fb198d1	fix	2023-07-27 12:36:25 +00:00
Félix Marty	0b8a1f922d	is it as simple as that?	2023-07-27 12:14:33 +02:00
PanQiWei	a7167b108c	simplify setup.py	2023-07-26 19:18:05 +08:00
PanQiWei	6395e4b301	update setup.py	2023-07-26 18:58:49 +08:00
PanQiWei	d6b6ec83ef	Merge remote-tracking branch 'origin/main'	2023-07-26 18:41:01 +08:00
PanQiWei	1138240385	update version to 0.3.2	2023-07-26 18:40:44 +08:00
潘其威(William)	b0889e4dab	Merge pull request #212 from casperbh96/main Fix build on non-CUDA machines after #206	2023-07-26 18:35:53 +08:00
Casper	c68b4492f6	Fix build on non-CUDA machines after #206	2023-07-26 12:21:58 +02:00
PanQiWei	ff1f100ded	remove argument 'save_dir' in method `from_quantized`	2023-07-26 17:58:04 +08:00
PanQiWei	722a621aaa	simplified code	2023-07-26 17:53:47 +08:00
PanQiWei	5d6862ee8d	update README	2023-07-26 14:18:26 +08:00
潘其威(William)	22748dd2b7	Merge pull request #209 from PanQiWei/fix_no_cuda_kernel Fix error raised when CUDA kernels are not installed	2023-07-26 14:07:30 +08:00
潘其威(William)	fd24e84eb2	Merge pull request #166 from casperbh96/main [FEATURE] Implement perplexity metric to compare against llama.cpp	2023-07-26 14:04:51 +08:00
PanQiWei	5883b45d73	fix error raised when cuda kernels are not installed	2023-07-26 13:59:28 +08:00
潘其威(William)	bbc4a7c455	Merge pull request #208 from TheBloke/TB_Add_SafeTensors_Metadata Add Safetensors metadata saving, with some values saved to each .safetensor file	2023-07-26 11:54:47 +08:00
潘其威(William)	228867a753	Merge pull request #207 from TheBloke/TB_version Add a central version number	2023-07-26 11:27:23 +08:00
潘其威(William)	cbc319b4c8	Merge pull request #206 from TheBloke/TB_InstallScript Change the install script so it attempts to build the CUDA extension in all cases	2023-07-26 11:20:53 +08:00
潘其威(William)	2456f71125	Merge pull request #205 from TheBloke/TB_fix_revision Fix `revision` and other huggingface_hub kwargs in .from_quantized()	2023-07-26 10:34:43 +08:00
潘其威(William)	df4c4312ff	Merge pull request #202 from PanQiWei/fix-cuda-bug Fix cuda bug that causes group_size and desc_act can't be used together	2023-07-26 10:32:18 +08:00
TheBloke	2647c92743	safetensors_metadata: add conversion to str() for input metadata to avoid errors from save_safe. Warn if this results in keys being overwritten.	2023-07-25 21:14:21 +00:00
TheBloke	ee7d80945b	Add version to metadata using new value	2023-07-25 14:25:24 +00:00
TheBloke	3817d154af	Merge branch 'TB_version' into TB_Add_SafeTensors_Metadata	2023-07-25 14:09:29 +00:00
TheBloke	7575eae6ab	Added to __init__.py to show a central version number. Also slightly adjust way version is stored in setup.py to make it easier to edit on version update. Bump version to 0.3.1 in both	2023-07-25 14:06:51 +00:00
TheBloke	eeaf5ebc53	Extend huggingface_hub features to AutoGPTQForCausalLM.from_pretrained() so models can be quantised from the hub including using a private token and revision/branch etc	2023-07-25 13:26:37 +00:00
TheBloke	593d32cb45	Typo in version joining	2023-07-25 13:18:52 +00:00
TheBloke	c9124e3fc7	Fix revision and other huggingface_hub args for .from_quantized(), which were not being passed through	2023-07-25 12:48:33 +00:00
TheBloke	6fc69c5b83	Fix check for Torch CUDA version	2023-07-25 12:45:27 +00:00
TheBloke	29da6c239f	setup.py now builds CUDA ext unless BUILD_CUDA_EXT=0. Also add a check of CUDA_VERSION from Torch, if available. GITHUB_ACTIONS=true is no longer needed.	2023-07-25 11:44:43 +00:00
TheBloke	3f359fc778	Add support for Safetensors metadata	2023-07-25 11:30:39 +00:00
qwopqwop200	9578c59d31	fix cuda bug	2023-07-25 16:50:05 +09:00
qwopqwop200	ed2aa9368e	fix cuda buf	2023-07-25 16:46:32 +09:00
PanQiWei	45576f0933	0.3.0 release	2023-07-16 15:24:06 +08:00

1 2 3 4 5 ...

532 commits