AutoGPTQ

Author	SHA1	Message	Date
Felix Marty	23eb519e68	typo	2023-07-28 17:45:34 +00:00
Felix Marty	caf6625b68	warning about triton	2023-07-28 17:42:37 +00:00
PanQiWei	1138240385	update version to 0.3.2	2023-07-26 18:40:44 +08:00
PanQiWei	ff1f100ded	remove argument 'save_dir' in method `from_quantized`	2023-07-26 17:58:04 +08:00
PanQiWei	722a621aaa	simplified code	2023-07-26 17:53:47 +08:00
潘其威(William)	22748dd2b7	Merge pull request #209 from PanQiWei/fix_no_cuda_kernel Fix error raised when CUDA kernels are not installed	2023-07-26 14:07:30 +08:00
潘其威(William)	fd24e84eb2	Merge pull request #166 from casperbh96/main [FEATURE] Implement perplexity metric to compare against llama.cpp	2023-07-26 14:04:51 +08:00
PanQiWei	5883b45d73	fix error raised when cuda kernels are not installed	2023-07-26 13:59:28 +08:00
潘其威(William)	bbc4a7c455	Merge pull request #208 from TheBloke/TB_Add_SafeTensors_Metadata Add Safetensors metadata saving, with some values saved to each .safetensor file	2023-07-26 11:54:47 +08:00
潘其威(William)	228867a753	Merge pull request #207 from TheBloke/TB_version Add a central version number	2023-07-26 11:27:23 +08:00
潘其威(William)	2456f71125	Merge pull request #205 from TheBloke/TB_fix_revision Fix `revision` and other huggingface_hub kwargs in .from_quantized()	2023-07-26 10:34:43 +08:00
TheBloke	2647c92743	safetensors_metadata: add conversion to str() for input metadata to avoid errors from save_safe. Warn if this results in keys being overwritten.	2023-07-25 21:14:21 +00:00
TheBloke	ee7d80945b	Add version to metadata using new value	2023-07-25 14:25:24 +00:00
TheBloke	3817d154af	Merge branch 'TB_version' into TB_Add_SafeTensors_Metadata	2023-07-25 14:09:29 +00:00
TheBloke	7575eae6ab	Added to __init__.py to show a central version number. Also slightly adjust way version is stored in setup.py to make it easier to edit on version update. Bump version to 0.3.1 in both	2023-07-25 14:06:51 +00:00
TheBloke	eeaf5ebc53	Extend huggingface_hub features to AutoGPTQForCausalLM.from_pretrained() so models can be quantised from the hub including using a private token and revision/branch etc	2023-07-25 13:26:37 +00:00
TheBloke	c9124e3fc7	Fix revision and other huggingface_hub args for .from_quantized(), which were not being passed through	2023-07-25 12:48:33 +00:00
TheBloke	3f359fc778	Add support for Safetensors metadata	2023-07-25 11:30:39 +00:00
qwopqwop200	9578c59d31	fix cuda bug	2023-07-25 16:50:05 +09:00
tc	e28e8ee809	Add support for InternLM	2023-07-07 09:25:40 -07:00
Casper	992a0ab102	Reference Perplexity class	2023-06-19 20:03:32 +02:00
Casper	b351c8c547	Add perplexity calculation class	2023-06-19 20:03:22 +02:00
潘其威(William)	046c031139	Merge pull request #141 from AngainorDev/patch-1 Fix error message	2023-06-19 10:11:10 +08:00
LaaZa	03577a7698	Rename the class to match reference capitalisation	2023-06-18 21:01:07 +03:00
LaaZa	9fd558f2ba	Add support for Baichuan	2023-06-18 20:13:29 +03:00
Angainor Development	e75611e1b7	Fix error message	2023-06-05 22:19:09 +02:00
lunar	618a5f50ee	Add transpose operator when replace Conv1d with qlinear_cuda_old	2023-06-05 23:11:18 +08:00
潘其威(William)	023bb1c593	Merge pull request #125 from PanQiWei/support-32dim Support 32dim	2023-06-03 19:08:29 +08:00
qwopqwop200	f4820f2988	change qlinear cuda support 64dim	2023-06-03 07:30:34 +09:00
潘其威(William)	b4fdd8d264	Merge branch 'main' into peft_integration	2023-06-02 19:11:59 +08:00
qwopqwop200	2df7d7105d	support 64 cuda dim	2023-06-02 19:54:37 +09:00
qwopqwop200	b03f53294f	support 64dim cuda	2023-06-02 19:53:50 +09:00
qwopqwop200	0891ea4036	support 32dim triton]	2023-06-02 19:05:55 +09:00
qwopqwop200	b3654a68c3	support 32dim triton kernel	2023-06-02 19:04:12 +09:00
PanQiWei	ec6603d0ab	support older version python	2023-05-31 22:11:16 +08:00
qwopqwop200	b1a8cc28e8	remove raise	2023-05-31 00:03:51 +09:00
qwopqwop200	c381958a5f	add warning	2023-05-30 23:53:33 +09:00
qwopqwop200	0f2841cb13	remove log	2023-05-30 23:51:55 +09:00
qwopqwop200	33809a8e59	remove log	2023-05-30 23:51:39 +09:00
qwopqwop200	dfd9dc0e6b	change if trainable backend pytorch	2023-05-30 23:43:55 +09:00
qwopqwop200	5274313067	change if trainable backend pytorch	2023-05-30 23:40:58 +09:00
潘其威(William)	defc96ff04	Merge pull request #91 from TheBloke/TheBloke_support-HF-download Add support for HF Hub download, and `push_to_hub`	2023-05-30 07:37:15 +08:00
潘其威(William)	2245fad095	Update auto.py fix None type error	2023-05-30 07:35:15 +08:00
潘其威(William)	15db2cdc44	Update _base.py fix problem that recursively adding file extension to model_base_name	2023-05-30 07:26:42 +08:00
潘其威(William)	cfa7271617	Update _base.py fix variable not found error	2023-05-30 07:22:10 +08:00
潘其威(William)	e5771fb206	Update _base.py fix key mismatch	2023-05-30 06:44:45 +08:00
潘其威(William)	61a4ea035f	Update auto.py add back save_dir for backward compatible	2023-05-30 06:43:00 +08:00
潘其威(William)	ea74e15199	Update _base.py add model_name_or_path and model_file_base_name to BaseQuantizeConfig for better model file management; add back save_dir to .from_quantized() for backward compatible	2023-05-30 06:40:31 +08:00
PanQiWei	6c64b0b361	raise NotImplementedError when model with fused attention injected try to use ADAPTION_PROMPT peft type	2023-05-28 22:35:34 +08:00
PanQiWei	def084bf0e	reset value of AdaptionPromptConfig.adapter_layers to number of model's hidden layers when exceeds	2023-05-28 22:11:02 +08:00

1 2 3 4 5 ...

308 commits