Commit graph

211 commits

Author SHA1 Message Date
Automation Pipeline
9fb99f61e7 Merge remote-tracking branches 'laaza/Mistral' and 'laaza/MPT' 2023-10-22 07:53:59 -04:00
LaaZa
4b7389ddb7 Merge branch 'main' into MPT
# Conflicts:
#	auto_gptq/modeling/__init__.py
#	auto_gptq/modeling/_const.py
#	auto_gptq/modeling/auto.py
2023-10-04 20:21:49 +03:00
LaaZa
99acbead42 Add support for Mistral models. 2023-10-04 01:07:55 +03:00
student686
c1a3013c45 import exllama QuantLinear instead of exllamav2's 2023-09-27 11:05:13 +08:00
潘其威(William)
3de7fbb0d5
Revert "fix bug(breaking change) remove (zeors -= 1)" 2023-09-27 10:37:31 +08:00
潘其威(William)
62fd0371ac
Merge branch 'main' into main 2023-09-26 14:09:04 +08:00
潘其威(William)
b461b6fa13
Merge pull request #335 from z80maniac/ignore-extra-args
Ignore unknown parameters in quantize_config.json
2023-09-26 14:00:38 +08:00
Marc Sun
c912bf361a exllamav2 integration 2023-09-25 16:51:18 +00:00
ZXED
121dbd15a5
Ignore unknown parameters in quantize_config.json 2023-09-10 18:39:40 +03:00
qwopqwop200
94de4ef185
GPTQ backward compatibility support 2023-09-08 10:16:29 +09:00
TheBloke
02a87dce76 Add support for Falcon as part of Transformers 4.33.0, including new Falcon 180B 2023-09-06 18:03:33 +01:00
qwopqwop200
ad5b0d72ee
fix bug 2023-09-06 16:41:41 +09:00
潘其威(William)
1793227283
Merge pull request #311 from SunMarc/fix_max_input_length
fix typo in max_input_length
2023-09-01 10:21:54 +08:00
潘其威(William)
782bb603d9
Merge pull request #303 from JustinLin610/patch-1
Update qwen.py for Qwen-VL
2023-09-01 10:20:24 +08:00
Marc Sun
04b321da89
fix type 2023-08-31 14:07:16 -04:00
潘其威(William)
1e938e6bad
Merge pull request #310 from PanQiWei/fix_to()_metod_bug
fix model type changed after calling .to() method
2023-08-31 19:04:02 +08:00
PanQiWei
c7021f0f44 fix model type changed after calling .to() method 2023-08-31 18:39:03 +08:00
qwopqwop200
45a1ee4d84
install check qigen 2023-08-31 14:37:39 +09:00
Junyang Lin
7c39a3a315
Update qwen.py for Qwen-VL
add transformer.visual as outside layer for the adaptation to Qwen-VL
2023-08-30 16:29:55 +08:00
qwopqwop200
6a9d80eddc Merge remote-tracking branch 'qwopqwop200/main' into main 2023-08-25 18:06:03 +09:00
qwopqwop200
dafdd6189a
duplicate code remove 2023-08-25 14:59:13 +09:00
Felix Marty
04730ac66c expose api to set exllama max length 2023-08-24 11:22:15 +00:00
qwopqwop200
b8a42911a6
qigen refactoring 2023-08-17 15:22:16 +09:00
qwopqwop200
051f3facc7
change arguments name 2023-08-11 16:10:32 +09:00
qwopqwop200
c591d6a1e1
change name make_quant_cpu to make_quant_qigen 2023-08-11 15:12:33 +09:00
qwopqwop200
2c1afc2ad9
chang name make_quant_cpu to make_quant_qigen 2023-08-11 15:04:58 +09:00
qwopqwop200
aa5528cb10
use_cpu name change and default dtype change 2023-08-11 09:51:36 +09:00
qwopqwop200
870be83bea
Merge branch 'PanQiWei:main' into main 2023-08-10 22:48:30 +09:00
qwopqwop200
7ba78af3ae support cpu 2023-08-10 22:48:04 +09:00
Felix Marty
4af7ea619d patch for transformers compatiblity 2023-08-09 14:23:59 +00:00
PanQiWei
44c7a1a184 make exllama_kernels compilation as optional 2023-08-09 17:42:22 +08:00
PanQiWei
172deae049 expose disable_exllama argument 2023-08-09 12:03:31 +08:00
qwopqwop200
fe244503e0
add "," 2023-08-08 19:57:23 +09:00
qwopqwop200
d22f89c524
support qwen 2023-08-08 19:27:43 +09:00
qwopqwop200
dc5541e78a
static groups default value change 2023-08-08 14:11:39 +09:00
qwopqwop200
25972d65bf
support static_groups and fix bug 2023-08-07 16:27:48 +09:00
qwopqwop200
a1fd81c72d
if training disable exllama 2023-08-01 12:29:58 +09:00
Felix Marty
5660b22f28 fix bug quantization config loading 2023-07-31 14:28:37 +00:00
Felix Marty
38447262c0 fix fused attn 2023-07-31 13:46:32 +00:00
Felix Marty
760667dccc cleaning 2023-07-31 11:58:10 +00:00
Felix Marty
179776bd1d exllama kernel 2023-07-31 11:50:45 +00:00
LaaZa
6ff6bc8dfc Merge branch 'main' into MPT
# Conflicts:
#	auto_gptq/modeling/__init__.py
#	auto_gptq/modeling/_const.py
#	auto_gptq/modeling/auto.py
2023-07-26 20:41:19 +03:00
PanQiWei
ff1f100ded remove argument 'save_dir' in method from_quantized 2023-07-26 17:58:04 +08:00
潘其威(William)
bbc4a7c455
Merge pull request #208 from TheBloke/TB_Add_SafeTensors_Metadata
Add Safetensors metadata saving, with some values saved to each .safetensor file
2023-07-26 11:54:47 +08:00
TheBloke
2647c92743 safetensors_metadata: add conversion to str() for input metadata to avoid errors from save_safe. Warn if this results in keys being overwritten. 2023-07-25 21:14:21 +00:00
TheBloke
ee7d80945b Add version to metadata using new value 2023-07-25 14:25:24 +00:00
TheBloke
eeaf5ebc53 Extend huggingface_hub features to AutoGPTQForCausalLM.from_pretrained() so models can be quantised from the hub including using a private token and revision/branch etc 2023-07-25 13:26:37 +00:00
TheBloke
c9124e3fc7 Fix revision and other huggingface_hub args for .from_quantized(), which were not being passed through 2023-07-25 12:48:33 +00:00
TheBloke
3f359fc778 Add support for Safetensors metadata 2023-07-25 11:30:39 +00:00
tc
e28e8ee809 Add support for InternLM 2023-07-07 09:25:40 -07:00