Felix Marty
23eb519e68
typo
2023-07-28 17:45:34 +00:00
Felix Marty
caf6625b68
warning about triton
2023-07-28 17:42:37 +00:00
PanQiWei
1138240385
update version to 0.3.2
2023-07-26 18:40:44 +08:00
PanQiWei
ff1f100ded
remove argument 'save_dir' in method from_quantized
2023-07-26 17:58:04 +08:00
PanQiWei
722a621aaa
simplified code
2023-07-26 17:53:47 +08:00
潘其威(William)
22748dd2b7
Merge pull request #209 from PanQiWei/fix_no_cuda_kernel
...
Fix error raised when CUDA kernels are not installed
2023-07-26 14:07:30 +08:00
潘其威(William)
fd24e84eb2
Merge pull request #166 from casperbh96/main
...
[FEATURE] Implement perplexity metric to compare against llama.cpp
2023-07-26 14:04:51 +08:00
PanQiWei
5883b45d73
fix error raised when cuda kernels are not installed
2023-07-26 13:59:28 +08:00
潘其威(William)
bbc4a7c455
Merge pull request #208 from TheBloke/TB_Add_SafeTensors_Metadata
...
Add Safetensors metadata saving, with some values saved to each .safetensor file
2023-07-26 11:54:47 +08:00
潘其威(William)
228867a753
Merge pull request #207 from TheBloke/TB_version
...
Add a central version number
2023-07-26 11:27:23 +08:00
潘其威(William)
2456f71125
Merge pull request #205 from TheBloke/TB_fix_revision
...
Fix `revision` and other huggingface_hub kwargs in .from_quantized()
2023-07-26 10:34:43 +08:00
TheBloke
2647c92743
safetensors_metadata: add conversion to str() for input metadata to avoid errors from save_safe. Warn if this results in keys being overwritten.
2023-07-25 21:14:21 +00:00
TheBloke
ee7d80945b
Add version to metadata using new value
2023-07-25 14:25:24 +00:00
TheBloke
3817d154af
Merge branch 'TB_version' into TB_Add_SafeTensors_Metadata
2023-07-25 14:09:29 +00:00
TheBloke
7575eae6ab
Added to __init__.py to show a central version number. Also slightly adjust way version is stored in setup.py to make it easier to edit on version update. Bump version to 0.3.1 in both
2023-07-25 14:06:51 +00:00
TheBloke
eeaf5ebc53
Extend huggingface_hub features to AutoGPTQForCausalLM.from_pretrained() so models can be quantised from the hub including using a private token and revision/branch etc
2023-07-25 13:26:37 +00:00
TheBloke
c9124e3fc7
Fix revision and other huggingface_hub args for .from_quantized(), which were not being passed through
2023-07-25 12:48:33 +00:00
TheBloke
3f359fc778
Add support for Safetensors metadata
2023-07-25 11:30:39 +00:00
qwopqwop200
9578c59d31
fix cuda bug
2023-07-25 16:50:05 +09:00
tc
e28e8ee809
Add support for InternLM
2023-07-07 09:25:40 -07:00
Casper
992a0ab102
Reference Perplexity class
2023-06-19 20:03:32 +02:00
Casper
b351c8c547
Add perplexity calculation class
2023-06-19 20:03:22 +02:00
潘其威(William)
046c031139
Merge pull request #141 from AngainorDev/patch-1
...
Fix error message
2023-06-19 10:11:10 +08:00
LaaZa
03577a7698
Rename the class to match reference capitalisation
2023-06-18 21:01:07 +03:00
LaaZa
9fd558f2ba
Add support for Baichuan
2023-06-18 20:13:29 +03:00
Angainor Development
e75611e1b7
Fix error message
2023-06-05 22:19:09 +02:00
lunar
618a5f50ee
Add transpose operator when replace Conv1d with qlinear_cuda_old
2023-06-05 23:11:18 +08:00
潘其威(William)
023bb1c593
Merge pull request #125 from PanQiWei/support-32dim
...
Support 32dim
2023-06-03 19:08:29 +08:00
qwopqwop200
f4820f2988
change qlinear cuda support 64dim
2023-06-03 07:30:34 +09:00
潘其威(William)
b4fdd8d264
Merge branch 'main' into peft_integration
2023-06-02 19:11:59 +08:00
qwopqwop200
2df7d7105d
support 64 cuda dim
2023-06-02 19:54:37 +09:00
qwopqwop200
b03f53294f
support 64dim cuda
2023-06-02 19:53:50 +09:00
qwopqwop200
0891ea4036
support 32dim triton]
2023-06-02 19:05:55 +09:00
qwopqwop200
b3654a68c3
support 32dim triton kernel
2023-06-02 19:04:12 +09:00
PanQiWei
ec6603d0ab
support older version python
2023-05-31 22:11:16 +08:00
qwopqwop200
b1a8cc28e8
remove raise
2023-05-31 00:03:51 +09:00
qwopqwop200
c381958a5f
add warning
2023-05-30 23:53:33 +09:00
qwopqwop200
0f2841cb13
remove log
2023-05-30 23:51:55 +09:00
qwopqwop200
33809a8e59
remove log
2023-05-30 23:51:39 +09:00
qwopqwop200
dfd9dc0e6b
change if trainable backend pytorch
2023-05-30 23:43:55 +09:00
qwopqwop200
5274313067
change if trainable backend pytorch
2023-05-30 23:40:58 +09:00
潘其威(William)
defc96ff04
Merge pull request #91 from TheBloke/TheBloke_support-HF-download
...
Add support for HF Hub download, and `push_to_hub`
2023-05-30 07:37:15 +08:00
潘其威(William)
2245fad095
Update auto.py
...
fix None type error
2023-05-30 07:35:15 +08:00
潘其威(William)
15db2cdc44
Update _base.py
...
fix problem that recursively adding file extension to model_base_name
2023-05-30 07:26:42 +08:00
潘其威(William)
cfa7271617
Update _base.py
...
fix variable not found error
2023-05-30 07:22:10 +08:00
潘其威(William)
e5771fb206
Update _base.py
...
fix key mismatch
2023-05-30 06:44:45 +08:00
潘其威(William)
61a4ea035f
Update auto.py
...
add back save_dir for backward compatible
2023-05-30 06:43:00 +08:00
潘其威(William)
ea74e15199
Update _base.py
...
add model_name_or_path and model_file_base_name to BaseQuantizeConfig for better model file management; add back save_dir to .from_quantized() for backward compatible
2023-05-30 06:40:31 +08:00
PanQiWei
6c64b0b361
raise NotImplementedError when model with fused attention injected try to use ADAPTION_PROMPT peft type
2023-05-28 22:35:34 +08:00
PanQiWei
def084bf0e
reset value of AdaptionPromptConfig.adapter_layers to number of model's hidden layers when exceeds
2023-05-28 22:11:02 +08:00