PanQiWei
|
801610367d
|
Merge branch 'main' into xformers_integration
|
2023-08-05 18:02:00 +08:00 |
|
PanQiWei
|
7d0909160c
|
add fused MLPs
|
2023-08-04 20:03:16 +08:00 |
|
PanQiWei
|
8b19122775
|
add fused attentions
|
2023-08-04 19:11:43 +08:00 |
|
PanQiWei
|
cd8a674002
|
add FusedGeneralQuantLinear
|
2023-08-04 19:10:32 +08:00 |
|
潘其威(William)
|
5d8fa85029
|
Merge pull request #226 from LeiWang1999/fix/general_attr
Register quant params in GeneralQuantLinear for friendly post process.
|
2023-08-04 18:42:54 +08:00 |
|
潘其威(William)
|
45152b7add
|
Merge pull request #220 from fxmarty/fix-revison-loading
Fix revision used to load the quantization config
|
2023-08-04 18:25:22 +08:00 |
|
leiwang1999
|
a0de5c2c51
|
regist buffer of general quant linear
|
2023-08-03 05:15:09 +00:00 |
|
Felix Marty
|
1f99b94ae2
|
fix revision
|
2023-07-31 15:03:33 +00:00 |
|
Felix Marty
|
23eb519e68
|
typo
|
2023-07-28 17:45:34 +00:00 |
|
Felix Marty
|
caf6625b68
|
warning about triton
|
2023-07-28 17:42:37 +00:00 |
|
PanQiWei
|
1138240385
|
update version to 0.3.2
|
2023-07-26 18:40:44 +08:00 |
|
PanQiWei
|
ff1f100ded
|
remove argument 'save_dir' in method from_quantized
|
2023-07-26 17:58:04 +08:00 |
|
PanQiWei
|
722a621aaa
|
simplified code
|
2023-07-26 17:53:47 +08:00 |
|
潘其威(William)
|
22748dd2b7
|
Merge pull request #209 from PanQiWei/fix_no_cuda_kernel
Fix error raised when CUDA kernels are not installed
|
2023-07-26 14:07:30 +08:00 |
|
潘其威(William)
|
fd24e84eb2
|
Merge pull request #166 from casperbh96/main
[FEATURE] Implement perplexity metric to compare against llama.cpp
|
2023-07-26 14:04:51 +08:00 |
|
PanQiWei
|
5883b45d73
|
fix error raised when cuda kernels are not installed
|
2023-07-26 13:59:28 +08:00 |
|
潘其威(William)
|
bbc4a7c455
|
Merge pull request #208 from TheBloke/TB_Add_SafeTensors_Metadata
Add Safetensors metadata saving, with some values saved to each .safetensor file
|
2023-07-26 11:54:47 +08:00 |
|
潘其威(William)
|
228867a753
|
Merge pull request #207 from TheBloke/TB_version
Add a central version number
|
2023-07-26 11:27:23 +08:00 |
|
潘其威(William)
|
2456f71125
|
Merge pull request #205 from TheBloke/TB_fix_revision
Fix `revision` and other huggingface_hub kwargs in .from_quantized()
|
2023-07-26 10:34:43 +08:00 |
|
TheBloke
|
2647c92743
|
safetensors_metadata: add conversion to str() for input metadata to avoid errors from save_safe. Warn if this results in keys being overwritten.
|
2023-07-25 21:14:21 +00:00 |
|
TheBloke
|
ee7d80945b
|
Add version to metadata using new value
|
2023-07-25 14:25:24 +00:00 |
|
TheBloke
|
3817d154af
|
Merge branch 'TB_version' into TB_Add_SafeTensors_Metadata
|
2023-07-25 14:09:29 +00:00 |
|
TheBloke
|
7575eae6ab
|
Added to __init__.py to show a central version number. Also slightly adjust way version is stored in setup.py to make it easier to edit on version update. Bump version to 0.3.1 in both
|
2023-07-25 14:06:51 +00:00 |
|
TheBloke
|
eeaf5ebc53
|
Extend huggingface_hub features to AutoGPTQForCausalLM.from_pretrained() so models can be quantised from the hub including using a private token and revision/branch etc
|
2023-07-25 13:26:37 +00:00 |
|
TheBloke
|
c9124e3fc7
|
Fix revision and other huggingface_hub args for .from_quantized(), which were not being passed through
|
2023-07-25 12:48:33 +00:00 |
|
TheBloke
|
3f359fc778
|
Add support for Safetensors metadata
|
2023-07-25 11:30:39 +00:00 |
|
qwopqwop200
|
9578c59d31
|
fix cuda bug
|
2023-07-25 16:50:05 +09:00 |
|
tc
|
e28e8ee809
|
Add support for InternLM
|
2023-07-07 09:25:40 -07:00 |
|
Casper
|
992a0ab102
|
Reference Perplexity class
|
2023-06-19 20:03:32 +02:00 |
|
Casper
|
b351c8c547
|
Add perplexity calculation class
|
2023-06-19 20:03:22 +02:00 |
|
潘其威(William)
|
046c031139
|
Merge pull request #141 from AngainorDev/patch-1
Fix error message
|
2023-06-19 10:11:10 +08:00 |
|
LaaZa
|
03577a7698
|
Rename the class to match reference capitalisation
|
2023-06-18 21:01:07 +03:00 |
|
LaaZa
|
9fd558f2ba
|
Add support for Baichuan
|
2023-06-18 20:13:29 +03:00 |
|
Angainor Development
|
e75611e1b7
|
Fix error message
|
2023-06-05 22:19:09 +02:00 |
|
lunar
|
618a5f50ee
|
Add transpose operator when replace Conv1d with qlinear_cuda_old
|
2023-06-05 23:11:18 +08:00 |
|
潘其威(William)
|
023bb1c593
|
Merge pull request #125 from PanQiWei/support-32dim
Support 32dim
|
2023-06-03 19:08:29 +08:00 |
|
qwopqwop200
|
f4820f2988
|
change qlinear cuda support 64dim
|
2023-06-03 07:30:34 +09:00 |
|
潘其威(William)
|
b4fdd8d264
|
Merge branch 'main' into peft_integration
|
2023-06-02 19:11:59 +08:00 |
|
qwopqwop200
|
2df7d7105d
|
support 64 cuda dim
|
2023-06-02 19:54:37 +09:00 |
|
qwopqwop200
|
b03f53294f
|
support 64dim cuda
|
2023-06-02 19:53:50 +09:00 |
|
qwopqwop200
|
0891ea4036
|
support 32dim triton]
|
2023-06-02 19:05:55 +09:00 |
|
qwopqwop200
|
b3654a68c3
|
support 32dim triton kernel
|
2023-06-02 19:04:12 +09:00 |
|
PanQiWei
|
ec6603d0ab
|
support older version python
|
2023-05-31 22:11:16 +08:00 |
|
qwopqwop200
|
b1a8cc28e8
|
remove raise
|
2023-05-31 00:03:51 +09:00 |
|
qwopqwop200
|
c381958a5f
|
add warning
|
2023-05-30 23:53:33 +09:00 |
|
qwopqwop200
|
0f2841cb13
|
remove log
|
2023-05-30 23:51:55 +09:00 |
|
qwopqwop200
|
33809a8e59
|
remove log
|
2023-05-30 23:51:39 +09:00 |
|
qwopqwop200
|
dfd9dc0e6b
|
change if trainable backend pytorch
|
2023-05-30 23:43:55 +09:00 |
|
qwopqwop200
|
5274313067
|
change if trainable backend pytorch
|
2023-05-30 23:40:58 +09:00 |
|
潘其威(William)
|
defc96ff04
|
Merge pull request #91 from TheBloke/TheBloke_support-HF-download
Add support for HF Hub download, and `push_to_hub`
|
2023-05-30 07:37:15 +08:00 |
|