Commit graph

532 commits

Author SHA1 Message Date
PanQiWei
801610367d Merge branch 'main' into xformers_integration 2023-08-05 18:02:00 +08:00
PanQiWei
7d0909160c add fused MLPs 2023-08-04 20:03:16 +08:00
PanQiWei
8b19122775 add fused attentions 2023-08-04 19:11:43 +08:00
PanQiWei
cd8a674002 add FusedGeneralQuantLinear 2023-08-04 19:10:32 +08:00
PanQiWei
116d8267d7 update requirements 2023-08-04 19:10:05 +08:00
潘其威(William)
5d8fa85029
Merge pull request #226 from LeiWang1999/fix/general_attr
Register quant params in GeneralQuantLinear for friendly post process.
2023-08-04 18:42:54 +08:00
潘其威(William)
45152b7add
Merge pull request #220 from fxmarty/fix-revison-loading
Fix revision used to load the quantization config
2023-08-04 18:25:22 +08:00
潘其威(William)
63da0cd00a
Merge pull request #214 from fxmarty/rocm-support
Add RoCm support
2023-08-04 18:24:15 +08:00
潘其威(William)
e6790ba2cb
Update README.md 2023-08-04 18:16:55 +08:00
leiwang1999
a0de5c2c51 regist buffer of general quant linear 2023-08-03 05:15:09 +00:00
Felix Marty
1f99b94ae2 fix revision 2023-07-31 15:03:33 +00:00
Felix Marty
23eb519e68 typo 2023-07-28 17:45:34 +00:00
Felix Marty
caf6625b68 warning about triton 2023-07-28 17:42:37 +00:00
Felix Marty
d20540173c change repo owner back to PanQiWei 2023-07-28 16:43:58 +00:00
Felix Marty
121399f8e5 fix workflows 2023-07-28 16:42:39 +00:00
Felix Marty
677d23be2d style 2023-07-28 15:14:46 +00:00
Felix Marty
8112239848 add workflow, edit readme & add tests 2023-07-28 15:10:39 +00:00
Felix Marty
2cb191e114 fix bugs 2023-07-28 14:10:44 +00:00
Felix Marty
547fb198d1 fix 2023-07-27 12:36:25 +00:00
Félix Marty
0b8a1f922d is it as simple as that? 2023-07-27 12:14:33 +02:00
PanQiWei
a7167b108c simplify setup.py 2023-07-26 19:18:05 +08:00
PanQiWei
6395e4b301 update setup.py 2023-07-26 18:58:49 +08:00
PanQiWei
d6b6ec83ef Merge remote-tracking branch 'origin/main' 2023-07-26 18:41:01 +08:00
PanQiWei
1138240385 update version to 0.3.2 2023-07-26 18:40:44 +08:00
潘其威(William)
b0889e4dab
Merge pull request #212 from casperbh96/main
Fix build on non-CUDA machines after #206
2023-07-26 18:35:53 +08:00
Casper
c68b4492f6 Fix build on non-CUDA machines after #206 2023-07-26 12:21:58 +02:00
PanQiWei
ff1f100ded remove argument 'save_dir' in method from_quantized 2023-07-26 17:58:04 +08:00
PanQiWei
722a621aaa simplified code 2023-07-26 17:53:47 +08:00
PanQiWei
5d6862ee8d update README 2023-07-26 14:18:26 +08:00
潘其威(William)
22748dd2b7
Merge pull request #209 from PanQiWei/fix_no_cuda_kernel
Fix error raised when CUDA kernels are not installed
2023-07-26 14:07:30 +08:00
潘其威(William)
fd24e84eb2
Merge pull request #166 from casperbh96/main
[FEATURE] Implement perplexity metric to compare against llama.cpp
2023-07-26 14:04:51 +08:00
PanQiWei
5883b45d73 fix error raised when cuda kernels are not installed 2023-07-26 13:59:28 +08:00
潘其威(William)
bbc4a7c455
Merge pull request #208 from TheBloke/TB_Add_SafeTensors_Metadata
Add Safetensors metadata saving, with some values saved to each .safetensor file
2023-07-26 11:54:47 +08:00
潘其威(William)
228867a753
Merge pull request #207 from TheBloke/TB_version
Add a central version number
2023-07-26 11:27:23 +08:00
潘其威(William)
cbc319b4c8
Merge pull request #206 from TheBloke/TB_InstallScript
Change the install script so it attempts to build the CUDA extension in all cases
2023-07-26 11:20:53 +08:00
潘其威(William)
2456f71125
Merge pull request #205 from TheBloke/TB_fix_revision
Fix `revision` and other huggingface_hub kwargs in .from_quantized()
2023-07-26 10:34:43 +08:00
潘其威(William)
df4c4312ff
Merge pull request #202 from PanQiWei/fix-cuda-bug
Fix cuda bug that causes group_size and desc_act can't be used together
2023-07-26 10:32:18 +08:00
TheBloke
2647c92743 safetensors_metadata: add conversion to str() for input metadata to avoid errors from save_safe. Warn if this results in keys being overwritten. 2023-07-25 21:14:21 +00:00
TheBloke
ee7d80945b Add version to metadata using new value 2023-07-25 14:25:24 +00:00
TheBloke
3817d154af Merge branch 'TB_version' into TB_Add_SafeTensors_Metadata 2023-07-25 14:09:29 +00:00
TheBloke
7575eae6ab Added to __init__.py to show a central version number. Also slightly adjust way version is stored in setup.py to make it easier to edit on version update. Bump version to 0.3.1 in both 2023-07-25 14:06:51 +00:00
TheBloke
eeaf5ebc53 Extend huggingface_hub features to AutoGPTQForCausalLM.from_pretrained() so models can be quantised from the hub including using a private token and revision/branch etc 2023-07-25 13:26:37 +00:00
TheBloke
593d32cb45 Typo in version joining 2023-07-25 13:18:52 +00:00
TheBloke
c9124e3fc7 Fix revision and other huggingface_hub args for .from_quantized(), which were not being passed through 2023-07-25 12:48:33 +00:00
TheBloke
6fc69c5b83 Fix check for Torch CUDA version 2023-07-25 12:45:27 +00:00
TheBloke
29da6c239f setup.py now builds CUDA ext unless BUILD_CUDA_EXT=0. Also add a check of CUDA_VERSION from Torch, if available. GITHUB_ACTIONS=true is no longer needed. 2023-07-25 11:44:43 +00:00
TheBloke
3f359fc778 Add support for Safetensors metadata 2023-07-25 11:30:39 +00:00
qwopqwop200
9578c59d31
fix cuda bug 2023-07-25 16:50:05 +09:00
qwopqwop200
ed2aa9368e
fix cuda buf 2023-07-25 16:46:32 +09:00
PanQiWei
45576f0933 0.3.0 release 2023-07-16 15:24:06 +08:00