PanQiWei
|
801610367d
|
Merge branch 'main' into xformers_integration
|
2023-08-05 18:02:00 +08:00 |
|
PanQiWei
|
7d0909160c
|
add fused MLPs
|
2023-08-04 20:03:16 +08:00 |
|
PanQiWei
|
8b19122775
|
add fused attentions
|
2023-08-04 19:11:43 +08:00 |
|
PanQiWei
|
cd8a674002
|
add FusedGeneralQuantLinear
|
2023-08-04 19:10:32 +08:00 |
|
PanQiWei
|
116d8267d7
|
update requirements
|
2023-08-04 19:10:05 +08:00 |
|
潘其威(William)
|
5d8fa85029
|
Merge pull request #226 from LeiWang1999/fix/general_attr
Register quant params in GeneralQuantLinear for friendly post process.
|
2023-08-04 18:42:54 +08:00 |
|
潘其威(William)
|
45152b7add
|
Merge pull request #220 from fxmarty/fix-revison-loading
Fix revision used to load the quantization config
|
2023-08-04 18:25:22 +08:00 |
|
潘其威(William)
|
63da0cd00a
|
Merge pull request #214 from fxmarty/rocm-support
Add RoCm support
|
2023-08-04 18:24:15 +08:00 |
|
潘其威(William)
|
e6790ba2cb
|
Update README.md
|
2023-08-04 18:16:55 +08:00 |
|
leiwang1999
|
a0de5c2c51
|
regist buffer of general quant linear
|
2023-08-03 05:15:09 +00:00 |
|
Felix Marty
|
1f99b94ae2
|
fix revision
|
2023-07-31 15:03:33 +00:00 |
|
Felix Marty
|
23eb519e68
|
typo
|
2023-07-28 17:45:34 +00:00 |
|
Felix Marty
|
caf6625b68
|
warning about triton
|
2023-07-28 17:42:37 +00:00 |
|
Felix Marty
|
d20540173c
|
change repo owner back to PanQiWei
|
2023-07-28 16:43:58 +00:00 |
|
Felix Marty
|
121399f8e5
|
fix workflows
|
2023-07-28 16:42:39 +00:00 |
|
Felix Marty
|
677d23be2d
|
style
|
2023-07-28 15:14:46 +00:00 |
|
Felix Marty
|
8112239848
|
add workflow, edit readme & add tests
|
2023-07-28 15:10:39 +00:00 |
|
Felix Marty
|
2cb191e114
|
fix bugs
|
2023-07-28 14:10:44 +00:00 |
|
Felix Marty
|
547fb198d1
|
fix
|
2023-07-27 12:36:25 +00:00 |
|
Félix Marty
|
0b8a1f922d
|
is it as simple as that?
|
2023-07-27 12:14:33 +02:00 |
|
PanQiWei
|
a7167b108c
|
simplify setup.py
|
2023-07-26 19:18:05 +08:00 |
|
PanQiWei
|
6395e4b301
|
update setup.py
|
2023-07-26 18:58:49 +08:00 |
|
PanQiWei
|
d6b6ec83ef
|
Merge remote-tracking branch 'origin/main'
|
2023-07-26 18:41:01 +08:00 |
|
PanQiWei
|
1138240385
|
update version to 0.3.2
|
2023-07-26 18:40:44 +08:00 |
|
潘其威(William)
|
b0889e4dab
|
Merge pull request #212 from casperbh96/main
Fix build on non-CUDA machines after #206
|
2023-07-26 18:35:53 +08:00 |
|
Casper
|
c68b4492f6
|
Fix build on non-CUDA machines after #206
|
2023-07-26 12:21:58 +02:00 |
|
PanQiWei
|
ff1f100ded
|
remove argument 'save_dir' in method from_quantized
|
2023-07-26 17:58:04 +08:00 |
|
PanQiWei
|
722a621aaa
|
simplified code
|
2023-07-26 17:53:47 +08:00 |
|
PanQiWei
|
5d6862ee8d
|
update README
|
2023-07-26 14:18:26 +08:00 |
|
潘其威(William)
|
22748dd2b7
|
Merge pull request #209 from PanQiWei/fix_no_cuda_kernel
Fix error raised when CUDA kernels are not installed
|
2023-07-26 14:07:30 +08:00 |
|
潘其威(William)
|
fd24e84eb2
|
Merge pull request #166 from casperbh96/main
[FEATURE] Implement perplexity metric to compare against llama.cpp
|
2023-07-26 14:04:51 +08:00 |
|
PanQiWei
|
5883b45d73
|
fix error raised when cuda kernels are not installed
|
2023-07-26 13:59:28 +08:00 |
|
潘其威(William)
|
bbc4a7c455
|
Merge pull request #208 from TheBloke/TB_Add_SafeTensors_Metadata
Add Safetensors metadata saving, with some values saved to each .safetensor file
|
2023-07-26 11:54:47 +08:00 |
|
潘其威(William)
|
228867a753
|
Merge pull request #207 from TheBloke/TB_version
Add a central version number
|
2023-07-26 11:27:23 +08:00 |
|
潘其威(William)
|
cbc319b4c8
|
Merge pull request #206 from TheBloke/TB_InstallScript
Change the install script so it attempts to build the CUDA extension in all cases
|
2023-07-26 11:20:53 +08:00 |
|
潘其威(William)
|
2456f71125
|
Merge pull request #205 from TheBloke/TB_fix_revision
Fix `revision` and other huggingface_hub kwargs in .from_quantized()
|
2023-07-26 10:34:43 +08:00 |
|
潘其威(William)
|
df4c4312ff
|
Merge pull request #202 from PanQiWei/fix-cuda-bug
Fix cuda bug that causes group_size and desc_act can't be used together
|
2023-07-26 10:32:18 +08:00 |
|
TheBloke
|
2647c92743
|
safetensors_metadata: add conversion to str() for input metadata to avoid errors from save_safe. Warn if this results in keys being overwritten.
|
2023-07-25 21:14:21 +00:00 |
|
TheBloke
|
ee7d80945b
|
Add version to metadata using new value
|
2023-07-25 14:25:24 +00:00 |
|
TheBloke
|
3817d154af
|
Merge branch 'TB_version' into TB_Add_SafeTensors_Metadata
|
2023-07-25 14:09:29 +00:00 |
|
TheBloke
|
7575eae6ab
|
Added to __init__.py to show a central version number. Also slightly adjust way version is stored in setup.py to make it easier to edit on version update. Bump version to 0.3.1 in both
|
2023-07-25 14:06:51 +00:00 |
|
TheBloke
|
eeaf5ebc53
|
Extend huggingface_hub features to AutoGPTQForCausalLM.from_pretrained() so models can be quantised from the hub including using a private token and revision/branch etc
|
2023-07-25 13:26:37 +00:00 |
|
TheBloke
|
593d32cb45
|
Typo in version joining
|
2023-07-25 13:18:52 +00:00 |
|
TheBloke
|
c9124e3fc7
|
Fix revision and other huggingface_hub args for .from_quantized(), which were not being passed through
|
2023-07-25 12:48:33 +00:00 |
|
TheBloke
|
6fc69c5b83
|
Fix check for Torch CUDA version
|
2023-07-25 12:45:27 +00:00 |
|
TheBloke
|
29da6c239f
|
setup.py now builds CUDA ext unless BUILD_CUDA_EXT=0. Also add a check of CUDA_VERSION from Torch, if available. GITHUB_ACTIONS=true is no longer needed.
|
2023-07-25 11:44:43 +00:00 |
|
TheBloke
|
3f359fc778
|
Add support for Safetensors metadata
|
2023-07-25 11:30:39 +00:00 |
|
qwopqwop200
|
9578c59d31
|
fix cuda bug
|
2023-07-25 16:50:05 +09:00 |
|
qwopqwop200
|
ed2aa9368e
|
fix cuda buf
|
2023-07-25 16:46:32 +09:00 |
|
PanQiWei
|
45576f0933
|
0.3.0 release
|
2023-07-16 15:24:06 +08:00 |
|