Marc Sun
|
c912bf361a
|
exllamav2 integration
|
2023-09-25 16:51:18 +00:00 |
|
TheBloke
|
02a87dce76
|
Add support for Falcon as part of Transformers 4.33.0, including new Falcon 180B
|
2023-09-06 18:03:33 +01:00 |
|
PanQiWei
|
172deae049
|
expose disable_exllama argument
|
2023-08-09 12:03:31 +08:00 |
|
qwopqwop200
|
fe244503e0
|
add ","
|
2023-08-08 19:57:23 +09:00 |
|
qwopqwop200
|
d22f89c524
|
support qwen
|
2023-08-08 19:27:43 +09:00 |
|
PanQiWei
|
ff1f100ded
|
remove argument 'save_dir' in method from_quantized
|
2023-07-26 17:58:04 +08:00 |
|
TheBloke
|
c9124e3fc7
|
Fix revision and other huggingface_hub args for .from_quantized(), which were not being passed through
|
2023-07-25 12:48:33 +00:00 |
|
tc
|
e28e8ee809
|
Add support for InternLM
|
2023-07-07 09:25:40 -07:00 |
|
LaaZa
|
03577a7698
|
Rename the class to match reference capitalisation
|
2023-06-18 21:01:07 +03:00 |
|
LaaZa
|
9fd558f2ba
|
Add support for Baichuan
|
2023-06-18 20:13:29 +03:00 |
|
潘其威(William)
|
b4fdd8d264
|
Merge branch 'main' into peft_integration
|
2023-06-02 19:11:59 +08:00 |
|
潘其威(William)
|
defc96ff04
|
Merge pull request #91 from TheBloke/TheBloke_support-HF-download
Add support for HF Hub download, and `push_to_hub`
|
2023-05-30 07:37:15 +08:00 |
|
潘其威(William)
|
2245fad095
|
Update auto.py
fix None type error
|
2023-05-30 07:35:15 +08:00 |
|
潘其威(William)
|
61a4ea035f
|
Update auto.py
add back save_dir for backward compatible
|
2023-05-30 06:43:00 +08:00 |
|
PanQiWei
|
86f060c74b
|
Merge branch 'main' into peft_integration
|
2023-05-28 16:23:38 +08:00 |
|
PanQiWei
|
491da62402
|
fix signature at import time
|
2023-05-27 17:49:58 +08:00 |
|
潘其威(William)
|
23998345f5
|
Merge branch 'main' into falcon
|
2023-05-27 16:23:16 +08:00 |
|
Bill Cai
|
0729760234
|
Update auto.py
|
2023-05-27 11:16:43 +08:00 |
|
潘其威(William)
|
269ef7335c
|
Merge branch 'main' into falcon
|
2023-05-27 08:15:52 +08:00 |
|
潘其威(William)
|
3c3b0e1e79
|
Merge branch 'main' into GPTBigCode
|
2023-05-27 08:03:03 +08:00 |
|
潘其威(William)
|
eab728b263
|
Merge branch 'main' into Codegen
|
2023-05-27 08:00:19 +08:00 |
|
潘其威(William)
|
f6fd314d5a
|
Merge branch 'main' into GPTBigCode
|
2023-05-27 07:57:25 +08:00 |
|
qwopqwop200
|
277809381b
|
fix bug
|
2023-05-27 08:53:47 +09:00 |
|
qwopqwop200
|
bcb345fb35
|
support falcon
|
2023-05-27 07:53:39 +09:00 |
|
PanQiWei
|
2b532f9453
|
add trainable mode
|
2023-05-26 13:11:30 +08:00 |
|
Tom Jobbins
|
492255b400
|
Merge branch 'main' into TheBloke_support-HF-download
|
2023-05-25 07:02:13 +01:00 |
|
PanQiWei
|
10347fdd7b
|
remove full_cpu_offload argument and unify model dispatch strategy
|
2023-05-24 17:41:04 +08:00 |
|
PanQiWei
|
6476ee4235
|
add options: 'low_cpu_mem_usage' and 'full_cpu_offload'
|
2023-05-23 22:51:00 +08:00 |
|
TheBloke
|
908b338436
|
Initial support for model loading from HF hub
|
2023-05-19 15:57:05 +01:00 |
|
PanQiWei
|
759d6953d4
|
support user customize device_map
|
2023-05-15 13:26:38 +08:00 |
|
潘其威(William)
|
bdb08c16fc
|
Merge branch 'main' into Codegen
|
2023-05-14 13:10:52 +08:00 |
|
潘其威(William)
|
e24c5122db
|
Merge branch 'main' into GPTBigCode
|
2023-05-14 13:10:10 +08:00 |
|
PanQiWei
|
f159aeabb6
|
refactor .from_quantized api and improve model loading strategy
|
2023-05-12 18:09:50 +08:00 |
|
LaaZa
|
b8187ff05a
|
Add support for CodeGen/2
|
2023-05-08 17:34:00 +03:00 |
|
LaaZa
|
63247a0669
|
Add support for GPTBigCode
|
2023-05-08 12:28:29 +03:00 |
|
qwopqwop200
|
d49281bc5d
|
support faster and model load strict
|
2023-05-04 09:07:34 +09:00 |
|
qwopqwop200
|
24251d1397
|
check kwargs
|
2023-05-02 22:32:54 +09:00 |
|
qwopqwop200
|
ccd87e5800
|
add Auto model parameter
|
2023-05-02 22:15:56 +09:00 |
|
ZXED
|
24a371d14a
|
use the same Optional style as in other params
|
2023-04-29 09:52:11 +03:00 |
|
ZXED
|
c22770188d
|
allow user to set trust_remote_code flag manually
|
2023-04-29 09:52:11 +03:00 |
|
ZXED
|
b3f19a7ba7
|
support custom model name when loading the model
|
2023-04-29 09:52:11 +03:00 |
|
ZXED
|
ea8ab73343
|
support custom quantize_config when loading the model
|
2023-04-29 09:51:50 +03:00 |
|
qwopqwop200
|
ac41f68532
|
add gpt2
|
2023-04-28 09:14:05 +09:00 |
|
PanQiWei
|
51d2e53130
|
add support to cpu offloading and multi gpus inference on quantized model
|
2023-04-28 00:53:57 +08:00 |
|
PanQiWei
|
498de923f2
|
support multi gpus quantization
|
2023-04-27 18:48:43 +08:00 |
|
PanQiWei
|
c9bb427546
|
align 'from_pretrained' api
|
2023-04-27 02:29:32 +08:00 |
|
PanQiWei
|
f2359f56cb
|
add support to use push_to_hub to upload and share quantized model
|
2023-04-26 16:55:01 +08:00 |
|
PanQiWei
|
832dc4a7a1
|
refactor file structure
|
2023-04-25 18:58:20 +08:00 |
|
PanQiWei
|
a259fb06bb
|
add support to MOSS model
|
2023-04-25 11:54:29 +08:00 |
|
PanQiWei
|
7ba0edffe0
|
refactor file structure of modeling module
|
2023-04-23 17:33:09 +08:00 |
|