PanQiWei
|
ad10c13d40
|
support AdaLora
|
2023-05-28 21:30:45 +08:00 |
|
PanQiWei
|
3ee2daa73c
|
make GPTQLoraModel to inherit from LoraModel to simplify code
|
2023-05-28 17:36:18 +08:00 |
|
PanQiWei
|
22d1d8dcaa
|
add 'auto_find_all_linears' argument to get_gptq_peft_model function
|
2023-05-28 17:04:38 +08:00 |
|
PanQiWei
|
83132a663a
|
add warning to guide users interact with lora properly
|
2023-05-28 16:57:31 +08:00 |
|
PanQiWei
|
86f060c74b
|
Merge branch 'main' into peft_integration
|
2023-05-28 16:23:38 +08:00 |
|
PanQiWei
|
491da62402
|
fix signature at import time
|
2023-05-27 17:49:58 +08:00 |
|
潘其威(William)
|
23998345f5
|
Merge branch 'main' into falcon
|
2023-05-27 16:23:16 +08:00 |
|
Bill Cai
|
0729760234
|
Update auto.py
|
2023-05-27 11:16:43 +08:00 |
|
潘其威(William)
|
269ef7335c
|
Merge branch 'main' into falcon
|
2023-05-27 08:15:52 +08:00 |
|
潘其威(William)
|
3c3b0e1e79
|
Merge branch 'main' into GPTBigCode
|
2023-05-27 08:03:03 +08:00 |
|
潘其威(William)
|
eab728b263
|
Merge branch 'main' into Codegen
|
2023-05-27 08:00:19 +08:00 |
|
潘其威(William)
|
f6fd314d5a
|
Merge branch 'main' into GPTBigCode
|
2023-05-27 07:57:25 +08:00 |
|
qwopqwop200
|
277809381b
|
fix bug
|
2023-05-27 08:53:47 +09:00 |
|
PanQiWei
|
5bc5325920
|
add find_all_linear_names help function, make customized lora module more general
|
2023-05-27 07:49:17 +08:00 |
|
PanQiWei
|
eb9c0b140f
|
update FusedLlamaMLPForQuantizedModel for general usage purpose
|
2023-05-27 07:47:20 +08:00 |
|
qwopqwop200
|
bcb345fb35
|
support falcon
|
2023-05-27 07:53:39 +09:00 |
|
qwopqwop200
|
4d5b4fa5c6
|
add dtype
|
2023-05-27 07:49:28 +09:00 |
|
qwopqwop200
|
c14b4c1567
|
change find layer algorithm
|
2023-05-27 07:48:50 +09:00 |
|
PanQiWei
|
f7e705848a
|
move peft compatible model injection to the last step
|
2023-05-26 14:29:33 +08:00 |
|
PanQiWei
|
8bf21a7e4c
|
set xavier_uniform_ as lora_A's init function
|
2023-05-26 14:06:53 +08:00 |
|
PanQiWei
|
2b532f9453
|
add trainable mode
|
2023-05-26 13:11:30 +08:00 |
|
PanQiWei
|
fe5f5d12ed
|
Merge branch 'main' into peft_integration
|
2023-05-26 09:48:06 +08:00 |
|
PanQiWei
|
69609c4bc7
|
support faster vecquant4matmul cuda kernel
|
2023-05-26 08:55:05 +08:00 |
|
PanQiWei
|
cfd27e8caa
|
refactor file structure of qlinears
|
2023-05-26 07:18:16 +08:00 |
|
qwopqwop200
|
503f85255d
|
Update kernels.py
|
2023-05-25 23:15:33 +09:00 |
|
PanQiWei
|
f6a34137e9
|
lora compatibility
|
2023-05-25 19:44:53 +08:00 |
|
PanQiWei
|
d293bf3a04
|
first upload peft_utils.py
|
2023-05-25 15:11:11 +08:00 |
|
PanQiWei
|
4d157a3b64
|
add hack of __getattr__
|
2023-05-25 15:10:33 +08:00 |
|
TheBloke
|
b7bb50b4d5
|
Fix bug added after merge
|
2023-05-25 07:05:51 +01:00 |
|
Tom Jobbins
|
492255b400
|
Merge branch 'main' into TheBloke_support-HF-download
|
2023-05-25 07:02:13 +01:00 |
|
PanQiWei
|
096749fe9d
|
generalize QuantLinear
|
2023-05-25 13:33:09 +08:00 |
|
PanQiWei
|
94ef4d5ada
|
update basic usage example code
|
2023-05-24 17:56:46 +08:00 |
|
PanQiWei
|
c89bb6450c
|
correct typo of function name
|
2023-05-24 17:43:38 +08:00 |
|
PanQiWei
|
10347fdd7b
|
remove full_cpu_offload argument and unify model dispatch strategy
|
2023-05-24 17:41:04 +08:00 |
|
PanQiWei
|
379f24c2a5
|
remove add_align_logits_hook_to_model
|
2023-05-24 17:01:57 +08:00 |
|
PanQiWei
|
749dba1a7e
|
disable add_align_logits_hook_to_model for now
|
2023-05-24 13:42:06 +08:00 |
|
PanQiWei
|
58c1b509f0
|
support add_align_logits_hook_to_model
|
2023-05-24 12:50:30 +08:00 |
|
PanQiWei
|
21ab7c435a
|
make comments more readable
|
2023-05-24 11:38:29 +08:00 |
|
PanQiWei
|
c31b370228
|
make_sure_not_tensor_in_meta_device before load checkpoint
|
2023-05-24 11:32:45 +08:00 |
|
PanQiWei
|
63f1b4e073
|
remove comment
|
2023-05-24 11:23:07 +08:00 |
|
PanQiWei
|
057c39e3f2
|
fix meta device bug when use low_cpu_mem_usage
|
2023-05-24 11:19:59 +08:00 |
|
PanQiWei
|
e2e7809a1f
|
always to enable QuantLinear bias to make compatible with model quantized from other frameworks
|
2023-05-24 10:56:31 +08:00 |
|
PanQiWei
|
8e034b28bc
|
remove duplicate code
|
2023-05-23 23:48:15 +08:00 |
|
PanQiWei
|
4373d6b29c
|
Merge branch 'main' into improve_cpu_offload
|
2023-05-23 23:47:33 +08:00 |
|
PanQiWei
|
191da8141e
|
fix device mismatch
|
2023-05-23 23:22:52 +08:00 |
|
PanQiWei
|
e4e90e8b0a
|
add warmup_triton method
|
2023-05-23 23:18:46 +08:00 |
|
PanQiWei
|
ed14d3a786
|
fix save quantized model failed when load pretrained model using CPU offload
|
2023-05-23 23:17:11 +08:00 |
|
PanQiWei
|
6476ee4235
|
add options: 'low_cpu_mem_usage' and 'full_cpu_offload'
|
2023-05-23 22:51:00 +08:00 |
|
PanQiWei
|
1b2159bd4c
|
add more help functions
|
2023-05-23 19:30:28 +08:00 |
|
PanQiWei
|
db63c0876a
|
half out
|
2023-05-23 16:08:28 +08:00 |
|