潘其威(William)
|
23998345f5
|
Merge branch 'main' into falcon
|
2023-05-27 16:23:16 +08:00 |
|
潘其威(William)
|
3108985a55
|
Merge pull request #112 from billcai/patch-2
Minor syntax fix for auto.py
|
2023-05-27 16:20:54 +08:00 |
|
Bill Cai
|
0729760234
|
Update auto.py
|
2023-05-27 11:16:43 +08:00 |
|
潘其威(William)
|
269ef7335c
|
Merge branch 'main' into falcon
|
2023-05-27 08:15:52 +08:00 |
|
潘其威(William)
|
9ebc1d1ec0
|
Merge pull request #63 from LaaZa/GPTBigCode
Add support for GPTBigCode(starcoder)
|
2023-05-27 08:03:10 +08:00 |
|
潘其威(William)
|
3c3b0e1e79
|
Merge branch 'main' into GPTBigCode
|
2023-05-27 08:03:03 +08:00 |
|
潘其威(William)
|
358ff80d09
|
Merge pull request #65 from LaaZa/Codegen
Add support for CodeGen/2
|
2023-05-27 08:01:53 +08:00 |
|
潘其威(William)
|
eab728b263
|
Merge branch 'main' into Codegen
|
2023-05-27 08:00:19 +08:00 |
|
潘其威(William)
|
f6fd314d5a
|
Merge branch 'main' into GPTBigCode
|
2023-05-27 07:57:25 +08:00 |
|
qwopqwop200
|
277809381b
|
fix bug
|
2023-05-27 08:53:47 +09:00 |
|
PanQiWei
|
5bc5325920
|
add find_all_linear_names help function, make customized lora module more general
|
2023-05-27 07:49:17 +08:00 |
|
PanQiWei
|
eb9c0b140f
|
update FusedLlamaMLPForQuantizedModel for general usage purpose
|
2023-05-27 07:47:20 +08:00 |
|
qwopqwop200
|
bcb345fb35
|
support falcon
|
2023-05-27 07:53:39 +09:00 |
|
qwopqwop200
|
4d5b4fa5c6
|
add dtype
|
2023-05-27 07:49:28 +09:00 |
|
qwopqwop200
|
c14b4c1567
|
change find layer algorithm
|
2023-05-27 07:48:50 +09:00 |
|
qwopqwop200
|
874c9fd0ef
|
fix bug
|
2023-05-27 07:47:17 +09:00 |
|
PanQiWei
|
f7e705848a
|
move peft compatible model injection to the last step
|
2023-05-26 14:29:33 +08:00 |
|
PanQiWei
|
8bf21a7e4c
|
set xavier_uniform_ as lora_A's init function
|
2023-05-26 14:06:53 +08:00 |
|
PanQiWei
|
2b532f9453
|
add trainable mode
|
2023-05-26 13:11:30 +08:00 |
|
PanQiWei
|
fe5f5d12ed
|
Merge branch 'main' into peft_integration
|
2023-05-26 09:48:06 +08:00 |
|
PanQiWei
|
69609c4bc7
|
support faster vecquant4matmul cuda kernel
|
2023-05-26 08:55:05 +08:00 |
|
PanQiWei
|
cfd27e8caa
|
refactor file structure of qlinears
|
2023-05-26 07:18:16 +08:00 |
|
潘其威(William)
|
b4eda619d0
|
Merge pull request #104 from PanQiWei/triton-float32
triton float32 support
|
2023-05-25 22:56:00 +08:00 |
|
qwopqwop200
|
503f85255d
|
Update kernels.py
|
2023-05-25 23:15:33 +09:00 |
|
PanQiWei
|
f6a34137e9
|
lora compatibility
|
2023-05-25 19:44:53 +08:00 |
|
PanQiWei
|
d293bf3a04
|
first upload peft_utils.py
|
2023-05-25 15:11:11 +08:00 |
|
PanQiWei
|
4d157a3b64
|
add hack of __getattr__
|
2023-05-25 15:10:33 +08:00 |
|
TheBloke
|
b7bb50b4d5
|
Fix bug added after merge
|
2023-05-25 07:05:51 +01:00 |
|
Tom Jobbins
|
492255b400
|
Merge branch 'main' into TheBloke_support-HF-download
|
2023-05-25 07:02:13 +01:00 |
|
PanQiWei
|
096749fe9d
|
generalize QuantLinear
|
2023-05-25 13:33:09 +08:00 |
|
PanQiWei
|
49d1f0da1b
|
update README
|
2023-05-25 13:06:17 +08:00 |
|
PanQiWei
|
6426b41f94
|
update setup.py
|
2023-05-25 13:06:10 +08:00 |
|
潘其威(William)
|
18c7ce5875
|
Merge pull request #100 from PanQiWei/improve_cpu_offload
Improve CPU offload
|
2023-05-24 18:48:37 +08:00 |
|
PanQiWei
|
c341a6df2f
|
update tutorial
|
2023-05-24 18:48:19 +08:00 |
|
PanQiWei
|
ac14180946
|
update tutorial
|
2023-05-24 18:31:59 +08:00 |
|
PanQiWei
|
065fd1de35
|
update README
|
2023-05-24 18:26:47 +08:00 |
|
PanQiWei
|
e6ba062c08
|
update basic usage example code
|
2023-05-24 17:58:01 +08:00 |
|
PanQiWei
|
94ef4d5ada
|
update basic usage example code
|
2023-05-24 17:56:46 +08:00 |
|
PanQiWei
|
c89bb6450c
|
correct typo of function name
|
2023-05-24 17:43:38 +08:00 |
|
PanQiWei
|
10347fdd7b
|
remove full_cpu_offload argument and unify model dispatch strategy
|
2023-05-24 17:41:04 +08:00 |
|
PanQiWei
|
379f24c2a5
|
remove add_align_logits_hook_to_model
|
2023-05-24 17:01:57 +08:00 |
|
PanQiWei
|
749dba1a7e
|
disable add_align_logits_hook_to_model for now
|
2023-05-24 13:42:06 +08:00 |
|
PanQiWei
|
58c1b509f0
|
support add_align_logits_hook_to_model
|
2023-05-24 12:50:30 +08:00 |
|
PanQiWei
|
21ab7c435a
|
make comments more readable
|
2023-05-24 11:38:29 +08:00 |
|
PanQiWei
|
c31b370228
|
make_sure_not_tensor_in_meta_device before load checkpoint
|
2023-05-24 11:32:45 +08:00 |
|
PanQiWei
|
63f1b4e073
|
remove comment
|
2023-05-24 11:23:07 +08:00 |
|
PanQiWei
|
057c39e3f2
|
fix meta device bug when use low_cpu_mem_usage
|
2023-05-24 11:19:59 +08:00 |
|
PanQiWei
|
e2e7809a1f
|
always to enable QuantLinear bias to make compatible with model quantized from other frameworks
|
2023-05-24 10:56:31 +08:00 |
|
PanQiWei
|
8e034b28bc
|
remove duplicate code
|
2023-05-23 23:48:15 +08:00 |
|
PanQiWei
|
4373d6b29c
|
Merge branch 'main' into improve_cpu_offload
|
2023-05-23 23:47:33 +08:00 |
|