Automation Pipeline
9fb99f61e7
Merge remote-tracking branches 'laaza/Mistral' and 'laaza/MPT'
2023-10-22 07:53:59 -04:00
Vivek Khandelwal
e4b2493733
Modify qlinear_cuda for tracing the GPTQ model ( #367 )
...
Changes:
-- The change to the torch.bitwise_and is done because during
tracing this model the current usage of the torch.bitwise_and
result in an in-place variant of this op, resulting in an issue
during the downstream lowering pipeline of the traced model via
Torch-MLIR and IREE-SHARK. That's why the op usage is changed to
not result in an in-place variaunt.
-- The change to the torch.matmul call in the forward function is
done because currently, it assumes that the weights will always
be of fp16 type. But, when the model is executed for the float32
weights it results in an error. That's why the current change
cast the LHS of the matmul to the same type as the RHS one.
Both the above changes doesn't affect the model in any way.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
2023-10-21 01:06:01 +09:00
LaaZa
4b7389ddb7
Merge branch 'main' into MPT
...
# Conflicts:
# auto_gptq/modeling/__init__.py
# auto_gptq/modeling/_const.py
# auto_gptq/modeling/auto.py
2023-10-04 20:21:49 +03:00
LaaZa
99acbead42
Add support for Mistral models.
2023-10-04 01:07:55 +03:00
student686
c1a3013c45
import exllama QuantLinear instead of exllamav2's
2023-09-27 11:05:13 +08:00
潘其威(William)
3de7fbb0d5
Revert "fix bug(breaking change) remove (zeors -= 1)"
2023-09-27 10:37:31 +08:00
潘其威(William)
62fd0371ac
Merge branch 'main' into main
2023-09-26 14:09:04 +08:00
潘其威(William)
b461b6fa13
Merge pull request #335 from z80maniac/ignore-extra-args
...
Ignore unknown parameters in quantize_config.json
2023-09-26 14:00:38 +08:00
潘其威(William)
04db761eed
Merge pull request #347 from alex4321/peft-model-use-adapter-name
...
Use `adapter_name` for `get_gptq_peft_model` with `train_mode=True`
2023-09-26 13:55:06 +08:00
Marc Sun
c912bf361a
exllamav2 integration
2023-09-25 16:51:18 +00:00
Alexander Pozharskii
0185095402
Use adapter_name
for get_gptq_peft_model
with train_mode=True
2023-09-24 17:11:19 +04:00
ZXED
121dbd15a5
Ignore unknown parameters in quantize_config.json
2023-09-10 18:39:40 +03:00
qwopqwop200
94de4ef185
GPTQ backward compatibility support
2023-09-08 10:16:29 +09:00
TheBloke
02a87dce76
Add support for Falcon as part of Transformers 4.33.0, including new Falcon 180B
2023-09-06 18:03:33 +01:00
qwopqwop200
6b1ceb1897
if exllama auto diable fused attention
2023-09-06 18:14:04 +09:00
qwopqwop200
ad5b0d72ee
fix bug
2023-09-06 16:41:41 +09:00
潘其威(William)
1793227283
Merge pull request #311 from SunMarc/fix_max_input_length
...
fix typo in max_input_length
2023-09-01 10:21:54 +08:00
潘其威(William)
782bb603d9
Merge pull request #303 from JustinLin610/patch-1
...
Update qwen.py for Qwen-VL
2023-09-01 10:20:24 +08:00
Marc Sun
04b321da89
fix type
2023-08-31 14:07:16 -04:00
潘其威(William)
1e938e6bad
Merge pull request #310 from PanQiWei/fix_to()_metod_bug
...
fix model type changed after calling .to() method
2023-08-31 19:04:02 +08:00
PanQiWei
c7021f0f44
fix model type changed after calling .to() method
2023-08-31 18:39:03 +08:00
qwopqwop200
45a1ee4d84
install check qigen
2023-08-31 14:37:39 +09:00
Junyang Lin
7c39a3a315
Update qwen.py for Qwen-VL
...
add transformer.visual as outside layer for the adaptation to Qwen-VL
2023-08-30 16:29:55 +08:00
PanQiWei
604c96144f
temporarily set the version of main branch to 0.5.0.dev0
2023-08-25 17:36:23 +08:00
潘其威(William)
e5050a5650
Revert "V0.4.2 release"
2023-08-25 17:26:55 +08:00
潘其威(William)
1049fd014a
Merge pull request #287 from PanQiWei/v0.4.2-release
...
V0.4.2 release
2023-08-25 17:26:41 +08:00
qwopqwop200
6a9d80eddc
Merge remote-tracking branch 'qwopqwop200/main' into main
2023-08-25 18:06:03 +09:00
qwopqwop200
dafdd6189a
duplicate code remove
2023-08-25 14:59:13 +09:00
Félix Marty
8254da4f15
update version
2023-08-24 17:47:14 +02:00
Felix Marty
04730ac66c
expose api to set exllama max length
2023-08-24 11:22:15 +00:00
qwopqwop200
f23a06f911
Merge branch 'PanQiWei:main' into main
2023-08-17 15:22:43 +09:00
qwopqwop200
b8a42911a6
qigen refactoring
2023-08-17 15:22:16 +09:00
qwopqwop200
5d5b687ca8
qigen formatting qlinear
2023-08-17 15:19:01 +09:00
qwopqwop200
084c9d8860
name change
2023-08-17 15:17:09 +09:00
PanQiWei
893fc5d7a3
release 0.4.1
2023-08-13 16:35:59 +08:00
PanQiWei
34b4ba451c
fix typo
2023-08-13 16:26:02 +08:00
qwopqwop200
051f3facc7
change arguments name
2023-08-11 16:10:32 +09:00
qwopqwop200
a807e038bb
remove many contiguous and change arguments name
2023-08-11 16:09:42 +09:00
qwopqwop200
c591d6a1e1
change name make_quant_cpu to make_quant_qigen
2023-08-11 15:12:33 +09:00
qwopqwop200
2c1afc2ad9
chang name make_quant_cpu to make_quant_qigen
2023-08-11 15:04:58 +09:00
qwopqwop200
aa5528cb10
use_cpu name change and default dtype change
2023-08-11 09:51:36 +09:00
qwopqwop200
870be83bea
Merge branch 'PanQiWei:main' into main
2023-08-10 22:48:30 +09:00
qwopqwop200
7ba78af3ae
support cpu
2023-08-10 22:48:04 +09:00
Felix Marty
4af7ea619d
patch for transformers compatiblity
2023-08-09 14:23:59 +00:00
PanQiWei
44c7a1a184
make exllama_kernels compilation as optional
2023-08-09 17:42:22 +08:00
PanQiWei
172deae049
expose disable_exllama argument
2023-08-09 12:03:31 +08:00
PanQiWei
86a3d4a094
release 0.4.0
2023-08-09 11:54:31 +08:00
qwopqwop200
fe244503e0
add ","
2023-08-08 19:57:23 +09:00
qwopqwop200
d22f89c524
support qwen
2023-08-08 19:27:43 +09:00
qwopqwop200
dc5541e78a
static groups default value change
2023-08-08 14:11:39 +09:00