ced04e1dff
disable the error exit here, see if the pregen code works
2023-10-26 12:43:07 -04:00
07021b9a1c
Generated files so that when they fail to work in pipeline then it still continues with what should be some ok defaults
2023-10-26 10:26:42 -04:00
3011e13009
Built locally for temp setup, not sure what its doing but it is doing weird stuff on build server, like it never determines something
2023-10-26 10:26:13 -04:00
153c085a32
Make this fail early when the actual problem happens
2023-10-26 09:38:59 -04:00
Automation Pipeline
9fb99f61e7
Merge remote-tracking branches 'laaza/Mistral' and 'laaza/MPT'
2023-10-22 07:53:59 -04:00
Vivek Khandelwal
e4b2493733
Modify qlinear_cuda for tracing the GPTQ model ( #367 )
...
Changes:
-- The change to the torch.bitwise_and is done because during
tracing this model the current usage of the torch.bitwise_and
result in an in-place variant of this op, resulting in an issue
during the downstream lowering pipeline of the traced model via
Torch-MLIR and IREE-SHARK. That's why the op usage is changed to
not result in an in-place variaunt.
-- The change to the torch.matmul call in the forward function is
done because currently, it assumes that the weights will always
be of fp16 type. But, when the model is executed for the float32
weights it results in an error. That's why the current change
cast the LHS of the matmul to the same type as the RHS one.
Both the above changes doesn't affect the model in any way.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
2023-10-21 01:06:01 +09:00
LaaZa
4b7389ddb7
Merge branch 'main' into MPT
...
# Conflicts:
# auto_gptq/modeling/__init__.py
# auto_gptq/modeling/_const.py
# auto_gptq/modeling/auto.py
2023-10-04 20:21:49 +03:00
LaaZa
99acbead42
Add support for Mistral models.
2023-10-04 01:07:55 +03:00
潘其威(William)
51c043c6be
Merge pull request #355 from PanQiWei/fix_pack_model_use_exllamav2
...
import exllama QuantLinear instead of exllamav2's in `pack_model`
2023-09-27 11:06:35 +08:00
student686
c1a3013c45
import exllama QuantLinear instead of exllamav2's
2023-09-27 11:05:13 +08:00
潘其威(William)
3b81fb5ea0
Merge pull request #354 from PanQiWei/revert-325-main
...
Reverts #325 for it may breaks exllama kernels
2023-09-27 10:39:00 +08:00
潘其威(William)
3de7fbb0d5
Revert "fix bug(breaking change) remove (zeors -= 1)"
2023-09-27 10:37:31 +08:00
潘其威(William)
ac23d6b819
Merge pull request #325 from qwopqwop200/main
...
remove an unnecessary line (zeors -= 1) to make disable 'sym' feature truely possible
2023-09-26 14:20:39 +08:00
潘其威(William)
62fd0371ac
Merge branch 'main' into main
2023-09-26 14:09:04 +08:00
潘其威(William)
b461b6fa13
Merge pull request #335 from z80maniac/ignore-extra-args
...
Ignore unknown parameters in quantize_config.json
2023-09-26 14:00:38 +08:00
潘其威(William)
04db761eed
Merge pull request #347 from alex4321/peft-model-use-adapter-name
...
Use `adapter_name` for `get_gptq_peft_model` with `train_mode=True`
2023-09-26 13:55:06 +08:00
潘其威(William)
50d2e86890
Merge pull request #349 from SunMarc/exllamav2_integration
...
exllamav2 integration
2023-09-26 13:49:59 +08:00
Marc Sun
c912bf361a
exllamav2 integration
2023-09-25 16:51:18 +00:00
student686
645bd15a96
update README
2023-09-25 18:55:34 +08:00
student686
d2844437fd
update README
2023-09-25 18:53:03 +08:00
student686
da84da846b
update README
2023-09-25 18:51:03 +08:00
student686
50da063f65
update README
2023-09-25 18:47:40 +08:00
Alexander Pozharskii
0185095402
Use adapter_name
for get_gptq_peft_model
with train_mode=True
2023-09-24 17:11:19 +04:00
潘其威(William)
06e071e68e
Merge pull request #326 from TheBloke/TB_Latest_Falcon
...
Add support for Falcon as part of Transformers 4.33.0, including new Falcon 180B
2023-09-14 22:49:25 +08:00
PanQiWei
7a75176224
update README
2023-09-11 11:15:08 +08:00
ZXED
121dbd15a5
Ignore unknown parameters in quantize_config.json
2023-09-10 18:39:40 +03:00
qwopqwop200
94de4ef185
GPTQ backward compatibility support
2023-09-08 10:16:29 +09:00
qwopqwop200
9e0682a63e
Optimize q4_matmul
...
https://github.com/turboderp/exllama/pull/275
2023-09-07 12:54:46 +09:00
TheBloke
034f6730ed
Removed unexpected file that shouldn't have been added, sorry
2023-09-06 18:08:30 +01:00
TheBloke
02a87dce76
Add support for Falcon as part of Transformers 4.33.0, including new Falcon 180B
2023-09-06 18:03:33 +01:00
qwopqwop200
6b1ceb1897
if exllama auto diable fused attention
2023-09-06 18:14:04 +09:00
qwopqwop200
ad5b0d72ee
fix bug
2023-09-06 16:41:41 +09:00
qwopqwop200
f752336cda
fix bug
2023-09-06 16:39:22 +09:00
潘其威(William)
1793227283
Merge pull request #311 from SunMarc/fix_max_input_length
...
fix typo in max_input_length
2023-09-01 10:21:54 +08:00
潘其威(William)
782bb603d9
Merge pull request #303 from JustinLin610/patch-1
...
Update qwen.py for Qwen-VL
2023-09-01 10:20:24 +08:00
Marc Sun
04b321da89
fix type
2023-08-31 14:07:16 -04:00
潘其威(William)
1e938e6bad
Merge pull request #310 from PanQiWei/fix_to()_metod_bug
...
fix model type changed after calling .to() method
2023-08-31 19:04:02 +08:00
潘其威(William)
1339db3045
Merge pull request #309 from PanQiWei/install-skip-qigen(windows)
...
skip qigen installation on windows
2023-08-31 19:03:43 +08:00
PanQiWei
c7021f0f44
fix model type changed after calling .to() method
2023-08-31 18:39:03 +08:00
qwopqwop200
f97b77a64e
fix install bug
2023-08-31 15:00:38 +09:00
qwopqwop200
45a1ee4d84
install check qigen
2023-08-31 14:37:39 +09:00
qwopqwop200
71d56c76d0
skip install qigen(windows)
2023-08-31 14:35:04 +09:00
Junyang Lin
7c39a3a315
Update qwen.py for Qwen-VL
...
add transformer.visual as outside layer for the adaptation to Qwen-VL
2023-08-30 16:29:55 +08:00
PanQiWei
604c96144f
temporarily set the version of main branch to 0.5.0.dev0
2023-08-25 17:36:23 +08:00
潘其威(William)
6bbf70373f
Merge pull request #288 from PanQiWei/revert-287-v0.4.2-release
...
Revert "V0.4.2 release"
2023-08-25 17:34:27 +08:00
潘其威(William)
e5050a5650
Revert "V0.4.2 release"
2023-08-25 17:26:55 +08:00
潘其威(William)
1049fd014a
Merge pull request #287 from PanQiWei/v0.4.2-release
...
V0.4.2 release
2023-08-25 17:26:41 +08:00
qwopqwop200
6a9d80eddc
Merge remote-tracking branch 'qwopqwop200/main' into main
2023-08-25 18:06:03 +09:00
qwopqwop200
dafdd6189a
duplicate code remove
2023-08-25 14:59:13 +09:00
fxmarty
144302f58f
Update install instructions ( #286 )
2023-08-25 04:17:25 +09:00