oobabooga
|
126b3a768f
|
Revert "Dynamic Chat Message UI Update Speed (#6952)" (for now)
This reverts commit 8137eb8ef4 .
|
2025-05-18 12:38:36 -07:00 |
|
oobabooga
|
2826c60044
|
Use logger for "Output generated in ..." messages
|
2025-05-13 14:45:46 -07:00 |
|
oobabooga
|
8984e95c67
|
UI: More friendly message when no model is loaded
|
2025-05-09 07:21:05 -07:00 |
|
mamei16
|
8137eb8ef4
|
Dynamic Chat Message UI Update Speed (#6952)
|
2025-05-05 18:05:23 -03:00 |
|
oobabooga
|
3f26b0408b
|
Fix after 9e3867dc83
|
2025-05-02 16:17:22 -07:00 |
|
oobabooga
|
9e3867dc83
|
llama.cpp: Fix manual random seeds
|
2025-05-02 09:36:15 -07:00 |
|
oobabooga
|
cd5c32dc19
|
UI: Fix max_updates_second not working
|
2025-04-30 14:54:05 -07:00 |
|
oobabooga
|
f1b64df8dd
|
EXL2: add another torch.cuda.synchronize() call to prevent errors
|
2025-04-24 09:03:49 -07:00 |
|
oobabooga
|
ff1c00bdd9
|
llama.cpp: set the random seed manually
|
2025-04-20 19:08:44 -07:00 |
|
oobabooga
|
ae02ffc605
|
Refactor the transformers loader (#6859)
|
2025-04-20 13:33:47 -03:00 |
|
oobabooga
|
6ba0164c70
|
Lint
|
2025-04-19 17:45:21 -07:00 |
|
oobabooga
|
5ab069786b
|
llama.cpp: add back the two encode calls (they are harmless now)
|
2025-04-19 17:38:36 -07:00 |
|
oobabooga
|
ba976d1390
|
llama.cpp: avoid two 'encode' calls
|
2025-04-19 16:35:01 -07:00 |
|
oobabooga
|
ae54d8faaa
|
New llama.cpp loader (#6846)
|
2025-04-18 09:59:37 -03:00 |
|
oobabooga
|
5c2f8d828e
|
Fix exllamav2 generating eos randomly after previous fix
|
2025-04-18 05:42:38 -07:00 |
|
oobabooga
|
ce9e2d94b1
|
Revert "Attempt at solving the ExLlamaV2 issue"
This reverts commit c9b3c9dfbf .
|
2025-04-17 22:03:21 -07:00 |
|
oobabooga
|
5dfab7d363
|
New attempt at solving the exl2 issue
|
2025-04-17 22:03:11 -07:00 |
|
oobabooga
|
c9b3c9dfbf
|
Attempt at solving the ExLlamaV2 issue
|
2025-04-17 21:45:15 -07:00 |
|
oobabooga
|
5bcd2d7ad0
|
Add the top N-sigma sampler (#6796)
|
2025-03-14 16:45:11 -03:00 |
|
oobabooga
|
83c426e96b
|
Organize internals (#6646)
|
2025-01-10 18:04:32 -03:00 |
|
oobabooga
|
11af199aff
|
Add a "Static KV cache" option for transformers
|
2025-01-04 17:52:57 -08:00 |
|
Petr Korolev
|
13c033c745
|
Fix CUDA error on MPS backend during API request (#6572)
---------
Co-authored-by: oobabooga <oobabooga4@gmail.com>
|
2025-01-02 00:06:11 -03:00 |
|
oobabooga
|
7b88724711
|
Make responses start faster by removing unnecessary cleanup calls (#6625)
|
2025-01-01 18:33:38 -03:00 |
|
oobabooga
|
cca9d6e22d
|
Lint
|
2024-10-01 10:21:06 -07:00 |
|
Philipp Emanuel Weidmann
|
301375834e
|
Exclude Top Choices (XTC): A sampler that boosts creativity, breaks writing clichés, and inhibits non-verbatim repetition (#6335)
|
2024-09-27 22:50:12 -03:00 |
|
GralchemOz
|
4c74c7a116
|
Fix UnicodeDecodeError for BPE-based Models (especially GLM-4) (#6357)
|
2024-09-02 23:00:59 -03:00 |
|
oobabooga
|
9dcff21da9
|
Remove unnecessary shared.previous_model_name variable
|
2024-07-28 18:35:11 -07:00 |
|
oobabooga
|
577a8cd3ee
|
Add TensorRT-LLM support (#5715)
|
2024-06-24 02:30:03 -03:00 |
|
Belladore
|
46174a2d33
|
Fix error when bos_token_id is None. (#6061)
|
2024-06-12 22:52:27 -03:00 |
|
Belladore
|
a363cdfca1
|
Fix missing bos token for some models (including Llama-3) (#6050)
|
2024-05-27 09:21:30 -03:00 |
|
Philipp Emanuel Weidmann
|
852c943769
|
DRY: A modern repetition penalty that reliably prevents looping (#5677)
|
2024-05-19 23:53:47 -03:00 |
|
oobabooga
|
9f77ed1b98
|
--idle-timeout flag to unload the model if unused for N minutes (#6026)
|
2024-05-19 23:29:39 -03:00 |
|
oobabooga
|
a4611232b7
|
Make --verbose output less spammy
|
2024-05-18 09:57:00 -07:00 |
|
oobabooga
|
70845c76fb
|
Add back the max_updates_second parameter (#5937)
|
2024-04-26 10:14:51 -03:00 |
|
oobabooga
|
6761b5e7c6
|
Improved instruct style (with syntax highlighting & LaTeX rendering) (#5936)
|
2024-04-26 10:13:11 -03:00 |
|
wangshuai09
|
fd4e46bce2
|
Add Ascend NPU support (basic) (#5541)
|
2024-04-11 18:42:20 -03:00 |
|
oobabooga
|
d423021a48
|
Remove CTransformers support (#5807)
|
2024-04-04 20:23:58 -03:00 |
|
oobabooga
|
13fe38eb27
|
Remove specialized code for gpt-4chan
|
2024-04-04 16:11:47 -07:00 |
|
oobabooga
|
35da6b989d
|
Organize the parameters tab (#5767)
|
2024-03-28 16:45:03 -03:00 |
|
oobabooga
|
2a92a842ce
|
Bump gradio to 4.23 (#5758)
|
2024-03-26 16:32:20 -03:00 |
|
oobabooga
|
cf0697936a
|
Optimize StreamingLLM by over 10x
|
2024-03-08 21:48:28 -08:00 |
|
oobabooga
|
afb51bd5d6
|
Add StreamingLLM for llamacpp & llamacpp_HF (2nd attempt) (#5669)
|
2024-03-09 00:25:33 -03:00 |
|
oobabooga
|
2174958362
|
Revert gradio to 3.50.2 (#5640)
|
2024-03-06 11:52:46 -03:00 |
|
oobabooga
|
63a1d4afc8
|
Bump gradio to 4.19 (#5522)
|
2024-03-05 07:32:28 -03:00 |
|
kalomaze
|
cfb25c9b3f
|
Cubic sampling w/ curve param (#5551)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2024-03-03 13:22:21 -03:00 |
|
oobabooga
|
080f7132c0
|
Revert gradio to 3.50.2 (#5513)
|
2024-02-15 20:40:23 -03:00 |
|
oobabooga
|
7123ac3f77
|
Remove "Maximum UI updates/second" parameter (#5507)
|
2024-02-14 23:34:30 -03:00 |
|
oobabooga
|
494cc3c5b0
|
Handle empty sampler priority field, use default values
|
2024-02-06 07:05:32 -08:00 |
|
oobabooga
|
2a1063eff5
|
Revert "Remove non-HF ExLlamaV2 loader (#5431)"
This reverts commit cde000d478 .
|
2024-02-06 06:21:36 -08:00 |
|
oobabooga
|
8c35fefb3b
|
Add custom sampler order support (#5443)
|
2024-02-06 11:20:10 -03:00 |
|