djholtby
|
73bfc936a0
|
Close response generator when stopping API generation (#7014)
|
2025-05-26 22:39:03 -03:00 |
|
oobabooga
|
83bfd5c64b
|
Fix API issues
|
2025-05-18 12:45:01 -07:00 |
|
oobabooga
|
076aa67963
|
Fix API issues
|
2025-05-17 22:22:18 -07:00 |
|
oobabooga
|
470c822f44
|
API: Hide the uvicorn access logs from the terminal
|
2025-05-16 12:54:39 -07:00 |
|
oobabooga
|
fd61297933
|
Lint
|
2025-05-15 21:19:19 -07:00 |
|
oobabooga
|
c375b69413
|
API: Fix llama.cpp generating after disconnect, improve disconnect detection, fix deadlock on simultaneous requests
|
2025-05-13 11:23:33 -07:00 |
|
oobabooga
|
0c5fa3728e
|
Revert "Fix API failing to cancel streams (attempt), closes #6966"
This reverts commit 006a866079 .
|
2025-05-10 19:12:40 -07:00 |
|
oobabooga
|
006a866079
|
Fix API failing to cancel streams (attempt), closes #6966
|
2025-05-10 17:55:48 -07:00 |
|
Jonas
|
fa960496d5
|
Tools support for OpenAI compatible API (#6827)
|
2025-05-08 12:30:27 -03:00 |
|
oobabooga
|
f82667f0b4
|
Remove more multimodal extension references
|
2025-05-05 14:17:00 -07:00 |
|
oobabooga
|
85bf2e15b9
|
API: Remove obsolete multimodal extension handling
Multimodal support will be added back once it's implemented in llama-server.
|
2025-05-05 14:14:48 -07:00 |
|
oobabooga
|
d10bded7f8
|
UI: Add an enable_thinking option to enable/disable Qwen3 thinking
|
2025-04-28 22:37:01 -07:00 |
|
oobabooga
|
bbcaec75b4
|
API: Find a new port if the default one is taken (closes #6918)
|
2025-04-27 21:13:16 -07:00 |
|
oobabooga
|
35717a088c
|
API: Add an /v1/internal/health endpoint
|
2025-04-26 15:42:27 -07:00 |
|
oobabooga
|
bc55feaf3e
|
Improve host header validation in local mode
|
2025-04-26 15:42:17 -07:00 |
|
oobabooga
|
d9de14d1f7
|
Restructure the repository (#6904)
|
2025-04-26 08:56:54 -03:00 |
|
oobabooga
|
d5e1bccef9
|
Remove the SpeechRecognition requirement
|
2025-04-20 11:47:28 -07:00 |
|
oobabooga
|
ae02ffc605
|
Refactor the transformers loader (#6859)
|
2025-04-20 13:33:47 -03:00 |
|
oobabooga
|
ae54d8faaa
|
New llama.cpp loader (#6846)
|
2025-04-18 09:59:37 -03:00 |
|
oobabooga
|
5bcd2d7ad0
|
Add the top N-sigma sampler (#6796)
|
2025-03-14 16:45:11 -03:00 |
|
oobabooga
|
edbe0af647
|
Minor fixes after 0360f54ae8
|
2025-02-02 17:04:56 -08:00 |
|
oobabooga
|
0360f54ae8
|
UI: add a "Show after" parameter (to use with DeepSeek </think>)
|
2025-02-02 15:30:09 -08:00 |
|
oobabooga
|
f01cc079b9
|
Lint
|
2025-01-29 14:00:59 -08:00 |
|
FP HAM
|
4bd260c60d
|
Give SillyTavern a bit of leaway the way the do OpenAI (#6685)
|
2025-01-22 12:01:44 -03:00 |
|
oobabooga
|
83c426e96b
|
Organize internals (#6646)
|
2025-01-10 18:04:32 -03:00 |
|
BPplays
|
619265b32c
|
add ipv6 support to the API (#6559)
|
2025-01-09 10:23:44 -03:00 |
|
oobabooga
|
11af199aff
|
Add a "Static KV cache" option for transformers
|
2025-01-04 17:52:57 -08:00 |
|
hronoas
|
9b3a3d8f12
|
openai extension fix: Handle Multiple Content Items in Messages (#6528)
|
2024-11-18 11:59:52 -03:00 |
|
Philipp Emanuel Weidmann
|
301375834e
|
Exclude Top Choices (XTC): A sampler that boosts creativity, breaks writing clichés, and inhibits non-verbatim repetition (#6335)
|
2024-09-27 22:50:12 -03:00 |
|
Jean-Sylvain Boige
|
4924ee2901
|
typo in OpenAI response format (#6365)
|
2024-09-05 21:42:23 -03:00 |
|
Stefan Merettig
|
9a150c3368
|
API: Relax multimodal format, fixes HuggingFace Chat UI (#6353)
|
2024-09-02 23:03:15 -03:00 |
|
oobabooga
|
addcb52c56
|
Make --idle-timeout work for API requests
|
2024-07-28 18:31:40 -07:00 |
|
Philipp Emanuel Weidmann
|
852c943769
|
DRY: A modern repetition penalty that reliably prevents looping (#5677)
|
2024-05-19 23:53:47 -03:00 |
|
oobabooga
|
f27e1ba302
|
Add a /v1/internal/chat-prompt endpoint (#5879)
|
2024-04-19 00:24:46 -03:00 |
|
oobabooga
|
c37f792afa
|
Better way to handle user_bio default in the API (alternative to bdcf31035f )
|
2024-03-29 10:54:01 -07:00 |
|
oobabooga
|
35da6b989d
|
Organize the parameters tab (#5767)
|
2024-03-28 16:45:03 -03:00 |
|
Yiximail
|
bdcf31035f
|
Set a default empty string for user_bio to fix #5717 issue (#5722)
|
2024-03-26 16:34:03 -03:00 |
|
oobabooga
|
28076928ac
|
UI: Add a new "User description" field for user personality/biography (#5691)
|
2024-03-11 23:41:57 -03:00 |
|
oobabooga
|
abcdd0ad5b
|
API: don't use settings.yaml for default values
|
2024-03-10 16:15:52 -07:00 |
|
oobabooga
|
527ba98105
|
Do not install extensions requirements by default (#5621)
|
2024-03-04 04:46:39 -03:00 |
|
kalomaze
|
cfb25c9b3f
|
Cubic sampling w/ curve param (#5551)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2024-03-03 13:22:21 -03:00 |
|
Kevin Pham
|
10df23efb7
|
Remove message.content from openai streaming API (#5503)
|
2024-02-19 18:50:27 -03:00 |
|
oobabooga
|
8c35fefb3b
|
Add custom sampler order support (#5443)
|
2024-02-06 11:20:10 -03:00 |
|
kalomaze
|
b6077b02e4
|
Quadratic sampling (#5403)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2024-02-04 00:20:02 -03:00 |
|
Forkoz
|
528318b700
|
API: Remove tiktoken from logit bias (#5391)
|
2024-01-28 21:42:03 -03:00 |
|
oobabooga
|
aa575119e6
|
API: minor fix
|
2024-01-22 04:38:43 -08:00 |
|
oobabooga
|
821dd65fb3
|
API: add a comment
|
2024-01-22 04:15:51 -08:00 |
|
oobabooga
|
6247eafcc5
|
API: better handle temperature = 0
|
2024-01-22 04:12:23 -08:00 |
|
oobabooga
|
817866c9cf
|
Lint
|
2024-01-22 04:07:25 -08:00 |
|
oobabooga
|
aad73667af
|
Lint
|
2024-01-22 03:25:55 -08:00 |
|