simcop2387/text-generation-webui-mirror

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2025-06-07 14:17:09 -04:00

Author	SHA1	Message	Date
oobabooga	9ec46b8c44	Remove the HQQ loader (HQQ models can be loaded through Transformers)	2025-05-19 09:23:24 -07:00
oobabooga	126b3a768f	Revert "Dynamic Chat Message UI Update Speed (#6952 )" (for now) This reverts commit `8137eb8ef4`.	2025-05-18 12:38:36 -07:00
oobabooga	47d4758509	Fix #6970	2025-05-10 17:46:00 -07:00
oobabooga	b28fa86db6	Default --gpu-layers to 256	2025-05-06 17:51:55 -07:00
Downtown-Case	5ef564a22e	Fix model config loading in shared.py for Python 3.13 (#6961 )	2025-05-06 17:03:33 -03:00
mamei16	8137eb8ef4	Dynamic Chat Message UI Update Speed (#6952 )	2025-05-05 18:05:23 -03:00
oobabooga	df7bb0db1f	Rename --n-gpu-layers to --gpu-layers	2025-05-04 20:03:55 -07:00
oobabooga	4cea720da8	UI: Remove the "Autoload the model" feature	2025-05-02 16:38:28 -07:00
oobabooga	905afced1c	Add a --portable flag to hide things in portable mode	2025-05-02 16:34:29 -07:00
oobabooga	b46ca01340	UI: Set max_updates_second to 12 by default When the tokens/second at at ~50 and the model is a thinking model, the markdown rendering for the streaming message becomes a CPU bottleneck.	2025-04-30 14:53:15 -07:00
oobabooga	d10bded7f8	UI: Add an `enable_thinking` option to enable/disable Qwen3 thinking	2025-04-28 22:37:01 -07:00
oobabooga	7b80acd524	Fix parsing --extra-flags	2025-04-26 18:40:03 -07:00
oobabooga	0fe3b033d0	Fix parsing of --n_ctx and --max_seq_len (2nd attempt)	2025-04-26 17:52:21 -07:00
oobabooga	c4afc0421d	Fix parsing of --n_ctx and --max_seq_len	2025-04-26 17:43:53 -07:00
oobabooga	4ff91b6588	Better default settings for Speculative Decoding	2025-04-26 17:24:40 -07:00
oobabooga	3a207e7a57	Improve the --help formatting a bit	2025-04-26 07:31:04 -07:00
oobabooga	cbd4d967cc	Update a --help message	2025-04-26 05:09:52 -07:00
oobabooga	d9de14d1f7	Restructure the repository (#6904 )	2025-04-26 08:56:54 -03:00
oobabooga	d4017fbb6d	ExLlamaV3: Add kv cache quantization (#6903 )	2025-04-25 21:32:00 -03:00
oobabooga	d4b1e31c49	Use `--ctx-size` to specify the context size for all loaders Old flags are still recognized as alternatives.	2025-04-25 16:59:03 -07:00
oobabooga	877cf44c08	llama.cpp: Add StreamingLLM (`--streaming-llm`)	2025-04-25 16:21:41 -07:00
oobabooga	d35818f4e1	UI: Add a collapsible thinking block to messages with `<think>` steps (#6902 )	2025-04-25 18:02:02 -03:00
oobabooga	98f4c694b9	llama.cpp: Add --extra-flags parameter for passing additional flags to llama-server	2025-04-25 07:32:51 -07:00
Matthew Jenkins	8f2493cc60	Prevent llamacpp defaults from locking up consumer hardware (#6870 )	2025-04-24 23:38:57 -03:00
oobabooga	93fd4ad25d	llama.cpp: Document the --device-draft syntax	2025-04-24 09:20:11 -07:00
oobabooga	c71a2af5ab	Handle CMD_FLAGS.txt in the main code (closes #6896 )	2025-04-24 08:21:06 -07:00
oobabooga	bfbde73409	Make 'instruct' the default chat mode	2025-04-24 07:08:49 -07:00
oobabooga	e99c20bcb0	llama.cpp: Add speculative decoding (#6891 )	2025-04-23 20:10:16 -03:00
oobabooga	8cfd7f976b	Revert "Remove the old --model-menu flag" This reverts commit `109de34e3b`.	2025-04-20 13:35:42 -07:00
oobabooga	ae02ffc605	Refactor the transformers loader (#6859 )	2025-04-20 13:33:47 -03:00
oobabooga	d68f0fbdf7	Remove obsolete references to llamacpp_HF	2025-04-18 07:46:04 -07:00
oobabooga	c6901aba9f	Remove deprecation warning code	2025-04-18 06:05:47 -07:00
oobabooga	8144e1031e	Remove deprecated command-line flags	2025-04-18 06:02:28 -07:00
oobabooga	ae54d8faaa	New llama.cpp loader (#6846 )	2025-04-18 09:59:37 -03:00
oobabooga	4ed0da74a8	Remove the obsolete 'multimodal' extension	2025-04-09 20:09:48 -07:00
oobabooga	8b8d39ec4e	Add ExLlamaV3 support (#6832 )	2025-04-09 00:07:08 -03:00
oobabooga	a5855c345c	Set context lengths to at most 8192 by default (to prevent out of memory errors) (#6835 )	2025-04-07 21:42:33 -03:00
oobabooga	109de34e3b	Remove the old --model-menu flag	2025-03-31 09:24:03 -07:00
oobabooga	0360f54ae8	UI: add a "Show after" parameter (to use with DeepSeek </think>)	2025-02-02 15:30:09 -08:00
oobabooga	c832953ff7	UI: Activate auto_max_new_tokens by default	2025-01-14 05:59:55 -08:00
oobabooga	d2f6c0f65f	Update README	2025-01-10 13:25:40 -08:00
oobabooga	c393f7650d	Update settings-template.yaml, organize modules/shared.py	2025-01-10 13:22:18 -08:00
oobabooga	83c426e96b	Organize internals (#6646 )	2025-01-10 18:04:32 -03:00
oobabooga	7fe46764fb	Improve the --help message about --tensorcores as well	2025-01-10 07:07:41 -08:00
oobabooga	da6d868f58	Remove old deprecated flags (~6 months or more)	2025-01-09 16:11:46 -08:00
BPplays	619265b32c	add ipv6 support to the API (#6559 )	2025-01-09 10:23:44 -03:00
oobabooga	91a8a87887	Remove obsolete code	2025-01-08 15:07:21 -08:00
oobabooga	7157257c3f	Remove the AutoGPTQ loader (#6641 )	2025-01-08 19:28:56 -03:00
oobabooga	c0f600c887	Add a --torch-compile flag for transformers	2025-01-05 05:47:00 -08:00
oobabooga	11af199aff	Add a "Static KV cache" option for transformers	2025-01-04 17:52:57 -08:00

1 2 3 4 5 ...

352 commits