Commit graph

1706 commits

Author SHA1 Message Date
Underscore
63234b9b6f
UI: Fix impersonate (#7025) 2025-05-29 08:22:03 -03:00
oobabooga
75d6cfd14d Download fetched web search results in parallel 2025-05-28 20:36:24 -07:00
oobabooga
7080a02252 Reduce the timeout for downloading web pages 2025-05-28 18:15:21 -07:00
oobabooga
3eb0b77427 Improve the web search query generation 2025-05-28 18:14:51 -07:00
oobabooga
27641ac182 UI: Make message editing work the same for user and assistant messages 2025-05-28 17:23:46 -07:00
oobabooga
6c3590ba9a Make web search attachments clickable 2025-05-28 05:28:15 -07:00
oobabooga
077bbc6b10
Add web search support (#7023) 2025-05-28 04:27:28 -03:00
oobabooga
1b0e2d8750 UI: Add a token counter to the chat tab (counts input + history) 2025-05-27 22:36:24 -07:00
oobabooga
f6ca0ee072 Fix regenerate sometimes not creating a new message version 2025-05-27 21:20:51 -07:00
Underscore
5028480eba
UI: Add footer buttons for editing messages (#7019)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2025-05-28 00:55:27 -03:00
Underscore
355b5f6c8b
UI: Add message version navigation (#6947)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2025-05-27 22:54:18 -03:00
Underscore
8531100109
Fix textbox text usage in methods (#7009) 2025-05-26 22:40:09 -03:00
oobabooga
bae1aa34aa Fix loading Llama-3_3-Nemotron-Super-49B-v1 and similar models (closes #7012) 2025-05-25 17:19:26 -07:00
oobabooga
8620d6ffe7 Make it possible to upload multiple text files/pdfs at once 2025-05-20 21:34:07 -07:00
oobabooga
cc8a4fdcb1 Minor improvement to attachments prompt format 2025-05-20 21:31:18 -07:00
oobabooga
409a48d6bd
Add attachments support (text files, PDF documents) (#7005) 2025-05-21 00:36:20 -03:00
oobabooga
5d00574a56 Minor UI fixes 2025-05-20 16:20:49 -07:00
oobabooga
616ea6966d
Store previous reply versions on regenerate (#7004) 2025-05-20 12:51:28 -03:00
Daniel Dengler
c25a381540
Add a "Branch here" footer button to chat messages (#6967) 2025-05-20 11:07:40 -03:00
oobabooga
8e10f9894a
Add a metadata field to the chat history & add date/time to chat messages (#7003) 2025-05-20 10:48:46 -03:00
oobabooga
9ec46b8c44 Remove the HQQ loader (HQQ models can be loaded through Transformers) 2025-05-19 09:23:24 -07:00
oobabooga
126b3a768f Revert "Dynamic Chat Message UI Update Speed (#6952)" (for now)
This reverts commit 8137eb8ef4.
2025-05-18 12:38:36 -07:00
oobabooga
2faaf18f1f Add back the "Common values" to the ctx-size slider 2025-05-18 09:06:20 -07:00
oobabooga
f1ec6c8662 Minor label changes 2025-05-18 09:04:51 -07:00
oobabooga
61276f6a37 Merge remote-tracking branch 'refs/remotes/origin/dev' into dev 2025-05-17 07:22:51 -07:00
oobabooga
4800d1d522 More robust VRAM calculation 2025-05-17 07:20:38 -07:00
mamei16
052c82b664
Fix KeyError: 'gpu_layers' when loading existing model settings (#6991) 2025-05-17 11:19:13 -03:00
oobabooga
0f77ff9670 UI: Use total VRAM (not free) for layers calculation when a model is loaded 2025-05-16 19:19:22 -07:00
oobabooga
c0e295dd1d Remove the 'None' option from the model menu 2025-05-16 17:53:20 -07:00
oobabooga
e3bba510d4 UI: Only add a blank space to streaming messages in instruct mode 2025-05-16 17:49:17 -07:00
oobabooga
71fa046c17 Minor changes after 1c549d176b 2025-05-16 17:38:08 -07:00
oobabooga
d99fb0a22a Add backward compatibility with saved n_gpu_layers values 2025-05-16 17:29:18 -07:00
oobabooga
1c549d176b Fix GPU layers slider: honor saved settings and show true maximum 2025-05-16 17:26:13 -07:00
oobabooga
e4d3f4449d API: Fix a regression 2025-05-16 13:02:27 -07:00
oobabooga
adb975a380 Prevent fractional gpu-layers in the UI 2025-05-16 12:52:43 -07:00
oobabooga
fc483650b5 Set the maximum gpu_layers value automatically when the model is loaded with --model 2025-05-16 11:58:17 -07:00
oobabooga
38c50087fe Prevent a crash on systems without an NVIDIA GPU 2025-05-16 11:55:30 -07:00
oobabooga
253e85a519 Only compute VRAM/GPU layers for llama.cpp models 2025-05-16 10:02:30 -07:00
oobabooga
9ec9b1bf83 Auto-adjust GPU layers after model unload to utilize freed VRAM 2025-05-16 09:56:23 -07:00
oobabooga
ee7b3028ac Always cache GGUF metadata calls 2025-05-16 09:12:36 -07:00
oobabooga
4925c307cf Auto-adjust GPU layers on context size and cache type changes + many fixes 2025-05-16 09:07:38 -07:00
oobabooga
93e1850a2c Only show the VRAM info for llama.cpp 2025-05-15 21:42:15 -07:00
oobabooga
cbf4daf1c8 Hide the LoRA menu in portable mode 2025-05-15 21:21:54 -07:00
oobabooga
fd61297933 Lint 2025-05-15 21:19:19 -07:00
oobabooga
5534d01da0
Estimate the VRAM for GGUF models + autoset gpu-layers (#6980) 2025-05-16 00:07:37 -03:00
oobabooga
c4a715fd1e UI: Move the LoRA menu under "Other options" 2025-05-13 20:14:09 -07:00
oobabooga
035cd3e2a9 UI: Hide the extension install menu in portable builds 2025-05-13 20:09:22 -07:00
oobabooga
2826c60044 Use logger for "Output generated in ..." messages 2025-05-13 14:45:46 -07:00
oobabooga
3fa1a899ae UI: Fix gpu-layers being ignored (closes #6973) 2025-05-13 12:07:59 -07:00
oobabooga
62c774bf24 Revert "New attempt"
This reverts commit e7ac06c169.
2025-05-13 06:42:25 -07:00