oobabooga
|
39cbb5fee0
|
Lint
|
2025-04-22 08:03:25 -07:00 |
|
oobabooga
|
da1919baae
|
Update the README
|
2025-04-22 08:03:22 -07:00 |
|
oobabooga
|
a3031795a3
|
Update the zip filename
|
2025-04-22 08:03:16 -07:00 |
|
oobabooga
|
008c6dd682
|
Lint
|
2025-04-22 08:02:37 -07:00 |
|
oobabooga
|
ee09e44c85
|
Portable version (#6868)
|
2025-04-22 09:25:57 -03:00 |
|
oobabooga
|
78aeabca89
|
Fix the transformers loader
|
2025-04-21 18:33:14 -07:00 |
|
oobabooga
|
8320190184
|
Fix the exllamav2_HF and exllamav3_HF loaders
|
2025-04-21 18:32:23 -07:00 |
|
oobabooga
|
15989c2ed8
|
Make llama.cpp the default loader
|
2025-04-21 16:36:35 -07:00 |
|
oobabooga
|
86c3ed3218
|
Small change to the unload_model() function
|
2025-04-20 20:00:56 -07:00 |
|
oobabooga
|
c178ea02fe
|
Revert "Move the requirements*.txt to a requirements folder"
This reverts commit 6117ef7d64 .
|
2025-04-20 19:27:38 -07:00 |
|
oobabooga
|
6117ef7d64
|
Move the requirements*.txt to a requirements folder
|
2025-04-20 19:12:04 -07:00 |
|
oobabooga
|
fe8e80e04a
|
Merge remote-tracking branch 'refs/remotes/origin/dev' into dev
|
2025-04-20 19:09:27 -07:00 |
|
oobabooga
|
ff1c00bdd9
|
llama.cpp: set the random seed manually
|
2025-04-20 19:08:44 -07:00 |
|
Matthew Jenkins
|
d3e7c655e5
|
Add support for llama-cpp builds from https://github.com/ggml-org/llama.cpp (#6862)
|
2025-04-20 23:06:24 -03:00 |
|
oobabooga
|
99588be576
|
Organize one_click.py
|
2025-04-20 18:57:26 -07:00 |
|
oobabooga
|
e243424ba1
|
Fix an import
|
2025-04-20 17:51:28 -07:00 |
|
oobabooga
|
8cfd7f976b
|
Revert "Remove the old --model-menu flag"
This reverts commit 109de34e3b .
|
2025-04-20 13:35:42 -07:00 |
|
oobabooga
|
d5e1bccef9
|
Remove the SpeechRecognition requirement
|
2025-04-20 11:47:28 -07:00 |
|
oobabooga
|
b3bf7a885d
|
Fix ExLlamaV2_HF and ExLlamaV3_HF after ae02ffc605
|
2025-04-20 11:32:48 -07:00 |
|
oobabooga
|
9c59acf820
|
Remove the numba requirement (it's no longer used)
|
2025-04-20 10:02:40 -07:00 |
|
oobabooga
|
ae02ffc605
|
Refactor the transformers loader (#6859)
|
2025-04-20 13:33:47 -03:00 |
|
oobabooga
|
6ba0164c70
|
Lint
|
2025-04-19 17:45:21 -07:00 |
|
oobabooga
|
5ab069786b
|
llama.cpp: add back the two encode calls (they are harmless now)
|
2025-04-19 17:38:36 -07:00 |
|
oobabooga
|
b9da5c7e3a
|
Use 127.0.0.1 instead of localhost for faster llama.cpp on Windows
|
2025-04-19 17:36:04 -07:00 |
|
oobabooga
|
9c9df2063f
|
llama.cpp: fix unicode decoding (closes #6856)
|
2025-04-19 16:38:15 -07:00 |
|
oobabooga
|
ba976d1390
|
llama.cpp: avoid two 'encode' calls
|
2025-04-19 16:35:01 -07:00 |
|
oobabooga
|
ed42154c78
|
Revert "llama.cpp: close the connection immediately on 'Stop'"
This reverts commit 5fdebc554b .
|
2025-04-19 05:32:36 -07:00 |
|
oobabooga
|
5fdebc554b
|
llama.cpp: close the connection immediately on 'Stop'
|
2025-04-19 04:59:24 -07:00 |
|
oobabooga
|
6589ebeca8
|
Revert "llama.cpp: new optimization attempt"
This reverts commit e2e73ed22f .
|
2025-04-18 21:16:21 -07:00 |
|
oobabooga
|
e2e73ed22f
|
llama.cpp: new optimization attempt
|
2025-04-18 21:05:08 -07:00 |
|
oobabooga
|
e2e90af6cd
|
llama.cpp: don't include --rope-freq-base in the launch command if null
|
2025-04-18 20:51:18 -07:00 |
|
oobabooga
|
9f07a1f5d7
|
llama.cpp: new attempt at optimizing the llama-server connection
|
2025-04-18 19:30:53 -07:00 |
|
oobabooga
|
f727b4a2cc
|
llama.cpp: close the connection properly when generation is cancelled
|
2025-04-18 19:01:39 -07:00 |
|
oobabooga
|
b3342b8dd8
|
llama.cpp: optimize the llama-server connection
|
2025-04-18 18:46:36 -07:00 |
|
oobabooga
|
2002590536
|
Revert "Attempt at making the llama-server streaming more efficient."
This reverts commit 5ad080ff25 .
|
2025-04-18 18:13:54 -07:00 |
|
oobabooga
|
71ae05e0a4
|
llama.cpp: Fix the sampler priority handling
|
2025-04-18 18:06:36 -07:00 |
|
oobabooga
|
5ad080ff25
|
Attempt at making the llama-server streaming more efficient.
|
2025-04-18 18:04:49 -07:00 |
|
oobabooga
|
4fabd729c9
|
Fix the API without streaming or without 'sampler_priority' (closes #6851)
|
2025-04-18 17:25:22 -07:00 |
|
oobabooga
|
5135523429
|
Fix the new llama.cpp loader failing to unload models
|
2025-04-18 17:10:26 -07:00 |
|
oobabooga
|
8d481ef9d5
|
Update README
|
2025-04-18 11:31:22 -07:00 |
|
oobabooga
|
caa6afc88b
|
Only show 'GENERATE_PARAMS=...' in the logits endpoint if use_logits is True
|
2025-04-18 09:57:57 -07:00 |
|
oobabooga
|
e52f62d3ff
|
Update README
|
2025-04-18 09:29:57 -07:00 |
|
oobabooga
|
85c4486d4a
|
Update the colab notebook
|
2025-04-18 08:53:44 -07:00 |
|
oobabooga
|
d00d713ace
|
Rename get_max_context_length to get_vocabulary_size in the new llama.cpp loader
|
2025-04-18 08:14:15 -07:00 |
|
oobabooga
|
c1cc65e82e
|
Lint
|
2025-04-18 08:06:51 -07:00 |
|
oobabooga
|
d68f0fbdf7
|
Remove obsolete references to llamacpp_HF
|
2025-04-18 07:46:04 -07:00 |
|
oobabooga
|
a0abf93425
|
Connect --rope-freq-base to the new llama.cpp loader
|
2025-04-18 06:53:51 -07:00 |
|
oobabooga
|
ef9910c767
|
Fix a bug after c6901aba9f
|
2025-04-18 06:51:28 -07:00 |
|
oobabooga
|
1c4a2c9a71
|
Make exllamav3 safer as well
|
2025-04-18 06:17:58 -07:00 |
|
oobabooga
|
03544d4fb6
|
Bump llama.cpp and exllamav3 to the latest commits
|
2025-04-18 06:14:13 -07:00 |
|