update NEWS_OR_UPDATE

This commit is contained in:
PanQiWei 2023-05-27 16:37:51 +08:00
parent 0a40581270
commit ceacd59e4b
3 changed files with 3 additions and 4 deletions

View file

@ -16,10 +16,9 @@
</h4> </h4>
## News or Update ## News or Update
- 2023-05-12 - (In Progress) - `peft` + `auto-gptq` + multi-modal data = easily fine tune LLMs to gain multi-modal instruction following ability with low resources, stay tune! - 2023-05-27 - (Update) - Support quantization and inference for `gpt_bigcode`, `codegen` and `RefineWeb/RefineWebModel`(falcon) model types.
- 2023-05-04 - (Update) - Support using faster cuda kernel when `not desc_act or group_size == -1`. - 2023-05-04 - (Update) - Support using faster cuda kernel when `not desc_act or group_size == -1`.
- 2023-04-29 - (Update) - Support loading quantized model from arbitrary quantize_config and model_basename. - 2023-04-29 - (Update) - Support loading quantized model from arbitrary quantize_config and model_basename.
- 2023-04-28 - (Update) - Support CPU offload and quantize/inference on multiple devices, support `gpt2` type models.
*For more histories please turn to [here](docs/NEWS_OR_UPDATE.md)* *For more histories please turn to [here](docs/NEWS_OR_UPDATE.md)*

View file

@ -16,10 +16,9 @@
</h4> </h4>
## 新闻或更新 ## 新闻或更新
- 2023-05-12 - (进行中) - `peft` + `auto-gptq` + 多模态数据 = 低资源条件下轻松微调大语言模型以获得多模态指令遵循能力,敬请关注! - 2023-05-27 - (更新) - 支持以下模型的量化和推理: `gpt_bigcode` `codegen` 以及 `RefineWeb/RefineWebModel`falcon
- 2023-05-04 - (更新) - 支持在 `not desc_act or group_size == -1` 的情况下使用更快的 cuda 算子。 - 2023-05-04 - (更新) - 支持在 `not desc_act or group_size == -1` 的情况下使用更快的 cuda 算子。
- 2023-04-29 - (更新) - 支持从指定的模型权重文件名或量化配置(quantize_config)加载量化过的模型。 - 2023-04-29 - (更新) - 支持从指定的模型权重文件名或量化配置(quantize_config)加载量化过的模型。
- 2023-04-28 - (更新) - 支持 CPU 分载权重和在多设备上执行模型量化或推理, 支持 `gpt2` 类型的模型。
*获取更多的历史信息,请转至[这里](docs/NEWS_OR_UPDATE.md)* *获取更多的历史信息,请转至[这里](docs/NEWS_OR_UPDATE.md)*

View file

@ -1,4 +1,5 @@
## <center>News or Update</center> ## <center>News or Update</center>
- 2023-05-27 - (Update) - Support quantization and inference for `gpt_bigcode`, `codegen` and `RefineWeb/RefineWebModel`(falcon) model types.
- 2023-05-04 - (Update) - Support using faster cuda kernel when `not desc_act or group_size == -1` - 2023-05-04 - (Update) - Support using faster cuda kernel when `not desc_act or group_size == -1`
- 2023-04-29 - (Update) - Support loading quantized model from arbitrary quantize_config and model_basename. - 2023-04-29 - (Update) - Support loading quantized model from arbitrary quantize_config and model_basename.
- 2023-04-28 - (Update) - Support CPU offload and quantize/inference on multiple devices, support `gpt2` type models. - 2023-04-28 - (Update) - Support CPU offload and quantize/inference on multiple devices, support `gpt2` type models.