diff --git a/README.md b/README.md
index e6815c7..01a526c 100644
--- a/README.md
+++ b/README.md
@@ -16,10 +16,9 @@
 </h4>
 
 ## News or Update
-- 2023-05-12 - (In Progress) - `peft` + `auto-gptq` + multi-modal data = easily fine tune LLMs to gain multi-modal instruction following ability with low resources, stay tune!
+- 2023-05-27 - (Update) - Support quantization and inference for `gpt_bigcode`, `codegen` and `RefineWeb/RefineWebModel`(falcon) model types.
 - 2023-05-04 - (Update) - Support using faster cuda kernel when `not desc_act or group_size == -1`.
 - 2023-04-29 - (Update) - Support loading quantized model from arbitrary quantize_config and model_basename.
-- 2023-04-28 - (Update) - Support CPU offload and quantize/inference on multiple devices, support `gpt2` type models.
 
 *For more histories please turn to [here](docs/NEWS_OR_UPDATE.md)*
 
diff --git a/README_zh.md b/README_zh.md
index 58f2760..b3b353f 100644
--- a/README_zh.md
+++ b/README_zh.md
@@ -16,10 +16,9 @@
 </h4>
 
 ## 新闻或更新
-- 2023-05-12 - (进行中) - `peft` + `auto-gptq` + 多模态数据 = 低资源条件下轻松微调大语言模型以获得多模态指令遵循能力，敬请关注！
+- 2023-05-27 - (更新) - 支持以下模型的量化和推理： `gpt_bigcode`， `codegen` 以及 `RefineWeb/RefineWebModel`（falcon）。
 - 2023-05-04 - (更新) - 支持在 `not desc_act or group_size == -1` 的情况下使用更快的 cuda 算子。
 - 2023-04-29 - (更新) - 支持从指定的模型权重文件名或量化配置(quantize_config)加载量化过的模型。
-- 2023-04-28 - (更新) - 支持 CPU 分载权重和在多设备上执行模型量化或推理, 支持 `gpt2` 类型的模型。
 
 *获取更多的历史信息，请转至[这里](docs/NEWS_OR_UPDATE.md)*
 
diff --git a/docs/NEWS_OR_UPDATE.md b/docs/NEWS_OR_UPDATE.md
index 98ffac8..655cfb6 100644
--- a/docs/NEWS_OR_UPDATE.md
+++ b/docs/NEWS_OR_UPDATE.md
@@ -1,4 +1,5 @@
 ## <center>News or Update</center>
+- 2023-05-27 - (Update) - Support quantization and inference for `gpt_bigcode`, `codegen` and `RefineWeb/RefineWebModel`(falcon) model types.
 - 2023-05-04 - (Update) - Support using faster cuda kernel when `not desc_act or group_size == -1`
 - 2023-04-29 - (Update) - Support loading quantized model from arbitrary quantize_config and model_basename.
 - 2023-04-28 - (Update) - Support CPU offload and quantize/inference on multiple devices, support `gpt2` type models.