[Doc] update evaluate result (#178)

2024-04-10 10:14:52 +08:00 · 2024-04-10 10:14:52 +08:00 · e3b8f2fbcd
commit e3b8f2fbcd
parent 14b8b9cb15 0e2ae67e7e
3 changed files with 721 additions and 708 deletions
--- a/README.md
+++ b/README.md
@ -48,7 +48,7 @@
 | :-------------------: | :------: | :---: |
 |   InternLM2_7B_chat   |  QLORA   |       |
 |   InternLM2_7B_chat   | 全量微调 |       |
-|   InternLM2_7B_base   |  QLORA   |       |
+|   InternLM2_7B_base   |  QLORA   | [internlm2_7b_base_qlora_e10_M_1e4_32_64.py](./xtuner_config/internlm2_7b_base_qlora_e10_M_1e4_32_64.py) |
 |  InternLM2_1_8B_chat  | 全量微调 |       |
 |  InternLM2_20B_chat   |   LORA   |       |
 |     Qwen_7b_chat      |  QLORA   |       |
@ -105,9 +105,10 @@
 </table>
 ### 🎇最近更新
 - 【2024.4.2】在 Huggingface 上传[老母亲心理咨询师](https://huggingface.co/brycewang2018/EmoLLM-mother/tree/main)
 - 【2024.3.25】在百度飞桨平台发布[爹系男友心理咨询师](https://aistudio.baidu.com/community/app/68787)
- 【2024.3.24】在OpenXLab和ModelScope平台发布InternLM2-Base-7B QLoRA微调模型, 具体请查看[InternLM2-Base-7B QLoRA](./xtuner_config/README_internlm2_7b_base_qlora.md)
+- 【2024.3.24】在**OpenXLab**和**ModelScope**平台发布**InternLM2-Base-7B QLoRA微调模型**, 具体请查看[**InternLM2-Base-7B QLoRA**](./xtuner_config/README_internlm2_7b_base_qlora.md)
 - 【2024.3.12】在百度飞桨平台发布[艾薇](https://aistudio.baidu.com/community/app/63335)
 - 【2024.3.11】 **EmoLLM V2.0 相比 EmoLLM V1.0 全面提升，已超越 Role-playing ChatGPT 在心理咨询任务上的能力！**[点击体验EmoLLM V2.0](https://openxlab.org.cn/apps/detail/Farewell1/EmoLLMV2.0)，更新[数据集统计及详细信息](./datasets/)、[路线图](./assets/Roadmap_ZH.png)
 - 【2024.3.9】 新增并发功能加速 [QA 对生成](./scripts/qa_generation/)、[RAG pipeline](./rag/)
@ -174,11 +175,11 @@
  - [目录](#目录)
          - [开发前的配置要求](#开发前的配置要求)
          - [**使用指南**](#使用指南)
-    - [快速体验](#快速体验)
+    - [🍪快速体验](#快速体验)
-    - [数据构建](#数据构建)
+    - [📌数据构建](#数据构建)
-    - [微调指南](#微调指南)
+    - [🎨微调指南](#微调指南)
-    - [部署指南](#部署指南)
+    - [🔧部署指南](#部署指南)
-    - [RAG(检索增强生成)Pipeline](#rag检索增强生成pipeline)
+    - [⚙RAG(检索增强生成)Pipeline](#rag检索增强生成pipeline)
    - [使用到的框架](#使用到的框架)
      - [如何参与本项目](#如何参与本项目)
    - [作者（排名不分先后）](#作者排名不分先后)
--- a/README_EN.md
+++ b/README_EN.md
@ -50,7 +50,7 @@
 | :-------------------: | :--------------: | :---: |
 |   InternLM2_7B_chat   |      QLORA       |       |
 |   InternLM2_7B_chat   | full fine-tuning |       |
-|   InternLM2_7B_base   |      QLORA       |       |
+|   InternLM2_7B_base   |      QLORA       |[internlm2_7b_base_qlora_e10_M_1e4_32_64.py](./xtuner_config/internlm2_7b_base_qlora_e10_M_1e4_32_64.py)|
 |  InternLM2_1_8B_chat  | full fine-tuning |       |
 |  InternLM2_20B_chat   |       LORA       |       |
 |     Qwen_7b_chat      |      QLORA       |       |
@ -109,7 +109,7 @@ The Model aims to fully understand and promote the mental health of individuals,
 ### Recent Updates
 - 【2024.3.25】 [Mother-like Therapist] is released on Huggingface (https://huggingface.co/brycewang2018/EmoLLM-mother/tree/main)
 - 【2024.3.25】 [Daddy-like Boy-Friend] is released on Baidu Paddle-Paddle AI Studio Platform (https://aistudio.baidu.com/community/app/68787)
- 【2024.3.24】 The InternLM2-Base-7B QLoRA fine-tuned model has been released on the OpenXLab and ModelScope platforms. For more details, please refer to [InternLM2-Base-7B QLoRA](./xtuner_config/README_internlm2_7b_base_qlora.md).
+- 【2024.3.24】 The **InternLM2-Base-7B QLoRA fine-tuned model** has been released on the **OpenXLab** and **ModelScope** platforms. For more details, please refer to [**InternLM2-Base-7B QLoRA**](./xtuner_config/README_internlm2_7b_base_qlora.md).
 - 【2024.3.12】 [aiwei] is released on Baidu Paddle-Paddle AI Studio Platform (https://aistudio.baidu.com/community/app/63335)
 - 【2024.3.11】 **EmoLLM V2.0 is greatly improved in all scores compared to EmoLLM V1.0. Surpasses the performance of Role-playing ChatGPT on counseling tasks!** [Click to experience EmoLLM V2.0](https://openxlab.org.cn/apps/detail/Farewell1/EmoLLMV2.0), update [dataset statistics and details](./datasets/), [Roadmap](./assets/Roadmap_ZH.png)
 - 【2024.3.9】 Add concurrency acceleration [QA pair generation](./scripts/qa_generation/), [RAG pipeline](./rag/)
@ -145,7 +145,7 @@ The Model aims to fully understand and promote the mental health of individuals,
 ### Honors
- The project won ***the Innovation and Creativity Award*** in the **2024 Puyuan Large Model Series Challenge Spring Competition held by the Shanghai Artificial Intelligence Laboratory**
+- The project won the ***the Innovation and Creativity Award*** in the **2024 Puyuan Large Model Series Challenge Spring Competition held by the Shanghai Artificial Intelligence Laboratory**
 <p align="center">
   <a href="https://github.com/SmartFlowAI/EmoLLM/">
@ -171,11 +171,11 @@ The Model aims to fully understand and promote the mental health of individuals,
  - [Contents](#contents)
          - [Pre-development Configuration Requirements.](#pre-development-configuration-requirements)
          - [**User Guide**](#user-guide)
-    - [File Directory Explanation](#file-directory-explanation)
+    - [🍪Quick start](#quick-start)
-    - [Data Construction](#data-construction)
+    - [📌Data Construction](#data-construction)
-    - [Fine-tuning Guide](#fine-tuning-guide)
+    - [🎨Fine-tuning Guide](#fine-tuning-guide)
-    - [Deployment Guide](#deployment-guide)
+    - [🔧Deployment Guide](#deployment-guide)
-    - [RAG (Retrieval Augmented Generation) Pipeline](#rag-retrieval-augmented-generation-pipeline)
+    - [⚙RAG (Retrieval Augmented Generation) Pipeline](#rag-retrieval-augmented-generation-pipeline)
    - [Frameworks Used](#frameworks-used)
      - [How to participate in this project](#how-to-participate-in-this-project)
    - [Version control](#version-control)
--- a/xtuner_config/README_internlm2_7b_base_qlora.md
+++ b/xtuner_config/README_internlm2_7b_base_qlora.md
@ -2,25 +2,37 @@
 ## 模型基座与配置文件
- 本项目在[**internlm2_7b_chat_qlora_e3**模型](./internlm2_7b_chat_qlora_e3.py)微调[指南](./README.md)的基础上，更新了对[**internlm2_7b_base_qlora_e3（配置文件）**](./internlm2_7b_base_qlora_e10_M_1e4_32_64.py)**模型**的微调。
+- 本项目在XTuner项目所提供的[**internlm2_7b_chat_qlora_e3**模型配置文件](./internlm2_7b_chat_qlora_e3.py)和在[EmoLLM模型微调指南](./README.md)的基础上，创建和更新了对**InternLM2_7B_base模型**在[EmoLLM通用数据集](../datasets/README.md)上进行QLoRA微调训练，配置文件详见[**internlm2_7b_base_qlora_e10_M_1e4_32_64.py**](./internlm2_7b_base_qlora_e10_M_1e4_32_64.py)。
 - 为了用户可以根据自己不同的硬件配置进行复现和微调训练，EmoLLM也提供了其他的配置文件以满足不同的配置需求。
  - [internlm2_7b_base_qlora_e10_b8_16_32.py](./internlm2_7b_base_qlora_e10_b8_16_32.py)
  - [internlm2_7b_base_qlora_e3_M_1e4_32_64.py](./internlm2_7b_base_qlora_e3_M_1e4_32_64.py)
 ## 模型公布和训练epoch数设置
- 由于采用了合并后的数据集，我们对选用的internlm2_7b_base模型进行了**10 epoch**的训练，读者可以根据训练过程中的输出和loss变化，进行训练的终止和模型的挑选，也可以采用更加专业的评估方法，来对模型评测。
+- 由于采用了合并后的数据集，我们对选用的InternLM2_7B_base模型进行了**10 epoch**的训练，读者可以根据训练过程中的输出和loss变化，进行训练的终止和模型的挑选，也可以采用更加专业的评估方法，来对模型评测。
- 在我们公布的internlm2_7b_base_qlora微调模型时，也分别在OpenXLab和ModelScope中提供了两个不同的权重版本供用户使用和测试，更多专业测评结果将会在近期更新， 敬请期待。
+- 在我们公布的InternLM2_7B_base QLoRA微调模型时，也分别在OpenXLab和ModelScope中提供了两个不同的权重版本供用户使用和测试，更多专业测评结果将会在近期更新，敬请期待。
- **OpenXLab**：
+  - **OpenXLab**：
-  - [5 epoch 模型](https://openxlab.org.cn/models/detail/chg0901/EmoLLM-InternLM7B-base)
+    - [5 epoch 模型](https://openxlab.org.cn/models/detail/chg0901/EmoLLM-InternLM7B-base)
-  - [10 epoch 模型](https://openxlab.org.cn/models/detail/chg0901/EmoLLM-InternLM7B-base-10e)
+    - [10 epoch 模型](https://openxlab.org.cn/models/detail/chg0901/EmoLLM-InternLM7B-base-10e)
  - **ModelScope**：
    - [5 epoch 模型](https://www.modelscope.cn/models/chg0901/EmoLLM-InternLM7B-base/files)
    - [10 epoch 模型](https://www.modelscope.cn/models/chg0901/EmoLLM-InternLM7B-base-10e/files)
- **ModelScope**：
+- 目前EmoLLM团队已经采用**通用指标**评估了QLoRA微调训练的InternLM2_7B_base模型（包括5 epoch 模型和10 epoch 模型），结果如下表所示，可以看到10 epoch QLoRA微调训练的InternLM2_7B_base模型通用指标已经超过其他模型，我们将近期更新在心理咨询专业指标上的评测结果。更多评测详情请查看[通用测评结果页面（General_evaluation.md）](../evaluate/General_evaluation.md)和[测评目录README](../evaluate/README.md).
-  - [5 epoch 模型](https://www.modelscope.cn/models/chg0901/EmoLLM-InternLM7B-base/files)
+
-  - [10 epoch 模型](https://www.modelscope.cn/models/chg0901/EmoLLM-InternLM7B-base-10e/files)
+| Model    | ROUGE-1 | ROUGE-2 | ROUGE-L | BLEU-1  | BLEU-2  | BLEU-3  | BLEU-4  |
 |----------|---------|---------|---------|---------|---------|---------|---------|
 | Qwen1_5-0_5B-chat | 27.23%  | 8.55%   | 17.05%  | 26.65%  | 13.11%  | 7.19%   | 4.05%   |
 | InternLM2_7B_chat_qlora | 37.86%  | 15.23%   | 24.34%  | 39.71%  | 22.66%  | 14.26%   | 9.21%   |
 | InternLM2_7B_chat_full  | 32.45%  | 10.82%   | 20.17%  | 30.48%  | 15.67%  | 8.84%   | 5.02%   |
 | InternLM2_7B_base_qlora_5epoch  | 41.94%  | 20.21%   | 29.67%  | 42.98%  | 27.07%  | 19.33%   | 14.62%   |
 | **InternLM2_7B_base_qlora_10epoch** | **43.47%** | **22.06%**   | **31.4%**  | **44.81%**  | **29.15%**  | **21.44%**   | **16.72%**   |
 ### 超参数设置
-训练config设置详情，请查看[**internlm2_7b_base_qlora_e3（配置文件）**](./internlm2_7b_base_qlora_e10_M_1e4_32_64.py)，这里我们只列出了关键的超参数或者我们做过调整的超参数。
+训练config设置详情，请查看[**`internlm2_7b_base_qlora_e10_M_1e4_32_64.py`（配置文件）**](./internlm2_7b_base_qlora_e10_M_1e4_32_64.py)，这里我们只列出了关键的超参数或者我们做过调整的超参数。
 ```python
 prompt_template = PROMPT_TEMPLATE.internlm2_chat