Merge pull request #61 from MING-ZCH/main

[DOC] Update EmoLLM V2.0’s evaluation details and fix some bugs in docs
2024-03-11 19:16:29 +08:00 · 2024-03-11 19:16:29 +08:00 · 77e3531169
commit 77e3531169
parent 10d4166165 50dac8b8f6
5 changed files with 45 additions and 37 deletions
--- a/README.md
+++ b/README.md
@ -64,8 +64,9 @@
 - 评估和诊断工具：为了有效促进心理健康，需要有科学的工具来评估个体的心理状态，以及诊断可能存在的心理问题。

 ### 最近更新
+- 【2024.3.11】 **EmoLLM V2.0 相比 EmoLLM V1.0 全面提升，已超越 Role-playing ChatGPT 在心理咨询任务上的能力！**
 - 【2024.3.9】 新增并发功能加速 QA 对生成
- 【2024.3.3】 [基于InternLM2-7B-chat全量微调版本开源](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_internlm2_7b_full)，需要两块A100*80G，更新专业评估，详见[evaluate](./evaluate/)，更新基于PaddleOCR的PDF转txt工具脚本，详见[scripts](./scripts/)
+- 【2024.3.3】 [基于InternLM2-7B-chat全量微调版本EmoLLM V2.0开源](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_internlm2_7b_full)，需要两块A100*80G，更新专业评估，详见[evaluate](./evaluate/)，更新基于PaddleOCR的PDF转txt工具脚本，详见[scripts](./scripts/)
 - 【2024.2.29】更新客观评估计算，详见[evaluate](./evaluate/)，更新一系列数据集，详见[datasets](./datasets/)。
 - 【2024.2.27】更新英文readme和一系列数据集（舔狗和单轮对话）
 - 【2024.2.23】推出基于InternLM2_7B_chat_qlora的 `温柔御姐心理医生艾薇`，[点击获取模型权重](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_aiwei)，[配置文件](xtuner_config/aiwei-internlm2_chat_7b_qlora.py)，[在线体验链接](https://openxlab.org.cn/apps/detail/ajupyter/EmoLLM-aiwei)
@ -89,7 +90,7 @@

 - 【2024.2.3】 [项目宣传视频](https://www.bilibili.com/video/BV1N7421N76X/)完成 😊
 - 【2024.1.27】 完善数据构建文档、微调指南、部署指南、Readme等相关文档 👏
- 【2024.1.25】 完成EmoLLM第一版并部署上线 https://openxlab.org.cn/apps/detail/jujimeizuo/EmoLLM 😀
+- 【2024.1.25】 EmoLLM V1.0 已部署上线 https://openxlab.org.cn/apps/detail/jujimeizuo/EmoLLM 😀

 </details>

--- a/README_EN.md
+++ b/README_EN.md
@ -1,4 +1,4 @@
-# EmoLLM - Large Languge Model for Mental Health
+# EmoLLM - Large Language Model for Mental Health

 <!-- PROJECT SHIELDS -->
 [![Contributors][contributors-shield]][contributors-url]
@ -35,15 +35,15 @@
 <!-- 本篇README.md面向开发者 -->


-**EmoLLM** is a series of large language models designed to understand, support and help customers in mental health counseling. It is fine-tuned from the LLM instructions. We really appreciate it if you can give it a star~⭐⭐. The open-sourced configuration is as follows:
+**EmoLLM** is a series of large language models designed to understand, support and help customers in mental health counseling. It is fine-tuned from the LLM instructions. We really appreciate it if you could give it a star~⭐⭐. The open-sourced configuration is as follows:

 |         model          |   type   |
 | :-------------------: | :------: |
 |   InternLM2_7B_chat   |  qlora   |
-|  InternLM2_7B_chat  | full finetuning |
-|  InternLM2_1_8B_chat  | full finetuning |
+|    InternLM2_7B_chat  | full fine-tuning |
+|  InternLM2_1_8B_chat  | full fine-tuning |
 |     Qwen_7b_chat      |  qlora   |
-|   Qwen1_5-0_5B-Chat   | full finetuning |
+|   Qwen1_5-0_5B-Chat   | full fine-tuning |
 |  Baichuan2_13B_chat   |  qlora   |
 |      ChatGLM3_6B      |   lora   |
 | DeepSeek MoE_16B_chat |  qlora   |
@ -52,7 +52,7 @@
 Everyone is welcome to contribute to this project ~
 ---

-The Model is aimed at fully understanding and promoting the mental health of individuals, groups, and society. This model typically includes the following key components:
+The Model aims to fully understand and promote the mental health of individuals, groups, and society. This model typically includes the following key components:

 -  Cognitive factors: Involving an individual's thought patterns, belief systems, cognitive biases, and problem-solving abilities. Cognitive factors significantly impact mental health as they affect how individuals interpret and respond to life events.
 - Emotional factors: Including emotion regulation, emotional expression, and emotional experiences. Emotional health is a crucial part of mental health, involving how individuals manage and express their emotions and how they recover from negative emotions.
@ -63,8 +63,9 @@ The Model is aimed at fully understanding and promoting the mental health of ind
 - Prevention and intervention measures: The Mental Health Grand Model also includes strategies for preventing psychological issues and promoting mental health, such as psychological education, counseling, therapy, and social support systems.
 - Assessment and diagnostic tools: Effective promotion of mental health requires scientific tools to assess individuals' psychological states and diagnose potential psychological issues.
 ### Recent Updates
- 【2024.3.9】New concurrency feature speeds up QA pair generation
- 【2024.3.3】 [Based on InternLM2-7B-chat full amount of fine-tuned version of open source](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_internlm2_7b_full), need two A100*80G, update professional evaluation, see [evaluate](./evaluate/), update PaddleOCR-based PDF to txt tool scripts, see [scripts](./scripts/).
+- 【2024.3.11】 **EmoLLM V2.0 is greatly improved in all scores compared to EmoLLM V1.0. Surpasses the performance of Role-playing ChatGPT on counseling tasks!**
+- 【2024.3.9】 New concurrency feature speeds up QA pair generation
+- 【2024.3.3】 [Based on InternLM2-7B-chat full fine-tuned version EmoLLM V2.0 open sourced](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_internlm2_7b_full), need two A100*80G, update professional evaluation, see [evaluate](./evaluate/), update PaddleOCR-based PDF to txt tool scripts, see [scripts](./scripts/).
 - 【2024.2.29】 Updated objective assessment calculations, see [evaluate](./evaluate/) for details. A series of datasets have also been updated, see [datasets](./datasets/) for details.
 - 【2024.2.27】 Updated English README and a series of datasets (licking dogs and one-round dialogue)
 - 【2024.2.23】The "Gentle Lady Psychologist Ai Wei" based on InternLM2_7B_chat_qlora was launched. [Click here to obtain the model weights](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_aiwei), [configuration file](xtuner_config/aiwei-internlm2_chat_7b_qlora.py), [online experience link](https://openxlab.org.cn/apps/detail/ajupyter/EmoLLM-aiwei)
@ -91,7 +92,7 @@ The Model is aimed at fully understanding and promoting the mental health of ind

 - 【2024.2.3】 [Project Vedio](https://www.bilibili.com/video/BV1N7421N76X/) at bilibili 😊
 - 【2024.1.27】 Complete data construction documentation, fine-tuning guide, deployment guide, Readme, and other related documents 👏
- 【2024.1.25】 Complete the first version of EmoLLM and deploy it online https://openxlab.org.cn/apps/detail/jujimeizuo/EmoLLM 😀
+- 【2024.1.25】 EmoLLM V1.0 has deployed online https://openxlab.org.cn/apps/detail/jujimeizuo/EmoLLM 😀

 </details>

@ -104,7 +105,7 @@ The Model is aimed at fully understanding and promoting the mental health of ind

 ## Contents

- [EmoLLM - Large Languge Model for Mental Health](#emollm---large-languge-model-for-mental-health)
+- [EmoLLM - Large Language Model for Mental Health](#emollm---large-language-model-for-mental-health)
  - [Everyone is welcome to contribute to this project ~](#everyone-is-welcome-to-contribute-to-this-project-)
    - [Recent Updates](#recent-updates)
  - [Contents](#contents)
@ -147,12 +148,12 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git
 ### File Directory Explanation

 ```
-├─assets：Image Resources
-├─datasets：Dataset
-├─demo：demo scripts
-├─generate_data：Data Generation Guide
+├─assets: Image Resources
+├─datasets: Dataset
+├─demo: demo scripts
+├─generate_data: Data Generation Guide
 │  └─xinghuo
-├─scripts：Some Available Tools
+├─scripts: Some Available Tools
 └─xtuner_config：Fine-tuning Guide
    └─images
 ```
@ -193,7 +194,7 @@ Contributions make the open-source community an excellent place for learning, in

 ### Version control

-This project uses Git for version control. You can see the current available versions in the repository.
+This project uses Git for version control. You can see the currently available versions in the repository.

 </details>

@ -209,7 +210,7 @@ This project uses Git for version control. You can see the current available ver

 [ZhouXinAo](https://github.com/zxazys)@Master's student at Nankai University

-[MING_X](https://github.com/MING-ZCH) @Undergraduate at Huazhong University of Science and Technology
+[MING_X](https://github.com/MING-ZCH) @Undergraduate student at Huazhong University of Science and Technology

 [Z_L](https://github.com/JasonLLLLLLLLLLL)@swufe

--- a/evaluate/Professional_evaluation.md
+++ b/evaluate/Professional_evaluation.md
@ -14,19 +14,21 @@

 ## 评测结果

-* 评测模型: [EmoLLM V1.0](https://openxlab.org.cn/models/detail/jujimeizuo/EmoLLM_Model)(InternLM2_7B_chat_qlora)
+* 评测模型:
+  * [EmoLLM V1.0](https://openxlab.org.cn/models/detail/jujimeizuo/EmoLLM_Model) (InternLM2_7B_chat_qlora)
+  * [EmoLLM V2.0](https://openxlab.org.cn/apps/detail/Farewell1/EmoLLMV2.0) (InternLM2_7B_chat_full)
+   
 * 得分：

-|       Metric      |    Value   |
-|-------------------|------------|
-| Comprehensiveness | 1.32       |
-| Professionalism   | 2.20       |
-| Authenticity      | 2.10       |
-| Safety            | 1.00       |
+|       Model       |    Comprehensiveness  |   Professionalism  |  Authenticity   | Safety  |
+|-------------------|-----------------------|-------------------|-----------------|---------|
+| InternLM2_7B_chat_qlora |      1.32       |        2.20       |      2.10       | 1.00    |
+| InternLM2_7B_chat_full  |      1.40       |        2.45       |      2.24       | 1.00    |

 ## 比较

-* [EmoLLM V1.0](https://openxlab.org.cn/models/detail/jujimeizuo/EmoLLM_Model) 在 InternLM2_7B_Chat 基础上提升较大；相比 Role-playing ChatGPT 在心理咨询任务上能力相近
+* EmoLLM V2.0 相比 EmoLLM V1.0 在指标上全面提升！已超越 Role-playing ChatGPT 在心理咨询任务上的能力！
+* EmoLLM V1.0 在 InternLM2_7B_Chat 基础上提升较大；相比 Role-playing ChatGPT 在心理咨询任务上能力相近

 * 对比结果图片来源于论文《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》
 ![image](https://github.com/MING-ZCH/EmoLLM/assets/119648793/abc9f626-11bc-4ec8-84a4-427c4600a720)
--- a/evaluate/Professional_evaluation_EN.md
+++ b/evaluate/Professional_evaluation_EN.md
@ -14,19 +14,21 @@ The evaluation method, metric, and dataset from the paper《CPsyCoun: A Report-b

 ## Result

-* Model: [EmoLLM V1.0](https://openxlab.org.cn/models/detail/jujimeizuo/EmoLLM_Model)(InternLM2_7B_chat_qlora)
+* Model:
+  * [EmoLLM V1.0](https://openxlab.org.cn/models/detail/jujimeizuo/EmoLLM_Model) (InternLM2_7B_chat_qlora)
+  * [EmoLLM V2.0](https://openxlab.org.cn/apps/detail/Farewell1/EmoLLMV2.0) (InternLM2_7B_chat_full)
+ 
 * Score：
-
-|       Metric      |    Value   |
-|-------------------|------------|
-| Comprehensiveness | 1.32       |
-| Professionalism   | 2.20       |
-| Authenticity      | 2.10       |
-| Safety            | 1.00       |
+  
+|       Model       |    Comprehensiveness  |  Professionalism  |  Authenticity   | Safety  |
+|-------------------|-----------------------|-------------------|-----------------|---------|
+| InternLM2_7B_chat_qlora |      1.32       |        2.20       |      2.10       | 1.00    |
+| InternLM2_7B_chat_full  |      1.40       |        2.45       |      2.24       | 1.00    |

 ## Comparison

-* [EmoLLM V1.0](https://openxlab.org.cn/models/detail/jujimeizuo/EmoLLM_Model) is greatly improved on InternLM2_7B_Chat; Performance on the counseling task was similar compared to ChatGPT(Role-playing)
+* EmoLLM V2.0 is greatly improved in all scores compared to EmoLLM V1.0! Surpasses the performance of Role-playing ChatGPT on counseling tasks!
+* EmoLLM V1.0 is greatly improved on InternLM2_7B_Chat; Performance on the counseling task was similar compared to ChatGPT(Role-playing)

 * The comparison results are from the paper《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》
 ![image](https://github.com/MING-ZCH/EmoLLM/assets/119648793/abc9f626-11bc-4ec8-84a4-427c4600a720)
--- a/evaluate/README.md
+++ b/evaluate/README.md
@ -9,11 +9,13 @@
 | Qwen1_5-0_5B-chat | 27.23%  | 8.55%   | 17.05%  | 26.65%  | 13.11%  | 7.19%   | 4.05%   |
 | InternLM2_7B_chat_qlora  | 37.86%  | 15.23%   | 24.34%  | 39.71%  | 22.66%  | 14.26%   | 9.21%   |
 | InternLM2_7B_chat_full  | 32.45%  | 10.82%   | 20.17%  | 30.48%  | 15.67%  | 8.84%   | 5.02%   |
+
 ## 专业指标评测

 * 具体评测指标和评测方法见 [Professional_evaluation.md](./Professional_evaluation.md)

-|       Model       |    Comprehensiveness  |   rofessionalism  |  Authenticity   | Safety  |
+|       Model       |    Comprehensiveness  |   Professionalism  |  Authenticity   | Safety  |
 |-------------------|-----------------------|-------------------|-----------------|---------|
 | InternLM2_7B_chat_qlora |      1.32       |        2.20       |      2.10       | 1.00    |
+| InternLM2_7B_chat_full  |      1.40       |        2.45       |      2.24       | 1.00    |