[DOC] update readme

This commit is contained in:
aJupyter 2024-03-12 23:55:03 +08:00
parent 77e3531169
commit ec374cc990
2 changed files with 137 additions and 127 deletions

148
README.md
View File

@ -1,4 +1,15 @@
# EmoLLM-心理健康大模型
<div align="center">
# EmoLLM-心理健康大模型
</div>
<p align="center">
<a href="https://github.com/aJupyter/EmoLLM/">
<img src="assets/logo.jpeg" alt="Logo" width="30%">
</a>
<div align="center">
<!-- PROJECT SHIELDS -->
[![Contributors][contributors-shield]][contributors-url]
@ -6,13 +17,8 @@
[![Issues][issues-shield]][issues-url]
[![MIT License][license-shield]][license-url]
[![Stargazers][stars-shield]][stars-url]
<br />
<!-- PROJECT LOGO -->
<p align="center">
<a href="https://github.com/aJupyter/EmoLLM/">
<img src="assets/logo.jpeg" alt="Logo" width="30%">
</a>
</div>
<h3 align="center">EmoLLM</h3>
@ -34,20 +40,26 @@
<!-- 本篇README.md面向开发者 -->
**EmoLLM** 是一系列能够支持 **理解用户-支持用户-帮助用户** 心理健康辅导链路的心理健康大模型,由 `LLM`指令微调而来欢迎大家star~⭐⭐。目前已经开源的 `LLM`微调配置如下:
**EmoLLM** 是一系列能够支持 **理解用户-支持用户-帮助用户** 心理健康辅导链路的心理健康大模型,由 `LLM`指令微调而来欢迎大家star~⭐⭐。目前已经开源的 `LLM` 微调配置如下:
<div align="center">
| 模型 | 类型 |
| :-------------------: | :------: |
| InternLM2_7B_chat | qlora |
| InternLM2_7B_chat | 全量微调 |
| InternLM2_1_8B_chat | 全量微调 |
| Qwen_7b_chat | qlora |
| Qwen1_5-0_5B-Chat | 全量微调 |
| Baichuan2_13B_chat | qlora |
| ChatGLM3_6B | lora |
| DeepSeek MoE_16B_chat | qlora |
| Mixtral 8x7B_instruct | qlora |
| InternLM2_7B_chat | QLORA |
| InternLM2_7B_chat | 全量微调 |
| InternLM2_1_8B_chat | 全量微调 |
| InternLM2_20B_chat | LORA |
| Qwen_7b_chat | QLORA |
| Qwen1_5-0_5B-Chat | 全量微调 |
| Baichuan2_13B_chat | QLORA |
| ChatGLM3_6B | LORA |
| DeepSeek MoE_16B_chat | QLORA |
| Mixtral 8x7B_instruct | QLORA |
| …… | …… |
</div>
欢迎大家为本项目做出贡献~
---
@ -63,11 +75,12 @@
- 预防和干预措施:心理健康大模型还包括预防心理问题和促进心理健康的策略,如心理教育、心理咨询、心理治疗和社会支持系统。
- 评估和诊断工具:为了有效促进心理健康,需要有科学的工具来评估个体的心理状态,以及诊断可能存在的心理问题。
### 最近更新
- 【2024.3.11】 **EmoLLM V2.0 相比 EmoLLM V1.0 全面提升,已超越 Role-playing ChatGPT 在心理咨询任务上的能力!**
- 【2024.3.9】 新增并发功能加速 QA 对生成
### 🎇最近更新
- 【2024.3.12】在百度飞浆平台发布[艾薇](https://aistudio.baidu.com/community/app/63335)
- 【2024.3.11】 **EmoLLM V2.0 相比 EmoLLM V1.0 全面提升,已超越 Role-playing ChatGPT 在心理咨询任务上的能力!**[点击体验EmoLLM V2.0](https://openxlab.org.cn/apps/detail/Farewell1/EmoLLMV2.0),更新[数据集统计及详细信息](./datasets/)、[路线图](./assets/Roadmap_ZH.png)
- 【2024.3.9】 新增并发功能加速 [QA 对生成](./scripts/qa_generation/)、[RAG pipeline](./rag/)
- 【2024.3.3】 [基于InternLM2-7B-chat全量微调版本EmoLLM V2.0开源](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_internlm2_7b_full)需要两块A100*80G更新专业评估详见[evaluate](./evaluate/)更新基于PaddleOCR的PDF转txt工具脚本详见[scripts](./scripts/)
- 【2024.2.29】更新客观评估计算,详见[evaluate](./evaluate/),更新一系列数据集,详见[datasets](./datasets/)
- 【2024.2.29】更新客观评估计算,详见[evaluate](./evaluate/),更新一系列数据集,详见[datasets](./datasets/)
- 【2024.2.27】更新英文readme和一系列数据集舔狗和单轮对话
- 【2024.2.23】推出基于InternLM2_7B_chat_qlora的 `温柔御姐心理医生艾薇`[点击获取模型权重](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_aiwei)[配置文件](xtuner_config/aiwei-internlm2_chat_7b_qlora.py)[在线体验链接](https://openxlab.org.cn/apps/detail/ajupyter/EmoLLM-aiwei)
- 【2024.2.23】更新[若干微调配置](/xtuner_config/),新增 [data_pro.json](/datasets/data_pro.json)(数量更多、场景更全、更丰富)和 [aiwei.json](/datasets/aiwei.json)温柔御姐角色扮演专用带有Emoji表情即将推出 `温柔御姐心理医生艾薇`
@ -94,7 +107,7 @@
</details>
### 路线图
### 🎯路线图
<p align="center">
<a href="https://github.com/aJupyter/EmoLLM/">
@ -106,14 +119,14 @@
- [EmoLLM-心理健康大模型](#emollm-心理健康大模型)
- [最近更新](#最近更新)
- [目录](#目录)
- [开发前的配置要求](#开发前的配置要求)
- [**使用指南**](#使用指南)
- [文件目录说明](#文件目录说明)
- [开发前的配置要求](#开发前的配置要求)
- [**使用指南**](#使用指南)
- [数据构建](#数据构建)
- [微调指南](#微调指南)
- [部署指南](#部署指南)
- [RAG](#rag检索增强生成pipeline)
- [使用到的框架](#使用到的框架)
- [如何参与本项目](#如何参与本项目)
- [如何参与本项目](#如何参与本项目)
- [版本控制](#版本控制)
- [作者(排名不分先后)](#作者排名不分先后)
- [版权说明](#版权说明)
@ -134,31 +147,19 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git
```
2. 依次阅读或者选择感兴趣的部分阅读:
- [文件目录说明](#文件目录说明)
- [数据构建](#数据构建)
- [微调指南](#微调指南)
- [部署指南](#部署指南)
- [RAG](#rag检索增强生成pipeline)
- 查看更多详情
### 文件目录说明
```
├─assets图像资源
├─datasets数据集
├─demodemo脚本
├─generate_data生成数据指南
│ └─xinghuo
├─scripts一些可用工具
└─xtuner_config微调指南
└─images
```
### 数据构建
请阅读[数据构建指南](generate_data/tutorial.md)查阅
- 请阅读[数据构建指南](generate_data/tutorial.md)查阅
本次微调用到的数据集见[datasets](datasets/data.json)
- 微调用到的数据集见[datasets](datasets/data.json)
### 微调指南
@ -166,16 +167,23 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git
### 部署指南
详见[部署指南](demo/README.md)
- Demo部署详见[部署指南](demo/README.md)
- 基于[LMDeploy](https://github.com/InternLM/lmdeploy/)的量化部署:详见[deploy](./deploy/lmdeploy.md)
### RAG(检索增强生成)Pipeline
- 详见[RAG](./rag/)
<details>
<summary>更多详情</summary>
### 使用到的框架
- [Xtuner](https://github.com/InternLM/xtuner)
- [Xtuner](https://github.com/InternLM/xtuner):用于微调
- [Transformers](https://github.com/huggingface/transformers)
- [Pytorch](https://pytorch.org/)
- [LMDeploy](https://github.com/InternLM/lmdeploy/):用于量化部署
- [Stremlit](https://streamlit.io/)用于构建Demo
- [DeepSpeed](https://github.com/microsoft/DeepSpeed):并行训练
- …
#### 如何参与本项目
@ -188,45 +196,31 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git
4. Push to the Branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
### 版本控制
该项目使用Git进行版本管理。您可以在repository参看当前可用版本。
</details>
### 作者(排名不分先后)
[aJupyter](https://github.com/aJupyter)@datawhale成员、南开大学在读硕士
[jujimeizuo](https://github.com/jujimeizuo)@江南大学在读硕士
[Smiling&amp;Weeping](https://github.com/Smiling-Weeping-zhr)@哈尔滨工业大学(威海)在读本科生
[Farewell](https://github.com/8baby8)@飞桨领航团区域主管、文心大模型核心开发者
[ZhouXinAo](https://github.com/zxazys)@南开大学在读硕士
[MING_X](https://github.com/MING-ZCH)@华中科技大学在读本科生
[Z_L](https://github.com/JasonLLLLLLLLLLL)@swufe
[MrCatAI](https://github.com/MrCatAI)@AI搬用工
[ZeyuBa](https://github.com/ZeyuBa)@自动化所在读硕士
[aiyinyuedejustin](https://github.com/aiyinyuedejustin)@宾夕法尼亚大学在读硕士
[Nobody-ML](https://github.com/Nobody-ML)@中国石油大学(华东)在读本科生
[chg0901](https://github.com/chg0901)@韩国光云大学博士生
[Mxoder](https://github.com/Mxoder)@北京航空航天大学在读本科生
[Anooyman](https://github.com/Anooyman) @南京理工大学硕士
| 用户名 | 学校/组织 | 备注 | 贡献 |
| :----------: | :--------------------: | :-------------------: | :----------: |
| [aJupyter](https://github.com/aJupyter) | 南开大学在读硕士 | DataWhale成员 | 项目发起人 |
| [jujimeizuo](https://github.com/jujimeizuo) | 江南大学在读硕士 | | |
| [Smiling-Weeping-zhr](https://github.com/Smiling-Weeping-zhr) | 哈尔滨工业大学(威海)在读本科生 | | |
| [8baby8](https://github.com/8baby8) | 飞桨领航团区域主管 | 文心大模型核心开发者 | |
| [zxazys](https://github.com/zxazys) | 南开大学在读硕士 | | |
| [MING-ZCH](https://github.com/MING-ZCH) | 华中科技大学在读本科生 | | |
| [JasonLLLLLLLLLLL](https://github.com/JasonLLLLLLLLLLL) | swufe | | |
| [MrCatAI](https://github.com/MrCatAI) | AI搬用工 | | |
| [ZeyuBa](https://github.com/ZeyuBa) | 自动化所在读硕士 | | |
| [aiyinyuedejustin](https://github.com/aiyinyuedejustin) | 宾夕法尼亚大学在读硕士 | | |
| [Nobody-ML](https://github.com/Nobody-ML) | 中国石油大学(华东)在读本科生 | | |
| [chg0901](https://github.com/chg0901) | 韩国光云大学博士生 | | |
| [Mxoder](https://github.com/Mxoder) | 北京航空航天大学在读本科生 | | |
| [Anooyman](https://github.com/Anooyman) | 南京理工大学硕士 | | |
| [Vicky-3021](https://github.com/Vicky-3021) | 西安电子科技大学硕士研0 | | |
### 版权说明
该项目签署了MIT 授权许可,详情请参阅 [LICENSE](https://github.com/aJupyter/EmoLLM/blob/master/LICENSE)
该项目签署了 MIT 授权许可,详情请参阅 [LICENSE](https://github.com/SmartFlowAI/EmoLLM/blob/main/LICENSE)
### 特别鸣谢
@ -234,6 +228,8 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git
- [上海人工智能实验室](https://www.shlab.org.cn/)
- [闻星大佬(小助手)](https://github.com/vansin)
- [扫地升(公众号宣传)](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A)
- 阿布(北大心理学硕士)
<!-- links -->
@ -260,7 +256,7 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git
[issues-shield]: https://img.shields.io/github/issues/SmartflowAI/EmoLLM.svg?style=flat-square
[issues-url]: https://img.shields.io/github/issues/SmartflowAI/EmoLLM.svg
[license-shield]: https://img.shields.io/github/license/SmartflowAI/EmoLLM.svg?style=flat-square
[license-url]: https://github.com/SmartflowAI/EmoLLM/blob/main/LICENSE
[license-url]: https://github.com/SmartFlowAI/EmoLLM/blob/main/LICENSE
## 交流群

View File

@ -1,18 +1,24 @@
<div align="center">
# EmoLLM - Large Language Model for Mental Health
</div>
<p align="center">
<a href="https://github.com/aJupyter/EmoLLM/">
<img src="assets/logo.jpeg" alt="Logo" width="30%">
</a>
<div align="center">
<!-- PROJECT SHIELDS -->
[![Contributors][contributors-shield]][contributors-url]
[![Forks][forks-shield]][forks-url]
[![Issues][issues-shield]][issues-url]
[![MIT License][license-shield]][license-url]
[![Stargazers][stars-shield]][stars-url]
<br />
<!-- PROJECT LOGO -->
<p align="center">
<a href="https://github.com/aJupyter/EmoLLM/">
<img src="assets/logo.jpeg" alt="Logo" width="30%">
</a>
</div>
<h3 align="center">EmoLLM</h3>
@ -37,19 +43,26 @@
**EmoLLM** is a series of large language models designed to understand, support and help customers in mental health counseling. It is fine-tuned from the LLM instructions. We really appreciate it if you could give it a star~⭐⭐. The open-sourced configuration is as follows:
| model | type |
<div align="center">
| 模型 | 类型 |
| :-------------------: | :------: |
| InternLM2_7B_chat | qlora |
| InternLM2_7B_chat | full fine-tuning |
| InternLM2_1_8B_chat | full fine-tuning |
| Qwen_7b_chat | qlora |
| Qwen1_5-0_5B-Chat | full fine-tuning |
| Baichuan2_13B_chat | qlora |
| ChatGLM3_6B | lora |
| DeepSeek MoE_16B_chat | qlora |
| Mixtral 8x7B_instruct | qlora |
| InternLM2_7B_chat | QLORA |
| InternLM2_7B_chat | 全量微调 |
| InternLM2_1_8B_chat | 全量微调 |
| InternLM2_20B_chat | LORA |
| Qwen_7b_chat | QLORA |
| Qwen1_5-0_5B-Chat | 全量微调 |
| Baichuan2_13B_chat | QLORA |
| ChatGLM3_6B | LORA |
| DeepSeek MoE_16B_chat | QLORA |
| Mixtral 8x7B_instruct | QLORA |
| …… | …… |
</div>
Everyone is welcome to contribute to this project ~
---
The Model aims to fully understand and promote the mental health of individuals, groups, and society. This model typically includes the following key components:
@ -63,8 +76,9 @@ The Model aims to fully understand and promote the mental health of individuals,
- Prevention and intervention measures: The Mental Health Grand Model also includes strategies for preventing psychological issues and promoting mental health, such as psychological education, counseling, therapy, and social support systems.
- Assessment and diagnostic tools: Effective promotion of mental health requires scientific tools to assess individuals' psychological states and diagnose potential psychological issues.
### Recent Updates
- 【2024.3.11】 **EmoLLM V2.0 is greatly improved in all scores compared to EmoLLM V1.0. Surpasses the performance of Role-playing ChatGPT on counseling tasks!**
- 【2024.3.9】 New concurrency feature speeds up QA pair generation
- 【2024.3.12】 Released on Baidu Flying Pulp Platform [aiwei](https://aistudio.baidu.com/community/app/63335)
- 【2024.3.11】 **EmoLLM V2.0 is greatly improved in all scores compared to EmoLLM V1.0. Surpasses the performance of Role-playing ChatGPT on counseling tasks!**, [Click to experience EmoLLM V2.0](https://openxlab.org.cn/apps/detail/Farewell1/EmoLLMV2.0), update [dataset statistics and details](./datasets/), [Roadmap](./assets/Roadmap_ZH.png)
- 【2024.3.9】 Add concurrency acceleration [QA pair generation](./scripts/qa_generation/), [RAG pipeline](./rag/)
- 【2024.3.3】 [Based on InternLM2-7B-chat full fine-tuned version EmoLLM V2.0 open sourced](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_internlm2_7b_full), need two A100*80G, update professional evaluation, see [evaluate](./evaluate/), update PaddleOCR-based PDF to txt tool scripts, see [scripts](./scripts/).
- 【2024.2.29】 Updated objective assessment calculations, see [evaluate](./evaluate/) for details. A series of datasets have also been updated, see [datasets](./datasets/) for details.
- 【2024.2.27】 Updated English README and a series of datasets (licking dogs and one-round dialogue)
@ -109,14 +123,14 @@ The Model aims to fully understand and promote the mental health of individuals,
- [Everyone is welcome to contribute to this project ~](#everyone-is-welcome-to-contribute-to-this-project-)
- [Recent Updates](#recent-updates)
- [Contents](#contents)
- [Pre-development Configuration Requirements.](#pre-development-configuration-requirements)
- [**User Guide**](#user-guide)
- [File Directory Explanation](#file-directory-explanation)
- [Pre-development Configuration Requirements.](#pre-development-configuration-requirements)
- [**User Guide**](#user-guide)
- [Data Construction](#data-construction)
- [Fine-tuning Guide](#fine-tuning-guide)
- [Deployment Guide](#deployment-guide)
- [RAG]()
- [Frameworks Used](#frameworks-used)
- [How to participate in this project](#how-to-participate-in-this-project)
- [How to participate in this project](#how-to-participate-in-this-project)
- [Version control](#version-control)
- [Authors (in no particular order)](#authors-in-no-particular-order)
- [Copyright Notice](#copyright-notice)
@ -160,9 +174,9 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git
### Data Construction
Please read the [Data Construction Guide ](generate_data/tutorial.md)for reference.
- Please read the [Data Construction Guide ](generate_data/tutorial.md)for reference.
The dataset used for this fine-tuning can be found at [datasets](datasets/data.json)
- The dataset used for this fine-tuning can be found at [datasets](datasets/data.json)
### Fine-tuning Guide
@ -170,7 +184,12 @@ For details, see the [fine-tuning guide](xtuner_config/README.md)
### Deployment Guide
For details, see the [deployment guide](demo/README.md)
- Demo deployment: see [deployment guide](./demo/README.md) for details.
- Quantitative deployment based on [LMDeploy](https://github.com/InternLM/lmdeploy/): see [deploy](./deploy/lmdeploy.md)
### RAG (Retrieval Augmented Generation) Pipeline
- See [RAG](./rag/)
<details>
<summary>Additional Details</summary>
@ -180,6 +199,9 @@ For details, see the [deployment guide](demo/README.md)
- [Xtuner](https://github.com/InternLM/xtuner)
- [Transformers](https://github.com/huggingface/transformers)
- [Pytorch](https://pytorch.org/)
- [LMDeploy](https://github.com/InternLM/lmdeploy/): for quantitative deployment
- [Stremlit](https://streamlit.io/): for building demos
- [DeepSpeed](https://github.com/microsoft/DeepSpeed): for parallel training
- …
#### How to participate in this project
@ -200,33 +222,24 @@ This project uses Git for version control. You can see the currently available v
### Authors (in no particular order)
[aJupyter](https://github.com/aJupyter)@Datawhale member, Master's student at Nankai University
| Username | School/Organization | Remarks | Contributions |
| :-------: | :-------------------: | :------------------: | :--------: |
| [aJupyter](https://github.com/aJupyter) | Nankai University, Master's student | DataWhale member | Project initiator |
| [jujimeizuo](https://github.com/jujimeizuo) | Jiangnan University, Master's student | | |
| [Smiling-Weeping-zhr](https://github.com/Smiling-Weeping-zhr) | Harbin Institute of Technology (Weihai), Undergraduate student | | |
| [8baby8](https://github.com/8baby8) | PaddlePaddle Pilot Team Regional Director | Wenxin Large Model core developer | |
| [zxazys](https://github.com/zxazys) | Nankai University, Master's student | | |
| [MING-ZCH](https://github.com/MING-ZCH) | Huazhong University of Science and Technology, Undergraduate student | | |
| [JasonLLLLLLLLLLL](https://github.com/JasonLLLLLLLLLLL) | SWUFE (Southwestern University of Finance and Economics) | | |
| [MrCatAI](https://github.com/MrCatAI) | AI Mover | | |
| [ZeyuBa](https://github.com/ZeyuBa) | Institute of Automation, Master's student | | |
| [aiyinyuedejustin](https://github.com/aiyinyuedejustin) | University of Pennsylvania, Master's student | | |
| [Nobody-ML](https://github.com/Nobody-ML) | China University of Petroleum (East China), Undergraduate student | | |
| [chg0901](https://github.com/chg0901) | Kongju University, Doctoral student (South Korea) | | |
| [Mxoder](https://github.com/Mxoder) | Beihang University, Undergraduate student | | |
| [Anooyman](https://github.com/Anooyman) | Nanjing University of Science and Technology, Master's student | | |
| [Vicky-3021](https://github.com/Vicky-3021) | Xidian University, Master's student (Research Year 0) | | |
[jujimeizuo](https://github.com/jujimeizuo)@Master's student at Jiangnan University
[Smiling&amp;Weeping](https://github.com/Smiling-Weeping-zhr)@Undergraduate student at Harbin Institute of Technology (Weihai)
[Farewell](https://github.com/8baby8)@
[ZhouXinAo](https://github.com/zxazys)@Master's student at Nankai University
[MING_X](https://github.com/MING-ZCH) @Undergraduate student at Huazhong University of Science and Technology
[Z_L](https://github.com/JasonLLLLLLLLLLL)@swufe
[MrCatAI](https://github.com/MrCatAI)@AI Removal of Labour
[ZeyuBa](https://github.com/ZeyuBa)@Master's student at Institute of Automation
[aiyinyuedejustin](https://github.com/aiyinyuedejustin)@Master's student at University of Pennsylvania
[Nobody-ML](https://github.com/Nobody-ML)@Undergraduate at China University of Petroleum (East China)
[chg0901](https://github.com/chg0901)@PhD Candidate at Kwangwoon University
[Mxoder](https://github.com/Mxoder)@Undergraduate at Beihang University
[Anooyman](https://github.com/Anooyman) @Master of Nanjing University of Science and Technology
### Copyright Notice
@ -239,6 +252,7 @@ The project is licensed under the MIT License. Please refer to the details
- [Shanghai Artificial Intelligence Laboratory](https://www.shlab.org.cn/)
- [Vanin](https://github.com/vansin)
- [Bloom up (WeChat Official Account Promotion)](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A)
- Abu (M.A. in Psychology, Peking University)
<!-- links -->