diff --git a/README.md b/README.md index fcfde5f..8987e81 100644 --- a/README.md +++ b/README.md @@ -1,354 +1,355 @@ -
- -# EmoLLM-心理健康大模型 - -
- -

- - Logo - - -

- - -[![Contributors][contributors-shield]][contributors-url] -[![Forks][forks-shield]][forks-url] -[![Issues][issues-shield]][issues-url] -[![OpenXLab_App][OpenXLab_App-image]][OpenXLab_App-url] -[![OpenXLab_Model][OpenXLab_Model-image]][OpenXLab_Model-url] -[![MIT License][license-shield]][license-url] -[![Stargazers][stars-shield]][stars-url] - -
- -

EmoLLM

- -
- 简体中文| English -
-
- 探索本项目的文档 » -
-
- 体验EmoLLM 2.0 - · - 报告Bug - · - 提出新特性 -
- - - -**EmoLLM** 是一系列能够支持 **理解用户-支持用户-帮助用户** 心理健康辅导链路的心理健康大模型,由 `LLM`指令微调而来,欢迎大家star~⭐⭐。目前已经开源的 `LLM` 微调配置如下: - -
- -| 模型 | 类型 | 链接 | -| :-------------------: | :------: | :---: | -| InternLM2_7B_chat | QLORA | | -| InternLM2_7B_chat | 全量微调 | | -| InternLM2_7B_base | QLORA | | -| InternLM2_1_8B_chat | 全量微调 | | -| InternLM2_20B_chat | LORA | | -| Qwen_7b_chat | QLORA | | -| Qwen1_5-0_5B-Chat | 全量微调 | | -| Baichuan2_13B_chat | QLORA | | -| ChatGLM3_6B | LORA | | -| DeepSeek MoE_16B_chat | QLORA | | -| Mixtral 8x7B_instruct | QLORA | | -| …… | …… | …… | - -
- -欢迎大家为本项目做出贡献~ - ---- - -心理健康大模型(Mental Health Grand Model)是一个综合性的概念,它旨在全面理解和促进个体、群体乃至整个社会的心理健康状态。这个模型通常包含以下几个关键组成部分: - -- 认知因素:涉及个体的思维模式、信念系统、认知偏差以及解决问题的能力。认知因素对心理健康有重要影响,因为它们影响个体如何解释和应对生活中的事件。 -- 情感因素:包括情绪调节、情感表达和情感体验。情感健康是心理健康的重要组成部分,涉及个体如何管理和表达自己的情感,以及如何从负面情绪中恢复。 -- 行为因素:涉及个体的行为模式、习惯和应对策略。这包括应对压力的技巧、社交技能以及自我效能感,即个体对自己能力的信心。 -- 社会环境:包括家庭、工作、社区和文化背景等外部因素,这些因素对个体的心理健康有着直接和间接的影响。 -- 生理健康:身体健康与心理健康紧密相关。良好的身体健康可以促进心理健康,反之亦然。 -- 心理韧性:指个体在面对逆境时的恢复力和适应能力。心理韧性强的人更能够从挑战中恢复,并从中学习和成长。 -- 预防和干预措施:心理健康大模型还包括预防心理问题和促进心理健康的策略,如心理教育、心理咨询、心理治疗和社会支持系统。 -- 评估和诊断工具:为了有效促进心理健康,需要有科学的工具来评估个体的心理状态,以及诊断可能存在的心理问题。 - - - - - - - - - - - -
- - placeholder-image - - - - placeholder-image - -
- - placeholder-image - - - - placeholder-image - -
- -### 🎇最近更新 -- 【2024.4.2】在 Huggingface 上传[老母亲心理咨询师](https://huggingface.co/brycewang2018/EmoLLM-mother/tree/main) -- 【2024.3.25】在百度飞桨平台发布[爹系男友心理咨询师](https://aistudio.baidu.com/community/app/68787) -- 【2024.3.24】在OpenXLab和ModelScope平台发布InternLM2-Base-7B QLoRA微调模型, 具体请查看[InternLM2-Base-7B QLoRA](./xtuner_config/README_internlm2_7b_base_qlora.md) -- 【2024.3.12】在百度飞桨平台发布[艾薇](https://aistudio.baidu.com/community/app/63335) -- 【2024.3.11】 **EmoLLM V2.0 相比 EmoLLM V1.0 全面提升,已超越 Role-playing ChatGPT 在心理咨询任务上的能力!**[点击体验EmoLLM V2.0](https://openxlab.org.cn/apps/detail/Farewell1/EmoLLMV2.0),更新[数据集统计及详细信息](./datasets/)、[路线图](./assets/Roadmap_ZH.png) -- 【2024.3.9】 新增并发功能加速 [QA 对生成](./scripts/qa_generation/)、[RAG pipeline](./rag/) -- 【2024.3.3】 [基于InternLM2-7B-chat全量微调版本EmoLLM V2.0开源](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_internlm2_7b_full),需要两块A100*80G,更新专业评估,详见[evaluate](./evaluate/),更新基于PaddleOCR的PDF转txt工具脚本,详见[scripts](./scripts/) -- 【2024.2.29】更新客观评估计算,详见[evaluate](./evaluate/),更新一系列数据集,详见[datasets](./datasets/) -- 【2024.2.27】更新英文readme和一系列数据集(舔狗和单轮对话) -- 【2024.2.23】推出基于InternLM2_7B_chat_qlora的 `温柔御姐心理医生艾薇`,[点击获取模型权重](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_aiwei),[配置文件](xtuner_config/aiwei-internlm2_chat_7b_qlora.py),[在线体验链接](https://openxlab.org.cn/apps/detail/ajupyter/EmoLLM-aiwei) -- 【2024.2.23】更新[若干微调配置](/xtuner_config/),新增 [data_pro.json](/datasets/data_pro.json)(数量更多、场景更全、更丰富)和 [aiwei.json](/datasets/aiwei.json)(温柔御姐角色扮演专用,带有Emoji表情),即将推出 `温柔御姐心理医生艾薇` -- 【2024.2.18】 [基于Qwen1_5-0_5B-Chat全量微调版本开源](https://www.modelscope.cn/models/aJupyter/EmoLLM_Qwen1_5-0_5B-Chat_full_sft/summary),算力有限的道友可以玩起来~ - -
-查看更多 - -- 【2024.2.6】 EmoLLM在[**Openxlab** ](https://openxlab.org.cn/models/detail/jujimeizuo/EmoLLM_Model) 平台下载量高达18.7k,欢迎大家体验! - -

- 模型下载量 -

- -- 【2024.2.5】 项目荣获公众号**NLP工程化**推文宣传[推文链接](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A),为博主推广一波,欢迎大家关注!!🥳🥳 - -

- 公众号二维码 -

- -- 【2024.2.3】 [项目宣传视频](https://www.bilibili.com/video/BV1N7421N76X/)完成 😊 -- 【2024.1.27】 完善数据构建文档、微调指南、部署指南、Readme等相关文档 👏 -- 【2024.1.25】 EmoLLM V1.0 已部署上线 https://openxlab.org.cn/apps/detail/jujimeizuo/EmoLLM 😀 - -
- -### 🏆荣誉栏 - -- 项目荣获上海人工智能实验室举办的**2024浦源大模型系列挑战赛春季赛*****创新创意奖*** - -

- - 浦语挑战赛创新创意奖 -

- -- 项目荣获公众号**NLP工程化**[推文宣传](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A) - -### 🎯路线图 - -

- - Roadmap_ZH - - -### 🔗框架图 - -

- - Framework_ZH - - -## 目录 - -- [EmoLLM-心理健康大模型](#emollm-心理健康大模型) - - [🎇最近更新](#最近更新) - - [🏆荣誉栏](#荣誉栏) - - [🎯路线图](#路线图) - - [🔗框架图](#框架图) - - [目录](#目录) - - [开发前的配置要求](#开发前的配置要求) - - [**使用指南**](#使用指南) - - [快速体验](#快速体验) - - [数据构建](#数据构建) - - [微调指南](#微调指南) - - [部署指南](#部署指南) - - [RAG(检索增强生成)Pipeline](#rag检索增强生成pipeline) - - [使用到的框架](#使用到的框架) - - [如何参与本项目](#如何参与本项目) - - [作者(排名不分先后)](#作者排名不分先后) - - [版权说明](#版权说明) - - [引用](#引用) - - [特别鸣谢](#特别鸣谢) - - [Star History](#star-history) - - [🌟 Contributors](#-contributors) - - [交流群](#交流群) - -###### 开发前的配置要求 - -- 硬件:A100 40G(仅针对InternLM2_7B_chat+qlora微调+deepspeed zero2优化) - -###### **使用指南** - -1. Clone the repo - -```sh -git clone https://github.com/SmartFlowAI/EmoLLM.git -``` - -2. 依次阅读或者选择感兴趣的部分阅读: - - [快速体验](#快速体验) - - [数据构建](#数据构建) - - [微调指南](#微调指南) - - [部署指南](#部署指南) - - [RAG](#rag检索增强生成pipeline) - - 查看更多详情 - - -### 🍪快速体验 - -- 请阅读[快速体验](docs/quick_start.md)查阅 - - -### 📌数据构建 - -- 请阅读[数据构建指南](generate_data/tutorial.md)查阅 - -- 微调用到的数据集见[datasets](datasets/data.json) - -### 🎨微调指南 - -详见[微调指南](xtuner_config/README.md) - -### 🔧部署指南 - -- Demo部署:详见[部署指南](demo/README.md) -- 基于[LMDeploy](https://github.com/InternLM/lmdeploy/)的量化部署:详见[deploy](./deploy/lmdeploy.md) - -### ⚙RAG(检索增强生成)Pipeline - -- 详见[RAG](./rag/) - -

-更多详情 - -### 使用到的框架 - -- [Xtuner](https://github.com/InternLM/xtuner):用于微调 -- [Transformers](https://github.com/huggingface/transformers) -- [Pytorch](https://pytorch.org/) -- [LMDeploy](https://github.com/InternLM/lmdeploy/):用于量化部署 -- [Stremlit](https://streamlit.io/):用于构建Demo -- [DeepSpeed](https://github.com/microsoft/DeepSpeed):并行训练 -- … - -#### 如何参与本项目 - -贡献使开源社区成为一个学习、激励和创造的绝佳场所。你所作的任何贡献都是**非常感谢**的。 - -1. Fork the Project -2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`) -3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`) -4. Push to the Branch (`git push origin feature/AmazingFeature`) -5. Open a Pull Request - -
- -### 作者(排名不分先后) - -| 用户名 | 学校/组织 | 备注 | 贡献 | -| :-----------------------------------------------------------: | :------------------------------------------------: | :------------------------------------------------------------------: | :-------------------------------------------: | -| [aJupyter](https://github.com/aJupyter) | 南开大学在读硕士 | DataWhale成员 | 项目发起人 | -| [MING-ZCH](https://github.com/MING-ZCH) | 华中科技大学在读本科生 | LLM x Psychology 研究者 | 项目联合负责人 | -| [jujimeizuo](https://github.com/jujimeizuo) | 江南大学在读硕士 | | | -| [Smiling-Weeping-zhr](https://github.com/Smiling-Weeping-zhr) | 哈尔滨工业大学(威海)在读本科生 | | | -| [8baby8](https://github.com/8baby8) | 飞桨领航团区域主管 | 文心大模型核心开发者 | | -| [zxazys](https://github.com/zxazys) | 南开大学在读硕士 | | | -| [JasonLLLLLLLLLLL](https://github.com/JasonLLLLLLLLLLL) | swufe | | | -| [MrCatAI](https://github.com/MrCatAI) | AI搬用工 | | | -| [ZeyuBa](https://github.com/ZeyuBa) | 自动化所在读硕士 | | | -| [aiyinyuedejustin](https://github.com/aiyinyuedejustin) | 宾夕法尼亚大学在读硕士 | | | -| [Nobody-ML](https://github.com/Nobody-ML) | 中国石油大学(华东)在读本科生 | | | -| [chg0901](https://github.com/chg0901) | [MiniSora](https://github.com/mini-sora/minisora/) | [MiniSora](https://github.com/mini-sora/minisora/)主要维护者,管理员 | LLM预训练和微调、模型上传、数据清洗、文档翻译 | -| [Mxoder](https://github.com/Mxoder) | 北京航空航天大学在读本科生 | | | -| [Anooyman](https://github.com/Anooyman) | 南京理工大学硕士 | | | -| [Vicky-3021](https://github.com/Vicky-3021) | 西安电子科技大学硕士(研0) | | | -| [SantiagoTOP](https://github.com/santiagoTOP) | 太原理工大学在读硕士 | | | -| [zealot52099](https://github.com/zealot52099) | 个人开发者 | | 清洗数据、LLM微调、RAG | -| [wwwyfff](https://github.com/wwwyfff) | 复旦大学在读硕士 | | | -| [jkhumor](https://github.com/jkhumor) | 南开大学在读硕士 | | RAG | -| [lll997150986](https://github.com/lll997150986) | 南开大学在读硕士 | | 微调 | -| [nln-maker](https://github.com/nln-maker) | 南开大学在读硕士 | | 前后端开发 | -| [dream00001](https://github.com/dream00001) | 南开大学在读硕士 | | 前后端开发 | -| [王几行XING](https://zhihu.com/people/brycewang1898) | 北京大学硕士毕业 | | 清洗数据、LLM微调、前后端开发 | -| [思在] | 北京大学硕士毕业(微软美国) | | LLM微调、前后端开发 | - -### 版权说明 - -该项目签署了 MIT 授权许可,详情请参阅 [LICENSE](https://github.com/SmartFlowAI/EmoLLM/blob/main/LICENSE) - -### 引用 - -如果本项目对您的工作有所帮助,请使用以下格式引用: - -```bibtex -@misc{EmoLLM, - title={EmoLLM}, - author={EmoLLM}, - url={https://github.com/SmartFlowAI/EmoLLM/}, - year={2024} -} -``` - -### 特别鸣谢 - -- [Sanbu](https://github.com/sanbuphy) -- [上海人工智能实验室](https://www.shlab.org.cn/) -- [闻星大佬(小助手)](https://github.com/vansin) -- [扫地升(公众号宣传)](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A) -- 阿布(北大心理学硕士) -- [HatBoy](https://github.com/hatboy) - - - - - - - -## Star History - -[![Star History Chart](https://api.star-history.com/svg?repos=SmartFlowAI/EmoLLM&type=Date)](https://star-history.com/#SmartFlowAI/EmoLLM&Date) - -## 🌟 Contributors - -[![EmoLLM contributors](https://contrib.rocks/image?repo=SmartFlowAI/EmoLLM&max=50)](https://github.com/SmartFlowAI/EmoLLM/graphs/contributors) - -[your-project-path]: SmartflowAI/EmoLLM -[contributors-shield]: https://img.shields.io/github/contributors/SmartflowAI/EmoLLM.svg?style=flat-square -[contributors-url]: https://github.com/SmartflowAI/EmoLLM/graphs/contributors -[forks-shield]: https://img.shields.io/github/forks/SmartflowAI/EmoLLM.svg?style=flat-square -[forks-url]: https://github.com/SmartflowAI/EmoLLM/network/members -[stars-shield]: https://img.shields.io/github/stars/SmartflowAI/EmoLLM.svg?style=flat-square -[stars-url]: https://github.com/SmartflowAI/EmoLLM/stargazers -[issues-shield]: https://img.shields.io/github/issues/SmartflowAI/EmoLLM.svg?style=flat-square -[issues-url]: https://img.shields.io/github/issues/SmartflowAI/EmoLLM.svg -[license-shield]: https://img.shields.io/github/license/SmartflowAI/EmoLLM.svg?style=flat-square -[license-url]: https://github.com/SmartFlowAI/EmoLLM/blob/main/LICENSE - -[OpenXLab_App-image]: https://cdn-static.openxlab.org.cn/app-center/openxlab_app.svg -[OpenXLab_Model-image]: https://cdn-static.openxlab.org.cn/header/openxlab_models.svg -[OpenXLab_App-url]: https://openxlab.org.cn/apps/detail/Farewell1/EmoLLMV2.0 -[OpenXLab_Model-url]: https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_internlm2_7b_full - -## 交流群 - -- 如果失效,请移步Issue区 - -

- EmoLLM官方交流群 -

+
+ +# EmoLLM-心理健康大模型 + +
+ +

+ + Logo + + +

+ + +[![Contributors][contributors-shield]][contributors-url] +[![Forks][forks-shield]][forks-url] +[![Issues][issues-shield]][issues-url] +[![OpenXLab_App][OpenXLab_App-image]][OpenXLab_App-url] +[![OpenXLab_Model][OpenXLab_Model-image]][OpenXLab_Model-url] +[![MIT License][license-shield]][license-url] +[![Stargazers][stars-shield]][stars-url] + +
+ +

EmoLLM

+ +
+ 简体中文| English +
+
+ 探索本项目的文档 » +
+
+ 体验EmoLLM 2.0 + · + 报告Bug + · + 提出新特性 +
+ + + +**EmoLLM** 是一系列能够支持 **理解用户-支持用户-帮助用户** 心理健康辅导链路的心理健康大模型,由 `LLM`指令微调而来,欢迎大家star~⭐⭐。目前已经开源的 `LLM` 微调配置如下: + +
+ +| 模型 | 类型 | 链接 | +| :-------------------: | :------: | :---: | +| InternLM2_7B_chat | QLORA | | +| InternLM2_7B_chat | 全量微调 | | +| InternLM2_7B_base | QLORA | [internlm2_7b_base_qlora_e10_M_1e4_32_64.py](./xtuner_config/internlm2_7b_base_qlora_e10_M_1e4_32_64.py) | +| InternLM2_1_8B_chat | 全量微调 | | +| InternLM2_20B_chat | LORA | | +| Qwen_7b_chat | QLORA | | +| Qwen1_5-0_5B-Chat | 全量微调 | | +| Baichuan2_13B_chat | QLORA | | +| ChatGLM3_6B | LORA | | +| DeepSeek MoE_16B_chat | QLORA | | +| Mixtral 8x7B_instruct | QLORA | | +| …… | …… | …… | + +
+ +欢迎大家为本项目做出贡献~ + +--- + +心理健康大模型(Mental Health Grand Model)是一个综合性的概念,它旨在全面理解和促进个体、群体乃至整个社会的心理健康状态。这个模型通常包含以下几个关键组成部分: + +- 认知因素:涉及个体的思维模式、信念系统、认知偏差以及解决问题的能力。认知因素对心理健康有重要影响,因为它们影响个体如何解释和应对生活中的事件。 +- 情感因素:包括情绪调节、情感表达和情感体验。情感健康是心理健康的重要组成部分,涉及个体如何管理和表达自己的情感,以及如何从负面情绪中恢复。 +- 行为因素:涉及个体的行为模式、习惯和应对策略。这包括应对压力的技巧、社交技能以及自我效能感,即个体对自己能力的信心。 +- 社会环境:包括家庭、工作、社区和文化背景等外部因素,这些因素对个体的心理健康有着直接和间接的影响。 +- 生理健康:身体健康与心理健康紧密相关。良好的身体健康可以促进心理健康,反之亦然。 +- 心理韧性:指个体在面对逆境时的恢复力和适应能力。心理韧性强的人更能够从挑战中恢复,并从中学习和成长。 +- 预防和干预措施:心理健康大模型还包括预防心理问题和促进心理健康的策略,如心理教育、心理咨询、心理治疗和社会支持系统。 +- 评估和诊断工具:为了有效促进心理健康,需要有科学的工具来评估个体的心理状态,以及诊断可能存在的心理问题。 + + + + + + + + + + + +
+ + placeholder-image + + + + placeholder-image + +
+ + placeholder-image + + + + placeholder-image + +
+ +### 🎇最近更新 + +- 【2024.4.2】在 Huggingface 上传[老母亲心理咨询师](https://huggingface.co/brycewang2018/EmoLLM-mother/tree/main) +- 【2024.3.25】在百度飞桨平台发布[爹系男友心理咨询师](https://aistudio.baidu.com/community/app/68787) +- 【2024.3.24】在**OpenXLab**和**ModelScope**平台发布**InternLM2-Base-7B QLoRA微调模型**, 具体请查看[**InternLM2-Base-7B QLoRA**](./xtuner_config/README_internlm2_7b_base_qlora.md) +- 【2024.3.12】在百度飞桨平台发布[艾薇](https://aistudio.baidu.com/community/app/63335) +- 【2024.3.11】 **EmoLLM V2.0 相比 EmoLLM V1.0 全面提升,已超越 Role-playing ChatGPT 在心理咨询任务上的能力!**[点击体验EmoLLM V2.0](https://openxlab.org.cn/apps/detail/Farewell1/EmoLLMV2.0),更新[数据集统计及详细信息](./datasets/)、[路线图](./assets/Roadmap_ZH.png) +- 【2024.3.9】 新增并发功能加速 [QA 对生成](./scripts/qa_generation/)、[RAG pipeline](./rag/) +- 【2024.3.3】 [基于InternLM2-7B-chat全量微调版本EmoLLM V2.0开源](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_internlm2_7b_full),需要两块A100*80G,更新专业评估,详见[evaluate](./evaluate/),更新基于PaddleOCR的PDF转txt工具脚本,详见[scripts](./scripts/) +- 【2024.2.29】更新客观评估计算,详见[evaluate](./evaluate/),更新一系列数据集,详见[datasets](./datasets/) +- 【2024.2.27】更新英文readme和一系列数据集(舔狗和单轮对话) +- 【2024.2.23】推出基于InternLM2_7B_chat_qlora的 `温柔御姐心理医生艾薇`,[点击获取模型权重](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_aiwei),[配置文件](xtuner_config/aiwei-internlm2_chat_7b_qlora.py),[在线体验链接](https://openxlab.org.cn/apps/detail/ajupyter/EmoLLM-aiwei) +- 【2024.2.23】更新[若干微调配置](/xtuner_config/),新增 [data_pro.json](/datasets/data_pro.json)(数量更多、场景更全、更丰富)和 [aiwei.json](/datasets/aiwei.json)(温柔御姐角色扮演专用,带有Emoji表情),即将推出 `温柔御姐心理医生艾薇` +- 【2024.2.18】 [基于Qwen1_5-0_5B-Chat全量微调版本开源](https://www.modelscope.cn/models/aJupyter/EmoLLM_Qwen1_5-0_5B-Chat_full_sft/summary),算力有限的道友可以玩起来~ + +
+查看更多 + +- 【2024.2.6】 EmoLLM在[**Openxlab** ](https://openxlab.org.cn/models/detail/jujimeizuo/EmoLLM_Model) 平台下载量高达18.7k,欢迎大家体验! + +

+ 模型下载量 +

+ +- 【2024.2.5】 项目荣获公众号**NLP工程化**推文宣传[推文链接](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A),为博主推广一波,欢迎大家关注!!🥳🥳 + +

+ 公众号二维码 +

+ +- 【2024.2.3】 [项目宣传视频](https://www.bilibili.com/video/BV1N7421N76X/)完成 😊 +- 【2024.1.27】 完善数据构建文档、微调指南、部署指南、Readme等相关文档 👏 +- 【2024.1.25】 EmoLLM V1.0 已部署上线 https://openxlab.org.cn/apps/detail/jujimeizuo/EmoLLM 😀 + +
+ +### 🏆荣誉栏 + +- 项目荣获上海人工智能实验室举办的**2024浦源大模型系列挑战赛春季赛*****创新创意奖*** + +

+ + 浦语挑战赛创新创意奖 +

+ +- 项目荣获公众号**NLP工程化**[推文宣传](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A) + +### 🎯路线图 + +

+ + Roadmap_ZH + + +### 🔗框架图 + +

+ + Framework_ZH + + +## 目录 + +- [EmoLLM-心理健康大模型](#emollm-心理健康大模型) + - [🎇最近更新](#最近更新) + - [🏆荣誉栏](#荣誉栏) + - [🎯路线图](#路线图) + - [🔗框架图](#框架图) + - [目录](#目录) + - [开发前的配置要求](#开发前的配置要求) + - [**使用指南**](#使用指南) + - [🍪快速体验](#快速体验) + - [📌数据构建](#数据构建) + - [🎨微调指南](#微调指南) + - [🔧部署指南](#部署指南) + - [⚙RAG(检索增强生成)Pipeline](#rag检索增强生成pipeline) + - [使用到的框架](#使用到的框架) + - [如何参与本项目](#如何参与本项目) + - [作者(排名不分先后)](#作者排名不分先后) + - [版权说明](#版权说明) + - [引用](#引用) + - [特别鸣谢](#特别鸣谢) + - [Star History](#star-history) + - [🌟 Contributors](#-contributors) + - [交流群](#交流群) + +###### 开发前的配置要求 + +- 硬件:A100 40G(仅针对InternLM2_7B_chat+qlora微调+deepspeed zero2优化) + +###### **使用指南** + +1. Clone the repo + +```sh +git clone https://github.com/SmartFlowAI/EmoLLM.git +``` + +2. 依次阅读或者选择感兴趣的部分阅读: + - [快速体验](#快速体验) + - [数据构建](#数据构建) + - [微调指南](#微调指南) + - [部署指南](#部署指南) + - [RAG](#rag检索增强生成pipeline) + - 查看更多详情 + + +### 🍪快速体验 + +- 请阅读[快速体验](docs/quick_start.md)查阅 + + +### 📌数据构建 + +- 请阅读[数据构建指南](generate_data/tutorial.md)查阅 + +- 微调用到的数据集见[datasets](datasets/data.json) + +### 🎨微调指南 + +详见[微调指南](xtuner_config/README.md) + +### 🔧部署指南 + +- Demo部署:详见[部署指南](demo/README.md) +- 基于[LMDeploy](https://github.com/InternLM/lmdeploy/)的量化部署:详见[deploy](./deploy/lmdeploy.md) + +### ⚙RAG(检索增强生成)Pipeline + +- 详见[RAG](./rag/) + +

+更多详情 + +### 使用到的框架 + +- [Xtuner](https://github.com/InternLM/xtuner):用于微调 +- [Transformers](https://github.com/huggingface/transformers) +- [Pytorch](https://pytorch.org/) +- [LMDeploy](https://github.com/InternLM/lmdeploy/):用于量化部署 +- [Stremlit](https://streamlit.io/):用于构建Demo +- [DeepSpeed](https://github.com/microsoft/DeepSpeed):并行训练 +- … + +#### 如何参与本项目 + +贡献使开源社区成为一个学习、激励和创造的绝佳场所。你所作的任何贡献都是**非常感谢**的。 + +1. Fork the Project +2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`) +3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`) +4. Push to the Branch (`git push origin feature/AmazingFeature`) +5. Open a Pull Request + +
+ +### 作者(排名不分先后) + +| 用户名 | 学校/组织 | 备注 | 贡献 | +| :-----------------------------------------------------------: | :------------------------------------------------: | :------------------------------------------------------------------: | :-------------------------------------------: | +| [aJupyter](https://github.com/aJupyter) | 南开大学在读硕士 | DataWhale成员 | 项目发起人 | +| [MING-ZCH](https://github.com/MING-ZCH) | 华中科技大学在读本科生 | LLM x Psychology 研究者 | 项目联合负责人 | +| [jujimeizuo](https://github.com/jujimeizuo) | 江南大学在读硕士 | | | +| [Smiling-Weeping-zhr](https://github.com/Smiling-Weeping-zhr) | 哈尔滨工业大学(威海)在读本科生 | | | +| [8baby8](https://github.com/8baby8) | 飞桨领航团区域主管 | 文心大模型核心开发者 | | +| [zxazys](https://github.com/zxazys) | 南开大学在读硕士 | | | +| [JasonLLLLLLLLLLL](https://github.com/JasonLLLLLLLLLLL) | swufe | | | +| [MrCatAI](https://github.com/MrCatAI) | AI搬用工 | | | +| [ZeyuBa](https://github.com/ZeyuBa) | 自动化所在读硕士 | | | +| [aiyinyuedejustin](https://github.com/aiyinyuedejustin) | 宾夕法尼亚大学在读硕士 | | | +| [Nobody-ML](https://github.com/Nobody-ML) | 中国石油大学(华东)在读本科生 | | | +| [chg0901](https://github.com/chg0901) | [MiniSora](https://github.com/mini-sora/minisora/) | [MiniSora](https://github.com/mini-sora/minisora/)主要维护者,管理员 | LLM预训练和微调、模型上传、数据清洗、文档翻译 | +| [Mxoder](https://github.com/Mxoder) | 北京航空航天大学在读本科生 | | | +| [Anooyman](https://github.com/Anooyman) | 南京理工大学硕士 | | | +| [Vicky-3021](https://github.com/Vicky-3021) | 西安电子科技大学硕士(研0) | | | +| [SantiagoTOP](https://github.com/santiagoTOP) | 太原理工大学在读硕士 | | | +| [zealot52099](https://github.com/zealot52099) | 个人开发者 | | 清洗数据、LLM微调、RAG | +| [wwwyfff](https://github.com/wwwyfff) | 复旦大学在读硕士 | | | +| [jkhumor](https://github.com/jkhumor) | 南开大学在读硕士 | | RAG | +| [lll997150986](https://github.com/lll997150986) | 南开大学在读硕士 | | 微调 | +| [nln-maker](https://github.com/nln-maker) | 南开大学在读硕士 | | 前后端开发 | +| [dream00001](https://github.com/dream00001) | 南开大学在读硕士 | | 前后端开发 | +| [王几行XING](https://zhihu.com/people/brycewang1898) | 北京大学硕士毕业 | | 清洗数据、LLM微调、前后端开发 | +| [思在] | 北京大学硕士毕业(微软美国) | | LLM微调、前后端开发 | + +### 版权说明 + +该项目签署了 MIT 授权许可,详情请参阅 [LICENSE](https://github.com/SmartFlowAI/EmoLLM/blob/main/LICENSE) + +### 引用 + +如果本项目对您的工作有所帮助,请使用以下格式引用: + +```bibtex +@misc{EmoLLM, + title={EmoLLM}, + author={EmoLLM}, + url={https://github.com/SmartFlowAI/EmoLLM/}, + year={2024} +} +``` + +### 特别鸣谢 + +- [Sanbu](https://github.com/sanbuphy) +- [上海人工智能实验室](https://www.shlab.org.cn/) +- [闻星大佬(小助手)](https://github.com/vansin) +- [扫地升(公众号宣传)](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A) +- 阿布(北大心理学硕士) +- [HatBoy](https://github.com/hatboy) + + + + + + + +## Star History + +[![Star History Chart](https://api.star-history.com/svg?repos=SmartFlowAI/EmoLLM&type=Date)](https://star-history.com/#SmartFlowAI/EmoLLM&Date) + +## 🌟 Contributors + +[![EmoLLM contributors](https://contrib.rocks/image?repo=SmartFlowAI/EmoLLM&max=50)](https://github.com/SmartFlowAI/EmoLLM/graphs/contributors) + +[your-project-path]: SmartflowAI/EmoLLM +[contributors-shield]: https://img.shields.io/github/contributors/SmartflowAI/EmoLLM.svg?style=flat-square +[contributors-url]: https://github.com/SmartflowAI/EmoLLM/graphs/contributors +[forks-shield]: https://img.shields.io/github/forks/SmartflowAI/EmoLLM.svg?style=flat-square +[forks-url]: https://github.com/SmartflowAI/EmoLLM/network/members +[stars-shield]: https://img.shields.io/github/stars/SmartflowAI/EmoLLM.svg?style=flat-square +[stars-url]: https://github.com/SmartflowAI/EmoLLM/stargazers +[issues-shield]: https://img.shields.io/github/issues/SmartflowAI/EmoLLM.svg?style=flat-square +[issues-url]: https://img.shields.io/github/issues/SmartflowAI/EmoLLM.svg +[license-shield]: https://img.shields.io/github/license/SmartflowAI/EmoLLM.svg?style=flat-square +[license-url]: https://github.com/SmartFlowAI/EmoLLM/blob/main/LICENSE + +[OpenXLab_App-image]: https://cdn-static.openxlab.org.cn/app-center/openxlab_app.svg +[OpenXLab_Model-image]: https://cdn-static.openxlab.org.cn/header/openxlab_models.svg +[OpenXLab_App-url]: https://openxlab.org.cn/apps/detail/Farewell1/EmoLLMV2.0 +[OpenXLab_Model-url]: https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_internlm2_7b_full + +## 交流群 + +- 如果失效,请移步Issue区 + +

+ EmoLLM官方交流群 +

\ No newline at end of file diff --git a/README_EN.md b/README_EN.md index c8d3cf3..bb9d34a 100644 --- a/README_EN.md +++ b/README_EN.md @@ -1,343 +1,343 @@ -
- -# EmoLLM - Large Language Model for Mental Health - -
- -

- - Logo - - -

- - -[![Contributors][contributors-shield]][contributors-url] -[![Forks][forks-shield]][forks-url] -[![Issues][issues-shield]][issues-url] -[![OpenXLab_App][OpenXLab_App-image]][OpenXLab_App-url] -[![OpenXLab_Model][OpenXLab_Model-image]][OpenXLab_Model-url] -[![MIT License][license-shield]][license-url] -[![Stargazers][stars-shield]][stars-url] - -
- -

EmoLLM

- -

- 简体中文 | English -
-
- Explore the documentation of this project » -
-
- EmoLLM 2.0 Demo - · - Report a Bug - · - Propose a New Feature -

- -

- - - -**EmoLLM** is a series of large language models designed to understand, support and help customers in mental health counseling. It is fine-tuned from the LLM instructions. We really appreciate it if you could give it a star~⭐⭐. The open-sourced configuration is as follows: - -
- -| Model | Type | link | -| :-------------------: | :--------------: | :---: | -| InternLM2_7B_chat | QLORA | | -| InternLM2_7B_chat | full fine-tuning | | -| InternLM2_7B_base | QLORA | | -| InternLM2_1_8B_chat | full fine-tuning | | -| InternLM2_20B_chat | LORA | | -| Qwen_7b_chat | QLORA | | -| Qwen1_5-0_5B-Chat | full fine-tuning | | -| Baichuan2_13B_chat | QLORA | | -| ChatGLM3_6B | LORA | | -| DeepSeek MoE_16B_chat | QLORA | | -| Mixtral 8x7B_instruct | QLORA | | -| …… | …… | …… | - -
- -Everyone is welcome to contribute to this project ~ - ---- - -The Model aims to fully understand and promote the mental health of individuals, groups, and society. This model typically includes the following key components: - -- Cognitive factors: Involving an individual's thought patterns, belief systems, cognitive biases, and problem-solving abilities. Cognitive factors significantly impact mental health as they affect how individuals interpret and respond to life events. -- Emotional factors: Including emotion regulation, emotional expression, and emotional experiences. Emotional health is a crucial part of mental health, involving how individuals manage and express their emotions and how they recover from negative emotions. -- Behavioral factors: Concerning an individual's behavior patterns, habits, and coping strategies. This includes stress management skills, social skills, and self-efficacy, which is the confidence in one's abilities. -- Social environment: Comprising external factors such as family, work, community, and cultural background, which have direct and indirect impacts on an individual's mental health. -- Physical health: There is a close relationship between physical and mental health. Good physical health can promote mental health and vice versa. -- Psychological resilience: Refers to an individual's ability to recover from adversity and adapt. Those with strong psychological resilience can bounce back from challenges and learn and grow from them. -- Prevention and intervention measures: The Mental Health Grand Model also includes strategies for preventing psychological issues and promoting mental health, such as psychological education, counseling, therapy, and social support systems. -- Assessment and diagnostic tools: Effective promotion of mental health requires scientific tools to assess individuals' psychological states and diagnose potential psychological issues. - - - - - - - - - - - -
- - placeholder-image - - - - placeholder-image - -
- - placeholder-image - - - - placeholder-image - -
- -### Recent Updates -- 【2024.3.25】 [Mother-like Therapist] is released on Huggingface (https://huggingface.co/brycewang2018/EmoLLM-mother/tree/main) -- 【2024.3.25】 [Daddy-like Boy-Friend] is released on Baidu Paddle-Paddle AI Studio Platform (https://aistudio.baidu.com/community/app/68787) -- 【2024.3.24】 The InternLM2-Base-7B QLoRA fine-tuned model has been released on the OpenXLab and ModelScope platforms. For more details, please refer to [InternLM2-Base-7B QLoRA](./xtuner_config/README_internlm2_7b_base_qlora.md). -- 【2024.3.12】 [aiwei] is released on Baidu Paddle-Paddle AI Studio Platform (https://aistudio.baidu.com/community/app/63335) -- 【2024.3.11】 **EmoLLM V2.0 is greatly improved in all scores compared to EmoLLM V1.0. Surpasses the performance of Role-playing ChatGPT on counseling tasks!** [Click to experience EmoLLM V2.0](https://openxlab.org.cn/apps/detail/Farewell1/EmoLLMV2.0), update [dataset statistics and details](./datasets/), [Roadmap](./assets/Roadmap_ZH.png) -- 【2024.3.9】 Add concurrency acceleration [QA pair generation](./scripts/qa_generation/), [RAG pipeline](./rag/) -- 【2024.3.3】 [Based on InternLM2-7B-chat full fine-tuned version EmoLLM V2.0 open sourced](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_internlm2_7b_full), need two A100*80G, update professional evaluation, see [evaluate](./evaluate/), update PaddleOCR-based PDF to txt tool scripts, see [scripts](./scripts/). -- 【2024.2.29】 Updated objective assessment calculations, see [evaluate](./evaluate/) for details. A series of datasets have also been updated, see [datasets](./datasets/) for details. -- 【2024.2.27】 Updated English README and a series of datasets (licking dogs and one-round dialogue) -- 【2024.2.23】The "Gentle Lady Psychologist Ai Wei" based on InternLM2_7B_chat_qlora was launched. [Click here to obtain the model weights](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_aiwei), [configuration file](xtuner_config/aiwei-internlm2_chat_7b_qlora.py), [online experience link](https://openxlab.org.cn/apps/detail/ajupyter/EmoLLM-aiwei) - -- 【2024.2.23】Updated [several fine-tuning configurations](/xtuner_config/), added [data_pro.json](/datasets/data_pro.json) (more quantity, more comprehensive scenarios, richer content) and [aiwei.json](/datasets/aiwei.json) (dedicated to the gentle lady role-play, featuring Emoji expressions), the "Gentle Lady Psychologist Ai Wei" is coming soon. - -- 【2024.2.18】 The full fine-tuned version based on Qwen1_5-0_5B-Chat has been [open-sourced](https://www.modelscope.cn/models/aJupyter/EmoLLM_Qwen1_5-0_5B-Chat_full_sft/summary). Friends with limited computational resources can now dive in and explore it. - -
-View More - -- 【2024.2.6】 [Open-sourced based on the Qwen1_5-0_5B-Chat full-scale fine-tuned version](https://www.modelscope.cn/models/aJupyter/EmoLLM_Qwen1_5-0_5B-Chat_full_sft/summary), friends with limited computing power can start experimenting~ - -

- 模型下载量 -

- -- 【2024.2.5】 The project has been promoted by the official WeChat account NLP Engineering. Here's the [link](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A) to the article. Welcome everyone to follow!! 🥳🥳 - -

- 公众号二维码 -

- -- 【2024.2.3】 [Project Vedio](https://www.bilibili.com/video/BV1N7421N76X/) at bilibili 😊 -- 【2024.1.27】 Complete data construction documentation, fine-tuning guide, deployment guide, Readme, and other related documents 👏 -- 【2024.1.25】 EmoLLM V1.0 has deployed online https://openxlab.org.cn/apps/detail/jujimeizuo/EmoLLM 😀 - -
- -### Honors - -- The project won the ***the Innovation and Creativity Award*** in the **2024 Puyuan Large Model Series Challenge Spring Competition held by the Shanghai Artificial Intelligence Laboratory** - -

- - Challenge Innovation and Creativity Award -

- - -- The project has been promoted by the official WeChat account **NLP Engineering**. Here's the [link](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A). - -### Roadmap - -

- - Roadmap_EN - - -## Contents - -- [EmoLLM - Large Language Model for Mental Health](#emollm---large-language-model-for-mental-health) - - [Recent Updates](#recent-updates) - - [Honors](#honors) - - [Roadmap](#roadmap) - - [Contents](#contents) - - [Pre-development Configuration Requirements.](#pre-development-configuration-requirements) - - [**User Guide**](#user-guide) - - [File Directory Explanation](#file-directory-explanation) - - [Data Construction](#data-construction) - - [Fine-tuning Guide](#fine-tuning-guide) - - [Deployment Guide](#deployment-guide) - - [RAG (Retrieval Augmented Generation) Pipeline](#rag-retrieval-augmented-generation-pipeline) - - [Frameworks Used](#frameworks-used) - - [How to participate in this project](#how-to-participate-in-this-project) - - [Version control](#version-control) - - [Authors (in no particular order)](#authors-in-no-particular-order) - - [Copyright Notice](#copyright-notice) - - [Acknowledgments](#acknowledgments) - - [Star History](#star-history) - - [🌟 Contributors](#-contributors) - - [Communication group](#communication-group) - -###### Pre-development Configuration Requirements. - -- A100 40G (specifically for InternLM2_7B_chat + qlora fine-tuning + deepspeed zero2 optimization) - -###### **User Guide** - -1. Clone the repo - -```sh -git clone https://github.com/SmartFlowAI/EmoLLM.git -``` - -1. Read in sequence or read sections you're interested in: - - [Quick Start](#quick-start) - - [Data Construction](#data-construction) - - [Fine-tuning Guide](#fine-tuning-guide) - - [Deployment Guide](#deployment-guide) - - [RAG](#rag-retrieval-augmented-generation-pipeline) - - View More Details - - -### 🍪Quick start -- Please read [Quick Start](docs/quick_start_EN.md) to see. - -### 📌Data Construction - -- Please read the [Data Construction Guide ](generate_data/tutorial_EN.md)for reference. - -- The dataset used for this fine-tuning can be found at [datasets](datasets/data.json) - -### 🎨Fine-tuning Guide - -For details, see the [fine-tuning guide](xtuner_config/README_EN.md) - -### 🔧Deployment Guide - -- Demo deployment: see [deployment guide](./demo/README_EN.md) for details. -- Quantitative deployment based on [LMDeploy](https://github.com/InternLM/lmdeploy/): see [deploy](./deploy/lmdeploy_EN.md) - -### ⚙RAG (Retrieval Augmented Generation) Pipeline - -- See [RAG](./rag/) - -

-Additional Details - -### Frameworks Used - -- [Xtuner](https://github.com/InternLM/xtuner) -- [Transformers](https://github.com/huggingface/transformers) -- [Pytorch](https://pytorch.org/) -- [LMDeploy](https://github.com/InternLM/lmdeploy/): for quantitative deployment -- [Stremlit](https://streamlit.io/): for building demos -- [DeepSpeed](https://github.com/microsoft/DeepSpeed): for parallel training -- … - -#### How to participate in this project - -Contributions make the open-source community an excellent place for learning, inspiration, and creation. Any contribution you make is greatly appreciated. - -1. Fork the Project -2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`) -3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`) -4. Push to the Branch (`git push origin feature/AmazingFeature`) -5. Open a Pull Request - -### Version control - -This project uses Git for version control. You can see the currently available versions in the repository. - -
- -### Authors (in no particular order) - -| Username | School/Organization | Remarks | Contributions | -| :-----------------------------------------------------------: | :------------------------------------------------------------------: | :-----------------------------------------------------------------------: | :-----------------------------------------------------------------------------------: | -| [aJupyter](https://github.com/aJupyter) | Nankai University, Master's student | DataWhale member | Project initiator | -| [MING-ZCH](https://github.com/MING-ZCH) | Huazhong University of Science and Technology, Undergraduate student | LLM X Psychology researcher | Project co-leader | -| [jujimeizuo](https://github.com/jujimeizuo) | Jiangnan University, Master's student | | | -| [Smiling-Weeping-zhr](https://github.com/Smiling-Weeping-zhr) | Harbin Institute of Technology (Weihai), Undergraduate student | | | -| [8baby8](https://github.com/8baby8) | PaddlePaddle Pilot Team Regional Director | Wenxin Large Model core developer | | -| [zxazys](https://github.com/zxazys) | Nankai University, Master's student | | | -| [JasonLLLLLLLLLLL](https://github.com/JasonLLLLLLLLLLL) | SWUFE (Southwestern University of Finance and Economics) | | | -| [MrCatAI](https://github.com/MrCatAI) | AI Mover | | | -| [ZeyuBa](https://github.com/ZeyuBa) | Institute of Automation, Master's student | | | -| [aiyinyuedejustin](https://github.com/aiyinyuedejustin) | University of Pennsylvania, Master's student | | | -| [Nobody-ML](https://github.com/Nobody-ML) | China University of Petroleum (East China), Undergraduate student | | | -| [chg0901](https://github.com/chg0901) | [MiniSora](https://github.com/mini-sora/minisora) | Maintainer and Admin of [MiniSora](https://github.com/mini-sora/minisora) | LLM Pre-Training and Fine-Tuning, Model Uploading, Data Cleaning and Docs Translation | -| [Mxoder](https://github.com/Mxoder) | Beihang University, Undergraduate student | | | -| [Anooyman](https://github.com/Anooyman) | Nanjing University of Science and Technology, Master's student | | | -| [Vicky-3021](https://github.com/Vicky-3021) | Xidian University, Master's student (Research Year 0) | | | -| [SantiagoTOP](https://github.com/santiagoTOP) | Taiyuan University of Technology, Master's student | | | -| [zealot52099](https://github.com/zealot52099) | Individual developer | | Data Processing, LLM finetuning and RAG | -| [wwwyfff](https://github.com/wwwyfff) | FuDan University, Master's student | | | -| [jkhumor](https://github.com/jkhumor) | Nankai University, Master's student | | RAG | -| [lll997150986](https://github.com/lll997150986) | Nankai University, Master's student | | Fine Tuning | -| [nln-maker](https://github.com/nln-maker) | Nankai University, Master's student | | Front-end and back-end development | -| [dream00001](https://github.com/dream00001) | Nankai University, Master's student | | Front-end and back-end development | -| [王几行XING](zhihu.com/people/brycewang1898) | Peking University, Master's graduate | | Data Processing, LLM finetuning, Front-end and back-end development | -| [思在] | Peking University, Master's graduate (Microsoft) | | LLM finetuning, Front-end and back-end development | - -### Copyright Notice - -The project is licensed under the MIT License. Please refer to the details - [LICENSE](https://github.com/SmartFlowAI/EmoLLM/blob/master/LICENSE) - -### Acknowledgments - -- [Sanbu](https://github.com/sanbuphy) -- [Shanghai Artificial Intelligence Laboratory](https://www.shlab.org.cn/) -- [Vanin](https://github.com/vansin) -- [Bloom up (WeChat Official Account Promotion)](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A) -- Abu (M.A. in Psychology, Peking University) -- [HatBoy](https://github.com/hatboy) - - - - - - - - - -## Star History - -[![Star History Chart](https://api.star-history.com/svg?repos=SmartFlowAI/EmoLLM&type=Date)](https://star-history.com/#SmartFlowAI/EmoLLM&Date) - -## 🌟 Contributors - -[![EmoLLM contributors](https://contrib.rocks/image?repo=SmartFlowAI/EmoLLM&max=50)](https://github.com/SmartFlowAI/EmoLLM/graphs/contributors) - -[your-project-path]: SmartflowAI/EmoLLM -[contributors-shield]: https://img.shields.io/github/contributors/SmartflowAI/EmoLLM.svg?style=flat-square -[contributors-url]: https://github.com/SmartflowAI/EmoLLM/graphs/contributors -[forks-shield]: https://img.shields.io/github/forks/SmartflowAI/EmoLLM.svg?style=flat-square -[forks-url]: https://github.com/SmartflowAI/EmoLLM/network/members -[stars-shield]: https://img.shields.io/github/stars/SmartflowAI/EmoLLM.svg?style=flat-square -[stars-url]: https://github.com/SmartflowAI/EmoLLM/stargazers -[issues-shield]: https://img.shields.io/github/issues/SmartflowAI/EmoLLM.svg?style=flat-square -[issues-url]: https://img.shields.io/github/issues/SmartflowAI/EmoLLM.svg -[license-shield]: https://img.shields.io/github/license/SmartflowAI/EmoLLM.svg?style=flat-square -[license-url]: https://github.com/SmartflowAI/EmoLLM/blob/main/LICENSE - -[OpenXLab_App-image]: https://cdn-static.openxlab.org.cn/app-center/openxlab_app.svg -[OpenXLab_Model-image]: https://cdn-static.openxlab.org.cn/header/openxlab_models.svg -[OpenXLab_App-url]: https://openxlab.org.cn/apps/detail/Farewell1/EmoLLMV2.0 -[OpenXLab_Model-url]: https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_internlm2_7b_full - -## Communication group - -- If it fails, go to the Issue section. - -

- EmoLLM official communication group -

+
+ +# EmoLLM - Large Language Model for Mental Health + +
+ +

+ + Logo + + +

+ + +[![Contributors][contributors-shield]][contributors-url] +[![Forks][forks-shield]][forks-url] +[![Issues][issues-shield]][issues-url] +[![OpenXLab_App][OpenXLab_App-image]][OpenXLab_App-url] +[![OpenXLab_Model][OpenXLab_Model-image]][OpenXLab_Model-url] +[![MIT License][license-shield]][license-url] +[![Stargazers][stars-shield]][stars-url] + +
+ +

EmoLLM

+ +

+ 简体中文 | English +
+
+ Explore the documentation of this project » +
+
+ EmoLLM 2.0 Demo + · + Report a Bug + · + Propose a New Feature +

+ +

+ + + +**EmoLLM** is a series of large language models designed to understand, support and help customers in mental health counseling. It is fine-tuned from the LLM instructions. We really appreciate it if you could give it a star~⭐⭐. The open-sourced configuration is as follows: + +
+ +| Model | Type | link | +| :-------------------: | :--------------: | :---: | +| InternLM2_7B_chat | QLORA | | +| InternLM2_7B_chat | full fine-tuning | | +| InternLM2_7B_base | QLORA |[internlm2_7b_base_qlora_e10_M_1e4_32_64.py](./xtuner_config/internlm2_7b_base_qlora_e10_M_1e4_32_64.py)| +| InternLM2_1_8B_chat | full fine-tuning | | +| InternLM2_20B_chat | LORA | | +| Qwen_7b_chat | QLORA | | +| Qwen1_5-0_5B-Chat | full fine-tuning | | +| Baichuan2_13B_chat | QLORA | | +| ChatGLM3_6B | LORA | | +| DeepSeek MoE_16B_chat | QLORA | | +| Mixtral 8x7B_instruct | QLORA | | +| …… | …… | …… | + +
+ +Everyone is welcome to contribute to this project ~ + +--- + +The Model aims to fully understand and promote the mental health of individuals, groups, and society. This model typically includes the following key components: + +- Cognitive factors: Involving an individual's thought patterns, belief systems, cognitive biases, and problem-solving abilities. Cognitive factors significantly impact mental health as they affect how individuals interpret and respond to life events. +- Emotional factors: Including emotion regulation, emotional expression, and emotional experiences. Emotional health is a crucial part of mental health, involving how individuals manage and express their emotions and how they recover from negative emotions. +- Behavioral factors: Concerning an individual's behavior patterns, habits, and coping strategies. This includes stress management skills, social skills, and self-efficacy, which is the confidence in one's abilities. +- Social environment: Comprising external factors such as family, work, community, and cultural background, which have direct and indirect impacts on an individual's mental health. +- Physical health: There is a close relationship between physical and mental health. Good physical health can promote mental health and vice versa. +- Psychological resilience: Refers to an individual's ability to recover from adversity and adapt. Those with strong psychological resilience can bounce back from challenges and learn and grow from them. +- Prevention and intervention measures: The Mental Health Grand Model also includes strategies for preventing psychological issues and promoting mental health, such as psychological education, counseling, therapy, and social support systems. +- Assessment and diagnostic tools: Effective promotion of mental health requires scientific tools to assess individuals' psychological states and diagnose potential psychological issues. + + + + + + + + + + + +
+ + placeholder-image + + + + placeholder-image + +
+ + placeholder-image + + + + placeholder-image + +
+ +### Recent Updates +- 【2024.3.25】 [Mother-like Therapist] is released on Huggingface (https://huggingface.co/brycewang2018/EmoLLM-mother/tree/main) +- 【2024.3.25】 [Daddy-like Boy-Friend] is released on Baidu Paddle-Paddle AI Studio Platform (https://aistudio.baidu.com/community/app/68787) +- 【2024.3.24】 The **InternLM2-Base-7B QLoRA fine-tuned model** has been released on the **OpenXLab** and **ModelScope** platforms. For more details, please refer to [**InternLM2-Base-7B QLoRA**](./xtuner_config/README_internlm2_7b_base_qlora.md). +- 【2024.3.12】 [aiwei] is released on Baidu Paddle-Paddle AI Studio Platform (https://aistudio.baidu.com/community/app/63335) +- 【2024.3.11】 **EmoLLM V2.0 is greatly improved in all scores compared to EmoLLM V1.0. Surpasses the performance of Role-playing ChatGPT on counseling tasks!** [Click to experience EmoLLM V2.0](https://openxlab.org.cn/apps/detail/Farewell1/EmoLLMV2.0), update [dataset statistics and details](./datasets/), [Roadmap](./assets/Roadmap_ZH.png) +- 【2024.3.9】 Add concurrency acceleration [QA pair generation](./scripts/qa_generation/), [RAG pipeline](./rag/) +- 【2024.3.3】 [Based on InternLM2-7B-chat full fine-tuned version EmoLLM V2.0 open sourced](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_internlm2_7b_full), need two A100*80G, update professional evaluation, see [evaluate](./evaluate/), update PaddleOCR-based PDF to txt tool scripts, see [scripts](./scripts/). +- 【2024.2.29】 Updated objective assessment calculations, see [evaluate](./evaluate/) for details. A series of datasets have also been updated, see [datasets](./datasets/) for details. +- 【2024.2.27】 Updated English README and a series of datasets (licking dogs and one-round dialogue) +- 【2024.2.23】The "Gentle Lady Psychologist Ai Wei" based on InternLM2_7B_chat_qlora was launched. [Click here to obtain the model weights](https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_aiwei), [configuration file](xtuner_config/aiwei-internlm2_chat_7b_qlora.py), [online experience link](https://openxlab.org.cn/apps/detail/ajupyter/EmoLLM-aiwei) + +- 【2024.2.23】Updated [several fine-tuning configurations](/xtuner_config/), added [data_pro.json](/datasets/data_pro.json) (more quantity, more comprehensive scenarios, richer content) and [aiwei.json](/datasets/aiwei.json) (dedicated to the gentle lady role-play, featuring Emoji expressions), the "Gentle Lady Psychologist Ai Wei" is coming soon. + +- 【2024.2.18】 The full fine-tuned version based on Qwen1_5-0_5B-Chat has been [open-sourced](https://www.modelscope.cn/models/aJupyter/EmoLLM_Qwen1_5-0_5B-Chat_full_sft/summary). Friends with limited computational resources can now dive in and explore it. + +
+View More + +- 【2024.2.6】 [Open-sourced based on the Qwen1_5-0_5B-Chat full-scale fine-tuned version](https://www.modelscope.cn/models/aJupyter/EmoLLM_Qwen1_5-0_5B-Chat_full_sft/summary), friends with limited computing power can start experimenting~ + +

+ 模型下载量 +

+ +- 【2024.2.5】 The project has been promoted by the official WeChat account NLP Engineering. Here's the [link](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A) to the article. Welcome everyone to follow!! 🥳🥳 + +

+ 公众号二维码 +

+ +- 【2024.2.3】 [Project Vedio](https://www.bilibili.com/video/BV1N7421N76X/) at bilibili 😊 +- 【2024.1.27】 Complete data construction documentation, fine-tuning guide, deployment guide, Readme, and other related documents 👏 +- 【2024.1.25】 EmoLLM V1.0 has deployed online https://openxlab.org.cn/apps/detail/jujimeizuo/EmoLLM 😀 + +
+ +### Honors + +- The project won the ***the Innovation and Creativity Award*** in the **2024 Puyuan Large Model Series Challenge Spring Competition held by the Shanghai Artificial Intelligence Laboratory** + +

+ + Challenge Innovation and Creativity Award +

+ + +- The project has been promoted by the official WeChat account **NLP Engineering**. Here's the [link](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A). + +### Roadmap + +

+ + Roadmap_EN + + +## Contents + +- [EmoLLM - Large Language Model for Mental Health](#emollm---large-language-model-for-mental-health) + - [Recent Updates](#recent-updates) + - [Honors](#honors) + - [Roadmap](#roadmap) + - [Contents](#contents) + - [Pre-development Configuration Requirements.](#pre-development-configuration-requirements) + - [**User Guide**](#user-guide) + - [🍪Quick start](#quick-start) + - [📌Data Construction](#data-construction) + - [🎨Fine-tuning Guide](#fine-tuning-guide) + - [🔧Deployment Guide](#deployment-guide) + - [⚙RAG (Retrieval Augmented Generation) Pipeline](#rag-retrieval-augmented-generation-pipeline) + - [Frameworks Used](#frameworks-used) + - [How to participate in this project](#how-to-participate-in-this-project) + - [Version control](#version-control) + - [Authors (in no particular order)](#authors-in-no-particular-order) + - [Copyright Notice](#copyright-notice) + - [Acknowledgments](#acknowledgments) + - [Star History](#star-history) + - [🌟 Contributors](#-contributors) + - [Communication group](#communication-group) + +###### Pre-development Configuration Requirements. + +- A100 40G (specifically for InternLM2_7B_chat + qlora fine-tuning + deepspeed zero2 optimization) + +###### **User Guide** + +1. Clone the repo + +```sh +git clone https://github.com/SmartFlowAI/EmoLLM.git +``` + +1. Read in sequence or read sections you're interested in: + - [Quick Start](#quick-start) + - [Data Construction](#data-construction) + - [Fine-tuning Guide](#fine-tuning-guide) + - [Deployment Guide](#deployment-guide) + - [RAG](#rag-retrieval-augmented-generation-pipeline) + - View More Details + + +### 🍪Quick start +- Please read [Quick Start](docs/quick_start_EN.md) to see. + +### 📌Data Construction + +- Please read the [Data Construction Guide ](generate_data/tutorial_EN.md)for reference. + +- The dataset used for this fine-tuning can be found at [datasets](datasets/data.json) + +### 🎨Fine-tuning Guide + +For details, see the [fine-tuning guide](xtuner_config/README_EN.md) + +### 🔧Deployment Guide + +- Demo deployment: see [deployment guide](./demo/README_EN.md) for details. +- Quantitative deployment based on [LMDeploy](https://github.com/InternLM/lmdeploy/): see [deploy](./deploy/lmdeploy_EN.md) + +### ⚙RAG (Retrieval Augmented Generation) Pipeline + +- See [RAG](./rag/) + +

+Additional Details + +### Frameworks Used + +- [Xtuner](https://github.com/InternLM/xtuner) +- [Transformers](https://github.com/huggingface/transformers) +- [Pytorch](https://pytorch.org/) +- [LMDeploy](https://github.com/InternLM/lmdeploy/): for quantitative deployment +- [Stremlit](https://streamlit.io/): for building demos +- [DeepSpeed](https://github.com/microsoft/DeepSpeed): for parallel training +- … + +#### How to participate in this project + +Contributions make the open-source community an excellent place for learning, inspiration, and creation. Any contribution you make is greatly appreciated. + +1. Fork the Project +2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`) +3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`) +4. Push to the Branch (`git push origin feature/AmazingFeature`) +5. Open a Pull Request + +### Version control + +This project uses Git for version control. You can see the currently available versions in the repository. + +
+ +### Authors (in no particular order) + +| Username | School/Organization | Remarks | Contributions | +| :-----------------------------------------------------------: | :------------------------------------------------------------------: | :-----------------------------------------------------------------------: | :-----------------------------------------------------------------------------------: | +| [aJupyter](https://github.com/aJupyter) | Nankai University, Master's student | DataWhale member | Project initiator | +| [MING-ZCH](https://github.com/MING-ZCH) | Huazhong University of Science and Technology, Undergraduate student | LLM X Psychology researcher | Project co-leader | +| [jujimeizuo](https://github.com/jujimeizuo) | Jiangnan University, Master's student | | | +| [Smiling-Weeping-zhr](https://github.com/Smiling-Weeping-zhr) | Harbin Institute of Technology (Weihai), Undergraduate student | | | +| [8baby8](https://github.com/8baby8) | PaddlePaddle Pilot Team Regional Director | Wenxin Large Model core developer | | +| [zxazys](https://github.com/zxazys) | Nankai University, Master's student | | | +| [JasonLLLLLLLLLLL](https://github.com/JasonLLLLLLLLLLL) | SWUFE (Southwestern University of Finance and Economics) | | | +| [MrCatAI](https://github.com/MrCatAI) | AI Mover | | | +| [ZeyuBa](https://github.com/ZeyuBa) | Institute of Automation, Master's student | | | +| [aiyinyuedejustin](https://github.com/aiyinyuedejustin) | University of Pennsylvania, Master's student | | | +| [Nobody-ML](https://github.com/Nobody-ML) | China University of Petroleum (East China), Undergraduate student | | | +| [chg0901](https://github.com/chg0901) | [MiniSora](https://github.com/mini-sora/minisora) | Maintainer and Admin of [MiniSora](https://github.com/mini-sora/minisora) | LLM Pre-Training and Fine-Tuning, Model Uploading, Data Cleaning and Docs Translation | +| [Mxoder](https://github.com/Mxoder) | Beihang University, Undergraduate student | | | +| [Anooyman](https://github.com/Anooyman) | Nanjing University of Science and Technology, Master's student | | | +| [Vicky-3021](https://github.com/Vicky-3021) | Xidian University, Master's student (Research Year 0) | | | +| [SantiagoTOP](https://github.com/santiagoTOP) | Taiyuan University of Technology, Master's student | | | +| [zealot52099](https://github.com/zealot52099) | Individual developer | | Data Processing, LLM finetuning and RAG | +| [wwwyfff](https://github.com/wwwyfff) | FuDan University, Master's student | | | +| [jkhumor](https://github.com/jkhumor) | Nankai University, Master's student | | RAG | +| [lll997150986](https://github.com/lll997150986) | Nankai University, Master's student | | Fine Tuning | +| [nln-maker](https://github.com/nln-maker) | Nankai University, Master's student | | Front-end and back-end development | +| [dream00001](https://github.com/dream00001) | Nankai University, Master's student | | Front-end and back-end development | +| [王几行XING](zhihu.com/people/brycewang1898) | Peking University, Master's graduate | | Data Processing, LLM finetuning, Front-end and back-end development | +| [思在] | Peking University, Master's graduate (Microsoft) | | LLM finetuning, Front-end and back-end development | + +### Copyright Notice + +The project is licensed under the MIT License. Please refer to the details + [LICENSE](https://github.com/SmartFlowAI/EmoLLM/blob/master/LICENSE) + +### Acknowledgments + +- [Sanbu](https://github.com/sanbuphy) +- [Shanghai Artificial Intelligence Laboratory](https://www.shlab.org.cn/) +- [Vanin](https://github.com/vansin) +- [Bloom up (WeChat Official Account Promotion)](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A) +- Abu (M.A. in Psychology, Peking University) +- [HatBoy](https://github.com/hatboy) + + + + + + + + + +## Star History + +[![Star History Chart](https://api.star-history.com/svg?repos=SmartFlowAI/EmoLLM&type=Date)](https://star-history.com/#SmartFlowAI/EmoLLM&Date) + +## 🌟 Contributors + +[![EmoLLM contributors](https://contrib.rocks/image?repo=SmartFlowAI/EmoLLM&max=50)](https://github.com/SmartFlowAI/EmoLLM/graphs/contributors) + +[your-project-path]: SmartflowAI/EmoLLM +[contributors-shield]: https://img.shields.io/github/contributors/SmartflowAI/EmoLLM.svg?style=flat-square +[contributors-url]: https://github.com/SmartflowAI/EmoLLM/graphs/contributors +[forks-shield]: https://img.shields.io/github/forks/SmartflowAI/EmoLLM.svg?style=flat-square +[forks-url]: https://github.com/SmartflowAI/EmoLLM/network/members +[stars-shield]: https://img.shields.io/github/stars/SmartflowAI/EmoLLM.svg?style=flat-square +[stars-url]: https://github.com/SmartflowAI/EmoLLM/stargazers +[issues-shield]: https://img.shields.io/github/issues/SmartflowAI/EmoLLM.svg?style=flat-square +[issues-url]: https://img.shields.io/github/issues/SmartflowAI/EmoLLM.svg +[license-shield]: https://img.shields.io/github/license/SmartflowAI/EmoLLM.svg?style=flat-square +[license-url]: https://github.com/SmartflowAI/EmoLLM/blob/main/LICENSE + +[OpenXLab_App-image]: https://cdn-static.openxlab.org.cn/app-center/openxlab_app.svg +[OpenXLab_Model-image]: https://cdn-static.openxlab.org.cn/header/openxlab_models.svg +[OpenXLab_App-url]: https://openxlab.org.cn/apps/detail/Farewell1/EmoLLMV2.0 +[OpenXLab_Model-url]: https://openxlab.org.cn/models/detail/ajupyter/EmoLLM_internlm2_7b_full + +## Communication group + +- If it fails, go to the Issue section. + +

+ EmoLLM official communication group +

\ No newline at end of file diff --git a/xtuner_config/README_internlm2_7b_base_qlora.md b/xtuner_config/README_internlm2_7b_base_qlora.md index 13b8e80..f583b42 100644 --- a/xtuner_config/README_internlm2_7b_base_qlora.md +++ b/xtuner_config/README_internlm2_7b_base_qlora.md @@ -2,25 +2,37 @@ ## 模型基座与配置文件 -- 本项目在[**internlm2_7b_chat_qlora_e3**模型](./internlm2_7b_chat_qlora_e3.py)微调[指南](./README.md)的基础上,更新了对[**internlm2_7b_base_qlora_e3(配置文件)**](./internlm2_7b_base_qlora_e10_M_1e4_32_64.py)**模型**的微调。 +- 本项目在XTuner项目所提供的[**internlm2_7b_chat_qlora_e3**模型配置文件](./internlm2_7b_chat_qlora_e3.py)和在[EmoLLM模型微调指南](./README.md)的基础上,创建和更新了对**InternLM2_7B_base模型**在[EmoLLM通用数据集](../datasets/README.md)上进行QLoRA微调训练,配置文件详见[**internlm2_7b_base_qlora_e10_M_1e4_32_64.py**](./internlm2_7b_base_qlora_e10_M_1e4_32_64.py)。 +- 为了用户可以根据自己不同的硬件配置进行复现和微调训练,EmoLLM也提供了其他的配置文件以满足不同的配置需求。 + - [internlm2_7b_base_qlora_e10_b8_16_32.py](./internlm2_7b_base_qlora_e10_b8_16_32.py) + - [internlm2_7b_base_qlora_e3_M_1e4_32_64.py](./internlm2_7b_base_qlora_e3_M_1e4_32_64.py) ## 模型公布和训练epoch数设置 -- 由于采用了合并后的数据集,我们对选用的internlm2_7b_base模型进行了**10 epoch**的训练,读者可以根据训练过程中的输出和loss变化,进行训练的终止和模型的挑选,也可以采用更加专业的评估方法,来对模型评测。 +- 由于采用了合并后的数据集,我们对选用的InternLM2_7B_base模型进行了**10 epoch**的训练,读者可以根据训练过程中的输出和loss变化,进行训练的终止和模型的挑选,也可以采用更加专业的评估方法,来对模型评测。 -- 在我们公布的internlm2_7b_base_qlora微调模型时,也分别在OpenXLab和ModelScope中提供了两个不同的权重版本供用户使用和测试,更多专业测评结果将会在近期更新, 敬请期待。 +- 在我们公布的InternLM2_7B_base QLoRA微调模型时,也分别在OpenXLab和ModelScope中提供了两个不同的权重版本供用户使用和测试,更多专业测评结果将会在近期更新,敬请期待。 -- **OpenXLab**: - - [5 epoch 模型](https://openxlab.org.cn/models/detail/chg0901/EmoLLM-InternLM7B-base) - - [10 epoch 模型](https://openxlab.org.cn/models/detail/chg0901/EmoLLM-InternLM7B-base-10e) - -- **ModelScope**: - - [5 epoch 模型](https://www.modelscope.cn/models/chg0901/EmoLLM-InternLM7B-base/files) - - [10 epoch 模型](https://www.modelscope.cn/models/chg0901/EmoLLM-InternLM7B-base-10e/files) + - **OpenXLab**: + - [5 epoch 模型](https://openxlab.org.cn/models/detail/chg0901/EmoLLM-InternLM7B-base) + - [10 epoch 模型](https://openxlab.org.cn/models/detail/chg0901/EmoLLM-InternLM7B-base-10e) + - **ModelScope**: + - [5 epoch 模型](https://www.modelscope.cn/models/chg0901/EmoLLM-InternLM7B-base/files) + - [10 epoch 模型](https://www.modelscope.cn/models/chg0901/EmoLLM-InternLM7B-base-10e/files) + +- 目前EmoLLM团队已经采用**通用指标**评估了QLoRA微调训练的InternLM2_7B_base模型(包括5 epoch 模型和10 epoch 模型),结果如下表所示,可以看到10 epoch QLoRA微调训练的InternLM2_7B_base模型通用指标已经超过其他模型,我们将近期更新在心理咨询专业指标上的评测结果。更多评测详情请查看[通用测评结果页面(General_evaluation.md)](../evaluate/General_evaluation.md)和[测评目录README](../evaluate/README.md). + +| Model | ROUGE-1 | ROUGE-2 | ROUGE-L | BLEU-1 | BLEU-2 | BLEU-3 | BLEU-4 | +|----------|---------|---------|---------|---------|---------|---------|---------| +| Qwen1_5-0_5B-chat | 27.23% | 8.55% | 17.05% | 26.65% | 13.11% | 7.19% | 4.05% | +| InternLM2_7B_chat_qlora | 37.86% | 15.23% | 24.34% | 39.71% | 22.66% | 14.26% | 9.21% | +| InternLM2_7B_chat_full | 32.45% | 10.82% | 20.17% | 30.48% | 15.67% | 8.84% | 5.02% | +| InternLM2_7B_base_qlora_5epoch | 41.94% | 20.21% | 29.67% | 42.98% | 27.07% | 19.33% | 14.62% | +| **InternLM2_7B_base_qlora_10epoch** | **43.47%** | **22.06%** | **31.4%** | **44.81%** | **29.15%** | **21.44%** | **16.72%** | ### 超参数设置 -训练config设置详情,请查看[**internlm2_7b_base_qlora_e3(配置文件)**](./internlm2_7b_base_qlora_e10_M_1e4_32_64.py),这里我们只列出了关键的超参数或者我们做过调整的超参数。 +训练config设置详情,请查看[**`internlm2_7b_base_qlora_e10_M_1e4_32_64.py`(配置文件)**](./internlm2_7b_base_qlora_e10_M_1e4_32_64.py),这里我们只列出了关键的超参数或者我们做过调整的超参数。 ```python prompt_template = PROMPT_TEMPLATE.internlm2_chat