Dev (#239)
This commit is contained in:
		
						commit
						ad3a1ce58b
					
				
							
								
								
									
										18
									
								
								README.md
									
									
									
									
									
								
							
							
						
						
									
										18
									
								
								README.md
									
									
									
									
									
								
							| @ -172,12 +172,12 @@ | |||||||
|   - [🔗框架图](#框架图) |   - [🔗框架图](#框架图) | ||||||
|   - [目录](#目录) |   - [目录](#目录) | ||||||
|           - [开发前的配置要求](#开发前的配置要求) |           - [开发前的配置要求](#开发前的配置要求) | ||||||
|           - [**使用指南**](#使用指南) |           - [使用指南](#使用指南) | ||||||
|     - [🍪快速体验](#快速体验) |     - [🍪快速体验](#快速体验) | ||||||
|     - [📌数据构建](#数据构建) |     - [📌数据构建](#数据构建) | ||||||
|     - [🎨微调指南](#微调指南) |     - [🎨微调指南](#微调指南) | ||||||
|     - [🔧部署指南](#部署指南) |     - [🔧部署指南](#部署指南) | ||||||
|     - [⚙RAG(检索增强生成)Pipeline](#rag检索增强生成pipeline) |     - [⚙RAG(检索增强生成)](#rag检索增强生成) | ||||||
|     - [使用到的框架](#使用到的框架) |     - [使用到的框架](#使用到的框架) | ||||||
|       - [如何参与本项目](#如何参与本项目) |       - [如何参与本项目](#如何参与本项目) | ||||||
|     - [作者(排名不分先后)](#作者排名不分先后) |     - [作者(排名不分先后)](#作者排名不分先后) | ||||||
| @ -192,7 +192,7 @@ | |||||||
| 
 | 
 | ||||||
| - 硬件:A100 40G(仅针对InternLM2_7B_chat+qlora微调+deepspeed zero2优化) | - 硬件:A100 40G(仅针对InternLM2_7B_chat+qlora微调+deepspeed zero2优化) | ||||||
| 
 | 
 | ||||||
| ###### **使用指南** | ###### 使用指南 | ||||||
| 
 | 
 | ||||||
| 1. Clone the repo | 1. Clone the repo | ||||||
| 
 | 
 | ||||||
| @ -211,7 +211,8 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git | |||||||
| 
 | 
 | ||||||
| ### 🍪快速体验 | ### 🍪快速体验 | ||||||
| 
 | 
 | ||||||
| - 请阅读[快速体验](docs/quick_start.md)查阅 | - 请阅读[快速体验](quick_start/quick_start.md)查阅 | ||||||
|  | - 快速上手:[Baby EmoLLM](quick_start/Baby_EmoLLM.ipynb) | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| ### 📌数据构建 | ### 📌数据构建 | ||||||
| @ -229,9 +230,9 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git | |||||||
| - Demo部署:详见[部署指南](demo/README.md) | - Demo部署:详见[部署指南](demo/README.md) | ||||||
| - 基于[LMDeploy](https://github.com/InternLM/lmdeploy/)的量化部署:详见[deploy](./deploy/lmdeploy.md) | - 基于[LMDeploy](https://github.com/InternLM/lmdeploy/)的量化部署:详见[deploy](./deploy/lmdeploy.md) | ||||||
| 
 | 
 | ||||||
| ### ⚙RAG(检索增强生成)Pipeline | ### ⚙RAG(检索增强生成) | ||||||
| 
 | 
 | ||||||
| - 详见[RAG](./rag/) | - 详见[RAG](rag/README.md) | ||||||
| 
 | 
 | ||||||
| <details> | <details> | ||||||
| <summary>更多详情</summary> | <summary>更多详情</summary> | ||||||
| @ -307,11 +308,10 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git | |||||||
| 
 | 
 | ||||||
| ### 特别鸣谢 | ### 特别鸣谢 | ||||||
| 
 | 
 | ||||||
| - [Sanbu](https://github.com/sanbuphy) |  | ||||||
| - [上海人工智能实验室](https://www.shlab.org.cn/) | - [上海人工智能实验室](https://www.shlab.org.cn/) | ||||||
| - [闻星大佬(小助手)](https://github.com/vansin) | - [闻星(浦语小助手)](https://github.com/vansin) | ||||||
| - [扫地升(公众号宣传)](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A) |  | ||||||
| - 阿布(北大心理学硕士) | - 阿布(北大心理学硕士) | ||||||
|  | - [Sanbu](https://github.com/sanbuphy) | ||||||
| - [HatBoy](https://github.com/hatboy) | - [HatBoy](https://github.com/hatboy) | ||||||
| 
 | 
 | ||||||
| <!-- links --> | <!-- links --> | ||||||
|  | |||||||
							
								
								
									
										20
									
								
								README_EN.md
									
									
									
									
									
								
							
							
						
						
									
										20
									
								
								README_EN.md
									
									
									
									
									
								
							| @ -173,12 +173,12 @@ The Model aims to fully understand and promote the mental health of individuals, | |||||||
|   - [Roadmap](#roadmap) |   - [Roadmap](#roadmap) | ||||||
|   - [Contents](#contents) |   - [Contents](#contents) | ||||||
|           - [Pre-development Configuration Requirements.](#pre-development-configuration-requirements) |           - [Pre-development Configuration Requirements.](#pre-development-configuration-requirements) | ||||||
|           - [**User Guide**](#user-guide) |           - [User Guide](#user-guide) | ||||||
|     - [🍪Quick start](#quick-start) |     - [🍪Quick start](#quick-start) | ||||||
|     - [📌Data Construction](#data-construction) |     - [📌Data Construction](#data-construction) | ||||||
|     - [🎨Fine-tuning Guide](#fine-tuning-guide) |     - [🎨Fine-tuning Guide](#fine-tuning-guide) | ||||||
|     - [🔧Deployment Guide](#deployment-guide) |     - [🔧Deployment Guide](#deployment-guide) | ||||||
|     - [⚙RAG (Retrieval Augmented Generation) Pipeline](#rag-retrieval-augmented-generation-pipeline) |     - [⚙RAG (Retrieval Augmented Generation)](#rag-retrieval-augmented-generation) | ||||||
|     - [Frameworks Used](#frameworks-used) |     - [Frameworks Used](#frameworks-used) | ||||||
|       - [How to participate in this project](#how-to-participate-in-this-project) |       - [How to participate in this project](#how-to-participate-in-this-project) | ||||||
|     - [Version control](#version-control) |     - [Version control](#version-control) | ||||||
| @ -193,7 +193,7 @@ The Model aims to fully understand and promote the mental health of individuals, | |||||||
| 
 | 
 | ||||||
| - A100 40G (specifically for InternLM2_7B_chat + qlora fine-tuning + deepspeed zero2 optimization) | - A100 40G (specifically for InternLM2_7B_chat + qlora fine-tuning + deepspeed zero2 optimization) | ||||||
| 
 | 
 | ||||||
| ###### **User Guide** | ###### User Guide | ||||||
| 
 | 
 | ||||||
| 1. Clone the repo | 1. Clone the repo | ||||||
| 
 | 
 | ||||||
| @ -211,7 +211,8 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git | |||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| ### 🍪Quick start | ### 🍪Quick start | ||||||
| - Please read [Quick Start](docs/quick_start_EN.md) to see. | - Please read [Quick Start](quick_start/quick_start_EN.md) to see. | ||||||
|  | - Quick coding: [Baby EmoLLM](quick_start/Baby_EmoLLM.ipynb) | ||||||
| 
 | 
 | ||||||
| ### 📌Data Construction | ### 📌Data Construction | ||||||
| 
 | 
 | ||||||
| @ -228,9 +229,9 @@ For details, see the [fine-tuning guide](xtuner_config/README_EN.md) | |||||||
| - Demo deployment: see [deployment guide](./demo/README_EN.md) for details. | - Demo deployment: see [deployment guide](./demo/README_EN.md) for details. | ||||||
| - Quantitative deployment based on [LMDeploy](https://github.com/InternLM/lmdeploy/): see [deploy](./deploy/lmdeploy_EN.md) | - Quantitative deployment based on [LMDeploy](https://github.com/InternLM/lmdeploy/): see [deploy](./deploy/lmdeploy_EN.md) | ||||||
| 
 | 
 | ||||||
| ### ⚙RAG (Retrieval Augmented Generation) Pipeline | ### ⚙RAG (Retrieval Augmented Generation) | ||||||
| 
 | 
 | ||||||
| - See [RAG](./rag/) | - See [RAG](rag/README_EN.md) | ||||||
| 
 | 
 | ||||||
| <details> | <details> | ||||||
| <summary>Additional Details</summary> | <summary>Additional Details</summary> | ||||||
| @ -297,11 +298,10 @@ The project is licensed under the MIT License. Please refer to the details | |||||||
| 
 | 
 | ||||||
| ### Acknowledgments | ### Acknowledgments | ||||||
| 
 | 
 | ||||||
| - [Sanbu](https://github.com/sanbuphy) |  | ||||||
| - [Shanghai Artificial Intelligence Laboratory](https://www.shlab.org.cn/) | - [Shanghai Artificial Intelligence Laboratory](https://www.shlab.org.cn/) | ||||||
| - [Vanin](https://github.com/vansin) | - [Vansin](https://github.com/vansin) | ||||||
| - [Bloom up (WeChat Official Account Promotion)](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A) | - A.bu (M.A. in Psychology, Peking University) | ||||||
| - Abu (M.A. in Psychology, Peking University) | - [Sanbuphy](https://github.com/sanbuphy) | ||||||
| - [HatBoy](https://github.com/hatboy) | - [HatBoy](https://github.com/hatboy) | ||||||
| 
 | 
 | ||||||
| <!-- links --> | <!-- links --> | ||||||
|  | |||||||
| @ -1,21 +0,0 @@ | |||||||
| MIT License |  | ||||||
| 
 |  | ||||||
| Copyright (c) 2024 SmartFlowAI |  | ||||||
| 
 |  | ||||||
| Permission is hereby granted, free of charge, to any person obtaining a copy |  | ||||||
| of this software and associated documentation files (the "Software"), to deal |  | ||||||
| in the Software without restriction, including without limitation the rights |  | ||||||
| to use, copy, modify, merge, publish, distribute, sublicense, and/or sell |  | ||||||
| copies of the Software, and to permit persons to whom the Software is |  | ||||||
| furnished to do so, subject to the following conditions: |  | ||||||
| 
 |  | ||||||
| The above copyright notice and this permission notice shall be included in all |  | ||||||
| copies or substantial portions of the Software. |  | ||||||
| 
 |  | ||||||
| THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR |  | ||||||
| IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, |  | ||||||
| FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE |  | ||||||
| AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER |  | ||||||
| LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, |  | ||||||
| OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE |  | ||||||
| SOFTWARE. |  | ||||||
| @ -2,7 +2,7 @@ | |||||||
| 
 | 
 | ||||||
| * 数据集按用处分为两种类型:**General** 和 **Role-play** | * 数据集按用处分为两种类型:**General** 和 **Role-play** | ||||||
| * 数据按格式分为两种类型:**QA** 和 **Conversation** | * 数据按格式分为两种类型:**QA** 和 **Conversation** | ||||||
| * 数据汇总:General(**6个数据集**);Role-play(**5个数据集**) | * 数据汇总:General(**8个数据集**);Role-play(**5个数据集**) | ||||||
| 
 | 
 | ||||||
| ## 数据集类型 | ## 数据集类型 | ||||||
| 
 | 
 | ||||||
|  | |||||||
| @ -2,7 +2,7 @@ | |||||||
| 
 | 
 | ||||||
| * Category of dataset: **General** and **Role-play** | * Category of dataset: **General** and **Role-play** | ||||||
| * Type of data: **QA** and **Conversation** | * Type of data: **QA** and **Conversation** | ||||||
| * Summary: General(**6 datasets**), Role-play(**5 datasets**) | * Summary: General(**8 datasets**), Role-play(**5 datasets**) | ||||||
| 
 | 
 | ||||||
|  ## Category |  ## Category | ||||||
| * **General**: generic dataset, including psychological Knowledge, counseling technology, etc. | * **General**: generic dataset, including psychological Knowledge, counseling technology, etc. | ||||||
|  | |||||||
| @ -1,8 +1,15 @@ | |||||||
| ## 一共有两个 .py 文件,分别为Book_QA_process_Step_1.py和Book_QA_process_Step_2.py | # Book_QA_process | ||||||
|  | 
 | ||||||
|  | 共两个python文件,分别为Book_QA_process_Step_1.py和Book_QA_process_Step_2.py | ||||||
|  | 
 | ||||||
| ### Book_QA_process_Step_1.py | ### Book_QA_process_Step_1.py | ||||||
|     该代码是将我们生成的QA对jsonl数据转换为json格式 | 
 | ||||||
|  | * 该代码是将我们生成的QA对jsonl数据转换为json格式 | ||||||
|  | 
 | ||||||
| ### Book_QA_process_Step_2.py | ### Book_QA_process_Step_2.py | ||||||
|     该代码是将第一步生成的json格式数据转化为可用于指令微调的数据格式,并添加system,即: | * 该代码是将第一步生成的json格式数据转化为可用于指令微调的数据格式,并添加system,即: | ||||||
|  | 
 | ||||||
|  |   ```json | ||||||
|     { |     { | ||||||
|         "conversation": [ |         "conversation": [ | ||||||
|             { |             { | ||||||
| @ -12,3 +19,4 @@ | |||||||
|             } |             } | ||||||
|         ] |         ] | ||||||
|     } |     } | ||||||
|  | ``` | ||||||
| @ -2,7 +2,7 @@ import json | |||||||
| 
 | 
 | ||||||
| # 打开JSON文件并读取其内容 | # 打开JSON文件并读取其内容 | ||||||
| 
 | 
 | ||||||
| file_name = 'ruozhiba_raw.jsonl'  | file_name = '../ruozhiba_raw.jsonl'  | ||||||
| 
 | 
 | ||||||
| # with open(f'data/{file_name}', 'r', encoding='utf-8') as file: | # with open(f'data/{file_name}', 'r', encoding='utf-8') as file: | ||||||
| #     data = json.load(file) | #     data = json.load(file) | ||||||
		Loading…
	
		Reference in New Issue
	
	Block a user
	 MING_X
						MING_X