Update README.md
This commit is contained in:
parent
6c870d350d
commit
bee025ca0f
18
README.md
18
README.md
@ -172,12 +172,12 @@
|
||||
- [🔗框架图](#框架图)
|
||||
- [目录](#目录)
|
||||
- [开发前的配置要求](#开发前的配置要求)
|
||||
- [**使用指南**](#使用指南)
|
||||
- [使用指南](#使用指南)
|
||||
- [🍪快速体验](#快速体验)
|
||||
- [📌数据构建](#数据构建)
|
||||
- [🎨微调指南](#微调指南)
|
||||
- [🔧部署指南](#部署指南)
|
||||
- [⚙RAG(检索增强生成)Pipeline](#rag检索增强生成pipeline)
|
||||
- [⚙RAG(检索增强生成)](#rag检索增强生成)
|
||||
- [使用到的框架](#使用到的框架)
|
||||
- [如何参与本项目](#如何参与本项目)
|
||||
- [作者(排名不分先后)](#作者排名不分先后)
|
||||
@ -192,7 +192,7 @@
|
||||
|
||||
- 硬件:A100 40G(仅针对InternLM2_7B_chat+qlora微调+deepspeed zero2优化)
|
||||
|
||||
###### **使用指南**
|
||||
###### 使用指南
|
||||
|
||||
1. Clone the repo
|
||||
|
||||
@ -211,7 +211,8 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git
|
||||
|
||||
### 🍪快速体验
|
||||
|
||||
- 请阅读[快速体验](docs/quick_start.md)查阅
|
||||
- 请阅读[快速体验](quick_start/quick_start.md)查阅
|
||||
- 快速上手:[Baby EmoLLM](quick_start/Baby_EmoLLM.ipynb)
|
||||
|
||||
|
||||
### 📌数据构建
|
||||
@ -229,9 +230,9 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git
|
||||
- Demo部署:详见[部署指南](demo/README.md)
|
||||
- 基于[LMDeploy](https://github.com/InternLM/lmdeploy/)的量化部署:详见[deploy](./deploy/lmdeploy.md)
|
||||
|
||||
### ⚙RAG(检索增强生成)Pipeline
|
||||
### ⚙RAG(检索增强生成)
|
||||
|
||||
- 详见[RAG](./rag/)
|
||||
- 详见[RAG](rag/README.md)
|
||||
|
||||
<details>
|
||||
<summary>更多详情</summary>
|
||||
@ -307,11 +308,10 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git
|
||||
|
||||
### 特别鸣谢
|
||||
|
||||
- [Sanbu](https://github.com/sanbuphy)
|
||||
- [上海人工智能实验室](https://www.shlab.org.cn/)
|
||||
- [闻星大佬(小助手)](https://github.com/vansin)
|
||||
- [扫地升(公众号宣传)](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A)
|
||||
- [闻星(浦语小助手)](https://github.com/vansin)
|
||||
- 阿布(北大心理学硕士)
|
||||
- [Sanbu](https://github.com/sanbuphy)
|
||||
- [HatBoy](https://github.com/hatboy)
|
||||
|
||||
<!-- links -->
|
||||
|
20
README_EN.md
20
README_EN.md
@ -173,12 +173,12 @@ The Model aims to fully understand and promote the mental health of individuals,
|
||||
- [Roadmap](#roadmap)
|
||||
- [Contents](#contents)
|
||||
- [Pre-development Configuration Requirements.](#pre-development-configuration-requirements)
|
||||
- [**User Guide**](#user-guide)
|
||||
- [User Guide](#user-guide)
|
||||
- [🍪Quick start](#quick-start)
|
||||
- [📌Data Construction](#data-construction)
|
||||
- [🎨Fine-tuning Guide](#fine-tuning-guide)
|
||||
- [🔧Deployment Guide](#deployment-guide)
|
||||
- [⚙RAG (Retrieval Augmented Generation) Pipeline](#rag-retrieval-augmented-generation-pipeline)
|
||||
- [⚙RAG (Retrieval Augmented Generation)](#rag-retrieval-augmented-generation)
|
||||
- [Frameworks Used](#frameworks-used)
|
||||
- [How to participate in this project](#how-to-participate-in-this-project)
|
||||
- [Version control](#version-control)
|
||||
@ -193,7 +193,7 @@ The Model aims to fully understand and promote the mental health of individuals,
|
||||
|
||||
- A100 40G (specifically for InternLM2_7B_chat + qlora fine-tuning + deepspeed zero2 optimization)
|
||||
|
||||
###### **User Guide**
|
||||
###### User Guide
|
||||
|
||||
1. Clone the repo
|
||||
|
||||
@ -211,7 +211,8 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git
|
||||
|
||||
|
||||
### 🍪Quick start
|
||||
- Please read [Quick Start](docs/quick_start_EN.md) to see.
|
||||
- Please read [Quick Start](quick_start/quick_start_EN.md) to see.
|
||||
- Quick coding: [Baby EmoLLM](quick_start/Baby_EmoLLM.ipynb)
|
||||
|
||||
### 📌Data Construction
|
||||
|
||||
@ -228,9 +229,9 @@ For details, see the [fine-tuning guide](xtuner_config/README_EN.md)
|
||||
- Demo deployment: see [deployment guide](./demo/README_EN.md) for details.
|
||||
- Quantitative deployment based on [LMDeploy](https://github.com/InternLM/lmdeploy/): see [deploy](./deploy/lmdeploy_EN.md)
|
||||
|
||||
### ⚙RAG (Retrieval Augmented Generation) Pipeline
|
||||
### ⚙RAG (Retrieval Augmented Generation)
|
||||
|
||||
- See [RAG](./rag/)
|
||||
- See [RAG](rag/README_EN.md)
|
||||
|
||||
<details>
|
||||
<summary>Additional Details</summary>
|
||||
@ -297,11 +298,10 @@ The project is licensed under the MIT License. Please refer to the details
|
||||
|
||||
### Acknowledgments
|
||||
|
||||
- [Sanbu](https://github.com/sanbuphy)
|
||||
- [Shanghai Artificial Intelligence Laboratory](https://www.shlab.org.cn/)
|
||||
- [Vanin](https://github.com/vansin)
|
||||
- [Bloom up (WeChat Official Account Promotion)](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A)
|
||||
- Abu (M.A. in Psychology, Peking University)
|
||||
- [Vansin](https://github.com/vansin)
|
||||
- A.bu (M.A. in Psychology, Peking University)
|
||||
- [Sanbuphy](https://github.com/sanbuphy)
|
||||
- [HatBoy](https://github.com/hatboy)
|
||||
|
||||
<!-- links -->
|
||||
|
@ -1,21 +0,0 @@
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2024 SmartFlowAI
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
@ -2,7 +2,7 @@
|
||||
|
||||
* 数据集按用处分为两种类型:**General** 和 **Role-play**
|
||||
* 数据按格式分为两种类型:**QA** 和 **Conversation**
|
||||
* 数据汇总:General(**6个数据集**);Role-play(**5个数据集**)
|
||||
* 数据汇总:General(**8个数据集**);Role-play(**5个数据集**)
|
||||
|
||||
## 数据集类型
|
||||
|
||||
|
@ -2,7 +2,7 @@
|
||||
|
||||
* Category of dataset: **General** and **Role-play**
|
||||
* Type of data: **QA** and **Conversation**
|
||||
* Summary: General(**6 datasets**), Role-play(**5 datasets**)
|
||||
* Summary: General(**8 datasets**), Role-play(**5 datasets**)
|
||||
|
||||
## Category
|
||||
* **General**: generic dataset, including psychological Knowledge, counseling technology, etc.
|
||||
|
@ -1,8 +1,15 @@
|
||||
## 一共有两个 .py 文件,分别为Book_QA_process_Step_1.py和Book_QA_process_Step_2.py
|
||||
# Book_QA_process
|
||||
|
||||
共两个python文件,分别为Book_QA_process_Step_1.py和Book_QA_process_Step_2.py
|
||||
|
||||
### Book_QA_process_Step_1.py
|
||||
该代码是将我们生成的QA对jsonl数据转换为json格式
|
||||
|
||||
* 该代码是将我们生成的QA对jsonl数据转换为json格式
|
||||
|
||||
### Book_QA_process_Step_2.py
|
||||
该代码是将第一步生成的json格式数据转化为可用于指令微调的数据格式,并添加system,即:
|
||||
* 该代码是将第一步生成的json格式数据转化为可用于指令微调的数据格式,并添加system,即:
|
||||
|
||||
```json
|
||||
{
|
||||
"conversation": [
|
||||
{
|
||||
@ -11,4 +18,5 @@
|
||||
"output": "Answer"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
@ -2,7 +2,7 @@ import json
|
||||
|
||||
# 打开JSON文件并读取其内容
|
||||
|
||||
file_name = 'ruozhiba_raw.jsonl'
|
||||
file_name = '../ruozhiba_raw.jsonl'
|
||||
|
||||
# with open(f'data/{file_name}', 'r', encoding='utf-8') as file:
|
||||
# data = json.load(file)
|
Loading…
Reference in New Issue
Block a user