Merge pull request #41 from SmartFlowAI/dev

update readme
2024-03-03 19:31:22 +08:00 · 2024-03-03 19:31:22 +08:00 · 0d9e6429cb
commit 0d9e6429cb
parent d055cad378 88b1794f7e
12 changed files with 428 additions and 21 deletions
--- a/README.md
+++ b/README.md
@ -205,6 +205,8 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git

 [Nobody-ML](https://github.com/Nobody-ML)@中国石油大学（华东）在读本科生

+[chg0901](https://github.com/chg0901)@韩国光云大学博士生
+
 ### 版权说明

 该项目签署了MIT 授权许可，详情请参阅 [LICENSE](https://github.com/aJupyter/EmoLLM/blob/master/LICENSE)
--- a/README_English_version.md
+++ b/README_English_version.md
@ -206,7 +206,9 @@ This project uses Git for version control. You can see the current available ver

 [aiyinyuedejustin](https://github.com/aiyinyuedejustin)@Master's student at University of Pennsylvania

-[Nobody-ML](https://github.com/Nobody-ML)@Undergraduate at China University of Petroleum (East China)
+[Nobody-ML](https://github.com/Nobody-ML)@中国石油大学（华东）在读本科生
+
+[chg0901](https://github.com/chg0901)@韩国光云大学博士生

 ### Copyright Notice

--- a/demo/README.md
+++ b/demo/README.md
@ -15,17 +15,16 @@ pip install -r requirements.txt
 ```

 - 下载模型
- 
-    - 模型权重：https://openxlab.org.cn/models/detail/jujimeizuo/EmoLLM_Model
-    - 通过 openxlab.model.download 下载，详情请看 [cli_internlm2](./cli_internlm2.py)
+  - 模型权重：https://openxlab.org.cn/models/detail/jujimeizuo/EmoLLM_Model
+  - 通过 openxlab.model.download 下载，详情请看 [cli_internlm2](./cli_internlm2.py)

-        ```bash
-        from openxlab.model import download
+    ```bash
+    from openxlab.model import download

-        download(model_repo='jujimeizuo/EmoLLM_Model', 
-                output='model')
-        ```
-    - 可以手动下载，放在 `./model` 目录下，然后把上面的代码删掉
+    download(model_repo='jujimeizuo/EmoLLM_Model', output='model')
+    ```
+
+  - 可以手动下载，放在 `./model` 目录下，然后把上面的代码删掉

 - cli_demo

--- a/demo/README_EN.md
+++ b/demo/README_EN.md
@ -0,0 +1,59 @@
+# Deploying Guide for EmoLLM
+
+## Local Deployment
+
+- Clone repo
+
+```bash
+git clone https://github.com/aJupyter/EmoLLM.git
+```
+
+- Install dependencies
+
+```bash
+pip install -r requirements.txt
+```
+
+- Download the model
+  - Model weights：https://openxlab.org.cn/models/detail/jujimeizuo/EmoLLM_Model
+  - Download via openxlab.model.download, see [cli_internlm2](./cli_internlm2.py) for details
+
+    ```bash
+    from openxlab.model import download
+
+    download(model_repo='jujimeizuo/EmoLLM_Model', output='model')
+    ```
+
+  - You can also download manually and place it in the `./model` directory, then delete the above code.
+
+- cli_demo
+
+```bash
+python ./demo/cli_internlm2.py
+```
+
+- web_demo
+
+```bash
+python ./app.py
+```
+
+If deploying on a server, you need to configure local port mapping.
+
+## Deploy on OpenXLab
+
+- Log in to OpenXLab and create a Gradio application
+
+![Login OpenXLab](../assets/deploy_1.png)
+
+- Select configurations and create the project
+
+![config](../assets/deploy_2.png)
+
+- Wait for the build and startup
+
+![wait a minutes](../assets/deploy_3.png)
+
+- Try your own project
+
+![enjoy](../assets/deploy_4.png)
--- a/evaluate/Genera_evaluation.md
+++ b/evaluate/Genera_evaluation.md
@ -4,16 +4,15 @@

 本文档提供了关于如何使用 `eval.py` 和 `metric.py` 两个脚本的指导。这些脚本用于评估 EmoLLM-心理健康大模型的生成结果。

-
 ## 安装

 - Python 3.x
 - PyTorch
- Transformers 
- Datasets 
- NLTK 
- Rouge 
- Jieba 
+- Transformers
+- Datasets
+- NLTK
+- Rouge
+- Jieba

 可以使用以下命令安装：

@ -24,6 +23,7 @@ pip install torch transformers datasets nltk rouge jieba
 ## 用法

 ### convert.py
+
 将原始多轮对话数据转换为测评用的单轮数据。

 ### eval.py
@ -39,8 +39,6 @@ pip install torch transformers datasets nltk rouge jieba

 `metric.py` 脚本包含计算评估指标的函数，可设置按字符级别或按词级别进行评估，目前包含 BLEU 和 ROUGE 分数。

-
-
 ## 测试结果

 对data.json中的数据进行测试，结果如下：
--- a/evaluate/Professional_evaluation.md
+++ b/evaluate/Professional_evaluation.md
@ -7,6 +7,7 @@
 ## 评测方法

 本评测方法采用论文《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》提出的评测指标与方法。
+
 * 指标：Comprehensiveness, Professionalism, Authenticity, Safety
 * 方法：Turn-Based Dialogue Evaluation
 * 数据集：CPsyCounE
@ -22,6 +23,7 @@
 | Safety            | 1.00       |

 ## 比较
+
 * [EmoLLM](https://openxlab.org.cn/models/detail/jujimeizuo/EmoLLM_Model) 在 InternLM2-7B-Chat 基础上提升较大；相比 Role-playing ChatGPT 在心理咨询任务上能力相近

 * 对比结果图片来源于论文《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》
--- a/evaluate/README.md
+++ b/evaluate/README.md
@ -2,7 +2,7 @@

 ## 通用指标评测

-* 具体指标、方法见 General evaluation.md
+* 具体指标、方法见 see [General_evaluation.md](./General_evaluation.md)

 | Model    | ROUGE-1 | ROUGE-2 | ROUGE-L | BLEU-1  | BLEU-2  | BLEU-3  | BLEU-4  |
 |----------|---------|---------|---------|---------|---------|---------|---------|
@ -11,7 +11,7 @@

 ## 专业指标评测

-* 具体指标、方法见 Professional evaluation.md
+* 具体指标、方法见 [Professional_evaluation.md](./Professional_evaluation.md)

 |       Metric      |    Value   |
 |-------------------|------------|
--- a/evaluate/README_EN.md
+++ b/evaluate/README_EN.md
@ -0,0 +1,26 @@
+# EmoLLM Evaluation
+
+## General Metrics Evaluation
+
+* For specific metrics and methods, see [General_evaluation.md](./General_evaluation_EN.md)
+
+| Metric  | Value                |
+|---------|----------------------|
+| ROUGE-1 | 27.23%               |
+| ROUGE-2 | 8.55%                |
+| ROUGE-L | 17.05%               |
+| BLEU-1  | 26.65%               |
+| BLEU-2  | 13.11%               |
+| BLEU-3  | 7.19%                |
+| BLEU-4  | 4.05%                |
+
+## Professional Metrics Evaluation
+
+* For specific metrics and methods, see [Professional_evaluation_EN.md](./Professional_evaluation_EN.md)
+
+|       Metric      |    Value   |
+|-------------------|------------|
+| Comprehensiveness | 1.32       |
+| Professionalism   | 2.20       |
+| Authenticity      | 2.10       |
+| Safety            | 1.00       |
--- a/xtuner_config/ChatGLM3-6b-ft_EN.md
+++ b/xtuner_config/ChatGLM3-6b-ft_EN.md
@ -0,0 +1,231 @@
+# ChatGLM3-6B
+
+## Environment Preparation
+
+In practice, we have two platforms available for selection.
+
+* Rent a machine with a 3090 GPU and 24G memory on the [autodl](https://www.autodl.com/) platform. Select the image as shown: `PyTorch` --> `2.0.0` --> `3.8(ubuntu20.04)` --> `11.8`.
+![autodl](Images/autodl.png)
+* On the [InternStudio](https://studio.intern-ai.org.cn/) platform, choose the configuration of A100(1/4). Select the image as shown: `Cuda11.7-conda`.
+![internstudio](Images/internstudio.png)
+
+In the Terminal, update pip and install dependencies.
+
+```shell
+# Upgrade pip
+python -m pip install --upgrade pip
+
+pip install modelscope==1.9.5
+pip install transformers==4.35.2
+pip install streamlit==1.24.0
+pip install sentencepiece==0.1.99
+pip install accelerate==0.24.1
+pip install peft==0.4.0
+pip install datasets==2.10.1
+```
+
+## Download Models
+
+Use the `modelscope` function `snapshot_download` to download the model. The first parameter is the model name, and the parameter `cache_dir` is the download path of the model.
+
+Create a `download.py` file in the `/root/autodl-tmp` directory and enter the following content. After pasting the code, remember to save the file as shown in the figure. Run python `/root/autodl-tmp/download.py` to execute the download. The model size is 14 GB, and the download will take about 10~20 minutes.
+
+```python
+import torch
+from modelscope import snapshot_download, AutoModel, AutoTokenizer
+import os
+model_dir = snapshot_download('ZhipuAI/chatglm3-6b', cache_dir='/root/autodl-tmp', revision='master')
+```
+
+For more information about ChatLLM, please refer to [self-llm](https://github.com/datawhalechina/self-llm)
+
+## Construction of Instruction Set
+
+Fine-tuning of LLM generally refers to the process of instruction tuning. Instruction tuning means that the fine-tuning data we use is in the form of:
+
+```json
+{
+    "system":"Answer the following user question and only output the answer.",
+    "input":"1+1 equals what?",
+    "output":"2"
+}
+```
+
+Among them, `system` is the user's instruction, informing the model of the task it needs to complete; `input` is the user's input, which is necessary input for completing the user's instruction; `output` is the output the model should give.
+
+That is, our core training goal is to enable the model to have the ability to understand and follow user instructions. Therefore, when constructing the instruction set, we should build task instruction sets in a targeted manner according to our own tasks and objectives. Our goal is to build a personalized LLM that can simulate a psychological doctor solving psychological problems, so we construct instructions like this:
+
+```json
+{
+    "system": "Now you are a psychological expert, I have some psychological problems, please help me solve them with professional knowledge.",
+    "input":"I feel like I always procrastinate and have trouble concentrating on completing tasks.",
+    "output":"Procrastination is a common problem, but it can be overcome. First, you can try to set yourself a clear goal and break it down into smaller tasks. This way, you will find the task easier to manage. In addition, maintaining a tidy and orderly work environment also helps improve work efficiency. Finally, when you complete a task, don't forget to give yourself some rewards to motivate yourself to keep moving forward."
+}
+```
+
+## Data Formatting
+
+Data for `Lora` training needs to be formatted and encoded before being input into the model for training. As those familiar with the training process of `Pytorch` models know, we usually need to encode the input text as `input_ids` and the output text as `labels`, and the results of encoding are multi-dimensional vectors. We first define a preprocessing function that encodes the input and output text for each sample and returns an encoded dictionary:
+
+```python
+def process_func(example):
+    MAX_LENGTH = 512
+    input_ids, labels = [], []
+    instruction = tokenizer.encode(text="\n".join(["<|system|>", "Now you are a psychological expert, I have some psychological problems, please help me solve them with your professional knowledge.", "<|user|>", 
+                                    example["system"] + example["input"] + "<|assistant|>"]).strip() + "\n",
+                                    add_special_tokens=True, truncation=True, max_length=MAX_LENGTH)
+
+    response = tokenizer.encode(text=example["output"], add_special_tokens=False, truncation=True,
+    max_length=MAX_LENGTH)
+
+    input_ids = instruction + response + [tokenizer.eos_token_id]
+    labels = [tokenizer.pad_token_id] * len(instruction) + response + [tokenizer.eos_token_id]
+    pad_len = MAX_LENGTH - len(input_ids)
+    input_ids += [tokenizer.pad_token_id] * pad_len
+    labels += [tokenizer.pad_token_id] * pad_len
+    labels = [(l if l != tokenizer.pad_token_id else -100) for l in labels]
+
+    return {
+        "input_ids": input_ids,
+        "labels": labels
+    }
+```
+
+After formatting, each piece of data sent into the model is a dictionary containing two key-value pairs: `input_ids` and `labels`. `input_ids` is the encoding of the input text, and `labels` is the encoding of the output text. The decoded result should appear as follows:
+
+```text
+[gMASK]sop <|system|>
+Now you are a psychological expert, I have some psychological problems, please help me solve them with your professional knowledge.
+<|user|>
+My team atmosphere is great, and all my colleagues are very friendly. Moreover, we often go out together to play, feeling like a big family.\n <|assistant|>
+This is a great working environment, and having good interpersonal relationships and teamwork can indeed bring a lot of happiness. However, I also understand that you may encounter some challenges in your work, such as task pressure or conflicts with colleagues. Have you ever thought about how to deal with these issues?
+```
+
+Why is it in this form? That's a great question! Different models have different formatting requirements for their inputs, so we need to refer to the training source code of our deep model to check the specific format. Since fine-tuning Lora based on the original model instructions should yield the best results, we still follow the input format of the original model. Ok, here is the link to the source code for those who are interested in exploring it on their own:
+
+[hugging face ChatGLM3 repository](https://github.com/THUDM/ChatGLM3/blob/main/finetune_chatmodel_demo/preprocess_utils.py): The `InputOutputDataset` class can be found here.
+Additionally, you can refer to this repository for data processing of ChatGLM [LLaMA-Factory](https://github.com/KMnO4-zx/LLaMA-Factory/blob/main/src/llmtuner/data/template.py).
+
+## Loading the tokenizer and half-precision model
+
+The model is loaded in half-precision format. If you have a newer graphics card, you can use `torch.bfolat` to load it. For custom models, always specify the `trust_remote_code` parameter as `True`.
+
+```python
+tokenizer = AutoTokenizer.from_pretrained('./model/chatglm3-6b', use_fast=False, trust_remote_code=True)
+
+# The model is loaded in half-precision format. If you have a relatively new GPU, you can load it in torch.bfloat format.
+model = AutoModelForCausalLM.from_pretrained('./model/chatglm3-6b', trust_remote_code=True, torch_dtype=torch.half, device_map="auto")
+```
+
+## Defining LoraConfig
+
+The `LoraConfig` class allows you to set many parameters, but there are only a few main ones. I'll briefly explain them; those interested can directly refer to the source code.
+
+- `task_type`: The type of the model.
+- `target_modules`: The names of the model layers that need to be trained, mainly the layers in the `attention` part. The names of these layers differ for different models. They can be passed as an array, a string, or a regular expression.
+- `r`: The rank of `lora`, which can be seen in the principles of `Lora`.
+- `lora_alpha`: The `Lora alaph`, the specific role of which can be referred to in the principles of `Lora`.
+- `modules_to_save` specifies which modules, besides those split into `lora`, can be fully specified for training.
+- 
+What's this scaling of `Lora` about? Obviously, it's not `r` (rank). This scaling is actually `lora_alpha/r`, and in this `LoraConfig`, the scaling is 4 times.
+
+This scaling does not change the size of the parameters of LoRa; it essentially involves broadcasting the parameter values and performing linear scaling.
+
+```python
+config = LoraConfig(
+    task_type=TaskType.CAUSAL_LM, 
+    target_modules=["query_key_value"],
+    inference_mode=False, # training mode
+    r=8, # Lora Rank
+    lora_alpha=32, # Lora alaph，for specific details and functionality, please refer to the Lora principle.
+    lora_dropout=0.1 # Dropout ratio
+)
+```
+
+## Customizing TrainingArguments Parameters
+
+The source code of the `TrainingArguments` class also introduces the specific functions of each parameter. Of course, everyone is encouraged to explore it on their own, but I'll mention a few commonly used ones here.
+
+- `output_dir`: The output path for the model.
+- `per_device_train_batch_size`: As the name suggests, `batch_size`.
+- `gradient_accumulation_steps`: Gradient accumulation. If you have a smaller GPU memory, you can set a smaller `batch_size` and increase the gradient accumulation.
+- `logging_steps`: How many steps to output a `log`.
+- `num_train_epochs`: As the name suggests, `epoch`.
+- `gradient_checkpointing`: Gradient checking. Once enabled, the model must execute `model.enable_input_require_grads()`. The principle behind this can be explored by yourselves, so I won't go into details here.
+
+```python
+# The GLM source repository has restructured its own data_collator and we will continue to use it here.
+
+data_collator = DataCollatorForSeq2Seq(
+    tokenizer,
+    model=model,
+    label_pad_token_id=-100,
+    pad_to_multiple_of=None,
+    padding=False
+)
+
+args = TrainingArguments(
+    output_dir="./output/ChatGLM",
+    per_device_train_batch_size=4,
+    gradient_accumulation_steps=2,
+    logging_steps=10,
+    num_train_epochs=3,
+    gradient_checkpointing=True,
+    save_steps=100,
+    learning_rate=1e-4,
+)
+```
+
+### Training with Trainer
+
+Put the model in, put the parameters set above in, and put the dataset in. OK! Start training!
+
+```python
+trainer = Trainer(
+    model=model,
+    args=args,
+    train_dataset=tokenized_id,
+    data_collator=data_collator,
+)
+trainer.train()
+```
+
+## Inference
+
+You can use this more classic method for inference.
+
+```python
+while True:
+    # 推理
+    model = model.cuda()
+    input_text = input("User  >>>")
+    ipt = tokenizer("<|system|>\nNow you are a mental health expert, and I have some psychological issues. Please use your professional knowledge to help me solve them.\n<|user|>\n {}\n{}".format(input_text, "").strip() + "<|assistant|>\n", return_tensors="pt").to(model.device)
+    print(tokenizer.decode(model.generate(**ipt, max_length=128, do_sample=True)[0], skip_special_tokens=True))
+```
+
+## Reloading
+
+Models fine-tuned through PEFT can be reloaded and inferred using the following methods:
+
+- Load the source model and tokenizer;
+- Use `PeftModel` to merge the source model with the parameters fine-tuned by PEFT.
+
+
+```python
+from peft import PeftModel
+
+model = AutoModelForCausalLM.from_pretrained("./model/chatglm3-6b", trust_remote_code=True, low_cpu_mem_usage=True)
+tokenizer = AutoTokenizer.from_pretrained("./model/chatglm3-6b", use_fast=False, trust_remote_code=True)
+
+# Load the LoRa weights obtained from training.
+p_model = PeftModel.from_pretrained(model, model_id="./output/ChatGLM/checkpoint-1000/")
+
+
+while True:
+    # inference
+    model = model.cuda()
+    input_text = input("User  >>>")
+    ipt = tokenizer("<|system|>\nNow you are a mental health expert, and I have some psychological issues. Please use your professional knowledge to help me solve them.\n<|user|>\n {}\n{}".format(input_text, "").strip() + "<|assistant|>\n", return_tensors="pt").to(model.device)
+    print(tokenizer.decode(model.generate(**ipt, max_length=128, do_sample=True)[0], skip_special_tokens=True))
+
+```
--- a/xtuner_config/README_EN.md
+++ b/xtuner_config/README_EN.md
@ -0,0 +1,87 @@
+# Fine-Tuning Guide
+
+- This project has undergone fine-tuning not only on mental health datasets but also on self-awareness, and here is the detailed guide for fine-tuning.
+
+## I. Fine-Tuning Based on Xtuner 🎉🎉🎉🎉🎉
+
+### Environment Setup
+
+```markdown
+datasets==2.16.1
+deepspeed==0.13.1
+einops==0.7.0
+flash_attn==2.5.0
+mmengine==0.10.2
+openxlab==0.0.34
+peft==0.7.1
+sentencepiece==0.1.99
+torch==2.1.2
+transformers==4.36.2
+xtuner==0.1.11
+```
+
+You can also install them all at once by
+
+```bash
+cd xtuner_config/
+pip3 install -r requirements.txt
+```
+
+---
+
+### Fine-Tuning
+
+```bash
+cd xtuner_config/
+xtuner train internlm2_7b_chat_qlora_e3.py --deepspeed deepspeed_zero2
+```
+
+---
+
+### Convert the Obtained PTH Model to a HuggingFace Model
+
+**That is: Generate the Adapter folder**
+
+```bash
+cd xtuner_config/
+mkdir hf
+export MKL_SERVICE_FORCE_INTEL=1
+
+xtuner convert pth_to_hf internlm2_7b_chat_qlora_e3.py ./work_dirs/internlm_chat_7b_qlora_oasst1_e3_copy/epoch_3.pth ./hf
+```
+
+---
+
+### Merge the HuggingFace Adapter with the Large Language Model
+
+```bash
+xtuner convert merge ./internlm2-chat-7b ./hf ./merged --max-shard-size 2GB
+# xtuner convert merge \
+#     ${NAME_OR_PATH_TO_LLM} \
+#     ${NAME_OR_PATH_TO_ADAPTER} \
+#     ${SAVE_PATH} \
+#     --max-shard-size 2GB
+```
+
+---
+
+### Testing
+
+```
+cd demo/
+python cli_internlm2.py
+```
+
+---
+
+## II. Fine-Tuning Based on Transformers🎉🎉🎉🎉🎉
+
+- Please refer to the [ChatGLM3-6b lora fine-tuning guide](ChatGLM3-6b-ft.md).
+
+---
+
+## Other
+
+Feel free to give [xtuner](https://github.com/InternLM/xtuner) and [EmoLLM](https://github.com/aJupyter/EmoLLM) a star~
+
+🎉🎉🎉🎉🎉
--- a/xtuner_config/images/README.md
+++ b/xtuner_config/images/README.md
@ -1 +1 @@
-此文件夹存放所有相关文件图片
+此文件夹存放所有相关文件图片
--- a/xtuner_config/images/README_EN.md
+++ b/xtuner_config/images/README_EN.md
@ -0,0 +1 @@
+This folder contains all related files and images.
				`@ -0,0 +1 @@`
				`This folder contains all related files and images.`