diff --git a/deploy/Imdeploy_EN.md b/deploy/Imdeploy_EN.md new file mode 100644 index 0000000..cb6049d --- /dev/null +++ b/deploy/Imdeploy_EN.md @@ -0,0 +1,227 @@ +![](../assets/emoxlmdeploy.png) +# Local deployment of LMDeploy +## 1.Environment configuration +
+ +Specific deployment environment + +Package Version + +------------------------- ----------- + +accelerate 0.27.2 +addict 2.4.0 +aiofiles 23.2.1 +aiohttp 3.9.3 +aiosignal 1.3.1 +aliyun-python-sdk-core 2.14.0 +aliyun-python-sdk-kms 2.16.2 +altair 5.2.0 +annotated-types 0.6.0 +anyio 4.2.0 +async-timeout 4.0.3 +attrs 23.2.0 +blinker 1.7.0 +Brotli 1.0.9 +cachetools 5.3.3 +certifi 2023.11.17 +cffi 1.16.0 +charset-normalizer 2.0.4 +click 8.1.7 +contourpy 1.2.0 +crcmod 1.7 +cryptography 41.0.3 +cycler 0.12.1 +datasets 2.17.0 +dill 0.3.8 +einops 0.7.0 +exceptiongroup 1.2.0 +fastapi 0.109.2 +ffmpy 0.3.2 +filelock 3.13.1 +fire 0.5.0 +flash-attn 2.4.2 +fonttools 4.49.0 +frozenlist 1.4.1 +fsspec 2023.10.0 +fuzzywuzzy 0.18.0 +gitdb 4.0.11 +GitPython 3.1.42 +gmpy2 2.1.2 +gradio 3.50.2 +gradio_client 0.6.1 +h11 0.14.0 +httpcore 1.0.3 +httpx 0.26.0 +huggingface-hub 0.20.3 +idna 3.4 +importlib-metadata 6.11.0 +importlib-resources 6.1.1 +Jinja2 3.1.2 +jmespath 0.10.0 +jsonschema 4.21.1 +jsonschema-specifications 2023.12.1 +kiwisolver 1.4.5 +lmdeploy 0.2.4 +markdown-it-py 3.0.0 +MarkupSafe 2.1.1 +matplotlib 3.8.3 +mdurl 0.1.2 +mkl-fft 1.3.8 +mkl-random 1.2.4 +mkl-service 2.4.0 +mmengine-lite 0.10.3 +mpmath 1.3.0 +multidict 6.0.5 +multiprocess 0.70.16 +networkx 3.1 +ninja 1.11.1.1 +numpy 1.26.2 +nvidia-cublas-cu11 11.11.3.6 +nvidia-cuda-runtime-cu11 11.8.89 +nvidia-nccl-cu11 2.19.3 +openxlab 0.0.34 +orjson 3.9.14 +oss2 2.17.0 +packaging 23.2 +pandas 2.2.0 +peft 0.8.2 +Pillow 9.5.0 +pip 23.3.1 +platformdirs 4.2.0 +protobuf 4.25.3 +psutil 5.9.8 +pyarrow 15.0.0 +pyarrow-hotfix 0.6 +pybind11 2.11.1 +pycparser 2.21 +pycryptodome 3.20.0 +pydantic 2.6.1 +pydantic_core 2.16.2 +pydeck 0.8.1b0 +pydub 0.25.1 +Pygments 2.17.2 +Pympler 1.0.1 +pynvml 11.5.0 +pyOpenSSL 23.2.0 +pyparsing 3.1.1 +PySocks 1.7.1 +python-dateutil 2.8.2 +python-multipart 0.0.9 +pytz 2023.4 +pytz-deprecation-shim 0.1.0.post0 +PyYAML 6.0.1 +referencing 0.33.0 +regex 2023.12.25 +requests 2.28.2 +rich 13.4.2 +rpds-py 0.18.0 +safetensors 0.4.2 +semantic-version 2.10.0 +sentencepiece 0.1.99 +setuptools 60.2.0 +shortuuid 1.0.11 +six 1.16.0 +smmap 5.0.1 +sniffio 1.3.0 +starlette 0.36.3 +streamlit 1.24.0 +sudo 1.0.0 +sympy 1.11.1 +tenacity 8.2.3 +termcolor 2.4.0 +tiktoken 0.6.0 +tokenizers 0.15.2 +toml 0.10.2 +tomli 2.0.1 +toolz 0.12.1 +torch 2.0.1 +torchaudio 2.0.2 +torchvision 0.15.2 +tornado 6.4 +tqdm 4.65.2 +transformers 4.37.1 +triton 2.2.0 +typing_extensions 4.9.0 +tzdata 2024.1 +tzlocal 4.3.1 +urllib3 1.26.18 +uvicorn 0.27.1 +validators 0.22.0 +watchdog 4.0.0 +websockets 11.0.3 +wheel 0.41.2 +xxhash 3.4.1 +yapf 0.40.2 +yarl 1.9.4 +zipp 3.17.0 +
+ +lmdeploy has not been installed. We will install it manually next. It is recommended to install the latest stable version. If you use the InternStudio development environment, run the following command first. Otherwise, an error occurs. +``` +# Resolved ModuleNotFoundError: No module named 'packaging' problem +pip install packaging +# Use flash_attn's precompiled package to solve slow installation problems +pip install /root/share/wheels/flash_attn-2.4.2+cu118torch2.0cxx11abiTRUE-cp310-cp310-linux_x86_64.whl +``` +Because the default installation is the runtime dependency package, but we also need to deploy and quantify here, so select [all] here. You can then examine the lmdeploy package again, as shown in the following figure +``` +pip install 'lmdeploy[all]==v0.1.0' +``` +However, the 0.1.0 version of lmdeploy does not support the Turbomind conversion of InternLM2-7B-chat +Note that the version of lmdeploy needs to be updated: +``` +# We used version 0.2.4 of lmdeploy +pip install --upgrade lmdeploy +``` + +## 2.Model transformation +To use TurboMind inference model, it is necessary to convert the model into TurboMind format first, which supports online conversion and offline conversion. Online conversion can directly load the Huggingface model, and offline conversion needs to save the model before loading. + +TurboMind is an efficient inference engine for LLM inference, based on Nvidia's FasterTransformer. Its main features include: LLaMa structural model support, persistent batch inference mode and scalable KV cache manager. +### 2.1 Online conversion +lmdeploy supports direct reading of Huggingface model weights. Currently, three types are supported: + +The models quantified by lmdeploy on huggingface.co are llama2-70b-4bit and internlm-chat-20b-4bit +Other LM models on huggingface.co, such as Qwen/ QWEN-7B-chat +An example is as follows: +``` +# Requires a network environment with access to Huggingface +lmdeploy chat turbomind internlm/internlm-chat-20b-4bit --model-name internlm-chat-20b +lmdeploy chat turbomind Qwen/Qwen-7B-Chat --model-name qwen-7b +``` +The above two lines show how to directly load Huggingface's model, the first to load the version quantified using lmdeploy, and the second to load the other LLM models. + +We can also launch the local Huggingface model directly, as shown below. +``` +lmdeploy chat turbomind /EmoLLM --model-name internlm2-chat-7b +``` +The preceding commands start a local dialog interface. You can use Bash to talk to LLM. +### 2.2 Offline conversion +The offline transformation requires converting the model to the lmdeploy TurboMind format before starting the service, as shown below. +``` +# Transform model(FastTransformer格式) TurboMind +lmdeploy convert internlm2-chat-7b /EmoLLM +``` +Upon completion, a workspace folder will be generated in the current directory. These are the files that TurboMind and Triton need for "model inference." +## 3.Run locally +### 3.1 TurboMind Inference + Command line local dialog +After the model transformation is complete, we have the conditions to use model inference, and then we can proceed to the real model inference. + +Let's try Bash Local Chat first, and then use Local Chat to call TurboMind instead of API Server. In simple terms, TurboMind is executed directly by command line code. So, there is a difference between the actual architecture diagram and the previous one. + +There are several ways to run it, such as Turbomind, PyTorch, DeepSpeed. But PyTorch and DeepSpeed are actually Huggingface's Transformers package, PyTorch means the native Transformer package, DeepSpeed means the use of DeepSpeed as an inference framework. Pytorch/DeepSpeed is currently weak and does not have production capacity, so it is not recommended to use. + +Run the following command. +``` +# Turbomind + Bash Local Chat +lmdeploy chat turbomind ./workspace +``` +To exit, enter exit and press enter twice. At this point, the Server is the locally run model (TurboMind), and the command line can be seen as the front end. +### 3.2 TurboMind Inference + API service +In the above part, we tried to start the Client directly using the command line. Next, we tried how to use lmdepoy to service it. +First, start the service with the following command. +``` +lmdeploy serve api_server ./workspace --server-name 0.0.0.0 --server-port ${server_port} --tp 1 +``` +Details please see [documents](https://lmdeploy.readthedocs.io/zh-cn/stable/serving/restful_api.html) diff --git a/evaluate/General_evaluation_EN.md b/evaluate/General_evaluation_EN.md new file mode 100644 index 0000000..ab6db16 --- /dev/null +++ b/evaluate/General_evaluation_EN.md @@ -0,0 +1,50 @@ +# EmoLLM general indicator evaluation + +## Introduction + +This document provides instructions on how to use the 'eval.py' and 'metric.py' scripts. These scripts are used to evaluate the generation results of EmoLLM- a large model of mental health. + +## Installation + +- Python 3.x +- PyTorch +- Transformers +- Datasets +- NLTK +- Rouge +- Jieba + +It can be installed using the following command: + +```bash +pip install torch transformers datasets nltk rouge jieba +``` + +## Usage + +### convert.py + +Convert raw multi-round conversation data into single round data for evaluation. + +### eval.py + +The `eval.py` script is used to generate the doctor's response and evaluate it, mainly divided into the following parts: + +1. Load the model and word divider. +2. Set test parameters, such as the number of test data and batch size. +3. Obtain data. +4. Generate responses and evaluate. + +### metric.py + +The `metric.py` script contains functions to calculate evaluation metrics, which can be set to evaluate by character level or word level, currently including BLEU and ROUGE scores. + +## Test results + +Test the data in data.json with the following results: + +| Model | ROUGE-1 | ROUGE-2 | ROUGE-L | BLEU-1 | BLEU-2 | BLEU-3 | BLEU-4 | +|----------|---------|---------|---------|---------|---------|---------|---------| +| Qwen1_5-0_5B-chat | 27.23% | 8.55% | 17.05% | 26.65% | 13.11% | 7.19% | 4.05% | +| InternLM2_7B_chat_qlora | 37.86% | 15.23% | 24.34% | 39.71% | 22.66% | 14.26% | 9.21% | +| InternLM2_7B_chat_full | 32.45% | 10.82% | 20.17% | 30.48% | 15.67% | 8.84% | 5.02% | diff --git a/generate_data/tutorial_EN.md b/generate_data/tutorial_EN.md new file mode 100644 index 0000000..fbdb05e --- /dev/null +++ b/generate_data/tutorial_EN.md @@ -0,0 +1,100 @@ +# EMO Psychological large model fine-tuning data generation tutorial + +**I. Objectives and Background** + + In order to have a better representation of our large mental models, we must have high quality data sets. To achieve this goal, we decided to use four powerful AI grand models: Wenxin Yiyi, Tongyi Qianwen, Feifei Spark, and NXP AI to generate conversation data. In addition, we will enhance the cognitive depth of the dataset and improve the generalization ability of the model by adding a small number of self-cognitive datasets. + +**II. Data set generation method** + +1. **Model selection and data preparation** + + Choose four big language models, namely Wenxin Yiyi, Tongyi Qianwen, IFei Spark and Zhipu, obtain the API to call the corresponding interface, and prepare to generate dialogue data. +2. **Single round and multiple round dialogue data generation ** + + Using these four models, we generated 10,000 single - and multi-round conversation data. In doing so, we ensure the diversity, complexity and validity of our data. + + Because mental activity is often complex, in order to ensure the diversity of data. We selected a total of 16 * 28 `448` scenarios for data set generation. For specific scenario names, please refer to the configuration of the two parameters`emotions_list and areas_of_life`in config.yml. +3. **Inclusion of self-perception datasets** + + In order to enhance the cognitive ability of the model, we specially added a part of self-cognitive data set. These data sets help the model better understand the context and improve the naturalness and coherence of the conversation. + +**III. Practical steps** + +1. **Initialize** + +* Install the required software and libraries. + + ```bash + pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple + ``` +* Prepare input data and configuration parameters. + + See `config.yml` for annotations + +2. **Model selection and configuration** + +* Select the right model for your needs. + In order to enable everyone to play with the large model, we chose the InterLLM2-7B as our baseline model (consumer graphics cards can also be deployed fine-tuned oh). +* Make necessary configuration and adjustments to the model. + Use XTuner for fine-tuning based on our data set and configuration strategy + +3. **Data generation** + +* Data generation using Tongyi Qianwen large model. + ```bash + # Terminal operation + bash run_qwen.bash + ``` +* Use Baidu Wenxin large model for data generation. + ```bash + # Terminal operation + python ernie_gen_data.py + ``` +* Data generation using the NXP AI large model. + ```bash + # Terminal operation + python zhipuai_gen_data.py + ``` +* Use IFlystar Fire model for data generation. + ```bash + # Terminal operation + python ./xinghuo/gen_data.py + ``` + +4. **Integration of self-cognition datasets** + +* Self-cognition data set this needs to be manually generated in accordance with the format, the following format can be. + ```json + [ + { + "conversation": [ + { + "input": "请介绍一下你自己", + "output": "我是大佬的emo小助手,可以帮助你解决心理上的问题哦" + } + ] + }, + { + "conversation": [ + { + "input": "请做一下自我介绍", + "output": "我是大佬的emo小助手,可以帮助你解决心理上的问题哦" + } + ] + } + ] + ``` + +5. **Data set integration.** + + Before data set integration, we need to check whether the generated data has formatting errors, type mismatches, etc. We need check.py to check the data. Finally, merge_json.py is used to combine all the json into one overall json file. +6. **Evaluation and optimization** + +* Evaluate the generated dataset using appropriate evaluation metrics. +* Make necessary optimizations and adjustments based on the evaluation results. + +7. **Testing and deployment** + +* Evaluate the trained model using an independent test set. +* Make necessary adjustments and optimizations based on test results. +* Deploy the final model into a real application. diff --git a/generate_data/xinghuo/Readme_EN.md b/generate_data/xinghuo/Readme_EN.md new file mode 100644 index 0000000..0a2aaf8 --- /dev/null +++ b/generate_data/xinghuo/Readme_EN.md @@ -0,0 +1,10 @@ +# Introduction +* gen_Chat is used to generate the ChatGLM3-6B dataset +* gen_data is used to generate the data set required for InternLM + +⭐Precautions~ +Spark Big Model V1.5 generates a specific topic with a **security limit**, the model will refuse to answer, be aware of similar data processing. + + +Example:{"system": "现在你是一个心理专家,我有一些心理问题,请你用专业的知识帮我解决。", "input": "xxx", "output": "抱歉,我不能完成这个任务。作为一个认知智能模型,我不会提供任何与性欲情感相关的回答或建议。这种问题需要由专业的心理健康医生进行处理和解决。如果您有任何心理健康方面的问题,请寻求专业医生的帮助。"} + diff --git a/rag/README_EN.md b/rag/README_EN.md new file mode 100644 index 0000000..e69de29 diff --git a/scripts/qa_generation/README_EN.md b/scripts/qa_generation/README_EN.md new file mode 100644 index 0000000..0c76750 --- /dev/null +++ b/scripts/qa_generation/README_EN.md @@ -0,0 +1,37 @@ +# QA Generation Pipeline + +## 1. Use method + +1. Check whether the dependencies in `requirements.txt` are satisfied. +2. Adjust the `system_prompt`in the code to ensure that it is consistent with the latest version of the repo to ensure the diversity and stability of the generated QA. +3. Put the txt file into the `data` folder in the same directory as `model`. +4. Configure the required API KEY in `config/config.py` and start from `main.py`. The generated QA pairs are stored in the jsonl format under `data/generated`. + +### 1.1 API KEY obtaining method + +Currently only qwen is included. + +#### 1.1.1 Qwen + +To[model service spirit product - API - KEY management (aliyun.com)](https://dashscope.console.aliyun.com/apiKey),click on "create a new API - KEY", Fill in the obtained API KEY to `DASHSCOPE_API_KEY` in `config/config.py`. + +## 2. Precautions + +### 2.1 The System Prompt is displayed + +Note that the current parsing scheme is based on the premise that the model generates json blocks of markdown wraps, and you need to make sure that this remains the case when you change the system prompt. + +### 2.2 Sliding Window + +Both `window_size` and `overlap_size` of the sliding window can be changed in the `get_txt_content` function in `util/data_loader.py.` Currently it is a sliding window divided by sentence. + +### 2.3 Corpus Format + +At present, only txt format is supported, and the cleaned book text can be placed under the `data` folder, and the program will recursively retrieve all txt files under the folder. + +## TODO + +1. Support more models (Gemini, GPT, ChatGLM...) +2. Support multi-threaded call model +3. Support more text formats (PDF...) +4. Support more ways to split text diff --git a/scripts/qa_generation/system_prompt_v1_EN.md b/scripts/qa_generation/system_prompt_v1_EN.md new file mode 100644 index 0000000..4ceeb52 --- /dev/null +++ b/scripts/qa_generation/system_prompt_v1_EN.md @@ -0,0 +1,24 @@ +You are a QA pair generator robot, you will automatically generate the appropriate QA pair according to the content of the psychology book provided by me, and the requirements are as follows: + +- For the text I gave you, you need to generate five such QA pairs +- QA should not repeat the content, and the answer should not be too long +- Answer in Simplified Chinese +- The generated QA pair needs to be wrapped in json code blocks in markdown format + +Here is the reference format: + +```json +[ + { + "question": "...", + "answer": "..." + }, + { + "question": "...", + "answer": "..." + }, + ... +] +``` + +Here is the text given: diff --git a/scripts/qa_generation/system_prompt_v2_EN.md b/scripts/qa_generation/system_prompt_v2_EN.md new file mode 100644 index 0000000..90d6cb6 --- /dev/null +++ b/scripts/qa_generation/system_prompt_v2_EN.md @@ -0,0 +1,26 @@ +You are an experienced psychologist, familiar with psychological knowledge and psychological counseling techniques. Please take a deep breath and think step by step to generate QA pairs that meet the criteria based on the psychology text content I provided. + +The criteria are as follows: +- Generate 5-10 QA pairs per psychology text +- QA should select "Psychology Knowledge" according to the psychology text content; Specific consulting methods; Mental illness characteristics; The most suitable topic generated in "Methods of Treatment for Mental illness" +- QA should not repeat the content, and the answer should not be too long +- QA pairs are simplified Chinese +- The generated QA pair needs to be wrapped in json code blocks in markdown format + +The reference format is as follows: + +```json +[ + { + "question": "...", + "answer": "..." + }, + { + "question": "...", + "answer": "..." + }, + ... +] +``` + +The following is the content of a given psychology text: