History

Anooyman de0674ccf7 Update main code (#2 ) * update rag/src/data_processing.py * Add files via upload allow user to load embedding & rerank models from cache * Add files via upload embedding_path = os.path.join(model_dir, 'embedding_model') rerank_path = os.path.join(model_dir, 'rerank_model') * 测试push dev 测试push dev * Add files via upload 两个母亲多轮对话数据集合并、清理和去重之后，得到 2439 条多轮对话数据（每条有6-8轮对话）。 * optimize deduplicate.py Add time print information save duplicate dataset as well remove print(content) * add base model qlora fintuning config file: internlm2_7b_base_qlora_e10_M_1e4_32_64.py * add full finetune code from internlm2 * other 2 configs for base model * update cli_internlm2.py three methods to load model 1. download model in openxlab 2. download model in modelscope 3. offline model * create upload_modelscope.py * add base model and update personal contributions * add README.md for Emollm_Scientist * Create README_internlm2_7b_base_qlora.md InternLM2 7B Base QLoRA 微调指南 * [DOC]EmoLLM_Scientist微调指南 * [DOC]EmoLLM_Scientist微调指南 * [DOC]EmoLLM_Scientist微调指南 * [DOC]EmoLLM_Scientist微调指南 * [DOC]EmoLLM_Scientist微调指南 * [DOC]EmoLLM_Scientist微调指南 * update * [DOC]README_scientist.md * delete config * format update * upload xlab * add README_Model_Uploading.md and images * modelscope model upload * Modify Recent Updates * update daddy-like Boy-Friend EmoLLM * update model uploading with openxlab * update model uploading with openxlab --------- Co-authored-by: zealot52099 <songyan5209@163.com> Co-authored-by: xzw <62385492+aJupyter@users.noreply.github.com> Co-authored-by: zealot52099 <67356208+zealot52099@users.noreply.github.com> Co-authored-by: Bryce Wang <90940753+brycewang2018@users.noreply.github.com> Co-authored-by: HongCheng <kwchenghong@gmail.com>		2024-03-24 11:51:19 +08:00
..
src	Update main code (#2 )	2024-03-24 11:51:19 +08:00
README_EN.md	Update README_EN.md	2024-03-17 10:40:26 +08:00
README.md	Update README.md	2024-03-16 08:32:35 +08:00
requirements.txt	Update basic RAG pipeline	2024-03-17 10:31:11 +08:00

README_EN.md

EmoLLM RAG

Module purpose

Based on the customer's questions, the corresponding information is retrieved to enhance the professionalism of the answer, making EmoLLM's answer more professional and reliable. Search content includes but is not limited to the following:

Psychology related theories
Psychology methodology
Classic Case
Customer background knowledge

Datasets

Cleaned QA pairs: Each QA pair is embedding as a sample
Filtered TXT texts
- Directly generate embedding for TXT text (segmented based on token length)
- Filter out irrelevant information such as directories and generate embedding for TXT text (segmented based on token length)
- After filtering irrelevant information such as directories, the TXT is semantically segmented to generate embedding.
- Split TXT according to the directory structure, and generate embeddings based on the architecture hierarchy.

For details on data collection construction, please refer to qa_generation_README

Components

BCEmbedding

bce-embedding-base_v1: embedding model, used to build vector DB
bce-reranker-base_v1: rerank model, used to rerank retrieved documents

Langchain

LangChain is an open source framework for building large language model (LLM) based applications. LangChain provides a variety of tools and abstractions to increase the customization, accuracy, and relevance of the information generated by your models.

FAISS

FAISS is a library for efficient similarity search and dense vector clustering. It contains algorithms that can search sets of vectors of any size. Since langchain has integrated FAISS, this project will no longer be developed based on native documents. FAISS in Langchain

RAGAS

RAG’s classic evaluation framework is evaluated through the following three aspects:

Faithfulness: The answers given should be generated based on the given context.
Answer Relevance: The generated answer should solve the actual question asked.
Context Relevance: The retrieved information should be highly concentrated and contain as little irrelevant information as possible.

Later, more evaluation indicators were added, such as: context recall, etc.

Detials

RAG pipeline

Build vector DB based on data set
Embedding questions entered by customers
Search in vector database based on embedding results
Reorder recall data
Generate final results based on user questions and recall data

Noted: The above process will only be carried out when the user chooses to use RAG

Follow-up actions

Add RAGAS evaluation results to the generation process. For example, when the generated results cannot solve the user's problem, it needs to be regenerated.
Add web retrieval to deal with the problem that the corresponding information cannot be retrieved in vector DB
Add multi-channel retrieval to increase recall rate. That is, multiple similar queries are generated based on user input for retrieval.

README_EN.md Unescape Escape