de0674ccf7
* update rag/src/data_processing.py * Add files via upload allow user to load embedding & rerank models from cache * Add files via upload embedding_path = os.path.join(model_dir, 'embedding_model') rerank_path = os.path.join(model_dir, 'rerank_model') * 测试push dev 测试push dev * Add files via upload 两个母亲多轮对话数据集合并、清理和去重之后,得到 2439 条多轮对话数据(每条有6-8轮对话)。 * optimize deduplicate.py Add time print information save duplicate dataset as well remove print(content) * add base model qlora fintuning config file: internlm2_7b_base_qlora_e10_M_1e4_32_64.py * add full finetune code from internlm2 * other 2 configs for base model * update cli_internlm2.py three methods to load model 1. download model in openxlab 2. download model in modelscope 3. offline model * create upload_modelscope.py * add base model and update personal contributions * add README.md for Emollm_Scientist * Create README_internlm2_7b_base_qlora.md InternLM2 7B Base QLoRA 微调指南 * [DOC]EmoLLM_Scientist微调指南 * [DOC]EmoLLM_Scientist微调指南 * [DOC]EmoLLM_Scientist微调指南 * [DOC]EmoLLM_Scientist微调指南 * [DOC]EmoLLM_Scientist微调指南 * [DOC]EmoLLM_Scientist微调指南 * update * [DOC]README_scientist.md * delete config * format update * upload xlab * add README_Model_Uploading.md and images * modelscope model upload * Modify Recent Updates * update daddy-like Boy-Friend EmoLLM * update model uploading with openxlab * update model uploading with openxlab --------- Co-authored-by: zealot52099 <songyan5209@163.com> Co-authored-by: xzw <62385492+aJupyter@users.noreply.github.com> Co-authored-by: zealot52099 <67356208+zealot52099@users.noreply.github.com> Co-authored-by: Bryce Wang <90940753+brycewang2018@users.noreply.github.com> Co-authored-by: HongCheng <kwchenghong@gmail.com> |
||
---|---|---|
.. | ||
src | ||
README_EN.md | ||
README.md | ||
requirements.txt |
EmoLLM RAG
Module purpose
Based on the customer's questions, the corresponding information is retrieved to enhance the professionalism of the answer, making EmoLLM's answer more professional and reliable. Search content includes but is not limited to the following:
- Psychology related theories
- Psychology methodology
- Classic Case
- Customer background knowledge
Datasets
- Cleaned QA pairs: Each QA pair is embedding as a sample
- Filtered TXT texts
- Directly generate embedding for TXT text (segmented based on token length)
- Filter out irrelevant information such as directories and generate embedding for TXT text (segmented based on token length)
- After filtering irrelevant information such as directories, the TXT is semantically segmented to generate embedding.
- Split TXT according to the directory structure, and generate embeddings based on the architecture hierarchy.
For details on data collection construction, please refer to qa_generation_README
Components
BCEmbedding
- bce-embedding-base_v1: embedding model, used to build vector DB
- bce-reranker-base_v1: rerank model, used to rerank retrieved documents
Langchain
LangChain is an open source framework for building large language model (LLM) based applications. LangChain provides a variety of tools and abstractions to increase the customization, accuracy, and relevance of the information generated by your models.
FAISS
FAISS is a library for efficient similarity search and dense vector clustering. It contains algorithms that can search sets of vectors of any size. Since langchain has integrated FAISS, this project will no longer be developed based on native documents. FAISS in Langchain
RAGAS
RAG’s classic evaluation framework is evaluated through the following three aspects:
- Faithfulness: The answers given should be generated based on the given context.
- Answer Relevance: The generated answer should solve the actual question asked.
- Context Relevance: The retrieved information should be highly concentrated and contain as little irrelevant information as possible.
Later, more evaluation indicators were added, such as: context recall, etc.
Detials
RAG pipeline
- Build vector DB based on data set
- Embedding questions entered by customers
- Search in vector database based on embedding results
- Reorder recall data
- Generate final results based on user questions and recall data
Noted: The above process will only be carried out when the user chooses to use RAG
Follow-up actions
- Add RAGAS evaluation results to the generation process. For example, when the generated results cannot solve the user's problem, it needs to be regenerated.
- Add web retrieval to deal with the problem that the corresponding information cannot be retrieved in vector DB
- Add multi-channel retrieval to increase recall rate. That is, multiple similar queries are generated based on user input for retrieval.