OliveSensorAPI/rag
2024-07-12 23:52:05 +08:00
..
pdf2md Update README.md 2024-05-04 18:08:54 +08:00
src fix: typo 2024-07-12 23:52:05 +08:00
README_EN.md Update README_EN.md 2024-05-03 13:11:43 +09:00
README.md Update README.md 2024-05-03 13:08:34 +09:00
requirements.txt Update requirements.txt add faiss-cpu 2024-05-03 00:33:11 +09:00

EmoLLM RAG

Module purpose

Based on the customer's questions, the corresponding information is retrieved to enhance the professionalism of the answer, making EmoLLM's answer more professional and reliable. Search content includes but is not limited to the following:

  • Psychology related theories
  • Psychology methodology
  • Classic Case
  • Customer background knowledge

Datasets

  • Cleaned QA pairs: Each QA pair is embedding as a sample
  • Filtered TXT texts
    • Directly generate embedding for TXT text (segmented based on token length)
    • Filter out irrelevant information such as directories and generate embedding for TXT text (segmented based on token length)
    • After filtering irrelevant information such as directories, the TXT is semantically segmented to generate embedding.
    • Split TXT according to the directory structure, and generate embeddings based on the architecture hierarchy.

For details on data collection construction, please refer to qa_generation_README

Components

There are two sets of embedding and rerank solutions, i.e., the BGE and BCE, we recommend to use the more powerful BGE !

BGE Github

BCEmbedding

Langchain

LangChain is an open source framework for building large language model (LLM) based applications. LangChain provides a variety of tools and abstractions to increase the customization, accuracy, and relevance of the information generated by your models.

FAISS

FAISS is a library for efficient similarity search and dense vector clustering. It contains algorithms that can search sets of vectors of any size. Since langchain has integrated FAISS, this project will no longer be developed based on native documents. FAISS in Langchain

RAGAS

RAGs classic evaluation framework is evaluated through the following three aspects:

  • Faithfulness: The answers given should be generated based on the given context.
  • Answer Relevance: The generated answer should solve the actual question asked.
  • Context Relevance: The retrieved information should be highly concentrated and contain as little irrelevant information as possible.

Later, more evaluation indicators were added, such as: context recall, etc.

Detials

RAG pipeline

  • Build vector DB based on data set
  • Embedding questions entered by customers
  • Search in vector database based on embedding results
  • Reorder recall data
  • Generate final results based on user questions and recall data

Noted: The above process will only be carried out when the user chooses to use RAG

Follow-up actions

  • Add RAGAS evaluation results to the generation process. For example, when the generated results cannot solve the user's problem, it needs to be regenerated.
  • Add web retrieval to deal with the problem that the corresponding information cannot be retrieved in vector DB
  • Add multi-channel retrieval to increase recall rate. That is, multiple similar queries are generated based on user input for retrieval.