diff --git a/rag/README_EN.md b/rag/README_EN.md index df4fe43..a3d3386 100644 --- a/rag/README_EN.md +++ b/rag/README_EN.md @@ -23,6 +23,13 @@ For details on data collection construction, please refer to [qa_generation_READ ## **Components** +There are two sets of embedding and rerank solutions, i.e., the BGE and BCE, we recommend to use the more powerful **BGE** ! + +### [BGE Github](https://github.com/FlagOpen/FlagEmbedding) + +- [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5): embedding model, used to build vector DB +- [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large): rerank model, used to rerank retrieved documents + ### [BCEmbedding](https://github.com/netease-youdao/BCEmbedding?tab=readme-ov-file) - [bce-embedding-base_v1](https://hf-mirror.com/maidalun1020/bce-embedding-base_v1): embedding model, used to build vector DB @@ -63,4 +70,4 @@ Later, more evaluation indicators were added, such as: context recall, etc. - Add RAGAS evaluation results to the generation process. For example, when the generated results cannot solve the user's problem, it needs to be regenerated. - Add web retrieval to deal with the problem that the corresponding information cannot be retrieved in vector DB -- Add multi-channel retrieval to increase recall rate. That is, multiple similar queries are generated based on user input for retrieval. \ No newline at end of file +- Add multi-channel retrieval to increase recall rate. That is, multiple similar queries are generated based on user input for retrieval.