From 64a80258a5fd43d5a5218fa128041cebe25c898e Mon Sep 17 00:00:00 2001
From: MING_X <119648793+MING-ZCH@users.noreply.github.com>
Date: Tue, 9 Apr 2024 19:06:11 +0800
Subject: [PATCH 1/4] Update README.md

---
 evaluate/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/evaluate/README.md b/evaluate/README.md
index d42a3c9..f4758fd 100644
--- a/evaluate/README.md
+++ b/evaluate/README.md
@@ -20,4 +20,4 @@
 |-------------------|-----------------------|-------------------|-----------------|---------|
 | InternLM2_7B_chat_qlora |      1.32       |        2.20       |      2.10       | 1.00    |
 | InternLM2_7B_chat_full  |      1.40       |        2.45       |      2.24       | 1.00    |
-
+| InternLM2_20B_chat_lora |      1.42       |        2.39       |      2.22       | 1.00    |

From b6e81c8b10b1b1842ceb60d236b2570ca137224b Mon Sep 17 00:00:00 2001
From: MING_X <119648793+MING-ZCH@users.noreply.github.com>
Date: Tue, 9 Apr 2024 19:07:14 +0800
Subject: [PATCH 2/4] Update README_EN.md

---
 evaluate/README_EN.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/evaluate/README_EN.md b/evaluate/README_EN.md
index b46b0ce..402cde8 100644
--- a/evaluate/README_EN.md
+++ b/evaluate/README_EN.md
@@ -19,3 +19,5 @@
 |       Model       |    Comprehensiveness  |   rofessionalism  |  Authenticity   | Safety  |
 |-------------------|-----------------------|-------------------|-----------------|---------|
 | InternLM2_7B_chat_qlora |      1.32       |        2.20       |      2.10       | 1.00    |
+| InternLM2_7B_chat_full  |      1.40       |        2.45       |      2.24       | 1.00    |
+| InternLM2_20B_chat_lora |      1.42       |        2.39       |      2.22       | 1.00    |

From 360dc212a5a9695063f6766f0312d1ee2a8b0d09 Mon Sep 17 00:00:00 2001
From: MING_X <119648793+MING-ZCH@users.noreply.github.com>
Date: Tue, 9 Apr 2024 19:31:14 +0800
Subject: [PATCH 3/4] Update README.md

---
 datasets/README.md | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/datasets/README.md b/datasets/README.md
index bd0fcf9..94c2aad 100644
--- a/datasets/README.md
+++ b/datasets/README.md
@@ -2,7 +2,7 @@
 
 * 数据集按用处分为两种类型：**General** 和 **Role-play**
 * 数据按格式分为两种类型：**QA** 和 **Conversation**
-* 数据汇总：General（**6个数据集**）；Role-play（**3个数据集**）
+* 数据汇总：General（**6个数据集**）；Role-play（**5个数据集**）
 
 ## 数据集类型
 
@@ -27,6 +27,8 @@
 | *Role-play* |         aiwei         | Conversation |  4000+  |
 | *Role-play* |       SoulStar        |      QA      | 11200+  |
 | *Role-play* |        tiangou        | Conversation |  3900+  |
+| *Role-play* |        mother         | Conversation | 24,500+ |
+| *Role-play* |       scientist       | Conversation | 28,400+ |
 |     ……      |          ……           |      ……      |   ……    |
 
 ## 数据集来源
@@ -45,6 +47,8 @@
 * 数据集 aiwei 来自本项目
 * 数据集 tiangou 来自本项目
 * 数据集 SoulStar 来源 [SoulStar](https://github.com/Nobody-ML/SoulStar)
+* 数据集 mother 来自本项目
+* 数据集 scientist 来自本项目
 
 ## 数据集去重
 

From 700edfb9e8316c008def9ba548e3933817ab929a Mon Sep 17 00:00:00 2001
From: MING_X <119648793+MING-ZCH@users.noreply.github.com>
Date: Tue, 9 Apr 2024 20:53:17 +0800
Subject: [PATCH 4/4] Update README_EN.md

---
 datasets/README_EN.md | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/datasets/README_EN.md b/datasets/README_EN.md
index 835de61..d77741f 100644
--- a/datasets/README_EN.md
+++ b/datasets/README_EN.md
@@ -2,7 +2,7 @@
 
 * Category of dataset: **General** and **Role-play**
 * Type of data: **QA** and **Conversation**
-* Summary: General(**6 datasets**), Role-play(**3 datasets**)
+* Summary: General(**6 datasets**), Role-play(**5 datasets**)
 
  ## Category
 * **General**: generic dataset, including psychological Knowledge, counseling technology, etc.
@@ -25,6 +25,8 @@
 | *Role-play* |         aiwei         | Conversation |  4000+  |
 | *Role-play* |       SoulStar        |      QA      | 11200+  |
 | *Role-play* |        tiangou        | Conversation |  3900+  |
+| *Role-play* |        mother         | Conversation | 24,500+ |
+| *Role-play* |       scientist       | Conversation | 28,400+ |
 |     ……      |          ……           |      ……      |   ……    |
 
 
@@ -41,8 +43,10 @@
 * dataset `aiwei` from this repo
 * dataset `tiangou` from this repo
 * dataset `SoulStar` from [SoulStar](https://github.com/Nobody-ML/SoulStar)
+* dataset `mother` from this repo
+* dataset `scientist` from this repo
 
 **Dataset Deduplication**：
 Combine absolute matching with fuzzy matching (Simhash) algorithms to deduplicate the dataset, thereby enhancing the effectiveness of the fine-tuning model. While ensuring the high quality of the dataset, the risk of losing important data due to incorrect matches can be reduced via adjusting the threshold.
 
-https://algonotes.readthedocs.io/en/latest/Simhash.html
\ No newline at end of file
+https://algonotes.readthedocs.io/en/latest/Simhash.html