xzw
a12a7ef107
add base model qlora fintuning config file and optimize deduplicate.py ( #128 )
2024-03-23 19:20:17 +08:00
zealot52099
53d69301ff
[DOC]EmoLLM_Scientist微调指南
2024-03-23 19:19:07 +08:00
xzw
c4187b6e9e
Add files via upload ( #127 )
2024-03-23 19:16:37 +08:00
zealot52099
8f63772651
[DOC]EmoLLM_Scientist微调指南
2024-03-23 19:14:40 +08:00
HongCheng
383789e869
Create README_internlm2_7b_base_qlora.md
...
InternLM2 7B Base QLoRA 微调指南
2024-03-23 19:52:52 +09:00
zealot52099
0c72c31b4f
add README.md for Emollm_Scientist
2024-03-23 15:59:02 +08:00
HongCheng
6e0042a54d
add base model and update personal contributions
2024-03-23 16:05:05 +09:00
HongCheng
affd90b177
create upload_modelscope.py
2024-03-23 15:45:11 +09:00
HongCheng
a22ec59be5
update cli_internlm2.py
...
three methods to load model
1. download model in openxlab
2. download model in modelscope
3. offline model
2024-03-23 15:43:01 +09:00
HongCheng
0124001926
other 2 configs for base model
2024-03-23 15:26:46 +09:00
HongCheng
df81a99f53
add full finetune code from internlm2
2024-03-23 15:26:01 +09:00
HongCheng
252adc7eef
add base model qlora fintuning config file: internlm2_7b_base_qlora_e10_M_1e4_32_64.py
2024-03-23 15:25:37 +09:00
HongCheng
950cab0262
optimize deduplicate.py
...
Add time print information
save duplicate dataset as well
remove print(content)
2024-03-23 15:24:45 +09:00
Bryce Wang
dd7b6c4cc1
Add files via upload
...
两个母亲多轮对话数据集合并、清理和去重之后,得到 2439 条多轮对话数据(每条有6-8轮对话)。
2024-03-22 15:13:30 -07:00
zealot52099
66b7617f04
测试push dev
...
测试push dev
2024-03-22 20:45:13 +08:00
xzw
a8cfdb87d4
update rag/src/data_processing.py & rag/src/config/config.py ( #123 )
2024-03-22 20:25:20 +08:00
zealot52099
b7da9a697f
Add files via upload
...
embedding_path = os.path.join(model_dir, 'embedding_model')
rerank_path = os.path.join(model_dir, 'rerank_model')
2024-03-22 20:17:19 +08:00
zealot52099
0aa58372bb
Add files via upload
...
allow user to load embedding & rerank models from cache
2024-03-22 20:15:37 +08:00
xzw
ad7329d113
[Code] update rag ( #122 )
2024-03-22 10:06:12 +08:00
xzw
382d338ab3
update rag/src/data_processing.py ( #121 )
2024-03-22 10:04:35 +08:00
xzw
ee6b365588
Update RAG pipeline ( #120 )
2024-03-22 10:02:11 +08:00
zealot52099
b5af7793d6
update rag/src/data_processing.py
2024-03-22 07:39:44 +08:00
Anooyman
2d3bd4a8f5
Update RAG pipeline
2024-03-21 22:43:09 +08:00
Anooyman
6c2c7496ba
Merge pull request #1 from SmartFlowAI/main
...
Update main
2024-03-21 20:03:19 +08:00
xzw
66fa15da5d
Dev ( #117 ) ( #118 ) ( #119 )
2024-03-21 17:02:41 +08:00
xzw
1c5a9c081c
Dev ( #117 ) ( #118 )
2024-03-21 16:14:08 +08:00
xzw
412c80aef3
Dev ( #117 )
2024-03-21 15:58:09 +08:00
xzw
8a1e0df9d3
[DOC]update datesets/README.md ( #115 )
2024-03-21 15:50:20 +08:00
xzw
e0ec624943
[Update] modified and add files related to fintuning with internlm2_7b_base ( #116 )
2024-03-21 15:47:35 +08:00
HongCheng
4ff7910368
Update process_merge.py
2024-03-21 16:07:18 +09:00
HongCheng
d25a304c4d
Update process_single_turn_conversation_construction.py
2024-03-21 16:06:41 +09:00
HongCheng
085a01eafa
add dataset processing codes
...
1. update process.py for multi_turn_dataset(1 and 2) and data.json, data_pro.json
2. add datasets\processed\process_single_turn_conversation_construction.py for single-turn dataset (1 and 2)
3. add datasets\processed\process_merge.py for these 6 updated dataset in datasets\processed\
2024-03-21 16:01:54 +09:00
HongCheng
ce2cb5156c
update data.json (delete 4 empty data)
...
4 empty lines in data.json 425 483 742 1120
2024-03-21 15:56:54 +09:00
HongCheng
d42f378eaa
add internlm2_7b_base_qlora_e3.py and modify requirements.txt
2024-03-21 15:55:50 +09:00
zealot52099
e2025cc8ea
[DOC]update datesets/README.md
2024-03-21 08:24:15 +08:00
zealot52099
3b21f79c3c
Merge branch 'dev' of https://github.com/SmartFlowAI/EmoLLM into dev
2024-03-21 07:59:16 +08:00
zealot52099
c354ffd7e0
[DOC]update datesets/README.md
2024-03-21 07:58:13 +08:00
xzw
fcd7ea5327
Merge pull request #114 from SmartFlowAI/dev
...
Dev
2024-03-20 23:45:28 +08:00
xzw
f5eb0ddc93
Merge pull request #113 from lll997150986/main
...
scientist.json
2024-03-20 23:44:46 +08:00
jeky
dbdd731565
1111
2024-03-20 23:25:07 +08:00
xzw
817c25b349
Merge pull request #112 from zealot52099/dev
...
update deduplicate.py
2024-03-20 23:20:14 +08:00
zealot52099
927659148c
Merge branch 'dev' of https://github.com/SmartFlowAI/EmoLLM into dev
2024-03-20 23:14:20 +08:00
zealot52099
77ff2d079c
update deduplicate.py
2024-03-20 23:08:36 +08:00
xzw
4e8271a540
Merge pull request #110 from SmartFlowAI/dev
...
Dev
2024-03-20 19:51:33 +08:00
MING_X
9e86195458
Merge pull request #109 from MING-ZCH/main
...
[DOC] Update README.md
2024-03-20 18:01:15 +08:00
xzw
585facad06
Merge pull request #108 from zealot52099/dev
...
update rag/src/data_processing.py & main,py
2024-03-20 17:58:12 +08:00
MING_X
c35f75ff1d
Update README_EN.md
2024-03-20 17:54:26 +08:00
zealot52099
41744ed604
[DOC] update datasets/README_EN.md
2024-03-20 17:52:23 +08:00
zealot52099
9b4e58f732
[DOC]update datasets/README.md
2024-03-20 17:40:31 +08:00
MING_X
12959e6d1a
Update README.md
2024-03-20 17:40:07 +08:00