MING_X 54a8ad2081 Update README.md

* Update news
* Update paper link - [《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》](https://arxiv.org/abs/2405.16433)

2024-05-28 17:04:36 +08:00

1.7 KiB

Raw Blame History

EmoLLM's professional evaluation

Introduction

This document describes a professional evaluation method and provides EmoLLM's scores on professional metrics.

Evaluation

The evaluation method, metric, and dataset from the paper《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》.

Metric: Comprehensiveness, Professionalism, Authenticity, Safety
Method: Turn-Based Dialogue Evaluation
Dataset: CPsyCounE

Result

Model:
- EmoLLM V1.0 (InternLM2_7B_chat_qlora)
- EmoLLM V2.0 (InternLM2_7B_chat_full)
Score：

Model	Comprehensiveness	Professionalism	Authenticity	Safety
InternLM2_7B_chat_qlora	1.32	2.20	2.10	1.00
InternLM2_7B_chat_full	1.40	2.45	2.24	1.00

Comparison

EmoLLM V2.0 is greatly improved in all scores compared to EmoLLM V1.0! Surpasses the performance of Role-playing ChatGPT on counseling tasks!
EmoLLM V1.0 is greatly improved on InternLM2_7B_Chat; Performance on the counseling task was similar compared to ChatGPT(Role-playing)
The comparison results are from the paper《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》

1.7 KiB Raw Blame History Unescape Escape

EmoLLM's professional evaluation

Introduction

Evaluation

Result

Comparison

1.7 KiB

Raw Blame History