MING_X 54a8ad2081 Update README.md

* Update news
* Update paper link - [《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》](https://arxiv.org/abs/2405.16433)

2024-05-28 17:04:36 +08:00

1.7 KiB

Raw Permalink Blame History

EmoLLM's professional evaluation

Introduction

This document describes a professional evaluation method and provides EmoLLM's scores on professional metrics.

Evaluation

The evaluation method, metric, and dataset from the paper《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》.

Metric: Comprehensiveness, Professionalism, Authenticity, Safety
Method: Turn-Based Dialogue Evaluation
Dataset: CPsyCounE

Result

Model:
- EmoLLM V1.0 (InternLM2_7B_chat_qlora)
- EmoLLM V2.0 (InternLM2_7B_chat_full)
Score：

Model	Comprehensiveness	Professionalism	Authenticity	Safety
InternLM2_7B_chat_qlora	1.32	2.20	2.10	1.00
InternLM2_7B_chat_full	1.40	2.45	2.24	1.00

Comparison

EmoLLM V2.0 is greatly improved in all scores compared to EmoLLM V1.0! Surpasses the performance of Role-playing ChatGPT on counseling tasks!
EmoLLM V1.0 is greatly improved on InternLM2_7B_Chat; Performance on the counseling task was similar compared to ChatGPT(Role-playing)
The comparison results are from the paper《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》

1.7 KiB Raw Permalink Blame History Unescape Escape

EmoLLM's professional evaluation

Introduction

Evaluation

Result

Comparison

1.7 KiB

Raw Permalink Blame History