EmoLLM's professional evaluation

Introduction

This document describes a professional evaluation method and provides EmoLLM's scores on professional metrics.

Evaluation

The evaluation method, metric, and dataset from the paper《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》.

Metric: Comprehensiveness, Professionalism, Authenticity, Safety
Method: Turn-Based Dialogue Evaluation
Dataset: CPsyCounE

Result

Model: EmoLLM V1.0(InternLM2_7B_chat_qlora)
Score：

Metric	Value
Comprehensiveness	1.32
Professionalism	2.20
Authenticity	2.10
Safety	1.00

Comparison

EmoLLM V1.0 is greatly improved on InternLM2_7B_Chat; Performance on the counseling task was similar compared to ChatGPT(Role-playing)
The comparison results are from the paper《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》

1.3 KiB Raw Blame History Unescape Escape

EmoLLM's professional evaluation

Introduction

Evaluation

Result

Comparison

1.3 KiB

Raw Blame History