OliveSensorAPI/evaluate/Professional_evaluation_EN.md
2024-03-06 17:29:52 +08:00

1.3 KiB
Raw Blame History

EmoLLM's professional evaluation

Introduction

This document describes a professional evaluation method and provides EmoLLM's scores on professional metrics.

Evaluation

The evaluation method, metric, and dataset from the paper《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》.

  • Metric: Comprehensiveness, Professionalism, Authenticity, Safety
  • Method: Turn-Based Dialogue Evaluation
  • Dataset: CPsyCounE

Result

Metric Value
Comprehensiveness 1.32
Professionalism 2.20
Authenticity 2.10
Safety 1.00

Comparison

  • EmoLLM V1.0 is greatly improved on InternLM2_7B_Chat; Performance on the counseling task was similar compared to ChatGPT(Role-playing)

  • The comparison results are from the paper《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》 image