OliveSensorAPI/evaluate/Professional_evaluation_EN.md
MING_X 54a8ad2081 Update README.md
* Update news
* Update paper link - [《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》](https://arxiv.org/abs/2405.16433)
2024-05-28 17:04:36 +08:00

1.7 KiB
Raw Permalink Blame History

EmoLLM's professional evaluation

Introduction

This document describes a professional evaluation method and provides EmoLLM's scores on professional metrics.

Evaluation

The evaluation method, metric, and dataset from the paper《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》.

  • Metric: Comprehensiveness, Professionalism, Authenticity, Safety
  • Method: Turn-Based Dialogue Evaluation
  • Dataset: CPsyCounE

Result

Model Comprehensiveness Professionalism Authenticity Safety
InternLM2_7B_chat_qlora 1.32 2.20 2.10 1.00
InternLM2_7B_chat_full 1.40 2.45 2.24 1.00

Comparison

  • EmoLLM V2.0 is greatly improved in all scores compared to EmoLLM V1.0! Surpasses the performance of Role-playing ChatGPT on counseling tasks!

  • EmoLLM V1.0 is greatly improved on InternLM2_7B_Chat; Performance on the counseling task was similar compared to ChatGPT(Role-playing)

  • The comparison results are from the paper《CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling》 image