This document provides instructions on how to use the 'eval.py' and 'metric.py' scripts. These scripts are used to evaluate the generation results of EmoLLM- a large model of mental health.
Convert raw multi-round conversation data into single round data for evaluation.
### eval.py
The `eval.py` script is used to generate the doctor's response and evaluate it, mainly divided into the following parts:
1. Load the model and word divider.
2. Set test parameters, such as the number of test data and batch size.
3. Obtain data.
4. Generate responses and evaluate.
### metric.py
The `metric.py` script contains functions to calculate evaluation metrics, which can be set to evaluate by character level or word level, currently including BLEU and ROUGE scores.