Evaluation Method

Jump to bottom

ai-lab-projects edited this page Apr 29, 2025 · 1 revision

Evaluation Method

This page describes how the trained agents (buyer and seller) are evaluated.

Evaluation Metrics

1. Total Reward

The sum of all rewards (profits and losses) achieved during the evaluation period.
Higher is better.

2. Win Rate

The percentage of profitable trades among all completed trades.
Formula: (Number of winning trades) / (Total number of trades)

3. Average Return per Trade

The mean profit or loss per trade.
Important for understanding trade quality beyond just win rate.

4. Average Holding Days

Average number of days the asset was held before selling.
Helps assess the strategy's turnover speed.

5. Holding Rate

Proportion of time spent holding positions vs. waiting.

6. p-Value

Statistical evaluation of strategy performance.
Compares total reward to the distribution of random trading simulations.
A lower p-value indicates a lower probability that the observed results are due to random chance.

7. Total Reward over Mean Random Reward

A measure comparing the total reward achieved to the mean reward of random strategies.

Evaluation Process

Evaluation on Training Data
- To check how well the model fits the training set.
Evaluation on Validation Data
- To assess the model’s ability to generalize to unseen data.
Random Simulations for p-Value
- Conduct random buy/sell simulations to establish a null distribution.
- Calculate the p-value by comparing the agent's result to random performance.

Notes

Evaluation is conducted without exploration (i.e., agents use deterministic policies without random actions).
Early stopping is performed based on validation results to avoid overfitting.
Models are saved if they achieve the best validation p-value during training.