Reinforcement Learning for Trading [link]

Performance functions:

Profit or Wealth
Sharpe ratio
differential Sharpe ratio

Differential Sharpe ratio - new value function for risk-adjusted return that enables learning to be done online.

Consider performance functions for systems with a single asset portfolio (price series z_t).

See Moody et al. (1998) for detailed discussion of multiple asset portfolios.

Trader's actions:

take long
take neutral
take short

- positions F_t \in {-1, 0, 1} of constant magnitude

F_t is established at the end of time interval t and is re-assessed at the end of period t+1.

Return R_t is realized at the end of the time interval (t-1, t] - it's profit or loss resulting from the position F_{t-1}.

Additive Profit:

- where r_t = z\_t - z_{t-1}.

Wealth W_T = W_0 + P_T.

Multiplicative profit are appropriate when a fixed fraction of accumulated wealth \nu > 0 is invested in each long or short trade. The Wealth at time T:

- where r_t = (z\_t/z_{t-1} - 1).

~~Maximizing profits~~ Maximizing risk-adjusted return (according to Modern Portfolio Theory).

The measure is Sharpe ratio:

Differential Sharpe ratio for online optimization of trading system performance:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rl-for-trading.md

rl-for-trading.md

Reinforcement Learning for Trading [link]

Files

rl-for-trading.md

Latest commit

History

rl-for-trading.md

File metadata and controls

Reinforcement Learning for Trading [link]