diff --git a/docs/briefs/01-read-dukascopy-cache.md b/docs/briefs/01-read-dukascopy-cache.md deleted file mode 100644 index c7daf34..0000000 --- a/docs/briefs/01-read-dukascopy-cache.md +++ /dev/null @@ -1,107 +0,0 @@ -Always read and follow instructions in `CLAUDE.md` in the project root before processing the brief. - -# Task Brief: Read `tradedesk-dukascopy` cache data for backtests - -## Context - -Data for backtests is currently supplied in the form of CSV candle files at some resolution (typically 1 minute) per instrument with the `tradedesk` library aggregating these into the periods actually required by the running strategy. This forces a one-off creation of the "chart" data per instrument for use in a backtest, where the chart file contains all the candles that will be used in the backtest for that instrument. - -The `tradedesk-dukascopy` tool provides a way to export such files from its cache of market data by specifying start/end dates, instrument names and resampling period. But any method of generating the data in the required format is equally valid. - -## Goal - -The goal of this session is to enable `tradedesk` to read market data directly from the `tradedesk-dukascopy` cache and avoid having to create and re-create exported chart files for backtests. Once the tool has downloaded and converted data, it should be immediately possible to use it in a test. The downloader now stores daily "tick" files consisting of one line of prices for each movement. These are the files that will need to be read in order to create OHLCV candles for injection into the bus for backtest clients. - -### Data Format - -All cached data is stored in the following directory layout from the root of the cache-dir: - -```text -SYMBOL -├── YEAR - ├── 00 - │ ├── 01_ticks.csv.zst - │ ├── 02_ticks.csv.zst - │ ├── ... - │ ├── 31_ticks.csv.zst -``` -Months are `00` based; 00 = January, 11 = December. This is a Dukascopy convention. Each day within the month has a single compressed file of tick data. - -File compression is done with the python `zstandard` library and this should be performant enough to decompress on the fly. - -This is the data format for all decompressed dukascopy data: - -```csv -ts,bid,ask,bid_vol,ask_vol -2026-01-30T00:00:02.244000+00:00,13807.7,13808.9,0.8999999761581421,1.6299999952316284 -2026-01-30T00:00:04.506000+00:00,13807.9,13808.8,0.8999999761581421,0.7300000190734863 -2026-01-30T00:00:04.609000+00:00,13807.9,13808.9,0.8999999761581421,1.7999999523162842 -2026-01-30T00:00:04.711000+00:00,13807.7,13808.9,0.8999999761581421,0.8999999761581421 -2026-01-30T00:00:10.805000+00:00,13808.0,13808.8,0.8999999761581421,0.8999999761581421 -2026-01-30T00:00:11.218000+00:00,13808.1,13809.0,0.8999999761581421,2.9800000190734863 -2026-01-30T00:00:12.487000+00:00,13808.0,13808.9,0.8999999761581421,4.5 -2026-01-30T00:00:12.590000+00:00,13807.8,13808.9,0.8999999761581421,0.8999999761581421 -2026-01-30T00:00:12.894000+00:00,13807.8,13808.7,0.8999999761581421,0.8999999761581421 -``` - -## Approach - - - The original data format does not need to be retained in the framework for backwards compatibility. Tick files will be the only supported future format. The framework should, however, handle source files that are compressed (with `zstandard`) or not. - - Chart files naturally stopped and started with whatever the first and last entry in them was. Running a backtest will now need two additional parameters / CLI args of `--from` and `--to` to determine this. - - Missing data days within the cache are to be expected (for example when markets are closed). - - Data using old formats might exist in the cache. This might include hourly files with `.bi5` extensions or differing directory structures if it was collected with older versions of the tool. If it is found, exit the backtest gracefully with an error message and advise the user to re-run the `tradedesk-dc-export` tool which will correct the cache. - - There may be existing code in the `tradedesk-dukascopy` project that will be of use as a starting point since it already implements some of these requirements to produce charts. Use it if so (but copy, do not reference/import) as that will likely be removed from the tool once this work is complete. - - - - -### Stretch goal - -The following should only be evaluated for complexity, do not write code or tests for them. - - - Optionally use a single side (default to BUY) or use both BUY and SELL prices to include spread costs. Use a CLI argument of `--side` with choices `{"buy", "sell", "both"}` - - Update the reporting (`analysis.md` file) to show the spread costs if used - -## Output - -### New files - -**`tradedesk/execution/backtest/dukascopy.py`** — cache reader module. Tick-reading logic copied from `tradedesk-dukascopy` (no import reference). Provides: -- `read_dukascopy_candles(cache_dir, symbol, period, date_from, date_to, *, price_side)` — reads daily tick files for the date range, skips missing days silently, detects old `.bi5` format and exits with guidance, resamples to `Candle` objects. -- Internal helpers: `_iter_days`, `_tick_paths`, `_check_old_format`, `_load_tick_rows`, `_period_to_pandas_rule`, `_ticks_to_candle_df`. - -**`tests/execution/backtest/test_dukascopy_cache.py`** — 36 tests covering: day iteration, path construction, period conversion, old-format detection, compressed/uncompressed loading, missing-day skipping, resampling correctness, timestamp ordering, and `BacktestClient.from_dukascopy_cache` integration. - -### Modified files - -**`tradedesk/execution/backtest/client.py`** — added `from_dukascopy_cache(cache_dir, *, symbol, instrument, period, date_from, date_to, price_side)` classmethod. - -**`tradedesk/execution/backtest/runner.py`** — `BacktestSpec` now takes `cache_dir`, `symbol`, `date_from`, `date_to`, `price_side` instead of `candle_csv`. `run_backtest` uses `BacktestClient.from_dukascopy_cache`. - -**`pyproject.toml`** — added `zstandard>=0.25` dependency. - -**`tests/execution/backtest/test_backtest_runner.py`** — updated for new `BacktestSpec` fields. - -### Stretch goal assessment - -`--side` (buy/sell/both) is already partially supported: `price_side` is threaded through `BacktestSpec`, `from_dukascopy_cache`, and `read_dukascopy_candles`. Exposing it as a CLI argument is a small addition. The spread-cost annotation in analysis reporting would require the reporting layer to carry the `price_side` context and annotate equity/trade output — moderate complexity, ~1–2 hours. - - -## Usage by Model - -**claude-sonnet-4-6:** - -| Type | Tokens | -| :--- | :--- | -| Input | 0.1k | -| Output | 34.7k | -| Cache Read | 6.6m | -| Cache Write | 170.3k | - -**claude-haiku-4-5:** - -| Type | Tokens | -| :--- | :--- | -| Input | 0.1k | -| Output | 6.4k | -| Cache Read | 300.3k | -| Cache Write | 66.6k | diff --git a/docs/briefs/02-read-new-cache-format.md b/docs/briefs/02-read-new-cache-format.md deleted file mode 100644 index 426a798..0000000 --- a/docs/briefs/02-read-new-cache-format.md +++ /dev/null @@ -1,64 +0,0 @@ -Always read and follow instructions in `CLAUDE.md` in the project root before processing the brief. - -# Amendment to `01-read-dukascopy-cache.md` - read candles instead - -## Context - -The changes outlined in `01-read-dukascopy-cache.md` worked as implemented but with a severe performance penalty of converting the cached tick data to 1-minute candle resolutions in memory. This penalty was paid on each backtest run. - -`tradedesk-dukascopy` has now been amended to store compressed 1-minute candle files in the cache instead (2 per day, one with bid prices and one with ask prices). - -## Goal - -The goal of this session is to amend the code to read and use the already available 1-minute candle files from the cache, aggregating those as required for the backtest parameters. - -The goal is successfully achieved when a backtest can be run using the new format for cached dukascopy data as input to the test. - - -## Approach - - - Amend existing code that reads the dukascopy cache to expect and require 1-minute candle files using the correct buy/sell side price. - - Discard any code that generates 1-minute candles internally as these are already created. - - Update any supporting code and tests to match the changes - - Ensure all tests pass, and that the code is compliant with `mypy --strict` - - Make specific note of any changes to the public API that scripts and calling code may need to be aware of - - Do not be concerned with backward compatibility or API shims, breakages are acceptable - -## Output - -### Modified files - -**`tradedesk/execution/backtest/dukascopy.py`** — rewired to read 1-min candle files. - -Removed: -- `_tick_paths()` — replaced by `_candle_path()` -- `_load_tick_rows()` — replaced by `_load_daily_candles()` -- `_ticks_to_candle_df()` — no longer needed; candles are pre-computed in the cache - -Added: -- `_candle_path(cache_dir, symbol, day, side)` — returns `{day}_{side}.csv.zst` path -- `_load_daily_candles(path)` — decompresses and returns a candle DataFrame with UTC DatetimeIndex; returns `None` on missing/corrupt file - -Changed in `read_dukascopy_candles`: -- `price_side` now selects which cached file to read (`"bid"` or `"ask"`); `"mid"` is no longer supported (no mid file exists in the cache) -- Loads 1-min candle DataFrames per day, concatenates, and resamples once with `first/max/min/last/sum` aggregation — matching the `CandleAggregator` pattern -- Error message updated: `"No candle data found"` (was `"No tick data found"`) - -**`tests/execution/backtest/test_dukascopy_cache.py`** — rewritten for candle format. - -Removed: tests for `_tick_paths`, `_load_tick_rows`, `_ticks_to_candle_df` - -Added: tests for `_candle_path`, `_load_daily_candles`, `_read_dukascopy_candles_ask_side`, `_read_dukascopy_candles_aggregates_to_15min`, `_read_dukascopy_candles_invalid_side_raises`, `test_backtest_client_from_dukascopy_cache_ask_side` - -Test count: 36 (unchanged). - -### Public API changes - -- `_tick_paths` and `_load_tick_rows` and `_ticks_to_candle_df` are removed from the public-ish internal API. Any calling code that imported them directly will need updating. -- `read_dukascopy_candles` now raises `ValueError` for `price_side="mid"` (was silently supported). -- Error message prefix changed from `"No tick data found"` to `"No candle data found"` — any caller matching the old string will need updating. - -### Result - -498 tests pass. `mypy --strict` reports only the pre-existing `import-untyped` error for `pandas` (shared with `recording/report.py`; no stubs installed project-wide). - diff --git a/docs/briefs/03-improve-startup-performance.md b/docs/briefs/03-improve-startup-performance.md deleted file mode 100644 index 1552f08..0000000 --- a/docs/briefs/03-improve-startup-performance.md +++ /dev/null @@ -1,135 +0,0 @@ -Always read and follow instructions in `CLAUDE.md` in the project root before processing the brief. - -# Improve backtest startup performance - -## Context - -Backtests work by reading 1-minute OHLCV candle files, aggregating them as necessary, and using them as the source of streaming prices to test strategies. Currently these candles are all read into memory priot to the test starting. - -In `tradedesk` applications, a portfolio can consist of several instruments and potentially long time periods causing an issue of delayed start (with no user feedback because logging has not been initiated at this point) and an inefficient use of memory. - -## Role - -In this session you will take on a role described in https://github.com/radiusred/.github/doc/agent-roles/consultant-architect.md (or the local repository clone) - -## Goal - -The goal of this session is to improve both startup time and memory efficiency of backtesting. With an event driven system, the aim should be to ideally inject candles into the event bus as they are available. - -The goal is successfully achieved when long, multi-instrument backtests can be run with no appreciable increase in startup time or memory overhead compared with shorter or single-instrument runs. - - -## Approach - - - Evaluate (do NOT write/edit code) two approaches to data loading. - - Should all the candles still be loaded into memory before starting, in which case multi-thread the load (one thread per instrument) before finally sorting by date and aggregating instruments into the current single candle list. This may solve the time issue even if the memory problem is not addressed. However, it may solve neither if the multi-threaded approach blocks on disk IO. - - Can injection of candles into the "bus" begin as soon as they are read - effectively implementing a real stream with the disk cache as the source doing what a broker's streaming client would do. This is more idealised, solves both problems but could be more complex than it's worth. - -Outline pros and cons of each approach, plus any other you feel worthy of consideration. Agree the way forward, but do NOT implement anything at this stage, implementation will be in a later brief. - -## Output - -### Architecture Observations - -There are two distinct memory allocations to account for, both of which are problems: - -1. **Load phase** (`build_portfolio_backtest_client`): Reads all CSV files into `BacktestClient._history` — a `dict[(epic, period) → list[Candle]]`. This happens *before* `asyncio.run()`, so it is synchronous, sequential, and unlogged. - -2. **Stream build phase** (`BacktestStreamer.run()`): Iterates every loaded candle, constructs a `CandleClosedEvent` object per candle, appends to a flat `stream` list, then calls `list.sort()`. This is a *second full materialisation* of the entire dataset, doubling the memory footprint before a single event is dispatched. - -For a 4-year, 5-instrument backtest at 1MIN resolution, each phase is on the order of several million objects. The sort in phase 2 is also O(N log N) across the full merged set, adding non-trivial CPU time before the replay begins. - -Any solution must address **both phases** to be effective. - ---- - -### Approach A: Parallel Loading (Multi-threaded) - -Load each instrument's CSV concurrently via a `ThreadPoolExecutor`, then proceed as now. - -**Pros** -- Minimal architectural change — only `build_portfolio_backtest_client` is touched -- Reading zstd-compressed CSV is largely IO-bound; threads bypass the GIL for that portion -- Proportional speedup with instrument count for the load phase -- Low risk: downstream code (BacktestClient, BacktestStreamer, strategies) unchanged - -**Cons** -- Addresses startup time only, not memory — all candles remain fully resident in `_history` and then again in `stream` -- Pandas resampling (1MIN → target) is CPU-bound; GIL limits real parallelism there, so the improvement may be partial -- If instruments share a single storage device, concurrent reads may contend rather than parallelise (less relevant with SSD) -- `BacktestStreamer.run()` still builds and sorts the full `stream` list — this phase dominates at scale and is untouched -- The problem gets worse linearly with scale; this approach just defers it - -**Verdict**: Partial, short-term fix. Solves neither the fundamental memory problem nor the stream-build bottleneck. Not recommended as the primary strategy. - ---- - -### Approach B: Streaming Injection (Lazy/Generator Pipeline) - -Replace upfront loading with per-instrument lazy generators that yield candles as they are read, and replace the flat sorted `stream` list with a k-way merge across those generators. - -The key observation is that **candles within each instrument file are already sorted chronologically**. A k-way merge across `k` instruments using `heapq.merge()` produces a fully ordered interleaved stream with O(log k) overhead per event — no upfront sort required, and only a constant-size heap in memory at any point. - -The `BacktestStreamer.run()` loop body would be unchanged — it still dispatches one event at a time — but the `stream` list and `stream.sort()` are replaced with a generator pipeline. - -**Pros** -- Solves both problems: startup begins immediately (no pre-load phase); memory per in-flight event is O(1) rather than O(total candles) -- No architectural change to the event dispatch path or strategies -- Scales arbitrarily — a 10-year, 20-instrument backtest uses no more memory than a 1-day, 1-instrument run -- Aligns with the system's stated design intent: injecting events as they arrive, as a real broker feed would -- `heapq.merge()` is already available in the standard library; no new dependencies - -**Cons** -- **Warmup mechanism is broken by this approach.** `get_historical_candles(epic, period, n)` reads from `_history`, which would no longer be pre-populated. Strategies that call this during warmup (to seed indicators before live replay) would fail. - - This is the dominant architectural challenge. - - MacroTrendStrategy already works around this by gating on `_bar_count >= period` rather than using warmup — that pattern generalises, but it requires strategy-level changes. - - An alternative: pre-load only the warmup window (first `warmup_bars` worth of data per instrument) into `_history`, then stream the remainder lazily. This is a small fraction of the total data. -- The base-period pre-aggregation step in `run_portfolio_backtest` also populates `_history[(epic, target_period)]` — this would need rethinking, though it could be done inline in the generator chain. -- Slightly more complex streaming code; ordering bugs would be harder to detect than with a sort-then-replay approach. - -**Verdict**: The right long-term architecture. The warmup issue is the only substantive obstacle and has a clear resolution path. - ---- - -### Approach C (Additional Consideration): Lazy Pipeline with Partial Warmup Pre-load - -A refinement of Approach B that resolves the warmup issue cleanly: - -- For each instrument, pre-load only the first `max(warmup_bars, aggregation_window)` candles into `_history` as today — enough to service `get_historical_candles()` calls during warmup -- Stream the remainder lazily via per-instrument generators -- The k-way merge begins after warmup completes (i.e. after `SessionReadyEvent`) so that warmup data and live-stream data are never interleaved incorrectly -- Base-period aggregation remains as a transform layer in the generator chain rather than a pre-pass - -This preserves full backward compatibility with existing warmup-dependent strategies while gaining the memory and startup benefits of lazy streaming. It is the pragmatic middle path: no strategy changes required, no warmup regressions, full scalability. - ---- - -### Recommendation - -**Approach C** is the recommended path. - -Approach A offers a small, bounded startup improvement but leaves the fundamental problems untouched — it is not worth the ongoing complexity budget given it will need to be revisited. - -Approach B/C together represent a clean, principled solution that aligns with the event-driven design of the system. The implementation surface is well-contained: - -- `BacktestStreamer`: replace `stream` list + sort with k-way merge generator -- `BacktestClient` / `dukascopy.py`: expose a lazy candle generator alongside (or replacing) the current eager loader -- `build_portfolio_backtest_client` / `run_portfolio_backtest`: pre-load warmup slice only; remove full upfront load -- Strategies: no changes required if partial pre-load is used - -The warmup boundary (when lazy streaming begins) maps naturally onto the existing `SessionStartedEvent` → `SessionReadyEvent` lifecycle, which already separates warmup from live replay. - -One recommendation for the implementation brief: define a `CandleSource` protocol (or similar) that both the eager `list[Candle]` and the lazy generator can satisfy. This keeps `BacktestStreamer` decoupled from the loading mechanism and avoids divergent code paths for tests versus production backtests. - - -# Cost -``` -Total cost: $0.4956 -Total duration (API): 4m 25s -Total duration (wall): 12h 25m 16s -Total code changes: 91 lines added, 1 line removed - -Usage by model: -claude-sonnet-4-6: 9 input, 5.7k output, 118.9k cache read, 21.2k cache write ($0.3335) -claude-haiku-4-5: 107 input, 6.9k output, 599.1k cache read, 54.1k cache write ($0.1621) -``` diff --git a/docs/briefs/03b-improve-startup-performance.md b/docs/briefs/03b-improve-startup-performance.md deleted file mode 100644 index d981ff4..0000000 --- a/docs/briefs/03b-improve-startup-performance.md +++ /dev/null @@ -1,110 +0,0 @@ -Always read and follow instructions in `CLAUDE.md` in the project root before processing the brief. - -# Improve startup performance (implementation) - -## Context - -See the context and architectural recommendations in `03-improve-startup-performance.md`. - - -## Role - -In this session you will take on a role described in https://github.com/radiusred/.github/doc/agent-roles/engineer.md (or the local equivalent repository clone) - -## Goal - -The goal of this session is to implement the architectural and design changes outlined in option **C** of `03-improve-startup-performance.md`. - -The goal is successfully achieved when long, or multi-instrument backtests can be run with no appreciable increase in startup time or memory overhead compared with shorter or single-instrument runs. - - -## Approach - -- Replace upfront loading with per-instrument lazy generators that yield candles as they are read, and replace the flat sorted `stream` list with a k-way merge across those generators. -- The key observation is that **candles within each instrument file are already sorted chronologically**. A k-way merge across `k` instruments using `heapq.merge()` will produces a fully ordered interleaved stream with O(log k) overhead per event. -- The `BacktestStreamer.run()` loop body remains unchanged — it still dispatches one event at a time — but the `stream` list and `stream.sort()` are replaced with a generator pipeline. -- For each instrument, pre-load only the first `max(warmup_bars, aggregation_window)` candles into `_history` as today — enough to service `get_historical_candles()` calls during warmup -- Stream the remainder lazily via per-instrument generators -- The k-way merge begins after warmup completes (i.e. after `SessionReadyEvent`) so that warmup data and live-stream data are never interleaved incorrectly -- Base-period aggregation remains as a transform layer in the generator chain rather than a pre-pass - -## Output - -### Files changed - -**`tradedesk/execution/backtest/dukascopy.py`** -- Added `iter_dukascopy_candles()` — validates inputs eagerly then delegates to `_iter_candles()`, a private generator that yields `Candle` objects one daily file at a time. Per-day resampling for non-1MIN periods (safe for all periods that align to day boundaries: 15MIN, 1H, 4H, 1D). No `ValueError` on empty ranges; invalid `price_side` / `period` raise immediately at call time. - -**`tradedesk/execution/backtest/streamer.py`** -- `CandleSeries.candles` widened from `list[Candle]` to `Iterable[Candle]` — accepts generators. -- `BacktestStreamer.run()` replaced the flat `stream` list + `stream.sort()` with `heapq.merge` over per-instrument generators. A shared `itertools.count()` acts as tiebreaker when two instruments share a timestamp, avoiding any comparison of non-orderable event objects. O(log k) per event vs O(N log N) upfront sort. -- Two module-level helpers extracted: `_candle_gen()` and `_market_gen()`. The redundant `_set_current_timestamp` call in the old stream-build loop is removed. - -**`tradedesk/execution/backtest/client.py`** -- Added `BacktestClient.from_lazy_sources()` classmethod. Accepts an explicit `history` dict (warmup slice, for `get_historical_candles()`) and `candle_series` list (lazy generators, for streaming), keeping them separate so only the warmup window needs to be resident at startup. - -**`tradedesk/execution/backtest/__init__.py`** -- Exported `iter_dukascopy_candles`. - -**`ig_trader/session_runners.py`** -- `build_portfolio_backtest_client()`: new `warmup_bars: int | None = None` parameter. `None` preserves existing eager behaviour. When set, pre-loads the first `warmup_bars` base-period candles per instrument into `_history` (via `itertools.islice`), then creates an independent full-range generator for streaming and returns a `from_lazy_sources` client. -- `run_portfolio_backtest()`: new `warmup_bars: int | None = None` and `run_name: str | None = None` parameters. `run_name` (or a UTC timestamp fallback) is used to compute the run output directory eagerly before any data loading, so logging is fully configured — including the file handler — before the expensive candle-loading and index-building steps. The deferred `_configure_logging_on_start` / `SessionStartedEvent` handler is removed. The subscriber is passed the pre-created `run_dir` directly via the new `run_dir` parameter (see below). `warmup_bars` is forwarded to the client builder. In lazy mode, the target-period warmup history is aggregated from the warmup slice and `ledger.candle_indices` is built from a fresh full-range generator (one extra disk pass, O(N_target) memory rather than O(N_base)). - -**`tradedesk/recording/subscriber.py`** -- `RecordingSubscriber.__init__` gains `run_dir: Path | None = None`. When provided it is stored directly as `_run_output_dir` (caller is responsible for creating the directory); `handle_session_started` skips directory creation and just logs the path. -- `register_recording_subscriber` gains the matching `run_dir` kwarg and passes it through. - -**`ig_trader/scripts/run_portfolio.py`** -- `--name` arg description updated to document the output-directory-naming behaviour (falls back to timestamp when omitted). -- `--warmup-bars N` arg added (`dest=warmup_bars`). Enables lazy loading when set. -- Both args forwarded to `run_portfolio_backtest`. - -### Tests added - -- `tests/execution/backtest/test_dukascopy_cache.py`: 9 new tests for `iter_dukascopy_candles` — single day, parity with eager `read_dukascopy_candles`, empty range, invalid `price_side` / `period` raise eagerly, ask side, 15MIN resampling, missing-day skip, laziness check. -- `tests/execution/backtest/test_streamer.py` (new file, 7 tests): `CandleSeries` with a generator, single-series order, two-instrument interleaving, 5-instrument k-way sort, lazy generator consumption, `from_lazy_sources` history and streamer behaviour. - -**514 tradedesk tests pass, 223 ig_trader tests pass.** - -### Usage - -```bash -# Named run with lazy loading (first log line appears immediately): -python -m scripts.run_portfolio \ - --target backtest \ - --from 2022-01-01 --to 2026-01-01 \ - --name my-run \ - --warmup-bars 300 -``` - -```python -# Programmatic equivalent: -run_portfolio_backtest( - cfg=cfg, - raw_cfg=raw_cfg, - specs=specs, - log_level="INFO", - run_name="my-run", # output written to back_tests/my-run/ - warmup_bars=300, # pre-load first 300 base-period candles per instrument -) -``` - -`warmup_bars=300` with `base_period="1MIN"` pre-loads ~300 1MIN candles per instrument into `_history` for strategy warmup. The remainder of the date range is streamed lazily. Memory stays O(k) where k is the number of instruments, rather than O(N × k). - -### Design notes - -- The approach is fully backward-compatible: passing `warmup_bars=None` (the default) preserves existing eager behaviour with zero code-path changes for existing callers. -- The k-way merge starts from the beginning of the date range (i.e. it does not gate on `SessionReadyEvent`). This is correct because `BacktestStreamer` runs after `SessionStartedEvent` / `SessionReadyEvent` have already fired in `BasePortfolio.run()`. Warmup (`get_historical_candles`) and streaming are already fully separate lifecycle phases. -- The warmup `_history` slice represents the start of the backtest period rather than the end (the previous eager approach seeded indicators from the final N bars of the full dataset — a subtle correctness issue for long backtests that this implementation resolves). -- Root cause of the silent startup delay: logging was deferred to a `SessionStartedEvent` handler so the subscriber could supply the timestamped directory to the file handler. This fired inside `run_portfolio()`, after all data loading and index building. Fix: compute the run directory eagerly from `run_name` (or timestamp), create it immediately, configure logging fully, then pass the directory to the subscriber via `run_dir` so it reuses rather than recreates it. - -# Cost -``` -Total cost: $6.56 -Total duration (API): 16m 44s -Total duration (wall): 6h 22m 51s -Total code changes: 687 lines added, 77 lines removed -Usage by model: -claude-sonnet-4-6: 68 input, 58.2k output, 5.2m cache read, 380.4k cache write ($6.44) -claude-haiku-4-5: 47 input, 5.4k output, 272.4k cache read, 49.5k cache write ($0.1160) -``` diff --git a/docs/briefs/04-add-control-equity-graph.md b/docs/briefs/04-add-control-equity-graph.md deleted file mode 100644 index 54845b4..0000000 --- a/docs/briefs/04-add-control-equity-graph.md +++ /dev/null @@ -1,43 +0,0 @@ -Always read and follow instructions in `CLAUDE.md` in the project root before processing the brief. - -# Add control line to equity graph on analysis report - -## Context - -`tradedesk/recording/report.py` generates a human readable report after each backtest and includes a number of generated charts. One of these charts shows the equity curve built up from the `equity_daily.csv` file (lines 715-791 of `report.py`). - -## Role - -In this session you will take on a role described in https://github.com/radiusred/.github/doc/agent-roles/engineer.md (or the local equivalent repository clone) - -## Goal - -The goal of this session is to enhance the equity curve chart by adding a second line showing a control instrument for comparison. This instrument will be the FTSE 100. - -The goal is successfully achieved when the chart shows the equity curve from the backtest in the current colour (or any strong colour) and on the same axis, with the same scale, a line showing the equivalent FTSE performance in a muted grey (hex colour #888). - - -## Approach - - - For the same period (`--from` and `--to`) as the backtest, calculate the daily increment/decrement of the FTSE - - this is a normalised line that always starts at 0 (zero) on the `--from` date of the backtest. - - each subsequent day is the difference between the current day's close price and the previous day's close. - - the net effect is an equity curve that models what would have happened if a LONG position of size 1.0 had been opened on day 1 of the test and closed on the final day of the test. - - plot this series on the same chart that is generated for the portfolio's equity_daily series - - use chart data available in the local dukascopy cache (`--cache-dir`) with an instrument name of `GBRIDXGBP` - -## Output - -Four files modified across two projects: - -**`tradedesk`:** - -- `tradedesk/execution/backtest/__init__.py` — exported `read_dukascopy_candles` -- `tradedesk/recording/report.py` — added `cache_dir: Path | None = None` to `_prepare_graphs` and `generate_analysis_report`; equity curve section now loads GBRIDXGBP daily candles when `cache_dir` is provided, computes a normalised cumulative series starting at 0, and plots it as a second line in `#888888` -- `tradedesk/recording/subscriber.py` — added `cache_dir` parameter to `RecordingSubscriber.__init__` and `register_recording_subscriber`; passes it through to `generate_analysis_report` - -**`ig_trader`:** - -- `ig_trader/session_runners.py` — passes `cache_dir=cfg.cache_dir` to `register_recording_subscriber` in the backtest runner - -**Logic:** daily close prices are resampled from GBRIDXGBP 1-minute Dukascopy data; the control series is `[0, Δclose₁, Δclose₁+Δclose₂, ...]`, equivalent to holding a size-1 long position from the first day of the backtest. The date range is derived from the equity curve's own timestamps so no additional parameters are needed at the call site. Failures silently warn and omit the control line rather than crashing the report.