From 78a01d9fc54adc7a3be4802004db7a7ce9eaeda6 Mon Sep 17 00:00:00 2001 From: Bu5hm4nn Date: Tue, 24 Mar 2026 11:23:12 +0100 Subject: [PATCH] docs: define strategy template and backtesting MVP --- docs/EXEC-001A_BT-001_MVP_ARCHITECTURE.md | 968 ++++++++++++++++++++++ docs/ROADMAP.md | 73 +- 2 files changed, 1036 insertions(+), 5 deletions(-) create mode 100644 docs/EXEC-001A_BT-001_MVP_ARCHITECTURE.md diff --git a/docs/EXEC-001A_BT-001_MVP_ARCHITECTURE.md b/docs/EXEC-001A_BT-001_MVP_ARCHITECTURE.md new file mode 100644 index 0000000..ddebe2f --- /dev/null +++ b/docs/EXEC-001A_BT-001_MVP_ARCHITECTURE.md @@ -0,0 +1,968 @@ +# EXEC-001A / BT-001 MVP Architecture + +## Scope + +This document defines the MVP design for four related roadmap items: + +- **EXEC-001A** — Named Strategy Templates +- **BT-001** — Synthetic Historical Backtesting +- **BT-002** — Historical Daily Options Snapshot Provider +- **BT-003** — Selloff Event Comparison Report + +The goal is to give implementation agents a concrete architecture without requiring a database or a full UI rewrite. The MVP should fit the current codebase shape: + +- domain models in `app/models/` +- IO and orchestration in `app/services/` +- strategy math in `app/strategies/` or a new `app/backtesting/` package +- lightweight docs under `docs/` + +## Design goals + +1. **Keep current live quote/options flows working.** Do not overload `app/services/data_service.py` with historical backtest state. +2. **Make templates reusable and named.** A strategy definition should be saved once and referenced by many backtests. +3. **Support synthetic-first backtests.** BT-001 must work before BT-002 exists. +4. **Prevent lookahead bias by design.** Providers and the run engine must expose only data available at each `as_of_date`. +5. **Preserve a migration path to real daily options snapshots.** Synthetic pricing and snapshot-based pricing must share the same provider contract. +6. **Stay file-backed for MVP persistence.** Repositories may use JSON files under `data/` first, behind interfaces. + +## Terminology decision + +The current code uses `LombardPortfolio.gold_ounces`, but the strategy engine effectively treats that field as generic underlying units. For historical backtesting, implementation agents should **not** extend that ambiguity. + +### Recommendation + +- Keep `LombardPortfolio` unchanged for existing live pages. +- Introduce backtesting-specific portfolio state using the neutral term **`underlying_units`**. +- Treat `symbol` + `underlying_units` as the canonical tradable exposure. + +This avoids mixing physical ounces, GLD shares, and synthetic units in the backtest engine. + +--- + +## MVP architecture summary + +### Main decision + +Create a new isolated subsystem: + +- `app/models/strategy_template.py` +- `app/models/backtest.py` +- `app/models/event_preset.py` +- `app/services/historical/` +- `app/services/backtesting/` +- optional thin adapters in `app/strategies/` for reusing existing payoff logic + +### Why isolate it + +The current `DataService` is a live/synthetic read service with cache-oriented payload shaping. Historical backtesting needs: + +- versioned saved definitions +- run lifecycle state +- daily path simulation +- historical provider abstraction +- reproducible result storage + +Those concerns should not be mixed into the current request-time quote service. + +--- + +## Domain model proposals + +## 1. Strategy templates (EXEC-001A) + +A strategy template is a **named, versioned, reusable hedge definition**. It is not a run result and it is not a specific dated option contract. + +### `StrategyTemplate` + +Recommended fields: + +| Field | Type | Notes | +|---|---|---| +| `template_id` | `str` | Stable UUID/string key | +| `slug` | `str` | Human-readable unique name, e.g. `protective-put-atm-12m` | +| `display_name` | `str` | UI/report label | +| `description` | `str` | Short rationale | +| `template_kind` | enum | `protective_put`, `laddered_put`, `collar` (future-safe) | +| `status` | enum | `draft`, `active`, `archived` | +| `version` | `int` | Increment on material rule changes | +| `underlying_symbol` | `str` | MVP may allow one symbol per template | +| `contract_mode` | enum | `continuous_units` for synthetic MVP, `listed_contracts` for BT-002+ | +| `legs` | `list[TemplateLeg]` | One or more parametric legs | +| `roll_policy` | `RollPolicy` | How/when to replace expiring hedges | +| `entry_policy` | `EntryPolicy` | When the initial hedge is entered | +| `tags` | `list[str]` | e.g. `conservative`, `income-safe` | +| `created_at` | `datetime` | Audit | +| `updated_at` | `datetime` | Audit | + +### `TemplateLeg` + +Recommended fields: + +| Field | Type | Notes | +|---|---|---| +| `leg_id` | `str` | Stable within template version | +| `side` | enum | `long` or `short`; MVP uses `long` only for puts | +| `option_type` | enum | `put` or `call` | +| `allocation_weight` | `float` | Must sum to `1.0` across active hedge legs in MVP | +| `strike_rule` | `StrikeRule` | MVP: `spot_pct` only | +| `target_expiry_days` | `int` | e.g. `365`, `180`, `90` | +| `quantity_rule` | enum | MVP: `target_coverage_pct` | +| `target_coverage_pct` | `float` | Usually `1.0` for full hedge, but supports partial hedges later | + +### `StrikeRule` + +MVP shape: + +| Field | Type | Notes | +|---|---|---| +| `rule_type` | enum | `spot_pct` | +| `value` | `float` | e.g. `1.00`, `0.95`, `0.90` | + +Future-safe, but not in MVP: + +- `delta_target` +- `fixed_strike` +- `moneyness_bucket` + +### `RollPolicy` + +Recommended MVP fields: + +| Field | Type | Notes | +|---|---|---| +| `policy_type` | enum | `hold_to_expiry`, `roll_n_days_before_expiry` | +| `days_before_expiry` | `int` | Required for rolling mode | +| `rebalance_on_new_deposit` | `bool` | Default `false` in MVP | + +### `EntryPolicy` + +Recommended MVP fields: + +| Field | Type | Notes | +|---|---|---| +| `entry_timing` | enum | `scenario_start_close` | +| `stagger_days` | `int \| None` | Not used in MVP, keep nullable | + +### MVP template invariants + +Implementation agents should enforce: + +- `slug` unique among active templates +- template versions immutable once referenced by a completed run +- weights sum to `1.0` for `protective_put`/`laddered_put` templates +- all legs use the same `target_expiry_days` in MVP unless explicitly marked as a ladder with shared roll policy +- `underlying_symbol` on the template must either match the scenario symbol or be `*`/generic if generic templates are later supported + +### Template examples + +- `protective-put-atm-12m` +- `protective-put-95pct-12m` +- `ladder-50-50-atm-95pct-12m` +- `ladder-33-33-33-atm-95pct-90pct-12m` + +These map cleanly onto the existing strategy set in `app/strategies/engine.py`. + +--- + +## 2. Backtest scenarios + +A backtest scenario is the **saved experiment definition**. It says what portfolio, time window, templates, provider, and execution rules are used. + +### `BacktestScenario` + +Recommended fields: + +| Field | Type | Notes | +|---|---|---| +| `scenario_id` | `str` | Stable UUID/string key | +| `slug` | `str` | Human-readable name | +| `display_name` | `str` | Report label | +| `description` | `str` | Optional scenario intent | +| `symbol` | `str` | Underlying being hedged | +| `start_date` | `date` | Inclusive | +| `end_date` | `date` | Inclusive | +| `initial_portfolio` | `BacktestPortfolioState` | Portfolio at day 0 | +| `template_refs` | `list[TemplateRef]` | One or more template versions to compare | +| `provider_ref` | `ProviderRef` | Which historical provider to use | +| `execution_model` | `ExecutionModel` | Daily close-to-close for MVP | +| `valuation_frequency` | enum | `daily` in MVP | +| `benchmark_mode` | enum | `unhedged_only` in MVP | +| `event_preset_id` | `str \| None` | Optional link for BT-003 | +| `notes` | `list[str]` | Optional warnings/assumptions | +| `created_at` | `datetime` | Audit | + +### `BacktestPortfolioState` + +Recommended fields: + +| Field | Type | Notes | +|---|---|---| +| `currency` | `str` | `USD` in MVP | +| `underlying_units` | `float` | Canonical exposure size | +| `entry_spot` | `float` | Starting spot reference | +| `loan_amount` | `float` | Outstanding loan | +| `margin_call_ltv` | `float` | Stress threshold | +| `cash_balance` | `float` | Usually `0.0` in MVP | +| `financing_rate` | `float` | Optional, default `0.0` in MVP | + +### `TemplateRef` + +Use a small immutable reference object: + +| Field | Type | Notes | +|---|---|---| +| `template_id` | `str` | Stable template key | +| `version` | `int` | Required for reproducibility | +| `display_name_override` | `str \| None` | Optional report label | + +### `ProviderRef` + +Recommended fields: + +| Field | Type | Notes | +|---|---|---| +| `provider_id` | `str` | e.g. `synthetic_v1`, `daily_snapshots_v1` | +| `config_key` | `str` | Named config/profile used by the run | +| `pricing_mode` | enum | `synthetic_bs_mid` or `snapshot_mid` | + +### `ExecutionModel` + +MVP decision: + +- **Daily close-to-close engine** +- Positions are evaluated once per trading day +- If a template rule triggers on date `T`, entry/roll is executed using provider data **as of date `T` close** +- Mark-to-market for date `T` uses the same `T` snapshot + +This is a simplification, but it is deterministic and compatible with BT-002 daily snapshots. + +### Scenario invariants + +- `start_date <= end_date` +- at least one `template_ref` +- all referenced template versions must exist before run submission +- `initial_portfolio.loan_amount < initial_portfolio.underlying_units * entry_spot` +- scenario must declare the provider explicitly; no hidden global default inside the engine + +--- + +## 3. Backtest runs and results + +A run is the **execution record** of one scenario against one or more templates under one provider. + +### `BacktestRun` + +Recommended fields: + +| Field | Type | Notes | +|---|---|---| +| `run_id` | `str` | Stable UUID | +| `scenario_id` | `str` | Source scenario | +| `status` | enum | `queued`, `running`, `completed`, `failed`, `cancelled` | +| `provider_snapshot` | `ProviderSnapshot` | Frozen provider config used at run time | +| `submitted_at` | `datetime` | Audit | +| `started_at` | `datetime \| None` | Audit | +| `completed_at` | `datetime \| None` | Audit | +| `engine_version` | `str` | Git SHA or app version | +| `rules_version` | `str` | Semantic rules hash for reproducibility | +| `warnings` | `list[str]` | Missing data fallback, skipped dates, etc. | +| `error` | `str \| None` | Failure detail | + +### `ProviderSnapshot` + +Freeze the provider state used by a run: + +| Field | Type | Notes | +|---|---|---| +| `provider_id` | `str` | Resolved provider implementation | +| `config` | `dict[str, Any]` | Frozen provider config used for the run | +| `source_version` | `str \| None` | Optional data snapshot/build hash | + +### `BacktestRunResult` + +Top-level recommended fields: + +| Field | Type | Notes | +|---|---|---| +| `run_id` | `str` | Foreign key | +| `scenario_snapshot` | `BacktestScenario` or frozen subset | Freeze used inputs | +| `template_results` | `list[TemplateBacktestResult]` | One per template | +| `comparison_summary` | `RunComparisonSummary` | Ranked table | +| `generated_at` | `datetime` | Audit | + +### `TemplateBacktestResult` + +Recommended fields: + +| Field | Type | Notes | +|---|---|---| +| `template_id` | `str` | Identity | +| `template_version` | `int` | Reproducibility | +| `template_name` | `str` | Display | +| `summary_metrics` | `BacktestSummaryMetrics` | Compact ranking metrics | +| `daily_path` | `list[BacktestDailyPoint]` | Daily timeseries | +| `position_log` | `list[BacktestPositionRecord]` | Open/roll/expire events | +| `trade_log` | `list[BacktestTradeRecord]` | Cashflow events | +| `validation_notes` | `list[str]` | e.g. synthetic IV fallback used | + +### `BacktestSummaryMetrics` + +Recommended MVP metrics: + +| Field | Type | Notes | +|---|---|---| +| `start_value` | `float` | Initial collateral value | +| `end_value_unhedged` | `float` | Baseline terminal collateral | +| `end_value_hedged_net` | `float` | After hedge P&L and premiums | +| `total_hedge_cost` | `float` | Sum of paid premiums | +| `total_option_payoff_realized` | `float` | Expiry/close realized payoff | +| `max_ltv_unhedged` | `float` | Path max | +| `max_ltv_hedged` | `float` | Path max | +| `margin_call_days_unhedged` | `int` | Count | +| `margin_call_days_hedged` | `int` | Count | +| `worst_drawdown_unhedged` | `float` | Optional but useful | +| `worst_drawdown_hedged` | `float` | Optional but useful | +| `days_protected_below_threshold` | `int` | Optional convenience metric | +| `roll_count` | `int` | Operational complexity | + +### `BacktestDailyPoint` + +Recommended daily path fields: + +| Field | Type | Notes | +|---|---|---| +| `date` | `date` | Trading date | +| `spot_close` | `float` | Underlying close | +| `underlying_value` | `float` | `underlying_units * spot_close` | +| `option_market_value` | `float` | Mark-to-market of open hedge | +| `premium_cashflow` | `float` | Negative on entry/roll | +| `realized_option_cashflow` | `float` | Expiry/sale value | +| `net_portfolio_value` | `float` | Underlying + option MTM + cash | +| `loan_amount` | `float` | Constant in MVP | +| `ltv_unhedged` | `float` | Baseline | +| `ltv_hedged` | `float` | Hedge-aware | +| `margin_call_unhedged` | `bool` | Baseline | +| `margin_call_hedged` | `bool` | Hedge-aware | +| `active_position_ids` | `list[str]` | Traceability | + +### `BacktestTradeRecord` + +Recommended fields: + +| Field | Type | Notes | +|---|---|---| +| `trade_id` | `str` | Stable key | +| `date` | `date` | Execution date | +| `action` | enum | `buy_open`, `sell_close`, `expire`, `roll` | +| `leg_id` | `str` | Template leg link | +| `instrument_key` | `HistoricalInstrumentKey` | Strike/expiry/type | +| `quantity` | `float` | Continuous or discrete | +| `price` | `float` | Fill price | +| `cashflow` | `float` | Signed | +| `reason` | enum | `initial_entry`, `scheduled_roll`, `expiry`, `scenario_end` | + +### Run/result invariants + +- runs are append-only after completion +- results must freeze template versions and scenario inputs used at execution time +- failed runs may omit `template_results` but must preserve `warnings`/`error` +- ranking should never rely on a metric that can be absent without a fallback rule + +--- + +## 4. Event presets (BT-003) + +An event preset is a **named reusable market window** used to compare strategy behavior across selloffs. + +### `EventPreset` + +Recommended fields: + +| Field | Type | Notes | +|---|---|---| +| `event_preset_id` | `str` | Stable key | +| `slug` | `str` | e.g. `covid-crash-2020` | +| `display_name` | `str` | Report label | +| `symbol` | `str` | Underlying symbol | +| `window_start` | `date` | Inclusive | +| `window_end` | `date` | Inclusive | +| `anchor_date` | `date \| None` | Optional focal date | +| `event_type` | enum | `selloff`, `recovery`, `stress_test` | +| `tags` | `list[str]` | e.g. `macro`, `liquidity`, `vol-spike` | +| `description` | `str` | Why this event exists | +| `scenario_overrides` | `EventScenarioOverrides` | Optional defaults | +| `created_at` | `datetime` | Audit | + +### `EventScenarioOverrides` + +MVP fields: + +| Field | Type | Notes | +|---|---|---| +| `lookback_days` | `int \| None` | Optional pre-window warmup | +| `recovery_days` | `int \| None` | Optional post-event tail | +| `default_template_slugs` | `list[str]` | Suggested comparison set | +| `normalize_start_value` | `bool` | Default `true` for event comparison charts | + +### BT-003 usage pattern + +- a report selects one or more `EventPreset`s +- each preset materializes a `BacktestScenario` +- the same template set is run across all events +- report compares normalized daily paths and summary metrics + +### MVP event decision + +Use **manual date windows only**. Do not attempt automatic peak/trough detection in the first slice. + +--- + +## Historical provider abstraction + +## Core interface + +Create a provider contract that exposes only **point-in-time historical data**. + +### `HistoricalMarketDataProvider` + +Recommended methods: + +```python +class HistoricalMarketDataProvider(Protocol): + provider_id: str + + def get_trading_days(self, symbol: str, start_date: date, end_date: date) -> list[date]: ... + + def get_underlying_bars( + self, symbol: str, start_date: date, end_date: date + ) -> list[UnderlyingBar]: ... + + def get_option_snapshot( + self, query: OptionSnapshotQuery + ) -> OptionSnapshot: ... + + def price_open_position( + self, position: HistoricalOptionPosition, as_of_date: date + ) -> HistoricalOptionMark: ... +``` + +### Why this interface + +It cleanly supports both provider types: + +- **BT-001 synthetic provider** — generate option values from deterministic assumptions +- **BT-002 snapshot provider** — read real daily option quotes/surfaces from stored snapshots + +It also makes lookahead control explicit: every method is asked for data **as of** a specific date. + +## Supporting provider models + +### `UnderlyingBar` + +| Field | Type | Notes | +|---|---|---| +| `date` | `date` | Trading day | +| `open` | `float` | Optional for future use | +| `high` | `float` | Optional | +| `low` | `float` | Optional | +| `close` | `float` | Required | +| `volume` | `float \| None` | Optional | +| `source` | `str` | Provider/source tag | + +### `OptionSnapshotQuery` + +| Field | Type | Notes | +|---|---|---| +| `symbol` | `str` | Underlying | +| `as_of_date` | `date` | Point-in-time date | +| `option_type` | enum | `put`/`call` | +| `target_expiry_days` | `int` | Desired tenor | +| `strike_rule` | `StrikeRule` | Resolved against current spot | +| `pricing_side` | enum | `mid` in MVP | + +### `OptionSnapshot` + +| Field | Type | Notes | +|---|---|---| +| `as_of_date` | `date` | Snapshot date | +| `symbol` | `str` | Underlying | +| `underlying_close` | `float` | Spot used for selection/pricing | +| `selected_contract` | `HistoricalOptionQuote` | Resolved contract | +| `selection_notes` | `list[str]` | e.g. nearest expiry/nearest strike | +| `source` | `str` | Provider ID | + +### `HistoricalOptionQuote` + +| Field | Type | Notes | +|---|---|---| +| `instrument_key` | `HistoricalInstrumentKey` | Canonical contract identity | +| `bid` | `float` | Optional for snapshot provider | +| `ask` | `float` | Optional | +| `mid` | `float` | Required for MVP valuation | +| `implied_volatility` | `float \| None` | Required for BT-002, synthetic-derived for BT-001 | +| `delta` | `float \| None` | Optional now, useful later | +| `open_interest` | `int \| None` | Optional now | +| `volume` | `int \| None` | Optional now | +| `source` | `str` | Provider/source tag | + +### `HistoricalInstrumentKey` + +| Field | Type | Notes | +|---|---|---| +| `symbol` | `str` | Underlying | +| `option_type` | enum | `put`/`call` | +| `expiry` | `date` | Contract expiry | +| `strike` | `float` | Contract strike | + +--- + +## Provider implementations + +## A. `SyntheticHistoricalProvider` (BT-001 first) + +Purpose: + +- generate deterministic historical backtests without requiring stored historical options chains +- use historical underlying closes plus a synthetic volatility/rates regime +- resolve template legs into synthetic option quotes on each rebalance date +- reprice open positions daily using the same model family + +### Recommended behavior + +Inputs: + +- underlying close series (from yfinance file cache, CSV fixture, or another deterministic source) +- configured implied volatility regime, e.g. fixed `0.16` or dated step regime +- configured risk-free rate regime +- optional stress spread for transaction cost realism + +Entry and valuation: + +- on a rebalance date, compute strike from `spot_pct * spot_close` +- set expiry by nearest trading day to `as_of_date + target_expiry_days` +- price using Black-Scholes with the current day's spot, configured IV, remaining time, and option type +- on later dates, reprice the same contract using current spot and remaining time only + +### MVP synthetic assumptions + +- constant or schedule-based implied volatility; no future realized volatility leakage +- no stochastic volatility process in first slice +- no early exercise modeling +- no assignment modeling +- `mid` price only +- deterministic rounding/selection rules + +### Why synthetic-first is acceptable + +It validates: + +- template persistence +- run lifecycle +- path valuation +- daily result rendering +- anti-lookahead contract boundaries + +before adding BT-002 data ingestion complexity. + +## B. `DailyOptionsSnapshotProvider` (BT-002) + +Purpose: + +- load historical option quotes for each trading day +- resolve actual listed contracts closest to template rules +- mark open positions to historical daily mids thereafter + +### Recommended behavior + +- selection on entry day uses nearest eligible expiry and nearest eligible strike from that day's chain only +- mark-to-market later uses the exact same contract key if a quote exists on later dates +- if the contract is missing on a later date, provider returns a missing-data result and the engine applies a documented fallback policy + +### MVP fallback policy for missing marks + +Implementation agents should choose one explicit fallback and test it. Recommended order: + +1. exact contract from same-day snapshot +2. if unavailable, previous available mark from same contract with warning +3. if unavailable and contract is expired, intrinsic value at expiry or zero afterward +4. otherwise fail the run or mark the template result incomplete + +Do **not** silently substitute a different strike/expiry for an already-open position. + +--- + +## Backtest engine flow + +Create a dedicated engine under `app/backtesting/engine.py`. Keep orchestration and repository wiring in `app/services/backtesting/`. + +### High-level loop + +For each template in the scenario: + +1. load trading days from provider +2. create baseline unhedged path +3. resolve initial hedge on `start_date` +4. for each trading day: + - read underlying close for day `T` + - mark open option positions as of `T` + - compute unhedged and hedged portfolio value + - compute LTV and margin-call flags + - check roll/expiry rules using only `T` data + - if a roll is due, close/expire old position and open replacement using `T` snapshot +5. liquidate remaining position at scenario end if still open +6. calculate summary metrics +7. rank templates inside `comparison_summary` + +### Position model recommendation + +Use a separate open-position model rather than reusing `OptionContract` directly. + +Recommended `HistoricalOptionPosition` fields: + +- `position_id` +- `instrument_key` +- `opened_at` +- `expiry` +- `quantity` +- `entry_price` +- `current_mark` +- `template_leg_id` +- `source_snapshot_date` + +Reason: backtests need lifecycle state and audit fields that the current `OptionContract` model does not carry. + +### Ranking recommendation + +For MVP comparison views, rank templates by: + +1. fewer `margin_call_days_hedged` +2. lower `max_ltv_hedged` +3. lower `total_hedge_cost` +4. higher `end_value_hedged_net` + +This is easier to explain than a single opaque score. + +--- + +## Data realism constraints + +Implementation agents should treat the following as mandatory MVP rules. + +## 1. Point-in-time only + +On day `T`, the engine may use only: + +- underlying bar for `T` +- option snapshot for `T` +- provider configuration known before the run starts +- open positions created earlier or on `T` + +It may **not** use: + +- future closes +- future implied vols +- terminal event windows beyond `T` for trading decisions +- any provider helper that precomputes the whole path and leaks future state into contract selection + +## 2. Stable contract identity after entry + +Once a contract is opened, daily valuation must use that exact contract identity: + +- same symbol +- same expiry +- same strike +- same option type + +No rolling relabeling of a live position to a “nearest” contract. + +## 3. Explicit selection rules + +Template rules must resolve to contracts with deterministic tiebreakers: + +- nearest expiry at or beyond target DTE +- nearest strike to rule target +- if tied, prefer more conservative strike for puts (higher strike) and earliest expiry + +Tiebreakers must be documented and unit-tested. + +## 4. Execution timing must be fixed + +MVP should use **same-day close execution** consistently. + +Do not mix: + +- signal at close / fill next open +- signal at close / fill same close +- signal intraday / mark at close + +If this changes later, it must be a scenario-level parameter. + +## 5. Continuous-vs-listed quantity must be explicit + +MVP synthetic runs may use `continuous_units`. +BT-002 listed snapshot runs should support `listed_contracts` with contract-size rounding. + +Do not hide rounding rules inside providers. +They belong in the position sizing logic and must be recorded in the result. + +## 6. Costs must be recorded as cashflows + +Premiums and close/expiry proceeds must be stored as dated cashflows. +Do not collapse the entire hedge economics into end-of-period payoff only. + +## 7. Missing data cannot be silent + +Any missing snapshot/mark fallback must add: + +- a run warning +- a template validation note +- a deterministic result status if the template becomes incomplete + +--- + +## Anti-lookahead rules + +These should be copied into tests and implementation notes verbatim. + +1. **Contract selection rule**: select options using only the entry-day snapshot. +2. **Daily MTM rule**: mark open positions using only same-day data for the same contract. +3. **Expiry rule**: once `as_of_date >= expiry`, option value becomes intrinsic-at-expiry or zero after expiry according to the provider contract; it is not repriced with negative time-to-expiry. +4. **Event preset rule**: event presets may define scenario dates, but the strategy engine may not inspect future event endpoints when deciding to roll or exit. +5. **Synthetic vol rule**: synthetic providers may use fixed or date-indexed IV schedules, but never realized future path statistics from dates after `as_of_date`. +6. **Metric rule**: comparison metrics may summarize the whole run only after the run completes; they may not feed back into trading decisions during the run. + +--- + +## Phased implementation plan with TDD slices + +Each slice should leave behind tests and a minimal implementation path. + +## Slice 0 — Red tests for model invariants + +Target: + +- create tests for `StrategyTemplate`, `BacktestScenario`, `BacktestRun`, `EventPreset` +- validate weights, dates, versioned references, and uniqueness assumptions + +Suggested tests: + +- invalid ladder weights rejected +- scenario with end before start rejected +- template ref requires explicit version +- loan amount cannot exceed initial collateral value + +## Slice 1 — Named template repository (EXEC-001A core) + +Target: + +- file-backed `StrategyTemplateRepository` +- save/load/list active templates +- version bump on immutable update + +Suggested tests: + +- saving template round-trips cleanly +- updating active template creates version 2, not in-place mutation +- archived template stays loadable for historical runs + +## Slice 2 — Synthetic provider contract (BT-001 foundation) + +Target: + +- `HistoricalMarketDataProvider` protocol +- `SyntheticHistoricalProvider` +- deterministic underlying fixture input + synthetic option pricing + +Suggested tests: + +- provider returns stable trading day list +- spot-pct strike resolution uses same-day spot only +- repricing uses decreasing time to expiry +- no future bar access required for day-`T` pricing + +## Slice 3 — Single-template backtest engine + +Target: + +- run one protective-put template across a short scenario +- output daily path + summary metrics + +Suggested tests: + +- hedge premium paid on entry day +- option MTM increases when spot falls materially below strike +- hedged max LTV is <= unhedged max LTV in a monotonic selloff fixture +- completed run freezes scenario and template version snapshots + +## Slice 4 — Multi-template comparison runs + +Target: + +- compare ATM, 95% put, 50/50 ladder on same scenario +- produce ranked `comparison_summary` + +Suggested tests: + +- all template results share same scenario snapshot +- ranking uses documented metric order +- equal primary metric falls back to next metric deterministically + +## Slice 5 — Roll logic and expiry behavior + +Target: + +- support `roll_n_days_before_expiry` +- support expiry settlement and position replacement + +Suggested tests: + +- roll occurs exactly on configured trading-day offset +- expired contracts stop carrying time value +- no contract identity mutation between entry and close + +## Slice 6 — Event presets and BT-003 scenario materialization + +Target: + +- repository for `EventPreset` +- materialize preset -> scenario +- run comparison over multiple named events + +Suggested tests: + +- preset dates map cleanly into scenario dates +- scenario overrides are applied explicitly +- normalized event series start from common baseline + +## Slice 7 — Daily snapshot provider (BT-002) + +Target: + +- add `DailyOptionsSnapshotProvider` behind same contract +- reuse existing engine with provider swap only + +Suggested tests: + +- entry picks nearest valid listed contract from snapshot +- later MTM uses same contract key +- missing mark generates warning and applies documented fallback +- synthetic and snapshot providers both satisfy same provider test suite + +## Slice 8 — Thin API/UI integration after engine is proven + +Not part of this doc’s implementation scope, but the natural next step is: + +- `/api/templates` +- `/api/backtests` +- `/api/backtests/{run_id}` +- later a NiceGUI page for listing templates and runs + +Per project rules, do not claim this feature is live until the UI consumes real run data. + +--- + +## Recommended file/module layout + +Recommended minimal layout for this codebase: + +```text +app/ + backtesting/ + __init__.py + engine.py # run loop, ranking, metric aggregation + position_sizer.py # continuous vs listed quantity rules + result_metrics.py # path -> summary metrics + scenario_materializer.py # event preset -> scenario + selection.py # strike/expiry resolution helpers + models/ + strategy_template.py # StrategyTemplate, TemplateLeg, RollPolicy, EntryPolicy + backtest.py # BacktestScenario, BacktestRun, results, daily points + event_preset.py # EventPreset, overrides + historical_data.py # UnderlyingBar, OptionSnapshot, InstrumentKey, marks + services/ + backtesting/ + __init__.py + orchestrator.py # submit/load/list runs + repositories.py # file-backed run repository helpers + historical/ + __init__.py + base.py # HistoricalMarketDataProvider protocol + synthetic.py # BT-001 provider + snapshots.py # BT-002 provider + templates/ + __init__.py + repository.py # save/load/list/version templates + events/ + __init__.py + repository.py # save/load/list presets +``` + +### Persistence recommendation for MVP + +Use file-backed repositories first: + +```text +data/ + strategy_templates.json + event_presets.json + backtests/ + .json +``` + +Reason: + +- aligns with current `PortfolioRepository` style +- keeps the MVP small +- allows deterministic fixtures in tests +- can later move behind the same repository interfaces + +--- + +## Code reuse guidance + +Implementation agents should reuse existing code selectively. + +### Safe to reuse + +- pricing helpers in `app/core/pricing/` +- payoff logic concepts from `app/models/option.py` +- existing strategy presets from `app/strategies/engine.py` as seed templates + +### Do not reuse directly without adaptation + +- `StrategySelectionEngine` as the backtest engine +- `DataService` as a historical run orchestrator +- `LombardPortfolio.gold_ounces` as the canonical backtest exposure field + +Reason: these current types are optimized for present-time research payloads, not dated position lifecycle state. + +--- + +## Open implementation decisions to settle before coding + +1. **Underlying source for synthetic BT-001**: use yfinance historical closes directly, local fixture CSVs, or both? +2. **Quantity mode in first runnable slice**: support only `continuous_units` first, or implement listed contract rounding immediately? +3. **Scenario end behavior**: liquidate remaining option at final close, or leave terminal MTM only? +4. **Missing snapshot policy**: hard-fail vs warn-and-carry-forward? +5. **Provider metadata freezing**: store config only, or config + source data hash? + +Recommended answers for MVP: + +- yfinance historical closes with deterministic test fixtures for unit tests +- `continuous_units` first +- liquidate at final close for clearer realized P&L +- warn-and-carry-forward only for same-contract marks, otherwise fail +- freeze provider config plus app/git version + +--- + +## Implementation-ready recommendations + +1. **Build BT-001 around a new provider interface, not around `DataService`.** +2. **Treat templates as immutable versioned definitions.** Runs must reference template versions, not mutable slugs only. +3. **Use a daily close-to-close engine for MVP and document it everywhere.** +4. **Record every hedge premium and payoff as dated cashflows.** +5. **Keep synthetic provider and daily snapshot provider behind the same contract.** +6. **Introduce `underlying_units` in backtesting models to avoid `gold_ounces` ambiguity.** +7. **Make missing data warnings explicit and persistent in run results.** + diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md index 58c6d42..e3d8ee9 100644 --- a/docs/ROADMAP.md +++ b/docs/ROADMAP.md @@ -292,10 +292,73 @@ DATA-001 (Price Feed) **Dependencies:** PORT-001 +### EXEC-001A: Named Strategy Templates [P1, M] **[depends: DATA-003]** +**As a** risk manager, **I want** protection strategies to be first-class named templates **so that** I can compare, reuse, and later edit hedge definitions. + +**Acceptance Criteria:** +- Persist strategy templates with a stable id and display name +- Support at least initial system-defined templates (future user-editable names) +- Store template parameters separately from backtest scenarios +- Strategy templates are reusable by both live recommendation flows and backtests + +**Technical Notes:** +- Add strategy template model/repository +- Separate template definition from strategy execution state +- Keep room for future user-editable naming and rule parameters + +**Dependencies:** DATA-003 + +### BT-001: Synthetic Historical Backtesting [P1, L] **[depends: EXEC-001A, PORT-001]** +**As a** portfolio manager, **I want** to backtest hedge strategies against historical selloffs **so that** I can see which approach would have survived without a margin call. + +**Acceptance Criteria:** +- Define a backtest scenario with start date, end date, collateral basis, and initial LTV +- Simulate at least one named hedge strategy over historical GLD prices +- Report max LTV, final equity, hedge cost, and whether a margin call would have occurred +- Compare protected vs unprotected outcomes for the same scenario +- Support known event replay such as the 2026 gold selloff window + +**Technical Notes:** +- Start with synthetic/model-priced historical options rather than requiring point-in-time full historical chains +- Use historical underlying prices plus Black-Scholes/volatility assumptions +- Output both time series and summary metrics + +**Dependencies:** EXEC-001A, PORT-001 + +### BT-002: Historical Daily Options Snapshot Provider [P2, L] **[depends: BT-001]** +**As a** quant user, **I want** real historical daily options snapshots **so that** backtests use observed premiums instead of only modeled prices. + +**Acceptance Criteria:** +- Historical data provider abstraction supports point-in-time daily option chain snapshots +- Backtest engine can swap synthetic pricing for provider-backed historical daily premiums +- Contract selection avoids lookahead bias +- Provider choice and data quality limits are documented clearly + +**Technical Notes:** +- Add provider interface for underlying history and option snapshot history +- Prefer daily snapshots first; intraday/tick fidelity is a later upgrade +- Candidate providers: Databento, Massive/Polygon, ThetaData, EODHD + +**Dependencies:** BT-001 + +### BT-003: Selloff Event Comparison Report [P2, M] **[depends: BT-001]** +**As a** portfolio manager, **I want** event-based backtest reports **so that** I can answer questions like “which strategy got me through the Jan 2026 selloff?” + +**Acceptance Criteria:** +- Event presets can define named historical windows +- Report ranks strategies by survival, max LTV, cost, and final equity +- Report highlights breach date if a strategy fails +- UI can show the unhedged path beside hedged paths + +**Dependencies:** BT-001 + ## Implementation Priority Queue -1. **PORT-001A** - Add collateral entry basis and derived weight/value handling in settings -2. **PORT-002** - Risk management safety -3. **EXEC-001** - Core user workflow -4. **EXEC-002** - Execution capability -5. Remaining features +1. **EXEC-001A** - Define named strategy templates as the foundation for backtesting +2. **BT-001** - Ship synthetic historical backtesting over GLD history +3. **PORT-003** - Historical LTV visibility and export groundwork +4. **BT-002** - Upgrade backtests to real daily options snapshots +5. **BT-003** - Event comparison reporting +6. **EXEC-001** - Core user workflow +7. **EXEC-002** - Execution capability +8. Remaining features