docs: define strategy template and backtesting MVP

This commit is contained in:
Bu5hm4nn
2026-03-24 11:23:12 +01:00
parent d0b1304b71
commit 78a01d9fc5
2 changed files with 1036 additions and 5 deletions

View File

@@ -0,0 +1,968 @@
# EXEC-001A / BT-001 MVP Architecture
## Scope
This document defines the MVP design for four related roadmap items:
- **EXEC-001A** — Named Strategy Templates
- **BT-001** — Synthetic Historical Backtesting
- **BT-002** — Historical Daily Options Snapshot Provider
- **BT-003** — Selloff Event Comparison Report
The goal is to give implementation agents a concrete architecture without requiring a database or a full UI rewrite. The MVP should fit the current codebase shape:
- domain models in `app/models/`
- IO and orchestration in `app/services/`
- strategy math in `app/strategies/` or a new `app/backtesting/` package
- lightweight docs under `docs/`
## Design goals
1. **Keep current live quote/options flows working.** Do not overload `app/services/data_service.py` with historical backtest state.
2. **Make templates reusable and named.** A strategy definition should be saved once and referenced by many backtests.
3. **Support synthetic-first backtests.** BT-001 must work before BT-002 exists.
4. **Prevent lookahead bias by design.** Providers and the run engine must expose only data available at each `as_of_date`.
5. **Preserve a migration path to real daily options snapshots.** Synthetic pricing and snapshot-based pricing must share the same provider contract.
6. **Stay file-backed for MVP persistence.** Repositories may use JSON files under `data/` first, behind interfaces.
## Terminology decision
The current code uses `LombardPortfolio.gold_ounces`, but the strategy engine effectively treats that field as generic underlying units. For historical backtesting, implementation agents should **not** extend that ambiguity.
### Recommendation
- Keep `LombardPortfolio` unchanged for existing live pages.
- Introduce backtesting-specific portfolio state using the neutral term **`underlying_units`**.
- Treat `symbol` + `underlying_units` as the canonical tradable exposure.
This avoids mixing physical ounces, GLD shares, and synthetic units in the backtest engine.
---
## MVP architecture summary
### Main decision
Create a new isolated subsystem:
- `app/models/strategy_template.py`
- `app/models/backtest.py`
- `app/models/event_preset.py`
- `app/services/historical/`
- `app/services/backtesting/`
- optional thin adapters in `app/strategies/` for reusing existing payoff logic
### Why isolate it
The current `DataService` is a live/synthetic read service with cache-oriented payload shaping. Historical backtesting needs:
- versioned saved definitions
- run lifecycle state
- daily path simulation
- historical provider abstraction
- reproducible result storage
Those concerns should not be mixed into the current request-time quote service.
---
## Domain model proposals
## 1. Strategy templates (EXEC-001A)
A strategy template is a **named, versioned, reusable hedge definition**. It is not a run result and it is not a specific dated option contract.
### `StrategyTemplate`
Recommended fields:
| Field | Type | Notes |
|---|---|---|
| `template_id` | `str` | Stable UUID/string key |
| `slug` | `str` | Human-readable unique name, e.g. `protective-put-atm-12m` |
| `display_name` | `str` | UI/report label |
| `description` | `str` | Short rationale |
| `template_kind` | enum | `protective_put`, `laddered_put`, `collar` (future-safe) |
| `status` | enum | `draft`, `active`, `archived` |
| `version` | `int` | Increment on material rule changes |
| `underlying_symbol` | `str` | MVP may allow one symbol per template |
| `contract_mode` | enum | `continuous_units` for synthetic MVP, `listed_contracts` for BT-002+ |
| `legs` | `list[TemplateLeg]` | One or more parametric legs |
| `roll_policy` | `RollPolicy` | How/when to replace expiring hedges |
| `entry_policy` | `EntryPolicy` | When the initial hedge is entered |
| `tags` | `list[str]` | e.g. `conservative`, `income-safe` |
| `created_at` | `datetime` | Audit |
| `updated_at` | `datetime` | Audit |
### `TemplateLeg`
Recommended fields:
| Field | Type | Notes |
|---|---|---|
| `leg_id` | `str` | Stable within template version |
| `side` | enum | `long` or `short`; MVP uses `long` only for puts |
| `option_type` | enum | `put` or `call` |
| `allocation_weight` | `float` | Must sum to `1.0` across active hedge legs in MVP |
| `strike_rule` | `StrikeRule` | MVP: `spot_pct` only |
| `target_expiry_days` | `int` | e.g. `365`, `180`, `90` |
| `quantity_rule` | enum | MVP: `target_coverage_pct` |
| `target_coverage_pct` | `float` | Usually `1.0` for full hedge, but supports partial hedges later |
### `StrikeRule`
MVP shape:
| Field | Type | Notes |
|---|---|---|
| `rule_type` | enum | `spot_pct` |
| `value` | `float` | e.g. `1.00`, `0.95`, `0.90` |
Future-safe, but not in MVP:
- `delta_target`
- `fixed_strike`
- `moneyness_bucket`
### `RollPolicy`
Recommended MVP fields:
| Field | Type | Notes |
|---|---|---|
| `policy_type` | enum | `hold_to_expiry`, `roll_n_days_before_expiry` |
| `days_before_expiry` | `int` | Required for rolling mode |
| `rebalance_on_new_deposit` | `bool` | Default `false` in MVP |
### `EntryPolicy`
Recommended MVP fields:
| Field | Type | Notes |
|---|---|---|
| `entry_timing` | enum | `scenario_start_close` |
| `stagger_days` | `int \| None` | Not used in MVP, keep nullable |
### MVP template invariants
Implementation agents should enforce:
- `slug` unique among active templates
- template versions immutable once referenced by a completed run
- weights sum to `1.0` for `protective_put`/`laddered_put` templates
- all legs use the same `target_expiry_days` in MVP unless explicitly marked as a ladder with shared roll policy
- `underlying_symbol` on the template must either match the scenario symbol or be `*`/generic if generic templates are later supported
### Template examples
- `protective-put-atm-12m`
- `protective-put-95pct-12m`
- `ladder-50-50-atm-95pct-12m`
- `ladder-33-33-33-atm-95pct-90pct-12m`
These map cleanly onto the existing strategy set in `app/strategies/engine.py`.
---
## 2. Backtest scenarios
A backtest scenario is the **saved experiment definition**. It says what portfolio, time window, templates, provider, and execution rules are used.
### `BacktestScenario`
Recommended fields:
| Field | Type | Notes |
|---|---|---|
| `scenario_id` | `str` | Stable UUID/string key |
| `slug` | `str` | Human-readable name |
| `display_name` | `str` | Report label |
| `description` | `str` | Optional scenario intent |
| `symbol` | `str` | Underlying being hedged |
| `start_date` | `date` | Inclusive |
| `end_date` | `date` | Inclusive |
| `initial_portfolio` | `BacktestPortfolioState` | Portfolio at day 0 |
| `template_refs` | `list[TemplateRef]` | One or more template versions to compare |
| `provider_ref` | `ProviderRef` | Which historical provider to use |
| `execution_model` | `ExecutionModel` | Daily close-to-close for MVP |
| `valuation_frequency` | enum | `daily` in MVP |
| `benchmark_mode` | enum | `unhedged_only` in MVP |
| `event_preset_id` | `str \| None` | Optional link for BT-003 |
| `notes` | `list[str]` | Optional warnings/assumptions |
| `created_at` | `datetime` | Audit |
### `BacktestPortfolioState`
Recommended fields:
| Field | Type | Notes |
|---|---|---|
| `currency` | `str` | `USD` in MVP |
| `underlying_units` | `float` | Canonical exposure size |
| `entry_spot` | `float` | Starting spot reference |
| `loan_amount` | `float` | Outstanding loan |
| `margin_call_ltv` | `float` | Stress threshold |
| `cash_balance` | `float` | Usually `0.0` in MVP |
| `financing_rate` | `float` | Optional, default `0.0` in MVP |
### `TemplateRef`
Use a small immutable reference object:
| Field | Type | Notes |
|---|---|---|
| `template_id` | `str` | Stable template key |
| `version` | `int` | Required for reproducibility |
| `display_name_override` | `str \| None` | Optional report label |
### `ProviderRef`
Recommended fields:
| Field | Type | Notes |
|---|---|---|
| `provider_id` | `str` | e.g. `synthetic_v1`, `daily_snapshots_v1` |
| `config_key` | `str` | Named config/profile used by the run |
| `pricing_mode` | enum | `synthetic_bs_mid` or `snapshot_mid` |
### `ExecutionModel`
MVP decision:
- **Daily close-to-close engine**
- Positions are evaluated once per trading day
- If a template rule triggers on date `T`, entry/roll is executed using provider data **as of date `T` close**
- Mark-to-market for date `T` uses the same `T` snapshot
This is a simplification, but it is deterministic and compatible with BT-002 daily snapshots.
### Scenario invariants
- `start_date <= end_date`
- at least one `template_ref`
- all referenced template versions must exist before run submission
- `initial_portfolio.loan_amount < initial_portfolio.underlying_units * entry_spot`
- scenario must declare the provider explicitly; no hidden global default inside the engine
---
## 3. Backtest runs and results
A run is the **execution record** of one scenario against one or more templates under one provider.
### `BacktestRun`
Recommended fields:
| Field | Type | Notes |
|---|---|---|
| `run_id` | `str` | Stable UUID |
| `scenario_id` | `str` | Source scenario |
| `status` | enum | `queued`, `running`, `completed`, `failed`, `cancelled` |
| `provider_snapshot` | `ProviderSnapshot` | Frozen provider config used at run time |
| `submitted_at` | `datetime` | Audit |
| `started_at` | `datetime \| None` | Audit |
| `completed_at` | `datetime \| None` | Audit |
| `engine_version` | `str` | Git SHA or app version |
| `rules_version` | `str` | Semantic rules hash for reproducibility |
| `warnings` | `list[str]` | Missing data fallback, skipped dates, etc. |
| `error` | `str \| None` | Failure detail |
### `ProviderSnapshot`
Freeze the provider state used by a run:
| Field | Type | Notes |
|---|---|---|
| `provider_id` | `str` | Resolved provider implementation |
| `config` | `dict[str, Any]` | Frozen provider config used for the run |
| `source_version` | `str \| None` | Optional data snapshot/build hash |
### `BacktestRunResult`
Top-level recommended fields:
| Field | Type | Notes |
|---|---|---|
| `run_id` | `str` | Foreign key |
| `scenario_snapshot` | `BacktestScenario` or frozen subset | Freeze used inputs |
| `template_results` | `list[TemplateBacktestResult]` | One per template |
| `comparison_summary` | `RunComparisonSummary` | Ranked table |
| `generated_at` | `datetime` | Audit |
### `TemplateBacktestResult`
Recommended fields:
| Field | Type | Notes |
|---|---|---|
| `template_id` | `str` | Identity |
| `template_version` | `int` | Reproducibility |
| `template_name` | `str` | Display |
| `summary_metrics` | `BacktestSummaryMetrics` | Compact ranking metrics |
| `daily_path` | `list[BacktestDailyPoint]` | Daily timeseries |
| `position_log` | `list[BacktestPositionRecord]` | Open/roll/expire events |
| `trade_log` | `list[BacktestTradeRecord]` | Cashflow events |
| `validation_notes` | `list[str]` | e.g. synthetic IV fallback used |
### `BacktestSummaryMetrics`
Recommended MVP metrics:
| Field | Type | Notes |
|---|---|---|
| `start_value` | `float` | Initial collateral value |
| `end_value_unhedged` | `float` | Baseline terminal collateral |
| `end_value_hedged_net` | `float` | After hedge P&L and premiums |
| `total_hedge_cost` | `float` | Sum of paid premiums |
| `total_option_payoff_realized` | `float` | Expiry/close realized payoff |
| `max_ltv_unhedged` | `float` | Path max |
| `max_ltv_hedged` | `float` | Path max |
| `margin_call_days_unhedged` | `int` | Count |
| `margin_call_days_hedged` | `int` | Count |
| `worst_drawdown_unhedged` | `float` | Optional but useful |
| `worst_drawdown_hedged` | `float` | Optional but useful |
| `days_protected_below_threshold` | `int` | Optional convenience metric |
| `roll_count` | `int` | Operational complexity |
### `BacktestDailyPoint`
Recommended daily path fields:
| Field | Type | Notes |
|---|---|---|
| `date` | `date` | Trading date |
| `spot_close` | `float` | Underlying close |
| `underlying_value` | `float` | `underlying_units * spot_close` |
| `option_market_value` | `float` | Mark-to-market of open hedge |
| `premium_cashflow` | `float` | Negative on entry/roll |
| `realized_option_cashflow` | `float` | Expiry/sale value |
| `net_portfolio_value` | `float` | Underlying + option MTM + cash |
| `loan_amount` | `float` | Constant in MVP |
| `ltv_unhedged` | `float` | Baseline |
| `ltv_hedged` | `float` | Hedge-aware |
| `margin_call_unhedged` | `bool` | Baseline |
| `margin_call_hedged` | `bool` | Hedge-aware |
| `active_position_ids` | `list[str]` | Traceability |
### `BacktestTradeRecord`
Recommended fields:
| Field | Type | Notes |
|---|---|---|
| `trade_id` | `str` | Stable key |
| `date` | `date` | Execution date |
| `action` | enum | `buy_open`, `sell_close`, `expire`, `roll` |
| `leg_id` | `str` | Template leg link |
| `instrument_key` | `HistoricalInstrumentKey` | Strike/expiry/type |
| `quantity` | `float` | Continuous or discrete |
| `price` | `float` | Fill price |
| `cashflow` | `float` | Signed |
| `reason` | enum | `initial_entry`, `scheduled_roll`, `expiry`, `scenario_end` |
### Run/result invariants
- runs are append-only after completion
- results must freeze template versions and scenario inputs used at execution time
- failed runs may omit `template_results` but must preserve `warnings`/`error`
- ranking should never rely on a metric that can be absent without a fallback rule
---
## 4. Event presets (BT-003)
An event preset is a **named reusable market window** used to compare strategy behavior across selloffs.
### `EventPreset`
Recommended fields:
| Field | Type | Notes |
|---|---|---|
| `event_preset_id` | `str` | Stable key |
| `slug` | `str` | e.g. `covid-crash-2020` |
| `display_name` | `str` | Report label |
| `symbol` | `str` | Underlying symbol |
| `window_start` | `date` | Inclusive |
| `window_end` | `date` | Inclusive |
| `anchor_date` | `date \| None` | Optional focal date |
| `event_type` | enum | `selloff`, `recovery`, `stress_test` |
| `tags` | `list[str]` | e.g. `macro`, `liquidity`, `vol-spike` |
| `description` | `str` | Why this event exists |
| `scenario_overrides` | `EventScenarioOverrides` | Optional defaults |
| `created_at` | `datetime` | Audit |
### `EventScenarioOverrides`
MVP fields:
| Field | Type | Notes |
|---|---|---|
| `lookback_days` | `int \| None` | Optional pre-window warmup |
| `recovery_days` | `int \| None` | Optional post-event tail |
| `default_template_slugs` | `list[str]` | Suggested comparison set |
| `normalize_start_value` | `bool` | Default `true` for event comparison charts |
### BT-003 usage pattern
- a report selects one or more `EventPreset`s
- each preset materializes a `BacktestScenario`
- the same template set is run across all events
- report compares normalized daily paths and summary metrics
### MVP event decision
Use **manual date windows only**. Do not attempt automatic peak/trough detection in the first slice.
---
## Historical provider abstraction
## Core interface
Create a provider contract that exposes only **point-in-time historical data**.
### `HistoricalMarketDataProvider`
Recommended methods:
```python
class HistoricalMarketDataProvider(Protocol):
provider_id: str
def get_trading_days(self, symbol: str, start_date: date, end_date: date) -> list[date]: ...
def get_underlying_bars(
self, symbol: str, start_date: date, end_date: date
) -> list[UnderlyingBar]: ...
def get_option_snapshot(
self, query: OptionSnapshotQuery
) -> OptionSnapshot: ...
def price_open_position(
self, position: HistoricalOptionPosition, as_of_date: date
) -> HistoricalOptionMark: ...
```
### Why this interface
It cleanly supports both provider types:
- **BT-001 synthetic provider** — generate option values from deterministic assumptions
- **BT-002 snapshot provider** — read real daily option quotes/surfaces from stored snapshots
It also makes lookahead control explicit: every method is asked for data **as of** a specific date.
## Supporting provider models
### `UnderlyingBar`
| Field | Type | Notes |
|---|---|---|
| `date` | `date` | Trading day |
| `open` | `float` | Optional for future use |
| `high` | `float` | Optional |
| `low` | `float` | Optional |
| `close` | `float` | Required |
| `volume` | `float \| None` | Optional |
| `source` | `str` | Provider/source tag |
### `OptionSnapshotQuery`
| Field | Type | Notes |
|---|---|---|
| `symbol` | `str` | Underlying |
| `as_of_date` | `date` | Point-in-time date |
| `option_type` | enum | `put`/`call` |
| `target_expiry_days` | `int` | Desired tenor |
| `strike_rule` | `StrikeRule` | Resolved against current spot |
| `pricing_side` | enum | `mid` in MVP |
### `OptionSnapshot`
| Field | Type | Notes |
|---|---|---|
| `as_of_date` | `date` | Snapshot date |
| `symbol` | `str` | Underlying |
| `underlying_close` | `float` | Spot used for selection/pricing |
| `selected_contract` | `HistoricalOptionQuote` | Resolved contract |
| `selection_notes` | `list[str]` | e.g. nearest expiry/nearest strike |
| `source` | `str` | Provider ID |
### `HistoricalOptionQuote`
| Field | Type | Notes |
|---|---|---|
| `instrument_key` | `HistoricalInstrumentKey` | Canonical contract identity |
| `bid` | `float` | Optional for snapshot provider |
| `ask` | `float` | Optional |
| `mid` | `float` | Required for MVP valuation |
| `implied_volatility` | `float \| None` | Required for BT-002, synthetic-derived for BT-001 |
| `delta` | `float \| None` | Optional now, useful later |
| `open_interest` | `int \| None` | Optional now |
| `volume` | `int \| None` | Optional now |
| `source` | `str` | Provider/source tag |
### `HistoricalInstrumentKey`
| Field | Type | Notes |
|---|---|---|
| `symbol` | `str` | Underlying |
| `option_type` | enum | `put`/`call` |
| `expiry` | `date` | Contract expiry |
| `strike` | `float` | Contract strike |
---
## Provider implementations
## A. `SyntheticHistoricalProvider` (BT-001 first)
Purpose:
- generate deterministic historical backtests without requiring stored historical options chains
- use historical underlying closes plus a synthetic volatility/rates regime
- resolve template legs into synthetic option quotes on each rebalance date
- reprice open positions daily using the same model family
### Recommended behavior
Inputs:
- underlying close series (from yfinance file cache, CSV fixture, or another deterministic source)
- configured implied volatility regime, e.g. fixed `0.16` or dated step regime
- configured risk-free rate regime
- optional stress spread for transaction cost realism
Entry and valuation:
- on a rebalance date, compute strike from `spot_pct * spot_close`
- set expiry by nearest trading day to `as_of_date + target_expiry_days`
- price using Black-Scholes with the current day's spot, configured IV, remaining time, and option type
- on later dates, reprice the same contract using current spot and remaining time only
### MVP synthetic assumptions
- constant or schedule-based implied volatility; no future realized volatility leakage
- no stochastic volatility process in first slice
- no early exercise modeling
- no assignment modeling
- `mid` price only
- deterministic rounding/selection rules
### Why synthetic-first is acceptable
It validates:
- template persistence
- run lifecycle
- path valuation
- daily result rendering
- anti-lookahead contract boundaries
before adding BT-002 data ingestion complexity.
## B. `DailyOptionsSnapshotProvider` (BT-002)
Purpose:
- load historical option quotes for each trading day
- resolve actual listed contracts closest to template rules
- mark open positions to historical daily mids thereafter
### Recommended behavior
- selection on entry day uses nearest eligible expiry and nearest eligible strike from that day's chain only
- mark-to-market later uses the exact same contract key if a quote exists on later dates
- if the contract is missing on a later date, provider returns a missing-data result and the engine applies a documented fallback policy
### MVP fallback policy for missing marks
Implementation agents should choose one explicit fallback and test it. Recommended order:
1. exact contract from same-day snapshot
2. if unavailable, previous available mark from same contract with warning
3. if unavailable and contract is expired, intrinsic value at expiry or zero afterward
4. otherwise fail the run or mark the template result incomplete
Do **not** silently substitute a different strike/expiry for an already-open position.
---
## Backtest engine flow
Create a dedicated engine under `app/backtesting/engine.py`. Keep orchestration and repository wiring in `app/services/backtesting/`.
### High-level loop
For each template in the scenario:
1. load trading days from provider
2. create baseline unhedged path
3. resolve initial hedge on `start_date`
4. for each trading day:
- read underlying close for day `T`
- mark open option positions as of `T`
- compute unhedged and hedged portfolio value
- compute LTV and margin-call flags
- check roll/expiry rules using only `T` data
- if a roll is due, close/expire old position and open replacement using `T` snapshot
5. liquidate remaining position at scenario end if still open
6. calculate summary metrics
7. rank templates inside `comparison_summary`
### Position model recommendation
Use a separate open-position model rather than reusing `OptionContract` directly.
Recommended `HistoricalOptionPosition` fields:
- `position_id`
- `instrument_key`
- `opened_at`
- `expiry`
- `quantity`
- `entry_price`
- `current_mark`
- `template_leg_id`
- `source_snapshot_date`
Reason: backtests need lifecycle state and audit fields that the current `OptionContract` model does not carry.
### Ranking recommendation
For MVP comparison views, rank templates by:
1. fewer `margin_call_days_hedged`
2. lower `max_ltv_hedged`
3. lower `total_hedge_cost`
4. higher `end_value_hedged_net`
This is easier to explain than a single opaque score.
---
## Data realism constraints
Implementation agents should treat the following as mandatory MVP rules.
## 1. Point-in-time only
On day `T`, the engine may use only:
- underlying bar for `T`
- option snapshot for `T`
- provider configuration known before the run starts
- open positions created earlier or on `T`
It may **not** use:
- future closes
- future implied vols
- terminal event windows beyond `T` for trading decisions
- any provider helper that precomputes the whole path and leaks future state into contract selection
## 2. Stable contract identity after entry
Once a contract is opened, daily valuation must use that exact contract identity:
- same symbol
- same expiry
- same strike
- same option type
No rolling relabeling of a live position to a “nearest” contract.
## 3. Explicit selection rules
Template rules must resolve to contracts with deterministic tiebreakers:
- nearest expiry at or beyond target DTE
- nearest strike to rule target
- if tied, prefer more conservative strike for puts (higher strike) and earliest expiry
Tiebreakers must be documented and unit-tested.
## 4. Execution timing must be fixed
MVP should use **same-day close execution** consistently.
Do not mix:
- signal at close / fill next open
- signal at close / fill same close
- signal intraday / mark at close
If this changes later, it must be a scenario-level parameter.
## 5. Continuous-vs-listed quantity must be explicit
MVP synthetic runs may use `continuous_units`.
BT-002 listed snapshot runs should support `listed_contracts` with contract-size rounding.
Do not hide rounding rules inside providers.
They belong in the position sizing logic and must be recorded in the result.
## 6. Costs must be recorded as cashflows
Premiums and close/expiry proceeds must be stored as dated cashflows.
Do not collapse the entire hedge economics into end-of-period payoff only.
## 7. Missing data cannot be silent
Any missing snapshot/mark fallback must add:
- a run warning
- a template validation note
- a deterministic result status if the template becomes incomplete
---
## Anti-lookahead rules
These should be copied into tests and implementation notes verbatim.
1. **Contract selection rule**: select options using only the entry-day snapshot.
2. **Daily MTM rule**: mark open positions using only same-day data for the same contract.
3. **Expiry rule**: once `as_of_date >= expiry`, option value becomes intrinsic-at-expiry or zero after expiry according to the provider contract; it is not repriced with negative time-to-expiry.
4. **Event preset rule**: event presets may define scenario dates, but the strategy engine may not inspect future event endpoints when deciding to roll or exit.
5. **Synthetic vol rule**: synthetic providers may use fixed or date-indexed IV schedules, but never realized future path statistics from dates after `as_of_date`.
6. **Metric rule**: comparison metrics may summarize the whole run only after the run completes; they may not feed back into trading decisions during the run.
---
## Phased implementation plan with TDD slices
Each slice should leave behind tests and a minimal implementation path.
## Slice 0 — Red tests for model invariants
Target:
- create tests for `StrategyTemplate`, `BacktestScenario`, `BacktestRun`, `EventPreset`
- validate weights, dates, versioned references, and uniqueness assumptions
Suggested tests:
- invalid ladder weights rejected
- scenario with end before start rejected
- template ref requires explicit version
- loan amount cannot exceed initial collateral value
## Slice 1 — Named template repository (EXEC-001A core)
Target:
- file-backed `StrategyTemplateRepository`
- save/load/list active templates
- version bump on immutable update
Suggested tests:
- saving template round-trips cleanly
- updating active template creates version 2, not in-place mutation
- archived template stays loadable for historical runs
## Slice 2 — Synthetic provider contract (BT-001 foundation)
Target:
- `HistoricalMarketDataProvider` protocol
- `SyntheticHistoricalProvider`
- deterministic underlying fixture input + synthetic option pricing
Suggested tests:
- provider returns stable trading day list
- spot-pct strike resolution uses same-day spot only
- repricing uses decreasing time to expiry
- no future bar access required for day-`T` pricing
## Slice 3 — Single-template backtest engine
Target:
- run one protective-put template across a short scenario
- output daily path + summary metrics
Suggested tests:
- hedge premium paid on entry day
- option MTM increases when spot falls materially below strike
- hedged max LTV is <= unhedged max LTV in a monotonic selloff fixture
- completed run freezes scenario and template version snapshots
## Slice 4 — Multi-template comparison runs
Target:
- compare ATM, 95% put, 50/50 ladder on same scenario
- produce ranked `comparison_summary`
Suggested tests:
- all template results share same scenario snapshot
- ranking uses documented metric order
- equal primary metric falls back to next metric deterministically
## Slice 5 — Roll logic and expiry behavior
Target:
- support `roll_n_days_before_expiry`
- support expiry settlement and position replacement
Suggested tests:
- roll occurs exactly on configured trading-day offset
- expired contracts stop carrying time value
- no contract identity mutation between entry and close
## Slice 6 — Event presets and BT-003 scenario materialization
Target:
- repository for `EventPreset`
- materialize preset -> scenario
- run comparison over multiple named events
Suggested tests:
- preset dates map cleanly into scenario dates
- scenario overrides are applied explicitly
- normalized event series start from common baseline
## Slice 7 — Daily snapshot provider (BT-002)
Target:
- add `DailyOptionsSnapshotProvider` behind same contract
- reuse existing engine with provider swap only
Suggested tests:
- entry picks nearest valid listed contract from snapshot
- later MTM uses same contract key
- missing mark generates warning and applies documented fallback
- synthetic and snapshot providers both satisfy same provider test suite
## Slice 8 — Thin API/UI integration after engine is proven
Not part of this docs implementation scope, but the natural next step is:
- `/api/templates`
- `/api/backtests`
- `/api/backtests/{run_id}`
- later a NiceGUI page for listing templates and runs
Per project rules, do not claim this feature is live until the UI consumes real run data.
---
## Recommended file/module layout
Recommended minimal layout for this codebase:
```text
app/
backtesting/
__init__.py
engine.py # run loop, ranking, metric aggregation
position_sizer.py # continuous vs listed quantity rules
result_metrics.py # path -> summary metrics
scenario_materializer.py # event preset -> scenario
selection.py # strike/expiry resolution helpers
models/
strategy_template.py # StrategyTemplate, TemplateLeg, RollPolicy, EntryPolicy
backtest.py # BacktestScenario, BacktestRun, results, daily points
event_preset.py # EventPreset, overrides
historical_data.py # UnderlyingBar, OptionSnapshot, InstrumentKey, marks
services/
backtesting/
__init__.py
orchestrator.py # submit/load/list runs
repositories.py # file-backed run repository helpers
historical/
__init__.py
base.py # HistoricalMarketDataProvider protocol
synthetic.py # BT-001 provider
snapshots.py # BT-002 provider
templates/
__init__.py
repository.py # save/load/list/version templates
events/
__init__.py
repository.py # save/load/list presets
```
### Persistence recommendation for MVP
Use file-backed repositories first:
```text
data/
strategy_templates.json
event_presets.json
backtests/
<run_id>.json
```
Reason:
- aligns with current `PortfolioRepository` style
- keeps the MVP small
- allows deterministic fixtures in tests
- can later move behind the same repository interfaces
---
## Code reuse guidance
Implementation agents should reuse existing code selectively.
### Safe to reuse
- pricing helpers in `app/core/pricing/`
- payoff logic concepts from `app/models/option.py`
- existing strategy presets from `app/strategies/engine.py` as seed templates
### Do not reuse directly without adaptation
- `StrategySelectionEngine` as the backtest engine
- `DataService` as a historical run orchestrator
- `LombardPortfolio.gold_ounces` as the canonical backtest exposure field
Reason: these current types are optimized for present-time research payloads, not dated position lifecycle state.
---
## Open implementation decisions to settle before coding
1. **Underlying source for synthetic BT-001**: use yfinance historical closes directly, local fixture CSVs, or both?
2. **Quantity mode in first runnable slice**: support only `continuous_units` first, or implement listed contract rounding immediately?
3. **Scenario end behavior**: liquidate remaining option at final close, or leave terminal MTM only?
4. **Missing snapshot policy**: hard-fail vs warn-and-carry-forward?
5. **Provider metadata freezing**: store config only, or config + source data hash?
Recommended answers for MVP:
- yfinance historical closes with deterministic test fixtures for unit tests
- `continuous_units` first
- liquidate at final close for clearer realized P&L
- warn-and-carry-forward only for same-contract marks, otherwise fail
- freeze provider config plus app/git version
---
## Implementation-ready recommendations
1. **Build BT-001 around a new provider interface, not around `DataService`.**
2. **Treat templates as immutable versioned definitions.** Runs must reference template versions, not mutable slugs only.
3. **Use a daily close-to-close engine for MVP and document it everywhere.**
4. **Record every hedge premium and payoff as dated cashflows.**
5. **Keep synthetic provider and daily snapshot provider behind the same contract.**
6. **Introduce `underlying_units` in backtesting models to avoid `gold_ounces` ambiguity.**
7. **Make missing data warnings explicit and persistent in run results.**

View File

@@ -292,10 +292,73 @@ DATA-001 (Price Feed)
**Dependencies:** PORT-001 **Dependencies:** PORT-001
### EXEC-001A: Named Strategy Templates [P1, M] **[depends: DATA-003]**
**As a** risk manager, **I want** protection strategies to be first-class named templates **so that** I can compare, reuse, and later edit hedge definitions.
**Acceptance Criteria:**
- Persist strategy templates with a stable id and display name
- Support at least initial system-defined templates (future user-editable names)
- Store template parameters separately from backtest scenarios
- Strategy templates are reusable by both live recommendation flows and backtests
**Technical Notes:**
- Add strategy template model/repository
- Separate template definition from strategy execution state
- Keep room for future user-editable naming and rule parameters
**Dependencies:** DATA-003
### BT-001: Synthetic Historical Backtesting [P1, L] **[depends: EXEC-001A, PORT-001]**
**As a** portfolio manager, **I want** to backtest hedge strategies against historical selloffs **so that** I can see which approach would have survived without a margin call.
**Acceptance Criteria:**
- Define a backtest scenario with start date, end date, collateral basis, and initial LTV
- Simulate at least one named hedge strategy over historical GLD prices
- Report max LTV, final equity, hedge cost, and whether a margin call would have occurred
- Compare protected vs unprotected outcomes for the same scenario
- Support known event replay such as the 2026 gold selloff window
**Technical Notes:**
- Start with synthetic/model-priced historical options rather than requiring point-in-time full historical chains
- Use historical underlying prices plus Black-Scholes/volatility assumptions
- Output both time series and summary metrics
**Dependencies:** EXEC-001A, PORT-001
### BT-002: Historical Daily Options Snapshot Provider [P2, L] **[depends: BT-001]**
**As a** quant user, **I want** real historical daily options snapshots **so that** backtests use observed premiums instead of only modeled prices.
**Acceptance Criteria:**
- Historical data provider abstraction supports point-in-time daily option chain snapshots
- Backtest engine can swap synthetic pricing for provider-backed historical daily premiums
- Contract selection avoids lookahead bias
- Provider choice and data quality limits are documented clearly
**Technical Notes:**
- Add provider interface for underlying history and option snapshot history
- Prefer daily snapshots first; intraday/tick fidelity is a later upgrade
- Candidate providers: Databento, Massive/Polygon, ThetaData, EODHD
**Dependencies:** BT-001
### BT-003: Selloff Event Comparison Report [P2, M] **[depends: BT-001]**
**As a** portfolio manager, **I want** event-based backtest reports **so that** I can answer questions like “which strategy got me through the Jan 2026 selloff?”
**Acceptance Criteria:**
- Event presets can define named historical windows
- Report ranks strategies by survival, max LTV, cost, and final equity
- Report highlights breach date if a strategy fails
- UI can show the unhedged path beside hedged paths
**Dependencies:** BT-001
## Implementation Priority Queue ## Implementation Priority Queue
1. **PORT-001A** - Add collateral entry basis and derived weight/value handling in settings 1. **EXEC-001A** - Define named strategy templates as the foundation for backtesting
2. **PORT-002** - Risk management safety 2. **BT-001** - Ship synthetic historical backtesting over GLD history
3. **EXEC-001** - Core user workflow 3. **PORT-003** - Historical LTV visibility and export groundwork
4. **EXEC-002** - Execution capability 4. **BT-002** - Upgrade backtests to real daily options snapshots
5. Remaining features 5. **BT-003** - Event comparison reporting
6. **EXEC-001** - Core user workflow
7. **EXEC-002** - Execution capability
8. Remaining features