docs: add Databento integration plan and roadmap items
This commit is contained in:
@@ -0,0 +1,34 @@
|
||||
id: DATA-DB-001
|
||||
title: Databento Historical Price Source
|
||||
status: backlog
|
||||
priority: high
|
||||
dependencies: []
|
||||
estimated_effort: 2-3 days
|
||||
created: 2026-03-28
|
||||
updated: 2026-03-28
|
||||
|
||||
description: |
|
||||
Integrate Databento historical API as a data source for backtesting and scenario
|
||||
comparison pages. This replaces yfinance for historical data on backtest pages
|
||||
and provides reliable, high-quality market data.
|
||||
|
||||
acceptance_criteria:
|
||||
- DatabentoHistoricalPriceSource implements HistoricalPriceSource protocol
|
||||
- Cache layer prevents redundant downloads when parameters unchanged
|
||||
- Environment variable DATABENTO_API_KEY used for authentication
|
||||
- Cost estimation available before data fetch
|
||||
- GLD symbol resolved to XNAS.BASIC dataset
|
||||
- GC=F symbol resolved to GLBX.MDP3 dataset
|
||||
- Unit tests with mocked Databento responses pass
|
||||
|
||||
implementation_notes: |
|
||||
Key files:
|
||||
- app/services/backtesting/databento_source.py (new)
|
||||
- tests/test_databento_source.py (new)
|
||||
|
||||
Uses ohlcv-1d schema for daily bars. The cache key includes dataset, symbol,
|
||||
schema, start_date, and end_date. Cache files are Parquet format for fast
|
||||
loading. Metadata includes download_date for age validation.
|
||||
|
||||
dependencies_detail:
|
||||
- None - this is the foundation for Databento integration
|
||||
@@ -0,0 +1,39 @@
|
||||
id: DATA-DB-002
|
||||
title: Backtest Settings Model
|
||||
status: backlog
|
||||
priority: high
|
||||
dependencies:
|
||||
- DATA-DB-001
|
||||
estimated_effort: 1 day
|
||||
created: 2026-03-28
|
||||
updated: 2026-03-28
|
||||
|
||||
description: |
|
||||
Create BacktestSettings model that captures user-configurable backtest parameters
|
||||
independent of portfolio settings. This allows running scenarios with custom start
|
||||
prices and position sizes without modifying the main portfolio.
|
||||
|
||||
acceptance_criteria:
|
||||
- BacktestSettings dataclass defined with all necessary fields
|
||||
- start_price can be 0 (auto-derive) or explicit value
|
||||
- underlying_units independent of portfolio.gold_ounces
|
||||
- loan_amount and margin_call_ltv for LTV analysis
|
||||
- data_source field supports "databento" and "yfinance"
|
||||
- Repository persists settings per workspace
|
||||
- Default settings created for new workspaces
|
||||
|
||||
implementation_notes: |
|
||||
Key fields:
|
||||
- settings_id: UUID for tracking
|
||||
- data_source: "databento" | "yfinance" | "synthetic"
|
||||
- dataset: "XNAS.BASIC" | "GLBX.MDP3"
|
||||
- underlying_symbol: "GLD" | "GC" | "XAU"
|
||||
- start_date, end_date: date range
|
||||
- start_price: 0 for auto-derive, or explicit
|
||||
- underlying_units: position size for scenario
|
||||
- loan_amount: debt level for LTV analysis
|
||||
|
||||
Settings are stored in .workspaces/{workspace_id}/backtest_settings.json
|
||||
|
||||
dependencies_detail:
|
||||
- DATA-DB-001: Need data source configuration fields
|
||||
@@ -0,0 +1,40 @@
|
||||
id: DATA-DB-003
|
||||
title: Databento Cache Management
|
||||
status: backlog
|
||||
priority: medium
|
||||
dependencies:
|
||||
- DATA-DB-001
|
||||
estimated_effort: 1 day
|
||||
created: 2026-03-28
|
||||
updated: 2026-03-28
|
||||
|
||||
description: |
|
||||
Implement cache lifecycle management for Databento data. Cache files should be
|
||||
invalidated after configurable age (default 30 days) and when request parameters
|
||||
change. Provide CLI tool for cache inspection and cleanup.
|
||||
|
||||
acceptance_criteria:
|
||||
- DatabentoCacheManager lists all cached entries
|
||||
- Entries invalidated after max_age_days
|
||||
- Parameters change detection triggers re-download
|
||||
- Cache size tracking available
|
||||
- CLI command to clear all cache
|
||||
- CLI command to show cache statistics
|
||||
|
||||
implementation_notes: |
|
||||
Cache files stored in .cache/databento/:
|
||||
- dbn_{hash}.parquet: Data file
|
||||
- dbn_{hash}_meta.json: Metadata (download_date, params, rows)
|
||||
|
||||
Cache invalidation rules:
|
||||
1. Age > 30 days: re-download
|
||||
2. Parameters changed: re-download
|
||||
3. File corruption: re-download
|
||||
|
||||
CLI commands:
|
||||
- vault-dash cache list
|
||||
- vault-dash cache clear
|
||||
- vault-dash cache stats
|
||||
|
||||
dependencies_detail:
|
||||
- DATA-DB-001: Needs DatabentoCacheKey structure
|
||||
@@ -0,0 +1,50 @@
|
||||
id: DATA-DB-004
|
||||
title: Backtest Page UI Updates
|
||||
status: backlog
|
||||
priority: high
|
||||
dependencies:
|
||||
- DATA-DB-001
|
||||
- DATA-DB-002
|
||||
estimated_effort: 2 days
|
||||
created: 2026-03-28
|
||||
updated: 2026-03-28
|
||||
|
||||
description: |
|
||||
Update backtest and event comparison pages to support Databento data source
|
||||
and independent scenario configuration. Show estimated data cost and cache
|
||||
status in the UI.
|
||||
|
||||
acceptance_criteria:
|
||||
- Data source selector shows Databento and yFinance options
|
||||
- Databento config shows dataset and resolution dropdowns
|
||||
- Dataset selection updates cost estimate display
|
||||
- Cache status shows age of cached data
|
||||
- Independent start price input (0 = auto-derive)
|
||||
- Independent underlying units and loan amount
|
||||
- Event comparison page uses same data source config
|
||||
- Settings persist across sessions
|
||||
|
||||
implementation_notes: |
|
||||
Page changes:
|
||||
|
||||
Backtests page:
|
||||
- Add "Data Source" section with Databento/yFinance toggle
|
||||
- Add dataset selector (XNAS.BASIC for GLD, GLBX.MDP3 for GC=F)
|
||||
- Add resolution selector (ohlcv-1d, ohlcv-1h)
|
||||
- Show estimated cost with refresh button
|
||||
- Show cache status (age, size)
|
||||
- "Configure Scenario" section with independent start price/units
|
||||
|
||||
Event comparison page:
|
||||
- Same data source configuration
|
||||
- Preset scenarios show if data cached
|
||||
- Cost estimate for missing data
|
||||
|
||||
State management:
|
||||
- Use workspace-level BacktestSettings
|
||||
- Load on page mount, save on change
|
||||
- Invalidate cache when params change
|
||||
|
||||
dependencies_detail:
|
||||
- DATA-DB-001: Need DatabentoHistoricalPriceSource
|
||||
- DATA-DB-002: Need BacktestSettings model
|
||||
48
docs/roadmap/backlog/DATA-DB-005-scenario-pre-seeding.yaml
Normal file
48
docs/roadmap/backlog/DATA-DB-005-scenario-pre-seeding.yaml
Normal file
@@ -0,0 +1,48 @@
|
||||
id: DATA-DB-005
|
||||
title: Scenario Pre-Seeding from Bulk Downloads
|
||||
status: backlog
|
||||
priority: medium
|
||||
dependencies:
|
||||
- DATA-DB-001
|
||||
estimated_effort: 1-2 days
|
||||
created: 2026-03-28
|
||||
updated: 2026-03-28
|
||||
|
||||
description: |
|
||||
Create pre-configured scenario presets for gold hedging research and implement
|
||||
bulk download capability to pre-seed event comparison pages. This allows quick
|
||||
testing against historical events without per-event data fetching.
|
||||
|
||||
acceptance_criteria:
|
||||
- Default presets include COVID crash, rate hike cycle, gold rally events
|
||||
- Bulk download script fetches all preset data
|
||||
- Presets stored in config file (JSON/YAML)
|
||||
- Event comparison page shows preset data availability
|
||||
- One-click "Download All Presets" button
|
||||
- Progress indicator during bulk download
|
||||
|
||||
implementation_notes: |
|
||||
Default presets:
|
||||
- GLD March 2020 COVID Crash (extreme volatility)
|
||||
- GLD 2022 Rate Hike Cycle (full year)
|
||||
- GC=F 2024 Gold Rally (futures data)
|
||||
|
||||
Bulk download flow:
|
||||
1. Create batch job for each preset
|
||||
2. Show progress per preset
|
||||
3. Store in cache directory
|
||||
4. Update preset availability status
|
||||
|
||||
Preset format:
|
||||
- preset_id: unique identifier
|
||||
- display_name: human-readable name
|
||||
- symbol: GLD, GC, etc.
|
||||
- dataset: Databento dataset
|
||||
- window_start/end: date range
|
||||
- default_start_price: first close
|
||||
- default_templates: hedging strategies
|
||||
- event_type: crash, rally, rate_cycle
|
||||
- tags: for filtering
|
||||
|
||||
dependencies_detail:
|
||||
- DATA-DB-001: Needs cache infrastructure
|
||||
@@ -0,0 +1,46 @@
|
||||
id: DATA-DB-006
|
||||
title: Databento Options Data Source
|
||||
status: backlog
|
||||
priority: low
|
||||
dependencies:
|
||||
- DATA-DB-001
|
||||
estimated_effort: 3-5 days
|
||||
created: 2026-03-28
|
||||
updated: 2026-03-28
|
||||
|
||||
description: |
|
||||
Implement historical options data source using Databento's OPRA.PILLAR dataset.
|
||||
This enables historical options chain lookups for accurate backtesting with
|
||||
real options prices, replacing synthetic Black-Scholes pricing.
|
||||
|
||||
acceptance_criteria:
|
||||
- DatabentoOptionSnapshotSource implements OptionSnapshotSource protocol
|
||||
- OPRA.PILLAR dataset used for GLD/SPY options
|
||||
- Option chain lookup by snapshot_date and symbol
|
||||
- Strike and expiry filtering supported
|
||||
- Cached per-date for efficiency
|
||||
- Fallback to synthetic pricing when data unavailable
|
||||
|
||||
implementation_notes: |
|
||||
OPRA.PILLAR provides consolidated options data from all US options exchanges.
|
||||
|
||||
Key challenges:
|
||||
1. OPRA data volume is large - need efficient caching
|
||||
2. Option symbology differs from regular symbols
|
||||
3. Need strike/expiry resolution in symbology
|
||||
|
||||
Implementation approach:
|
||||
- Use 'definition' schema to get instrument metadata
|
||||
- Use 'trades' or 'ohlcv-1d' for price history
|
||||
- Cache per (symbol, expiration, strike, option_type, date)
|
||||
- Use continuous contracts for futures options (GC=F)
|
||||
|
||||
Symbology:
|
||||
- GLD options: Use underlying symbol "GLD" with OPRA
|
||||
- GC options: Use parent symbology "GC" for continuous contracts
|
||||
|
||||
This is a future enhancement - not required for initial backtesting
|
||||
which uses synthetic Black-Scholes pricing.
|
||||
|
||||
dependencies_detail:
|
||||
- DATA-DB-001: Needs base cache infrastructure
|
||||
Reference in New Issue
Block a user