TASK-PLAN — QU100 pattern audit + config system (WS1)

Rendered from docs/TASK-PLAN-qu100-pattern-audit-ws1-e7db.md — 2026-06-15. The .md is the source of truth; this file is the local render.

Goal

Build the measurement + config foundation for pattern tuning — a pattern-layer forward-return audit over one year of prices, plus a single versioned champion.yamlwithout changing any live weight or detector behavior.

Workstream summary

WS Goal Effort Risk Decision
A Replay the live pattern-layer ranking over 1yr stock_prices; full-composite replay deferred to WS3 (needs as-of money-flow selector) M Med YES
B Pattern forward-return audit → corpus (Parquet) + report M Low YES
C champion.yaml config system + deep-merge loader + history/registry (byte-identical seed) M Low YES

Problem

We tune pattern matching blind: the per-pattern confidence weights and the 3-layer composite (money-flow 0.25 / sector 0.10 / pattern 0.65) were never checked against outcomes. Live data is ~2 weeks — too thin (n<50). One year of daily OHLC sits in Postgres stock_prices; we never replayed the detector over it to learn which patterns work.

What's replayable over 1 year, and what isn't. The pattern layer is fully derivable from stock_prices OHLC (run the detector as-of each day). Money-flow history existsmoney_flow_snapshots covers 2020→present (~1,400 trading days, 276k rows; the 2026-06-04 backfill stamped 1,373) — but the live _screen_money_flow is latest-only: there is no as-of selector, and the backfill stamped every day with one shared captured_at, so point-in-time money-flow rank needs a new as-of query (latest captured_at within data_date ≤ t). Building that selector is its own work. So:

stock_prices (1yr OHLC) ──► replay detector as-of t ──► pattern emissions + fwd returns
                                                         └─ per-pattern table (1yr)  ◄── headline
money_flow history exists (~1,400 days) BUT no as-of selector ──► full-composite replay = WS3 later
WS1 composite parity ──► recent dates only (live latest-snapshot path)
champion.yaml (weight sets + thresholds, seeded byte-identical)  ◄── the unit to tune later

Success criteria

  1. One command produces a per-pattern table (n, win-rate, mean/median fwd return at 5/10/20d, by regime incl. an unknown cell) over the 1yr window, reproducible.
  2. The pattern-layer replay matches live consumption — same detect_patterns → _filter_actionable → best_pattern over the live ~6-month lookback. A parity test asserts the full ranking (ordering + best_pattern + composite score, not just symbol set) against screen_stocks; because the composite needs Layer-1 money-flow, that score assertion runs on recent dates (live latest-snapshot path) (the pattern-layer mechanism itself is exercised over the 1yr corpus).
  3. champion.yaml seeded from today's effective config produces a byte-identical ranking on a fixture (behavior-preserving), proven by test.
  4. The audit report names ≥1 concrete tuning hypothesis per later axis (WS2/WS3/WS4).

Deliverables

Execution order

  1. WS C first — land the champion.yaml deep-merge loader seeded byte-identical (behavior-preserving). Lowest risk, unblocks the rest.
  2. WS A — the pattern-layer replay (reads stock_prices; uses the config object).
  3. WS B — audit + report on top of A's corpus.

Work breakdown

Dependencies

Blockers (STOP and escalate, do not push)

Acceptance criteria

Risks

Non-goals

Validation


Implementation Notes (for engineers)