ChromeDevTools
diff --git a/‎.github/workflows/odds-collect.yml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/odds-collect.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.github/workflows/pre-release.yml‎
Lines changed: 10 additions & 1 deletion b/‎.github/workflows/pre-release.yml‎
Lines changed: 10 additions & 1 deletion
diff --git a/‎docs/CALIBRATION_EXPERT_GUIDE.md‎
Lines changed: 73 additions & 0 deletions b/‎docs/CALIBRATION_EXPERT_GUIDE.md‎
Lines changed: 73 additions & 0 deletions
diff --git a/‎docs/MODEL_CALIBRATION_FINDINGS.md‎
Lines changed: 65 additions & 0 deletions b/‎docs/MODEL_CALIBRATION_FINDINGS.md‎
Lines changed: 65 additions & 0 deletions
@@ -25,7 +25,7 @@ jobs:
     runs-on: ubuntu-latest
     steps:
       - name: Check out repository
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
 
       - name: Install uv
         run: |
 
@@ -10,7 +10,7 @@ on:
 
 jobs:
   pre-release:
-    name: 'Verify MCP server schema unchanged'
+    name: 'Pre-release checks'
     runs-on: ubuntu-latest
     steps:
       - name: Check out repository
@@ -24,5 +24,14 @@ jobs:
           cache: npm
           node-version-file: '.nvmrc'
 
+      - name: Install dependencies
+        run: npm ci
+
+      - name: Generate documents
+        run: npm run docs
+
+      - name: Run docs smoke check
+        run: npm run docs:lint
+
       - name: Verify server.json
         run: npm run verify-server-json-version
@@ -0,0 +1,73 @@
+# Model Calibration Expert Guide
+
+This document explains the methodology behind the context-aware calibration used in
+`scripts/prediction/apply_context_aware_calibration.py`.
+
+> Status: **draft** – calibration constants and procedures are expected to evolve as
+> more validation data becomes available.
+
+## Purpose
+
+The goal of context-aware calibration is to correct systematic biases in model
+totals predictions by conditioning adjustments on:
+
+- **Scoring range** (low/mid/high)
+- **Pace** (slow/moderate/fast)
+- **Team quality** (elite defenses, large efficiency mismatches)
+
+Rather than adding the same offset to every game, the script computes a
+per-game adjustment based on these contextual signals.
+
+See the docstring and comments in `scripts/prediction/apply_context_aware_calibration.py`
+for the exact thresholds and constants currently in use.
+
+## Implementation Overview
+
+The calibration pipeline implemented in `apply_context_aware_calibration.py`:
+
+1. **Stores raw predictions**
+   - `predicted_total_raw`
+   - `predicted_home_score_raw`
+   - `predicted_away_score_raw`
+2. **Computes a context adjustment** via `calculate_context_adjustment`:
+   - Scoring band bias (low / mid / high totals)
+   - Pace bias (if `avg_tempo` is present)
+   - Elite defense bias (if `home_adj_d` / `away_adj_d` present)
+   - Mismatch bias (if `home_adj_em` / `away_adj_em` present)
+3. **Applies the adjustment**:
+   - New `predicted_total = predicted_total_raw + calibration_adjustment`
+   - Home/away scores shifted proportionally while keeping the margin intact
+4. **Logs aggregate behavior**:
+   - Average adjustment
+   - Adjustment range
+   - Count of low/mid/high-scoring games
+
+The script also records a human-readable `calibration_reasons` string that
+summarizes which contextual rules fired for each game.
+
+## When to Use This Script
+
+Use `apply_context_aware_calibration.py` **after** you have generated base model
+predictions and want to:
+
+- Reduce systematic over/underprediction in certain game types.
+- Preserve the underlying model ordering while nudging totals into better-calibrated
+  ranges.
+
+Example:
+
+```bash
+uv run python scripts/prediction/apply_context_aware_calibration.py \
+  --input predictions/2026-02-08_fresh.csv \
+  --output predictions/2026-02-08_context_calibrated.csv
+```
+
+## Future Work
+
+- Periodically recompute calibration constants from out-of-sample data.
+- Add automated reports that compare:
+  - Raw vs calibrated Brier score / log-loss.
+  - Calibration curves by scoring band and tempo band.
+- Consider integrating calibration into the main training pipeline once
+  behavior is stable.
+
@@ -0,0 +1,65 @@
+# Model Calibration Findings - 2026-02-07
+
+This document summarizes the bias analysis that motivated the one-time
+calibration fix implemented in `scripts/prediction/apply_calibration_fix.py`.
+
+> Status: **snapshot** – describes the specific 2026-02-07 underprediction issue.
+> Future calibration work should extend this document or be captured in a new
+> dated section.
+
+## Background
+
+On 2026-02-07, evaluation of the totals model against actual game results
+identified a **systematic underprediction** of final scores:
+
+- Average miss vs final totals was approximately **+4.5 points** (actual higher).
+- Bias was consistent across a wide range of matchups and totals bands.
+
+To avoid re-training the model mid-slates, a lightweight correction layer was
+introduced that:
+
+- Preserves the model’s relative ordering between games.
+- Applies a uniform upward shift to totals predictions.
+
+## Fix Implemented
+
+The script `scripts/prediction/apply_calibration_fix.py`:
+
+1. Stores original predictions:
+   - `predicted_total_raw`
+   - `predicted_home_score_raw`
+   - `predicted_away_score_raw`
+2. Adds a **+4.5 point bias correction** to totals:
+   - `predicted_total = predicted_total_raw + 4.5`
+3. Redistributes the 4.5 points proportionally to home and away scores so that:
+   - The **margin** (`predicted_home_score - predicted_away_score`) is unchanged.
+4. Optionally (when `--validate` is used) computes warning flags for games that
+   diverge too far from:
+   - KenPom-derived totals (`kenpom_total`)
+   - Market totals (`market_total`)
+   - Recent scoring averages (`recent_avg_total`)
+
+Command-line usage:
+
+```bash
+uv run python scripts/prediction/apply_calibration_fix.py \
+  --input predictions/2026-02-07_raw.csv \
+  --output predictions/2026-02-07_calibrated.csv
+```
+
+## Interpretation
+
+- This correction should be treated as a **temporary hotfix**, not a substitute
+  for retraining with more data.
+- The value **+4.5** is specific to the 2026-02-07 analysis window; future
+  evaluation may justify a different value or a more nuanced, context-aware
+  approach (see `docs/CALIBRATION_EXPERT_GUIDE.md`).
+
+## Next Steps
+
+- Regularly recompute calibration bias on rolling windows.
+- Prefer context-aware calibration (`apply_context_aware_calibration.py`) once
+  its methodology is fully validated.
+- Integrate calibration metrics (e.g., Brier score, calibration curves) into the
+  standard model evaluation pipeline.
+