Skip to content

Commit a3be745

Browse files
committed
chore: update odds-collect and pre-release workflows
Changed the checkout action version in odds-collect workflow from v6 to v4 for compatibility. Updated pre-release workflow to include steps for installing dependencies, generating documentation, and running a smoke check on documentation, enhancing the overall verification process before releases.
1 parent 4406c7a commit a3be745

151 files changed

Lines changed: 40734 additions & 3 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/odds-collect.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ jobs:
2525
runs-on: ubuntu-latest
2626
steps:
2727
- name: Check out repository
28-
uses: actions/checkout@v6
28+
uses: actions/checkout@v4
2929

3030
- name: Install uv
3131
run: |

.github/workflows/pre-release.yml

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ on:
1010

1111
jobs:
1212
pre-release:
13-
name: 'Verify MCP server schema unchanged'
13+
name: 'Pre-release checks'
1414
runs-on: ubuntu-latest
1515
steps:
1616
- name: Check out repository
@@ -24,5 +24,14 @@ jobs:
2424
cache: npm
2525
node-version-file: '.nvmrc'
2626

27+
- name: Install dependencies
28+
run: npm ci
29+
30+
- name: Generate documents
31+
run: npm run docs
32+
33+
- name: Run docs smoke check
34+
run: npm run docs:lint
35+
2736
- name: Verify server.json
2837
run: npm run verify-server-json-version

docs/CALIBRATION_EXPERT_GUIDE.md

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
# Model Calibration Expert Guide
2+
3+
This document explains the methodology behind the context-aware calibration used in
4+
`scripts/prediction/apply_context_aware_calibration.py`.
5+
6+
> Status: **draft** – calibration constants and procedures are expected to evolve as
7+
> more validation data becomes available.
8+
9+
## Purpose
10+
11+
The goal of context-aware calibration is to correct systematic biases in model
12+
totals predictions by conditioning adjustments on:
13+
14+
- **Scoring range** (low/mid/high)
15+
- **Pace** (slow/moderate/fast)
16+
- **Team quality** (elite defenses, large efficiency mismatches)
17+
18+
Rather than adding the same offset to every game, the script computes a
19+
per-game adjustment based on these contextual signals.
20+
21+
See the docstring and comments in `scripts/prediction/apply_context_aware_calibration.py`
22+
for the exact thresholds and constants currently in use.
23+
24+
## Implementation Overview
25+
26+
The calibration pipeline implemented in `apply_context_aware_calibration.py`:
27+
28+
1. **Stores raw predictions**
29+
- `predicted_total_raw`
30+
- `predicted_home_score_raw`
31+
- `predicted_away_score_raw`
32+
2. **Computes a context adjustment** via `calculate_context_adjustment`:
33+
- Scoring band bias (low / mid / high totals)
34+
- Pace bias (if `avg_tempo` is present)
35+
- Elite defense bias (if `home_adj_d` / `away_adj_d` present)
36+
- Mismatch bias (if `home_adj_em` / `away_adj_em` present)
37+
3. **Applies the adjustment**:
38+
- New `predicted_total = predicted_total_raw + calibration_adjustment`
39+
- Home/away scores shifted proportionally while keeping the margin intact
40+
4. **Logs aggregate behavior**:
41+
- Average adjustment
42+
- Adjustment range
43+
- Count of low/mid/high-scoring games
44+
45+
The script also records a human-readable `calibration_reasons` string that
46+
summarizes which contextual rules fired for each game.
47+
48+
## When to Use This Script
49+
50+
Use `apply_context_aware_calibration.py` **after** you have generated base model
51+
predictions and want to:
52+
53+
- Reduce systematic over/underprediction in certain game types.
54+
- Preserve the underlying model ordering while nudging totals into better-calibrated
55+
ranges.
56+
57+
Example:
58+
59+
```bash
60+
uv run python scripts/prediction/apply_context_aware_calibration.py \
61+
--input predictions/2026-02-08_fresh.csv \
62+
--output predictions/2026-02-08_context_calibrated.csv
63+
```
64+
65+
## Future Work
66+
67+
- Periodically recompute calibration constants from out-of-sample data.
68+
- Add automated reports that compare:
69+
- Raw vs calibrated Brier score / log-loss.
70+
- Calibration curves by scoring band and tempo band.
71+
- Consider integrating calibration into the main training pipeline once
72+
behavior is stable.
73+

docs/MODEL_CALIBRATION_FINDINGS.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# Model Calibration Findings - 2026-02-07
2+
3+
This document summarizes the bias analysis that motivated the one-time
4+
calibration fix implemented in `scripts/prediction/apply_calibration_fix.py`.
5+
6+
> Status: **snapshot** – describes the specific 2026-02-07 underprediction issue.
7+
> Future calibration work should extend this document or be captured in a new
8+
> dated section.
9+
10+
## Background
11+
12+
On 2026-02-07, evaluation of the totals model against actual game results
13+
identified a **systematic underprediction** of final scores:
14+
15+
- Average miss vs final totals was approximately **+4.5 points** (actual higher).
16+
- Bias was consistent across a wide range of matchups and totals bands.
17+
18+
To avoid re-training the model mid-slates, a lightweight correction layer was
19+
introduced that:
20+
21+
- Preserves the model’s relative ordering between games.
22+
- Applies a uniform upward shift to totals predictions.
23+
24+
## Fix Implemented
25+
26+
The script `scripts/prediction/apply_calibration_fix.py`:
27+
28+
1. Stores original predictions:
29+
- `predicted_total_raw`
30+
- `predicted_home_score_raw`
31+
- `predicted_away_score_raw`
32+
2. Adds a **+4.5 point bias correction** to totals:
33+
- `predicted_total = predicted_total_raw + 4.5`
34+
3. Redistributes the 4.5 points proportionally to home and away scores so that:
35+
- The **margin** (`predicted_home_score - predicted_away_score`) is unchanged.
36+
4. Optionally (when `--validate` is used) computes warning flags for games that
37+
diverge too far from:
38+
- KenPom-derived totals (`kenpom_total`)
39+
- Market totals (`market_total`)
40+
- Recent scoring averages (`recent_avg_total`)
41+
42+
Command-line usage:
43+
44+
```bash
45+
uv run python scripts/prediction/apply_calibration_fix.py \
46+
--input predictions/2026-02-07_raw.csv \
47+
--output predictions/2026-02-07_calibrated.csv
48+
```
49+
50+
## Interpretation
51+
52+
- This correction should be treated as a **temporary hotfix**, not a substitute
53+
for retraining with more data.
54+
- The value **+4.5** is specific to the 2026-02-07 analysis window; future
55+
evaluation may justify a different value or a more nuanced, context-aware
56+
approach (see `docs/CALIBRATION_EXPERT_GUIDE.md`).
57+
58+
## Next Steps
59+
60+
- Regularly recompute calibration bias on rolling windows.
61+
- Prefer context-aware calibration (`apply_context_aware_calibration.py`) once
62+
its methodology is fully validated.
63+
- Integrate calibration metrics (e.g., Brier score, calibration curves) into the
64+
standard model evaluation pipeline.
65+

0 commit comments

Comments
 (0)