Skip to content

Commit 38d515b

Browse files
committed
chore(deps): update puppeteer to version 24.37.2
Bumps the puppeteer dependency from version 24.36.1 to 24.37.2, incorporating various fixes and improvements, including enhanced ConsoleMessage text() results and documentation updates for the signal option in Page waitFor methods.
1 parent a23d625 commit 38d515b

35 files changed

Lines changed: 3147 additions & 0 deletions

.claude/CLAUDE.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# Odds ML Orchestration (Claude Code Team)
2+
3+
This repository is used to maintain an orchestrated pipeline for **sports odds statistical research and ML predictions** (spreads, moneylines, totals) with a strict **rolling 5-day freshness window**.
4+
5+
## Non-negotiables
6+
7+
- **Freshness SLO**: all datasets used for training and prediction must be derived from the last **5 days** of collected data. If inputs are stale, the pipeline must fail fast and trigger backfill.
8+
- **Canonical markets**:
9+
- **Spreads**: store **one canonical value per game** (favorite perspective OR `spread_magnitude` + `favorite_team`). Never average ±spread rows.
10+
- **Totals**: store **one canonical total** per game (not separate over/under rows).
11+
- **Moneylines**: convert American odds to **implied probability** for aggregation and ML.
12+
13+
See the project rules in `./.claude/rules/` for details.
14+
15+
## Repeatable Agent Team Template (copy/paste prompt)
16+
17+
Create an agent team for odds pipeline maintenance with these teammates and responsibilities. Require plan approval before implementing any schema or workflow changes. Put the lead into delegate mode after spawning.
18+
19+
- **TeamLead (delegate mode)**: coordination only, creates tasks, assigns owners, synthesizes results.
20+
- **CollectorEngineer**: web scraping + API collectors, rate limits, idempotency, retries/backfills.
21+
- **NormalizationSteward**: canonicalization of spreads/totals/moneylines; dedupe; invariants/tests.
22+
- **DataFreshnessSRE**: rolling 5-day window enforcement; staleness detection; alerting/escalation.
23+
- **MLTrainerEngineer**: feature views; training; evaluation; prediction artifacts.
24+
- **CostQuotaAnalyst (optional)**: API credit/usage budgeting; schedule optimization.
25+
26+
Approval criteria for TeamLead:
27+
- Reject any plan that changes market sign conventions.
28+
- Reject any plan that allows stale inputs to silently pass.
29+
- Reject any plan that introduces non-idempotent collectors.
30+
31+
## Operational contract (GitHub Actions)
32+
33+
GitHub Actions runs the scheduled pipeline. The code must support:
34+
- **Collect****Normalize/Validate****Train****Predict/Report****Freshness Guard**
35+
- Bounded backfill on staleness (5-day lookback).
36+

.claude/hooks/task_gate.py

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
from __future__ import annotations
2+
3+
import json
4+
import os
5+
import subprocess
6+
import sys
7+
8+
9+
def _run(cmd: list[str]) -> subprocess.CompletedProcess:
10+
return subprocess.run(cmd, capture_output=True, text=True)
11+
12+
13+
def main() -> int:
14+
# Read hook JSON input (best-effort; TaskCompleted/TeammateIdle always fire).
15+
try:
16+
_ = json.loads(sys.stdin.read() or "{}")
17+
except Exception:
18+
pass
19+
20+
# Only run gates if the odds pipeline package is importable in this environment.
21+
probe = _run([sys.executable, "-c", "import odds_pipeline"])
22+
if probe.returncode != 0:
23+
print("odds_pipeline not installed; skipping odds pipeline gates", file=sys.stderr)
24+
return 0
25+
26+
v = _run([sys.executable, "-m", "odds_pipeline", "validate"])
27+
if v.returncode != 0:
28+
print("odds pipeline validation failed", file=sys.stderr)
29+
print(v.stdout, file=sys.stderr)
30+
print(v.stderr, file=sys.stderr)
31+
return 2
32+
33+
# Only enforce freshness if DATABASE_URL is present (so local doc work isn't blocked).
34+
if os.getenv("DATABASE_URL"):
35+
f = _run([sys.executable, "-m", "odds_pipeline", "freshness-guard", "--window-days", "5"])
36+
if f.returncode != 0:
37+
print("freshness guard failed", file=sys.stderr)
38+
print(f.stdout, file=sys.stderr)
39+
print(f.stderr, file=sys.stderr)
40+
return 2
41+
42+
return 0
43+
44+
45+
if __name__ == "__main__":
46+
raise SystemExit(main())
47+
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
---
2+
paths:
3+
- "odds/**"
4+
- ".github/workflows/odds-*.yml"
5+
---
6+
7+
# Data freshness + rolling window rules (5 days)
8+
9+
This project’s ML outputs are only valid if inputs are **fresh** and the training/inference data is bounded to a **rolling 5-day window**.
10+
11+
## Freshness SLO
12+
13+
Fail the pipeline if any required input stream is stale.
14+
15+
Recommended defaults (tune per sport/market cadence):
16+
- **Odds snapshots**: stale if `max(collected_at)` older than **180 minutes**
17+
- **Scores/finals**: stale if `max(collected_at)` older than **24 hours**
18+
19+
## Rolling 5-day window
20+
21+
All downstream datasets (features, training rows, prediction features) must be computable from the last **5 days** of canonical + raw inputs.
22+
23+
Implementation requirements:
24+
- Every compute job must accept `--window-days 5` (default 5).
25+
- Normalization must support backfill with an explicit `--lookback-days 5`.
26+
- Any retention/pruning job must never delete within the active 5-day window.
27+
28+
## Backfill on staleness
29+
30+
If freshness checks fail:
31+
- run a bounded backfill (lookback 5 days)
32+
- re-run normalization + validation
33+
- re-check freshness before training/predicting
34+
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
---
2+
paths:
3+
- "odds/**"
4+
---
5+
6+
# ML training + prediction contracts
7+
8+
This project is designed so that models can be trained and evaluated deterministically from canonical data in the last 5 days.
9+
10+
## Requirements
11+
12+
- Training jobs must:
13+
- log the dataset time window used
14+
- record training timestamp and model version identifier
15+
- output evaluation metrics (at minimum: calibration/accuracy proxies appropriate to the target)
16+
- Prediction jobs must:
17+
- refuse to run if freshness checks fail
18+
- attach the model version + data window to every prediction artifact
19+
20+
## Targets
21+
22+
- **Spreads**: predict cover probability from the team perspective (requires consistent sign conventions).
23+
- **Moneylines**: predict win probability (compare to implied probs for edge).
24+
- **Totals**: predict over probability relative to the canonical total.
25+
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
---
2+
paths:
3+
- "odds/**"
4+
---
5+
6+
# Odds normalization rules (canonical markets)
7+
8+
These rules prevent sign-convention errors and ensure math is consistent across collectors, analytics, and ML training.
9+
10+
## Key distinction
11+
12+
**Favorite/underdog is determined by spread sign; home/away is venue and independent.** Do not conflate them.
13+
14+
## Spreads (one canonical value per game)
15+
16+
Sportsbooks/APIs often return **two outcomes per event** with opposite signs (e.g., -7 and +7). Those represent the *same* market.
17+
18+
Store exactly **one canonical record per event/book/collected_at** using either:
19+
20+
- **Option A (allowed)**: store the **favorite spread** (always negative or 0).
21+
- **Option B (preferred)**: store `spread_magnitude` (always positive) and explicit `favorite_team`/`underdog_team`.
22+
23+
Never average raw `point` values that include both + and -.
24+
25+
## Totals (one canonical value per game)
26+
27+
Over/Under are two prices on the same number. Store one `total` value plus `over_price`/`under_price`.
28+
29+
## Moneylines (use implied probability for math)
30+
31+
American odds must be converted to implied probability before any aggregation or modeling.
32+
33+
For American odds \(o\):
34+
35+
- If \(o < 0\): \(p = |o| / (|o| + 100)\)
36+
- If \(o > 0\): \(p = 100 / (o + 100)\)
37+
38+
Never average American odds directly.
39+
40+
## Line movement convention (favorite perspective)
41+
42+
If tracking spread movement, compute deltas from the **favorite’s spread** (negative). This avoids mixing perspectives.
43+

.claude/rules/storage-schema.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
---
2+
paths:
3+
- "odds/**"
4+
---
5+
6+
# Storage and schema contract
7+
8+
GitHub Actions runners are ephemeral. **All pipeline state must live in persistent storage**.
9+
10+
## Required environment variables
11+
12+
- `DATABASE_URL`: Postgres connection string for the persistent store.
13+
- `ODDS_API_KEY`: The Odds API key (or equivalent) for collectors.
14+
15+
## Schema principles
16+
17+
- **Raw tables**: append-only snapshots; never mutated in place.
18+
- **Canonical tables**: derived from raw via normalization; can be re-derived deterministically.
19+
- **Idempotency**: collectors must not create duplicates for the same `(source,event_id,market,bookmaker,collected_at)` tuple.
20+
- **Time zone**: store timestamps in UTC and only convert for presentation.
21+
22+
## Market canonicalization
23+
24+
Canonical tables must follow the rules in `odds-normalization.md`.
25+

.claude/settings.json

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
{
2+
"env": {
3+
"CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
4+
},
5+
"teammateMode": "in-process",
6+
"hooks": {
7+
"TaskCompleted": [
8+
{
9+
"hooks": [
10+
{
11+
"type": "command",
12+
"command": "python \"$CLAUDE_PROJECT_DIR/.claude/hooks/task_gate.py\""
13+
}
14+
]
15+
}
16+
],
17+
"TeammateIdle": [
18+
{
19+
"hooks": [
20+
{
21+
"type": "command",
22+
"command": "python \"$CLAUDE_PROJECT_DIR/.claude/hooks/task_gate.py\""
23+
}
24+
]
25+
}
26+
]
27+
}
28+
}
29+
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
---
2+
name: betting-data-normalizing
3+
description: Mandatory normalization rules for spreads, totals, and moneylines. Use for ANY sports betting analytics or ML work in this repo.
4+
---
5+
6+
# Betting Data Normalizing (repo standard)
7+
8+
## Spreads
9+
10+
- APIs/books often return two outcomes per game with opposite signs (e.g., -7 and +7). They represent the **same** spread.\n
11+
- Store **one canonical record per game**.\n
12+
- Recommended representation:\n
13+
- `spread_magnitude`: always positive\n
14+
- `favorite_team` and `underdog_team`\n
15+
- prices for each side\n
16+
\n
17+
Never average raw point values that mix negative and positive spreads.
18+
19+
## Totals
20+
21+
- Store **one total** per game plus `over_price` and `under_price`.\n
22+
- Do not store separate Over/Under rows as separate totals.
23+
24+
## Moneylines
25+
26+
- Convert American odds to implied probability before doing math.\n
27+
\n
28+
If `odds < 0`:\n
29+
`p = abs(odds) / (abs(odds) + 100)`\n
30+
\n
31+
If `odds > 0`:\n
32+
`p = 100 / (odds + 100)`\n
33+
\n
34+
Never average American odds directly.
35+
36+
## Movement convention
37+
38+
Track spread movement from the **favorite’s perspective** (negative spread). This avoids mixing perspectives between teams.
39+
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
---
2+
name: odds-collecting
3+
description: Collect scores and odds in a rolling window with retries, deduplication, and freshness guarantees.
4+
---
5+
6+
# Odds Collecting (repo standard)
7+
8+
## Goals
9+
10+
- Keep data fresh for a rolling **5-day** ML window.\n
11+
- Make collectors **idempotent** and safe to rerun.\n
12+
- Track costs/quotas and avoid redundant polling.\n
13+
14+
## Collector requirements
15+
16+
- Always accept explicit arguments:\n
17+
- `--lookback-days` (default 5)\n
18+
- `--sport` (e.g., `basketball_ncaab`)\n
19+
- `--regions` and `--markets` when applicable\n
20+
- Always write timestamps in UTC (`collected_at`).\n
21+
- Use `event_id` (or equivalent) as the primary dedupe key.\n
22+
- Handle rate limits with exponential backoff.\n
23+
24+
## Freshness
25+
26+
- Provide a `freshness_guard` command that fails when data is stale.\n
27+
- On staleness, run bounded backfill (lookback 5 days), then re-normalize.\n
28+

.github/workflows/odds-collect.yml

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
name: Odds pipeline - collect odds + normalize
2+
3+
permissions: read-all
4+
5+
on:
6+
workflow_dispatch:
7+
inputs:
8+
sport:
9+
description: Sport key (e.g. basketball_ncaab)
10+
required: false
11+
default: basketball_ncaab
12+
regions:
13+
description: Regions (comma-separated)
14+
required: false
15+
default: us
16+
markets:
17+
description: Markets (comma-separated)
18+
required: false
19+
default: h2h,spreads,totals
20+
schedule:
21+
- cron: "*/15 * * * *"
22+
23+
jobs:
24+
collect-odds:
25+
runs-on: ubuntu-latest
26+
steps:
27+
- name: Check out repository
28+
uses: actions/checkout@v4
29+
30+
- name: Set up uv
31+
uses: astral-sh/setup-uv@v3
32+
33+
- name: Init schema (idempotent)
34+
env:
35+
DATABASE_URL: ${{ secrets.DATABASE_URL }}
36+
working-directory: odds
37+
run: uv run python -m odds_pipeline.schema
38+
39+
- name: Collect odds snapshots
40+
env:
41+
DATABASE_URL: ${{ secrets.DATABASE_URL }}
42+
ODDS_API_KEY: ${{ secrets.ODDS_API_KEY }}
43+
working-directory: odds
44+
run: >
45+
uv run python -m odds_pipeline collect-odds
46+
--sport "${{ inputs.sport || 'basketball_ncaab' }}"
47+
--regions "${{ inputs.regions || 'us' }}"
48+
--markets "${{ inputs.markets || 'h2h,spreads,totals' }}"
49+
50+
- name: Normalize (rolling window)
51+
env:
52+
DATABASE_URL: ${{ secrets.DATABASE_URL }}
53+
working-directory: odds
54+
run: uv run python -m odds_pipeline normalize --window-days 5
55+
56+
- name: Validate invariants (fast)
57+
working-directory: odds
58+
run: uv run python -m odds_pipeline validate
59+

0 commit comments

Comments
 (0)