QuickCall instruments AI coding agents on real dev machines — every prompt, every tool call, every correction. It spots the patterns: where agents go wrong, what conventions they miss, which mistakes keep happening across sessions.
Blackbox is the engine that does the heavy lifting. Drop in session traces from Claude Code, Codex CLI, or pi.dev, and it runs them through a multi-stage LLM pipeline to pull out root causes, recurring failures, and anti-patterns. The output feeds back into QuickCall so future agent sessions start smarter.
flowchart TD
subgraph Upload
A["POST /analyze<br/>upload JSONL files"] --> B["Detect source<br/>_detect_source()"]
B --> C["Normalize<br/>_normalize_file()"]
end
C --> D["Return 202 Accepted<br/>run_id → background task"]
subgraph Pipeline
P0["P0 Normalize<br/>count + index messages"]
P1["P1 Classify<br/>LLM label each user turn<br/>batches run concurrently"]
P2["P2 Context<br/>build windows around triggers"]
P3["P3 Root-Cause<br/>LLM per trigger window"]
P4a["P4a Behavior<br/>rule type + confidence"]
P4b["P4b Cluster<br/>group recurring patterns"]
P4c["P4c Convention<br/>dont_do / do_instead"]
P5["P5 Aggregate<br/>deduplicate + score severity"]
P6["P6 Scope<br/>map to repos + devs"]
end
subgraph Client
POLL["GET /runs/:id<br/>poll status"]
OUT["GET /runs/:id/findings<br/>recurring findings JSON"]
end
P0 --> P1
P1 --> P2
P2 --> P3
P3 --> P4a
P3 --> P4b
P3 --> P4c
P4a --> P5
P4b --> P5
P4c --> P5
P5 --> P6
P6 --> POLL
POLL --> OUT
uv run uvicorn src.main:app --host 0.0.0.0 --port 8000All responses are JSON.
Multipart form upload. Returns immediately with a run_id. Analysis runs in background.
Request:
curl -X POST http://localhost:8000/analyze \
-F "file=@session1.jsonl" \
-F "file=@session2.jsonl"Response (202):
{
"run_id": "run_a3f7e2d1",
"status": "pending",
"message": "Analysis started"
}Auto-detects source from file content. Override with ?source=claude_code or ?source=codex_cli or ?source=pi.
Poll this until status is "done".
Response:
{
"run_id": "run_a3f7e2d1",
"status": "done",
"created_at": "2026-06-02T18:05:31.106757",
"completed_at": "2026-06-02T18:05:37.927401",
"stages": {
"p0_normalize": {"status": "done", ...},
"p1_classify": {"status": "done", ...},
"p2_context": {"status": "done", ...},
"p3_rca": {"status": "done", ...},
"p4a_behavior": {"status": "done", ...},
"p4b_cluster": {"status": "done", ...},
"p4c_convention":{"status": "done", ...},
"p5_aggregate": {"status": "done", ...},
"p6_scope": {"status": "done", ...}
}
}Stages progress: pending → running → done / error.
Returns findings that appear across 2+ sessions.
curl http://localhost:8000/runs/run_a3f7e2d1/findingsResponse:
[
{
"session_id": "sess_abc123",
"agents_md_rule": "Use specific error handling...",
"category": "missing_context",
"severity": 3,
"is_recurring": true,
"pattern_label": "error_handling",
...
}
]Same structure as above but includes total_findings, severity_distribution, category_distribution, filtered_findings (recurring subset).
Access any pipeline stage directly:
p0_normalize— normalized sessions with message countsp1_classify— per-session message classificationsp5_aggregate— full findings + metadata
{"status": "ok", "model": "kimi-k2.6"}| Stage | What it does |
|---|---|
| p0_normalize | Parse uploaded JSONL → unified message format |
| p1_classify | Label each user message (question, new_task, correction, etc.) |
| p2_context | Build context windows around trigger turns |
| p3_rca | LLM root-cause analysis on triggers |
| p4a_behavior | Classify findings by rule type |
| p4b_cluster | Group recurring findings into patterns |
| p4c_convention | Identify wrong_approach conventions |
| p5_aggregate | Deduplicate, score severity, filter recurring |
| p6_scope | Map findings to repos / developers |
| Source | Detection |
|---|---|
| Claude Code | JSONL with "type":"user", "uuid", "sessionId", "version" |
| Codex CLI | JSONL with rollout filename or "type":"conversation" |
| pi.dev | JSONL with "type":"session", "type":"message" |
OPENAI_API_KEY=sk-...
OPENAI_BASE_URL=https://api.moonshot.ai/v1
MODEL=kimi-k2.6
CONCURRENCY=30uv run pytestPOST /analyze→run_id(immediate)- Poll
GET /runs/{run_id}untilstatus: "done" - Fetch
GET /runs/{run_id}/findingsfor actionable recurring issues
Apache 2.0 — see LICENSE.