fix: make Copilot capture attribution consistent by codeprakhar25 · Pull Request #16 · codeprakhar25/agentdiff

codeprakhar25 · 2026-05-07T12:33:43Z

Summary

Fixes #12. Copilot capture attribution was inconsistent due to three separate bugs:

Unstable session ID: vscode-${Date.now()} was evaluated on every captureFile() call, so each event appeared to come from a different session, breaking per-session grouping.
False positives from heuristic: MIN_COPILOT_CHANGE_LEN = 50 fires on any large document change — including edits from Claude Code, Cursor, Codex, or human copy-paste. These false positives silently polluted session.jsonl with non-Copilot events.
No confidence signal: There was no way for downstream tooling (or users) to distinguish reliable captures (the agentdiff.captureNow command) from heuristic guesses.

Changes

File	What changed
`extension.js`	Stable `WINDOW_SESSION_ID` per activation; `confidence` (`"high"`/`"low"`) and `capture_mode` fields on every event; doc comment listing reliable vs. unsupported modes
`capture-copilot.py`	Passes `confidence` + `capture_mode` through to session.jsonl; module docstring documents mode reliability
`prepare-ledger.py`	Reads copilot events (still excluded from attribution) and builds a `copilot_context` summary in the pending payload
`finalize-ledger.py`	Propagates `copilot_context` into trace metadata
`src/data.rs`	Adds `copilot_context: Option<serde_json::Value>` to `AgentdiffMetadata`
`src/commands/list.rs`	Shows `~cpl` in the TRUST column when low-confidence events exist
`src/commands/report.rs`	Emits a warning in the Review Context markdown section when heuristic capture is detected
`scripts/tests/test_capture_copilot.py` (new)	10 tests: all capture modes, tool mappings, path resolution, silent-exit behaviour
`scripts/tests/test_extension.js`	3 new tests: confidence fields, session ID stability

Test plan

python3 -m pytest scripts/tests/ -v — 30/30 pass
node --test scripts/tests/test_extension.js — 14/14 pass
cargo test — 35/35 pass

🤖 Generated with Claude Code

The VS Code extension was using Date.now() as a session ID (so every capture looked like a different session) and the heuristic threshold fired on any large edit — including edits from Claude Code, Cursor, and human copy-paste — producing silent false positives in session.jsonl. Changes: - extension.js: stable WINDOW_SESSION_ID per activation; confidence and capture_mode fields on every event ("high"/manual vs "low"/heuristic); doc comment explaining which capture modes are reliable vs. unsupported - capture-copilot.py: passes confidence + capture_mode through to session.jsonl; documents mode reliability in module docstring - prepare-ledger.py: reads copilot events (kept excluded from attribution) and builds a copilot_context summary (event count, low_confidence flag, files, lines) for the pending payload - finalize-ledger.py: propagates copilot_context into trace metadata - data.rs: adds copilot_context field to AgentdiffMetadata - list.rs: shows "~cpl" in TRUST column when low-confidence events exist - report.rs: emits a warning note in Review Context markdown section when low-confidence heuristic capture is detected - test_capture_copilot.py (new): 10 tests covering all capture modes, tool mappings, path resolution, and silent-exit behaviour - test_extension.js: 3 new tests for confidence fields and session ID stability Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

greptile-apps · 2026-05-07T12:37:34Z

Greptile Summary

This PR fixes three bugs that made Copilot capture attribution unreliable: a per-call Date.now() session ID that fragmented events, a heuristic that silently fired on any large VS Code edit regardless of source, and no signal distinguishing reliable from heuristic captures.

Stable session ID: WINDOW_SESSION_ID is now a module-level constant evaluated once per activation, so all captures in a window share one session.
Confidence + capture mode fields: every event now carries confidence (\"high\"/\"low\") and capture_mode through the full pipeline — extension → capture-copilot.py → session.jsonl → pending_ledger.json → trace metadata.
Downstream surfacing: agentdiff list shows ~cpl in the TRUST column and agentdiff report emits a Review Context warning when low-confidence heuristic events are present at commit time.

Confidence Score: 4/5

Safe to merge; one test has a logic gap that could cause a false failure in environments where AGENTDIFF_SESSION_LOG is set.

The new test_get_session_log_returns_none_when_not_initialized builds an env dict with AGENTDIFF_SESSION_LOG="" but never applies it to os.environ. Any CI runner that exports AGENTDIFF_SESSION_LOG with a non-empty value will see a non-None return from get_session_log and trip the assertIsNone assertion. All production paths look correct.

scripts/tests/test_capture_copilot.py — unused env variable in the first test

Important Files Changed

Filename	Overview
scripts/vscode-extension/extension.js	Introduces stable `WINDOW_SESSION_ID` (evaluated once per activation), adds `confidence`/`capture_mode` fields to all capture paths, and exports `_WINDOW_SESSION_ID` for test inspection.
scripts/capture-copilot.py	Passes `confidence`/`capture_mode` from payload through to session.jsonl with safe defaults for older extension versions; logic is correct.
scripts/prepare-ledger.py	Adds `read_copilot_context` which correctly filters by agent and timestamp; minor O(n²) deduplication in `files_seen`; overall logic is sound.
scripts/finalize-ledger.py	Propagates `copilot_context` into trace metadata only when it is a non-empty dict; guard is correct.
src/data.rs	Adds `copilot_context: Option<serde_json::Value>` with proper `skip_serializing_if`; straightforward schema extension.
src/commands/list.rs	Appends `~cpl` suffix to the TRUST column when `copilot_context.low_confidence` is true; column width accommodates all trust values.
src/commands/report.rs	Correctly scopes the Copilot warning to the current intent group by filtering on `group.trace_ids` before the `.any()` check.
scripts/tests/test_capture_copilot.py	Good coverage of capture modes and path resolution; first test constructs an `env` dict that is never applied to `os.environ`, leaving it vulnerable to a live `AGENTDIFF_SESSION_LOG` value.
scripts/tests/test_extension.js	Three new tests cover confidence fields on heuristic vs. manual captures and session ID stability; stubs are well-contained with proper cleanup.

_{Reviews (2): Last reviewed commit: "fix: scope Copilot warning to current in..." | Re-trigger Greptile}

greptile-apps · 2026-05-07T12:37:41Z

+// These limitations exist because VS Code does not expose a stable public API
+// that identifies the source of a document edit as Copilot vs. human vs. other
+// agent.  The VS Code team is tracking this at:
+//   https://github.com/microsoft/vscode/issues/XXXXX  (placeholder)


Placeholder GitHub issue URL

The URL https://github.com/microsoft/vscode/issues/XXXXX is a placeholder and should be replaced with the real tracking issue number before this ships, or removed if no canonical issue exists.

has_cpl_warning was iterating over the full traces slice, causing the warning to appear in every intent group when any single trace had low-confidence Copilot events. Filter to only traces belonging to the current group via group.trace_ids. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-05-12T12:46:05Z

AgentDiff Report

Summary

Agent	Lines	%
Prakhar Khatri	739	100%

Review Context

Intent: unspecified (739 lines, 9 files)
- Agent/model: Prakhar Khatri / unknown

Files To Review First

File	Lines	Dominant Agent	Intent	Context
scripts/tests/test_capture_copilot.py	330	Prakhar Khatri	unspecified	trace 9a99913a
scripts/tests/test_extension.js	179	Prakhar Khatri	unspecified	trace 9a99913a
scripts/prepare-ledger.py	72	Prakhar Khatri	unspecified	trace 9a99913a
scripts/vscode-extension/extension.js	65	Prakhar Khatri	unspecified	trace 9a99913a
scripts/capture-copilot.py	38	Prakhar Khatri	unspecified	trace 9a99913a
src/commands/report.rs	30	Prakhar Khatri	unspecified	trace 9a99913a
src/commands/list.rs	15	Prakhar Khatri	unspecified	trace 9a99913a
src/data.rs	7	Prakhar Khatri	unspecified	trace 9a99913a
scripts/finalize-ledger.py	3	Prakhar Khatri	unspecified	trace 9a99913a

Trace details

Trace	Agent	Intent	Files	Lines
9a99913a	Prakhar Khatri	unspecified	scripts/capture-copilot.py, scripts/finalize-ledger.py, scripts/prepare-ledger.py, scripts/tests/test_capture_copilot.py, scripts/tests/test_extension.js, +4 more	727
6650bd62	Prakhar Khatri	unspecified	src/commands/report.rs	12

greptile-apps · 2026-05-12T12:50:12Z

+                    "event": event_name,
+                    "cwd": str(repo),
+                    "file_path": str(edited),
+                    "model": "copilot",
+                    "session_id": "s",
+                    "lines": [1],
+                    "confidence": "high" if event_name == "manual" else "low",
+                    "capture_mode": event_name if event_name == "manual" else "inline_heuristic",
+                }
+


Unused env dict — test is not actually isolated from environment

The env dict built on line 229 is never used; self.mod.get_session_log(str(repo)) reads os.environ directly. If AGENTDIFF_SESSION_LOG is set to a non-empty value in the process environment (e.g. by a CI runner), get_session_log follows the override path and returns a non-None path, causing the assertIsNone assertion to fail. Compare test_get_session_log_returns_path_when_initialized, which explicitly pops the variable before calling the function. The same guard is needed here.

greptile-apps Bot reviewed May 7, 2026

View reviewed changes

greptile-apps Bot reviewed May 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: make Copilot capture attribution consistent#16

fix: make Copilot capture attribution consistent#16
codeprakhar25 wants to merge 2 commits into
mainfrom
feat/issue-12

codeprakhar25 commented May 7, 2026

Uh oh!

greptile-apps Bot commented May 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

greptile-apps Bot May 7, 2026

Uh oh!

github-actions Bot commented May 12, 2026

Uh oh!

greptile-apps Bot May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

codeprakhar25 commented May 7, 2026

Summary

Changes

Test plan

Uh oh!

greptile-apps Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

Uh oh!

greptile-apps Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 12, 2026

AgentDiff Report

Summary

Review Context

Files To Review First

Uh oh!

greptile-apps Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented May 7, 2026 •

edited

Loading