Skip to content

[feat] Restructure the trace/telemetry interface by semantic role#4910

Merged
mmabrouk merged 1 commit into
big-agentsfrom
feat/trace-telemetry-restructure
Jun 28, 2026
Merged

[feat] Restructure the trace/telemetry interface by semantic role#4910
mmabrouk merged 1 commit into
big-agentsfrom
feat/trace-telemetry-restructure

Conversation

@mmabrouk

Copy link
Copy Markdown
Member

Context

The /run request carries a trace block on every agent run. It mixed four different roles under one bucket — the classic interface-design failure the design-interfaces skill is built around:

"trace": {
  "traceparent": "...",     // per-call propagation CONTEXT
  "baggage": null,          // per-call propagation CONTEXT
  "endpoint": "...",        // exporter CONFIG (the destination)
  "authorization": "...",   // exporter CREDENTIAL
  "captureContent": true    // capture POLICY
}

Four roles, one feature label. This is the deferred "Fork A" from #4892 — the proposed shape was posted on that thread and Mahmoud greenlit it ("yes please create another pr to target this"). This PR is shape-only: behavior is identical.

What changed

Split the block by semantic role into context (per-call protocol context) and telemetry (operator-owned config + policy + credential):

"context": {
  "propagation": { "traceparent": "...", "baggage": null }
},
"telemetry": {
  "capture":   { "content": { "enabled": true } },
  "exporters": { "otlp": { "endpoint": "...", "headers": { "authorization": "..." } } }
}

Both keys come from one service-side capture (trace_context()); both are null when the run has no trace context (the prior single trace: null). The keys are updated in lockstep across the wire and both OTLP readers:

  • Wire (TS): protocol.tsTraceContext replaced by Propagation / RequestContext / Telemetry / OtlpExporter; AgentRunRequest.tracecontext + telemetry.
  • Wire (Python): dtos.py (TraceContext.context_to_wire() / telemetry_to_wire()), utils/wire.py (request_to_wire), wire_models.py (WirePropagation / WireRequestContext / WireCapture / WireOtlpExporter / WireExporters / WireTelemetry).
  • The two live OTLP readers: engines/sandbox_agent.ts (the tracer init) and engines/sandbox_agent/pi-assets.ts (the Pi extension OTLP env).
  • Golden + contract tests: both golden fixtures, both wire-contract tests, and the DTO unit test, with explicit role-shape assertions.

The internal tracer (tracing/otel.ts) and its captureContent field are not touched — that is downstream of the readers and not part of the wire.

CodeRabbit follow-up (from #4892)

run_context() in services/oss/src/agent/tracing.py treated workflow and trace capture as one failure domain — a failure reading the workflow references would drop an otherwise-valid runContext.trace. They are now captured as independent failure domains, so a trace-only run still ships runContext.trace. Added test_tracing.py covering both directions.

Interface reference (old → new)

Role Old (trace.*) New
per-call propagation context traceparent, baggage context.propagation.{traceparent,baggage}
capture policy captureContent telemetry.capture.content.enabled
exporter config (destination) endpoint telemetry.exporters.otlp.endpoint
exporter credential authorization telemetry.exporters.otlp.headers.authorization

design-interfaces rationale

Each field now lives under a parent that answers the same question it does: standard W3C names (traceparent/baggage) kept verbatim under context.propagation; the credential nested under the exporter's headers (under the thing it authenticates) rather than free-floating; the capture policy expressed as an allowlist-style enabled flag, not a sibling of the data; exporters plural so a second exporter can be added without reshaping. context (telemetry propagation, read by the runner's OTLP exporter) is intentionally kept distinct from runContext (the run's own resource identity, read by tool call.context binding) — different roles, different consumers.

Scope / risk

  • Shape-only. No behavior change. The captureContent-default-on semantics (!== false) and the OTLP env wiring are preserved exactly. Standalone runs still serialize context: null, telemetry: null.
  • Single-DTO threading kept. The Python plumbing still threads one TraceContext object through SessionConfigBackend.create_sessionrequest_to_wire; only the serialization (the wire) is role-split. This keeps the change to the contract, not the call graph.
  • Deferred (out of scope): projects/runner-interface/README.md shows the old shape but is already stale (it still has the removed backend field and an old golden filename); a DTO rename (TraceContext is now slightly misnamed) is left as optional cleanup. Flagged, not done, to keep the diff a clean shape change.

How to QA

# Python wire contract + models + DTO + the run_context fix
cd sdks/python && uv run ruff check agenta/sdk/agents/ \
  && uv run --no-sync python -m pytest oss/tests/pytest/unit/agents/ -n0
cd ../services && uv run --no-sync python -m pytest oss/tests/pytest/unit/agent/ -n0

# TS typecheck + the OTLP/telemetry + wire-contract tests
cd ../services/agent && pnpm run typecheck && pnpm test

Expected (verified locally): SDK agents 441 passed; service agent 45 passed (3 new in test_tracing.py); TS typecheck clean, 309 vitest passed (incl. wire-contract.test.ts, sandbox-agent-pi-assets.test.ts, tool-direct.test.ts). The contract tests pin both sides of the new shape and that the legacy trace key is gone; the pi-assets test confirms telemetry still flows to the OTLP env (TRACEPARENT, OTEL_EXPORTER_OTLP_TRACES_ENDPOINT, OTEL_EXPORTER_OTLP_HEADERS).

Do not merge.

https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc

The pre-existing `trace` block on the `/run` request mixed four different roles
under one bucket: per-call propagation context (traceparent/baggage), exporter
config (endpoint), exporter credential (authorization), and capture policy
(captureContent). Split it by role into `context.propagation` (W3C trace-context)
and `telemetry` (capture policy + OTLP exporter with the credential under the
exporter's standard `authorization` header). Behavior is identical; this is a
shape change to the wire contract.

Updated in lockstep: protocol.ts, wire.py, wire_models.py, dtos.py, both golden
fixtures, both wire-contract tests, the DTO unit test, and the two live OTLP
readers (engines/sandbox_agent.ts, engines/sandbox_agent/pi-assets.ts). The
internal tracer (tracing/otel.ts) and its `captureContent` field are unchanged.

Also addresses a CodeRabbit follow-up on #4892: `run_context()` now captures the
workflow identity and the trace identity as independent failure domains, so a
workflow-extraction failure no longer drops an otherwise-valid `runContext.trace`.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 28, 2026
@vercel

vercel Bot commented Jun 28, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jun 28, 2026 12:43pm

Request Review

@dosubot dosubot Bot added the refactoring A code change that neither fixes a bug nor adds a feature label Jun 28, 2026
@coderabbitai

coderabbitai Bot commented Jun 28, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Tracing-related run requests now use separate context and telemetry sections, making propagation and exporter settings easier to configure.
    • Added support for nested OTLP export settings and content-capture controls.
  • Bug Fixes

    • Improved trace handling so workflow and trace data can be preserved independently when one part fails to load.
    • Updated runtime behavior to keep tracing and export settings aligned across services.
  • Documentation

    • Refreshed interface docs and examples to match the new request shape.

Walkthrough

The /run wire contract's single trace field is split into two role-separated fields: context (W3C propagation headers) and telemetry (OTLP exporter config and capture policy). This change is applied across the TypeScript protocol, Python SDK models, service engines, golden fixtures, tests, and design docs. Separately, run_context() is refactored to independently capture workflow and trace references with separate failure domains.

Changes

trace → context + telemetry wire contract split

Layer / File(s) Summary
TypeScript and Python wire model contracts
services/agent/src/protocol.ts, sdks/python/agenta/sdk/agents/wire_models.py
TraceContext/WireTraceContext removed; new split models (Propagation, RequestContext, OtlpExporter, Telemetry / WirePropagation, WireRequestContext, WireTelemetry) defined. AgentRunRequest/WireRunRequest replace trace with context and telemetry fields.
Python TraceContext serialization
sdks/python/agenta/sdk/agents/dtos.py, sdks/python/agenta/sdk/agents/utils/wire.py
to_wire() removed from TraceContext; context_to_wire() and telemetry_to_wire() added. request_to_wire() emits context and telemetry fields instead of trace.
Service engine consumption
services/agent/src/engines/sandbox_agent.ts, services/agent/src/engines/sandbox_agent/pi-assets.ts
createSandboxAgentOtel and buildPiExtensionEnv updated to read from request.context?.propagation and request.telemetry instead of request.trace.
Tests and golden fixtures
sdks/python/oss/tests/pytest/unit/agents/golden/*, sdks/python/oss/tests/pytest/unit/agents/test_*.py, services/agent/tests/unit/*
All golden JSON fixtures, DTO serialization tests, and wire-contract tests updated to the context/telemetry structure. TypeScript wire-contract test gains a compile-time drift guard.
Design doc updates
docs/design/agent-workflows/interfaces/cross-service/service-and-runner-trace-export.md, docs/design/agent-workflows/interfaces/cross-service/service-to-agent-runner.md
Contract examples and "watch for when changing" checklists updated to reflect new field paths.

run_context() independent failure domains

Layer / File(s) Summary
run_context() resilience refactor
services/oss/src/agent/tracing.py, services/oss/tests/pytest/unit/agent/test_tracing.py
run_context() uses two independent try/except blocks for workflow and trace capture; returns partial RunContext when only one succeeds. New tests cover each single-failure and total-failure case.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Agenta-AI/agenta#4830: Modifies the same /run wire-contract schema files (wire_models.py, request key definitions), as part of the same contract evolution this PR continues.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.50% which is insufficient. The required threshold is 60.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: splitting the trace/telemetry interface by semantic role.
Description check ✅ Passed The description is directly about the /run trace block split and related wire-contract updates.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/trace-telemetry-restructure

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@mmabrouk

Copy link
Copy Markdown
Member Author

🤖 The AI agent says:

This is the deferred "Fork A" from #4892 (the trace/telemetry restructure you greenlit with "yes please create another pr to target this"), plus the open CodeRabbit follow-up on that PR.

What I need reviewed (a design + contract check, not a UX check):

  1. The role split itself — does context.propagation (per-call W3C trace-context) vs telemetry (capture policy + OTLP exporter, credential under the exporter's headers.authorization) match what you proposed on [feat] Direct-call tools — Phase 3a: run-context delivery + bind #4892? The shape applied is byte-for-byte your posted shape, refined per design-interfaces.
  2. context vs runContext coexisting at top level — I kept them separate (different roles, different consumers: context is read by the runner's OTLP exporter, runContext by tool call.context binding) rather than nesting one under the other. Confirm that boundary, or say if you'd rather unify them (that would re-open the freshly-merged [feat] Direct-call tools — Phase 3a: run-context delivery + bind #4892 contract, so I left it).
  3. The two live OTLP readers stay correctengines/sandbox_agent.ts and pi-assets.ts now read the new paths; the pi-assets test asserts TRACEPARENT / OTEL_EXPORTER_OTLP_TRACES_ENDPOINT / OTEL_EXPORTER_OTLP_HEADERS are still set, so telemetry still exports. Behavior is identical — please sanity-check that claim.
  4. The CodeRabbit fixrun_context() now captures workflow and trace as independent failure domains (test_tracing.py covers both). Confirm that's the resolution you want.

Deliberately deferred (flagged, not done): the stale projects/runner-interface/README.md snapshot (already drifted — still has the removed backend field) and an optional TraceContext DTO rename. Both noted in the PR body under Scope/risk.

Gates green locally: SDK agents 441, service agent 45 (+3 new), TS typecheck + 309 vitest. Not merging.

@mmabrouk

Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 28, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@mmabrouk mmabrouk added the needs-review Agent updated; awaiting Mahmoud's review label Jun 28, 2026

@mmabrouk mmabrouk left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mmabrouk mmabrouk added lgtm This PR has been approved by a maintainer and removed needs-review Agent updated; awaiting Mahmoud's review labels Jun 28, 2026
@mmabrouk mmabrouk merged commit bcc8d3b into big-agents Jun 28, 2026
18 of 19 checks passed
@mmabrouk mmabrouk mentioned this pull request Jun 28, 2026
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer refactoring A code change that neither fixes a bug nor adds a feature size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant