Skip to content

fix(rack): fiber-safe context detachment in EventHandler#2130

Open
rsamoilov wants to merge 1 commit intoopen-telemetry:mainfrom
rage-rb:fix/rack-fiber-safe-context-detachment
Open

fix(rack): fiber-safe context detachment in EventHandler#2130
rsamoilov wants to merge 1 commit intoopen-telemetry:mainfrom
rage-rb:fix/rack-fiber-safe-context-detachment

Conversation

@rsamoilov
Copy link
Copy Markdown

Summary

Fixes context detachment errors in fiber-based environments.

Problem: When using Rack::Events, the on_finish callback can be invoked from a different fiber than on_start (e.g., when streaming response bodies). This causes OpenTelemetry::Context.detach(token) to fail with:

ERROR -- : OpenTelemetry error: calls to detach should match corresponding calls to attach

Solution: Store the fiber reference alongside the token and span. On detach_context:

  • Always finish the span (trace data is never lost)
  • Only detach context if running in the same fiber
  • Log a debug message when detachment is skipped

Note on fiber pooling: Skipping detach could leave stale context in the original fiber. However, servers that reuse fibers across requests would be fundamentally incompatible with any Ruby feature relying on fiber-locals (e.g., ActiveSupport::CurrentAttributes), so this isn't something we should solve at the instrumentation level.

Changes

  • Changed env key from otel.rack.token_and_span to otel.context_info
  • Changed storage format from [token, span] to [fiber, token, span]
  • Updated detach_context to check fiber identity before detaching
  • Applied changes to all three convention variants (stable, old, dup)

Test plan

  • Added unit tests for same-fiber detachment
  • Added unit tests for cross-fiber detachment (verifies span finishes without error)
  • All existing tests pass across all 12 configurations (4 Rack versions × 3 conventions)

@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla bot commented Mar 26, 2026

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: rsamoilov / name: Roman Samoilov (1129d8f)

Store fiber reference with context token to safely skip detachment
when on_finish is called from a different fiber than on_start.
Spans are always finished regardless of fiber.
@rsamoilov rsamoilov force-pushed the fix/rack-fiber-safe-context-detachment branch from 16aa912 to 1129d8f Compare April 13, 2026 08:29
span.finish
OpenTelemetry::Context.detach(token)

if Fiber.current.equal?(original_fiber)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you share more details about how this could happen?

What application server are you using that invokes the rack events in different fibers?

Copy link
Copy Markdown
Contributor

@arielvalentin arielvalentin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am concerned about passing around references to fibers and want to better understand if we are leaking context objects or detaching them out of order.

Could you share more information or a small script that demonstrates the problem?

@rsamoilov
Copy link
Copy Markdown
Author

I'm using Rage, which runs each request in a separate fiber. But the issue isn't specific to Rage - it affects any setup where the response body is closed in a different fiber than the one that handled the request.

Here's a minimal reproduction that demonstrates the problem using just Puma + a simple middleware: https://gist.github.com/rsamoilov/c01b3923af0a35dd75b34056062abf0a

WrapInFiber simulates the fiber boundary. Rack::Events calls on_start inside the fiber (where context is attached), but calls on_finish when the body is closed - back in the original fiber. Since OTel context is fiber-local, the detach fails.

The span always finishes regardless of which fiber we're in. The only thing skipped is the Context.detach call when it would error anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants