fix(rack): fiber-safe context detachment in EventHandler#2130
fix(rack): fiber-safe context detachment in EventHandler#2130rsamoilov wants to merge 1 commit intoopen-telemetry:mainfrom
EventHandler#2130Conversation
|
|
Store fiber reference with context token to safely skip detachment when on_finish is called from a different fiber than on_start. Spans are always finished regardless of fiber.
16aa912 to
1129d8f
Compare
| span.finish | ||
| OpenTelemetry::Context.detach(token) | ||
|
|
||
| if Fiber.current.equal?(original_fiber) |
There was a problem hiding this comment.
Could you share more details about how this could happen?
What application server are you using that invokes the rack events in different fibers?
arielvalentin
left a comment
There was a problem hiding this comment.
I am concerned about passing around references to fibers and want to better understand if we are leaking context objects or detaching them out of order.
Could you share more information or a small script that demonstrates the problem?
|
I'm using Rage, which runs each request in a separate fiber. But the issue isn't specific to Rage - it affects any setup where the response body is closed in a different fiber than the one that handled the request. Here's a minimal reproduction that demonstrates the problem using just Puma + a simple middleware: https://gist.github.com/rsamoilov/c01b3923af0a35dd75b34056062abf0a
The span always finishes regardless of which fiber we're in. The only thing skipped is the |
Summary
Fixes context detachment errors in fiber-based environments.
Problem: When using
Rack::Events, theon_finishcallback can be invoked from a different fiber thanon_start(e.g., when streaming response bodies). This causesOpenTelemetry::Context.detach(token)to fail with:Solution: Store the fiber reference alongside the token and span. On
detach_context:Note on fiber pooling: Skipping
detachcould leave stale context in the original fiber. However, servers that reuse fibers across requests would be fundamentally incompatible with any Ruby feature relying on fiber-locals (e.g.,ActiveSupport::CurrentAttributes), so this isn't something we should solve at the instrumentation level.Changes
otel.rack.token_and_spantootel.context_info[token, span]to[fiber, token, span]detach_contextto check fiber identity before detachingTest plan