Skip to content

[ROBO-5747] fix: don't log failed-accept Error on server teardown#126

Open
eduard-dumitru wants to merge 2 commits into
masterfrom
fix/suppress_teardown_logging_noise
Open

[ROBO-5747] fix: don't log failed-accept Error on server teardown#126
eduard-dumitru wants to merge 2 commits into
masterfrom
fix/suppress_teardown_logging_noise

Conversation

@eduard-dumitru

@eduard-dumitru eduard-dumitru commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

Disposing an IpcServer cancels the accept loop; on Linux a parked slot's pipe can surface as a non-OCE IOException ("Pipe is broken") rather than an OperationCanceledException, which was logged as "Failed to accept new connection". Suppress it when cancellation is already requested (genuine errors still log). Includes a real-pipe, Linux red->green regression test.

Test Results (on CI-RobotIntegration-Test)

PR #126 teardown-noise verification — coreipc IpcServer accept loop

Artifact: RobotLogs (per build). Search string in Robot.log: Failed to accept new connection
Full line (before fix): Failed to accept new connection. Ex: System.IO.IOException: Pipe is broken.

Build Fix? Leg File in RobotLogs/ Matches First 3 line #s
12439546 ✅ with fix Linux: Build Robot.log 0 — (no matches)
12439546 ✅ with fix Docker: Linux Robot-*.log 0 — (no matches)
12439546 ✅ with fix Mac: Build stage still in progress — no logs published n/a
12439546 ✅ with fix Windows ×3 / Serverless Robot.log 0
12436764 ❌ no fix Linux: Build Robot.log 2675 2466, 2472, 2478
12436764 ❌ no fix Docker: Linux Robot-7f2da42b-c12b-4e07-8da5-247649cc0c57.log 2035 231, 237, 243
12436764 ❌ no fix Mac: Build Robot.log 3125 2513, 2519, 2525
12436764 ❌ no fix Windows ×3 / Serverless Robot.log 0

How to verify: open the file in the leg's RobotLogs zip, Find Failed to accept new connection
before-fix legs hit it thousands of times (jump to the line #s above); fixed Linux/Docker legs return 0 matches.
Noise is Unix-only (Windows = 0 in both), matching the root cause PR #126 guards.

Test Results (on dev machine) — 20 runs per cell (80 total)

Linux (WSL Debian, net6.0) Windows (net6.0)
with fix ✅ 20 / 20 pass ✅ 20 / 20 pass
without fix ❌ 0 / 20 fail (expected) ✅ 20 / 20 pass

Findings

  • Deterministic on Linux. Without the fix the test fails on every run (20/20), reproducing
    the exact production stack trace (IOException: Pipe is broken at
    NamedPipeServerTransport.AwaitConnection → IpcServer.Accepter.TryAccept). With the fix it
    passes on every run. No flakiness in either direction.
  • Windows never manifests it (20/20 pass with and without the fix). There the accept
    cancellation is a clean OperationCanceledException (already suppressed); the non-OCE
    IOException only leaks on the Linux dispose-based cancellation path. This matches the
    original report, which was Linux.

How the Linux runs were executed

wsl -d Debian                       # default distro (docker-desktop) has no shell
cd /mnt/d/Code/coreipc
DOTNET_ROLL_FORWARD=Major \         # Debian has .NET 8/10 runtimes, not net6.0
  dotnet test src/UiPath.CoreIpc.Tests/UiPath.CoreIpc.Tests.csproj -f net6.0 \
  --filter "FullyQualifiedName~DisposingServerDoesNotLogFailedAcceptOnTeardown"

Follow-up — addressed Copilot review (commit 8c52ee4)

Three test-only nits from the Copilot review (the fix itself was unchanged):

  • Serialized the whole test suite[assembly: CollectionBehavior(DisableTestParallelization = true)]. The teardown test attaches a process-wide System.Diagnostics.Trace listener, which is unsafe under xUnit's default cross-collection parallelism; serializing removes the hazard entirely.
  • using on the raw NamedPipeClientStream — exception-safe (no leak if ConnectAsync throws); the scope-end dispose is still an abrupt OS handle close.
  • Dropped the fixed Task.Delay sleepsTimeouts.Short for the slot re-park, and removed the post-dispose settle delay (DisposeAsync already awaits the accept-loop teardown, so any OnError has fired by then).

Re-verified with the new timings (Linux WSL/Debian, net6.0): 10/10 fail without the fix, 10/10 pass with it; Windows still passes.

Disposing an IpcServer cancels the accept loop; on Linux a parked slot's
pipe can surface as a non-OCE IOException ("Pipe is broken") rather than an
OperationCanceledException, which was logged as "Failed to accept new
connection". Suppress it when cancellation is already requested (genuine
errors still log). Includes a real-pipe, Linux red->green regression test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces teardown noise by preventing IpcServer’s accept loop from reporting “Failed to accept new connection” when the server is already shutting down (cancellation requested), and adds a regression test to validate the behavior using real named pipes.

Changes:

  • Suppress _newConnection.OnError(ex) during accept-loop teardown when cancellation is already requested.
  • Add a named-pipe regression test that captures System.Diagnostics.Trace error output and asserts teardown does not emit “Failed to accept new connection”.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/UiPath.CoreIpc/Config/IpcServer.cs Avoid reporting accept-loop exceptions to observers when cancellation is already requested (teardown race).
src/UiPath.CoreIpc.Tests/NamedPipeSmokeTests.Teardown.cs Adds a real named-pipe regression test to ensure server disposal doesn’t log failed-accept errors.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/UiPath.CoreIpc.Tests/NamedPipeSmokeTests.Teardown.cs
Comment thread src/UiPath.CoreIpc.Tests/NamedPipeSmokeTests.Teardown.cs
Comment thread src/UiPath.CoreIpc.Tests/NamedPipeSmokeTests.Teardown.cs Outdated
@eduard-dumitru eduard-dumitru changed the title fix: don't log failed-accept Error on server teardown [ROBO-5747] fix: don't log failed-accept Error on server teardown Jun 19, 2026
- Serialize the whole suite ([assembly: CollectionBehavior(
  DisableTestParallelization = true)]): the teardown test attaches a
  process-wide Trace listener, unsafe under xUnit's default parallelism.
- Wrap the raw NamedPipeClientStream in `using` (exception-safe; the scope-end
  dispose is still an abrupt OS handle close).
- Drop the fixed Task.Delay sleeps: use Timeouts.Short for the re-park, and
  remove the post-dispose settle (DisposeAsync already awaits the accept-loop
  teardown, so any OnError has fired by then).

Re-verified on Linux (WSL/Debian, net6.0): 10/10 fail without the fix, 10/10
pass with it; passes on Windows.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment thread src/UiPath.CoreIpc.Tests/AssemblyInfo.cs
@eduard-dumitru

Copy link
Copy Markdown
Collaborator Author

/azp run CI

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@danutboanta

Copy link
Copy Markdown
Collaborator

Please fix the node.js on Windows/Ubuntu jobs pipeline

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants