Skip to content

Bug: Extend error handling in backend_write_file()#983

Merged
jan-janssen merged 2 commits intomainfrom
bug_serial_error
May 8, 2026
Merged

Bug: Extend error handling in backend_write_file()#983
jan-janssen merged 2 commits intomainfrom
bug_serial_error

Conversation

@jan-janssen
Copy link
Copy Markdown
Member

@jan-janssen jan-janssen commented May 8, 2026

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced error handling for file write operations to prevent jobs from hanging when serialization failures occur. Error details and runtime information are now captured and persisted.
  • Tests

    • Added unit tests covering file write operations, including successful output writing and error handling for unserializable payloads.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Review Change Stack

Warning

Rate limit exceeded

@jan-janssen has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 54 minutes and 26 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 264df92e-6c45-4575-994c-daaf5c4cee36

📥 Commits

Reviewing files that changed from the base of the PR and between 2713148 and dc3c71f.

📒 Files selected for processing (1)
  • tests/unit/task_scheduler/file/test_backend.py
📝 Walkthrough

Walkthrough

This PR adds fault-tolerant error handling to backend_write_file by wrapping HDF5 serialization operations in try/except blocks. When serialization fails, the function now writes the error and runtime to the output file instead of crashing. Two unit tests verify both the success path and the error-handling path.

Changes

File Write Error Handling

Layer / File(s) Summary
Error Handling Logic
src/executorlib/task_scheduler/file/backend.py
backend_write_file wraps HDF5 write in try/except; on serialization failure, writes error payload with runtime instead of propagating exception.
Test Coverage
tests/unit/task_scheduler/file/test_backend.py
Imports backend_write_file and adds tests for successful output write and graceful handling of unserializable payloads.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

A rabbit writes files with care,
When HDF5 stumbles in despair,
No crash, no fright—just graceful fall,
Error saved; the job won't stall. 🐰✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: extending error handling in backend_write_file() to wrap HDF5 serialization in try/except and write errors to file, preventing job hangs.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bug_serial_error

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/executorlib/task_scheduler/file/backend.py`:
- Around line 59-64: The fallback writes the caught exception object
(serialize_error) directly, which may itself be non-serializable; change the
recovery payload to store a serializable string (e.g., traceback.format_exc() or
repr(serialize_error)) instead: in the except block where dump(...) is called
(references: serialize_error, file_name_out, runtime, dump), replace the "error"
value with a string representation of the exception (and optionally include a
short tag/context) so the dump always receives a serializable payload.

In `@tests/unit/task_scheduler/file/test_backend.py`:
- Around line 87-88: The test currently uses a broad with
self.assertRaises(Exception) around future_file_obj.result(), which can hide
other failures; change this to assert the specific exception the serialization
layer throws (e.g., the concrete exception class used when futures from
future_file_obj.result() fail — replace Exception with that type, or use the
stable project-specific base like SerializationError/ValueError/RuntimeError
used elsewhere), and optionally assert the error message contents to make the
expectation explicit; update the test in
tests/unit/task_scheduler/file/test_backend.py to call
self.assertRaises(ExpectedException) around future_file_obj.result() (and add an
assert on str(e) if a message check is needed).
- Around line 82-84: The test calls backend_write_file which first renames the
input "<task>_i.h5" and therefore must create that HDF5 input file before
invoking backend_write_file; update the test setup to create the expected input
HDF file (using the same cache_directory/name pattern used for file_name) so
backend_write_file can find and rename it, then proceed to call
backend_write_file and construct FutureItem as before (reference
backend_write_file and FutureItem to locate the change).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: de89cb44-b544-46dc-9409-dd4771f602f2

📥 Commits

Reviewing files that changed from the base of the PR and between 3c5d8b1 and 2713148.

📒 Files selected for processing (2)
  • src/executorlib/task_scheduler/file/backend.py
  • tests/unit/task_scheduler/file/test_backend.py

Comment on lines +59 to 64
except Exception as serialize_error:
# Serialization failed — store the error so the job is not stuck
dump(
file_name=file_name_out + "_r.h5",
data_dict={"error": output["error"], "runtime": runtime},
data_dict={"error": serialize_error, "runtime": runtime},
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make fallback error payload always serializable.

At Line 63, writing serialize_error directly can fail again for non-serializable exception objects, which breaks the new recovery path. Persist a string payload (repr/traceback) instead.

Proposed fix
+import traceback
@@
-    except Exception as serialize_error:
+    except Exception as serialize_error:
         # Serialization failed — store the error so the job is not stuck
         dump(
             file_name=file_name_out + "_r.h5",
-            data_dict={"error": serialize_error, "runtime": runtime},
+            data_dict={
+                "error": f"{type(serialize_error).__name__}: {serialize_error}",
+                "error_traceback": traceback.format_exc(),
+                "runtime": runtime,
+            },
         )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception as serialize_error:
# Serialization failed — store the error so the job is not stuck
dump(
file_name=file_name_out + "_r.h5",
data_dict={"error": output["error"], "runtime": runtime},
data_dict={"error": serialize_error, "runtime": runtime},
)
except Exception as serialize_error:
# Serialization failed — store the error so the job is not stuck
dump(
file_name=file_name_out + "_r.h5",
data_dict={
"error": f"{type(serialize_error).__name__}: {serialize_error}",
"error_traceback": traceback.format_exc(),
"runtime": runtime,
},
)
🧰 Tools
🪛 Ruff (0.15.12)

[warning] 59-59: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/executorlib/task_scheduler/file/backend.py` around lines 59 - 64, The
fallback writes the caught exception object (serialize_error) directly, which
may itself be non-serializable; change the recovery payload to store a
serializable string (e.g., traceback.format_exc() or repr(serialize_error))
instead: in the except block where dump(...) is called (references:
serialize_error, file_name_out, runtime, dump), replace the "error" value with a
string representation of the exception (and optionally include a short
tag/context) so the dump always receives a serializable payload.

Comment on lines +82 to +84
file_name = os.path.join(cache_directory, "test_file_i.h5")
backend_write_file(file_name=file_name, output={"result": Future()}, runtime=0.1)
future_file_obj = FutureItem(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

This test does not set up the input file required by backend_write_file.

At Line 83, backend_write_file renames <task>_i.h5 first, so this test can fail with FileNotFoundError before hitting the serialization-error branch. Create the input HDF file in setup for this test.

Proposed fix
     def test_backend_write_file_serialization_error(self):
         cache_directory = os.path.abspath("executorlib_cache")
         os.makedirs(cache_directory, exist_ok=True)
         file_name = os.path.join(cache_directory, "test_file_i.h5")
+        dump(file_name=file_name, data_dict={"fn": my_funct, "args": [1], "kwargs": {"b": 2}})
         backend_write_file(file_name=file_name, output={"result": Future()}, runtime=0.1)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/task_scheduler/file/test_backend.py` around lines 82 - 84, The
test calls backend_write_file which first renames the input "<task>_i.h5" and
therefore must create that HDF5 input file before invoking backend_write_file;
update the test setup to create the expected input HDF file (using the same
cache_directory/name pattern used for file_name) so backend_write_file can find
and rename it, then proceed to call backend_write_file and construct FutureItem
as before (reference backend_write_file and FutureItem to locate the change).

Comment on lines +87 to +88
with self.assertRaises(Exception):
future_file_obj.result()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid blind exception assertion in test.

At Line 87, assertRaises(Exception) is too broad and can mask unrelated regressions. Assert a specific expected exception type (or a stable base type used by your serialization layer).

🧰 Tools
🪛 Ruff (0.15.12)

[warning] 87-87: Do not assert blind exception: Exception

(B017)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/task_scheduler/file/test_backend.py` around lines 87 - 88, The
test currently uses a broad with self.assertRaises(Exception) around
future_file_obj.result(), which can hide other failures; change this to assert
the specific exception the serialization layer throws (e.g., the concrete
exception class used when futures from future_file_obj.result() fail — replace
Exception with that type, or use the stable project-specific base like
SerializationError/ValueError/RuntimeError used elsewhere), and optionally
assert the error message contents to make the expectation explicit; update the
test in tests/unit/task_scheduler/file/test_backend.py to call
self.assertRaises(ExpectedException) around future_file_obj.result() (and add an
assert on str(e) if a message check is needed).

@codecov
Copy link
Copy Markdown

codecov Bot commented May 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.16%. Comparing base (3d876bb) to head (dc3c71f).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #983   +/-   ##
=======================================
  Coverage   94.15%   94.16%           
=======================================
  Files          39       39           
  Lines        2089     2092    +3     
=======================================
+ Hits         1967     1970    +3     
  Misses        122      122           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@ltalirz ltalirz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fantastic, thanks @jan-janssen !

Please let me know when a new release with this fix is out

@jan-janssen jan-janssen merged commit 4a63cb1 into main May 8, 2026
36 checks passed
@jan-janssen jan-janssen deleted the bug_serial_error branch May 8, 2026 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug] backend_write_file should catch serialization errors and write them to the output file

2 participants