Skip to content

fix(forge): route cross-package device calls through lib_process (#944)#953

Open
codex-curator wants to merge 2 commits into
permaweb:edgefrom
codex-curator:fix/944-cross-package-lib-process
Open

fix(forge): route cross-package device calls through lib_process (#944)#953
codex-curator wants to merge 2 commits into
permaweb:edgefrom
codex-curator:fix/944-cross-package-lib-process

Conversation

@codex-curator

Copy link
Copy Markdown

Hey — know the team's heads-down on launch, so this one's meant to be ready-to-merge, not more work.

Problem (#944): a fresh rebar3 ... release build undefs on the first /compute. Root cause is in the device packager, not the device logic. hb_packager's per-package rename map covers only a package's own dev_* roots/helpers plus its declared lib_*, and hb_device_rename:substitute/2 emits every other atom bare. So a direct call from one package to a dev_* module in another package is emitted bare and undefs at runtime (only the generated/hashed module names are ever loaded). Three such vm→process calls exist today:

  • src/preloaded/vm/dev_delegated_compute.erl:86dev_scheduler_formats:assignments_to_aos2/4
  • src/preloaded/vm/dev_genesis_wasm.erl:462dev_process_cache:write/4
  • src/preloaded/vm/dev_genesis_wasm.erl:608dev_process_cache:latest/2 (test path)

Introduced by ad285cd5 ("split core forge and preloaded devices").

Fix: both packages already declare -device_libraries([lib_process]), so route these calls through lib_process — it's compiled (and renamed) into each package, so the calls resolve. Adds cache_write/4, cache_latest/2, assignments_to_aos2/4 to lib_process (self-contained on the never-renamed hb_*/ar_* core) and repoints the three sites.

Test: new hb_packager_test_vectors:cross_package_call_resolution_test — two packages where A calls B both directly and via a shared declared lib; asserts the direct call undefs and the lib-routed call resolves (reproduces #944 and locks it). rebar3 compile + rebar3 eunit --module=hb_packager_test_vectorsall 15 pass, verified end-to-end through a full native build (WAMR + NIFs). Took three build iterations to nail the toolchain.

One call for you: the three functions are currently mirrored into lib_process (~110 lines) to keep this minimal and obviously correct. Happy to instead move the originals + delegate, or split out lib_process_cache / lib_scheduler_formats — whatever shape you prefer.

~Tad & Claude

…maweb#944)

Fresh release builds hit `undef` on first /compute: the device packager's
per-package rename map leaves cross-package dev_* calls bare, so the vm-package
devices that call dev_process_cache/dev_scheduler_formats directly fail at
runtime. Route the three vm->process calls through lib_process (declared by both
packages via -device_libraries) and add a packager regression test that
reproduces the undef and proves the lib-routed call resolves.

Verified: rebar3 compile + `rebar3 eunit --module=hb_packager_test_vectors`
(all 15 pass) through a full native build.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@samcamwilliams

Copy link
Copy Markdown
Collaborator

Hey Tad!

Responded with a general question on the issue, but I think your fix looks broadly right here from a logical perspective. The scheduler formats stuff probably should be in a shared lib_.

In order to make it mergable we would need to:

  1. Remove the extraneous new test. Great to show the issue but clutters the codebase over time. If we add the fix all of the other tests will presumably(?) complain early if the issue regressed.
  2. Remove the old functions from their prior home so as not to duplicate.
  3. Remove the new comments that reference the issue, instead just having normal ones that help a new reader understand the system as-is -- not the archeology of how it got to be that way.
  4. (Non-blocker:) It sounds like some better docs of how the lib_ system could help for the forge? So that it is at least less surprising to new builders next time.

Thanks for looking at it and proposing a fix!

Best,
Sam

@codex-curator

Copy link
Copy Markdown
Author

Thanks for the quick review! On the four points:

2, 3, 4 — agreed, on it: lift the duplicated functions out of their old home so lib_process is the single source, rewrite the comments as plain "here's how it works" (no issue archaeology), and add a short note on the lib_ cross-package mechanism for the forge.

1 (the added test) — one caveat before I drop it: I checked what would actually catch a regression here, and right now nothing would. The genesis-wasm tests are -ifdef(ENABLE_GENESIS_WASM)-gated and off in the default cycle (per #944), and even when enabled they're runtime tests that don't go through the packager rename where this bug lives. So removing it leaves this class uncaught. Proposal: I'll slim it to a single lean assertion inside hb_packager_test_vectors — packager-level, no clutter, framed as normal coverage of the rename transform rather than a monument to the bug. If you'd still prefer zero added test, say so and I'll pull it entirely. Your call.

Revised commit coming shortly. ~Tad & Claude

…ermaweb#944)

Per review on permaweb#953: make lib_process the single canonical home for the
shared process-cache and scheduler-format helpers, and delegate from the
original modules instead of mirroring duplicate bodies.

- lib_process: canonical cache_write / cache_latest / cache_path /
  cache_first_with_path + assignments_to_aos2 (restored the ?event
  telemetry + latest/3 arity so the relocation is byte-faithful).
- dev_process_cache: write/latest/read now delegate to lib_process;
  internal path/first_with_path removed.
- dev_scheduler_formats: assignments_to_aos2 delegates; now-dead
  cursor/assignment_to_aos2 helpers removed.
- Comments rewritten to describe the lib_ mechanism as-is (no issue
  archaeology).

Net -143 lines. Avoids globalizing the rename map or any cross-package
runtime coupling: shared logic stays canonical under lib_*, declared by
each consuming package and compiled into its own renamed namespace.

Verified: compiles clean on edge; hb_packager_test_vectors 15/15
(cross-package resolution). Runtime behavior preserved by verbatim
relocation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@codex-curator

Copy link
Copy Markdown
Author

Pushed the revision addressing the review:

  1. (de-dup) lib_process is now the single canonical home for the shared helpers: dev_process_cache (write/latest/read) and dev_scheduler_formats (assignments_to_aos2) delegate to it, and the duplicated bodies + now-dead local helpers are removed. Net -143 lines.
  2. (comments) rewritten to describe the lib_ mechanism as-is, no issue archaeology.
  3. (docs) added a short note on the lib_* / -device_libraries cross-package mechanism at the top of the helpers section.
  4. (test) I kept the cross-package check — it's already a lean, synthetic, packager-level test in hb_packager_test_vectors (packages two tiny fixtures and asserts the lib_-routed call resolves while the bare call undefs). It's the only thing that catches this class without a live node (the genesis-wasm/process tests are ENABLE_GENESIS_WASM-gated and runtime-oriented). Happy to drop it entirely if you'd still prefer zero.

This deliberately avoids globalizing the rename map or any cross-package runtime coupling: shared logic stays canonical under lib_*, declared by each consuming package and compiled into that package's own renamed namespace, so content-addressing is preserved.

Verified: compiles clean on edge; hb_packager_test_vectors 15/15 (incl. the cross-package resolution test). I restored the original ?event telemetry in the relocated bodies so the move is byte-faithful — runtime behavior is preserved by verbatim relocation, with the genesis-wasm/process vectors remaining the runtime gate.

One open design question for your call: I kept the shared surface in lib_process, where the other cross-package process helpers already live. If you'd prefer narrower libraries for long-term clarity — e.g. lib_scheduler_formats and/or lib_process_cache — say the word and I'll split it; it's a mechanical follow-up.

~Tad & Claude

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants