Skip to content

fix(formatjs): avoid hygiene crashes#625

Merged
kdy1 merged 5 commits into
mainfrom
kdy1/fix-formatjs-issue-604
May 30, 2026
Merged

fix(formatjs): avoid hygiene crashes#625
kdy1 merged 5 commits into
mainfrom
kdy1/fix-formatjs-issue-604

Conversation

@kdy1
Copy link
Copy Markdown
Member

@kdy1 kdy1 commented May 30, 2026

Closes #604

Summary

Tests

  • cargo check -p swc_formatjs_transform -p swc_plugin_formatjs
  • pnpm -F @swc/plugin-formatjs test
  • pnpm exec taplo format --check packages/formatjs/Cargo.toml
  • pnpm exec prettier --check packages/formatjs/__tests__/wasm.test.ts
  • external reproduction: next-swc-formatjs-reactcompiler-bug simpler branch, local packed @swc/plugin-formatjs, npm run build

Notes

  • pnpm -F @swc/plugin-formatjs test still prints the pre-existing Vitest warning about an unawaited rejection assertion in an older test, but all 28 tests pass.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 30, 2026

🦋 Changeset detected

Latest commit: 833fac9

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@swc/plugin-formatjs Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@claude
Copy link
Copy Markdown

claude Bot commented May 30, 2026

Review of #625fix(formatjs): avoid hygiene crashes

Nice fix — dropping the swc_ecma_minifier::Evaluator removes a heavy module.clone() per file and replaces it with a focused, static evaluator that handles the only patterns formatjs descriptors actually require. The two added regression tests reproduce #604 well. A few questions and suggestions below.

Public API / SemVer

  • packages/formatjs/transform/src/lib.rs:1351 keeps create_formatjs_visitor(..., evaluator) for source compatibility but silently ignores the _evaluator argument. Downstream consumers of swc_formatjs_transform (non-wasm) will pay the cost of constructing an Evaluator and may rely on its ability to fold things like String.prototype.replace, member access through resolved objects, etc. They will get reduced evaluation power with no compile-time signal.
    • Suggest at minimum a #[deprecated(note = "...")] attribute pointing to create_formatjs_visitor_without_evaluator, and a short note in the changeset that the legacy entry point's evaluator argument is now a no-op.
    • The change is shipped as patch in .changeset/fix-formatjs-issue-604.md. Behavior of the legacy API changes (no more deep evaluation), which is arguably a minor-level change for the transform crate even if @swc/plugin-formatjs itself stays patch.
  • swc_ecma_minifier is removed from packages/formatjs/Cargo.toml (good), but transform/Cargo.toml still pulls it in just to keep the legacy signature alive. Worth a TODO/follow-up to delete it once the legacy fn is removed.

Correctness

  • store_binding (transform/src/lib.rs:562):
    (true, false) => false,
    _ => true,
    This "object beats non-object" asymmetry was previously commented to explain the React-Compiler t1 = { ... } motivation; that rationale is now gone. Consider re-adding a one-line comment so a future reader doesn't simplify it to "always update". It also produces a subtle correctness quirk:
    let x = { id: "a", defaultMessage: "A" };
    x = somethingDynamic();
    formatMessage(x); // still extracted as the original object
    Probably fine for the intended pattern, but worth being intentional about.
  • evaluate_static_string_with_depth, resolve_object_expr_with_depth, and json_value_from_expr_with_depth each hard-code depth > 16. Pull this out into a const MAX_RESOLUTION_DEPTH: usize = 16; so they can't drift. The limit also doubles as cycle protection (let a = b; let b = a;), which is worth a one-line comment.
  • evaluate_template_literal (transform/src/lib.rs:641) silently drops template expressions whose cooked is None by ?-returning None — correct, but worth noting that a tagged-template-style raw-only literal will now be treated as non-evaluable and emit the "[React Intl] Messages must be statically evaluate-able" error. Probably fine; just confirm it doesn't fire for plain template literals on existing fixtures.
  • Many new .as_str().expect("non-utf8 string") calls were copied from the old code path. Since the surrounding helpers all return Option<String>, converting expect? (via Atom::as_str()Option<&str>) would be a strict improvement and prevent a panic on the rare non-UTF-8 atom.
  • evaluate_expr now emits emit_non_evaluable_error whenever the small evaluator fails. The previous code only emitted for the evaluate_expr paths and let object/identifier branches handle themselves; the new flow funnels object handling through get_message_descriptor_value first, so the error placement looks correct — but please verify no existing test that previously didn't error now does.

Tests

  • The two new regression tests (plugin-formatjs: SWC plugin crashes with Next.js 16.2.1 + React Compiler + @swc/plugin-formatjs #604) are good. To lock in the new mini-evaluator, consider adding:
    • Template literal & concatenation: defineMessage({ defaultMessage: \Hello, ${"world"}` })and"a" + "b"` — both new positive paths.
    • TS strip cases: defineMessage({ id: "x" as const, defaultMessage: ("y" satisfies string) }) exercises Expr::TsAs/Expr::TsSatisfies in strip_expr.
    • Indirect binding: const id = "x"; defineMessage({ id, defaultMessage: "y" }) confirms the var-declarator capture path.
    • Cycle / deep chain: prove the depth guard doesn't loop forever on let a = b; let b = a;.
  • The should not evaluate unrelated TypeScript and JSX code (#604) test only asserts .resolves.toBeDefined(). Consider snapshotting (or at least string-matching) the output so a future regression that does mangle the unrelated code is caught.

Performance

  • The bindings: HashMap<Id, Box<Expr>> clones every RHS into the map. For large modules this can add up (e.g., a long t1 = <big object> chain), but it's still dramatically cheaper than the old module.clone() per visitor construction. Probably fine for now; if it shows up in profiles, a Cow/arena interner could help.

Misc / nits

  • Cargo.lock: minifier still appears under other crates — expected, no action.
  • The PR title and changeset are clear. Two-line ## Why this fixes #604 blurb in the changeset (Evaluator was eagerly walking React-Compiler-generated proxies and panicking under hygiene) would help future readers correlate the fix to the issue.

Overall: solid, targeted fix. Main asks are (1) deprecation/labeling for the legacy create_formatjs_visitor so downstream consumers know the evaluator is now a no-op, and (2) a bit more positive-path test coverage for the new static evaluator.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6045e153f1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/formatjs/transform/src/lib.rs Outdated
Comment on lines +633 to +635
Expr::Ident(ident) => {
let expr = self.bindings.get(&ident.to_id())?;
self.evaluate_static_string_with_depth(expr, depth + 1)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Support descriptor constants declared later

When a descriptor value is an identifier, this new resolver only looks in self.bindings, which is populated as the visitor reaches declarations. Because visit_mut_module_items still descends into function bodies before later module items, valid code like function C() { return <FormattedMessage defaultMessage={MSG} /> } const MSG = "Hello"; now sees MSG as missing, emits the static-evaluation error, and skips the transform/extraction; the previous module-wide Evaluator was not limited by traversal order.

Useful? React with 👍 / 👎.

Comment thread packages/formatjs/transform/src/lib.rs Outdated
}

match Self::strip_expr(expr) {
Expr::Lit(Lit::Str(s)) => Some(s.value.as_str().expect("non-utf8 string").to_string()),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve static coercion in template messages

For statically constant messages that include non-string primitives, such as defaultMessage={\Step ${2}`}ordefaultMessage={"Step " + 2}, the replacement evaluator now returns Nonebecause literal handling only acceptsLit::Str`. The previous SWC evaluator evaluated these JavaScript constants with string coercion, so these valid messages now emit the static-evaluation error and are left unextracted.

Useful? React with 👍 / 👎.

let expr = self.bindings.get(&ident.to_id())?;
self.evaluate_static_string_with_depth(expr, depth + 1)
}
_ => None,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep evaluating static member lookups

Static messages that read from object literals, for example const m = { hello: "Hello" }; formatMessage({ defaultMessage: m.hello }), now fall into this catch-all and are treated as non-evaluable because the replacement evaluator only follows bare identifiers. The old Evaluator::eval handled descriptor expressions directly, so these constant member lookups were extractable before but now emit the static-evaluation error and skip extraction.

Useful? React with 👍 / 👎.

@claude
Copy link
Copy Markdown

claude Bot commented May 30, 2026

Code Review

Nice fix for the hygiene crash — the redesign (drop the heavyweight Evaluator + module clone in favor of a focused static evaluator) is the right shape for a WASM plugin entrypoint. A few things worth considering:

Correctness

  • visit_mut_var_declarator double-stores top-level bindings. visit_mut_module_items calls collect_module_item_bindings over every item first, then visits children — which means for any top-level var/let/const, visit_mut_var_declarator runs store_binding a second time with the same expression. Functionally harmless thanks to should_update, but redundant. Consider skipping the visitor-level store for module-scope, or skip the prepass and rely entirely on visit ordering for non-forward-reference cases.

  • Function-scoped bindings are not hoisted. collect_module_item_bindings only walks ModuleItem::Stmt(Decl::Var(..)) and ExportDecl::Var. So inside a function body, the visitor walks in source order — a defineMessage({ defaultMessage: msg }) that appears before const msg = "..." in the same function will silently fail to resolve. The "delayed bindings" test (packages/formatjs/__tests__/wasm.test.ts:118) only exercises the module-level case. Worth either adding a function-scope test or documenting the limitation.

  • No destructuring support. collect_var_decl_bindings only matches Pat::Ident. const { id, defaultMessage } = MESSAGE; will not be tracked. The previous Evaluator may have handled this; if the issue is specifically about hygiene, this is probably acceptable, but it is a behavior change.

  • Module prepass ignores import bindings. If a user does import MSG from './messages'; defineMessage({ defaultMessage: MSG }), MSG won't resolve. Again, possibly intentional (cannot be statically inlined without crossing module boundaries) — but the old evaluator may have at least tried.

Code quality

  • store_binding heuristic deserves an inline test. The "keep object over non-object" rule for React Compiler temps (packages/formatjs/transform/src/lib.rs:596-613) is subtle and the comment is the only documentation. A direct unit test of the logic (object first then assignment, vs. non-object first then object) would help guard against future regressions. The PR's tests exercise it indirectly through the issue plugin-formatjs: SWC plugin crashes with Next.js 16.2.1 + React Compiler + @swc/plugin-formatjs #604 repros.

  • resolve_member_expr_value ignores spread. { ...defaults, id: 'x' } followed by a lookup of id works (last-wins reverse iter), but a lookup of any field provided only by ...defaults silently fails. Probably fine for descriptor extraction, but worth a comment.

  • evaluate_static_value_with_depth only handles BinaryOp::Add. Reasonable for descriptor strings, but if users had String(x), template-tag calls, or || fallbacks working under the old Evaluator, they regress. Worth confirming the issue tracker doesn't have existing reports.

  • number_to_js_string may diverge from JS String(n) for exponential notation. Rust's f64::to_string() and JS's Number.prototype.toString() agree on integers and NaN/Infinity, but differ for small numbers (e.g. 1e-7). Unlikely to bite for descriptor IDs, but the function name suggests JS semantics that aren't quite delivered.

  • Cargo.toml reformatting (packages/formatjs/transform/Cargo.toml) changes alignment of every dependency line just to add the swc_ecma_minifier TODO. The diff would be smaller if that block were left alone.

Tests

  • Good fix to the missing await at packages/formatjs/__tests__/wasm.test.ts:211 — the PR notes mention a pre-existing Vitest warning about an unawaited rejection assertion. Worth a quick rg \"expect\\(transformCode\" packages/formatjs/__tests__ to catch any siblings still missing await.

  • Coverage gaps suggested above: function-scoped binding resolution, destructuring fallback (should emit non-evaluable error cleanly), and spread-in-object descriptors.

  • The cyclic test only covers ident → ident cycles. A const a = b.x; const b = { x: a }; member-cycle is what resolve_object_expr_with_depth is really protecting against — would be worth a direct test.

API / Backward compat

  • create_formatjs_visitor silently ignores the evaluator argument. The #[deprecated] note mentions it, but custom-transform consumers built against an older version will see a behavior change (their externally-fed Evaluator state is gone) without any runtime signal beyond the compile-time deprecation warning. Consider mentioning the behavior change — not just the rename — more prominently in the changeset, since users may have been seeding bindings into the evaluator.

Performance / Security

  • No security concerns — the static evaluator only walks literal/binding/member shapes.
  • bindings: HashMap<Id, Box<Expr>> holds clones of every initializer expression. For pathological modules this could be large, but bounded by program size; not a regression vs. the prior approach (which cloned the entire module).

Overall, this is a solid fix. The biggest open question is whether the narrowing of evaluation scope (no destructuring, no imports, no function-scope hoisting, only +) regresses any real-world users vs. the prior Evaluator-backed implementation.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9eeca1c46c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +139 to +143
Prop::Shorthand(ident) => {
let key = ident.sym.to_string();
let ident_expr = Expr::Ident(ident.clone());

(key, visitor.get_message_descriptor_value(&ident_expr))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove shorthand defaultMessage when requested

When a descriptor uses ES shorthand, e.g. const defaultMessage = "Hi"; defineMessage({ defaultMessage }) with removeDefaultMessage: true, this new branch now treats the shorthand as an extractable descriptor. However, the later mutation/removal pass only handles Prop::KeyValue entries for defaultMessage, so the transformed output still contains the shorthand defaultMessage, violating the removeDefaultMessage behavior for the newly supported syntax. Either rewrite/remove shorthand descriptor properties too, or avoid extracting them as descriptors.

Useful? React with 👍 / 👎.

Comment on lines +671 to +674
for prop in obj.props.iter().rev() {
let PropOrSpread::Prop(prop) = prop else {
continue;
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Bail out on spreads before resolving member values

When resolving a member from an object literal, this loop scans properties in reverse but simply skips spreads. For const messages = { hello: "A", ...runtimeMessages }; formatMessage({ defaultMessage: messages.hello }), the spread occurs after hello and can override it at runtime, but the resolver ignores the spread and returns the earlier literal, causing extraction and ID generation for a message that may not be the runtime default. If a spread is encountered before finding the key, the lookup should be treated as non-static unless the spread itself is statically resolved.

Useful? React with 👍 / 👎.

Comment thread packages/formatjs/transform/src/lib.rs Outdated
Comment on lines +777 to +781
_ => Some(StaticValue::Str(format!(
"{}{}",
left.to_js_string(),
right.to_js_string()
))),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve JavaScript addition semantics for non-strings

This fallback treats every + with at least one non-number as string concatenation, but JavaScript only concatenates when either operand becomes a string; otherwise booleans/null are coerced numerically. For example, defaultMessage={"Count " + (1 + true)} should statically evaluate to "Count 2", while this resolver produces "Count 1true", changing the extracted message and generated ID for valid static descriptors that the previous evaluator handled with JS semantics.

Useful? React with 👍 / 👎.

Comment on lines +139 to +143
Prop::Shorthand(ident) => {
let key = ident.sym.to_string();
let ident_expr = Expr::Ident(ident.clone());

(key, visitor.get_message_descriptor_value(&ident_expr))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Ignore unknown shorthand props before resolving them

This new shorthand branch resolves the identifier before checking whether the property is a message descriptor key. As a result, defineMessage({ defaultMessage: "Hi", metadata }) now emits the static-evaluation error when metadata is dynamic or unbound, even though non-descriptor fields should be ignored and shorthand properties were previously skipped entirely. Move the key allow-list check ahead of get_message_descriptor_value for shorthand properties.

Useful? React with 👍 / 👎.

@claude
Copy link
Copy Markdown

claude Bot commented May 30, 2026

Code review for PR #625 — fix(formatjs): avoid hygiene crashes

Nice, focused fix. Removing the swc_ecma_minifier::Evaluator dependency from the WASM plugin entrypoint addresses a real crash class and shrinks the plugin's surface area. The bespoke static evaluator is appropriately scoped to what FormatJS actually needs (string/number/bool/null/BigInt literals, template literals, + concatenation, ident and member resolution) and ships with good test coverage for the new behavior. The pre-existing await bug in wasm.test.ts:211 is a nice catch — without it a rejection assertion could silently miss.

Below are some observations from the diff. Most are nits; a couple are worth a second look.

Correctness

  1. Bin(+) semantics close to but not exactly JS. evaluate_static_value_with_depth for BinaryOp::Add treats BigInt like a string operand only through coercion — BigInt + BigInt falls through to_js_number() returning None and never produces a number result. In practice nobody puts BigInt arithmetic in defaultMessage, but it's worth a comment so the next person doesn't try to "fix" it without realizing JS would actually throw TypeError here. Either be explicit about not supporting BigInt arithmetic, or bail out (returning None) on mixed BigInt/Number to be safe.

  2. Cycle protection is depth-based, not visit-based. The comment on resolve_object_expr_with_depth (transform/src/lib.rs:680) notes the shared depth limit prevents let a = b; let b = a;. That works, but it also caps any chain at MAX_RESOLUTION_DEPTH = 16 references — a perfectly valid deep alias chain would be treated identically to a cycle and silently bail out. For descriptor evaluation 16 is plenty in practice, so this is fine, but a HashSet<Id> of visited bindings would be more precise and free us to raise the cap without hurting cycle behavior.

  3. store_binding "last write wins" with a single carve-out for objects. The rule "don't overwrite an object with a non-object" handles the React Compiler temp pattern, but normal reassignment of an object to a different object still overwrites — only the last assignment is visible to the resolver. Combined with visit_mut_module_items doing a pre-pass and then a normal traversal that re-runs visit_mut_var_declarator, module-level let/const decls get stored twice; for plain values that's idempotent, but for an object reassigned later in the file (legal pattern) the resolver will read the late value when evaluating an earlier call site. That's a minor semantic departure from "what would actually run at that point in the program," but it's probably the pragmatic choice for static extraction. Worth a one-line comment explaining the trade-off.

  4. Silent as_str()? on non-UTF8 strings. Several call sites switched from .expect("non-utf8 string") to ?, which is the right call (don't crash the plugin), but it does mean we silently drop a descriptor field instead of telling the user why. Logging a warning would be friendlier; otherwise the user just sees a missing message and no clue why.

  5. get_message_descriptor_key_from_call_expr doesn't handle computed/numeric keys, while prop_name_to_key does. Not a regression (matches prior behavior) — just worth noting that { ["id"]: ... } is still ignored at the descriptor extraction layer.

Performance / memory

  1. bindings: HashMap<Id, Box<Expr>> clones every binding initializer. For large modules with lots of top-level const declarations, that's a measurable allocation cost up-front in the visit_mut_module_items pre-pass, regardless of whether they're ever referenced from a descriptor. Two thoughts: (a) since you only resolve from this map when an identifier is referenced, you could lazily store on first need; (b) Arc<Expr> instead of Box<Expr> would avoid the additional clones inside resolve_member_expr_value returning Box<Expr>. Probably not worth optimizing pre-emptively, but flag for future profiling if the plugin shows up in build traces.

  2. Pre-pass + traversal both call store_binding for module-level decls. Cheap, but wasteful. You could skip visit_mut_var_declarator for items already handled at the module level, or skip the pre-pass for inner scopes.

Backward compatibility / API

  1. create_formatjs_visitor_without_evaluator is an awkward name. The "without_evaluator" suffix describes what it doesn't take rather than what it does. Given the old function is #[deprecated] and slated for removal (per the TODO comment in transform/Cargo.toml), consider a less awkward name now (e.g. create_formatjs_visitor_v2, or planning to free up create_formatjs_visitor once the old one is gone) so external consumers don't anchor on the long name.

  2. swc_ecma_minifier is still pulled into swc_formatjs_transform just to satisfy the deprecated function signature. The TODO is there, which is good — make sure there's a tracking issue to actually do that cleanup, since the build-time cost of pulling in swc_ecma_minifier doesn't go away until that dep is dropped from the transform crate too.

Tests

  1. Great coverage on the new behavior — the cyclic binding test, the "hidden behind later spreads" test, the TS wrapper stripping test, and the multiple-formatMessage-with-ast test are all good. A couple I'd add if you want to be thorough:

    • A defineMessages({ a: { defaultMessage: id }, b: { defaultMessage: id } }) test exercising the multi-entry path through process_message_object with the new shorthand handling.
    • A nested template literal test: `Hello ${`${name}`}` to lock in the depth recursion through evaluate_template_literal.
    • A + chain like "a" + "b" + "c" + ... long enough to exceed MAX_RESOLUTION_DEPTH and ensure the failure mode (emit non-evaluable error) is what you want.
  2. The await expect(...).rejects.toThrow(...) fix on line 208 of wasm.test.ts is a real bug fix — was that the only unawaited rejection in the file? Worth a quick grep to make sure no other test was silently passing for the wrong reason.

Nit

  1. Tiny: is_object_expr / store_binding use a match with (true, false) => false, _ => true — equivalent to !(existing.is_object && !value.is_object), which reads a bit cleaner. Subjective.

Overall: this is a good fix that addresses the real bug and makes the plugin meaningfully leaner. Approving in spirit; the items above are mostly polish or future cleanup, with #1, #3, and #6 being the ones I'd think hardest about.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cb7fa5af24

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +797 to +803
match Self::strip_expr(expr) {
Expr::Lit(Lit::Str(s)) => Some(StaticValue::Str(s.value.as_str()?.to_string())),
Expr::Lit(Lit::Num(n)) => Some(StaticValue::Num(n.value)),
Expr::Lit(Lit::Bool(b)) => Some(StaticValue::Bool(b.value)),
Expr::Lit(Lit::Null(_)) => Some(StaticValue::Null),
Expr::Lit(Lit::BigInt(n)) => Some(StaticValue::BigInt(n.value.to_string())),
Expr::Tpl(tpl) => self
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve unary literals in static messages

When a message contains a negative numeric constant, e.g. defaultMessage={`Step ${-1}`} or defaultMessage: "Step " + -1, SWC represents the -1 as an Expr::Unary, but the replacement evaluator only accepts literal numbers and falls through to None. The previous Evaluator::eval handled these JavaScript constants, so these valid static descriptors now emit the non-evaluable error and are skipped instead of being extracted.

Useful? React with 👍 / 👎.

let expr = self.resolve_member_expr_value(member, depth + 1)?;
self.evaluate_static_value_with_depth(&expr, depth + 1)
}
_ => None,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve static conditional messages

When a descriptor uses a statically decidable conditional, such as defaultMessage: true ? "Enabled" : "Disabled", this evaluator reaches the catch-all because it has no Expr::Cond handling. The previous SWC evaluator could fold constant conditionals, so these valid static descriptors now emit the non-evaluable error and are skipped instead of being transformed/extracted.

Useful? React with 👍 / 👎.

Comment thread packages/formatjs/transform/src/lib.rs Outdated
} else if value == f64::NEG_INFINITY {
"-Infinity".to_string()
} else {
value.to_string()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Match JavaScript number stringification

When numeric values are coerced inside a template or string concatenation, f64::to_string() does not match JavaScript String(number) for values outside JS's fixed-format range, e.g. defaultMessage={`ID ${1e21}`} should extract ID 1e+21 but this produces the expanded decimal form. That changes the extracted default message and generated ID compared with the previous JS-semantics evaluator for otherwise static descriptors.

Useful? React with 👍 / 👎.

Comment thread packages/formatjs/transform/src/lib.rs Outdated
Comment on lines +715 to +717
if self.prop_name_to_key(&kv.key, depth + 1).as_deref() == Some(key.as_str()) {
return Some(kv.value.clone());
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Bail on unknown later object keys

When resolving a member from an object literal, an unresolvable computed property to the right of the target can still override it at runtime, e.g. const messages = { hello: "A", [runtimeKey]: "B" }; formatMessage({ defaultMessage: messages.hello }) should not be treated as static if runtimeKey might be "hello". Here prop_name_to_key() returning None is just considered a non-match and the scan continues to the earlier literal, causing extraction and ID generation for a message that may not be the runtime default.

Useful? React with 👍 / 👎.

@claude
Copy link
Copy Markdown

claude Bot commented May 30, 2026

Code Review

Thanks for tackling #604 — replacing the heavyweight Evaluator with a small static evaluator targeted at FormatJS descriptors looks like the right call, and the added test cases (cycles, spreads, computed keys, TS wrappers, ast: true regressions) are excellent. A few notes below.

Correctness

  1. Breaking change in deprecated create_formatjs_visitor signature. The function is marked #[deprecated] but the return type changed from FormatJSVisitor<'a, C, S> to FormatJSVisitor<C, S>. Downstream custom-transform consumers that name the type with its lifetime parameter (e.g. in a struct field) will fail to compile, not just see a deprecation warning. Two options:

    • Document this as a breaking change in the changeset (it currently reads as additive).
    • Keep a phantom PhantomData<&'a ()> to preserve the lifetime parameter, so the old type signature still type-checks during the deprecation window.
  2. Forward references only resolve at module top level. collect_module_item_bindings runs only over ModuleItems (and only Decl::Var / ExportDecl(Var)). Inside a function body, the binding is only seen when visit_mut_var_declarator is reached during traversal — so function f() { formatMessage({ defaultMessage: MSG }); const MSG = 'x'; } will fail to resolve while the same code at module scope succeeds. The Evaluator-backed version handled this uniformly. Worth either documenting the new limitation or adding a quick hoist pass inside function/block visitors.

  3. Destructured top-level bindings are dropped. collect_var_decl_bindings and visit_mut_var_declarator both gate on Pat::Ident, so const { foo } = bar; defineMessage({ defaultMessage: foo }) won't resolve. Also probably worth a test (either confirming the bail-with-error behavior or supporting at least simple object destructuring).

  4. store_binding rule may silently mask reassignments. The keep-object-on-second-write rule (existing object + new non-object → keep object) is justified for React Compiler temps, but it also means let x = {}; x = 'foo'; defineMessage({ defaultMessage: x.bar }) resolves to the original object. The pattern this protects against (t1 = { ... }; t1 = useMemoCache()) is specific to compiled output — for source code this could be surprising. Worth a brief comment listing the assumed invariant, or scoping the heuristic to identifiers that look like compiler temps (t\d+, _temp, etc.) if that's where the original incident actually came from.

Minor

  1. evaluate_static_value_with_depth doesn't model Expr::Bin ops other than Add. That's reasonable for descriptor values, but -, *, /, comparisons, &&/||/??, and typeof are common in template-expression contexts. Probably fine to leave for follow-up — flagging since the test suite implies broader static-eval support.

  2. to_js_number for Str returns None. +"5" is 5 in JS; \Count ${+'5'}`` won't fold here. Conservative is fine; just noting the gap from spec for future reference.

  3. json_value_from_expr returns Value::Null for non-finite numbers (NaN, Infinity) via Number::from_f64. This preserves the pre-PR behavior — but emitting null into a parsed ICU AST for a numeric option seems like it would silently corrupt extraction. Not introduced by this PR; just worth a follow-up.

  4. evaluate_expr emits the error at the outermost span — so when a deeply nested sub-expression is the actual culprit, the diagnostic points at the whole descriptor value, not the offending node. The previous Evaluator-based code had the same property, so no regression, but a richer error would help users debug the new failure modes (e.g. cyclic bindings).

  5. #[deprecated(note = "...the evaluator argument is ignored")] — small UX nit: callers still have to construct an Evaluator just to pass an ignored argument. Worth mentioning the migration target uses fewer args in the note, e.g. \"use create_formatjs_visitor_without_evaluator (no Evaluator construction required)\".

Performance

  • store_binding and resolve_member_expr_value clone Box<Expr> / ObjectLit on every lookup. For modules with many defineMessage calls against a shared messages object, this is O(props × calls × depth) of cloning. Likely fine in practice, but if profiling ever shows this hot, switching bindings to HashMap<Id, Arc<Expr>> would let lookups share the original AST node.

Tests

Coverage is strong — particularly the cyclic-binding case and the spread/computed-key invalidation tests. Suggestions for filling small gaps:

  • Forward reference inside a function body (currently quietly unsupported, per point 2).
  • Destructured top-level binding (point 3).
  • Reassignment with the React-Compiler heuristic — both the supported case and a let x = {}; x = 'foo' case where the user reasonably expects the second write to win.
  • removeDefaultMessage: true on a JSX <FormattedMessage> (the unit test covers the call-expr form only).

Nits

  • packages/formatjs/transform/Cargo.toml shifts column alignment of every dep on the line above the new ryu-js entry — pure formatting churn that will show up in git blame for unrelated lines. Could be reverted to keep the diff focused.
  • The // TODO: Remove this after the deprecated evaluator-taking visitor constructor is removed. comment on the swc_ecma_minifier dep is helpful; consider tying it to a tracking issue so it doesn't get lost.

Overall this is a solid, well-tested fix. The points above are mostly about clarifying the new semantic surface (what does and doesn't resolve) and the API-compat story for the deprecated entry point. Nice work landing it.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 80ea19072a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/formatjs/transform/src/lib.rs Outdated
Comment on lines +650 to +652
) {
(true, false) => false,
_ => true,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Treat non-object reassignment as dynamic

When a binding starts as an object literal and is reassigned to a non-object before it is used, this branch preserves the stale object instead of updating or invalidating the binding. For example, let messages = { hello: "A" }; messages = getMessages(); formatMessage({ defaultMessage: messages.hello }) will still extract/hash "A", even though the runtime value is dynamic (and may not even be an object). The special React-compiler case needs to be narrower, or non-object reassignments should make later member resolution fail.

Useful? React with 👍 / 👎.

Comment on lines +1534 to +1535
for item in items.iter() {
self.collect_module_item_bindings(item);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve statement order for var redeclarations

Pre-collecting every module-level var binding before visiting statements lets later valid var initializers overwrite earlier uses. In var MSG = "A"; formatMessage({ defaultMessage: MSG }); var MSG = "B";, both declarations share the same binding, so the prepass leaves MSG as "B" before the formatMessage call is processed, causing extraction/ID generation for the wrong message even though runtime uses "A" at that point.

Useful? React with 👍 / 👎.

Comment thread packages/formatjs/transform/src/lib.rs Outdated
Comment on lines +1310 to +1311
if let AssignTarget::Simple(SimpleAssignTarget::Ident(ident)) = &assign_expr.left {
let variable_id = ident.id.to_id();

// Check if we already have a binding for this variable
let should_update = match self.resolve_identifier(ident) {
Some(existing_expr) => {
// Only overwrite if the new expression is an object literal
// and the existing one is not, or if both are object literals
match (existing_expr, &*assign_expr.right) {
(Expr::Object(_), Expr::Object(_)) => true, // Both objects, update
(_, Expr::Object(_)) => true, /* New is object, existing */
// is not, update
(Expr::Object(_), _) => false, /* Existing is object, new is not, don't */
// update
_ => true, // Neither is object, update
}
}
None => true, // No existing binding, always update
};

if should_update {
self.evaluator.store(variable_id, &assign_expr.right);
}
self.store_binding(ident.id.to_id(), &assign_expr.right);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Invalidate cached objects on member writes

Because the new resolver folds member reads out of cached object literals, ignoring member assignments leaves stale values in bindings. For example, after const messages = { hello: "A" }; messages.hello = getRuntimeMessage(); formatMessage({ defaultMessage: messages.hello }), this branch does nothing for the messages.hello = ... write, so the later lookup still extracts and hashes "A" even though the runtime message is dynamic or different.

Useful? React with 👍 / 👎.

Comment on lines +1295 to +1300
fn visit_mut_var_declarator(&mut self, declarator: &mut VarDeclarator) {
declarator.visit_mut_children_with(self);

if let Pat::Ident(ident) = &declarator.name {
if let Some(init) = &declarator.init {
self.store_binding(ident.id.to_id(), init);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Precollect bindings inside function bodies

The module prepass fixes later top-level constants, but function/block scopes still rely on normal traversal because visit_mut_var_declarator stores a binding only after visiting the initializer's children. In valid code such as const render = () => <FormattedMessage defaultMessage={MSG} />; const MSG = "Hi"; return render();, the arrow body is visited before MSG is stored, so extraction emits the non-evaluable error even though the binding is initialized before the closure runs. This is fresh evidence beyond the earlier top-level later-constant issue: only visit_mut_module_items precollects declarations, not nested bodies.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 833fac9f96

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1371 to +1372
if let Some(id) = Self::member_base_ident(member) {
self.invalidate_binding(id);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Invalidate aliases when member writes mutate shared objects

When an object binding has been aliased, invalidating only the syntactic base of the member write leaves other bindings pointing at the same object stale. For example, const messages = { hello: "A" }; const alias = messages; alias.hello = getMessage(); formatMessage({ defaultMessage: messages.hello }) still resolves messages.hello from the cached object and extracts/hashes "A", even though the runtime value was mutated through alias. The new member-resolution cache should either invalidate aliases that resolve to the same object or stop treating aliased object member reads as static after a member write.

Useful? React with 👍 / 👎.

@claude
Copy link
Copy Markdown

claude Bot commented May 30, 2026

Review

Nice fix for the hygiene crash — replacing swc_ecma_minifier::Evaluator (which cloned the whole module and ran resolver-dependent logic) with a small purpose-built static evaluator over the same AST is the right move. The diff is mostly self-contained, well-tested (15+ new cases), and the behavior changes are deliberate. A few notes below.

What I like

  • The early bailout if !matches!(key.as_str(), "id" | "defaultMessage" | "description") in JSXAttrOrSpread::get_key_value_with_visitor plus is_message_descriptor_key for PropOrSpread is exactly the right place to draw the line — values, fallback, etc. no longer get fed through the evaluator. This is the actual fix for the spurious "must be statically evaluate-able" errors on unrelated JSX/TS.
  • MAX_RESOLUTION_DEPTH = 16 shared across evaluate_static_value_with_depth, resolve_member_expr_value, json_value_from_expr_with_depth, and evaluate_template_literal doubles as a cycle guard (covered by the cyclic-binding test) and a stack-overflow guard. Good single knob.
  • is_react_compiler_cache_read is a tight, syntactic match ($[<num-lit>]) — the existing should generate same id after react compiler optimizations test still works because the t1 = $[0] reassignment is recognized and the prior object binding is preserved.
  • ryu_js is the right choice for String(1e21) === "1e+21" semantics rather than format!("{}", value).
  • The #[deprecated] on create_formatjs_visitor with the _evaluator underscore parameter is good API hygiene.

Issues and suggestions

1. Duplicated descriptor-key list. transform/src/lib.rs:86 has an inline matches!(key.as_str(), "id" | "defaultMessage" | "description") that duplicates is_message_descriptor_key. Worth swapping the inline check to call the helper so the canonical key set lives in one place.

2. Code duplication between Prop::KeyValue and Prop::Shorthand arms in evaluate_message_descriptor (lines ~1227–1320). The id/description/defaultMessage handling is repeated nearly verbatim. Could be factored into a small helper that takes (key, original_prop) and returns the replacement PropOrSpread. Not a blocker but it'll bitrot fast if a new descriptor key is added.

3. evaluate_static_string_with_depth checks depth and then calls evaluate_static_value_with_depth(expr, depth) (note: same depth, not depth + 1). Each layer of resolution should bump depth exactly once. Right now the redundant check at the string-level wrapper just consumes a level for free and the value-level check does the real work — works correctly, but evaluate_static_value_with_depth is the only check you need. Removing the wrapper's check would simplify the invariant.

4. Behavior change worth flagging in the changeset. The old visit_mut_assign_expr preserved the previous Object binding when the new value was non-Object (should_update = false). The new code invalidates the binding (marks Dynamic) unless the new value is a $[N] cache read. This is more correct (it's what should treat non-object reassignment as dynamic checks) but it's a real behavior change for downstream consumers — the changeset only mentions "avoid hygiene proxy crashes" and "ignores its evaluator argument," not that let x = {…}; x = f() now blocks static extraction where it previously didn't.

5. Conditional/branch assignments give arbitrary results. With let MSG; if (cond) { MSG = "A"; } else { MSG = "B"; } formatMessage({ defaultMessage: MSG });, whichever branch is visited last wins and gets extracted as the message. Not a regression (the old evaluator had its own version of this), but it means the static extractor confidently extracts a value that may not match runtime in either branch. Worth a comment near store_binding so the next person doesn't think the visitor models control flow.

6. var semantics are source-order, not hoisting-order. var MSG = "First"; formatMessage(...); var MSG = "Second"; works because the call sits between the two assignments, but formatMessage({defaultMessage: MSG}); var MSG = "x"; will now emit "must be statically evaluate-able" rather than recognizing the hoisted (but uninitialized) binding. Probably acceptable — flagging in case any internal consumer relies on hoisted-var extraction.

7. swc_ecma_minifier is now a dead-weight dep, kept only so the deprecated create_formatjs_visitor signature compiles. The TODO comment in transform/Cargo.toml acknowledges this; would be worth filing a follow-up issue to drop it in the next breaking release of swc_formatjs_transform (the crate is already at v31 so a v32 with the removal is cheap).

8. Error span quality. evaluate_expr always emits at expr.span() of the top-level descriptor value. When the failure is several hops deep (e.g. messages.hello where messages was reassigned), the diagnostic points at messages.hello rather than the assignment that invalidated it. Not blocking, but if you have spare time the user-facing error gets nicer if evaluate_static_value_with_depth can return a span for the offending sub-expression.

9. Minor: Prop::Assign in resolve_member_expr_value is only producible by destructuring patterns and shouldn't appear in plain object literals reached from resolve_object_expr. The handling is fine, but a one-line comment explaining "defensive — Prop::Assign only appears in patterns" would help future readers not wonder if there's a case they're missing.

Correctness spot-checks I did

  • should not resolve member values hidden behind later spreads: reverse-iteration sees Spread first → returns None. ✓
  • should not resolve member values hidden behind unknown computed keys: prop_name_to_key on [runtimeKey] returns None because globalThis.key has no binding → Some(_) => {} path doesn't apply, None => return None does. ✓
  • React Compiler cache pattern t1 = $[0] after t1 = {…}: is_react_compiler_cache_read matches, store_binding returns early without invalidation. ✓
  • Cyclic let a = b; let b = a;: alternates until depth 16, returns None, error emitted. ✓
  • messages.hello = foo: member_base_ident finds messages (recurses through nested member accesses), invalidates entire object binding. ✓

Security / performance

  • No code execution; purely static AST walking with bounded depth.
  • Memory: every let/const RHS is cloned and stored in bindings. For typical formatjs source files this is negligible; for a generated file with thousands of top-level constants it could matter. Not worth optimizing preemptively.

Overall: 👍 ready to merge once the changeset is updated to call out the assign-invalidation behavior change. Items 1–3 and 9 are nice-to-haves that could be folded in here or as a follow-up.

@kdy1 kdy1 merged commit c5bd878 into main May 30, 2026
13 checks passed
@kdy1 kdy1 deleted the kdy1/fix-formatjs-issue-604 branch May 30, 2026 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

plugin-formatjs: SWC plugin crashes with Next.js 16.2.1 + React Compiler + @swc/plugin-formatjs

1 participant