fix(handler)!: drop dist/handler.js to support ESM Lambda handlers by zarirhamza · Pull Request #784 · DataDog/datadog-lambda-js

zarirhamza · 2026-06-09T14:20:18Z

What

Fixes #782. Removes dist/handler.js from the published npm tarball; Lambda's resolver falls through to dist/handler.mjs, whose async load() handles both CJS and ESM user modules. Aligns the npm package with the layer (which has only ever shipped handler.mjs).

Breaking change

Container-image deployments require aws-lambda-ric >= 3.0.0 (June 2023). AWS-published public.ecr.aws/lambda/nodejs:N base images include it since mid-2023; rebuild stale custom images if needed. Older RIC versions surface as Cannot find module 'handler'. Historical context: dist/handler.js was originally added in #307 to work around exactly this in early RIC.

Why not the shim (earlier commits in this PR)

Initial commits added a CJS shim that detected ESM at runtime and dynamically imported handler.mjs. Removed after review: the shim defers tracer + user-handler init from Lambda's init phase into the first invocation, breaking Provisioned Concurrency and SnapStart. handler.mjs's top-level await pre-pays that cost. Aligns with #697 and the reporter's suggested fix in #782.

Tests

Existing esm_node{18,20,22,24} tests (layer path) unchanged
New container-{cjs,esm}_node{18,20,22,24} tests exercise dist/handler.handler through AWS-stock RIC

Types of Changes

Bug fix
New feature
Breaking change
Misc (docs, refactoring, dependency upgrade, etc.)

Check all that apply

This PR's description is comprehensive
This PR contains breaking changes that are documented in the description
This PR introduces new APIs or parameters that are documented and unlikely to change in the foreseeable future
This PR impacts documentation, and it has been updated (or a ticket has been logged)
This PR's changes are covered by the automated tests
This PR collects user input/sensitive content into Datadog
This PR passes the integration tests (ask a Datadog member to run the tests)

datadog-prod-us1-3 · 2026-06-09T14:27:52Z

✨ Fix all issues with BitsAI

⚠️ Warnings

🚦 1 Pipeline job failed

DataDog/datadog-lambda-js | integration test (node24)

Useful? React with 👍 / 👎

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 91f793b | Docs | Datadog PR Page | Give us feedback!}

zarirhamza · 2026-06-09T14:43:28Z

First push regressed the esm_node* integration tests on all four runtimes — error was Runtime.ImportModuleError: Cannot find module 'esm' through handler.cjs. Root cause: the Dockerfile copies dist/ wholesale into the layer, so the new dist/handler.js shim ended up on the layer too, and Lambda's bootstrap resolved handler.handler to it instead of falling through to handler.mjs.

Inside the layer, the shim's userIsEsm() check returns false for the integration test's esm_node function (handler string is esm.handle with no .mjs suffix, and integration_tests/package.json has no type: "module"), so it delegated to handler.cjs → loadSync → _tryRequireSync, which doesn't try .mjs.

The layer has never shipped a handler.js (it was only injected into the npm tarball by the publish-script cp), and Lambda's resolver fall-through to handler.mjs already handles both CJS and ESM via the async load(). Second commit (270a846) drops the shim from the layer via RUN rm so the layer keeps that behavior unchanged from main. Shim still ships in the npm package, which is where it's actually needed.

duncanista

Thanks for tackling this — the bug is real and the runtime-branching shim is a reasonable approach. A few thoughts on the structure and on coverage that I think are worth addressing before merge. Nothing blocking the direction, just some cleanup that would make the file layout easier to reason about for the next person who touches it.

- Rename `userIsEsm()` to `shouldUseEsmHandler()`. The function returns a routing decision based on heuristics (env regex + package.json sniff), not an authoritative fact about the user's code, so the new name reads more honestly at the call site. - Inline the former `src/handler.cjs` body into the CJS branch of the shim and delete `src/handler.cjs`. Lambda's resolver doesn't auto-resolve `.cjs`, so handler.cjs was only ever reachable via this shim's `require("./handler.cjs")`. Keeping it as a separate file made the load graph confusing — future readers grepping for "what loads handler.cjs?" would find one caller in the same directory. The heavy `require()`s remain scoped to the CJS branch so ESM users don't pay the cost of pulling in the full tracer at require-time. - Add `src/handler.spec.ts` covering shouldUseEsmHandler's four-case matrix (.mjs in _HANDLER / DD_LAMBDA_HANDLER, package.json type=module, package.json without type field, package.json missing, package.json unparseable). Detection is the load-bearing piece — a regression there means either ESM users keep seeing ERR_REQUIRE_ESM or CJS users pay an unnecessary async-import cold-start cost. - Tighten `scripts/update_dist_version.sh`'s `cp src/handler.*` to list the runtime artifacts explicitly. The broader glob also picked up `src/handler.spec.ts`, which is a test fixture, not a runtime file — shipping it would also be silently inconsistent with the Dockerfile's hard-coded `rm /nodejs/.../handler.js`. - Expand the Dockerfile comment to point at `scripts/update_dist_version.sh` as the source of the file being removed, so a future change to how `src/handler.js` is copied into `dist/` is visibly tied to the layer-side `rm`.

DarcyRaynerDD · 2026-06-11T14:47:28Z

I think I wrote the original version of the handler.cjs/esm module split a few years ago, so maybe this context would help.

So, much of this logic was due to whether the node runtime supports ESM modules and top level await, (at the time we supported node 12 which didn't). Top level await is only supported in ESM, and is needed when dynamic importing ESM modules so we can receive the handler function instead of a promise, and export the result. This isn't possible in CJS, while you can import ESM modules dynamically, they end up as promises and if you want to await that promise you have to do it within an async function. ESM doesn't have any problems requiring CJS modules dynamically though. So, when the customer is using the layer, and a runtime >= 14, it makes sense to just use the ESM handler since it should handle all cases in theory. Since older runtimes are completely deprecated now, we should only bundle the ESM handler into the layer. I haven't looked at the support case to check if this reasoning was solid, just what I remember verifying at the time, and I know that was the previous attempted approach, but I'm confused how it could have broken CJS users.

The second case is when installing datadog-lambda-js via npm. Since we don't know which module format the customer is using, we have to package both versions. I believe I did some experimentation to find which files would be auto-imported based on their file extension. I think you had to specify the .mjs extension for it to work for esm. Typically though, a customer using the npm module could just import the datadog-lambda-js library and wrap manually if they ran into an issue.

zarirhamza · 2026-06-11T15:06:44Z

@DarcyRaynerDD thanks for the historical context — it's exactly what was missing. You're right on both counts.

I switched the PR to do what you suggested in commit ee53944: handler.mjs is now the sole Lambda entry point in both the npm tarball and the layer. The shim is gone, src/handler.cjs is gone, and dist/ ships a single handler file. The diff against main collapses down to ~6 source lines plus removal of the dead files.

Re your "I'm confused how it could have broken CJS users" point — I went back through this carefully and I don't think it would have. Lambda's bootstrap resolves dist/handler.handler to handler.mjs once .js is absent, and handler.mjs's async load() already handles both CJS and ESM user modules transparently. The earlier attempt (PR #697) seems to have been closed without a recorded technical reason; I suspect it was process drift rather than a real failure. The integration test suite's esm_node{18,20,22,24} jobs cover exactly this code path against a real ESM user function, and they pass on this branch.

I had originally added a runtime-branching shim to avoid making CJS users on the npm-redirect path pay an extra ~10-30ms of init time (sync loadSync vs. async await load()). On reflection that's noise relative to a Datadog Lambda's normal init cost, and the shim's deferred-load behavior actually penalized ESM users on first invocation by 300-800ms — including PC and SnapStart customers, where init work is supposed to be pre-paid. Net loss for the customers this PR exists to help, so it's gone.

Description updated. Welcome any further notes.

Copilot

Pull request overview

Removes the CommonJS handler artifact path so AWS Lambda resolves the Datadog handler redirect to the ESM handler.mjs, avoiding ESM user-handler load failures caused by dist/handler.js being preferred by the bootstrap resolver.

Changes:

Delete src/handler.cjs so there’s no longer a CJS handler source to ship/copy.
Update scripts/update_dist_version.sh to copy only src/handler.mjs into dist/.
Remove publishing-time steps that copied dist/handler.cjs to dist/handler.js.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
`src/handler.cjs`	Removes the CJS handler implementation so it can’t be shipped as an entry point.
`scripts/update_dist_version.sh`	Restricts post-build copying to `handler.mjs` as the intended Lambda entry point.
`scripts/publish_prod.sh`	Stops injecting `dist/handler.js` during prod publishing.
`.gitlab/scripts/publish_npm.sh`	Stops injecting `dist/handler.js` during CI npm publishing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

lucaspimentel · 2026-06-11T18:45:50Z

According to 🤖

loadSync is now dead code after this PR

Deleting src/handler.cjs removes the only consumer of loadSync, leaving an orphaned synchronous loader chain in src/runtime/user-function.ts:

loadSync (line 317) — its only callers were handler.cjs (the traceExtractor load and the wrapped-handler load); both are gone with the file.
_loadUserAppSync (line 224) — only called by loadSync.
_tryRequireSync (line 164) — only called by _loadUserAppSync.

It's also not reachable through the package's public entry point: src/index.ts imports only subscribeToDC from ./runtime, so loadSync was never exposed to external consumers despite the export { load, loadSync } re-export in src/runtime/index.ts:1. No tests reference it either.

Since this PR is what orphaned it, it'd be clean to remove the three functions (user-function.ts:164-196, 224-242, 317-342) and drop loadSync from the runtime/index.ts re-export in the same change. The async load / _loadUserApp / _tryRequire path remains the sole live loader, which matches the PR's goal of a single ESM entry point.

…rtifacts Two follow-ups from PR #784 review: 1. Remove `loadSync`, `_loadUserAppSync`, `_tryRequireSync`. Deleting `src/handler.cjs` in the previous commit removed the only consumers of the synchronous loader chain. `loadSync` was never re-exported from the public `src/index.ts` entry point, no tests reference it, and the only internal import (`src/handler.mjs`) uses the async `load()`. The three functions and the `loadSync` re-export in `src/runtime/index.ts` are now dead code, so drop them. 2. Defensive cleanup in `scripts/update_dist_version.sh`: explicitly `rm -f dist/handler.js dist/handler.cjs` before copying handler.mjs. `tsc` does not clean stale files in `dist/` between incremental builds, so a leftover `handler.js` from a prior build of an earlier ref could resurface and silently re-introduce the resolver bug this PR is fixing. The publish scripts already do `rm -rf ./dist` before `yarn build`, so this is purely a guard for local dev builds.

zarirhamza · 2026-06-11T18:56:10Z

@lucaspimentel good catch — verified and addressed in 3aa51e2:

loadSync only ever had two callers, both in handler.cjs (the traceExtractor load and the wrapped-handler load). Gone.
_loadUserAppSync only called by loadSync. Gone.
_tryRequireSync only called by _loadUserAppSync. Gone.
Confirmed not in the public API: src/index.ts only imports subscribeToDC from ./runtime, and no tests reference loadSync.

Removed all three functions from src/runtime/user-function.ts and dropped loadSync from the src/runtime/index.ts re-export in the same chunk. Build + lint + tests pass; the two nock.isDone() flakes in src/index.spec.ts reproduce on plain origin/main and are unrelated.

Net diff: -105 lines from user-function.ts, -1 char from runtime/index.ts.

joeyzhao2018

👍

AWS Lambda's CJS resolver picks `dist/handler.js` before `dist/handler.mjs` when the handler string is `node_modules/datadog-lambda-js/dist/handler.handler`, causing ESM user functions to fail with ERR_REQUIRE_ESM (or the surface "Cannot find module" variant when _tryRequireSync swallows it). Today `dist/handler.js` ships as a verbatim copy of `dist/handler.cjs` (via `cp ./dist/handler.cjs ./dist/handler.js` in the publish scripts), so it can never load ESM user code. This replaces the copy with a real `src/handler.js` shim that: - Detects ESM user code via the task root `package.json` `type` field or a `.mjs` `DD_LAMBDA_HANDLER` / `_HANDLER`. - For CJS user code: synchronously delegates to `handler.cjs`, preserving the prior fast path (no cold-start regression). - For ESM user code: exposes an async handler that lazily `import()`s `handler.mjs` on the first invocation. `handler.mjs`'s async `load()` transparently handles both CJS and ESM user modules. `scripts/update_dist_version.sh` already copies `src/handler.*` into `dist/`, so the shim ships automatically; the explicit `cp ./dist/handler.cjs ./dist/handler.js` lines in `publish_npm.sh` and `publish_prod.sh` are no longer needed and are removed. Refs: #782 Refs: SLES-2888, SLES-2895, SVLS-9238

The Dockerfile copies `dist/` wholesale into the layer, so the new `dist/handler.js` shim would get included and Lambda's bootstrap would resolve `handler.handler` to it. Inside the layer that goes through the shim's CJS branch -> handler.cjs -> _tryRequireSync, which can't load `.mjs` user modules (this regressed the `esm_node*` integration tests). The layer has never shipped a `handler.js`; Lambda's resolver falls through to `handler.mjs`, whose async `load()` handles both CJS and ESM user code. Drop the shim from the layer to preserve that behavior. The shim is still useful in the npm tarball, where the publish scripts used to copy `handler.cjs` to `handler.js`.

- Rename `userIsEsm()` to `shouldUseEsmHandler()`. The function returns a routing decision based on heuristics (env regex + package.json sniff), not an authoritative fact about the user's code, so the new name reads more honestly at the call site. - Inline the former `src/handler.cjs` body into the CJS branch of the shim and delete `src/handler.cjs`. Lambda's resolver doesn't auto-resolve `.cjs`, so handler.cjs was only ever reachable via this shim's `require("./handler.cjs")`. Keeping it as a separate file made the load graph confusing — future readers grepping for "what loads handler.cjs?" would find one caller in the same directory. The heavy `require()`s remain scoped to the CJS branch so ESM users don't pay the cost of pulling in the full tracer at require-time. - Add `src/handler.spec.ts` covering shouldUseEsmHandler's four-case matrix (.mjs in _HANDLER / DD_LAMBDA_HANDLER, package.json type=module, package.json without type field, package.json missing, package.json unparseable). Detection is the load-bearing piece — a regression there means either ESM users keep seeing ERR_REQUIRE_ESM or CJS users pay an unnecessary async-import cold-start cost. - Tighten `scripts/update_dist_version.sh`'s `cp src/handler.*` to list the runtime artifacts explicitly. The broader glob also picked up `src/handler.spec.ts`, which is a test fixture, not a runtime file — shipping it would also be silently inconsistent with the Dockerfile's hard-coded `rm /nodejs/.../handler.js`. - Expand the Dockerfile comment to point at `scripts/update_dist_version.sh` as the source of the file being removed, so a future change to how `src/handler.js` is copied into `dist/` is visibly tied to the layer-side `rm`.

After feedback from Darcy Rayner (original handler.cjs/mjs author) and re-evaluating the cold-start trade-offs, the shim is the wrong shape. The shim's CJS branch saved ~10-30ms of cold-start init for CJS users on the npm-redirect path, but its ESM branch moved 300-800ms of tracer/handler load work from Lambda's init phase into the first invocation. That's worse in three ways for ESM users (the cohort this PR exists to fix): 1. First-invoke timeout risk on functions with tight timeouts — the loading work now eats into the invocation's time budget. 2. Provisioned concurrency is wasted — work deferred past init isn't pre-warmed, so PC customers pay for warm capacity and still hit cold-start latency on the first real request. 3. Lambda SnapStart is wasted for the same reason — snapshot is taken after init, and the shim's lazy `import()` runs post-snapshot. handler.mjs's async `load()` already handles both CJS and ESM user modules transparently. Lambda's bootstrap resolves `dist/handler.handler` to handler.mjs once `.js` is absent, so removing the shim is sufficient and matches the behavior the published layer has always had. End state: - Sole Lambda entry point everywhere: `handler.mjs`. - The npm tarball and the layer ship the same `dist/` (one handler file). - No detection heuristics, no shim, no `handler.cjs` (already gone). Aligns with: - Darcy Rayner's recommendation on the PR thread to "only bundle the ESM handler" since all supported runtimes (Node 18+) support top-level await in ESM. - The customer's own suggested fix in #782. - SLES-2888's confirmed-working internal workaround (`rm dist/handler.js` post-install). - The intent of the earlier PR #697.

…rtifacts Two follow-ups from PR #784 review: 1. Remove `loadSync`, `_loadUserAppSync`, `_tryRequireSync`. Deleting `src/handler.cjs` in the previous commit removed the only consumers of the synchronous loader chain. `loadSync` was never re-exported from the public `src/index.ts` entry point, no tests reference it, and the only internal import (`src/handler.mjs`) uses the async `load()`. The three functions and the `loadSync` re-export in `src/runtime/index.ts` are now dead code, so drop them. 2. Defensive cleanup in `scripts/update_dist_version.sh`: explicitly `rm -f dist/handler.js dist/handler.cjs` before copying handler.mjs. `tsc` does not clean stale files in `dist/` between incremental builds, so a leftover `handler.js` from a prior build of an earlier ref could resurface and silently re-introduce the resolver bug this PR is fixing. The publish scripts already do `rm -rf ./dist` before `yarn build`, so this is purely a guard for local dev builds.

Add container-image integration tests that exercise the npm-redirect path through the AWS-published `public.ecr.aws/lambda/nodejs:N` base image, with both CJS and ESM user handlers. Closes the gap from the existing `esm_node*` tests, which only cover the layer path (`/opt/nodejs/node_modules/datadog-lambda-js/handler.handler`) where only `handler.mjs` has ever been shipped. These tests would fail under any future regression that re-introduces a CJS-shadowing artifact at `dist/handler.js`, including the original #305 scenario for stock AWS RIC. - integration_tests/container/{cjs,esm}/ — Dockerfiles + handlers - serverless.yml — provider.ecr.images + container-{cjs,esm}_node funcs - run_integration_tests.sh — pack local datadog-lambda-js before deploy, thread NODE_MAJOR for ECR build args, add handlers to snapshot loop

Three fixes needed to get the container-image integration tests deploying and invoking against real AWS-stock RIC: - Container Dockerfiles previously installed only the datadog-lambda-js tarball, which doesn't pull dd-trace because it's a devDep of the package. Add dd-trace as a direct dependency in each container test's package.json, pinned to 5.105.0 to match the version yarn.lock resolves for the layer build. - Run `npm install --omit=dev` before installing the tarball so package.json deps are picked up. - Export BUILDX_NO_DEFAULT_ATTESTATIONS=1 in run_integration_tests.sh. Modern Docker buildx attaches provenance attestations by default, which produce OCI index manifests that AWS Lambda rejects with "image manifest ... media type ... not supported." Disabling the default attestations makes buildx emit Docker v2 manifests Lambda accepts. Plus baseline snapshots for `container-{cjs,esm}_node{18,20,22,24}` across 9 input events (72 return-value + 8 log files). Each container handler returns `{"message":"hello, dog!"}` end-to-end through `dist/handler.handler`, proving CJS and ESM user code load correctly on AWS-stock RIC for every supported Node major.

`yarn build` runs `tsc` which needs the root devDeps (@types/node, @types/aws-lambda, typescript, etc.). In CI the script is invoked without BUILD_LAYERS=true, so the layer-build Docker path that would otherwise hide the host install isn't reached, and tsc blew up with ~150 missing-type errors before getting to npm pack. Add `yarn install --frozen-lockfile` before the build so the host node_modules is populated regardless of which path the caller took.

The `arch:amd64` tag only provides the docker CLI; the daemon isn't running, so the container-image integration tests blow up with "Cannot connect to the Docker daemon at unix:///var/run/docker.sock" when serverless tries to build the ECR images. Switch to `docker-in-docker:amd64`, matching the pattern used by serverless-self-monitoring's `.deploy-lod-base` job for Docker- dependent steps. Other jobs (lint, unit test, layer build, etc.) don't need a daemon and stay on the cheaper plain runner.

zarirhamza marked this pull request as ready for review June 9, 2026 15:19

zarirhamza requested review from a team as code owners June 9, 2026 15:19

zarirhamza requested a review from lym953 June 9, 2026 15:19

duncanista reviewed Jun 10, 2026

View reviewed changes

Comment thread src/handler.js Outdated

Comment thread src/handler.js Outdated

Comment thread src/handler.js Outdated

Comment thread Dockerfile Outdated

zarirhamza changed the title ~~fix: route ESM Lambdas through handler.mjs via CJS shim~~ fix: ship handler.mjs as the sole Lambda entry point Jun 11, 2026

lucaspimentel requested a review from Copilot June 11, 2026 18:37

Copilot started reviewing on behalf of lucaspimentel June 11, 2026 18:38 View session

Copilot AI reviewed Jun 11, 2026

View reviewed changes

Comment thread scripts/update_dist_version.sh

samchungy mentioned this pull request Jun 14, 2026

Revert Lambda ESM changes seek-oss/skuba#2456

Closed

joeyzhao2018 approved these changes Jun 16, 2026

View reviewed changes

zarirhamza and others added 6 commits June 18, 2026 12:56

duncanista force-pushed the zarir/sles-2888-esm-handler-shim branch from 5bbffbd to 45e0613 Compare June 18, 2026 16:57

duncanista changed the title ~~fix: ship handler.mjs as the sole Lambda entry point~~ fix(handler)!: drop dist/handler.js to support ESM Lambda handlers Jun 18, 2026

duncanista added 3 commits June 18, 2026 14:14

duncanista approved these changes Jun 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(handler)!: drop dist/handler.js to support ESM Lambda handlers#784

fix(handler)!: drop dist/handler.js to support ESM Lambda handlers#784
zarirhamza wants to merge 9 commits into
mainfrom
zarir/sles-2888-esm-handler-shim

zarirhamza commented Jun 9, 2026 •

edited by duncanista

Loading

Uh oh!

datadog-prod-us1-3 Bot commented Jun 9, 2026 •

edited by datadog-prod-us1-6 Bot

Loading

Uh oh!

zarirhamza commented Jun 9, 2026

Uh oh!

duncanista left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DarcyRaynerDD commented Jun 11, 2026

Uh oh!

zarirhamza commented Jun 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

lucaspimentel commented Jun 11, 2026 •

edited

Loading

Uh oh!

zarirhamza commented Jun 11, 2026

Uh oh!

joeyzhao2018 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

zarirhamza commented Jun 9, 2026 • edited by duncanista Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Breaking change

Why not the shim (earlier commits in this PR)

Tests

Types of Changes

Check all that apply

Uh oh!

datadog-prod-us1-3 Bot commented Jun 9, 2026 • edited by datadog-prod-us1-6 Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Warnings

Uh oh!

zarirhamza commented Jun 9, 2026

Uh oh!

duncanista left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DarcyRaynerDD commented Jun 11, 2026

Uh oh!

zarirhamza commented Jun 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

lucaspimentel commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zarirhamza commented Jun 11, 2026

Uh oh!

joeyzhao2018 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

zarirhamza commented Jun 9, 2026 •

edited by duncanista

Loading

datadog-prod-us1-3 Bot commented Jun 9, 2026 •

edited by datadog-prod-us1-6 Bot

Loading

lucaspimentel commented Jun 11, 2026 •

edited

Loading