fix(handler)!: drop dist/handler.js to support ESM Lambda handlers#784
fix(handler)!: drop dist/handler.js to support ESM Lambda handlers#784zarirhamza wants to merge 9 commits into
Conversation
|
|
First push regressed the Inside the layer, the shim's The layer has never shipped a |
duncanista
left a comment
There was a problem hiding this comment.
Thanks for tackling this — the bug is real and the runtime-branching shim is a reasonable approach. A few thoughts on the structure and on coverage that I think are worth addressing before merge. Nothing blocking the direction, just some cleanup that would make the file layout easier to reason about for the next person who touches it.
- Rename `userIsEsm()` to `shouldUseEsmHandler()`. The function returns a
routing decision based on heuristics (env regex + package.json sniff),
not an authoritative fact about the user's code, so the new name reads
more honestly at the call site.
- Inline the former `src/handler.cjs` body into the CJS branch of the
shim and delete `src/handler.cjs`. Lambda's resolver doesn't auto-resolve
`.cjs`, so handler.cjs was only ever reachable via this shim's
`require("./handler.cjs")`. Keeping it as a separate file made the load
graph confusing — future readers grepping for "what loads handler.cjs?"
would find one caller in the same directory. The heavy `require()`s
remain scoped to the CJS branch so ESM users don't pay the cost of
pulling in the full tracer at require-time.
- Add `src/handler.spec.ts` covering shouldUseEsmHandler's four-case
matrix (.mjs in _HANDLER / DD_LAMBDA_HANDLER, package.json type=module,
package.json without type field, package.json missing, package.json
unparseable). Detection is the load-bearing piece — a regression there
means either ESM users keep seeing ERR_REQUIRE_ESM or CJS users pay an
unnecessary async-import cold-start cost.
- Tighten `scripts/update_dist_version.sh`'s `cp src/handler.*` to list
the runtime artifacts explicitly. The broader glob also picked up
`src/handler.spec.ts`, which is a test fixture, not a runtime file —
shipping it would also be silently inconsistent with the Dockerfile's
hard-coded `rm /nodejs/.../handler.js`.
- Expand the Dockerfile comment to point at
`scripts/update_dist_version.sh` as the source of the file being
removed, so a future change to how `src/handler.js` is copied into
`dist/` is visibly tied to the layer-side `rm`.
|
I think I wrote the original version of the handler.cjs/esm module split a few years ago, so maybe this context would help. So, much of this logic was due to whether the node runtime supports ESM modules and top level await, (at the time we supported node 12 which didn't). Top level await is only supported in ESM, and is needed when dynamic importing ESM modules so we can receive the handler function instead of a promise, and export the result. This isn't possible in CJS, while you can import ESM modules dynamically, they end up as promises and if you want to await that promise you have to do it within an async function. ESM doesn't have any problems requiring CJS modules dynamically though. So, when the customer is using the layer, and a runtime >= 14, it makes sense to just use the ESM handler since it should handle all cases in theory. Since older runtimes are completely deprecated now, we should only bundle the ESM handler into the layer. I haven't looked at the support case to check if this reasoning was solid, just what I remember verifying at the time, and I know that was the previous attempted approach, but I'm confused how it could have broken CJS users. The second case is when installing datadog-lambda-js via npm. Since we don't know which module format the customer is using, we have to package both versions. I believe I did some experimentation to find which files would be auto-imported based on their file extension. I think you had to specify the .mjs extension for it to work for esm. Typically though, a customer using the npm module could just import the datadog-lambda-js library and wrap manually if they ran into an issue. |
|
@DarcyRaynerDD thanks for the historical context — it's exactly what was missing. You're right on both counts. I switched the PR to do what you suggested in commit Re your "I'm confused how it could have broken CJS users" point — I went back through this carefully and I don't think it would have. Lambda's bootstrap resolves I had originally added a runtime-branching shim to avoid making CJS users on the npm-redirect path pay an extra ~10-30ms of init time (sync Description updated. Welcome any further notes. |
There was a problem hiding this comment.
Pull request overview
Removes the CommonJS handler artifact path so AWS Lambda resolves the Datadog handler redirect to the ESM handler.mjs, avoiding ESM user-handler load failures caused by dist/handler.js being preferred by the bootstrap resolver.
Changes:
- Delete
src/handler.cjsso there’s no longer a CJS handler source to ship/copy. - Update
scripts/update_dist_version.shto copy onlysrc/handler.mjsintodist/. - Remove publishing-time steps that copied
dist/handler.cjstodist/handler.js.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/handler.cjs |
Removes the CJS handler implementation so it can’t be shipped as an entry point. |
scripts/update_dist_version.sh |
Restricts post-build copying to handler.mjs as the intended Lambda entry point. |
scripts/publish_prod.sh |
Stops injecting dist/handler.js during prod publishing. |
.gitlab/scripts/publish_npm.sh |
Stops injecting dist/handler.js during CI npm publishing. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
According to 🤖
Deleting
It's also not reachable through the package's public entry point: Since this PR is what orphaned it, it'd be clean to remove the three functions ( |
…rtifacts Two follow-ups from PR #784 review: 1. Remove `loadSync`, `_loadUserAppSync`, `_tryRequireSync`. Deleting `src/handler.cjs` in the previous commit removed the only consumers of the synchronous loader chain. `loadSync` was never re-exported from the public `src/index.ts` entry point, no tests reference it, and the only internal import (`src/handler.mjs`) uses the async `load()`. The three functions and the `loadSync` re-export in `src/runtime/index.ts` are now dead code, so drop them. 2. Defensive cleanup in `scripts/update_dist_version.sh`: explicitly `rm -f dist/handler.js dist/handler.cjs` before copying handler.mjs. `tsc` does not clean stale files in `dist/` between incremental builds, so a leftover `handler.js` from a prior build of an earlier ref could resurface and silently re-introduce the resolver bug this PR is fixing. The publish scripts already do `rm -rf ./dist` before `yarn build`, so this is purely a guard for local dev builds.
|
@lucaspimentel good catch — verified and addressed in
Removed all three functions from Net diff: -105 lines from |
AWS Lambda's CJS resolver picks `dist/handler.js` before `dist/handler.mjs` when the handler string is `node_modules/datadog-lambda-js/dist/handler.handler`, causing ESM user functions to fail with ERR_REQUIRE_ESM (or the surface "Cannot find module" variant when _tryRequireSync swallows it). Today `dist/handler.js` ships as a verbatim copy of `dist/handler.cjs` (via `cp ./dist/handler.cjs ./dist/handler.js` in the publish scripts), so it can never load ESM user code. This replaces the copy with a real `src/handler.js` shim that: - Detects ESM user code via the task root `package.json` `type` field or a `.mjs` `DD_LAMBDA_HANDLER` / `_HANDLER`. - For CJS user code: synchronously delegates to `handler.cjs`, preserving the prior fast path (no cold-start regression). - For ESM user code: exposes an async handler that lazily `import()`s `handler.mjs` on the first invocation. `handler.mjs`'s async `load()` transparently handles both CJS and ESM user modules. `scripts/update_dist_version.sh` already copies `src/handler.*` into `dist/`, so the shim ships automatically; the explicit `cp ./dist/handler.cjs ./dist/handler.js` lines in `publish_npm.sh` and `publish_prod.sh` are no longer needed and are removed. Refs: #782 Refs: SLES-2888, SLES-2895, SVLS-9238
The Dockerfile copies `dist/` wholesale into the layer, so the new `dist/handler.js` shim would get included and Lambda's bootstrap would resolve `handler.handler` to it. Inside the layer that goes through the shim's CJS branch -> handler.cjs -> _tryRequireSync, which can't load `.mjs` user modules (this regressed the `esm_node*` integration tests). The layer has never shipped a `handler.js`; Lambda's resolver falls through to `handler.mjs`, whose async `load()` handles both CJS and ESM user code. Drop the shim from the layer to preserve that behavior. The shim is still useful in the npm tarball, where the publish scripts used to copy `handler.cjs` to `handler.js`.
- Rename `userIsEsm()` to `shouldUseEsmHandler()`. The function returns a
routing decision based on heuristics (env regex + package.json sniff),
not an authoritative fact about the user's code, so the new name reads
more honestly at the call site.
- Inline the former `src/handler.cjs` body into the CJS branch of the
shim and delete `src/handler.cjs`. Lambda's resolver doesn't auto-resolve
`.cjs`, so handler.cjs was only ever reachable via this shim's
`require("./handler.cjs")`. Keeping it as a separate file made the load
graph confusing — future readers grepping for "what loads handler.cjs?"
would find one caller in the same directory. The heavy `require()`s
remain scoped to the CJS branch so ESM users don't pay the cost of
pulling in the full tracer at require-time.
- Add `src/handler.spec.ts` covering shouldUseEsmHandler's four-case
matrix (.mjs in _HANDLER / DD_LAMBDA_HANDLER, package.json type=module,
package.json without type field, package.json missing, package.json
unparseable). Detection is the load-bearing piece — a regression there
means either ESM users keep seeing ERR_REQUIRE_ESM or CJS users pay an
unnecessary async-import cold-start cost.
- Tighten `scripts/update_dist_version.sh`'s `cp src/handler.*` to list
the runtime artifacts explicitly. The broader glob also picked up
`src/handler.spec.ts`, which is a test fixture, not a runtime file —
shipping it would also be silently inconsistent with the Dockerfile's
hard-coded `rm /nodejs/.../handler.js`.
- Expand the Dockerfile comment to point at
`scripts/update_dist_version.sh` as the source of the file being
removed, so a future change to how `src/handler.js` is copied into
`dist/` is visibly tied to the layer-side `rm`.
After feedback from Darcy Rayner (original handler.cjs/mjs author) and
re-evaluating the cold-start trade-offs, the shim is the wrong shape.
The shim's CJS branch saved ~10-30ms of cold-start init for CJS users
on the npm-redirect path, but its ESM branch moved 300-800ms of
tracer/handler load work from Lambda's init phase into the first
invocation. That's worse in three ways for ESM users (the cohort this
PR exists to fix):
1. First-invoke timeout risk on functions with tight timeouts —
the loading work now eats into the invocation's time budget.
2. Provisioned concurrency is wasted — work deferred past init isn't
pre-warmed, so PC customers pay for warm capacity and still hit
cold-start latency on the first real request.
3. Lambda SnapStart is wasted for the same reason — snapshot is
taken after init, and the shim's lazy `import()` runs post-snapshot.
handler.mjs's async `load()` already handles both CJS and ESM user
modules transparently. Lambda's bootstrap resolves `dist/handler.handler`
to handler.mjs once `.js` is absent, so removing the shim is sufficient
and matches the behavior the published layer has always had.
End state:
- Sole Lambda entry point everywhere: `handler.mjs`.
- The npm tarball and the layer ship the same `dist/` (one handler file).
- No detection heuristics, no shim, no `handler.cjs` (already gone).
Aligns with:
- Darcy Rayner's recommendation on the PR thread to "only bundle the
ESM handler" since all supported runtimes (Node 18+) support top-level
await in ESM.
- The customer's own suggested fix in
#782.
- SLES-2888's confirmed-working internal workaround
(`rm dist/handler.js` post-install).
- The intent of the earlier PR #697.
…rtifacts Two follow-ups from PR #784 review: 1. Remove `loadSync`, `_loadUserAppSync`, `_tryRequireSync`. Deleting `src/handler.cjs` in the previous commit removed the only consumers of the synchronous loader chain. `loadSync` was never re-exported from the public `src/index.ts` entry point, no tests reference it, and the only internal import (`src/handler.mjs`) uses the async `load()`. The three functions and the `loadSync` re-export in `src/runtime/index.ts` are now dead code, so drop them. 2. Defensive cleanup in `scripts/update_dist_version.sh`: explicitly `rm -f dist/handler.js dist/handler.cjs` before copying handler.mjs. `tsc` does not clean stale files in `dist/` between incremental builds, so a leftover `handler.js` from a prior build of an earlier ref could resurface and silently re-introduce the resolver bug this PR is fixing. The publish scripts already do `rm -rf ./dist` before `yarn build`, so this is purely a guard for local dev builds.
Add container-image integration tests that exercise the npm-redirect path through the AWS-published `public.ecr.aws/lambda/nodejs:N` base image, with both CJS and ESM user handlers. Closes the gap from the existing `esm_node*` tests, which only cover the layer path (`/opt/nodejs/node_modules/datadog-lambda-js/handler.handler`) where only `handler.mjs` has ever been shipped. These tests would fail under any future regression that re-introduces a CJS-shadowing artifact at `dist/handler.js`, including the original #305 scenario for stock AWS RIC. - integration_tests/container/{cjs,esm}/ — Dockerfiles + handlers - serverless.yml — provider.ecr.images + container-{cjs,esm}_node funcs - run_integration_tests.sh — pack local datadog-lambda-js before deploy, thread NODE_MAJOR for ECR build args, add handlers to snapshot loop
5bbffbd to
45e0613
Compare
Three fixes needed to get the container-image integration tests
deploying and invoking against real AWS-stock RIC:
- Container Dockerfiles previously installed only the
datadog-lambda-js tarball, which doesn't pull dd-trace because
it's a devDep of the package. Add dd-trace as a direct dependency
in each container test's package.json, pinned to 5.105.0 to
match the version yarn.lock resolves for the layer build.
- Run `npm install --omit=dev` before installing the tarball so
package.json deps are picked up.
- Export BUILDX_NO_DEFAULT_ATTESTATIONS=1 in run_integration_tests.sh.
Modern Docker buildx attaches provenance attestations by default,
which produce OCI index manifests that AWS Lambda rejects with
"image manifest ... media type ... not supported." Disabling the
default attestations makes buildx emit Docker v2 manifests Lambda
accepts.
Plus baseline snapshots for `container-{cjs,esm}_node{18,20,22,24}`
across 9 input events (72 return-value + 8 log files). Each
container handler returns `{"message":"hello, dog!"}` end-to-end
through `dist/handler.handler`, proving CJS and ESM user code load
correctly on AWS-stock RIC for every supported Node major.
`yarn build` runs `tsc` which needs the root devDeps (@types/node, @types/aws-lambda, typescript, etc.). In CI the script is invoked without BUILD_LAYERS=true, so the layer-build Docker path that would otherwise hide the host install isn't reached, and tsc blew up with ~150 missing-type errors before getting to npm pack. Add `yarn install --frozen-lockfile` before the build so the host node_modules is populated regardless of which path the caller took.
The `arch:amd64` tag only provides the docker CLI; the daemon isn't running, so the container-image integration tests blow up with "Cannot connect to the Docker daemon at unix:///var/run/docker.sock" when serverless tries to build the ECR images. Switch to `docker-in-docker:amd64`, matching the pattern used by serverless-self-monitoring's `.deploy-lod-base` job for Docker- dependent steps. Other jobs (lint, unit test, layer build, etc.) don't need a daemon and stay on the cheaper plain runner.
What
Fixes #782. Removes
dist/handler.jsfrom the published npm tarball; Lambda's resolver falls through todist/handler.mjs, whose asyncload()handles both CJS and ESM user modules. Aligns the npm package with the layer (which has only ever shippedhandler.mjs).Breaking change
Container-image deployments require
aws-lambda-ric >= 3.0.0(June 2023). AWS-publishedpublic.ecr.aws/lambda/nodejs:Nbase images include it since mid-2023; rebuild stale custom images if needed. Older RIC versions surface asCannot find module 'handler'. Historical context:dist/handler.jswas originally added in #307 to work around exactly this in early RIC.Why not the shim (earlier commits in this PR)
Initial commits added a CJS shim that detected ESM at runtime and dynamically imported
handler.mjs. Removed after review: the shim defers tracer + user-handler init from Lambda's init phase into the first invocation, breaking Provisioned Concurrency and SnapStart.handler.mjs's top-levelawaitpre-pays that cost. Aligns with #697 and the reporter's suggested fix in #782.Tests
esm_node{18,20,22,24}tests (layer path) unchangedcontainer-{cjs,esm}_node{18,20,22,24}tests exercisedist/handler.handlerthrough AWS-stock RICTypes of Changes
Check all that apply