From e4eecdddd9234e167d33e2a27d050af59b5849ab Mon Sep 17 00:00:00 2001 From: Christian Bewernitz Date: Sat, 18 Apr 2026 14:04:44 +0200 Subject: [PATCH] Improve GHSA-wh4c-j3r5-mjhp --- .../2026/04/GHSA-wh4c-j3r5-mjhp/GHSA-wh4c-j3r5-mjhp.json | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/advisories/github-reviewed/2026/04/GHSA-wh4c-j3r5-mjhp/GHSA-wh4c-j3r5-mjhp.json b/advisories/github-reviewed/2026/04/GHSA-wh4c-j3r5-mjhp/GHSA-wh4c-j3r5-mjhp.json index 72bef1ab7f941..59270cc733de0 100644 --- a/advisories/github-reviewed/2026/04/GHSA-wh4c-j3r5-mjhp/GHSA-wh4c-j3r5-mjhp.json +++ b/advisories/github-reviewed/2026/04/GHSA-wh4c-j3r5-mjhp/GHSA-wh4c-j3r5-mjhp.json @@ -1,13 +1,13 @@ { "schema_version": "1.4.0", "id": "GHSA-wh4c-j3r5-mjhp", - "modified": "2026-04-06T17:24:48Z", + "modified": "2026-04-06T17:24:49Z", "published": "2026-04-01T00:19:06Z", "aliases": [ "CVE-2026-34601" ], "summary": "xmldom: XML injection via unsafe CDATA serialization allows attacker-controlled markup insertion", - "details": "## Summary\n\n`@xmldom/xmldom` allows attacker-controlled strings containing the CDATA terminator `]]>` to be inserted into a `CDATASection` node. During serialization, `XMLSerializer` emitted the CDATA content verbatim without rejecting or safely splitting the terminator. As a result, data intended to remain text-only became **active XML markup** in the serialized output, enabling XML structure\ninjection and downstream business-logic manipulation.\n\nThe sequence `]]>` is not allowed inside CDATA content and must be rejected or safely handled during serialization. ([MDN Web Docs](https://developer.mozilla.org/))\n\n### Attack surface\n\n`Document.createCDATASection(data)` is the most direct entry point, but it is not the only one. The WHATWG DOM spec intentionally does not validate `]]>` in mutation methods — only `createCDATASection` carries that guard. The following paths therefore also allow `]]>` to enter a CDATASection node and reach the serializer:\n\n- `CharacterData.appendData()`\n- `CharacterData.replaceData()`\n- `CharacterData.insertData()`\n- Direct assignment to `.data`\n- Direct assignment to `.textContent`\n\n(Note: assigning to `.nodeValue` does **not** update `.data` in this implementation — the serializer reads `.data` directly — so `.nodeValue` is not an exploitable path.)\n\n### Parse path\n\nParsing XML that contains a CDATA section is **not** affected. The SAX parser's non-greedy `CDSect` regex stops at the first `]]>`, so parsed CDATA data never contains the terminator.\n\n---\n\n## Impact\n\nIf an application uses `xmldom` to generate \"trusted\" XML documents that embed **untrusted user input** inside CDATA (a common pattern in exports, feeds, SOAP/XML integrations, etc.), an attacker can inject additional XML elements/attributes into the generated document.\n\nThis can lead to:\n\n- Integrity violation of generated XML documents.\n- Business-logic injection in downstream consumers (e.g., injecting `true`, `admin`, workflow flags, or other security-relevant elements).\n- Unexpected privilege/workflow decisions if downstream logic assumes injected nodes cannot appear.\n\nThis issue does **not** require malformed parsers or browser behavior; it is caused by serialization producing attacker-influenced XML markup.\n\n---\n\n## Root Cause (with file + line numbers)\n\n**File:** `lib/dom.js`\n\n### 1. No validation in `createCDATASection`\n\n`createCDATASection: function (data)` accepts any string and appends it directly.\n\n- **Lines 2216–2221** (0.9.8)\n\n### 2. Unsafe CDATA serialization\n\nSerializer prints CDATA sections as:\n\n```\n\n```\n\nwithout handling `]]>` in the data.\n\n- **Lines 2919–2920** (0.9.8)\n\nBecause CDATA content is emitted verbatim, an embedded `]]>` closes the CDATA section early and the remainder of the attacker-controlled payload is interpreted as markup in the serialized XML.\n\n---\n\n## Proof of Concept — Fix A: `createCDATASection` now throws\n\nOn patched versions, passing `]]>` directly to `createCDATASection` throws `InvalidCharacterError` instead of silently accepting the payload:\n\n```js\nconst { DOMImplementation } = require('./lib');\n\nconst doc = new DOMImplementation().createDocument(null, 'root', null);\ntry {\n doc.createCDATASection('SAFE]]>');\n console.log('VULNERABLE — no error thrown');\n} catch (e) {\n console.log('FIXED — threw:', e.name); // InvalidCharacterError\n}\n```\n\nExpected output on patched versions:\n\n```\nFIXED — threw: InvalidCharacterError\n```\n\n---\n\n## Proof of Concept — Fix B: mutation vector now safe\n\nOn patched versions, injecting `]]>` via a mutation method (`appendData`, `replaceData`, `.data =`, `.textContent =`) no longer produces injectable output. The serializer splits the terminator so the result round-trips as safe text:\n\n```js\nconst { DOMImplementation, XMLSerializer } = require('./lib');\nconst { DOMParser } = require('./lib');\n\nconst doc = new DOMImplementation().createDocument(null, 'root', null);\n\n// Start with safe data, then mutate to include the terminator\nconst cdata = doc.createCDATASection('safe');\ndoc.documentElement.appendChild(cdata);\ncdata.appendData(']]>TEXT 0;\nconsole.log('Injected element found in reparsed doc:', injected);\n// VULNERABLE: true | FIXED: false\n```\n\nExpected output on patched versions:\n\n```\nSerialized: TEXT\nInjected element found in reparsed doc: false\n```\n\n---\n\n## Fix Applied\n\nBoth mitigations were implemented:\n\n### Option A — Strict/spec-aligned: reject `]]>` in `createCDATASection()`\n\n`Document.createCDATASection(data)` now throws `InvalidCharacterError` (per the [WHATWG DOM spec](https://dom.spec.whatwg.org/#dom-document-createcdatasection)) when `data` contains `]]>`. This closes the direct entry point.\n\nCode that previously passed a string containing `]]>` to `createCDATASection` and relied on the silent/unsafe behaviour will now receive `InvalidCharacterError`. Use a mutation method such as `appendData` if you intentionally need `]]>` in a CDATASection node's data (the serializer split in Option B will keep the output safe).\n\n### Option B — Defensive serialization: split the terminator during serialization\n\n`XMLSerializer` now replaces every occurrence of `]]>` in CDATA section data with the split sequence `]]]]>` before emitting. This closes all mutation-vector paths that Option A alone cannot guard, and means the serialized output is always well-formed XML regardless of how `]]>` entered the node.", + "details": "## Summary\n\n`@xmldom/xmldom` allows attacker-controlled strings containing the CDATA terminator `]]>` to be inserted into a `CDATASection` node. During serialization, `XMLSerializer` emitted the CDATA content verbatim without rejecting or safely splitting the terminator. As a result, data intended to remain text-only became **active XML markup** in the serialized output, enabling XML structure\ninjection and downstream business-logic manipulation.\n\nThe sequence `]]>` is not allowed inside CDATA content and must be rejected or safely handled during serialization. ([MDN Web Docs](https://developer.mozilla.org/))\n\n### Attack surface\n\n`Document.createCDATASection(data)` is the most direct entry point, but it is not the only one. The WHATWG DOM spec intentionally does not validate `]]>` in mutation methods — only `createCDATASection` carries that guard. The following paths therefore also allow `]]>` to enter a CDATASection node and reach the serializer:\n\n- `CharacterData.appendData()`\n- `CharacterData.replaceData()`\n- `CharacterData.insertData()`\n- Direct assignment to `.data`\n- Direct assignment to `.textContent`\n\n(Note: assigning to `.nodeValue` does **not** update `.data` in this implementation — the serializer reads `.data` directly — so `.nodeValue` is not an exploitable path.)\n\n### Parse path\n\nParsing XML that contains a CDATA section is **not** affected. The SAX parser's non-greedy `CDSect` regex stops at the first `]]>`, so parsed CDATA data never contains the terminator.\n\n---\n\n## Impact\n\nIf an application uses `xmldom` to generate \"trusted\" XML documents that embed **untrusted user input** inside CDATA (a common pattern in exports, feeds, SOAP/XML integrations, etc.), an attacker can inject additional XML elements/attributes into the generated document.\n\nThis can lead to:\n\n- Integrity violation of generated XML documents.\n- Business-logic injection in downstream consumers (e.g., injecting `true`, `admin`, workflow flags, or other security-relevant elements).\n- Unexpected privilege/workflow decisions if downstream logic assumes injected nodes cannot appear.\n\nThis issue does **not** require malformed parsers or browser behavior; it is caused by serialization producing attacker-influenced XML markup.\n\n---\n\n## Root Cause (with file + line numbers)\n\n**File:** `lib/dom.js`\n\n### 1. No validation in `createCDATASection`\n\n`createCDATASection: function (data)` accepts any string and appends it directly.\n\n- **Lines 2216–2221** (0.9.8)\n\n### 2. Unsafe CDATA serialization\n\nSerializer prints CDATA sections as:\n\n```\n\n```\n\nwithout handling `]]>` in the data.\n\n- **Lines 2919–2920** (0.9.8)\n\nBecause CDATA content is emitted verbatim, an embedded `]]>` closes the CDATA section early and the remainder of the attacker-controlled payload is interpreted as markup in the serialized XML.\n\n---\n\n## Proof of Concept — Fix A: `createCDATASection` now throws\n\nOn patched versions, passing `]]>` directly to `createCDATASection` throws `InvalidCharacterError` instead of silently accepting the payload:\n\n```js\nconst { DOMImplementation } = require('./lib');\n\nconst doc = new DOMImplementation().createDocument(null, 'root', null);\ntry {\n doc.createCDATASection('SAFE]]>');\n console.log('VULNERABLE — no error thrown');\n} catch (e) {\n console.log('FIXED — threw:', e.name); // InvalidCharacterError\n}\n```\n\nExpected output on patched versions:\n\n```\nFIXED — threw: InvalidCharacterError\n```\n\n---\n\n## Proof of Concept — Fix B: mutation vector now safe\n\nOn patched versions, injecting `]]>` via a mutation method (`appendData`, `replaceData`, `.data =`, `.textContent =`) no longer produces injectable output. The serializer splits the terminator so the result round-trips as safe text:\n\n```js\nconst { DOMImplementation, XMLSerializer } = require('./lib');\nconst { DOMParser } = require('./lib');\n\nconst doc = new DOMImplementation().createDocument(null, 'root', null);\n\n// Start with safe data, then mutate to include the terminator\nconst cdata = doc.createCDATASection('safe');\ndoc.documentElement.appendChild(cdata);\ncdata.appendData(']]>TEXT 0;\nconsole.log('Injected element found in reparsed doc:', injected);\n// VULNERABLE: true | FIXED: false\n```\n\nExpected output on patched versions:\n\n```\nSerialized: TEXT\nInjected element found in reparsed doc: false\n```\n\n---\n\n## Fix Applied\n\nBoth mitigations were implemented:\n\n### Option A — Strict/spec-aligned: reject `]]>` in `createCDATASection()`\n\n`Document.createCDATASection(data)` now throws `InvalidCharacterError` (per the [WHATWG DOM spec](https://dom.spec.whatwg.org/#dom-document-createcdatasection)) when `data` contains `]]>`. This closes the direct entry point.\n\nCode that previously passed a string containing `]]>` to `createCDATASection` and relied on the silent/unsafe behaviour will now receive `InvalidCharacterError`. Use a mutation method such as `appendData` if you intentionally need `]]>` in a CDATASection node's data (the serializer split in Option B will keep the output safe).\n\n### Option B — Defensive serialization: split the terminator during serialization\n\n`XMLSerializer` now replaces every occurrence of `]]>` in CDATA section data with the split sequence `]]]]>` before emitting. This closes all mutation-vector paths that Option A alone cannot guard, and means the serialized output is always well-formed XML regardless of how `]]>` entered the node.\n\n## Update — 2026-04-xx (0.9.10 / 0.8.13)\n\n### `splitCDATASections` is deprecated\n\nThe CDATA split behavior introduced as Option B of this fix (replacing `]]>` with\n`]]]]>` during serialization) is **deprecated** as of 0.9.10 / 0.8.13.\n\nThis release introduces a `requireWellFormed` option on `XMLSerializer.serializeToString()`.\nWhen `{ requireWellFormed: true }` is passed as the second argument, the serializer throws\n`InvalidStateError` if CDATA section data contains `]]>` — this is the spec-aligned behavior\n(W3C DOM Parsing and Serialization, `require well-formed` flag) and the recommended migration\npath going forward.\n\nThe split behavior is now controlled by an explicit `splitCDATASections` option (default\n`true`, preserving the current behavior). The three serialization behaviors are:\n\n| `requireWellFormed` | `splitCDATASections` | Behavior |\n|---|---|---|\n| `false` (default) | `true` (default) | Split `]]>` → `]]]]>` (current behavior, deprecated) |\n| `true` | — (ignored) | Throw `InvalidStateError` — spec-aligned, recommended |\n| `false` | `false` | Emit verbatim — same as pre-0.9.9 behavior |\n\n`requireWellFormed: true` takes precedence: the split path is unreachable when it is set.\n\n### Migration\n\nReplace any reliance on the default split behavior with an explicit opt-in:\n\n```js\n// Before (implicit split, deprecated):\nconst xml = new XMLSerializer().serializeToString(doc);\n\n// After (explicit guard, spec-aligned):\nconst xml = new XMLSerializer().serializeToString(doc, { requireWellFormed: true });\n// Throws InvalidStateError if any CDATASection contains ']]>'\n```\n\n### Removal timeline\n\nBoth the `splitCDATASections` option and the underlying `]]>` → `]]]]>` split\nmechanics will be removed in the next breaking (`0.10.0`) release. After removal, the only\nbehaviors will be verbatim (default) and `requireWellFormed: true` (throws).\n\nRemoval is tracked in [xmldom/xmldom#999](https://github.com/xmldom/xmldom/issues/999).\n", "severity": [ { "type": "CVSS_V3", @@ -82,6 +82,10 @@ "type": "ADVISORY", "url": "https://nvd.nist.gov/vuln/detail/CVE-2026-34601" }, + { + "type": "WEB", + "url": "https://github.com/xmldom/xmldom/issues/999" + }, { "type": "WEB", "url": "https://github.com/xmldom/xmldom/commit/2b852e836ab86dbbd6cbaf0537f584dd0b5ac184"