Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
{
"schema_version": "1.4.0",
"id": "GHSA-wh4c-j3r5-mjhp",
"modified": "2026-04-06T17:24:48Z",
"modified": "2026-04-06T17:24:49Z",
"published": "2026-04-01T00:19:06Z",
"aliases": [
"CVE-2026-34601"
],
"summary": "xmldom: XML injection via unsafe CDATA serialization allows attacker-controlled markup insertion",
"details": "## Summary\n\n`@xmldom/xmldom` allows attacker-controlled strings containing the CDATA terminator `]]>` to be inserted into a `CDATASection` node. During serialization, `XMLSerializer` emitted the CDATA content verbatim without rejecting or safely splitting the terminator. As a result, data intended to remain text-only became **active XML markup** in the serialized output, enabling XML structure\ninjection and downstream business-logic manipulation.\n\nThe sequence `]]>` is not allowed inside CDATA content and must be rejected or safely handled during serialization. ([MDN Web Docs](https://developer.mozilla.org/))\n\n### Attack surface\n\n`Document.createCDATASection(data)` is the most direct entry point, but it is not the only one. The WHATWG DOM spec intentionally does not validate `]]>` in mutation methods — only `createCDATASection` carries that guard. The following paths therefore also allow `]]>` to enter a CDATASection node and reach the serializer:\n\n- `CharacterData.appendData()`\n- `CharacterData.replaceData()`\n- `CharacterData.insertData()`\n- Direct assignment to `.data`\n- Direct assignment to `.textContent`\n\n(Note: assigning to `.nodeValue` does **not** update `.data` in this implementation — the serializer reads `.data` directly — so `.nodeValue` is not an exploitable path.)\n\n### Parse path\n\nParsing XML that contains a CDATA section is **not** affected. The SAX parser's non-greedy `CDSect` regex stops at the first `]]>`, so parsed CDATA data never contains the terminator.\n\n---\n\n## Impact\n\nIf an application uses `xmldom` to generate \"trusted\" XML documents that embed **untrusted user input** inside CDATA (a common pattern in exports, feeds, SOAP/XML integrations, etc.), an attacker can inject additional XML elements/attributes into the generated document.\n\nThis can lead to:\n\n- Integrity violation of generated XML documents.\n- Business-logic injection in downstream consumers (e.g., injecting `<approved>true</approved>`, `<role>admin</role>`, workflow flags, or other security-relevant elements).\n- Unexpected privilege/workflow decisions if downstream logic assumes injected nodes cannot appear.\n\nThis issue does **not** require malformed parsers or browser behavior; it is caused by serialization producing attacker-influenced XML markup.\n\n---\n\n## Root Cause (with file + line numbers)\n\n**File:** `lib/dom.js`\n\n### 1. No validation in `createCDATASection`\n\n`createCDATASection: function (data)` accepts any string and appends it directly.\n\n- **Lines 2216–2221** (0.9.8)\n\n### 2. Unsafe CDATA serialization\n\nSerializer prints CDATA sections as:\n\n```\n<![CDATA[ + node.data + ]]>\n```\n\nwithout handling `]]>` in the data.\n\n- **Lines 2919–2920** (0.9.8)\n\nBecause CDATA content is emitted verbatim, an embedded `]]>` closes the CDATA section early and the remainder of the attacker-controlled payload is interpreted as markup in the serialized XML.\n\n---\n\n## Proof of Concept — Fix A: `createCDATASection` now throws\n\nOn patched versions, passing `]]>` directly to `createCDATASection` throws `InvalidCharacterError` instead of silently accepting the payload:\n\n```js\nconst { DOMImplementation } = require('./lib');\n\nconst doc = new DOMImplementation().createDocument(null, 'root', null);\ntry {\n doc.createCDATASection('SAFE]]><injected attr=\"pwn\"/>');\n console.log('VULNERABLE — no error thrown');\n} catch (e) {\n console.log('FIXED — threw:', e.name); // InvalidCharacterError\n}\n```\n\nExpected output on patched versions:\n\n```\nFIXED — threw: InvalidCharacterError\n```\n\n---\n\n## Proof of Concept — Fix B: mutation vector now safe\n\nOn patched versions, injecting `]]>` via a mutation method (`appendData`, `replaceData`, `.data =`, `.textContent =`) no longer produces injectable output. The serializer splits the terminator so the result round-trips as safe text:\n\n```js\nconst { DOMImplementation, XMLSerializer } = require('./lib');\nconst { DOMParser } = require('./lib');\n\nconst doc = new DOMImplementation().createDocument(null, 'root', null);\n\n// Start with safe data, then mutate to include the terminator\nconst cdata = doc.createCDATASection('safe');\ndoc.documentElement.appendChild(cdata);\ncdata.appendData(']]><injected attr=\"pwn\"/><more>TEXT</more><![CDATA[');\n\nconst out = new XMLSerializer().serializeToString(doc);\nconsole.log('Serialized:', out);\n\nconst reparsed = new DOMParser().parseFromString(out, 'text/xml');\nconst injected = reparsed.getElementsByTagName('injected').length > 0;\nconsole.log('Injected element found in reparsed doc:', injected);\n// VULNERABLE: true | FIXED: false\n```\n\nExpected output on patched versions:\n\n```\nSerialized: <root><![CDATA[safe]]]]><![CDATA[><injected attr=\"pwn\"/><more>TEXT</more><![CDATA[]]></root>\nInjected element found in reparsed doc: false\n```\n\n---\n\n## Fix Applied\n\nBoth mitigations were implemented:\n\n### Option A — Strict/spec-aligned: reject `]]>` in `createCDATASection()`\n\n`Document.createCDATASection(data)` now throws `InvalidCharacterError` (per the [WHATWG DOM spec](https://dom.spec.whatwg.org/#dom-document-createcdatasection)) when `data` contains `]]>`. This closes the direct entry point.\n\nCode that previously passed a string containing `]]>` to `createCDATASection` and relied on the silent/unsafe behaviour will now receive `InvalidCharacterError`. Use a mutation method such as `appendData` if you intentionally need `]]>` in a CDATASection node's data (the serializer split in Option B will keep the output safe).\n\n### Option B — Defensive serialization: split the terminator during serialization\n\n`XMLSerializer` now replaces every occurrence of `]]>` in CDATA section data with the split sequence `]]]]><![CDATA[>` before emitting. This closes all mutation-vector paths that Option A alone cannot guard, and means the serialized output is always well-formed XML regardless of how `]]>` entered the node.",
"details": "## Summary\n\n`@xmldom/xmldom` allows attacker-controlled strings containing the CDATA terminator `]]>` to be inserted into a `CDATASection` node. During serialization, `XMLSerializer` emitted the CDATA content verbatim without rejecting or safely splitting the terminator. As a result, data intended to remain text-only became **active XML markup** in the serialized output, enabling XML structure\ninjection and downstream business-logic manipulation.\n\nThe sequence `]]>` is not allowed inside CDATA content and must be rejected or safely handled during serialization. ([MDN Web Docs](https://developer.mozilla.org/))\n\n### Attack surface\n\n`Document.createCDATASection(data)` is the most direct entry point, but it is not the only one. The WHATWG DOM spec intentionally does not validate `]]>` in mutation methods — only `createCDATASection` carries that guard. The following paths therefore also allow `]]>` to enter a CDATASection node and reach the serializer:\n\n- `CharacterData.appendData()`\n- `CharacterData.replaceData()`\n- `CharacterData.insertData()`\n- Direct assignment to `.data`\n- Direct assignment to `.textContent`\n\n(Note: assigning to `.nodeValue` does **not** update `.data` in this implementation — the serializer reads `.data` directly — so `.nodeValue` is not an exploitable path.)\n\n### Parse path\n\nParsing XML that contains a CDATA section is **not** affected. The SAX parser's non-greedy `CDSect` regex stops at the first `]]>`, so parsed CDATA data never contains the terminator.\n\n---\n\n## Impact\n\nIf an application uses `xmldom` to generate \"trusted\" XML documents that embed **untrusted user input** inside CDATA (a common pattern in exports, feeds, SOAP/XML integrations, etc.), an attacker can inject additional XML elements/attributes into the generated document.\n\nThis can lead to:\n\n- Integrity violation of generated XML documents.\n- Business-logic injection in downstream consumers (e.g., injecting `<approved>true</approved>`, `<role>admin</role>`, workflow flags, or other security-relevant elements).\n- Unexpected privilege/workflow decisions if downstream logic assumes injected nodes cannot appear.\n\nThis issue does **not** require malformed parsers or browser behavior; it is caused by serialization producing attacker-influenced XML markup.\n\n---\n\n## Root Cause (with file + line numbers)\n\n**File:** `lib/dom.js`\n\n### 1. No validation in `createCDATASection`\n\n`createCDATASection: function (data)` accepts any string and appends it directly.\n\n- **Lines 2216–2221** (0.9.8)\n\n### 2. Unsafe CDATA serialization\n\nSerializer prints CDATA sections as:\n\n```\n<![CDATA[ + node.data + ]]>\n```\n\nwithout handling `]]>` in the data.\n\n- **Lines 2919–2920** (0.9.8)\n\nBecause CDATA content is emitted verbatim, an embedded `]]>` closes the CDATA section early and the remainder of the attacker-controlled payload is interpreted as markup in the serialized XML.\n\n---\n\n## Proof of Concept — Fix A: `createCDATASection` now throws\n\nOn patched versions, passing `]]>` directly to `createCDATASection` throws `InvalidCharacterError` instead of silently accepting the payload:\n\n```js\nconst { DOMImplementation } = require('./lib');\n\nconst doc = new DOMImplementation().createDocument(null, 'root', null);\ntry {\n doc.createCDATASection('SAFE]]><injected attr=\"pwn\"/>');\n console.log('VULNERABLE — no error thrown');\n} catch (e) {\n console.log('FIXED — threw:', e.name); // InvalidCharacterError\n}\n```\n\nExpected output on patched versions:\n\n```\nFIXED — threw: InvalidCharacterError\n```\n\n---\n\n## Proof of Concept — Fix B: mutation vector now safe\n\nOn patched versions, injecting `]]>` via a mutation method (`appendData`, `replaceData`, `.data =`, `.textContent =`) no longer produces injectable output. The serializer splits the terminator so the result round-trips as safe text:\n\n```js\nconst { DOMImplementation, XMLSerializer } = require('./lib');\nconst { DOMParser } = require('./lib');\n\nconst doc = new DOMImplementation().createDocument(null, 'root', null);\n\n// Start with safe data, then mutate to include the terminator\nconst cdata = doc.createCDATASection('safe');\ndoc.documentElement.appendChild(cdata);\ncdata.appendData(']]><injected attr=\"pwn\"/><more>TEXT</more><![CDATA[');\n\nconst out = new XMLSerializer().serializeToString(doc);\nconsole.log('Serialized:', out);\n\nconst reparsed = new DOMParser().parseFromString(out, 'text/xml');\nconst injected = reparsed.getElementsByTagName('injected').length > 0;\nconsole.log('Injected element found in reparsed doc:', injected);\n// VULNERABLE: true | FIXED: false\n```\n\nExpected output on patched versions:\n\n```\nSerialized: <root><![CDATA[safe]]]]><![CDATA[><injected attr=\"pwn\"/><more>TEXT</more><![CDATA[]]></root>\nInjected element found in reparsed doc: false\n```\n\n---\n\n## Fix Applied\n\nBoth mitigations were implemented:\n\n### Option A — Strict/spec-aligned: reject `]]>` in `createCDATASection()`\n\n`Document.createCDATASection(data)` now throws `InvalidCharacterError` (per the [WHATWG DOM spec](https://dom.spec.whatwg.org/#dom-document-createcdatasection)) when `data` contains `]]>`. This closes the direct entry point.\n\nCode that previously passed a string containing `]]>` to `createCDATASection` and relied on the silent/unsafe behaviour will now receive `InvalidCharacterError`. Use a mutation method such as `appendData` if you intentionally need `]]>` in a CDATASection node's data (the serializer split in Option B will keep the output safe).\n\n### Option B — Defensive serialization: split the terminator during serialization\n\n`XMLSerializer` now replaces every occurrence of `]]>` in CDATA section data with the split sequence `]]]]><![CDATA[>` before emitting. This closes all mutation-vector paths that Option A alone cannot guard, and means the serialized output is always well-formed XML regardless of how `]]>` entered the node.\n\n## Update — 2026-04-xx (0.9.10 / 0.8.13)\n\n### `splitCDATASections` is deprecated\n\nThe CDATA split behavior introduced as Option B of this fix (replacing `]]>` with\n`]]]]><![CDATA[>` during serialization) is **deprecated** as of 0.9.10 / 0.8.13.\n\nThis release introduces a `requireWellFormed` option on `XMLSerializer.serializeToString()`.\nWhen `{ requireWellFormed: true }` is passed as the second argument, the serializer throws\n`InvalidStateError` if CDATA section data contains `]]>` — this is the spec-aligned behavior\n(W3C DOM Parsing and Serialization, `require well-formed` flag) and the recommended migration\npath going forward.\n\nThe split behavior is now controlled by an explicit `splitCDATASections` option (default\n`true`, preserving the current behavior). The three serialization behaviors are:\n\n| `requireWellFormed` | `splitCDATASections` | Behavior |\n|---|---|---|\n| `false` (default) | `true` (default) | Split `]]>` → `]]]]><![CDATA[>` (current behavior, deprecated) |\n| `true` | — (ignored) | Throw `InvalidStateError` — spec-aligned, recommended |\n| `false` | `false` | Emit verbatim — same as pre-0.9.9 behavior |\n\n`requireWellFormed: true` takes precedence: the split path is unreachable when it is set.\n\n### Migration\n\nReplace any reliance on the default split behavior with an explicit opt-in:\n\n```js\n// Before (implicit split, deprecated):\nconst xml = new XMLSerializer().serializeToString(doc);\n\n// After (explicit guard, spec-aligned):\nconst xml = new XMLSerializer().serializeToString(doc, { requireWellFormed: true });\n// Throws InvalidStateError if any CDATASection contains ']]>'\n```\n\n### Removal timeline\n\nBoth the `splitCDATASections` option and the underlying `]]>` → `]]]]><![CDATA[>` split\nmechanics will be removed in the next breaking (`0.10.0`) release. After removal, the only\nbehaviors will be verbatim (default) and `requireWellFormed: true` (throws).\n\nRemoval is tracked in [xmldom/xmldom#999](https://github.com/xmldom/xmldom/issues/999).\n",
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, 2026-04-xx needs to be replaced with 2026-04-18

"severity": [
{
"type": "CVSS_V3",
Expand Down Expand Up @@ -82,6 +82,10 @@
"type": "ADVISORY",
"url": "https://nvd.nist.gov/vuln/detail/CVE-2026-34601"
},
{
"type": "WEB",
"url": "https://github.com/xmldom/xmldom/issues/999"
},
{
"type": "WEB",
"url": "https://github.com/xmldom/xmldom/commit/2b852e836ab86dbbd6cbaf0537f584dd0b5ac184"
Expand Down
Loading