Skip to content

Restrict pickle deserialization to safe types (CVE-2025-69872)#363

Closed
a2811057970 wants to merge 1 commit into
grantjenks:masterfrom
a2811057970:fix/cve-2025-69872-safe-unpickler
Closed

Restrict pickle deserialization to safe types (CVE-2025-69872)#363
a2811057970 wants to merge 1 commit into
grantjenks:masterfrom
a2811057970:fix/cve-2025-69872-safe-unpickler

Conversation

@a2811057970

Copy link
Copy Markdown

Summary

Mitigates CVE-2025-69872 by restricting pickle deserialization to a hardcoded allowlist of safe built-in types. Arbitrary objects can no longer be deserialized from cache, preventing code execution via crafted pickle payloads.

Approach

This uses a SafeUnpickler with an allowlist-based find_class override — a fundamentally different approach to #361 (HMAC envelope). The key difference:

  • HMAC (Mitigate CVE-2025-69872: HMAC-verified pickle envelope #361): still deserializes arbitrary types once the signature is valid. An attacker with read+write access to the cache directory can read the auto-generated key file and forge payloads.
  • Allowlist (this PR): blocks dangerous types regardless of filesystem access. No key management, no env vars, no file creation.

What's allowed

int, float, str, bytes, bytearray, list, dict, tuple, set, frozenset, complex, range, slice, object, bool, None, collections.OrderedDict, collections.defaultdict, collections.deque, datetime.*, decimal.Decimal, fractions.Fraction, uuid.UUID

Everything else raises UnpicklingError on read.

Breaking change

This is intentionally a breaking change (version bumped to 6.0.0). Users caching custom types have two migration paths:

  1. Use JSONDisk for JSON-serializable data
  2. Subclass Disk and override get()/fetch() with a custom serializer

There is no opt-out by design — an escape hatch would just be cargo-culted by every downstream that hits an error.

Downstream compatibility

UnpicklingError inherits from pickle.UnpicklingError so libraries that catch pickle.PickleError (e.g. dvc-data's translate_pickle_error decorator) handle it gracefully.

Tests

  • All 179 existing tests pass unchanged
  • 26 new security tests covering: safe type round-trips, exploit blocking (os.system, eval, subprocess, __reduce__), cache integration, no-escape-hatch verification, FanoutCache/Deque/Index
  • isort and blue pass

Fixes CVE-2025-69872
Closes #357, #360, #362

BREAKING CHANGE: Pickle deserialization now only permits safe built-in
types (builtins, collections, datetime, decimal, fractions, uuid).
Arbitrary objects can no longer be deserialized from cache, preventing
code execution via crafted pickle payloads.

Users caching custom types should migrate to JSONDisk or a custom Disk
subclass. There is no opt-out mechanism by design.

- Add SafeUnpickler with allowlist-based find_class override
- Add UnpicklingError (inherits pickle.UnpicklingError) for downstream
  compatibility with libraries catching pickle.PickleError
- Support pickle protocols 0-5 via __builtin__, copy_reg, and _codecs
  allowlist entries
- Use frozenset values in SAFE_PICKLE_CLASSES to prevent runtime bypass
- Bump version to 6.0.0 (breaking change per semver)

This takes a different approach to PR #361 (HMAC envelope). The HMAC
approach still allows arbitrary deserialization once the signature is
verified, meaning an attacker with read+write access to the cache
directory can read the auto-generated key file and forge valid
payloads. The allowlist approach blocks dangerous types regardless of
filesystem access.

Fixes: CVE-2025-69872
Closes: #357, #360, #362
@a2811057970 a2811057970 force-pushed the fix/cve-2025-69872-safe-unpickler branch from 74ffa4a to 50dc1fb Compare July 3, 2026 11:01
@a2811057970 a2811057970 closed this by deleting the head repository Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Vulnerability CVE-2025-69872

1 participant