@@ -67,3 +67,73 @@ whatever information they need in the sdist to build the project.
6767The tarball should use the modern POSIX.1-2001 pax tar format, which specifies
6868UTF-8 based file names. In particular, source distribution files must be readable
6969using the standard library tarfile module with the open flag 'r:gz'.
70+
71+
72+ .. _sdist-archive-features :
73+
74+ Source distribution archive features
75+ ====================================
76+
77+ Because extracting tar files as-is is dangerous, and the results are
78+ platform-specific, archive features of source distributions are limited.
79+
80+ Unpacking with the data filter
81+ ------------------------------
82+
83+ When extracting a source distribution, tools MUST either use
84+ ``tarfile.data_filter `` (e.g. ``TarFile.extractall(..., filter='data') ``), OR
85+ follow the *Unpacking without the data filter * section below.
86+
87+ As an exception, on Python interpreters without ``hasattr(tarfile, 'data_filter') ``
88+ (:pep: `706 `), tools that normally use that filter (directly on indirectly)
89+ MAY warn the user and ignore this specification.
90+ The trade-off between usability (e.g. fully trusting the archive) and
91+ security (e.g. refusing to unpack) is left up to the tool in this case.
92+
93+
94+ Unpacking without the data filter
95+ ---------------------------------
96+
97+ Tools that do not use the ``data `` filter directly (e.g. for backwards
98+ compatibility, allowing additional features, or not using Python) MUST follow
99+ this section.
100+ (At the time of this writing, the ``data `` filter also follows this section,
101+ but it may get out of sync in the future.)
102+
103+ The following files are invalid in an ``sdist `` archive.
104+ Upon encountering such an entry, tools SHOULD notify the user,
105+ MUST NOT unpack the entry, and MAY abort with a failure:
106+
107+ - Files that would be placed outside the destination directory.
108+ - Links (symbolic or hard) pointing outside the destination directory.
109+ - Device files (including pipes).
110+
111+ The following are also invalid. Tools MAY treat them as above,
112+ but are NOT REQUIRED to do so:
113+
114+ - Files with a ``.. `` component in the filename or link target.
115+ - Links pointing to a file that is not part of the archive.
116+
117+ Tools MAY unpack links (symbolic or hard) as regular files,
118+ using content from the archive.
119+
120+ When extracting ``sdist `` archives:
121+
122+ - Leading slashes in file names MUST be dropped.
123+ (This is nowadays standard behaviour for ``tar `` unpacking.)
124+ - For each ``mode `` (Unix permission) bit, tools MUST either:
125+
126+ - use the platform's default for a new file/directory (respectively),
127+ - set the bit according to the archive, or
128+ - use the bit from ``rw-r--r-- `` (``0o644 ``) for non-executable files or
129+ ``rwxr-xr-x `` (``0o755 ``) for executable files and directories.
130+
131+ - High ``mode `` bits (setuid, setgid, sticky) MUST be cleared.
132+ - It is RECOMMENDED to preserve the user *executable * bit.
133+
134+
135+ Further hints
136+ -------------
137+
138+ Tool authors are encouraged to consider how *hints for further
139+ verification * in ``tarfile `` documentation apply for their tool.
0 commit comments