Skip to content

gh-149578: Fix tarfile.open failing on PAX archives with only global headers#149647

Open
ShadiBahaa wants to merge 1 commit intopython:mainfrom
ShadiBahaa:fix-tarfile-pax-global-header-eof
Open

gh-149578: Fix tarfile.open failing on PAX archives with only global headers#149647
ShadiBahaa wants to merge 1 commit intopython:mainfrom
ShadiBahaa:fix-tarfile-pax-global-header-eof

Conversation

@ShadiBahaa
Copy link
Copy Markdown

Summary

Fix tarfile.open() raising ReadError when opening a PAX format tar archive that contains only global headers and no regular file members.

Root cause: In TarInfo._proc_pax(), after processing a global header (XGLTYPE), the code tries to read the next header. If the archive has no file entries after the global header, this read encounters the end-of-archive marker and raises EOFHeaderError. This exception was being caught by the generic except HeaderError clause and converted to SubsequentHeaderError, which both TarFile.next() and the append-mode initialization loop treat as a fatal error.

Fix: Catch EOFHeaderError separately before the generic HeaderError handler. For global headers (XGLTYPE), let the EOFHeaderError propagate so callers handle end-of-archive normally. For extended headers (XHDTYPE), a following file entry is mandatory, so the error is still converted to SubsequentHeaderError.

Test plan

  • Added test_pax_global_header_empty_archive in PaxWriteTest that:
    • Creates a PAX archive with global headers but no file entries
    • Verifies the archive can be opened for reading and global headers are preserved
    • Verifies the archive can be opened in append mode
    • Verifies appending a file works and global headers are preserved afterward

…lobal headers

When a PAX format tar archive contains only global headers and no
regular members, tarfile.open() raised ReadError because the
EOFHeaderError from reaching end-of-archive after the global header
was being caught and converted to SubsequentHeaderError in
TarInfo._proc_pax(). This prevented the caller from handling
end-of-archive normally.

Fix by letting EOFHeaderError propagate when processing a global
header (XGLTYPE), while still treating it as a SubsequentHeaderError
for extended headers (XHDTYPE) where a following file entry is
mandatory.
@ShadiBahaa ShadiBahaa requested a review from ethanfurman as a code owner May 10, 2026 18:20
@python-cla-bot
Copy link
Copy Markdown

python-cla-bot Bot commented May 10, 2026

All commit authors signed the Contributor License Agreement.

CLA signed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant