Skip to content

fix: reindent idempotence broken by accumulating blank lines between statements#857

Open
gaoflow wants to merge 1 commit into
andialbrecht:masterfrom
gaoflow:fix/reindent-multi-stmt-idempotence
Open

fix: reindent idempotence broken by accumulating blank lines between statements#857
gaoflow wants to merge 1 commit into
andialbrecht:masterfrom
gaoflow:fix/reindent-multi-stmt-idempotence

Conversation

@gaoflow

@gaoflow gaoflow commented Jun 24, 2026

Copy link
Copy Markdown

Problem

sqlparse.format(sql, reindent=True) is not idempotent when formatting multiple statements separated by semicolons. Each re-format adds an extra blank line between statements, accumulating unboundedly until stabilizing at 2 blank lines after the second pass.

Minimal reproducer:

sql = 'SELECT a FROM b; SELECT c FROM d'
once = sqlparse.format(sql, reindent=True)    # 1 blank line between stmts
twice = sqlparse.format(once, reindent=True)  # 2 blank lines -- bug!

Root cause

In _split_statements, after deleting the leading whitespace before a DML/DDL keyword, the condition if prev_: still evaluated true because prev_ held the deleted whitespace token object. This caused a spurious newline insertion even when the keyword was at the very start of the statement (where process() already handles inter-statement separation). On repeated formatting, the blank lines added by process() were then treated as meaningful content and another newline was layered on top.

Fix

After deleting leading whitespace, check whether any non-whitespace token exists before the DML/DDL keyword (via token_prev(tidx, skip_ws=True)). Only insert the separator newline when there is actual content preceding the keyword -- not when all preceding tokens are whitespace from a prior format pass.

Changes

  • sqlparse/filters/reindent.py: +8/-2 in _split_statements
  • tests/test_format.py: +31 lines (2 new regression tests)

All 489 existing + new tests pass.

…statements

_splitting_statements treated inter-statement blank lines (added by
process()) as meaningful content on the next format pass, inserting
an extra newline each time.  After the leading whitespace is deleted,
check whether any non-whitespace token precedes the DML/DDL keyword
before inserting a separator line.
Copilot AI review requested due to automatic review settings June 24, 2026 09:12

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a formatting idempotence bug in ReindentFilter where repeated sqlparse.format(..., reindent=True) calls could accumulate extra blank lines between semicolon-separated statements by avoiding an extra separator insertion when the “previous token” is only whitespace.

Changes:

  • Adjust _split_statements to insert a newline only when there is actual non-whitespace content before a DML/DDL keyword.
  • Add regression tests asserting formatting idempotence for multi-statement input and for UNION within a single statement.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
sqlparse/filters/reindent.py Prevents inserting an extra separator newline when a statement begins with only whitespace from prior formatting.
tests/test_format.py Adds regression tests ensuring reindent=True formatting is idempotent across multiple passes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 94 to +103
# only break if it's not the first token
if prev_:
tlist.insert_before(tidx, self.nl())
tidx += 1
# Check that there is actual non-whitespace content before
# this keyword. If all preceding tokens are whitespace
# (e.g. inter-statement blank lines that process() already
# handles), skip -- otherwise we double-count the separator.
_, before = tlist.token_prev(tidx, skip_ws=True)
if before is not None:
tlist.insert_before(tidx, self.nl())
tidx += 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants