Skip to content

Implement SQL grouping extensions: ROLLUP, CUBE, GROUPING SETS, GROUPING(), GROUP BY DISTINCT#9029

Open
livius2 wants to merge 1 commit into
FirebirdSQL:masterfrom
livius2:grouping_sets__flattened
Open

Implement SQL grouping extensions: ROLLUP, CUBE, GROUPING SETS, GROUPING(), GROUP BY DISTINCT#9029
livius2 wants to merge 1 commit into
FirebirdSQL:masterfrom
livius2:grouping_sets__flattened

Conversation

@livius2
Copy link
Copy Markdown

@livius2 livius2 commented May 17, 2026

Hi

A few years ago I made an earlier attempt at implementing ROLLUP / CUBE support in Firebird.
That prototype was based on joining the input stream with a procedure producing a sequence of numbers/masks.
It worked for simple cases, but the approach turned out to be hard to generalize, difficult to integrate cleanly with the optimizer/executor, and not a good long-term fit for Firebird internals.

After several years of following Firebird development and repository changes, I decided to revisit this feature with a cleaner implementation strategy.

This pull request implements SQL extended grouping support, including:

  • GROUP BY ROLLUP (...)
  • GROUP BY CUBE (...)
  • GROUP BY GROUPING SETS (...)
  • mixed grouping elements, for example GROUP BY a, ROLLUP(b, c)
  • empty grouping set ()
  • GROUPING() with multiple arguments, SQL feature T433
  • GROUP BY DISTINCT, SQL feature T434
  • GROUPING_ID as a compatibility extension

The implementation lowers advanced grouping to existing aggregate and UNION ALL machinery, without adding new BLR opcodes, ODS changes.

This is a flattened commit from my local repo, there were too many commits and experiments there.
Please note that all English texts come from a translator.
I will add tests to Firebird-qa soon.
Tests added now:
FirebirdSQL/firebird-qa#38

  GROUP BY ROLLUP(...)
  GROUP BY CUBE(...)
  GROUP BY GROUPING SETS (...)
  GROUP BY ()
  GROUPING(...)
  GROUPING_ID(...)
  GROUP BY DISTINCT
@sim1984
Copy link
Copy Markdown
Contributor

sim1984 commented May 18, 2026

The implementation lowers advanced grouping to existing aggregate and UNION ALL machinery, without adding new BLR opcodes, ODS changes.

This is unfortunate; at least aggregate functions containing only WITH ROLLUP can be executed in a single pass (without re-executing the query). This is unlikely to be possible with CUBE and GROUPING SETS, but there are options there, such as repeating groupings based on records stored in the "Record Buffer".

@livius2
Copy link
Copy Markdown
Author

livius2 commented May 18, 2026

I agree. This implementation intentionally uses lowering to the existing aggregate + UNION ALL infrastructure as a first correctness-focused step, avoiding new BLR, ODS, record source, or executor changes.

This is not meant to be the final optimal execution strategy. ROLLUP is a good candidate for a future single-pass implementation, as its grouping levels are hierarchical and can be produced from one ordered/grouped stream. CUBE and arbitrary GROUPING SETS are less straightforward, but there are still possible optimizations, for example sharing/materializing the base stream or reusing buffered records instead of re-executing the full input for every grouping set.

For this PR, the main goal is SQL semantics and integration with existing DSQL/BLR paths. Native execution and optimizer improvements can be developed as a follow-up without changing the SQL surface introduced here.

@sim1984
Copy link
Copy Markdown
Contributor

sim1984 commented May 18, 2026

I don't see any reason to change the ODS. The introduction of new BLR verbs may be necessary to improve implementation efficiency, but it's not necessary.

The main question: do you plan to implement support in the optimizer yourself or wait for someone else to do it?

@livius2
Copy link
Copy Markdown
Author

livius2 commented May 18, 2026

Yes, if I understand correctly, by optimizer support you mean a future native execution strategy, for example single-pass execution for pure ROLLUP cases.

I do plan to work on that, but I would prefer to do it after this PR is accepted and merged. This PR has already required a significant amount of work, and I would like to avoid investing in a larger implementation before knowing whether this approach is acceptable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants