Skip to content

POC: CTE materialization for multi-referenced CTEs#22551

Draft
nathanb9 wants to merge 14 commits into
apache:mainfrom
nathanb9:cte-materialization-poc
Draft

POC: CTE materialization for multi-referenced CTEs#22551
nathanb9 wants to merge 14 commits into
apache:mainfrom
nathanb9:cte-materialization-poc

Conversation

@nathanb9
Copy link
Copy Markdown
Contributor

@nathanb9 nathanb9 commented May 27, 2026

Materialize CTEs and duplicate subplans — compute once, reuse many.


How It Works

  1. SQL Planner — Detects multi-referenced CTEs and wraps them in MaterializedCteProducer/Reader nodes. Respects MATERIALIZED / NOT MATERIALIZED hints. Skips cheap non-volatile CTEs.

  2. Optimizer (CommonSubplanEliminate) — Detects structurally identical subplans anywhere in the plan tree (not just CTEs) via subtree hashing. Wraps duplicates in the same Producer/Reader nodes. Catches repeated views, self-joins, and generated SQL patterns.

  3. Optimizer (InlineCte) — Runs after filter pushdown. Removes materialization where it's not beneficial:

    • force_materialized = true → keep (explicit hint or CommonSubplan-detected)
    • ref_count <= 1 → inline (single-ref or dead)
    • Volatile functions → keep (evaluate-once semantics)
    • Cheap CTE (empty/literal) → inline
    • Ends in Aggregate/Distinct/Window → keep (unless disjoint filters → inline)
    • base_table_refs > 2 AND refs * count > 10 → keep (expensive)
    • Continuation has LIMIT → inline (early termination)
  4. Extension Planner (MaterializedCtePlanner) — Converts logical nodes to physical. Creates one shared cache per materialized subplan and wires all scans to it.

  5. Execution (MaterializeExec) — First partition triggers collect_partitioned via OnceAsync. All partitions await the shared future, then execute the continuation. Scans serve cached batches via MemoryStream.

  6. Optimizer (CteFilterPusher) (follow-up) — For CTEs that survive inlining, OR-combines filters from all readers and pushes the combined predicate into the CTE body. Community feedback welcome on scoping.


Structs

Logical Nodes (datafusion-expr)

pub struct MaterializedCteProducer {
    pub name: String,
    pub cte_plan: Arc<LogicalPlan>,
    pub continuation: Arc<LogicalPlan>,
    pub schema: DFSchemaRef,
    pub force_materialized: bool,
}

pub struct MaterializedCteReader {
    pub name: String,
    pub schema: DFSchemaRef,
}

Physical Operators (datafusion-physical-plan)

General-purpose — reusable for CTEs, views, and duplicate subplans.

pub struct MaterializedCache {
    label: String,
    once: OnceAsync<Vec<Vec<RecordBatch>>>,
}

pub struct MaterializeExec {
    label: String,
    source: Arc<dyn ExecutionPlan>,
    continuation: Arc<dyn ExecutionPlan>,
    cache: Arc<MaterializedCache>,
    metrics: ExecutionPlanMetricsSet,
    properties: Arc<PlanProperties>,
}

pub struct MaterializedScanExec {
    label: String,
    schema: SchemaRef,
    cache: Arc<MaterializedCache>,
    metrics: ExecutionPlanMetricsSet,
    statistics: Arc<Statistics>,
    properties: Arc<PlanProperties>,
}

Optimizer Rules (datafusion-optimizer)

pub struct InlineCte {}              // inlines materialized CTEs where not beneficial
pub struct CteFilterPusher {}        // pushes OR-combined filters into CTE body (follow-up)
pub struct CommonSubplanEliminate {} // detects duplicate subplans via hashing

Extension Planner (datafusion-core)

pub struct MaterializedCtePlanner {
    caches: Mutex<HashMap<String, Arc<MaterializedCache>>>,
    partition_counts: Mutex<HashMap<String, usize>>,
}

Syntax / User Configuration

Session config:

SET datafusion.execution.enable_materialized_ctes = true;  -- default in POC
SET datafusion.execution.enable_materialized_ctes = false; -- disable

Per-CTE SQL hints:

WITH t AS MATERIALIZED (SELECT ...)           -- always materialize, optimizer cannot inline
WITH t AS NOT MATERIALIZED (SELECT ...)       -- never materialize, always inline
WITH t AS (SELECT ...)                        -- automatic: materialize if multi-ref, optimizer decides

CommonSubplanEliminate (new)

Detects duplicate subplans via structural hashing of LogicalPlan subtrees. Any subtree appearing 2+ times with sufficient cost (contains TableScan/Aggregate/Join/Window, >= 3 nodes) is wrapped in Producer/Reader for compute-once semantics.

This benefits:

  • Views referenced multiple times (inlined as identical plan clones)
  • Generated SQL from ORMs/BI tools with repeated subqueries
  • Self-joins on identical filtered scans
  • Any pattern where the user didn't think to use a CTE

Currently does not help TPC-DS queries with:

  • Same table, different filters (not structurally identical)
  • Same filter, different projections/aggregates
  • Same dimension joins, different fact tables

Benchmark Results

TPC-DS SF1, 10 iterations, vs main:

Overall: 1.041x (+4.1%)

Query      Main (ms)   Branch (ms)   Speedup
------     ---------   -----------   -------
Q47          158.6         64.7       2.45x
Q57          118.4         48.7       2.43x
Q59           87.1         50.3       1.73x
Q75          111.0         65.2       1.70x
Q74          109.8         65.1       1.69x
Q2            56.9         34.5       1.65x
Q64          289.6        182.9       1.58x
Q4           312.6        212.6       1.47x
Q11          191.0        154.8       1.23x
Q39           77.0         63.4       1.22x

19 queries faster (>1.1x), 61 neutral, 19 slower (<0.9x noise from branch age)
0 failures

All gains are from CTE materialization via the SQL planner. CommonSubplanEliminate does not trigger on TPC-DS because repeated-table patterns there have differing filters/projections (not structurally identical subtrees).

Add support for materializing Common Table Expressions (CTEs) that are
referenced more than once in a query. When a CTE ends in an expensive
operation (Aggregate, Distinct, Window, or Union), the CTE is computed
once and its results are cached in memory for reuse by multiple consumers.

This implements a DuckDB-inspired heuristic: only materialize CTEs that
end in expensive operations, avoiding regressions where predicate pushdown
through the CTE would be more beneficial.

The implementation uses Extension nodes (UserDefinedLogicalNode) to avoid
modifying the core LogicalPlan enum, and introduces:
- MaterializedCteProducer/Reader logical nodes
- MaterializedCteExec/ReaderExec physical operators
- MaterializedCtePlanner extension planner
- Dependency-ordered execution for nested materialized CTEs

Benchmarked on TPC-DS SF1 (10 iterations):
- Q47: 2.85x speedup (401ms → 141ms)
- Q57: 2.67x speedup (112ms → 42ms)
- Q2:  1.58x speedup (101ms → 64ms)
- Q74: 1.90x speedup (311ms → 164ms)

Relates to: apache#17737
@github-actions github-actions Bot added sql SQL Planner logical-expr Logical plan and expressions core Core DataFusion crate common Related to common crate physical-plan Changes to the physical-plan crate labels May 27, 2026
@nathanb9 nathanb9 changed the title feat: support CTE materialization for multi-referenced CTEs POC: CTE materialization for multi-referenced CTEs May 27, 2026
@github-actions github-actions Bot added documentation Improvements or additions to documentation sqllogictest SQL Logic Tests (.slt) labels May 27, 2026
@neilconway
Copy link
Copy Markdown
Contributor

@nathanb9 Cool! Can you at-me when this is ready for review?

nathanb9 and others added 4 commits May 27, 2026 15:02
… filters

When a CTE ending in aggregate/distinct/window is referenced multiple
times but each reference filters on a different literal value of the
same column (e.g., d_moy=4 vs d_moy=5), inlining is better because
the optimizer can push each filter through the aggregate, specializing
each copy to process only a subset of the data.

This fixes the TPC-DS Q39 regression (was 1.78x slower with
materialization, now 1.01x — within noise). The detection handles:
- Column-qualified filters above joins (inv1.d_moy=4, inv2.d_moy=5)
- Simple constant arithmetic expressions (4+1 → 5)
- Aliased group-by columns (d_year → syear)

Also fixes a clippy warning: pass `statistics` by reference in
`replace_materialized_cte_readers`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The disjoint filter detection previously only looked at Filter nodes.
When using JOIN ... ON syntax (vs comma-join with WHERE), the equality
conditions like `a.d_moy = 1 AND b.d_moy = 2` live inside the Join
node's filter field, not as a separate Filter node.

This fixes a 4x regression on queries using JOIN ON with disjoint
group-key predicates (e.g. benchmark Q4: inventory comparison across
months).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the optimizer Optimizer rules label May 29, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 29, 2026

Thank you for opening this pull request!

Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch).

Details
     Cloning apache/main
    Building datafusion v53.1.0 (current)
       Built [  91.286s] (current)
     Parsing datafusion v53.1.0 (current)
      Parsed [   0.031s] (current)
    Building datafusion v53.1.0 (baseline)
       Built [  90.236s] (baseline)
     Parsing datafusion v53.1.0 (baseline)
      Parsed [   0.032s] (baseline)
    Checking datafusion v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.683s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [ 184.501s] datafusion
    Building datafusion-common v53.1.0 (current)
       Built [  30.909s] (current)
     Parsing datafusion-common v53.1.0 (current)
      Parsed [   0.054s] (current)
    Building datafusion-common v53.1.0 (baseline)
       Built [  30.999s] (baseline)
     Parsing datafusion-common v53.1.0 (baseline)
      Parsed [   0.054s] (baseline)
    Checking datafusion-common v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.759s] 222 checks: 221 pass, 1 fail, 0 warn, 30 skip

--- failure constructible_struct_adds_field: externally-constructible struct adds field ---

Description:
A pub struct constructible with a struct literal has a new pub field. Existing struct literals must be updated to include the new field.
        ref: https://doc.rust-lang.org/reference/expressions/struct-expr.html
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.47.0/src/lints/constructible_struct_adds_field.ron

Failed in:
  field ExecutionOptions.enable_materialized_ctes in /home/runner/work/datafusion/datafusion/datafusion/common/src/config.rs:469

     Summary semver requires new major version: 1 major and 0 minor checks failed
    Finished [  64.088s] datafusion-common
    Building datafusion-expr v53.1.0 (current)
       Built [  24.958s] (current)
     Parsing datafusion-expr v53.1.0 (current)
      Parsed [   0.070s] (current)
    Building datafusion-expr v53.1.0 (baseline)
       Built [  24.426s] (baseline)
     Parsing datafusion-expr v53.1.0 (baseline)
      Parsed [   0.068s] (baseline)
    Checking datafusion-expr v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   1.304s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [  52.191s] datafusion-expr
    Building datafusion-optimizer v53.1.0 (current)
       Built [  25.348s] (current)
     Parsing datafusion-optimizer v53.1.0 (current)
      Parsed [   0.028s] (current)
    Building datafusion-optimizer v53.1.0 (baseline)
       Built [  25.493s] (baseline)
     Parsing datafusion-optimizer v53.1.0 (baseline)
      Parsed [   0.028s] (baseline)
    Checking datafusion-optimizer v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.183s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [  52.156s] datafusion-optimizer
    Building datafusion-physical-plan v53.1.0 (current)
       Built [  33.976s] (current)
     Parsing datafusion-physical-plan v53.1.0 (current)
      Parsed [   0.120s] (current)
    Building datafusion-physical-plan v53.1.0 (baseline)
       Built [  33.909s] (baseline)
     Parsing datafusion-physical-plan v53.1.0 (baseline)
      Parsed [   0.119s] (baseline)
    Checking datafusion-physical-plan v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.700s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [  70.436s] datafusion-physical-plan
    Building datafusion-sql v53.1.0 (current)
       Built [  38.753s] (current)
     Parsing datafusion-sql v53.1.0 (current)
      Parsed [   0.029s] (current)
    Building datafusion-sql v53.1.0 (baseline)
       Built [  38.556s] (baseline)
     Parsing datafusion-sql v53.1.0 (baseline)
      Parsed [   0.029s] (baseline)
    Checking datafusion-sql v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.267s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [  78.948s] datafusion-sql
    Building datafusion-sqllogictest v53.1.0 (current)
       Built [ 156.026s] (current)
     Parsing datafusion-sqllogictest v53.1.0 (current)
      Parsed [   0.021s] (current)
    Building datafusion-sqllogictest v53.1.0 (baseline)
       Built [ 158.140s] (baseline)
     Parsing datafusion-sqllogictest v53.1.0 (baseline)
      Parsed [   0.022s] (baseline)
    Checking datafusion-sqllogictest v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.102s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [ 317.809s] datafusion-sqllogictest

@github-actions github-actions Bot added the auto detected api change Auto detected API change label May 29, 2026
@nathanb9
Copy link
Copy Markdown
Contributor Author

run benchmarks tpcds

@adriangbot
Copy link
Copy Markdown

Hi @nathanb9, thanks for the request (#22551 (comment)). Only whitelisted users can trigger benchmarks. Allowed users: Dandandan, Fokko, Jefffrey, Omega359, adriangb, alamb, asubiotto, brunal, buraksenn, cetra3, codephage2020, coderfender, comphead, erenavsarogullari, etseidl, friendlymatthew, gabotechs, geoffreyclaude, grtlr, haohuaijin, jonathanc-n, kevinjqliu, klion26, kosiew, kumarUjjawal, kunalsinghdadhwal, liamzwbao, mbutrovich, mkleen, mzabaluev, neilconway, rluvaton, sdf-jkl, timsaucer, xudong963, zhuqi-lucas.


File an issue against this benchmark runner

@nathanb9
Copy link
Copy Markdown
Contributor Author

@neilconway Okay broke down into PR.
Please review when you can :#22675

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto detected api change Auto detected API change common Related to common crate core Core DataFusion crate documentation Improvements or additions to documentation logical-expr Logical plan and expressions optimizer Optimizer rules physical-plan Changes to the physical-plan crate sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants