Skip to content

Add a .stream() terminal with server-side parameter binding to Layer 3#59

Merged
wudidapaopao merged 1 commit into
chdb-io:mainfrom
ShawnChen-Sirius:feat/layer3-stream-terminal
Jun 26, 2026
Merged

Add a .stream() terminal with server-side parameter binding to Layer 3#59
wudidapaopao merged 1 commit into
chdb-io:mainfrom
ShawnChen-Sirius:feat/layer3-stream-terminal

Conversation

@ShawnChen-Sirius

Copy link
Copy Markdown
Contributor

What

Adds a .stream() terminal to the Layer 3 fluent SELECT builder so large result
sets arrive lazily, one row at a time, through Layer 1's streaming cursor instead
of buffering the whole result.

for await (const row of db.selectFrom('events').selectAll().where('ts', '>', cutoff).stream()) {
  // O(chunk) memory, not O(result)
}

Why

The fluent read path buffered the entire result (parseRows over the full text),
so a forgotten .limit() on a big OLAP scan peaked at ~3–4× the result size and
risked OOM. Layer 1 already had a streaming cursor, but the fluent builder never
used it, and its param-less streaming C entry could not carry the compiler's bound
values.

libchdb already ships chdb_stream_query_with_params_n (streaming + server-side
binding), so no upstream change is needed.

How

  • Native (lib/chdb_node.cpp): StreamQuery now takes an optional
    pre-formatted params object and routes to chdb_stream_query_with_params_n,
    mirroring the one-shot QueryWithParams path; the param-less path is unchanged.
  • Layer 1 (index.js / index.d.ts): new Session.queryStreamBind;
    queryStream refactored to share a single #startStream. Values are bound
    exactly like queryBindAsync.
  • Layer 3 (builder/select.ts, execute/terminal.ts, runtime.ts):
    SelectQueryBuilder.stream() returns an AsyncIterableIterator<O> via
    executeSelectStream. Values stay bound server-side, so a streamed read is as
    injection-safe as .execute(). It requires a bound session (the default
    connection has no streaming cursor) and honors an AbortSignal.

Note on 64-bit precision

chDB builds a streaming query's output format from the connection's session
settings at open time
, not from a query's trailing SETTINGS clause (which
executeSelect rides in SQL). So the row view SETs
output_format_json_quote_64bit_integers = 1 before opening the stream —
without it, 64-bit ints would stream as lossy JS numbers and streamed rows would
differ from executed rows.

Tests

  • 5 new .stream() cases: parity with .execute(), bound filter, 64-bit
    precision, no-session throw, abort.
  • Full v3 suite green (373 passed / 11 skipped), including the existing Layer 1
    streaming tests.

🤖 Generated with Claude Code

@ShawnChen-Sirius ShawnChen-Sirius force-pushed the feat/layer3-stream-terminal branch from be68c63 to fb55d05 Compare June 26, 2026 02:40
@ShawnChen-Sirius ShawnChen-Sirius requested a review from Copilot June 26, 2026 03:58
@ShawnChen-Sirius

Copy link
Copy Markdown
Contributor Author

@chibugai, review it

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a Layer 3 .stream() terminal for fluent SELECT queries so large result sets can be consumed lazily (row-by-row) via Layer 1’s streaming cursor, while preserving server-side parameter binding for injection safety.

Changes:

  • Add SelectQueryBuilder.stream() returning AsyncIterableIterator<Row> via a new executeSelectStream terminal.
  • Add Layer 1 Session.queryStreamBind() and route native streaming through chdb_stream_query_with_params_n when params are provided.
  • Add Layer 3 streaming tests covering parity, server-side binding, 64-bit precision, no-session error, and abort.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test/v3/layer3/stream.test.ts New Layer 3 .stream() behavioral tests (parity, binding, precision, abort).
src/layer3/runtime.ts Extend runtime session typing with queryStreamBind and RuntimeRowStream.
src/layer3/execute/terminal.ts Implement executeSelectStream / streaming row generator; enforce “bound session” requirement.
src/layer3/builder/select.ts Add .stream() terminal to SelectQueryBuilder.
lib/chdb_node.cpp Extend native StreamQuery wrapper to optionally bind params server-side.
index.js Add Session.queryStreamBind and refactor stream startup through a shared #startStream.
index.d.ts Add TypeScript declaration for Session.queryStreamBind.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/layer3/execute/terminal.ts
Comment thread src/layer3/execute/terminal.ts
Comment thread src/layer3/execute/terminal.ts
Comment thread src/layer3/builder/select.ts Outdated
@chibugai

Copy link
Copy Markdown

Reviewed — OCR deep pass (opus-4-8, 6 files) plus a manual read of the native and Layer 3 paths. No correctness or safety issues found; looks good to me.

Things I specifically checked and was happy with:

  • lib/chdb_node.cpp — the cnames/cvalues pointers stay valid through the chdb_stream_query_with_params_n call (they point into the live names/values vectors), and the empty-params case is guarded with n ? ... : nullptr. Mirrors the one-shot QueryWithParams path cleanly — no use-after-free.
  • 64-bit ints — this is the subtle part and you handled it right: opening the stream SETs output_format_json_quote_64bit_integers = 1 on the session first, since chDB builds the streaming output format from session settings at open time rather than the trailing SETTINGS clause. That keeps streamed rows identical to executed rows.
  • Server-side binding is preserved through queryStreamBind, so .stream() is as injection-safe as .execute().

One minor, non-blocking note: that SET persists on the session and isn't restored, so a caller who had deliberately set output_format_json_quote_64bit_integers = 0 would see it flipped for the rest of the session. The code comment already calls this out as intentional — just flagging it for awareness.

The fluent read path buffered the whole result (parseRows over the full
text), so a forgotten .limit() on a big OLAP scan peaked at ~3-4x the
result size and risked OOM. Layer 1 already had a streaming cursor but
the fluent builder never used it, and its param-less C entry could not
carry the compiler's bound values.

libchdb already ships chdb_stream_query_with_params_n (a streaming +
server-side binding entry), so no upstream change is needed:

- Native StreamQuery now takes an optional pre-formatted params object and
  routes to chdb_stream_query_with_params_n, mirroring the one-shot
  QueryWithParams path; the param-less path is unchanged.
- Layer 1 gains Session.queryStreamBind (queryStream refactored to share a
  single #startStream), binding values exactly like queryBindAsync.
- Layer 3 SelectQueryBuilder.stream() returns an AsyncIterableIterator of
  rows via executeSelectStream. Values stay bound server-side, so a
  streamed read is as injection-safe as .execute(). It requires a bound
  session (the default connection has no streaming cursor) and honors an
  AbortSignal.

chDB builds a streaming query's output format from the connection's
session settings at open time, not from a query's trailing SETTINGS
clause, so the row view SETs output_format_json_quote_64bit_integers
before opening the stream — without it, 64-bit ints would stream as lossy
JS numbers and streamed rows would differ from executed rows.

Tests: 5 new .stream() cases (parity with execute, bound filter, 64-bit
precision, no-session throw, abort); full v3 suite green (373 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ShawnChen-Sirius ShawnChen-Sirius force-pushed the feat/layer3-stream-terminal branch from fb55d05 to f6457ee Compare June 26, 2026 04:34
@wudidapaopao wudidapaopao merged commit 2760217 into chdb-io:main Jun 26, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants