Skip to content

Add Vercel AI SDK and Mastra tool adapters (chdb/ai-sdk, chdb/mastra)#61

Open
ShawnChen-Sirius wants to merge 1 commit into
chdb-io:mainfrom
ShawnChen-Sirius:feat/agent-tools
Open

Add Vercel AI SDK and Mastra tool adapters (chdb/ai-sdk, chdb/mastra)#61
ShawnChen-Sirius wants to merge 1 commit into
chdb-io:mainfrom
ShawnChen-Sirius:feat/agent-tools

Conversation

@ShawnChen-Sirius

Copy link
Copy Markdown
Contributor

What

Lets an agent run ClickHouse SQL with chDB as a first-class tool in the two main
TypeScript agent frameworks, over one shared executor so behavior is identical.

  • integrations/chdb-tool-core.mjs — framework-agnostic core: a shared description
    (the chDB engine plus the table functions it can reach: file/s3/url/postgresql/
    mysql/mongodb/remoteSecure) and runChdbQuery(), which runs SQL via a bound
    Session (or the default connection), returns rows as JSON capped at maxRows
    (with a truncated flag), and returns the engine error to the model instead of
    throwing.
  • chdb/ai-sdk (integrations/ai-sdk.mjs) — the core in the Vercel AI SDK
    tool() shape (description + zod inputSchema + execute).
  • chdb/mastra (integrations/mastra.mjs) — the core in Mastra createTool()
    (id, description, input/output zod schemas, execute; tolerant of the .context
    input shape).

Authored as ESM (.mjs + .d.mts) because both frameworks are ESM-only, which
also avoids a CJS-requiring-ESM break; exposed via the chdb/ai-sdk and
chdb/mastra subpath exports. ai, @mastra/core, and zod are optional peer
dependencies.

Tests

AI SDK adapter verified end to end (builds a valid tool; execute runs SQL and
returns rows; a bad query returns a typed error string). Full v3 suite green.

Not included

ChDBStore (a Mastra storage adapter) is intentionally deferred — MastraStorage
is a large, version-sensitive surface that deserves its own pass.

🤖 Generated with Claude Code

Lets an agent run ClickHouse SQL with chDB as a first-class tool in the two main
TS agent frameworks, with one shared executor so behavior is identical across them.

- integrations/chdb-tool-core.mjs: framework-agnostic core — a shared description
  (chDB engine + the table functions it can reach: file/s3/url/postgresql/mysql/
  mongodb/remoteSecure), and runChdbQuery() which runs SQL via a bound Session (or
  the default connection), returns rows as JSON capped at maxRows (with a truncated
  flag), and returns the engine error to the model instead of throwing.
- chdb/ai-sdk (integrations/ai-sdk.mjs): wraps the core in the Vercel AI SDK
  tool() shape (description + zod inputSchema + execute).
- chdb/mastra (integrations/mastra.mjs): wraps it in Mastra createTool() (id,
  description, input/output zod schemas, execute; tolerant of the .context input shape).

Authored as ESM (.mjs + .d.mts) because both frameworks are ESM-only, which also
avoids a CJS-requiring-ESM break; exposed as the chdb/ai-sdk and chdb/mastra subpaths.
ai, @mastra/core, and zod are optional peer dependencies.

Tests: AI SDK adapter verified end to end (builds a valid tool; execute runs SQL
and returns rows; a bad query returns a typed error string). Full v3 suite green.

ChDBStore (a Mastra storage adapter) is intentionally deferred — MastraStorage is a
large, version-sensitive surface that deserves its own pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ShawnChen-Sirius ShawnChen-Sirius requested a review from Copilot June 26, 2026 04:08
@ShawnChen-Sirius

Copy link
Copy Markdown
Contributor Author

@chibugai, review it

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class “agent tool” integrations for running ClickHouse SQL via chDB in two ESM-only TypeScript agent ecosystems (Vercel AI SDK and Mastra), sharing a single framework-agnostic executor to keep behavior consistent.

Changes:

  • Introduces a shared runChdbQuery() core (description + execution semantics) and two thin adapters (chdb/ai-sdk, chdb/mastra).
  • Exposes the adapters via new subpath exports and adds optional peer deps (ai, @mastra/core, zod).
  • Adds a Vitest integration test covering the AI SDK adapter end-to-end.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
test/v3/integrations/ai-sdk.test.ts Adds an end-to-end test for the Vercel AI SDK tool wrapper.
integrations/chdb-tool-core.mjs Introduces shared tool description and runChdbQuery() executor used by both adapters.
integrations/ai-sdk.mjs Adds the Vercel AI SDK tool() adapter wrapper.
integrations/ai-sdk.d.mts Adds type declarations for the AI SDK adapter subpath.
integrations/mastra.mjs Adds the Mastra createTool() adapter wrapper.
integrations/mastra.d.mts Adds type declarations for the Mastra adapter subpath.
package.json Adds ./ai-sdk and ./mastra exports, files whitelist entry, and optional peer/dev deps.
package-lock.json Updates lockfile for new deps and version metadata.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +33 to +36
const { session, maxRows = 1000 } = opts
if (typeof sql !== 'string' || sql.trim() === '') {
return { rows: [], rowCount: 0, truncated: false, error: 'sql must be a non-empty string' }
}
Comment on lines +38 to +47
const res = await (session
? session.queryAsync(sql, { format: 'JSON' })
: queryAsync(sql, { format: 'JSON' }))
// ClickHouse 'JSON' format: { meta, data, rows, statistics }.
const parsed = res.json()
const data = Array.isArray(parsed?.data) ? parsed.data : []
const truncated = data.length > maxRows
return {
rows: truncated ? data.slice(0, maxRows) : data,
rowCount: typeof parsed?.rows === 'number' ? parsed.rows : data.length,
Comment on lines +3 to +4
// @ts-expect-error - .mjs adapter has a sibling .d.mts; vitest resolves the runtime file
import { chdbQueryTool } from '../../../integrations/ai-sdk.mjs'
Comment thread integrations/ai-sdk.d.mts
Comment on lines +18 to +20
/** Build a Vercel AI SDK tool that runs ClickHouse SQL with chDB. */
export function chdbQueryTool(opts?: ChdbToolOptions): unknown
export default chdbQueryTool
Comment thread integrations/mastra.d.mts
Comment on lines +18 to +20
/** Build a Mastra tool that runs ClickHouse SQL with chDB. */
export function chdbQueryTool(opts?: ChdbToolOptions): unknown
export default chdbQueryTool
Comment thread package.json
Comment on lines 65 to 75
"devDependencies": {
"@types/node": "^20.14.0",
"ai": "^7.0.2",
"apache-arrow": "^21.1.0",
"chai": "^4.5.0",
"mocha": "^10.7.3",
"node-gyp": "^9.3.1",
"typescript": "^5.5.0",
"vitest": "^2.0.0"
"vitest": "^2.0.0",
"zod": "^4.4.3"
},
@chibugai

Copy link
Copy Markdown

Reviewed — OCR deep pass (opus-4-8, 5 source files) plus a manual read of the core and both adapters. No correctness or safety issues found; looks good to me.

What I specifically checked and was happy with:

  • chdb-tool-core.mjsrunChdbQuery — the input guard rejects non-string/empty SQL before touching the engine, truncation is correct (data.slice(0, maxRows) with the truncated flag reporting the cap was hit), and engine errors are returned as a value ({ error }) rather than thrown — the right call for agent loops, since a thrown tool error is opaque to most of them. The Array.isArray(parsed?.data) guard also keeps a malformed/empty result from blowing up.
  • Shared core — both adapters are thin wrappers over runChdbQuery, so the description, input contract, and execution semantics stay identical across the Vercel AI SDK and Mastra. Good for consistency.
  • mastra.mjs — the dual input-shape handling (input.context.sql ?? input.sql) covers both current and older Mastra versions, and if neither is present sql is undefined and falls cleanly through to the empty-string guard. outputSchema matches the actual return shape (elapsed/error correctly .optional()).

One non-blocking design note, just for awareness: the tool deliberately hands the model the full SQL surface, including table functions that reach external hosts and embed credentials (s3(), url(), postgresql(), remoteSecure(), …). That's by design, and the description already says "embed only safe literal values" — but since the SQL is model-generated it's worth keeping in mind for prompt-injection / data-exfiltration exposure. A future readonly / table-function allowlist option could be a nice hardening knob. Not a blocker for this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants