Skip to content

feat: add generate_schema_docs tool for database documentation#278

Open
hasithasandunlakshan wants to merge 2 commits into
supabase:mainfrom
hasithasandunlakshan:feat/db-documentation-generator
Open

feat: add generate_schema_docs tool for database documentation#278
hasithasandunlakshan wants to merge 2 commits into
supabase:mainfrom
hasithasandunlakshan:feat/db-documentation-generator

Conversation

@hasithasandunlakshan

Copy link
Copy Markdown
Contributor

What kind of change does this PR introduce?

Feature

What is the current behavior?

Currently, to understand the full structure of a database (tables, columns, RLS policies, triggers, and functions), an AI agent or developer must call multiple tools (e.g., list_tables with verbose: true, execute_sql, etc.) or perform multiple manual queries. There is no single, consolidated way to get a documentation-ready overview of the database schema.

Relevant Issue: #277

What is the new behavior?

This PR introduces a new tool, generate_schema_docs, which provides a comprehensive, documentation-ready summary of the database schema in a single call.

Key Features:

  • Consolidated View: Fetches tables, columns, foreign keys, RLS policies, triggers, and user-defined functions in one operation.
  • Dual Output Formats: Supports markdown (optimized for humans/AI context windows), json (optimized for programmatic use), or both.
  • Smart Filtering: Includes options to filter by specific schemas and toggle the inclusion of internal/system functions.
  • RLS Awareness: Highlights whether RLS is enabled and lists policies directly under their respective tables in the markdown output.
  • Architectural Alignment: Correctly integrated into the database feature group and wired into the MCP server runtime.

Additional context

  • Test Coverage: Added comprehensive unit tests in server.test.ts covering various output formats and edge cases. All 172 unit tests are passing.
  • Zod Integration: Fully typed input and output schemas, allowing for type-safe usage via the AI SDK.
  • Performance: Uses parameterized SQL queries and grouping logic to minimize token bloat while maintaining high information density.

@anp0429

anp0429 commented Jul 1, 2026

Copy link
Copy Markdown

Nice tool, been trying it out. Ran it against a few schema shapes and hit something on the foreign key output that seemed worth flagging.

For a composite (multi column) foreign key, the FK list looks like it's giving the cartesian product of the source and target columns instead of the actual column pairing.

Quick repro with one composite FK, (a, b) references parent (a, b):

create table public.parent (a int not null, b int not null, primary key (a, b));
create table public.child (
  a int not null, b int not null,
  constraint child_parent_fk foreign key (a, b) references public.parent (a, b)
);

generate_schema_docs comes back with 4 edges for child_parent_fk:

public.child.a -> public.parent.a
public.child.a -> public.parent.b   (this pairing doesn't exist)
public.child.b -> public.parent.a   (this one doesn't either)
public.child.b -> public.parent.b

Only a->a and b->b are real. The a->b and b->a rows are relationships that aren't actually in the schema. Since the tool's meant for AI reasoning and security auditing, having it report FKs that don't exist felt worth raising.

Tracked it down to the FK subquery in pg-meta/tables.sql — the source and target columns are joined independently:

... on sa.attrelid = c.conrelid  and sa.attnum = any (c.conkey)
... on ta.attrelid = c.confrelid and ta.attnum = any (c.confkey)

any(conkey) against any(confkey) cross-joins every source column with every target column, so an N-column FK gives N² rows. Pairing needs to be positional (conkey[i] with confkey[i]) — unnesting the two arrays together with ordinality keeps them aligned:

join lateral unnest(c.conkey, c.confkey) with ordinality as cols(conkey, confkey, ord) on true
join pg_attribute sa on sa.attrelid = c.conrelid  and sa.attnum = cols.conkey
join pg_attribute ta on ta.attrelid = c.confrelid and ta.attnum = cols.confkey

Same query also backs list_tables (verbose), so this likely affects that path too — this PR just made it visible.

Here's a failing test if it helps, drops in next to the existing schema docs test:

test('composite foreign key is not a cartesian product', async () => {
  // ...same setup, plus the two tables above...
  const child = result.data.tables.find((t) => t.full_name === 'public.child');
  const pairs = child.foreign_key_constraints.map((f) => `${f.source}=>${f.target}`);
  expect(pairs).not.toContain('public.child.a=>public.parent.b');
  expect(pairs).not.toContain('public.child.b=>public.parent.a');
  expect(child.foreign_key_constraints).toHaveLength(2);
});

Also noticed the FK list is the only collection that isn't sorted (tables, policies, triggers, functions all are), so FK order might shift between runs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants