Commit 0b0ff5a
committed
fix(migrations): robustly strip psql meta commands without breaking SQL
Replace naive PostgreSQL schema preprocessing with a single-pass state machine that correctly distinguishes top-level psql meta-commands from backslashes in valid SQL code, literals, identifiers, comments, and dollar-quoted bodies.
The previous implementation would either leave pg_dump-style backslash directives in place for some schema-loading paths or strip too aggressively, breaking valid SQL containing:
- Backslashes in string literals, including `E'...'` escapes and simple `standard_conforming_strings` variants
- Meta-command text in comments or documentation
- Dollar-quoted function bodies, including Unicode-tagged bodies
- Double-quoted identifiers and identifiers containing `$`
Changes:
- Add engine-aware preprocessing helpers so rollback removal always applies while PostgreSQL preprocessing preserves server-side SQL, including PL/pgSQL bodies and extension/language DDL, before stripping psql meta-commands.
- Replace the line-based PostgreSQL filter with a single-pass state machine that tracks single quotes, double quotes, dollar quotes, line comments, and nested block comments.
- Handle escape-string prefixes, simple `standard_conforming_strings` value variants, Unicode dollar-quote tags, identifier-boundary checks, the documented `\\` separator command, and broader top-level unknown backslash directives.
- Preserve SQL that follows a valid whitespace-delimited inline `\\` separator on the same psql meta-command line, matching tested `psql 17.9` behavior.
- Strip semantic psql commands such as `\connect`, `\copy`, `\gexec`, `\i`, and `\ir` with warnings in parse/codegen paths, but reject them in schema-application paths where sqlc cannot reproduce their effects safely.
- Reject psql conditionals (`\if`, `\elif`, `\else`, `\endif`) during preprocessing instead of flattening branches and changing SQL semantics.
- Treat `standard_conforming_strings` and transaction-scoped script behavior as best-effort parsing aids rather than full psql emulation, and surface an explicit warning when preprocessing has to approximate those semantics.
- Apply shared schema preprocessing and warning propagation in compiler parsing, `createdb`, `verify`, managed `vet`, and PostgreSQL sqltest seeding paths, including stable hashing of preprocessed read-only PostgreSQL fixtures.
- Re-enable the `pg_dump` end-to-end fixture now that its schema parses and seeds successfully.
- Add targeted regression coverage covering:
* All documented psql meta-commands plus broader top-level unknown backslash directives
* String literals with backslashes, escape-string prefixes, and simple `standard_conforming_strings` variants
* Unicode dollar-quote tags, identifier-boundary cases, inline `\\` separator behavior, and rejected conditional directives
* Semantic-command warnings and apply-mode rejections, `\copy ... from stdin`, line comments, nested block comments, quoted identifiers, helper edge cases, and schema preprocessing rollout
* Bare CR normalization, CRLF preservation, and managed/PostgreSQL preprocessing behavior
Performance improvements:
- Pre-allocate output buffers with `strings.Builder.Grow()`
- Keep parsing single-pass rather than rescanning line slices
- Consolidate schema preprocessing into engine-aware helpers reused across schema-loading paths1 parent 394bdc7 commit 0b0ff5a
File tree
14 files changed
+1856
-37
lines changed- internal
- cmd
- compiler
- endtoend/testdata/pg_dump
- migrations
- sqltest
- local
14 files changed
+1856
-37
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
88 | | - | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
89 | 96 | | |
90 | 97 | | |
91 | 98 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
306 | 306 | | |
307 | 307 | | |
308 | 308 | | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
309 | 312 | | |
310 | 313 | | |
311 | 314 | | |
| |||
316 | 319 | | |
317 | 320 | | |
318 | 321 | | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
319 | 325 | | |
320 | 326 | | |
321 | 327 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
122 | 122 | | |
123 | 123 | | |
124 | 124 | | |
125 | | - | |
126 | | - | |
127 | | - | |
128 | | - | |
129 | | - | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
130 | 128 | | |
| 129 | + | |
| 130 | + | |
131 | 131 | | |
132 | 132 | | |
133 | 133 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
102 | 102 | | |
103 | 103 | | |
104 | 104 | | |
105 | | - | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
106 | 113 | | |
107 | 114 | | |
108 | 115 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
432 | 432 | | |
433 | 433 | | |
434 | 434 | | |
435 | | - | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
436 | 443 | | |
437 | 444 | | |
438 | 445 | | |
| |||
549 | 556 | | |
550 | 557 | | |
551 | 558 | | |
552 | | - | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
553 | 566 | | |
554 | 567 | | |
555 | 568 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | | - | |
42 | | - | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
43 | 47 | | |
44 | 48 | | |
45 | 49 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| 31 | + | |
31 | 32 | | |
32 | 33 | | |
33 | 34 | | |
| |||
138 | 139 | | |
139 | 140 | | |
140 | 141 | | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
141 | 148 | | |
142 | 149 | | |
143 | 150 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
3 | | - | |
4 | | - | |
5 | | - | |
| 2 | + | |
6 | 3 | | |
0 commit comments