Skip to content

Commit 35c6d37

Browse files
fix(duckdb): wrap OCTET_LENGTH string arguments with ENCODE (#7507)
* fix(duckdb): wrap OCTET_LENGTH string arguments with ENCODE [CLAUDE] - DuckDB OCTET_LENGTH only accepts BLOB or BIT types, not VARCHAR - Snowflake OCTET_LENGTH accepts both VARCHAR and BINARY - Added bytelength_sql() method to wrap VARCHAR arguments with ENCODE() - Uses _is_binary() check to detect BINARY types via type annotation - Binary data (BINARY, VARBINARY, BLOB) is passed through without ENCODE - Fixes transpilation from Snowflake to DuckDB for string arguments * chore: update integration tests submodule pointer * feat(snowflake)!: Transpilation support for REDUCE - implemented review comments * Sync w/ integration tests --------- Co-authored-by: George Sittas <giwrgos.sittas@gmail.com>
1 parent d0407ed commit 35c6d37

2 files changed

Lines changed: 11 additions & 2 deletions

File tree

sqlglot-integration-tests

sqlglot/generators/duckdb.py

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1536,7 +1536,6 @@ class DuckDBGenerator(generator.Generator):
15361536
exp.BitwiseOrAgg: _bitwise_agg_sql,
15371537
exp.BitwiseRightShift: _bitshift_sql,
15381538
exp.BitwiseXorAgg: _bitwise_agg_sql,
1539-
exp.ByteLength: lambda self, e: self.func("OCTET_LENGTH", e.this),
15401539
exp.CommentColumnConstraint: no_comment_column_constraint_sql,
15411540
exp.Corr: lambda self, e: self._corr_sql(e),
15421541
exp.CosineDistance: rename_func("LIST_COSINE_DISTANCE"),
@@ -3522,6 +3521,16 @@ def rand_sql(self, expression: exp.Rand) -> str:
35223521
# Default DuckDB behavior - just return RANDOM() as float
35233522
return "RANDOM()"
35243523

3524+
def bytelength_sql(self, expression: exp.ByteLength) -> str:
3525+
arg = expression.this
3526+
3527+
# Check if it's a text type (handles both literals and annotated expressions)
3528+
if arg.is_type(*exp.DataType.TEXT_TYPES):
3529+
return self.func("OCTET_LENGTH", exp.Encode(this=arg))
3530+
3531+
# Default: pass through as-is (conservative for DuckDB, handles binary and unannotated)
3532+
return self.func("OCTET_LENGTH", arg)
3533+
35253534
def base64encode_sql(self, expression: exp.Base64Encode) -> str:
35263535
# DuckDB TO_BASE64 requires BLOB input
35273536
# Snowflake BASE64_ENCODE accepts both VARCHAR and BINARY - for VARCHAR it implicitly

0 commit comments

Comments
 (0)