fix: decline CreateArray with struct-nullability-divergent children#4533
fix: decline CreateArray with struct-nullability-divergent children#4533schenksj wants to merge 3 commits into
Conversation
DataFusion's make_array asserts strict element-type equality in MutableArrayData and panics on a mismatch. Spark's CreateArray coerces element types with `sameType`, which ignores nullability, so children that share a surface type but differ only in a nested struct field's nullability get no unifying cast (e.g. array(struct(a not null), struct(a nullable))). Native execution then panics: "Arrays with inconsistent types passed to MutableArrayData". DataFusion tolerates container nullability differences (ArrayType.containsNull / MapType.valueContainsNull are coerced), so decline only the cases that actually panic: children that still differ after normalizing container nullability while keeping struct field nullability significant. Those fall back to Spark's evaluator. Closes apache#4528 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
FYI... I filed a PR upstream to fix the root cause as well. If that goes in, this gating becomes unnecessary: https://github.com/apache/datafusion/pull/22658/changes |
andygrove
left a comment
There was a problem hiding this comment.
Just adding a note here that my AI review tool said that this PR was good to merge. I will do a manual review in the next couple days to confirm.
| struct(col("_1").as("id"), lit("a").as("ct")), | ||
| // ct is NULLABLE (when without otherwise) -- same type, different nullability | ||
| struct(col("_1").as("id"), when(col("_1") === 0, lit("b")).as("ct"))).as("arr")) | ||
| checkSparkAnswer(df) |
There was a problem hiding this comment.
nit: would be better to use checkSparkAnswerAndFallbackReason
There was a problem hiding this comment.
Switched the existing test to checkSparkAnswerAndFallbackReason so it asserts the specific decline reason (CreateArray children have mismatched data types) instead of only matching the answer.
| val df = spark | ||
| .table("tbl") | ||
| .select( | ||
| array( |
There was a problem hiding this comment.
This test covers arrays and structs, but does not cover maps. Could you add a map test as well?
There was a problem hiding this comment.
Added a map test, array of maps with nullability-divergent struct values, which wraps the same nested-nullability divergence in a MapType value. That covers normalizeContainerNullability's
MapType branch. The struct field nullability difference survives container-nullability normalization, so CreateArray still declines, mirroring the struct case. Both tests pass under Spark 4.1.
…line
Address review feedback on the nullability-divergent CreateArray test:
- use checkSparkAnswerAndFallbackReason to assert the specific decline reason
("CreateArray children have mismatched data types") rather than only
checkSparkAnswer
- add a map-typed analogue ("array of maps with nullability-divergent struct
values") so the suite covers MapType, exercising
normalizeContainerNullability's MapType branch (the struct field nullability
divergence survives container-nullability normalization, so CreateArray still
declines)
Both tests pass (CometArrayExpressionSuite, Spark 4.1).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Which issue does this PR close?
Closes #4528.
Rationale for this change
DataFusion's
make_arrayasserts strict element-type equality inMutableArrayData::with_capacitiesand panics on a mismatch. Spark'sCreateArraycoerces element types withsameType, which ignores nullability, so children that share a surface type but differ only in a nested struct field's nullability get no unifying cast. For examplearray(struct(a not null), struct(a nullable))reaches native execution with two different struct types and panics:This is a standalone fix; it was surfaced while working on the Delta Lake contrib integration (Delta's CDC write path builds
array(struct(...), struct(...))plans with one struct per change type, leaving a_change_typefield's nullability divergent across arms), so prioritizing it helps that effort, but it applies to any such plan.What changes are included in this PR?
CometCreateArraynow declines (falls back to Spark) when its children's types differ in a waymake_arraycannot handle. DataFusion tolerates container nullability differences (ArrayType.containsNull/MapType.valueContainsNullare coerced) but not a struct field's nullability, so the check normalizes container nullability before comparing and keeps struct field nullability significant — declining only the cases that actually panic. This avoids over-declining legitimate arrays of arrays/maps that differ only incontainsNull.This tracks upstream apache/datafusion#22366; the caller-side decline can be removed once that fix lands.
How are these changes tested?
New test in
CometArrayExpressionSuitebuildsarray(struct(id, ct not null), struct(id, ct nullable))and asserts correct results. The test fails onmainwith the nativeMutableArrayDatapanic and passes with this change. The fullCometArrayExpressionSuite(40/40) passes, includingarrays_overlap - nested array null handlingwhich exercises arrays differing only incontainsNulland must still run natively.