Skip to content

[core] Skip stats from schema in DataEvolution when column type has changed#7803

Open
ArnavBalyan wants to merge 1 commit intoapache:masterfrom
ArnavBalyan:arnavb/fix-stats
Open

[core] Skip stats from schema in DataEvolution when column type has changed#7803
ArnavBalyan wants to merge 1 commit intoapache:masterfrom
ArnavBalyan:arnavb/fix-stats

Conversation

@ArnavBalyan
Copy link
Copy Markdown
Member

Purpose

  • Stats merges today ignores type changes during schema evolution. After alter table changes (eg Int to String), stats from older files written under the prior schema are merged in under the new type.
  • When types mismatch, this causes garbage stats to be accumulated which causing wrong pruning/silently data drop.
  • Ensure we can detect this mismatch and skip the stats for such cases.

Tests

  • UT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant