Skip to content

Fix LT-21598: Find and Fix repairs LexReferences with duplicate Targets#381

Merged
mark-sil merged 1 commit into
masterfrom
LT-21598-FindAndFix
Jun 26, 2026
Merged

Fix LT-21598: Find and Fix repairs LexReferences with duplicate Targets#381
mark-sil merged 1 commit into
masterfrom
LT-21598-FindAndFix

Conversation

@mark-sil

@mark-sil mark-sil commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Summary

A LexReference (lexical relation) could list the same Target object more than
once. This invalid state can lead to crashes — the incoming-reference index
deduplicates, so the duplicated Target and the relation's references get out of
sync, leaving stale references to objects when they are later processed.

This is the data-repair companion to the runtime fix in #380: that PR makes the
model robust against the corrupt data, and this PR teaches Find and Fix to
repair it.

Change (SequenceFixer)

Find and Fix now counts distinct Targets of a LexReference:

  • If duplicates exist but two or more distinct Targets remain, the duplicate
    Targets are removed (the first occurrence of each is kept).
  • If fewer than two distinct Targets remain, the whole LexReference is
    removed (and the owning LexRefType's reference to it). This generalizes the
    existing "too few Targets" rule, which previously counted raw entries and so
    missed e.g. a relation listing one sense twice.

Testing

  • New unit test DuplicateLexReferenceTargets covering the three cases:
    duplicate removed (relation kept), all-duplicate (relation removed), and a
    valid relation left untouched. Existing fixer tests still pass.
  • Validated end-to-end on the project from the bug report: the actual
    FwDataFixer made 42 repairs (38 duplicate Targets removed, 4 now-invalid
    relations removed), leaving 0 LexReferences with duplicate or fewer-than-two
    distinct Targets.

🤖 Generated with Claude Code


This change is Reviewable

A LexReference could list the same Target object more than once, leading
to a crash. Find and Fix now counts distinct Targets. It removes duplicate
Targets from a LexReference (keeping the first occurrence), and removes
the LexReference altogether when fewer than two distinct Targets remain.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

LCM Tests

    16 files  ±0      16 suites  ±0   2m 48s ⏱️ -27s
 2 861 tests +1   2 841 ✅ +1   20 💤 ±0  0 ❌ ±0 
11 392 runs  +4  11 224 ✅ +4  168 💤 ±0  0 ❌ ±0 

Results for commit 4fada45. ± Comparison against base commit c4d88bb.

@jasonleenaylor jasonleenaylor left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@jasonleenaylor reviewed 6 files and all commit messages, and made 1 comment.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on mark-sil).

@mark-sil mark-sil merged commit 7c0aeef into master Jun 26, 2026
5 checks passed
@mark-sil mark-sil deleted the LT-21598-FindAndFix branch June 26, 2026 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants