Skip to content

RemoveUnicodeCharacters: expand built-in replacement defaults#281

Open
niksedk wants to merge 1 commit into
mainfrom
expand-unicode-defaults
Open

RemoveUnicodeCharacters: expand built-in replacement defaults#281
niksedk wants to merge 1 commit into
mainfrom
expand-unicode-defaults

Conversation

@niksedk
Copy link
Copy Markdown
Member

@niksedk niksedk commented May 26, 2026

Summary

  • Grow the built-in replacement defaults for the RemoveUnicodeCharacters plugin from 2 entries (/#) to 41.
  • Extract the dictionary into its own UnicodeDefaults.cs so MainWindow.axaml.cs stays focused on UI/apply logic.
  • Behavior unchanged for any character the user has already persisted a replacement for — the persisted value still wins. The new defaults only seed the "Replace with" textbox for characters with no setting yet.

Categories added

Group Count Mapping
Smart double quotes (“ ” „ ‟) 4 "
Smart single quotes (‘ ’ ‛) + low-9 () 4 ' / ,
Dashes (‐ ‑ ‒ – — ― −) 7 -
Ellipsis variants (․ ‥ …) 3 . / .. / ...
Music note heads (♩ ♪ ♫ ♬) 4 #
Wide/narrow space variants 14
Zero-width / joiners / BOM 5 `` (removed)

Deliberately not defaulted (debatable, left blank so the user picks per row): bullets • ‣, arrows → ← ↑ ↓, math ≤ ≥ ≠, trademarks ™ ℗ ℠, music accidentals ♭ ♮ ♯.

Test plan

  • CI build passes for all six RIDs.
  • Run on an SRT containing smart quotes, em dashes, ellipses, NBSP — confirm the proposed replacement is pre-filled for each.
  • Run again — confirm previously persisted user edits still override the new defaults.

🤖 Generated with Claude Code

Grow the seed map from the original 2 entries (music ♪♫ -> #) to 41
covering smart quotes, the full dash family, ellipsis variants, music
note heads, all wide/narrow space variants, and zero-width / BOM
characters. These are cases where the ASCII mapping is essentially
lossless or matches universal ANSI-export practice.

Debatable cases (bullets, arrows, math, trademark, music accidentals)
are deliberately left blank so the user picks per row in the UI.

Extract the dictionary into its own UnicodeDefaults.cs to keep
MainWindow.axaml.cs readable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant