Collapse nested character markup to single level (BL-16387)#8019
Collapse nested character markup to single level (BL-16387)#8019StephenMcConnel wants to merge 1 commit into
Conversation
|
| Filename | Overview |
|---|---|
| src/BloomExe/Book/Book.cs | Adds two-pass <b>/<i> re-conversion to catch nested legacy tags, plus a deduplication regex that collapses double-wrapped semantic tags; also adds IgnoreCase to the empty-markup cleanup regex. |
| src/BloomTests/Book/BookTests.cs | Adds UpdateCharacterStyleMarkup_ReducesNestingOfSameTags covering all four tag types; inner style attributes exercise the existing attribute-stripping path. |
Reviews (4): Last reviewed commit: "Collapse nested character markup to sing..." | Re-trigger Greptile
b857247 to
6c5d409
Compare
JohnThomson
left a comment
There was a problem hiding this comment.
, but a couple of ideas to consider. If you don't think it's worth it you can merge as-is.
@JohnThomson reviewed 2 files and all commit messages, and made 3 comments.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on StephenMcConnel).
src/BloomExe/Book/Book.cs line 2631 at r2 (raw file):
/// <summary> /// Convert old <b> and <i> to <strong> and <em> respectively. /// Also remove instances like </b><b> altogether since such markup is redundant.
Comment is rather specific, so needs update.
src/BloomExe/Book/Book.cs line 2695 at r2 (raw file):
inner = Regex.Replace( inner, @"<(strong|em|sup|u)>(<\1>.*?</\1>)</\1>",
This also only handles things that are doubled right at the boundary. Maybe that's enough, but I'm wondering if we should handle the case that there is text inside the outer element but outside the inner one.
For example, <em><em>...</em></em> goes to <em>...</em>.
6c5d409 to
a3f63e4
Compare
StephenMcConnel
left a comment
There was a problem hiding this comment.
@StephenMcConnel made 2 comments.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on JohnThomson).
src/BloomExe/Book/Book.cs line 2631 at r2 (raw file):
Previously, JohnThomson (John Thomson) wrote…
Comment is rather specific, so needs update.
Done. I added to the comment.
src/BloomExe/Book/Book.cs line 2695 at r2 (raw file):
Previously, JohnThomson (John Thomson) wrote…
This also only handles things that are doubled right at the boundary. Maybe that's enough, but I'm wondering if we should handle the case that there is text inside the outer element but outside the inner one.
Done.
For example, ... goes to ....
Devin review
This change is