Speed up HTML escaping by copying safe spans in bulk#502
Open
deliro wants to merge 1 commit into
Open
Conversation
Escape one contiguous run of safe bytes per `push_str` instead of pushing each byte individually, and drop the `unsafe` `as_mut_vec()` write in the process. Behaviour is unchanged; the byte-set and output are identical, now covered by additional edge-case tests (empty input, adjacent specials, multi-byte UTF-8). Neutral on templates that only splice short strings (static markup is escaped at compile time and never hits this path); noticeably faster when long dynamic text is escaped at runtime.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
escape_to_stringis on the hot path for every dynamic value rendered at runtime. It currently pushes the input one byte at a time (viaunsafe { output.as_mut_vec().push(b) }for the common, non-special case). This rewrites it to scan for the next character that needs escaping and copy the whole preceding run of safe bytes in a singlepush_str.Side benefits:
unsafeblock — the new code is entirely safe.< 0x80), so they never fall inside a multi-byte UTF-8 sequence; slicing at their indices is always on a character boundary.Behaviour is unchanged (same byte-set, same output). Both copies of the function (
maud/src/escape.rsandmaud_macros/src/escape.rs) are kept in sync as the header comment requires, and I added edge-case tests: empty input, no specials, all specials, adjacent/boundary specials, multi-byte UTF-8, and appending to a non-empty buffer.Benchmarks
cargo +nightly bench -p maud, Apple M-series, 3 runs each, median:render_long_text(long user prose through the escaper)render_template(short splices)render_complicated_templateThe win shows up on text-heavy output (article bodies, comments, descriptions). Templates that only splice short strings are unaffected, since static markup is escaped at compile time and never reaches this function at runtime. A new
render_long_textbenchmark is included to cover this workload.Note: replacing the
forloop with iterator combinators does not enable autovectorization here — the side-effectful body (push_str) and the loop-carried offset block LLVM's loop vectorizer (verified in disassembly: identical scalar codegen). True SIMD would need an explicit byte-set search (e.g.memchr/jetscii/core::arch), which conflicts with the crate'sno_std+ stable + minimal-deps constraints, so it's intentionally left out.