Speed up HTML escaping by copying safe spans in bulk by deliro · Pull Request #502 · lambda-fairy/maud

deliro · 2026-06-19T20:56:52Z

Why

escape_to_string is on the hot path for every dynamic value rendered at runtime. It currently pushes the input one byte at a time (via unsafe { output.as_mut_vec().push(b) } for the common, non-special case). This rewrites it to scan for the next character that needs escaping and copy the whole preceding run of safe bytes in a single push_str.

Side benefits:

Drops the unsafe block — the new code is entirely safe.
All four escaped characters are single-byte ASCII (< 0x80), so they never fall inside a multi-byte UTF-8 sequence; slicing at their indices is always on a character boundary.

Behaviour is unchanged (same byte-set, same output). Both copies of the function (maud/src/escape.rs and maud_macros/src/escape.rs) are kept in sync as the header comment requires, and I added edge-case tests: empty input, no specials, all specials, adjacent/boundary specials, multi-byte UTF-8, and appending to a non-empty buffer.

Benchmarks

cargo +nightly bench -p maud, Apple M-series, 3 runs each, median:

Benchmark	before	after
`render_long_text` (long user prose through the escaper)	~325 ns	~235 ns	−28%
`render_template` (short splices)	~102 ns	~104 ns	within noise
`render_complicated_template`	~677 ns	~688 ns	within noise

The win shows up on text-heavy output (article bodies, comments, descriptions). Templates that only splice short strings are unaffected, since static markup is escaped at compile time and never reaches this function at runtime. A new render_long_text benchmark is included to cover this workload.

Note: replacing the for loop with iterator combinators does not enable autovectorization here — the side-effectful body (push_str) and the loop-carried offset block LLVM's loop vectorizer (verified in disassembly: identical scalar codegen). True SIMD would need an explicit byte-set search (e.g. memchr/jetscii/core::arch), which conflicts with the crate's no_std + stable + minimal-deps constraints, so it's intentionally left out.

Escape one contiguous run of safe bytes per `push_str` instead of pushing each byte individually, and drop the `unsafe` `as_mut_vec()` write in the process. Behaviour is unchanged; the byte-set and output are identical, now covered by additional edge-case tests (empty input, adjacent specials, multi-byte UTF-8). Neutral on templates that only splice short strings (static markup is escaped at compile time and never hits this path); noticeably faster when long dynamic text is escaped at runtime.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up HTML escaping by copying safe spans in bulk#502

Speed up HTML escaping by copying safe spans in bulk#502
deliro wants to merge 1 commit into
lambda-fairy:mainfrom
deliro:optimize-html-escaping

deliro commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

deliro commented Jun 19, 2026

Why

Benchmarks

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant