Skip to content

Commit 62cf274

Browse files
authored
wasm-encoder: add method to allow fetching raw Function bytes. (#1630)
When generating function bodies with `wasm-encoder`, it is sometimes useful to take the function bytecode, cache it, then reuse it later. (For my specific use-case, in weval, I would like to cache weval-specialized function bodies and reuse them when creating new modules.) Unfortunately the existing API of `wasm-encoder` makes this *almost* but *not quite* easy: `Function` implements `Encode::encode`, but this will produce a function body *with* the length prefixed. However there's no way to use this unmodified with the next level up the entity hierarchy: `CodeSection::raw` wants the function bytes *without* the length prefixed. This is a fairly annoying if small API gap, and otherwise requires manually stripping the length prefix, a leb128-encoded integer. It's also a small footgun: I naively did not realize this mismatch and tried to do the above, only to get perplexing type errors with locals. This PR adds one method on `wasm_encoder::Function` to return the inner bytes directly, and the doc-comment contains an example showing the intended use-case.
1 parent 3e38098 commit 62cf274

1 file changed

Lines changed: 58 additions & 0 deletions

File tree

  • crates/wasm-encoder/src/core

crates/wasm-encoder/src/core/code.rs

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -274,6 +274,45 @@ impl Function {
274274
self.bytes.len()
275275
}
276276

277+
/// Unwraps and returns the raw byte encoding of this function.
278+
///
279+
/// This encoding doesn't include the variable-width size field
280+
/// that `encode` will write before the added bytes. As such, its
281+
/// length will match the return value of [`byte_len`].
282+
///
283+
/// # Use Case
284+
///
285+
/// This raw byte form is suitable for later using with
286+
/// [`CodeSection::raw`]. Note that it *differs* from what results
287+
/// from [`Function::encode`] precisely due to the *lack* of the
288+
/// length prefix; [`CodeSection::raw`] will use this. Using
289+
/// [`Function::encode`] instead produces bytes that cannot be fed
290+
/// into other wasm-encoder types without stripping off the length
291+
/// prefix, which is awkward and error-prone.
292+
///
293+
/// This method combined with [`CodeSection::raw`] may be useful
294+
/// together if one wants to save the result of function encoding
295+
/// and use it later: for example, caching the result of some code
296+
/// generation process.
297+
///
298+
/// For example:
299+
///
300+
/// ```
301+
/// # use wasm_encoder::{CodeSection, Function, Instruction};
302+
/// let mut f = Function::new([]);
303+
/// f.instruction(&Instruction::End);
304+
/// let bytes = f.into_raw_body();
305+
/// // (save `bytes` somewhere for later use)
306+
/// let mut code = CodeSection::new();
307+
/// code.raw(&bytes[..]);
308+
///
309+
/// assert_eq!(2, bytes.len()); // Locals count, then `end`
310+
/// assert_eq!(3, code.byte_len()); // Function length byte, function body
311+
/// ```
312+
pub fn into_raw_body(self) -> Vec<u8> {
313+
self.bytes
314+
}
315+
277316
/// Parses a single instruction from `reader` and adds it to `self`.
278317
#[cfg(feature = "wasmparser")]
279318
pub fn parse(
@@ -3753,4 +3792,23 @@ mod tests {
37533792

37543793
assert_eq!(f1.bytes, f2.bytes)
37553794
}
3795+
3796+
#[test]
3797+
fn func_raw_bytes() {
3798+
use super::*;
3799+
3800+
let mut f = Function::new([(1, ValType::I32), (1, ValType::F32)]);
3801+
f.instruction(&Instruction::End);
3802+
let mut code_from_func = CodeSection::new();
3803+
code_from_func.function(&f);
3804+
let bytes = f.into_raw_body();
3805+
let mut code_from_raw = CodeSection::new();
3806+
code_from_raw.raw(&bytes[..]);
3807+
3808+
let mut c1 = vec![];
3809+
code_from_func.encode(&mut c1);
3810+
let mut c2 = vec![];
3811+
code_from_raw.encode(&mut c2);
3812+
assert_eq!(c1, c2);
3813+
}
37563814
}

0 commit comments

Comments
 (0)