Skip to content

Commit 443fc54

Browse files
authored
Store a full WasmFeatures inside of BinaryReader (#1548)
* Move `WasmFeatures` to a standalone module in `wasmparser` This makes the type available even without the `validate` feature since it doesn't have any intrinsic dependencies on validation. * Extend `WasmFeatures` with methods for testing features A bit nicer than `.contains(WasmFeatures::FEATURE)` and abstracts the implementation for possible future iterations such as forcing features to always be on. * Store a full `WasmFeatures` inside of `BinaryReader` Historically wasmparser has had a pretty clean split between validation and parsing with the idea that the parser produces a bunch of stuff and validation weeds it all out depending on activated features. Parsing has historically not had access to the set of active features to have it work the same no matter what. Over time, though, this has become quite the burden. The integration with the spec test repo in this repository tries to make sure that any invalid wasm binary matches the same error message, with a degree of flexibility. Not being able to take wasm features into account when parsing makes this quite difficult. For example WebAssembly binaries invalid before a feature is implemented might be valid after a feature is implemented. They might also have an entirely different class of error after a feature is implemented. Effectively WebAssembly does not guarantee stability of the error message for invalid wasms as it evolves, only that valid wasms all remain interpreted the same way. Wasmparser has historically had a number of hacks around this. The `roundtrip.rs` test suite has a very large function trying to match error messages between wasmparser and the spec interpreter. Lots of special cases happen here for wasmparser's feature-agnostic parser when run against the spec test suite where each proposal has its own copy of the spec interpreter, possibly with subtly different binary decoders. Wasmparser additionally has hacks present in the public API such as `BinaryReader::allow_memarg64` and the `table_byte` field of the `CallIndirect` instruction. These have historically exclusively been around to get the spec tests passing but otherwise serve no purpose. Overall, there's been a lot of pain historically from not being able to understand active wasm features when parsing wasm. It's quite easy to write a code path along the lines of "if this feature is active parse this way otherwise parse this way", but the features have never been available to test this. Additionally `BinaryReader` instances were created in quite a few locations throughout `wasmparser` which would mean threading these around would be quite difficult. This commit changes all of this. This commit replaces the `allow_memarg64: bool` field of `BinaryReader` with `features: WasmFeatures`. This means that all instances of `BinaryReader` have access to all activated features and will be able to parse differently when a feature is disabled or enabled. This should make it much easier to pass the various stages of spec tests depending on what happens to be merged into the spec interpreter. To assist with threading this value around and to ensure it's accurate the `new_with_offset` constructor of `BinaryReader` was removed and the only remaining `new` constructor now requires that features must be specified. Much of `wasmparser` has now additionally been refactored to take a `BinaryReader` as a constructor rather than a slice-and-offset. This centralizes the construction of `BinaryReader` and cuts down on constant conversions in-and-out of a `BinaryReader`. For example much of wasmparser now forwards along `BinaryReader` instances as-is. This is a large-ish API change, too, for anyone constructing these manually (shouldn't affect those exclusively using the validator). The intention is that it's generally ok to pass `WasmFeatures::all()` to the construction of a `BinaryReader`. For example the `wasmparser::Parser` type defaults to having all features active. This means that errors will be different (or not present) for newer wasm features but they'll all still get filtered out by the validator. The features in the `BinaryReader` are not intended to be a strict gating procedure such that it's an exact wasm MVP parser, for example, instead only being available as necessary for spec tests and other relevant edge cases. * Fix a rebase conflict * Add missing calls to `shrink` for linking/reloc parsers * Fix fuzzer compilation
1 parent 143aacb commit 443fc54

51 files changed

Lines changed: 598 additions & 544 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

crates/wasm-encoder/src/core/code.rs

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,11 +83,13 @@ impl CodeSection {
8383
/// into a new code section encoder:
8484
///
8585
/// ```
86+
/// # use wasmparser::{BinaryReader, WasmFeatures, CodeSectionReader};
8687
/// // id, size, # entries, entry
8788
/// let code_section = [10, 6, 1, 4, 0, 65, 0, 11];
8889
///
8990
/// // Parse the code section.
90-
/// let reader = wasmparser::CodeSectionReader::new(&code_section, 0).unwrap();
91+
/// let reader = BinaryReader::new(&code_section, 0, WasmFeatures::all());
92+
/// let reader = CodeSectionReader::new(reader).unwrap();
9193
/// let body = reader.into_iter().next().unwrap().unwrap();
9294
/// let body_range = body.range();
9395
///

crates/wasm-metadata/src/lib.rs

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ use std::mem;
99
use std::ops::Range;
1010
use wasm_encoder::{ComponentSection as _, ComponentSectionId, Encode, Section};
1111
use wasmparser::{
12-
ComponentNameSectionReader, KnownCustom, NameSectionReader, Parser, Payload::*,
13-
ProducersSectionReader,
12+
BinaryReader, ComponentNameSectionReader, KnownCustom, NameSectionReader, Parser, Payload::*,
13+
ProducersSectionReader, WasmFeatures,
1414
};
1515

1616
/// A representation of a WebAssembly producers section.
@@ -64,7 +64,8 @@ impl Producers {
6464
}
6565
/// Read the producers section from a Wasm binary.
6666
pub fn from_bytes(bytes: &[u8], offset: usize) -> Result<Self> {
67-
let section = ProducersSectionReader::new(bytes, offset)?;
67+
let reader = BinaryReader::new(bytes, offset, WasmFeatures::all());
68+
let section = ProducersSectionReader::new(reader)?;
6869
let mut fields = IndexMap::new();
6970
for field in section.into_iter() {
7071
let field = field?;
@@ -603,7 +604,8 @@ impl<'a> ModuleNames<'a> {
603604
/// Read a name section from a WebAssembly binary. Records the module name, and all other
604605
/// contents of name section, for later serialization.
605606
pub fn from_bytes(bytes: &'a [u8], offset: usize) -> Result<ModuleNames<'a>> {
606-
let section = NameSectionReader::new(bytes, offset);
607+
let reader = BinaryReader::new(bytes, offset, WasmFeatures::all());
608+
let section = NameSectionReader::new(reader);
607609
let mut s = Self::empty();
608610
for name in section.into_iter() {
609611
let name = name?;
@@ -688,7 +690,8 @@ impl<'a> ComponentNames<'a> {
688690
/// Read a component-name section from a WebAssembly binary. Records the component name, as
689691
/// well as all other component name fields for later serialization.
690692
pub fn from_bytes(bytes: &'a [u8], offset: usize) -> Result<ComponentNames<'a>> {
691-
let section = ComponentNameSectionReader::new(bytes, offset);
693+
let reader = BinaryReader::new(bytes, offset, WasmFeatures::all());
694+
let section = ComponentNameSectionReader::new(reader);
692695
let mut s = Self::empty();
693696
for name in section.into_iter() {
694697
let name = name?;

crates/wasm-mutate/src/info.rs

Lines changed: 5 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ use crate::{
55
use std::collections::HashSet;
66
use std::ops::Range;
77
use wasm_encoder::{RawSection, SectionId};
8-
use wasmparser::{Chunk, Parser, Payload};
8+
use wasmparser::{BinaryReader, Chunk, Parser, Payload, WasmFeatures};
99

1010
/// Provides module information for future usage during mutation
1111
/// an instance of ModuleInfo could be user to determine which mutation could be applied
@@ -220,7 +220,8 @@ impl<'a> ModuleInfo<'a> {
220220
pub fn has_nonempty_code(&self) -> bool {
221221
if let Some(section) = self.code {
222222
let section_data = self.raw_sections[section].data;
223-
wasmparser::CodeSectionReader::new(section_data, 0)
223+
let reader = BinaryReader::new(section_data, 0, WasmFeatures::all());
224+
wasmparser::CodeSectionReader::new(reader)
224225
.map(|r| r.count() != 0)
225226
.unwrap_or(false)
226227
} else {
@@ -247,21 +248,12 @@ impl<'a> ModuleInfo<'a> {
247248
});
248249
}
249250

250-
pub fn get_type_section(&self) -> Option<RawSection<'a>> {
251-
let idx = self.types?;
252-
Some(self.raw_sections[idx])
253-
}
254-
255251
pub fn get_code_section(&self) -> RawSection<'a> {
256252
self.raw_sections[self.code.unwrap()]
257253
}
258254

259-
pub fn get_exports_section(&self) -> RawSection<'a> {
260-
self.raw_sections[self.exports.unwrap()]
261-
}
262-
263-
pub fn get_data_section(&self) -> RawSection<'a> {
264-
self.raw_sections[self.data.unwrap()]
255+
pub fn get_binary_reader(&self, i: usize) -> wasmparser::BinaryReader<'a> {
256+
BinaryReader::new(self.raw_sections[i].data, 0, WasmFeatures::all())
265257
}
266258

267259
pub fn has_exports(&self) -> bool {
@@ -281,11 +273,6 @@ impl<'a> ModuleInfo<'a> {
281273
self.global_types.len()
282274
}
283275

284-
/// Returns the global section bytes as a `RawSection` instance
285-
pub fn get_global_section(&self) -> RawSection {
286-
self.raw_sections[self.globals.unwrap()]
287-
}
288-
289276
/// Insert a new section as the `i`th section in the Wasm module.
290277
pub fn insert_section(
291278
&self,

crates/wasm-mutate/src/mutators/add_function.rs

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,8 @@ impl Mutator for AddFunctionMutator {
2121
// (Re)encode the function section and add this new entry.
2222
let mut func_sec_enc = wasm_encoder::FunctionSection::new();
2323
if let Some(func_sec_idx) = config.info().functions {
24-
let raw_func_sec = config.info().raw_sections[func_sec_idx];
25-
let reader = wasmparser::FunctionSectionReader::new(raw_func_sec.data, 0)?;
24+
let reader = config.info().get_binary_reader(func_sec_idx);
25+
let reader = wasmparser::FunctionSectionReader::new(reader)?;
2626
for x in reader {
2727
func_sec_enc.function(x?);
2828
}
@@ -33,12 +33,11 @@ impl Mutator for AddFunctionMutator {
3333
// this function.
3434
let mut code_sec_enc = wasm_encoder::CodeSection::new();
3535
if let Some(code_sec_idx) = config.info().code {
36-
let raw_code_sec = config.info().raw_sections[code_sec_idx];
37-
let reader = wasmparser::CodeSectionReader::new(raw_code_sec.data, 0)?;
36+
let reader = config.info().get_binary_reader(code_sec_idx);
37+
let reader = wasmparser::CodeSectionReader::new(reader)?;
3838
for body in reader {
3939
let body = body?;
40-
let range = body.range();
41-
code_sec_enc.raw(&raw_code_sec.data[range.start..range.end]);
40+
code_sec_enc.raw(body.as_bytes());
4241
}
4342
}
4443
let func_ty = match &config.info().types_map[usize::try_from(ty_idx).unwrap()] {

crates/wasm-mutate/src/mutators/add_type.rs

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,9 +52,10 @@ impl Mutator for AddTypeMutator {
5252
}
5353

5454
let mut types = wasm_encoder::TypeSection::new();
55-
if let Some(old_types) = config.info().get_type_section() {
55+
if let Some(old_types) = config.info().types {
5656
// Copy the existing types section over into the encoder.
57-
let reader = wasmparser::TypeSectionReader::new(old_types.data, 0)?;
57+
let reader = config.info().get_binary_reader(old_types);
58+
let reader = wasmparser::TypeSectionReader::new(reader)?;
5859
for ty in reader.into_iter_err_on_gc_types() {
5960
let ty = ty?;
6061
let params = ty

crates/wasm-mutate/src/mutators/codemotion.rs

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -62,9 +62,9 @@ impl CodemotionMutator {
6262
config: &mut WasmMutate,
6363
mutators: &[Box<dyn AstMutator>],
6464
) -> crate::Result<(Function, u32)> {
65-
let original_code_section = config.info().get_code_section();
66-
67-
let sectionreader = CodeSectionReader::new(original_code_section.data, 0)?;
65+
let original_code_section = config.info().code.unwrap();
66+
let reader = config.info().get_binary_reader(original_code_section);
67+
let sectionreader = CodeSectionReader::new(reader)?;
6868
let function_count = sectionreader.count();
6969
let function_to_mutate = config.rng().gen_range(0..function_count);
7070

@@ -75,8 +75,7 @@ impl CodemotionMutator {
7575
for fidx in (function_to_mutate..function_count).chain(0..function_to_mutate) {
7676
config.consume_fuel(1)?;
7777
let reader = all_readers[fidx as usize].clone();
78-
let mut operatorreader = reader.get_operators_reader()?;
79-
operatorreader.allow_memarg64(true);
78+
let operatorreader = reader.get_operators_reader()?;
8079

8180
let operators = operatorreader
8281
.into_iter_with_offsets()
@@ -101,7 +100,7 @@ impl CodemotionMutator {
101100
&ast,
102101
&self.copy_locals(reader)?,
103102
&operators,
104-
original_code_section.data,
103+
config.info().raw_sections[original_code_section].data,
105104
)?;
106105
return Ok((newfunc, fidx));
107106
}
@@ -143,16 +142,16 @@ impl Mutator for CodemotionMutator {
143142
let (newfunc, function_to_mutate) = self.random_mutate(config, &mutators)?;
144143

145144
let mut codes = CodeSection::new();
146-
let code_section = config.info().get_code_section();
147-
let sectionreader = CodeSectionReader::new(code_section.data, 0)?;
145+
let code_section = config.info().get_binary_reader(config.info().code.unwrap());
146+
let sectionreader = CodeSectionReader::new(code_section)?;
148147

149148
for (fidx, reader) in sectionreader.into_iter().enumerate() {
150149
let reader = reader?;
151150
if fidx as u32 == function_to_mutate {
152151
log::trace!("Mutating function {}", fidx);
153152
codes.function(&newfunc);
154153
} else {
155-
codes.raw(&code_section.data[reader.range().start..reader.range().end]);
154+
codes.raw(reader.as_bytes());
156155
}
157156
}
158157
let module = config

crates/wasm-mutate/src/mutators/custom.rs

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,8 @@ impl Mutator for CustomSectionMutator {
2828
assert!(!custom_section_indices.is_empty());
2929

3030
let custom_section_index = *custom_section_indices.choose(config.rng()).unwrap();
31-
let old_custom_section = &config.info().raw_sections[custom_section_index];
32-
let old_custom_section =
33-
wasmparser::CustomSectionReader::new(old_custom_section.data, 0).unwrap();
31+
let reader = config.info().get_binary_reader(custom_section_index);
32+
let old_custom_section = wasmparser::CustomSectionReader::new(reader).unwrap();
3433

3534
let name_string;
3635
let data_vec;

crates/wasm-mutate/src/mutators/function_body_unreachable.rs

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,9 @@ impl Mutator for FunctionBodyUnreachable {
1919
) -> Result<Box<dyn Iterator<Item = Result<Module>> + 'a>> {
2020
let mut codes = CodeSection::new();
2121

22-
let code_section = config.info().get_code_section();
23-
let reader = CodeSectionReader::new(code_section.data, 0)?;
22+
let code_section = config.info().code.unwrap();
23+
let reader = config.info().get_binary_reader(code_section);
24+
let reader = CodeSectionReader::new(reader)?;
2425

2526
let count = reader.count();
2627
let function_to_mutate = config.rng().gen_range(0..count);
@@ -38,7 +39,7 @@ impl Mutator for FunctionBodyUnreachable {
3839

3940
codes.function(&f);
4041
} else {
41-
codes.raw(&code_section.data[f.range().start..f.range().end]);
42+
codes.raw(f.as_bytes());
4243
}
4344
}
4445

crates/wasm-mutate/src/mutators/modify_const_exprs.rs

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,8 @@ impl Mutator for ConstExpressionMutator {
155155
let mutate_idx = config.rng().gen_range(0..num_total);
156156
let section = config.info().globals.ok_or(skip_err)?;
157157
let mut new_section = GlobalSection::new();
158-
let reader = GlobalSectionReader::new(config.info().raw_sections[section].data, 0)?;
158+
let reader = config.info().get_binary_reader(section);
159+
let reader = GlobalSectionReader::new(reader)?;
159160
let mut translator = InitTranslator {
160161
config,
161162
skip_inits: 0,
@@ -179,8 +180,8 @@ impl Mutator for ConstExpressionMutator {
179180
let mutate_idx = config.rng().gen_range(0..num_total);
180181
let section = config.info().elements.ok_or(skip_err)?;
181182
let mut new_section = ElementSection::new();
182-
let reader =
183-
ElementSectionReader::new(config.info().raw_sections[section].data, 0)?;
183+
let reader = config.info().get_binary_reader(section);
184+
let reader = ElementSectionReader::new(reader)?;
184185
let mut translator = InitTranslator {
185186
config,
186187
skip_inits: 0,

crates/wasm-mutate/src/mutators/modify_data.rs

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,9 @@ impl Mutator for ModifyDataMutator {
1818
config: &'a mut WasmMutate,
1919
) -> Result<Box<dyn Iterator<Item = Result<Module>> + 'a>> {
2020
let mut new_section = DataSection::new();
21-
let reader = DataSectionReader::new(config.info().get_data_section().data, 0)?;
21+
let section_idx = config.info().data.unwrap();
22+
let reader = config.info().get_binary_reader(section_idx);
23+
let reader = DataSectionReader::new(reader)?;
2224

2325
// Select an arbitrary data segment to modify.
2426
let data_to_modify = config.rng().gen_range(0..reader.count());

0 commit comments

Comments
 (0)