wasip3: Fix fallback code when buffering writes#788
wasip3: Fix fallback code when buffering writes#788alexcrichton merged 4 commits intoWebAssembly:mainfrom
Conversation
This commit fixes mistakes from WebAssembly#782 where the fallback code for buffering writes wasn't handled properly. Specifically when a completion event came in through `poll` it didn't fully update the internal state of the stream, such as kicking off further writes or cleaning up the internal buffer. This is fixed by refactoring the logic already present during a normal `write` to get executed in this event. Testing this is unfortunately not trivial. There aren't really all that many streams we have access to natively in Wasmtime, and TCP is "well behaved" and doesn't execute any of these fallbacks anyway. The closest way to test this is filesystem-based streams, but Wasmtime's default behavior is to lie and just do blocking operations on these streams regardless. Through changes in Wasmtime, and flags to Wasmtime, however, it's possible to Wasmtime in a mode where filesystem streams are "ill behaved" where zero-length reads don't signal readiness, triggering these fallback paths. With those changes the added test here, and minor additionally support for nonblocking reads/writes, exercise these paths and triggers the previous bug.
|
The companion Wasmtime PR for this is bytecodealliance/wasmtime#13163 which exercises these tests more thoroughly. |
| read->offset = &file->offset; | ||
| read->timeout = 0; | ||
| read->blocking = true; | ||
| read->blocking = (file->oflag & O_NONBLOCK) == 0; |
There was a problem hiding this comment.
Does this mean read/write can start returning EWOULDBLOCK for regular files?
There was a problem hiding this comment.
It does, yeah. I'm not sure how useful that is in practice, but it's the only way I could think of to test this PR
There was a problem hiding this comment.
I suppose if an application requested O_NONBLOCK, then getting EWOULDBLOCK is exactly what it asked for. Nonetheless, I wonder how many programs are actually prepared to handle that for regular files, given ~40 years of assumptions that the flag is ignored for those.
There was a problem hiding this comment.
Er sorry meant to think about this a bit more before landing. Do you feel that this should get backed out? I don't know how to test anything related to "ill behaving streams" otherwise since files are basically the only host-defined stream that doesn't handle zero-length reads/writes like TCP does.
There was a problem hiding this comment.
The Linux man page mentions:
Note that this flag has no effect for regular files and block devices; that is, I/O operations will (briefly) block when device activity is required, regardless of whether
O_NONBLOCKis set.
And also:
Since
O_NONBLOCKsemantics might eventually be implemented, applications should not depend upon blocking behavior when specifying this flag for regular files and block devices.
So yeah,. there's that.. :)
I feel this leaks testing concerns into user-visible behavior so I'm leaning toward keeping the fix but backing out the nonblocking behavior for regular files. I don't have a super strong well-funded opinion for it though.
That would indeed leave some of the new code untested. As long as the test suite uses off-the-shelf, unmodified Wasm engines, we're unlikely to hit all code paths anyway. A longer-term solution might be some kind of host-side "chaos mode" that deterministically generates all kinds of odd-but-technically-correct stream behaviors. That could also be more generally useful for other languages and toolchains targeting the component model directly. I realize that is completely out of scope for this PR.
There was a problem hiding this comment.
ah right, this could indeed be a nice use case for component composition
This commit refactors how `offset` is updated during the internal read/write operations on streams. This isn't applicable for TCP but matters for files, for example. There's not a great way to test the paths that were previously missing the update but the code is structured differently now to make it much harder to forget to do an update. The test modified here is a light test which doesn't reproduce the original issue but at least ensures that nonblocking paths update the offset as well. > **Note**: this is temporarily built on #788
Concerns brought up in #788 rightfully point out that this doesn't work on other platforms and it's been such a core assumption for so long we run more of a risk of breaking code by supporting it than we do continuing to not support it. This means that some recent changes no longer have test coverage, but that seems best addressed by #766 so it's deferred to that.
This commit fixes mistakes from #782 where the fallback code for buffering writes wasn't handled properly. Specifically when a completion event came in through
pollit didn't fully update the internal state of the stream, such as kicking off further writes or cleaning up the internal buffer. This is fixed by refactoring the logic already present during a normalwriteto get executed in this event.Testing this is unfortunately not trivial. There aren't really all that many streams we have access to natively in Wasmtime, and TCP is "well behaved" and doesn't execute any of these fallbacks anyway. The closest way to test this is filesystem-based streams, but Wasmtime's default behavior is to lie and just do blocking operations on these streams regardless. Through changes in Wasmtime, and flags to Wasmtime, however, it's possible to Wasmtime in a mode where filesystem streams are "ill behaved" where zero-length reads don't signal readiness, triggering these fallback paths. With those changes the added test here, and minor additionally support for nonblocking reads/writes, exercise these paths and triggers the previous bug.