refactor(shm)!: Extract one_way_shared_memory to IPC and prepare libdd-remote-config for python#2121
Conversation
🎉 All green!🧪 All tests passed 🎯 Code Coverage (details) 🔗 Commit SHA: c06b8a8 | Docs | Datadog PR Page | Give us feedback! |
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f999625063
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| pub mod example_interface; | ||
| pub mod handles; | ||
|
|
||
| pub mod one_way_shared_memory; |
There was a problem hiding this comment.
Restore the sidecar shared-memory export
Removing the public datadog_sidecar::one_way_shared_memory module breaks the FFI crate that still imports datadog_sidecar::one_way_shared_memory::{OneWayShmReader, ReaderOpener} in datadog-sidecar-ffi/src/lib.rs, so workspace builds that include datadog-sidecar-ffi fail after this move. Please either keep a compatibility re-export from datadog-sidecar or update the FFI crate to use the new datadog_ipc::one_way_shared_memory API and drop the removed ReaderOpener bound.
Useful? React with 👍 / 👎.
| #[repr(C)] | ||
| #[derive(Debug, Copy, Clone, Hash, Eq, PartialEq, Serialize, Deserialize)] | ||
| #[serde(rename_all = "SCREAMING_SNAKE_CASE")] | ||
| #[cfg_attr(feature = "pyo3", pyo3::pyclass(eq, eq_int, from_py_object))] |
There was a problem hiding this comment.
In this case I'd leave the the pyo3 mirroring strictly to the dd-trace-py. That's the model we use for other crates which, IMO, is more flexible and maintanable for both teams.
There was a problem hiding this comment.
done, made enum inspection available via strum to achieve the same goal.
| /// The handle can be mapped by readers in the same process or inherited by | ||
| /// forked children (an anonymous mapping survives `fork`), letting them build a | ||
| /// [`OneWayShmReader`] over what this writer publishes. The segment starts at one | ||
| /// page and grows on demand as larger buffers are written. Use |
There was a problem hiding this comment.
The segment starts at one page and grows on demand as larger buffers are written
How does that work when the page is already shared? Is the existing data still left in place, and there's some linked list of pages in ShmHandle? Or is it copied over to some larger location? How does the reader notice that?
| /// `handle` is the live mapping to read from — typically an anonymous segment | ||
| /// inherited across a `fork`. Passing `None` leaves the reader without a | ||
| /// mapping; since this constructor installs no opener, such a reader stays | ||
| /// inert (empty reads, `wait_for_change` sleeps) until a handle is supplied — |
There was a problem hiding this comment.
Naive question: what is the use case for allowing a reader to have no underlying handle?
There was a problem hiding this comment.
It makes no sense without opener, that's correct.
There was a problem hiding this comment.
Should new take a MappedMem<T> instead of an Option<_>? (whether you keep the option internally so that you can take it later is a different question)
There was a problem hiding this comment.
Yes, that's what I did :-D
| let copied_data: &RawData = new_mem.as_slice().into(); | ||
|
|
||
| // Ensure a new write hasn't started yet | ||
| fence(Ordering::Acquire); // prevent loads before from being reordered with gen load after |
There was a problem hiding this comment.
I don't think an acquire fence provide that. Usually you can move things past it. What you can't do is move things before it.
To be fair, this kind of passive reader scheme is just impossible to express safely in current Rust. The seqlock crate does use an Acquire fence as well, but I'm not sure what kind of guarantees it provides. On paper a Release fence is more adapted but it's not clear how it interacts with the load below since load can't be Release. Or maybe use a SeqCst fence just to be more constraining? That's what we do for the Otel process-level context, although this is more costly (in the OTel part we don't care, it's not read very often)
There was a problem hiding this comment.
We must not move the generation load before the read of the data from shm. Which is what the acquire guarantees.
Release fences are centered around writes, not reads.
That's why the comment says "prevent loads (from shm) from being reordered...".
The read/write ordering within a same thread is guaranteed to be consistent by the rust memory model (i.e. this doesn't need to be fenced).
There was a problem hiding this comment.
Ordering requirements when reading are always towards external writers, i.e. reads from something which might be written to by some external writer.
I.e. the compiler is not allowed to assume that source_data points to the same data than the source used to copy from into source_data after the fence - i.e. not allowed to alias source and target - it must thus create a data dependency on the target data. Which it would ordinarily be allowed to, as copied data is "obviously" identical to the source.
There was a problem hiding this comment.
I'm not sure to understand exactly what you mean, sorry. Maybe I don't really understand the original comment.
While re-ordering is a useful mental model, it's not how the Rust memory model is defined anyway. What acquire gives you is that it synchronize the load on this thread with the corresponding store that put the value there on some other thread, creating a connection: that the store happens-before the load. So all observable memory accesses (shared but potentially non-atomic) done on the writer before writing the value to the atomic are visible on the reader side after we load. Since the load happens before whatever is next done on the reader side (sequenced-before), all memory accesses (shared but potentially non-atomic) occurring after the load will also not be re-ordered before the load (or acquire fence in our case).
However, it doesn't tell anything about accesses that are sequenced before the atomic load/ the fence on the reader thread. The following is, I believe, a totally legal re-ordering, if the compiler or the hardware think it can't be observed by the current thread (but this is always the case for a bunch of reads):
read address A (non-atomic)
read address B (non-atomic)
atomic load relaxed C
acquire fence
...rest
to
atomic load relaxed C
acquire fence
read address A (non-atomic)
read address B (non-atomic)
..rest
Whether it can happen in this specific case because of the data dependency is another question 🤷 , but I think the acquire fence doesn't help preventing it, or at least is not guaranteed to help.
There was a problem hiding this comment.
(and by the way, an acquire fence should be placed after the corresponding load(s), see https://doc.rust-lang.org/std/sync/atomic/fn.fence.html#atomic---fence -- I'm not sure what is the semantics when put before)
There was a problem hiding this comment.
Our acquire fence is after the load by new_mem.extend_from_slice(unsafe { reinterpret_u8_as_u64_slice(&handle.as_slice()[0..size]) });. The problem is, as you say, that I cannot make this an atomic memcpy, because rust doesn't support it currently. But it's that one which the fence is supposed to synchronize with.
I think you might have a point that the subsequent Relaxed load also needs a release fence before, requiring AcqRel.
There was a problem hiding this comment.
Ah, I see, you're trying to protect this access. Indeed the seqlock-like readers like this just can't be implemented properly (at least in theory) in Rust, and the semantics of fences is a bit crazy (the whole "atomics are required" part is really unclear). I don't think there's a canonical and satisfying solution. I would say if perf is not critical here SeqCst might be safer (I know, I am the one usually ranting against SeqCst by default)
However, the seqlock crate does as you do, so I suppose it's also somehow reasonable. Acquire fences should be compiled to dmb ishld on arm, which provides sufficient guarantees on the hardware side in practice (all reads are completed before proceeding).
People have used volatile_read as well in addition to atomics, as this has lower risk of being re-ordered past atomics. Not sure if it's necessary.
There was a problem hiding this comment.
Exactly, and the dmb ishld on ARM is what we need, in practice. (x86 with its total store ordering is unproblematic).
d58b844 to
806bad3
Compare
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
…for python Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
|
/merge |
|
View all feedbacks in Devflow UI.
The expected merge time in
|
Adds pyo3 to the enums to expose them directly, fixes the acknowledgment status and adds futex_wait to shm.