fix(node): carry full owner DID on ref-update wire event (#144)#145
fix(node): carry full owner DID on ref-update wire event (#144)#145Gravirei wants to merge 15 commits into
Conversation
Gitlawb#126) The IPFS visibility gate used withheld_blob_oids (a deny-set enumerating only reachable blobs), so a dangling/unreachable blob was absent from the set and served in cleartext to anonymous callers. Flip to an allowed-set (allowed_blob_set_for_caller) that enumerates reachable blobs the caller may read: a dangling blob has no path, is never in the set, and 404s.
Move store::read_object before the allowed_blob_set_for_caller spawn_blocking call so random-CID spray against repos with path-scoped rules cannot trigger full-history git walks on repos that don't carry the object.
…tion ✓ P3 blocker fixed: cargo fmt applied — the format gate will pass. ✓ P3 cleanup resolved: withheld_blob_oids is still used by replication code in repos.rs, so it stays. • P2 follow-up: Tree/commit disclosure tracked in Gitlawb#135 — out of scope here.
|
Thanks for the contribution. A couple of things will help us review this faster:
See CONTRIBUTING.md. Update the PR and these notes will clear automatically. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughAdds an optional ChangesOwner DID propagation
Estimated code review effort: 3 (Moderate) | ~25 minutes Possibly related issues
Possibly related PRs
Suggested labels: Suggested reviewers: 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
crates/gitlawb-node/src/test_support.rs (1)
1969-2043: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueConsider a shared builder for
ReceivedRefUpdatetest fixtures.Both tests construct near-identical
ReceivedRefUpdateliterals inline. A shared helper (similar to theupdate(...)fixture already used in db/mod.rs tests) would reduce duplication and make future field additions (likeowner_did) less error-prone to keep in sync across test files.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/test_support.rs` around lines 1969 - 2043, The `events_returns_inserted_ref_updates` and `events_limit_respects_limit_param` tests duplicate the same `crate::db::ReceivedRefUpdate` setup inline, so add a shared test fixture/builder for `ReceivedRefUpdate` in `test_support.rs` and use it in both cases. Model it after the existing `update(...)` helper used in the db tests, and make sure the helper accepts overrides for fields like `repo`, `owner_did`, `new_sha`, and timestamps so future schema changes stay consistent across tests.crates/gitlawb-node/src/db/mod.rs (1)
3919-4077: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAssert
owner_didin the filtered-by-repo test too.
list_ref_updates_filtered_by_repo(lines 4027-4058) inserts records with distinctowner_didvalues but only assertsid/count, notowner_did, unlike the siblinglist_repo_ref_updates_filters_by_repotest which does check it. Adding that assertion would close a small gap in coverage for the exact field this PR introduces.♻️ Proposed test strengthening
let filtered = db .list_ref_updates_filtered(Some("ownerA/proj"), 100) .await .unwrap(); assert_eq!(filtered.len(), 1); assert_eq!(filtered[0].id, "u5"); + assert_eq!(filtered[0].owner_did.as_deref(), Some("did:key:zA"));🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/db/mod.rs` around lines 3919 - 4077, The list_ref_updates_filtered_by_repo test is missing coverage for the new owner_did behavior. Update ref_update_db_tests::list_ref_updates_filtered_by_repo to assert the returned row’s owner_did matches the inserted value, similar to list_repo_ref_updates_filters_by_repo, so the filtered path verifies both repo filtering and owner_did preservation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/gitlawb-node/src/test_support.rs`:
- Around line 1969-2007: The ref-update events payload returned by the events
API is still missing owner_did, so the current test only verifies persistence in
Postgres. Update the ref-update feed serialization in the events response path
to include owner_did, using the relevant event model/handler that backs
events_router and GET /api/v1/events/ref-updates, and then extend
events_returns_inserted_ref_updates to assert events[0]["owner_did"] matches the
inserted value.
---
Nitpick comments:
In `@crates/gitlawb-node/src/db/mod.rs`:
- Around line 3919-4077: The list_ref_updates_filtered_by_repo test is missing
coverage for the new owner_did behavior. Update
ref_update_db_tests::list_ref_updates_filtered_by_repo to assert the returned
row’s owner_did matches the inserted value, similar to
list_repo_ref_updates_filters_by_repo, so the filtered path verifies both repo
filtering and owner_did preservation.
In `@crates/gitlawb-node/src/test_support.rs`:
- Around line 1969-2043: The `events_returns_inserted_ref_updates` and
`events_limit_respects_limit_param` tests duplicate the same
`crate::db::ReceivedRefUpdate` setup inline, so add a shared test
fixture/builder for `ReceivedRefUpdate` in `test_support.rs` and use it in both
cases. Model it after the existing `update(...)` helper used in the db tests,
and make sure the helper accepts overrides for fields like `repo`, `owner_did`,
`new_sha`, and timestamps so future schema changes stay consistent across tests.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6d0fa77a-014d-4a04-af46-9ef99e20fe27
📒 Files selected for processing (5)
crates/gitlawb-node/src/api/peers.rscrates/gitlawb-node/src/api/repos.rscrates/gitlawb-node/src/db/mod.rscrates/gitlawb-node/src/p2p/mod.rscrates/gitlawb-node/src/test_support.rs
…ef-update event handling with owner DID
beardthelion
left a comment
There was a problem hiding this comment.
The write-side plumbing is right: the full DID comes from the local repo record, it's threaded through both the gossip and HTTP-notify paths, and the #[serde(default)] / nullable column keeps older peers compatible. CodeRabbit's feed-serialization ask is genuinely addressed too. Two things block merge, and the first is a hard one: the schema change breaks every existing node on upgrade, and the PR closes #144 without fixing #144's symptom.
Findings
-
[P1] Ship
owner_didas a new migration version, not appended to v1
crates/gitlawb-node/src/db/mod.rs:542
TheALTER TABLE ... ADD COLUMN owner_didandidx_ref_updates_ownerwere added insideMigration { version: 1 }, butrun_migrationsskips any version already recorded inschema_migrations, and the catalogue comment right above the array says new schema must be a new version (v2, v3, ...). Every existing node has v1 recorded, so on upgrade the column is never created;insert_ref_update(binds$12) and all threelist_*SELECTs then reference a missing column and error, which stops ref-update ingestion on both the gossip and notify paths and 500s the events feed. Fresh databases pass because they run v1 from scratch, which is why CI is green. I reproduced it against the real migration runner: an existing node (v1..v9 recorded, noowner_did) upgraded to this build leaves the column absent and the insert/select fail; moving both statements into a newMigration { version: 10 }makes the same upgrade round-trip cleanly. -
[P2] Don't close #144 on this PR; nothing consumes
owner_didyet
crates/gitlawb-node/src/api/events.rs:38
#144 has two halves: carry the full owner DID (done here) and make the feed gate use it to stop the over-drop. Nothing in this diff readsowner_didfor a visibility decision; it's stored and echoed in the feed JSON only, andidx_ref_updates_owneris an index no query uses. The gate that over-drops isn't in this branch, so merging this auto-closes #144 while the reported symptom persists. Keep the plumbing, but change "Closes #144" to "Refs #144" and land the gate consumption (the did:key-aware match #144 describes) in the follow-up that sits on top of the gate. -
[P3] Populate
owner_didon the other read surfaces, or hold the JSON change
crates/gitlawb-node/src/graphql/query.rs:61,crates/gitlawb-node/src/api/events.rs:117
The field lands on the global REST feed but the GraphQLref_updatesresolver drops it (RefUpdateTypehas no such field) and the repo-scopedlist_repo_eventsgossip block omits it. No consumer reads it yet so there's no impact today, but the asymmetry is the same drift the duplicated four-copy SELECT column list invites, and it becomes a trap once something depends on the field. Either populate all three surfaces or defer the JSON exposure until the gate needs it. While here, add a migration upgrade-path test (seedschema_migrationsat v1, run migrations, assert the column round-trips) so the v1-append class can't recur.
The direction is right and the receive/publish wiring is sound; once the migration moves to its own version this is close.
P1 migration break resolved on 72ccb28; re-reviewed the new head, posting a fresh review.
beardthelion
left a comment
There was a problem hiding this comment.
Re-reviewed on 72ccb28d. The P1 migration break is resolved and I verified it by execution: the owner_did DDL now lives in a new Migration{version:10}, and on a simulated existing node (schema_migrations at v1..v9, received_ref_updates without the column) migrate() adds the column and index and records v10. Putting the DDL back inside the already-applied v1 makes that same check fail (the column is never re-added), so the guard is load-bearing. Wire compat (#[serde(default)] Option<String>) and the DB plumbing are sound, and "Refs #144" is correct. Four items to fold in.
Findings
-
[P2] Drop the premature
idx_ref_updates_ownerindex, or defer it to the gate PR
crates/gitlawb-node/src/db/mod.rs:833
Nothing readsreceived_ref_updatesbyowner_didyet: every reference is a projection, no WHERE/JOIN/ORDER. Migrations run inside a transaction soCREATE INDEX CONCURRENTLYisn't available here (mod.rs:426), which means this index builds under a write-blocking lock at startup on exactly the populated nodes this PR targets, for no current benefit. Ship the column now and add the index in #143 next to the query that reads it. -
[P2] Expose owner_did on the repo-scoped events feed
crates/gitlawb-node/src/api/events.rs:116
list_ref_updatesreturnsowner_didnow, but thegossip_eventsblock inlist_repo_eventsomits it, soGET /api/v1/repos/{owner}/{repo}/eventssilently drops the field. Add"owner_did": u.owner_didto thatjson!block to match line 38. -
[P2] Carry owner_did through the GraphQL query and subscription
crates/gitlawb-node/src/graphql/types.rs:52
RefUpdateTypeandRefUpdateBroadcast(state.rs:12) have noowner_did, so theref_updatesquery (query.rs:61) and the subscription drop it while REST exposes it. Addowner_did: Option<String>to both structs and populate it at query.rs, subscription.rs, and the broadcast send in repos.rs. -
[P2] Add an existing-node upgrade test for v10
crates/gitlawb-node/src/db/mod.rs:3200
migration_v10_creates_owner_did_columnruns only a fresh v1..v10 chain, which can't catch DDL appended to an already-applied version. Seedschema_migrationswith v1..v9 and areceived_ref_updateswithout the column, thenmigrate()and assert the column, index, and v10 row appear. I ran exactly this against the head and it passes; the same body fails when the DDL sits back in v1, so it is the missing regression guard, not a live bug.
One forward note for #143, not a blocker here: the wire owner_did is attacker-controlled (the peer signature authenticates the relaying node, not this field) and is stored verbatim. Every authorization path today reads the trusted record.owner_did from the local repo row, so nothing consumes the wire value yet, but the gate must cross-check it against a trusted source and must not fall back to the lossy last-segment / did_matches matcher the full DID exists to avoid.
@Gravirei migration fix verified, thanks. The above are plumbing-consistency and deploy-window items.
- Drop idx_ref_updates_owner from Migration v10; no query reads owner_did yet and CREATE INDEX (non-CONCURRENT) inside a transaction takes a write-blocking lock on populated nodes for zero benefit. Defer the index to Gitlawb#143 alongside the query that uses it. - Expose owner_did on repo-scoped events feed: add the field to the gossip_events json! block in list_repo_events so GET /api/v1/repos/{owner}/{repo}/events is consistent with GET /api/v1/events/ref-updates. - Carry owner_did through GraphQL: add owner_did: Option<String> to RefUpdateBroadcast (state.rs) and RefUpdateType (types.rs), populate it from the DB row in query.rs, from the broadcast in subscription.rs, and from the trusted local record.owner_did in the broadcast send in repos.rs. - Add migration_v10_existing_node_upgrade test: seeds a v1..v9 state, strips the column and version row, re-runs migrate(), and asserts the column and v10 row appear. This is the regression guard: it fails if the DDL is accidentally moved back into v1. Refs Gitlawb#144
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
crates/gitlawb-node/src/db/mod.rs (2)
3207-3218: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick winAssert the column is actually nullable.
The tests fetch
is_nullablebut only assert the name/type. Since older peers can omitowner_did, please assert"YES"so a futureNOT NULLchange is caught.Proposed fix
assert_eq!(col.0, "owner_did"); assert_eq!(col.1, "text"); + assert_eq!(col.2, "YES");Apply the same assertion in both migration tests.
Also applies to: 3274-3284
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/db/mod.rs` around lines 3207 - 3218, The migration tests for received_ref_updates currently verify owner_did exists and is text, but they ignore the fetched is_nullable value. Update both migration test blocks that query information_schema.columns in db::mod to assert the third tuple field from sqlx::query_as is "YES" alongside the existing name/type checks, so changes to nullable behavior are caught.
2103-2107: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick winPreserve later
owner_didenrichment on duplicate ref-update IDs.With
ON CONFLICT(id) DO NOTHING, a mixed-version network can store the olderNonepayload first and ignore a later duplicate carrying the full owner DID, leaving the feed unable to disambiguate that event.Proposed fix
(id, node_did, pusher_did, repo, ref_name, old_sha, new_sha, timestamp, cert_id, received_at, from_peer, owner_did) VALUES ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12) - ON CONFLICT(id) DO NOTHING", + ON CONFLICT(id) DO UPDATE + SET owner_did = COALESCE(received_ref_updates.owner_did, EXCLUDED.owner_did) + WHERE received_ref_updates.owner_did IS NULL + AND EXCLUDED.owner_did IS NOT NULL",🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/db/mod.rs` around lines 2103 - 2107, The INSERT into received_ref_updates currently uses ON CONFLICT(id) DO NOTHING, which prevents a later duplicate from enriching an earlier row with owner_did. Update the upsert logic in the received ref-update insert path so duplicates with the same id can fill in a missing owner_did while still avoiding overwriting existing non-null data. Refer to the received_ref_updates insert statement and the surrounding db write function that persists ref updates.
🧹 Nitpick comments (1)
crates/gitlawb-node/src/db/mod.rs (1)
4096-4129: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd a same-slug owner collision regression.
This test uses different
repovalues, so it does not cover the PR’s core case: two updates with the same trailing repo slug but different fullowner_didvalues. Add a row pair likealice/myrepo+did:web:host:aliceandalice/myrepo+did:gitlawb:alice.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/db/mod.rs` around lines 4096 - 4129, The current `list_repo_ref_updates_filters_by_repo` test only verifies filtering by full repo name, not the same-slug owner collision case. Update this test in `db` to insert two ref updates with the same trailing repo slug but different full owner identifiers, using the existing `insert_ref_update`, `update`, and `list_repo_ref_updates` helpers, and assert the query returns the correct row for each full repo owner path. Keep the original repo-filter assertions if helpful, but add coverage for the `owner_did` collision scenario the PR is targeting.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/gitlawb-node/src/api/events.rs`:
- Line 130: The event payload in list_repo_events is only populating owner_did
for gossipsub events, so local cert events in the same feed remain inconsistent.
Update the local event mapping in events.rs to also set owner_did from
record.owner_did, keeping the repo-scoped payload shape consistent with the
existing gossipsub path and preserving consumers that dedup or gate on the full
owner DID.
---
Outside diff comments:
In `@crates/gitlawb-node/src/db/mod.rs`:
- Around line 3207-3218: The migration tests for received_ref_updates currently
verify owner_did exists and is text, but they ignore the fetched is_nullable
value. Update both migration test blocks that query information_schema.columns
in db::mod to assert the third tuple field from sqlx::query_as is "YES"
alongside the existing name/type checks, so changes to nullable behavior are
caught.
- Around line 2103-2107: The INSERT into received_ref_updates currently uses ON
CONFLICT(id) DO NOTHING, which prevents a later duplicate from enriching an
earlier row with owner_did. Update the upsert logic in the received ref-update
insert path so duplicates with the same id can fill in a missing owner_did while
still avoiding overwriting existing non-null data. Refer to the
received_ref_updates insert statement and the surrounding db write function that
persists ref updates.
---
Nitpick comments:
In `@crates/gitlawb-node/src/db/mod.rs`:
- Around line 4096-4129: The current `list_repo_ref_updates_filters_by_repo`
test only verifies filtering by full repo name, not the same-slug owner
collision case. Update this test in `db` to insert two ref updates with the same
trailing repo slug but different full owner identifiers, using the existing
`insert_ref_update`, `update`, and `list_repo_ref_updates` helpers, and assert
the query returns the correct row for each full repo owner path. Keep the
original repo-filter assertions if helpful, but add coverage for the `owner_did`
collision scenario the PR is targeting.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9f70fb07-7129-4c96-9e77-7b2943106a56
📒 Files selected for processing (7)
crates/gitlawb-node/src/api/events.rscrates/gitlawb-node/src/api/repos.rscrates/gitlawb-node/src/db/mod.rscrates/gitlawb-node/src/graphql/query.rscrates/gitlawb-node/src/graphql/subscription.rscrates/gitlawb-node/src/graphql/types.rscrates/gitlawb-node/src/state.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/gitlawb-node/src/api/repos.rs
Local cert events in the repo-scoped feed now include owner_did sourced from the trusted local record, matching the field already present on gossipsub events. Keeps the payload consistent for consumers that gate or dedup by full owner DID. Refs Gitlawb#144
beardthelion
left a comment
There was a problem hiding this comment.
Re-reviewed on 63ab6015. All four items from my last pass are in, CodeRabbit's local-cert-events fix landed, and I verified the security- and deploy-relevant paths by execution:
-
Migration, both ways: on a simulated existing node (v1..v9 recorded, no
owner_didcolumn),migrate()adds the column and records v10; moving that DDL back into the already-applied v1 makesmigration_v10_existing_node_upgradefail (RowNotFound) while the fresh-DB test still passes, so the upgrade guard is load-bearing. A pre-upgrade row (owner_didNULL, since the ALTER has no default) round-trips through the feed as JSON null, no 500. The index is correctly deferred to #143. -
Feed provenance, executed: the repo-scoped feed's local block carries the trusted
record.owner_did, and the gossip block echoes the stored wire value verbatim, which I confirmed with a spoofedowner_didsurfacing unchanged (display-only, nothing filters on it). All fourreceived_ref_updatesSELECTs projectowner_did, so the shared row mapper never hits a missing column. The global feed and GraphQL (query, subscription, broadcast) carry the field through as a passthrough. -
[P3] The stored
received_ref_updates.owner_didis attacker-controlled (the peer signature authenticates the relaying node, not this field, andnotify_syncaccepts it unsigned by default) and is echoed verbatim, as the spoof test shows. It is display-only and nothing gates on it, so not a break. A short comment at the two ingest sites (p2p/mod.rs,api/peers.rs:398) marking the column untrusted would keep #143's DID-aware feed gate from consuming it as trusted; that gate must reconcile it against the repo slug or a verified cert.
@Gravirei the P2s and the migration all check out, thanks. @kevincodex1 this is ready.
Summary
Carry the full owner DID on the ref-update wire event so the feed gate can distinguish different DID methods that share the same trailing segment (e.g.
did:web:host:alicevsdid:gitlawb:alice).Motivation & context
Refs #144
The ref-update wire slug was previously constructed from the last colon-segmented component of the owner DID (
{owner_did.split(":").last}/{name}), which is lossy. Two owners with different DID methods but the same trailing segment would produce identical slugs, causing the feed gate to over-drop remote rows for anonymous callers.This PR is the plumbing half of #144: it carries the full owner DID on the wire and stores it in the database so a follow-up can teach the feed gate to use
did_matchesfor proper disambiguation. The gate itself (the second half of #144) will land in a separate change on top of this.Kind of change
What changed
crates/gitlawb-node (gitlawb-node):
p2p/mod.rs: Addedowner_did: Option<String>toRefUpdateEventwith#[serde(default)]for backward compat with older peers that do not include the fielddb/mod.rs: Addedowner_did: Option<String>toReceivedRefUpdate, added migration v10 to addowner_did TEXTcolumn andidx_ref_updates_ownerindex to thereceived_ref_updatestable, updated all SELECT/INSERT queries to include the columnapi/repos.rs: Setowner_did: Some(record.owner_did)when publishing ref-update events over gossip; addedowner_didto the HTTP sync notify bodyapi/peers.rs: Addedowner_didtoNotifyRequestand wired it through toReceivedRefUpdateHow a reviewer can verify
DATABASE_URL=postgresql://gitlawb:changeme@172.19.0.2:5432/gitlawb cargo test -p gitlawb-node cargo fmt --check cargo clippy --workspace --all-targets -- -D warningsBefore you request review
cargo test --workspacepasses locally (327/327)cargo fmt --allandcargo clippy --workspace --all-targets -- -D warningsare cleanfix(...),test(...),style(...)).env.exampleupdated if behavior or config changed (or N/A)Protocol & signing impact
This is a backward-compatible wire-format addition: the new
owner_didfield is#[serde(default)], so events from older peers deserialize without it (owner_did: None). Therepofield format is unchanged. Migration v10 adds the column to existing databases;ADD COLUMN IF NOT EXISTSis idempotent for re-runs.did:key, Ed25519 / RFC 9421 signatures, UCAN, ref certs, or P2P wire formatsSummary by CodeRabbit
New Features
owner_didwhen available.owner_didas optional (omittable).Bug Fixes
owner_didis missing or null in incoming notifications.