Summary
list_ref_certificates (crates/gitlawb-node/src/db/mod.rs:1846) runs SELECT ... FROM ref_certificates WHERE repo_id = $1 ORDER BY issued_at DESC with fetch_all and no LIMIT, loading every certificate row for a repo into memory. The table grows one row per ref per push (api/repos.rs:919 loops issue_ref_certificate over every advanced ref; cert.rs:44 mints a fresh UUID per call), with no upsert/dedup and no prune or retention anywhere, so it accumulates permanently. An anonymous caller reading a public repo turns a single cheap GET into an unbounded fetch, allocation, and response body.
This is an availability/cost problem, separate from #120 (which adds the missing visibility gate to these handlers) and #114 (which bounded only the gossip half of the events feed). #120's fix does not help here: for a public repo authorize_repo_read allows an anonymous caller straight through, so the load stays fully permissionless after that gate lands.
Where
Two consumers of the unbounded fetch, both on the anonymous read group (optional_signature):
crates/gitlawb-node/src/api/certs.rs:20 — list_certs (GET /api/v1/repos/{owner}/{repo}/certs, routed at server.rs:317). No limit/cursor param and no truncate: it serializes the entire cert set into the response.
crates/gitlawb-node/src/api/events.rs:181 — list_repo_events caps the response with all_events.truncate(limit), but only after the full fetch, so the DB read and the intermediate Vec allocation stay unbounded regardless of ?limit.
Growing the table needs an authenticated pusher (git-receive-pack sits behind require_signature), but any registered agent can push (enforce_owner_push defaults false) and git_write_routes carries no rate limit, so a single writer can inflate one repo's cert count without bound. Every read afterward is permissionless and amplified.
Impact
Measured against a live DB (200k synthetic cert rows for one repo, then reverted): ~60 MB of column data; the no-LIMIT query returns all rows in ~120 ms; list_certs would serialize ~82 MB per request. The node holds the Vec<RefCertificate> from fetch_all, the re-mapped Vec<serde_json::Value>, and the response body concurrently, roughly 200+ MB transient heap per in-flight /certs request from one unauthenticated GET. Reads have no per-caller rate limit, so N concurrent anonymous GETs multiply toward OOM. Same availability/amplification class as #82.
Fix
Bound the fetch at the DB layer instead of in memory:
Found as a follow-up to the PR #143 events-feed work. Verified by execution: list_ref_certificates returns all 250 seeded rows past the 200-row feed ceiling, and the figures above are from a live-DB measurement.
Summary
list_ref_certificates(crates/gitlawb-node/src/db/mod.rs:1846) runsSELECT ... FROM ref_certificates WHERE repo_id = $1 ORDER BY issued_at DESCwithfetch_alland noLIMIT, loading every certificate row for a repo into memory. The table grows one row per ref per push (api/repos.rs:919loopsissue_ref_certificateover every advanced ref;cert.rs:44mints a fresh UUID per call), with no upsert/dedup and no prune or retention anywhere, so it accumulates permanently. An anonymous caller reading a public repo turns a single cheap GET into an unbounded fetch, allocation, and response body.This is an availability/cost problem, separate from #120 (which adds the missing visibility gate to these handlers) and #114 (which bounded only the gossip half of the events feed). #120's fix does not help here: for a public repo
authorize_repo_readallows an anonymous caller straight through, so the load stays fully permissionless after that gate lands.Where
Two consumers of the unbounded fetch, both on the anonymous read group (
optional_signature):crates/gitlawb-node/src/api/certs.rs:20—list_certs(GET /api/v1/repos/{owner}/{repo}/certs, routed atserver.rs:317). Nolimit/cursor param and no truncate: it serializes the entire cert set into the response.crates/gitlawb-node/src/api/events.rs:181—list_repo_eventscaps the response withall_events.truncate(limit), but only after the full fetch, so the DB read and the intermediateVecallocation stay unbounded regardless of?limit.Growing the table needs an authenticated pusher (git-receive-pack sits behind
require_signature), but any registered agent can push (enforce_owner_pushdefaults false) andgit_write_routescarries no rate limit, so a single writer can inflate one repo's cert count without bound. Every read afterward is permissionless and amplified.Impact
Measured against a live DB (200k synthetic cert rows for one repo, then reverted): ~60 MB of column data; the no-
LIMITquery returns all rows in ~120 ms;list_certswould serialize ~82 MB per request. The node holds theVec<RefCertificate>fromfetch_all, the re-mappedVec<serde_json::Value>, and the response body concurrently, roughly 200+ MB transient heap per in-flight/certsrequest from one unauthenticated GET. Reads have no per-caller rate limit, so N concurrent anonymous GETs multiply toward OOM. Same availability/amplification class as #82.Fix
Bound the fetch at the DB layer instead of in memory:
LIMIT(plus cursor/offset) tolist_ref_certificatesand a paged/certshandler, mirroringlist_ref_updates_pagethat Unauthenticated GET /api/v1/events/ref-updates leaks private-repo ref metadata (REST analog of #112) #114 built for the gossip table.list_repo_eventsbound into the query so the cert half is capped beforefetch_all, not by the post-fetchtruncate.(repo_id, ref_name)upsert so the table cannot grow without limit per ref.Found as a follow-up to the PR #143 events-feed work. Verified by execution:
list_ref_certificatesreturns all 250 seeded rows past the 200-row feed ceiling, and the figures above are from a live-DB measurement.