Skip to content

list_pins / list_anchors serve stale metadata for repos made private after push (no index reconciliation on visibility downgrade) #136

Description

@beardthelion

Summary

GET /api/v1/ipfs/pins and GET /api/v1/arweave/anchors serve metadata for repos that were public when pushed but later made private. The visibility gate on the write side is evaluated once at push time and is never reconciled when a repo's visibility is tightened, so the index rows persist and the read endpoints return them with no current-visibility filter.

This is the real mechanism behind #121 (and what PR #134 only partially addresses): the leak is not that private repos are indexed (they are not, the write path is correctly gated), it is that visibility is mutable after indexing and nothing purges or re-filters the index.

Mechanism (verified)

Write side is gated at push, point-in-time:

  • pinned_cids is written only via pin_new_objects (crates/gitlawb-node/src/ipfs_pin.rs:133, crates/gitlawb-node/src/pinata.rs:114), called only inside the withheld.is_some() blocks of the push handler (crates/gitlawb-node/src/api/repos.rs:1055, :1167).
  • arweave_anchors is written only via record_arweave_anchor (crates/gitlawb-node/src/api/repos.rs:1249), inside if announce && !irys_url.is_empty() (repos.rs:1231).
  • announce/withheld come from replication_withheld_set = listable_at_root(rules, is_public, owner_did, None) (repos.rs:49), the anonymous root-read decision evaluated during the push async task.

Nothing reconciles the index on visibility change:

  • There is no DELETE FROM pinned_cids or DELETE FROM arweave_anchors anywhere in the tree.
  • set_visibility (crates/gitlawb-node/src/api/visibility.rs:82) adds/updates rules and touches neither table.
  • The read queries are unfiltered: list_pinned_cids is SELECT ... FROM pinned_cids ORDER BY pinned_at DESC (db/mod.rs:2038), and the global list_arweave_anchors(None,..) is a plain SELECT ... FROM arweave_anchors with no visibility predicate (db/mod.rs:2344).

Repro sequence

  1. Create a public repo (is_public=true, no rules).
  2. Push to it: announce=true, so arweave_anchors and pinned_cids rows are written (with IPFS/Irys configured).
  3. Tighten visibility: PUT /api/v1/repos/{owner}/{repo}/visibility adding a / deny rule (mode A, or mode B excluding the public). listable_at_root(..., None) now denies; the repo is effectively private.
  4. GET /api/v1/arweave/anchors (no ?repo=) and GET /api/v1/ipfs/pins still return the now-private repo's anchor rows (slug, owner DID, ref_name, old_sha, new_sha, cid, arweave_url) and pinned object CIDs.

Object content stays gated by the per-caller check in GET /ipfs/{cid} (#110/#133), so this is metadata disclosure (branch names, commit SHAs, ownership, object CIDs), not content. The anchor rows are also permanently on public Arweave from when the repo was public; that part is inherent to permanent storage and not fixable post-hoc, but the node's own listing should not keep serving it.

Why PR #134's auth-only gate is only a partial mitigation

#134 requires authentication for the global listings. That closes anonymous scraping, but identities are permissionless on this node (optional_signature verifies a self-produced signature; register is open), so any throwaway DID still reads the stale rows. Authentication is not authorization here (same class as INV-1).

Fix direction

Filter the listings by current visibility rather than (or in addition to) requiring auth:

  • For list_anchors/list_pins, resolve each row's repo and apply authorize_repo_read / listable_at_root against current rules before returning it, or restrict the global listing to a node-admin capability.
  • And/or reconcile the index on visibility downgrade: have set_visibility purge or mark rows for repos that are no longer announceable.

A regression test should push-while-public, downgrade, then assert the listing excludes the repo.

Metadata

Metadata

Assignees

No one assigned

    Labels

    crate:nodegitlawb-node — the serving node and REST APIkind:securityVulnerability fix or hardeningsev:mediumDegraded but workaround existssubsystem:apiNode REST API request/response surfacesubsystem:visibilityPath-scoped visibility and content withholding

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions