Skip to content

feat(agents): write agent ownership to the authorization graph on create/delete#270

Open
asherfink wants to merge 1 commit into
mainfrom
asher.fink/agx1-242-agent-dual-write
Open

feat(agents): write agent ownership to the authorization graph on create/delete#270
asherfink wants to merge 1 commit into
mainfrom
asher.fink/agx1-242-agent-dual-write

Conversation

@asherfink
Copy link
Copy Markdown
Contributor

@asherfink asherfink commented Jun 3, 2026

What

Record an agent's owner in the authorization graph when it's created, and remove it when the agent is deleted — so agents are access-controlled the same way tasks already are. This is the ownership-write half of agent access control; route-level enforcement lands in the stacked follow-up (#271).

Changes

  • On a genuine agent create (both register_agent and register_build), register the agent as a resource and grant the caller ownership before persisting the row, so a failure aborts the create with no orphaned row.
  • Only the genuine-create path writes ownership. The update and name-already-exists paths don't, so updating an agent can never rewrite ownership to whoever made the update.
  • Skip the ownership write (with a warning) when the caller has no resolvable user or service account, since there'd be no owner to attribute.
  • Agents are registered with no parent edge — their tenant scope is derived server-side from the account, so there's nothing to cascade from in this layer.
  • Compensating cleanup: if the persist fails, or loses a race to a duplicate, the just-written ownership is removed before re-raising the original error (or adopting the existing row).
  • On delete, ownership is removed best-effort after the soft-delete, so a cleanup failure never fails a delete that already succeeded.
  • Ownership writes use the request principal from middleware and live in the use case; the create/delete routes just gate on an authorization check. Also drops an unused principal_context request-body field from the register schemas (nothing read it) and regenerates the OpenAPI spec.

Tests

  • Ownership-write tests: write-before-persist ordering, no-parent registration, compensating cleanup on persist failure and on duplicate races, update path leaves ownership untouched, no-resolvable-creator skips the write, and cleanup after delete.
  • Updated the existing route auth test to reflect that the routes now only check (the ownership writes moved into the use case).

Ticket: AGX1-242

Auth backend

The auth service already supports agents end to end: the agent resource type and the register/deregister and grant/revoke operations exist, and operations route per-account via the fgac-agentex-auth-spark flag. Nothing new is needed there for this PR.

Why agents don't need their own dual-write flag. Tasks gate ownership writes behind fgac-tasks-dual-write because per-task ownership is net-new, so writing those edges for not-yet-migrated accounts is a staged behavior change. Agents are different: owner-on-create and remove-on-delete predate the per-resource work, and on the legacy backend these calls resolve to the same owner-row writes that were already happening. The change is behavior-preserving on the legacy backend and additive on the new one, so there's nothing to stage.

Backfill before flipping. Agents created before this wiring still need their ownership backfilled into the new backend before fgac-agentex-auth-spark is enabled for an account, otherwise pre-existing agents collapse to 404.

Stacking

Follow-up #271 (route 404 collapse) is stacked on this one.

Greptile Summary

Writes agent ownership into the authorization graph on create and removes it on delete, moving those side-effects from route handlers into AgentsUseCase. The ownership write now happens before the Postgres row is persisted so a failure aborts cleanly with no orphaned row, and compensating cleanup (_safe_deregister) undoes both the register_resource and grant calls on persist failure or a duplicate-create race.

  • _register_in_auth (register + grant, no parent) is called only on the genuine-create branches of both register_agent and register_build; update paths are skipped to prevent ownership rewriting. _safe_deregister / _safe_revoke are called best-effort post-delete.
  • Routes drop the inline authorization.grant/revoke calls and the now-unused principal_context request field; auth gating is handled by DAuthorizedId/DAuthorizedName dependencies at the parameter level.
  • New test_agent_authz_dual_write.py covers write-before-persist ordering, compensating cleanup on failure/duplicate, update no-op, and no-creator guard — all via register_agent; register_build dual-write paths are not yet covered by dedicated tests.

Confidence Score: 4/5

Safe to merge with the understanding that the orphaned-registration gap on register_resource-succeeds/grant-fails (tracked in the existing review thread) is accepted as a known follow-up, and that register_build dual-write paths have no dedicated test coverage yet.

The core ownership write logic is well-structured: pre-persist ordering is correct, compensating cleanup is properly scoped, and the update paths correctly skip registration. The new test suite covers the main register_agent flows thoroughly. Two concerns hold the score below the ceiling: the grant-failure orphan gap in _register_in_auth (already raised in the prior review thread, accepted for now), and the absence of equivalent tests for register_build's duplicate-race and persist-failure compensation paths.

agentex/src/domain/use_cases/agents_use_case.py — specifically _register_in_auth, which has no compensation if grant raises after register_resource succeeds. agentex/tests/integration/use_cases/test_agent_authz_dual_write.py — register_build dual-write paths are untested.

Important Files Changed

Filename Overview
agentex/src/domain/use_cases/agents_use_case.py Core logic for ownership dual-write: _register_in_auth, _safe_deregister, _safe_revoke added with proper pre-persist ordering and compensation; the orphaned-registration gap on register_resource-succeeds/grant-fails is a known tracked issue (#270 thread).
agentex/src/api/routes/agents.py Moves ownership writes (grant/revoke) out of routes and into use case; drops principal_context param passthrough; removes redundant inline auth checks that were superseded by DAuthorizedId/DAuthorizedName dependencies.
agentex/tests/integration/use_cases/test_agent_authz_dual_write.py New integration test suite covering write-before-persist ordering, no-parent registration, compensating cleanup on failure/duplicate, update-path no-op, no-creator guard, and best-effort post-delete cleanup; all test cases exercise register_agent only — register_build paths not covered.
agentex/tests/integration/api/agents/test_agents_auth_api.py Updated to assert only route-level authz checks (create/delete gates); ownership write assertions correctly moved to test_agent_authz_dual_write.py.
agentex/src/api/schemas/agents.py Drops unused principal_context field from RegisterAgentRequest and RegisterBuildRequest; no functional change since nothing read it.

Sequence Diagram

sequenceDiagram
    participant R as Route Handler
    participant UC as AgentsUseCase
    participant AS as AuthorizationService
    participant DB as AgentRepo

    R->>AS: "check(agent:*, create)"
    R->>UC: register_agent() / register_build()
    UC->>DB: get(name) — probe for existing
    DB-->>UC: ItemDoesNotExist
    UC->>AS: "register_resource(agent:{new-id})"
    UC->>AS: "grant(agent:{new-id})"
    alt register/grant succeeds
        UC->>DB: create(agent)
        alt create succeeds
            DB-->>UC: AgentEntity
        else DuplicateItemError
            UC->>AS: _safe_deregister(new-id)
            UC->>DB: get(name) — adopt existing row
        else Other Exception
            UC->>AS: _safe_deregister(new-id)
            UC-->>R: re-raise
        end
    else register_resource or grant raises
        UC-->>R: propagates (no compensation — known gap)
    end

    Note over R,DB: Delete path
    R->>UC: delete(id/name)
    UC->>DB: "update(status=DELETED)"
    UC->>AS: _safe_deregister(agent.id)
    UC->>AS: _safe_revoke(agent.id)
    UC-->>R: AgentEntity (soft-deleted)
Loading

Fix All in Cursor Fix All in Claude Code Fix All in Codex

Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
agentex/tests/integration/use_cases/test_agent_authz_dual_write.py:519
**`register_build` dual-write paths not exercised**

Every test in `TestAgentDualWriteCreate` calls `use_case.register_agent`, but `register_build` has the same `_register_in_auth``create` → compensation pattern. The persist-failure, duplicate-race, and no-creator-skips paths for `register_build` are not covered, so a regression specific to that path (e.g. a wrong agent-ID being passed to `_safe_deregister` after a duplicate race) would go undetected. Adding a mirrored test class for `register_build` would close the gap.

Reviews (2): Last reviewed commit: "feat(agents): write agent ownership to t..." | Re-trigger Greptile

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

✱ Stainless preview builds

This PR will update the agentex-sdk SDKs with the following commit messages.

openapi

feat(api): remove principal_context field from agent models

python

feat(api): remove principal_context parameter from agents.register_build

typescript

fix(api): remove principal_context parameter from agent register build

Edit this comment to update them. They will appear in their respective SDK's changelogs.

agentex-sdk-openapi studio · code · diff

Your SDK build had at least one "note" diagnostic, but this did not represent a regression.
generate ✅

agentex-sdk-typescript studio · code · diff

Your SDK build had at least one "warning" diagnostic, but this did not represent a regression.
generate ⚠️build ✅lint ✅test ✅

npm install https://pkg.stainless.com/s/agentex-sdk-typescript/4acbfa7b11b2f3fe920286198f8c7be53c24937c/dist.tar.gz
agentex-sdk-python studio · code · diff

Your SDK build had at least one "warning" diagnostic, but this did not represent a regression.
generate ⚠️build ✅lint ❗test ✅

pip install https://pkg.stainless.com/s/agentex-sdk-python/524158ecaea77a433171e2156561b3ac147a220d/agentex_sdk-0.12.0-py3-none-any.whl

This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
If you push custom code to the preview branch, re-run this workflow to update the comment.
Last updated: 2026-06-05 17:32:21 UTC

@asherfink asherfink force-pushed the asher.fink/agx1-242-agent-dual-write branch from 685126a to 0d99f31 Compare June 3, 2026 15:31
@asherfink asherfink changed the title feat(agents): register agents in the authorization graph on create/delete feat(agents): register/deregister agents in the authorization graph on create/delete Jun 3, 2026
@asherfink asherfink force-pushed the asher.fink/agx1-242-agent-dual-write branch from 0d99f31 to 14796e9 Compare June 3, 2026 15:35
"""Get an agent by its unique name."""
agent_entity = await agents_use_case.get(name=agent_name)

await authorization.check(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep the authorization.check/revoke/grant. Otherwise, how can we maintain backward compatibility?

1, removing authorization.check will remove authz altogether.
2, removing grant/revoke will break the old authz even if the user opts out of the new FGAC

Even if a user turns FGAC on, we may still want to keep grant/revoke for permission data dual-write, so that the old authz still works in case the user turns FGAC off.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

went and traced this through agentex-auth to make sure we're solid, and I don't think we need to keep grant/revoke here

tl;dr: on the legacy (SGP) backend, register_resource is grant and deregister_resource is revoke. so dropping the explicit grant/revoke and going through register/deregister is behavior-preserving for old authz, not a regression.

1. removing authorization.check removes authz altogether

it doesn't. the only check I removed is the explicit one in get_agent_by_name, but that route still has agent_name: DAuthorizedName(agent, read) in its signature, and that dependency calls authorization.check(agent, read) itself before the handler body runs (authorization_shortcuts.py). so the line I deleted was a redundant second check of the same tuple. read authz is still enforced. on create I kept the check(agent("*"), create) too, just dropped the unused principal_context= kwarg.

2 + 3. removing grant/revoke breaks old authz when FGAC is off

two things:

  • agentex-auth routes every call (grant/revoke/check/register/deregister) to exactly one provider per account based on FGAC_AGENTEX_AUTH_SPARK (_resolve_provider). flag off → legacy SGP, flag on → Spark. there's no write-to-both happening, it's a routing switch.
  • on the SGP gateway, register_resource delegates straight to grant and deregister_resource delegates to revoke

so for an FGAC-off account, register_resource(agent) → SGP → grant(owner), which is exactly what the old route did. deregister_resourcerevoke. legacy authz is fully preserved. for FGAC-on it does the proper Spark resource+owner write (tenant scoping comes from account_id automatically). either way it's correct, and keeping grant/revoke on top would just be a redundant duplicate write.

@asherfink asherfink force-pushed the asher.fink/agx1-242-agent-dual-write branch from 14796e9 to 883921e Compare June 5, 2026 17:07
Comment thread agentex/src/domain/use_cases/agents_use_case.py
…ate/delete

Record an agent's owner when it is created and remove it when the agent is
deleted, so agents are access-controlled the same way tasks are. This is the
ownership-write half of agent access control; route-level enforcement is the
stacked follow-up.

Create (both register_agent and register_build, genuine-create path only):
register the agent as a resource and grant the caller ownership before the row
is persisted, so a failure aborts the create with no orphaned row. The update
and name-already-exists paths leave ownership untouched, so updating an agent
never rewrites its owner. The write is skipped when the caller has no resolvable
user or service account, since there is no owner to attribute. If the persist
fails or loses a duplicate race, a best-effort cleanup removes the just-written
ownership before the original error is re-raised or the existing row is adopted.

Delete: after the soft-delete, best-effort remove the agent's ownership. Cleanup
failures are swallowed so they never fail a delete that already succeeded.

The ownership writes use the request principal and live in the use case; the
routes just gate create and delete on an authorization check. Also drops an
unused principal_context request-body field from the register schemas and
regenerates the OpenAPI spec.

Ticket: AGX1-242
@asherfink asherfink changed the title feat(agents): register/deregister agents in the authorization graph on create/delete feat(agents): write agent ownership to the authorization graph on create/delete Jun 5, 2026
@asherfink asherfink force-pushed the asher.fink/agx1-242-agent-dual-write branch from 883921e to 6de8f39 Compare June 5, 2026 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants