feat(agents): write agent ownership to the authorization graph on create/delete#270
feat(agents): write agent ownership to the authorization graph on create/delete#270asherfink wants to merge 1 commit into
Conversation
✱ Stainless preview buildsThis PR will update the openapi python typescript Edit this comment to update them. They will appear in their respective SDK's changelogs. ✅ agentex-sdk-openapi studio · code · diff
✅ agentex-sdk-typescript studio · code · diff
✅ agentex-sdk-python studio · code · diff
This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push. |
685126a to
0d99f31
Compare
0d99f31 to
14796e9
Compare
| """Get an agent by its unique name.""" | ||
| agent_entity = await agents_use_case.get(name=agent_name) | ||
|
|
||
| await authorization.check( |
There was a problem hiding this comment.
I think we should keep the authorization.check/revoke/grant. Otherwise, how can we maintain backward compatibility?
1, removing authorization.check will remove authz altogether.
2, removing grant/revoke will break the old authz even if the user opts out of the new FGAC
Even if a user turns FGAC on, we may still want to keep grant/revoke for permission data dual-write, so that the old authz still works in case the user turns FGAC off.
There was a problem hiding this comment.
went and traced this through agentex-auth to make sure we're solid, and I don't think we need to keep grant/revoke here
tl;dr: on the legacy (SGP) backend, register_resource is grant and deregister_resource is revoke. so dropping the explicit grant/revoke and going through register/deregister is behavior-preserving for old authz, not a regression.
1. removing authorization.check removes authz altogether
it doesn't. the only check I removed is the explicit one in get_agent_by_name, but that route still has agent_name: DAuthorizedName(agent, read) in its signature, and that dependency calls authorization.check(agent, read) itself before the handler body runs (authorization_shortcuts.py). so the line I deleted was a redundant second check of the same tuple. read authz is still enforced. on create I kept the check(agent("*"), create) too, just dropped the unused principal_context= kwarg.
2 + 3. removing grant/revoke breaks old authz when FGAC is off
two things:
agentex-authroutes every call (grant/revoke/check/register/deregister) to exactly one provider per account based onFGAC_AGENTEX_AUTH_SPARK(_resolve_provider). flag off → legacy SGP, flag on → Spark. there's no write-to-both happening, it's a routing switch.- on the SGP gateway,
register_resourcedelegates straight tograntandderegister_resourcedelegates torevoke
so for an FGAC-off account, register_resource(agent) → SGP → grant(owner), which is exactly what the old route did. deregister_resource → revoke. legacy authz is fully preserved. for FGAC-on it does the proper Spark resource+owner write (tenant scoping comes from account_id automatically). either way it's correct, and keeping grant/revoke on top would just be a redundant duplicate write.
14796e9 to
883921e
Compare
…ate/delete Record an agent's owner when it is created and remove it when the agent is deleted, so agents are access-controlled the same way tasks are. This is the ownership-write half of agent access control; route-level enforcement is the stacked follow-up. Create (both register_agent and register_build, genuine-create path only): register the agent as a resource and grant the caller ownership before the row is persisted, so a failure aborts the create with no orphaned row. The update and name-already-exists paths leave ownership untouched, so updating an agent never rewrites its owner. The write is skipped when the caller has no resolvable user or service account, since there is no owner to attribute. If the persist fails or loses a duplicate race, a best-effort cleanup removes the just-written ownership before the original error is re-raised or the existing row is adopted. Delete: after the soft-delete, best-effort remove the agent's ownership. Cleanup failures are swallowed so they never fail a delete that already succeeded. The ownership writes use the request principal and live in the use case; the routes just gate create and delete on an authorization check. Also drops an unused principal_context request-body field from the register schemas and regenerates the OpenAPI spec. Ticket: AGX1-242
883921e to
6de8f39
Compare
What
Record an agent's owner in the authorization graph when it's created, and remove it when the agent is deleted — so agents are access-controlled the same way tasks already are. This is the ownership-write half of agent access control; route-level enforcement lands in the stacked follow-up (#271).
Changes
register_agentandregister_build), register the agent as a resource and grant the caller ownership before persisting the row, so a failure aborts the create with no orphaned row.principal_contextrequest-body field from the register schemas (nothing read it) and regenerates the OpenAPI spec.Tests
Ticket: AGX1-242
Auth backend
The auth service already supports agents end to end: the
agentresource type and the register/deregister and grant/revoke operations exist, and operations route per-account via thefgac-agentex-auth-sparkflag. Nothing new is needed there for this PR.Why agents don't need their own dual-write flag. Tasks gate ownership writes behind
fgac-tasks-dual-writebecause per-task ownership is net-new, so writing those edges for not-yet-migrated accounts is a staged behavior change. Agents are different: owner-on-create and remove-on-delete predate the per-resource work, and on the legacy backend these calls resolve to the same owner-row writes that were already happening. The change is behavior-preserving on the legacy backend and additive on the new one, so there's nothing to stage.Backfill before flipping. Agents created before this wiring still need their ownership backfilled into the new backend before
fgac-agentex-auth-sparkis enabled for an account, otherwise pre-existing agents collapse to 404.Stacking
Follow-up #271 (route 404 collapse) is stacked on this one.
Greptile Summary
Writes agent ownership into the authorization graph on create and removes it on delete, moving those side-effects from route handlers into
AgentsUseCase. The ownership write now happens before the Postgres row is persisted so a failure aborts cleanly with no orphaned row, and compensating cleanup (_safe_deregister) undoes both theregister_resourceandgrantcalls on persist failure or a duplicate-create race._register_in_auth(register + grant, no parent) is called only on the genuine-create branches of bothregister_agentandregister_build; update paths are skipped to prevent ownership rewriting._safe_deregister/_safe_revokeare called best-effort post-delete.authorization.grant/revokecalls and the now-unusedprincipal_contextrequest field; auth gating is handled byDAuthorizedId/DAuthorizedNamedependencies at the parameter level.test_agent_authz_dual_write.pycovers write-before-persist ordering, compensating cleanup on failure/duplicate, update no-op, and no-creator guard — all viaregister_agent;register_builddual-write paths are not yet covered by dedicated tests.Confidence Score: 4/5
Safe to merge with the understanding that the orphaned-registration gap on register_resource-succeeds/grant-fails (tracked in the existing review thread) is accepted as a known follow-up, and that register_build dual-write paths have no dedicated test coverage yet.
The core ownership write logic is well-structured: pre-persist ordering is correct, compensating cleanup is properly scoped, and the update paths correctly skip registration. The new test suite covers the main register_agent flows thoroughly. Two concerns hold the score below the ceiling: the grant-failure orphan gap in _register_in_auth (already raised in the prior review thread, accepted for now), and the absence of equivalent tests for register_build's duplicate-race and persist-failure compensation paths.
agentex/src/domain/use_cases/agents_use_case.py — specifically _register_in_auth, which has no compensation if grant raises after register_resource succeeds. agentex/tests/integration/use_cases/test_agent_authz_dual_write.py — register_build dual-write paths are untested.
Important Files Changed
Sequence Diagram
sequenceDiagram participant R as Route Handler participant UC as AgentsUseCase participant AS as AuthorizationService participant DB as AgentRepo R->>AS: "check(agent:*, create)" R->>UC: register_agent() / register_build() UC->>DB: get(name) — probe for existing DB-->>UC: ItemDoesNotExist UC->>AS: "register_resource(agent:{new-id})" UC->>AS: "grant(agent:{new-id})" alt register/grant succeeds UC->>DB: create(agent) alt create succeeds DB-->>UC: AgentEntity else DuplicateItemError UC->>AS: _safe_deregister(new-id) UC->>DB: get(name) — adopt existing row else Other Exception UC->>AS: _safe_deregister(new-id) UC-->>R: re-raise end else register_resource or grant raises UC-->>R: propagates (no compensation — known gap) end Note over R,DB: Delete path R->>UC: delete(id/name) UC->>DB: "update(status=DELETED)" UC->>AS: _safe_deregister(agent.id) UC->>AS: _safe_revoke(agent.id) UC-->>R: AgentEntity (soft-deleted)Prompt To Fix All With AI
Reviews (2): Last reviewed commit: "feat(agents): write agent ownership to t..." | Re-trigger Greptile