Bug Description
When a SandboxClaim expires with ShutdownPolicy=Retain, the reconcileExpired method attempts to find and delete the associated Sandbox resource. However, it uses client.ObjectKeyFromObject(claim) to look up the Sandbox, which assumes the Sandbox has the same name as the Claim.
https://github.com/kubernetes-sigs/agent-sandbox/blob/main/extensions/controllers/sandboxclaim_controller.go#L277
func (r *SandboxClaimReconciler) reconcileExpired(ctx context.Context, claim *extensionsv1alpha1.SandboxClaim) (*v1alpha1.Sandbox, error) {
sandbox := &v1alpha1.Sandbox{}
if err := r.Get(ctx, client.ObjectKeyFromObject(claim), sandbox); err != nil {
...
}
}
Why This Is a Problem
When a Sandbox is adopted from a WarmPool (via adoptSandboxFromCandidates), it retains its original name assigned by the WarmPool controller, which differs from the Claim name. In this case, reconcileExpired will get a NotFound error and return nil, nil (the "Sandbox is gone, life is good" path at line 279), without actually deleting the Sandbox.
This results in orphaned Sandbox resources that persist after their Claim has expired.
Steps to Reproduce
- Create a
SandboxWarmPool to pre-provision sandboxes.
- Create a
SandboxClaim with warmPool policy set to adopt from the pool.
- Set a short
shutdownTime on the Claim so it expires quickly.
- Wait for the Claim to expire.
- Observe that the adopted Sandbox is not deleted.
Expected Behavior
reconcileExpired should be able to locate and delete the Sandbox regardless of whether it was created directly (same name as Claim) or adopted from a WarmPool (different name).
Suggested Fix
The lookup logic in reconcileExpired should mirror the approach used in getOrCreateSandbox:
- First check
claim.Status.SandboxStatus.Name — this field records the actual Sandbox name after adoption.
- Fall back to listing Sandboxes in the namespace and filtering by ownership (
metav1.IsControlledBy).
Alternatively, the existing Sandbox lookup logic in getOrCreateSandbox could be extracted into a shared helper and reused by both methods.
Environment
- agent-sandbox version: main branch (commit 591c34a)
Bug Description
When a
SandboxClaimexpires withShutdownPolicy=Retain, thereconcileExpiredmethod attempts to find and delete the associatedSandboxresource. However, it usesclient.ObjectKeyFromObject(claim)to look up the Sandbox, which assumes the Sandbox has the same name as the Claim.https://github.com/kubernetes-sigs/agent-sandbox/blob/main/extensions/controllers/sandboxclaim_controller.go#L277
Why This Is a Problem
When a Sandbox is adopted from a WarmPool (via
adoptSandboxFromCandidates), it retains its original name assigned by the WarmPool controller, which differs from the Claim name. In this case,reconcileExpiredwill get aNotFounderror and returnnil, nil(the "Sandbox is gone, life is good" path at line 279), without actually deleting the Sandbox.This results in orphaned Sandbox resources that persist after their Claim has expired.
Steps to Reproduce
SandboxWarmPoolto pre-provision sandboxes.SandboxClaimwithwarmPoolpolicy set to adopt from the pool.shutdownTimeon the Claim so it expires quickly.Expected Behavior
reconcileExpiredshould be able to locate and delete the Sandbox regardless of whether it was created directly (same name as Claim) or adopted from a WarmPool (different name).Suggested Fix
The lookup logic in
reconcileExpiredshould mirror the approach used ingetOrCreateSandbox:claim.Status.SandboxStatus.Name— this field records the actual Sandbox name after adoption.metav1.IsControlledBy).Alternatively, the existing Sandbox lookup logic in
getOrCreateSandboxcould be extracted into a shared helper and reused by both methods.Environment