Skip to content

feat: add dataSource PVC mount and per-proposal timeoutMinutes#17

Open
sakshiep1 wants to merge 3 commits into
openshift:mainfrom
sakshiep1:must-gather-version
Open

feat: add dataSource PVC mount and per-proposal timeoutMinutes#17
sakshiep1 wants to merge 3 commits into
openshift:mainfrom
sakshiep1:must-gather-version

Conversation

@sakshiep1

@sakshiep1 sakshiep1 commented May 28, 2026

Copy link
Copy Markdown

Summary

Enable IntelliAide-style deep RCA by letting Proposals reference pre-loaded
diagnostic data and run long analysis steps in the sandbox.

  • PVC data source (spec.dataSource) — optional, immutable reference to a
    pre-existing PVC in the Proposal namespace. The operator mounts it read-only
    at /data/input in the sandbox pod (via sandbox template patching).
  • Configurable sandbox timeout (spec.timeoutMinutes) — optional 1–60
    minutes per step (default 5). Used for analysis, execution, verification,
    and escalation so long-running RCA can override the default.
  • Analysis prompt updatesanalysis_query.tmpl requires skill discovery
    from /app/skills before generic analysis; when dataSource is set, tells
    the agent diagnostic data is available at /data/input.
  • RBAC — regenerate ClusterRole: PVC get/list/watch; result CR
    */status sub-resources.
  • Examplesexamples/setup/09-intelliaide-proposals.yaml with
    dataSource, timeoutMinutes: 30, and analysis-only IntelliAide workflows.

Companion PR: agentic-skills adds the intelliaide/ skill pipeline that
reads from /data/input.

API Changes

Field Type Validation Notes
spec.dataSource.claimName string required, DNS subdomain, immutable PVC name in same namespace
spec.timeoutMinutes int32 min=1, max=60, mutable Per-step sandbox timeout

Test Plan

  • make test passes (unit tests updated for new signatures)
  • make manifests produces clean CRD with both new fields
  • Deploy on dev cluster: create PVC + Proposal with dataSource → verify volume mounted at /data/input
  • Verify timeout override: set timeoutMinutes: 30 → operator waits 30m
  • IntelliAide routing: submit RCA request → analysis agent reads SKILL.md and invokes pipeline

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 28, 2026
@openshift-ci

openshift-ci Bot commented May 28, 2026

Copy link
Copy Markdown

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces per-step sandbox timeout support and optional PVC mounting for input data across the proposal workflow. It extends the API types with ProposalStep.timeoutMinutes and a new DataSource type, updates the AgentCaller interface to accept timeouts, propagates timeouts through proposal resolution and workflow handlers, implements timeout handling in the HTTP client and sandbox agent, and adds template patching to mount DataSource PVCs into sandbox containers. New tests validate timeout propagation and DataSource mounting behavior.

Changes

Per-Step Timeout and DataSource Implementation

Layer / File(s) Summary
API Types - Timeout and DataSource Contracts
api/v1alpha1/proposal_types.go, api/v1alpha1/tools_types.go
ProposalStep.timeoutMinutes (optional, 1-60 min), DataSource type with validated claimName, and ToolsSpec.DataSource field added with updated IsZero() logic. Documentation describes per-step and default PVC mounting behavior.
AgentCaller Interface - Timeout Parameter
controller/proposal/agent.go
AgentCaller interface methods (Analyze, Execute, Verify, Escalate) and StubAgentCaller implementations extended with trailing timeout time.Duration parameter.
HTTP Client - Timeout Serialization
controller/proposal/client.go, controller/proposal/client_test.go
NewAgentHTTPClient accepts explicit timeout, defaults non-positive to defaultSandboxTimeout, and sets http.Client.Timeout. Run serializes timeout to TimeoutMs in request. Tests updated and new timeout serialization tests added.
Proposal Resolver - Timeout Propagation
controller/proposal/resolve.go
resolvedStep extended with TimeoutMinutes field; resolveProposal now assigns timeouts from proposal.Spec.*.TimeoutMinutes to resolved steps.
SandboxAgentCaller - Timeout Implementation
controller/proposal/sandbox_agent.go
ClientFactory signature extended to accept timeout. Analyze, Execute, Verify, Escalate methods accept and forward timeout to callWithSandbox. callWithSandbox defaults timeout, applies to HTTP client creation, and keeps WaitReady capped at defaultSandboxTimeout. New stepTimeout() helper extracts per-step timeout with fallback.
Workflow Handlers - Apply Per-Step Timeouts
controller/proposal/handlers.go
handleAnalysis, handleRevision, handleExecution, handleVerification, and handleEscalation compute stepTimeout() and pass to respective agent invocations.
SandboxAgentCaller Tests - Timeout Behavior
controller/proposal/sandbox_agent_test.go
mockSandboxProvider tracks lastWaitReadyTimeout. Test helpers updated for timeout-aware ClientFactory. All method calls updated to pass defaultSandboxTimeout. New TestSandboxAgentCaller_TimeoutPropagation verifies timeout flow: WaitReady receives defaultSandboxTimeout, ClientFactory receives caller-supplied timeout.
Reconciler Tests - Timeout Integration
controller/proposal/reconciler_test.go, controller/proposal/reconciler.go
testAgentCaller tracks per-method timeouts; mockSandboxProvider.ClientFactory updated to accept timeout. RBAC annotation added for persistentvolumeclaims read permissions. New TestReconcile_PropagatesStepTimeout validates minutes-to-duration conversion and forwarding to Analyze.
Sandbox Templates - DataSource Mounting
controller/proposal/sandbox_templates.go
Constants added for dataSourceMountPath and dataSourceVolumeName. templateHashInput extended with DataSource field. computeTemplateHash accepts and incorporates dataSource into hash. EnsureAgentTemplate extended with dataSource parameter, extracts from tools.DataSource, and conditionally patches template with PVC volume and read-only mount via new addPVCVolume and patchDataSource helpers.
Sandbox Template Tests - DataSource Validation
controller/proposal/sandbox_templates_test.go
Hash tests updated to new computeTemplateHash signature. New TestComputeTemplateHash_DataSource asserts hash changes for nil vs non-nil DataSource and different ClaimName values. New TestPatchDataSource_MountsReadOnly verifies PVC volume upsert and read-only mount at expected path.
Examples and Documentation
examples/setup/09-intelliaide-proposals.yaml, .ai/spec/how/reconciler.md
Five IntelliAide RCA Proposal examples added (etcd, update-failure, router-latency, general degraded, must-gather-mode) with analysis.agent: smart and timeoutMinutes: 30. Must-gather proposal includes tools.dataSource.claimName reference. Reconciler.md updated to document dataSource extraction and optional PVC mount.
Repository Configuration
.gitignore
Added /oc-agentic ignore pattern at repository root.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 24.07% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title accurately summarizes the main changes: adding dataSource PVC mount support and per-proposal timeoutMinutes configuration.
Description check ✅ Passed Description is directly related to the changeset, detailing the dataSource feature, timeout configuration, API changes, and test plan.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 28, 2026
@sakshiep1 sakshiep1 force-pushed the must-gather-version branch from 2a89d44 to f71f525 Compare May 29, 2026 08:40
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 29, 2026
@sakshiep1 sakshiep1 force-pushed the must-gather-version branch from f71f525 to 31261be Compare May 29, 2026 09:09
@sakshiep1 sakshiep1 marked this pull request as ready for review May 29, 2026 09:25
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 29, 2026
@openshift-ci openshift-ci Bot requested review from harche and joshuawilson May 29, 2026 09:26

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
controller/proposal/sandbox_agent.go (1)

177-210: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use one shared deadline for the whole step.

timeout is spent once in WaitReady and then a second time in the HTTP client, so timeoutMinutes: 5 can block for roughly 10 minutes before failing. If this field is meant to cap a step end-to-end, the later phases need to use the remaining budget, not the original budget again.

⏱️ Possible fix
 func (s *SandboxAgentCaller) callWithSandbox(
 	ctx context.Context,
 	proposal *agenticv1alpha1.Proposal,
 	stepName string,
@@
 ) (json.RawMessage, error) {
 	if timeout <= 0 {
 		timeout = defaultSandboxTimeout
 	}
+	deadline := time.Now().Add(timeout)
 
 	templateName, err := EnsureAgentTemplate(ctx, s.K8sClient, defaultBaseTemplateName, s.Namespace, stepName, step.Agent, step.LLM, step.Tools, proposal.Spec.DataSource)
 	if err != nil {
@@
-	endpoint, err := s.Sandbox.WaitReady(ctx, claimName, timeout)
+	endpoint, err := s.Sandbox.WaitReady(ctx, claimName, time.Until(deadline))
 	if err != nil {
 		return nil, fmt.Errorf("wait for sandbox: %w", err)
 	}
@@
-	client := s.ClientFactory(agentURL, timeout)
+	remaining := time.Until(deadline)
+	if remaining <= 0 {
+		return nil, fmt.Errorf("step timeout exceeded before agent call")
+	}
+	client := s.ClientFactory(agentURL, remaining)
 	resp, err := client.Run(ctx, "", query, schema, agentCtx)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@controller/proposal/sandbox_agent.go` around lines 177 - 210, The step
currently reuses the full original timeout for WaitReady and then again for the
HTTP client, doubling the effective wait; fix by creating a single
deadline-based context and deriving the remaining budget for the client: wrap
the incoming ctx with context.WithTimeout/context.WithDeadline using the
provided timeout (store the returned cancel and deadline), pass that ctx to
s.Sandbox.WaitReady, then compute remaining := time.Until(deadline) (clamp to a
minimum and fallback if non-positive) and pass remaining to
s.ClientFactory(agentURL, remaining) (and ensure to call cancel()). Reference
EnsureAgentTemplate, s.Sandbox.Claim, s.Sandbox.WaitReady,
s.ClientFactory(...).Run and the timeout variable when applying this change.
🧹 Nitpick comments (4)
controller/proposal/reconciler_test.go (1)

39-64: ⚡ Quick win

Capture the timeout passed by the reconciler.

testAgentCaller drops the new argument everywhere, so nothing in this file proves that proposalTimeout(proposal) is actually passed into Analyze/Execute/Verify/Escalate. A small field like lastAnalyzeTimeout plus one case with Spec.TimeoutMinutes = 30 would cover the new handler plumbing.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@controller/proposal/reconciler_test.go` around lines 39 - 64, The
testAgentCaller currently ignores the timeout parameter so tests don't verify
that proposalTimeout(proposal) is propagated; add fields on testAgentCaller
(e.g., lastAnalyzeTimeout, lastExecuteTimeout, lastVerifyTimeout,
lastEscalateTimeout) and assign the incoming timeout argument in the respective
methods Analyze, Execute, Verify, Escalate; then add/update a test that sets
Spec.TimeoutMinutes = 30 on a Proposal, calls the reconciler path, and asserts
each last*Timeout equals the expected duration returned by
proposalTimeout(proposal) to prove the plumbing forwards the timeout.
controller/proposal/sandbox_agent_test.go (1)

57-82: ⚡ Quick win

Record and assert the propagated timeout in these helpers.

The new time.Duration parameter is ignored here, so the tests only prove the signatures compile. Capturing the value passed into ClientFactory (and similarly into WaitReady) would let this suite verify that non-default proposal timeouts actually reach the sandbox layer.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@controller/proposal/sandbox_agent_test.go` around lines 57 - 82, Update
newTestSandboxAgentCaller and newTestSandboxAgentCallerWithProposal so the
ClientFactory closure captures and records the time.Duration it receives and the
mockSandboxProvider's WaitReady call records its timeout; specifically, modify
the ClientFactory func(_ string, d time.Duration) to store d onto the provided
mockHTTPClient (e.g., a LastTimeout field) before returning it, and ensure the
mockSandboxProvider exposes a WaitReady(timeout time.Duration) recorder that
tests can assert against; this will let tests verify that non-default proposal
timeouts are propagated to both ClientFactory and WaitReady.
controller/proposal/client_test.go (1)

38-172: ⚡ Quick win

Add assertions for the new timeout contract.

These updates only make the tests compile against the new constructor. There’s still no coverage that a custom timeout is serialized into timeout_ms, or that timeout <= 0 falls back to defaultSandboxTimeout, so the new behavior can regress silently.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@controller/proposal/client_test.go` around lines 38 - 172, Tests for
NewAgentHTTPClient lack assertions that the serialized request contains the
timeout_ms field and that timeout <= 0 falls back to defaultSandboxTimeout;
update relevant tests that decode agentRunRequest (e.g., inside the handlers in
TestAgentHTTPClient_RunWithExecutionResult,
TestAgentHTTPClient_RunWithoutExecutionResult,
TestAgentHTTPClient_RunWithContext and the basic Run test) to assert
req.TimeoutMs equals the provided custom timeout when
NewAgentHTTPClient(server.URL, X) is called with X>0, and assert req.TimeoutMs
equals defaultSandboxTimeout when NewAgentHTTPClient(..., 0) or a negative
timeout is used; reference the agentRunRequest struct and NewAgentHTTPClient
constructor to locate where to add these checks.
controller/proposal/sandbox_templates_test.go (1)

68-70: ⚡ Quick win

Add coverage for non-nil dataSource here.

All updated hash calls still pass nil for dataSource, so this file never asserts that changing claimName changes the derived template name. A focused test for that — plus one for the read-only /data/input PVC mount — would cover the new behavior directly.

Also applies to: 490-524

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@controller/proposal/sandbox_templates_test.go` around lines 68 - 70, Update
tests to exercise a non-nil dataSource when calling computeTemplateHash: modify
mustHash (and the test cases around it) to pass a populated
agenticv1alpha1.DataSource with a distinct ClaimName and verify that changing
ClaimName alters the returned hash; additionally add an assertion that the
computed template includes the expected PVC mount mode (readOnly) for
/data/input by constructing a DataSource whose MountPath is /data/input and
verifying the template metadata or generated mount flags reflect read-only.
Target the helper mustHash and calls to computeTemplateHash so the new
assertions cover the branch where dataSource is non-nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/v1alpha1/proposal_types.go`:
- Line 297: The XValidation rule for the dataSource field currently allows
nil->non-nil updates; replace the rule so only two cases are permitted: both old
and new lack dataSource, or both have it and it is equal. Update the kubebuilder
annotation on the dataSource validation (the comment containing XValidation) to
use an expression such as "(has(oldSelf.dataSource) && has(self.dataSource) &&
self.dataSource == oldSelf.dataSource) || (!has(oldSelf.dataSource) &&
!has(self.dataSource))" so adding a dataSource after creation is disallowed.

In `@controller/proposal/sandbox_templates.go`:
- Around line 523-551: The current patchDataSource unconditionally uses the name
"data-source" and addPVCVolume will overwrite any existing volume with that
name; modify addPVCVolume and patchDataSource so they do not clobber an
unrelated pre-existing volume: in addPVCVolume, when finding an existing volume
with the same name check whether it already has a persistentVolumeClaim with the
same claimName and only replace if it matches; otherwise return a sentinel (or
error) to signal a name conflict. In patchDataSource, detect that conflict and
instead pick a non-colliding name (e.g., "data-source-<uniqueSuffix>") and use
that name for both addPVCVolume and addVolumeMount, or surface the error to the
caller; reference the functions addPVCVolume and patchDataSource and ensure the
chosen name is threaded through to addVolumeMount so mounts reference the
correct volume.

In `@controller/proposal/templates/analysis_query.tmpl`:
- Around line 3-27: The template fails to tell the agent that PVC-backed input
may already be mounted at /data/input when spec.dataSource is set, so update the
contract and template: modify buildAnalysisQuery in
controller/proposal/helpers.go to accept a flag/field (e.g., hasDataSource or
includeMountedInput) derived from the Request/spec.dataSource (so callers can
indicate PVC-backed input is present), and change
controller/proposal/templates/analysis_query.tmpl to conditionally include a
short directive like "Input data is mounted read-only at /data/input; prefer
analyzing that bundle before collecting live data" when that flag is true;
ensure the helpers.go change sets the flag when spec.dataSource is present so
PVC-backed RCA is handled as bundle-first analysis.
- Around line 19-25: The skill-loading branch is inconsistent: "cat
/app/skills/intelliaide/SKILL.md" cannot "return one or more paths." Update the
logic around the atomic command (the literal string "cat
/app/skills/intelliaide/SKILL.md" and references to SKILL.md) so it is
internally consistent — either change the command to a listing/find command that
can return paths (e.g., use a glob/find/ls conceptual step) or change the prose
to state that if the cat command fails because the file does not exist you must
stop and return a JSON error, and otherwise read that SKILL.md and follow its
workflow exactly; ensure the branch no longer claims cat can return multiple
paths.

In `@examples/setup/09-intelliaide-proposals.yaml`:
- Around line 21-31: Add one concrete example of the PVC-backed input to the
IntelliAide proposals by updating a Proposal's spec to include a dataSource
block using the exact shape spec.dataSource.claimName and a placeholder PVC name
(e.g., <pre-existing-pvc>); locate any Proposal objects in this file that
currently run in live mode and add a dataSource: claimName: <pre-existing-pvc>
under their spec so the new /data/input workflow and PVC-backed input are
discoverable (ensure the YAML key is spec.dataSource.claimName and that only the
example uses a placeholder to avoid changing runtime behavior).
- Around line 33-36: The docs instruct building/pushing
quay.io/<org>/intelliaide-skills:latest but all Proposal manifests reference
image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest,
so examples fail; either (A) update each Proposal image field to
quay.io/<org>/intelliaide-skills:latest (search for the string
image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest
in the Proposal resources and replace) or (B) add the missing retag/push steps
after the build: retag quay image to the in-cluster registry name and push it
(i.e., docker/podman tag quay.io/<org>/intelliaide-skills:latest
image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest
and push), and apply this fix for all occurrences noted (the other blocks at the
same pattern: lines ~70-72, 101-103, 147-149, 183-185).

---

Outside diff comments:
In `@controller/proposal/sandbox_agent.go`:
- Around line 177-210: The step currently reuses the full original timeout for
WaitReady and then again for the HTTP client, doubling the effective wait; fix
by creating a single deadline-based context and deriving the remaining budget
for the client: wrap the incoming ctx with
context.WithTimeout/context.WithDeadline using the provided timeout (store the
returned cancel and deadline), pass that ctx to s.Sandbox.WaitReady, then
compute remaining := time.Until(deadline) (clamp to a minimum and fallback if
non-positive) and pass remaining to s.ClientFactory(agentURL, remaining) (and
ensure to call cancel()). Reference EnsureAgentTemplate, s.Sandbox.Claim,
s.Sandbox.WaitReady, s.ClientFactory(...).Run and the timeout variable when
applying this change.

---

Nitpick comments:
In `@controller/proposal/client_test.go`:
- Around line 38-172: Tests for NewAgentHTTPClient lack assertions that the
serialized request contains the timeout_ms field and that timeout <= 0 falls
back to defaultSandboxTimeout; update relevant tests that decode agentRunRequest
(e.g., inside the handlers in TestAgentHTTPClient_RunWithExecutionResult,
TestAgentHTTPClient_RunWithoutExecutionResult,
TestAgentHTTPClient_RunWithContext and the basic Run test) to assert
req.TimeoutMs equals the provided custom timeout when
NewAgentHTTPClient(server.URL, X) is called with X>0, and assert req.TimeoutMs
equals defaultSandboxTimeout when NewAgentHTTPClient(..., 0) or a negative
timeout is used; reference the agentRunRequest struct and NewAgentHTTPClient
constructor to locate where to add these checks.

In `@controller/proposal/reconciler_test.go`:
- Around line 39-64: The testAgentCaller currently ignores the timeout parameter
so tests don't verify that proposalTimeout(proposal) is propagated; add fields
on testAgentCaller (e.g., lastAnalyzeTimeout, lastExecuteTimeout,
lastVerifyTimeout, lastEscalateTimeout) and assign the incoming timeout argument
in the respective methods Analyze, Execute, Verify, Escalate; then add/update a
test that sets Spec.TimeoutMinutes = 30 on a Proposal, calls the reconciler
path, and asserts each last*Timeout equals the expected duration returned by
proposalTimeout(proposal) to prove the plumbing forwards the timeout.

In `@controller/proposal/sandbox_agent_test.go`:
- Around line 57-82: Update newTestSandboxAgentCaller and
newTestSandboxAgentCallerWithProposal so the ClientFactory closure captures and
records the time.Duration it receives and the mockSandboxProvider's WaitReady
call records its timeout; specifically, modify the ClientFactory func(_ string,
d time.Duration) to store d onto the provided mockHTTPClient (e.g., a
LastTimeout field) before returning it, and ensure the mockSandboxProvider
exposes a WaitReady(timeout time.Duration) recorder that tests can assert
against; this will let tests verify that non-default proposal timeouts are
propagated to both ClientFactory and WaitReady.

In `@controller/proposal/sandbox_templates_test.go`:
- Around line 68-70: Update tests to exercise a non-nil dataSource when calling
computeTemplateHash: modify mustHash (and the test cases around it) to pass a
populated agenticv1alpha1.DataSource with a distinct ClaimName and verify that
changing ClaimName alters the returned hash; additionally add an assertion that
the computed template includes the expected PVC mount mode (readOnly) for
/data/input by constructing a DataSource whose MountPath is /data/input and
verifying the template metadata or generated mount flags reflect read-only.
Target the helper mustHash and calls to computeTemplateHash so the new
assertions cover the branch where dataSource is non-nil.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f4659329-5901-4895-8a4c-9eb649291e97

📥 Commits

Reviewing files that changed from the base of the PR and between 44bd61b and 31261be.

⛔ Files ignored due to path filters (3)
  • api/v1alpha1/zz_generated.deepcopy.go is excluded by !**/zz_generated.deepcopy.go
  • config/crd/bases/agentic.openshift.io_proposals.yaml is excluded by !config/crd/bases/**
  • config/rbac/role.yaml is excluded by !config/rbac/role.yaml
📒 Files selected for processing (14)
  • .gitignore
  • api/v1alpha1/proposal_types.go
  • controller/proposal/agent.go
  • controller/proposal/client.go
  • controller/proposal/client_test.go
  • controller/proposal/handlers.go
  • controller/proposal/reconciler.go
  • controller/proposal/reconciler_test.go
  • controller/proposal/sandbox_agent.go
  • controller/proposal/sandbox_agent_test.go
  • controller/proposal/sandbox_templates.go
  • controller/proposal/sandbox_templates_test.go
  • controller/proposal/templates/analysis_query.tmpl
  • examples/setup/09-intelliaide-proposals.yaml

Comment thread api/v1alpha1/proposal_types.go Outdated
Comment thread controller/proposal/sandbox_templates.go
Comment thread controller/proposal/templates/analysis_query.tmpl Outdated
Comment thread controller/proposal/templates/analysis_query.tmpl Outdated
Comment thread examples/setup/09-intelliaide-proposals.yaml
Comment thread examples/setup/09-intelliaide-proposals.yaml
Comment thread api/v1alpha1/proposal_types.go Outdated
// +optional
// +kubebuilder:validation:Minimum=1
// +kubebuilder:validation:Maximum=60
TimeoutMinutes *int32 `json:"timeoutMinutes,omitempty"`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be on ProposalStep, not on ProposalSpec. Right now all four steps (analysis, execution, verification, escalation) get the same timeout — but IntelliAide RCA analysis might need 30 minutes while a verification step that runs kubectl get pods needs 2. A single proposal-level knob forces users to over-provision the timeout for every step just to accommodate the slowest one.

Would you consider moving it to ProposalStep? That would also align with how agent and tools are already scoped — per-step, not per-proposal.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed - timeoutMinutes is moved to ProposalStep.

@@ -1,5 +1,31 @@
You are an analysis agent. Your job is to diagnose the problem, determine the root cause, and propose one or more remediation options. Do NOT attempt to fix, patch, or execute any changes — only analyze and propose.

## Skills

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operator is skill-agnostic by design — it orchestrates Proposal lifecycles without knowing which skills are mounted. Hardcoding intelliaide references and keyword-routing here breaks that separation, and making skill absence fatal would break any Proposal that doesn't mount skills.

No changes to the analysis template are needed. spec.request already carries the user's intent to the agent — the adapter creating the Proposal can embed whatever context it needs there (e.g., "perform deep RCA on diagnostic data at /data/input"). The agent sees the request, sees the skills in its filesystem, and decides what to use. The operator shouldn't be involved in that decision.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this makes sense - so if we are triggering the proposal manually, this is not needed and if we want it to be automatic we need an adapter. either way this is not necessary.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed - the 'Skills' section is dropped.

// pre-populated with data before the Proposal is created. The operator
// mounts it read-only at a well-known path (/data/input) accessible to
// all skills in the sandbox pod.
type DataSource struct {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need PVC support here? The sandbox pods are ephemeral — if the agent needs must-gather data, it can collect it during its run (e.g., oc adm must-gather -o /tmp/must-gather) and work with it locally. The skill or spec.request can instruct the agent to do that.

This avoids adding new API surface (DataSource type, immutability rules, PVC RBAC, volume mount patching in sandbox templates) for something the agent can already do with its existing tools and a writable /tmp.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harche while that can be done, it restricts the ability to bring a must-gather from outside and analyze it, i.e customer cases.

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 30, 2026
Comment thread api/v1alpha1/proposal_types.go Outdated
//
// Immutable: input data source is fixed at creation.
// +optional
DataSource *DataSource `json:"dataSource,omitzero"`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you set add the DataSource here then all the steps (i.e. analyse, execution etc) would need mount that data source, but that may not be necessary in all use cases. Consider moving this to per-step, so data source is attached only to the steps that need it.

@harche harche Jun 1, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe consider making it part of ToolsSpec above, that way it will be available at the proposal as well as step level.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed - DataSource is moved into ToolsSpec

@sakshiep1 sakshiep1 force-pushed the must-gather-version branch from 31261be to 89cd2a4 Compare June 3, 2026 11:59
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 3, 2026
@openshift-ci

openshift-ci Bot commented Jun 3, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign xrajesh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
examples/setup/09-intelliaide-proposals.yaml (1)

37-40: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Align the documented image push target with the image referenced in Proposal specs.

Line 37–40 tells users to publish quay.io/<org>/intelliaide-skills:latest, but Lines 73, 104, 149, 184, and 220 reference a different in-cluster image path. Following this file verbatim can leave the referenced Proposal image unavailable.

Proposed minimal fix (make Proposal images match docs)
@@
-      - image: image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest
+      - image: quay.io/<org>/intelliaide-skills:latest
@@
-      - image: image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest
+      - image: quay.io/<org>/intelliaide-skills:latest
@@
-      - image: image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest
+      - image: quay.io/<org>/intelliaide-skills:latest
@@
-      - image: image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest
+      - image: quay.io/<org>/intelliaide-skills:latest
@@
-      - image: image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest
+      - image: quay.io/<org>/intelliaide-skills:latest

Also applies to: 73-74, 104-105, 149-150, 184-185, 220-220

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/setup/09-intelliaide-proposals.yaml` around lines 37 - 40, The
documented push target quay.io/<org>/intelliaide-skills:latest is inconsistent
with the image path used in the Proposal specs; update either the push command
or the Proposal YAMLs so they match. Specifically, change the podman/push target
in the build instructions (the quay.io/<org>/intelliaide-skills:latest line) to
the exact in-cluster image path referenced by the Proposal entries (the image
string used in the Proposal specs at lines referencing the Proposal images) or
vice versa so all references use the same image name and tag across the file.
🧹 Nitpick comments (1)
controller/proposal/sandbox_templates.go (1)

282-287: ⚡ Quick win

Use constant error identifiers for the newly added wrapped errors.

Several new wraps (for example Line 282 and Line 573) use inline string literals in fmt.Errorf. Please move these messages to const error identifiers and keep %w wrapping.

Proposed pattern
+const (
+    ErrSetLightspeedProvider   = "set LIGHTSPEED_PROVIDER"
+    ErrAddDataSourcePVCVolume  = "add data source PVC volume"
+)
...
- return fmt.Errorf("set LIGHTSPEED_PROVIDER: %w", err)
+ return fmt.Errorf("%s: %w", ErrSetLightspeedProvider, err)
...
- return fmt.Errorf("add data source PVC volume: %w", err)
+ return fmt.Errorf("%s: %w", ErrAddDataSourcePVCVolume, err)

As per coding guidelines, **/*.go: Define errors as constants using const ErrFoo = "…" and wrap them with fmt.Errorf("%s: %w", …) for context.

Also applies to: 301-303, 307-315, 317-319, 323-325, 333-339, 343-349, 573-575

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@controller/proposal/sandbox_templates.go` around lines 282 - 287, Replace
inline error message strings in fmt.Errorf wraps with file-level const error
identifiers and preserve %w wrapping; for each occurrence (e.g., the calls
wrapping errors from setEnvVar in the block using LIGHTSPEED_PROVIDER and
LIGHTSPEED_MODEL and the other listed lines) define a const like
ErrSetLightspeedProvider = "set LIGHTSPEED_PROVIDER" and ErrSetLightspeedModel =
"set LIGHTSPEED_MODEL" (and similar names for the other cases), then change the
wraps to fmt.Errorf("%s: %w", ErrSetLightspeedProvider, err) etc., keeping the
same wrapped original error and use the existing symbols (setEnvVar, tmpl,
model, providerTypeString, llm.Spec.Type) to locate the sites to update.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/v1alpha1/proposal_types.go`:
- Around line 262-272: The step-level immutability check currently compares
entire objects (spec.analysis, spec.execution, spec.verification) so changing
TimeoutMinutes (*TimeoutMinutes) gets rejected; update the immutability
validation to exclude timeoutMinutes from the equality check (i.e., allow
changes where only spec.*.timeoutMinutes hasChanged). Modify the webhook/CEL
rule or validation logic that enforces immutability so it compares individual
fields (or uses hasChanged(...) to whitelist spec.analysis.timeoutMinutes,
spec.execution.timeoutMinutes, spec.verification.timeoutMinutes) instead of
comparing the whole object, referencing the TimeoutMinutes field in
proposal_types.go and the spec.analysis/spec.execution/spec.verification
immutability checks.

In `@controller/proposal/handlers.go`:
- Around line 510-512: The escalation path loses the configured timeout because
the temporary resolvedStep used for override agents omits TimeoutMinutes;
compute the timeout once from stepTimeout(step) (or copy step.TimeoutMinutes)
before building resolvedStep and either set resolvedStep.TimeoutMinutes =
step.TimeoutMinutes or pass the precomputed timeout into r.Agent.Escalate so the
override path uses the same timeout value; update the logic around
stepTimeout(step), resolvedStep, and the r.Agent.Escalate call to preserve the
original timeout.

In `@controller/proposal/sandbox_agent.go`:
- Around line 197-201: Replace the hard-coded defaultSandboxTimeout passed to
s.Sandbox.WaitReady so sandbox readiness respects the step's configured timeout:
compute the timeout for WaitReady from the step's timeout (e.g., the step's
TimeoutMinutes or the remaining time on ctx's deadline), then pass that derived
duration instead of defaultSandboxTimeout to s.Sandbox.WaitReady (use the same
claimName and ctx). Ensure you handle zero/negative remaining time and fall back
to a sensible minimum if needed.

---

Duplicate comments:
In `@examples/setup/09-intelliaide-proposals.yaml`:
- Around line 37-40: The documented push target
quay.io/<org>/intelliaide-skills:latest is inconsistent with the image path used
in the Proposal specs; update either the push command or the Proposal YAMLs so
they match. Specifically, change the podman/push target in the build
instructions (the quay.io/<org>/intelliaide-skills:latest line) to the exact
in-cluster image path referenced by the Proposal entries (the image string used
in the Proposal specs at lines referencing the Proposal images) or vice versa so
all references use the same image name and tag across the file.

---

Nitpick comments:
In `@controller/proposal/sandbox_templates.go`:
- Around line 282-287: Replace inline error message strings in fmt.Errorf wraps
with file-level const error identifiers and preserve %w wrapping; for each
occurrence (e.g., the calls wrapping errors from setEnvVar in the block using
LIGHTSPEED_PROVIDER and LIGHTSPEED_MODEL and the other listed lines) define a
const like ErrSetLightspeedProvider = "set LIGHTSPEED_PROVIDER" and
ErrSetLightspeedModel = "set LIGHTSPEED_MODEL" (and similar names for the other
cases), then change the wraps to fmt.Errorf("%s: %w", ErrSetLightspeedProvider,
err) etc., keeping the same wrapped original error and use the existing symbols
(setEnvVar, tmpl, model, providerTypeString, llm.Spec.Type) to locate the sites
to update.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ea29bc9e-57fb-4b1c-869c-c5572b79c808

📥 Commits

Reviewing files that changed from the base of the PR and between 31261be and 89cd2a4.

⛔ Files ignored due to path filters (4)
  • api/v1alpha1/zz_generated.deepcopy.go is excluded by !**/zz_generated.deepcopy.go
  • config/crd/bases/agentic.openshift.io_proposals.yaml is excluded by !config/crd/bases/**
  • config/rbac/role.yaml is excluded by !config/rbac/role.yaml
  • go.sum is excluded by !**/*.sum, !go.sum
📒 Files selected for processing (16)
  • .ai/spec/how/reconciler.md
  • .gitignore
  • api/v1alpha1/proposal_types.go
  • api/v1alpha1/tools_types.go
  • controller/proposal/agent.go
  • controller/proposal/client.go
  • controller/proposal/client_test.go
  • controller/proposal/handlers.go
  • controller/proposal/reconciler.go
  • controller/proposal/reconciler_test.go
  • controller/proposal/resolve.go
  • controller/proposal/sandbox_agent.go
  • controller/proposal/sandbox_agent_test.go
  • controller/proposal/sandbox_templates.go
  • controller/proposal/sandbox_templates_test.go
  • examples/setup/09-intelliaide-proposals.yaml
✅ Files skipped from review due to trivial changes (2)
  • .gitignore
  • .ai/spec/how/reconciler.md

Comment thread api/v1alpha1/proposal_types.go Outdated
Comment thread controller/proposal/handlers.go
Comment on lines +197 to +201
// Pod startup is an infrastructure concern unrelated to the agent's work
// budget. Using a fixed ceiling here ensures that the full user-configured
// timeout is available for the agent call itself, and avoids the effective
// wall-clock time being 2× the configured value.
endpoint, err := s.Sandbox.WaitReady(ctx, claimName, defaultSandboxTimeout)

@coderabbitai coderabbitai Bot Jun 3, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Step timeout is not applied to sandbox readiness.

Line 201 hard-caps WaitReady at 5 minutes, so a proposal with timeoutMinutes: 30 can still fail during pod startup at 5 minutes before the agent call begins.

💡 Suggested fix
- endpoint, err := s.Sandbox.WaitReady(ctx, claimName, defaultSandboxTimeout)
+ endpoint, err := s.Sandbox.WaitReady(ctx, claimName, timeout)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Pod startup is an infrastructure concern unrelated to the agent's work
// budget. Using a fixed ceiling here ensures that the full user-configured
// timeout is available for the agent call itself, and avoids the effective
// wall-clock time being 2× the configured value.
endpoint, err := s.Sandbox.WaitReady(ctx, claimName, defaultSandboxTimeout)
// Pod startup is an infrastructure concern unrelated to the agent's work
// budget. Using a fixed ceiling here ensures that the full user-configured
// timeout is available for the agent call itself, and avoids the effective
// wall-clock time being 2× the configured value.
endpoint, err := s.Sandbox.WaitReady(ctx, claimName, timeout)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@controller/proposal/sandbox_agent.go` around lines 197 - 201, Replace the
hard-coded defaultSandboxTimeout passed to s.Sandbox.WaitReady so sandbox
readiness respects the step's configured timeout: compute the timeout for
WaitReady from the step's timeout (e.g., the step's TimeoutMinutes or the
remaining time on ctx's deadline), then pass that derived duration instead of
defaultSandboxTimeout to s.Sandbox.WaitReady (use the same claimName and ctx).
Ensure you handle zero/negative remaining time and fall back to a sensible
minimum if needed.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional and is documented in the code comments (lines 195–198). Pod startup is an infrastructure-related task and is separate from the agent's execution time. We use a fixed defaultSandboxTimeout for WaitReady so that the entire user-configured timeoutMinutes remains available for the agent call itself. If we pass timeout to WaitReady, a step configured with timeoutMinutes: 30 could end up taking close to 60 minutes in total, which is exactly what the comment is trying to avoid.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sakshipatels98-byte Understood — the comment makes the intent clear. Keeping WaitReady capped at defaultSandboxTimeout so the full user-configured budget is preserved for the agent call is a deliberate design choice, not a bug. Withdrawing the concern.

@sakshiep1 sakshiep1 force-pushed the must-gather-version branch from 89cd2a4 to 4ba26ce Compare June 4, 2026 09:57

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@controller/proposal/sandbox_agent_test.go`:
- Line 751: The test currently discards the error returned by
fc.Create(context.Background(), fakeBaseTemplate()); update the
sandbox_agent_test.go setup to handle that error: capture the returned error and
fail the test if non-nil (e.g., call t.Fatalf / t.Fatalf-like helper or
require.NoError) so broken fake client setup doesn't produce false-positive
results; locate the fc.Create call and replace the discard with proper error
checking using the test's *testing.T or test helper functions.

In `@controller/proposal/sandbox_templates.go`:
- Around line 197-205: Replace inline fmt.Errorf string literals with the repo
const-error wrapping pattern: define consts (e.g., const ErrPatchDataSource =
"patch data source" and const ErrPatchProbes = "patch probes") and wrap
underlying errors using fmt.Errorf("%s: %w", ErrPatchDataSource, err) and
fmt.Errorf("%s: %w", ErrPatchProbes, err) where patchDataSource and patchProbes
are called; apply the same pattern for the other new error paths in this file
referenced by your review (the other similar fmt.Errorf usages around the later
blocks) so all error returns use const error identifiers and %w wrapping.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f2416450-e6b0-4181-83cb-deaa76e0f8a3

📥 Commits

Reviewing files that changed from the base of the PR and between 89cd2a4 and 4ba26ce.

⛔ Files ignored due to path filters (3)
  • api/v1alpha1/zz_generated.deepcopy.go is excluded by !**/zz_generated.deepcopy.go
  • config/crd/bases/agentic.openshift.io_proposals.yaml is excluded by !config/crd/bases/**
  • config/rbac/role.yaml is excluded by !config/rbac/role.yaml
📒 Files selected for processing (16)
  • .ai/spec/how/reconciler.md
  • .gitignore
  • api/v1alpha1/proposal_types.go
  • api/v1alpha1/tools_types.go
  • controller/proposal/agent.go
  • controller/proposal/client.go
  • controller/proposal/client_test.go
  • controller/proposal/handlers.go
  • controller/proposal/reconciler.go
  • controller/proposal/reconciler_test.go
  • controller/proposal/resolve.go
  • controller/proposal/sandbox_agent.go
  • controller/proposal/sandbox_agent_test.go
  • controller/proposal/sandbox_templates.go
  • controller/proposal/sandbox_templates_test.go
  • examples/setup/09-intelliaide-proposals.yaml
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (12)
  • .ai/spec/how/reconciler.md
  • api/v1alpha1/proposal_types.go
  • controller/proposal/resolve.go
  • controller/proposal/reconciler.go
  • controller/proposal/sandbox_templates_test.go
  • controller/proposal/handlers.go
  • controller/proposal/agent.go
  • api/v1alpha1/tools_types.go
  • controller/proposal/client_test.go
  • examples/setup/09-intelliaide-proposals.yaml
  • controller/proposal/sandbox_agent.go
  • controller/proposal/reconciler_test.go


var lastFactoryTimeout time.Duration
fc := fake.NewClientBuilder().WithScheme(testScheme()).Build()
_ = fc.Create(context.Background(), fakeBaseTemplate())

@coderabbitai coderabbitai Bot Jun 4, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Handle the fake client setup error instead of discarding it.

Line 751 ignores fc.Create(...) errors, which can mask broken test setup and create false-positive test outcomes.

Suggested fix
- _ = fc.Create(context.Background(), fakeBaseTemplate())
+ if err := fc.Create(context.Background(), fakeBaseTemplate()); err != nil {
+     t.Fatalf("create base template: %v", err)
+ }

As per coding guidelines **/*.go: Go security (prodsec-skills) — Never ignore error returns.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
_ = fc.Create(context.Background(), fakeBaseTemplate())
if err := fc.Create(context.Background(), fakeBaseTemplate()); err != nil {
t.Fatalf("create base template: %v", err)
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@controller/proposal/sandbox_agent_test.go` at line 751, The test currently
discards the error returned by fc.Create(context.Background(),
fakeBaseTemplate()); update the sandbox_agent_test.go setup to handle that
error: capture the returned error and fail the test if non-nil (e.g., call
t.Fatalf / t.Fatalf-like helper or require.NoError) so broken fake client setup
doesn't produce false-positive results; locate the fc.Create call and replace
the discard with proper error checking using the test's *testing.T or test
helper functions.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This follows the same pattern already used in the existing helper functions (newTestSandboxAgentCaller and newTestSandboxAgentCallerWithProposal). The new test is consistent with the existing code style, and changing only this instance would make the file inconsistent. I don't think any changes are needed here.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sakshipatels98-byte, fair point — the pattern is consistent with the two existing helpers. If this is addressed, it should be done uniformly across all three sites (newTestSandboxAgentCaller, newTestSandboxAgentCallerWithProposal, and the new test). Leaving it as-is for now is reasonable given the scope of this PR.


✏️ Learnings added
Learnt from: sakshipatels98-byte
Repo: openshift/lightspeed-agentic-operator PR: 17
File: controller/proposal/sandbox_agent_test.go:751-751
Timestamp: 2026-06-09T17:36:48.182Z
Learning: In `controller/proposal/sandbox_agent_test.go`, the `_ = fc.Create(context.Background(), fakeBaseTemplate())` pattern (ignoring the error from fake client Create) is used consistently across all test helpers (`newTestSandboxAgentCaller`, `newTestSandboxAgentCallerWithProposal`) and inline test setup. Do not flag individual instances of this pattern in isolation; if it is to be fixed, it must be done uniformly across all occurrences in the file.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: sakshipatels98-byte
Repo: openshift/lightspeed-agentic-operator PR: 17
File: controller/proposal/sandbox_templates.go:197-205
Timestamp: 2026-06-09T17:19:11.943Z
Learning: In this repository’s Go code, error wrapping may use inline `fmt.Errorf("...: %w", err)` (with literal message strings) without requiring the project’s non-use of a `const ErrFoo = "..."` + `fmt.Errorf("%s: %w", ErrFoo, err)` style. During code reviews, do not flag inline `fmt.Errorf` message strings as violations of any const-error-message guideline for this repo.

Comment thread controller/proposal/sandbox_templates.go
@sakshiep1 sakshiep1 force-pushed the must-gather-version branch from 4ba26ce to f69db4c Compare June 9, 2026 17:51
@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 9, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
.ai/spec/how/reconciler.md (1)

179-195: 💤 Low value

Add language specification to fenced code block.

The code block is missing a language identifier. Add ```text or similar to fix the markdown linting warning.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.ai/spec/how/reconciler.md around lines 179 - 195, The fenced code block in
.ai/spec/how/reconciler.md that contains the controller/ProposalReconciler
diagram lacks a language identifier; update the opening fence from ``` to a
labeled fence such as ```text (or ```mermaid/```diagram if preferred) so the
block is recognized by markdown linters/renderers—look for the triple-backtick
block around the diagram lines referencing ProposalReconciler.Reconcile,
SandboxAgentCaller, SandboxProvider, BarePodManager, PodSpecBuilder and change
the fence to include the language spec.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In @.ai/spec/how/reconciler.md:
- Around line 179-195: The fenced code block in .ai/spec/how/reconciler.md that
contains the controller/ProposalReconciler diagram lacks a language identifier;
update the opening fence from ``` to a labeled fence such as ```text (or
```mermaid/```diagram if preferred) so the block is recognized by markdown
linters/renderers—look for the triple-backtick block around the diagram lines
referencing ProposalReconciler.Reconcile, SandboxAgentCaller, SandboxProvider,
BarePodManager, PodSpecBuilder and change the fence to include the language
spec.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 12564e27-d55e-481e-a21e-7b1e94f34388

📥 Commits

Reviewing files that changed from the base of the PR and between 4ba26ce and f69db4c.

⛔ Files ignored due to path filters (3)
  • api/v1alpha1/zz_generated.deepcopy.go is excluded by !**/zz_generated.deepcopy.go
  • config/crd/bases/agentic.openshift.io_proposals.yaml is excluded by !config/crd/bases/**
  • config/rbac/role.yaml is excluded by !config/rbac/role.yaml
📒 Files selected for processing (16)
  • .ai/spec/how/reconciler.md
  • .gitignore
  • api/v1alpha1/proposal_types.go
  • api/v1alpha1/tools_types.go
  • controller/proposal/agent.go
  • controller/proposal/client.go
  • controller/proposal/client_test.go
  • controller/proposal/handlers.go
  • controller/proposal/reconciler.go
  • controller/proposal/reconciler_test.go
  • controller/proposal/resolve.go
  • controller/proposal/sandbox_agent.go
  • controller/proposal/sandbox_agent_test.go
  • controller/proposal/sandbox_templates.go
  • controller/proposal/sandbox_templates_test.go
  • examples/setup/09-intelliaide-proposals.yaml
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (11)
  • controller/proposal/reconciler.go
  • controller/proposal/resolve.go
  • controller/proposal/handlers.go
  • controller/proposal/reconciler_test.go
  • controller/proposal/client_test.go
  • controller/proposal/sandbox_templates_test.go
  • controller/proposal/agent.go
  • controller/proposal/sandbox_agent.go
  • examples/setup/09-intelliaide-proposals.yaml
  • controller/proposal/sandbox_templates.go
  • controller/proposal/sandbox_agent_test.go

@sakshiep1 sakshiep1 force-pushed the must-gather-version branch from f69db4c to 499ee65 Compare June 9, 2026 18:58
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 9, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@controller/proposal/sandbox_templates_test.go`:
- Line 519: Replace the nil context arguments in the test calls to
EnsureAgentTemplate with context.TODO(); locate the calls to EnsureAgentTemplate
(e.g., the invocation with parameters "...", "base", "ns", "analysis", nil,
testLLMProvider(...), ...) and change the first nil to context.TODO() (do the
same for the second occurrence mentioned in the comment) so the tests use an
idiomatic non-nil context.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 09f90786-5deb-45ab-8d14-a0122b846a97

📥 Commits

Reviewing files that changed from the base of the PR and between f69db4c and 499ee65.

⛔ Files ignored due to path filters (3)
  • api/v1alpha1/zz_generated.deepcopy.go is excluded by !**/zz_generated.deepcopy.go
  • config/crd/bases/agentic.openshift.io_proposals.yaml is excluded by !config/crd/bases/**
  • config/rbac/role.yaml is excluded by !config/rbac/role.yaml
📒 Files selected for processing (16)
  • .ai/spec/how/reconciler.md
  • .gitignore
  • api/v1alpha1/proposal_types.go
  • api/v1alpha1/tools_types.go
  • controller/proposal/agent.go
  • controller/proposal/client.go
  • controller/proposal/client_test.go
  • controller/proposal/handlers.go
  • controller/proposal/reconciler.go
  • controller/proposal/reconciler_test.go
  • controller/proposal/resolve.go
  • controller/proposal/sandbox_agent.go
  • controller/proposal/sandbox_agent_test.go
  • controller/proposal/sandbox_templates.go
  • controller/proposal/sandbox_templates_test.go
  • examples/setup/09-intelliaide-proposals.yaml
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • openshift/lightspeed-agentic-sandbox (manual)
✅ Files skipped from review due to trivial changes (2)
  • .gitignore
  • .ai/spec/how/reconciler.md
🚧 Files skipped from review as they are similar to previous changes (12)
  • controller/proposal/resolve.go
  • controller/proposal/reconciler.go
  • controller/proposal/client.go
  • controller/proposal/handlers.go
  • controller/proposal/sandbox_templates.go
  • examples/setup/09-intelliaide-proposals.yaml
  • controller/proposal/agent.go
  • api/v1alpha1/tools_types.go
  • api/v1alpha1/proposal_types.go
  • controller/proposal/sandbox_agent.go
  • controller/proposal/reconciler_test.go
  • controller/proposal/sandbox_agent_test.go

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@controller/proposal/sandbox_templates_test.go`:
- Line 519: Replace the nil context arguments in the test calls to
EnsureAgentTemplate with context.TODO(); locate the calls to EnsureAgentTemplate
(e.g., the invocation with parameters "...", "base", "ns", "analysis", nil,
testLLMProvider(...), ...) and change the first nil to context.TODO() (do the
same for the second occurrence mentioned in the comment) so the tests use an
idiomatic non-nil context.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 09f90786-5deb-45ab-8d14-a0122b846a97

📥 Commits

Reviewing files that changed from the base of the PR and between f69db4c and 499ee65.

⛔ Files ignored due to path filters (3)
  • api/v1alpha1/zz_generated.deepcopy.go is excluded by !**/zz_generated.deepcopy.go
  • config/crd/bases/agentic.openshift.io_proposals.yaml is excluded by !config/crd/bases/**
  • config/rbac/role.yaml is excluded by !config/rbac/role.yaml
📒 Files selected for processing (16)
  • .ai/spec/how/reconciler.md
  • .gitignore
  • api/v1alpha1/proposal_types.go
  • api/v1alpha1/tools_types.go
  • controller/proposal/agent.go
  • controller/proposal/client.go
  • controller/proposal/client_test.go
  • controller/proposal/handlers.go
  • controller/proposal/reconciler.go
  • controller/proposal/reconciler_test.go
  • controller/proposal/resolve.go
  • controller/proposal/sandbox_agent.go
  • controller/proposal/sandbox_agent_test.go
  • controller/proposal/sandbox_templates.go
  • controller/proposal/sandbox_templates_test.go
  • examples/setup/09-intelliaide-proposals.yaml
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • openshift/lightspeed-agentic-sandbox (manual)
✅ Files skipped from review due to trivial changes (2)
  • .gitignore
  • .ai/spec/how/reconciler.md
🚧 Files skipped from review as they are similar to previous changes (12)
  • controller/proposal/resolve.go
  • controller/proposal/reconciler.go
  • controller/proposal/client.go
  • controller/proposal/handlers.go
  • controller/proposal/sandbox_templates.go
  • examples/setup/09-intelliaide-proposals.yaml
  • controller/proposal/agent.go
  • api/v1alpha1/tools_types.go
  • api/v1alpha1/proposal_types.go
  • controller/proposal/sandbox_agent.go
  • controller/proposal/reconciler_test.go
  • controller/proposal/sandbox_agent_test.go
🛑 Comments failed to post (1)
controller/proposal/sandbox_templates_test.go (1)

519-519: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Replace nil context with context.TODO().

These negative tests pass nil as the context argument. While the function likely errors out before using the context, context.TODO() is more idiomatic and satisfies the linter.

🔧 Proposed fix
-	_, err := EnsureAgentTemplate(nil, nil, "base", "ns", "analysis", nil, testLLMProvider(agenticv1alpha1.LLMProviderGoogleCloudVertex), nil, "lightspeed-agent")
+	_, err := EnsureAgentTemplate(context.TODO(), nil, "base", "ns", "analysis", nil, testLLMProvider(agenticv1alpha1.LLMProviderGoogleCloudVertex), nil, "lightspeed-agent")
 	if err == nil {
 		t.Error("expected error for nil agent")
 	}
-	_, err := EnsureAgentTemplate(nil, nil, "base", "ns", "analysis", testDefaultAgent(), nil, nil, "lightspeed-agent")
+	_, err := EnsureAgentTemplate(context.TODO(), nil, "base", "ns", "analysis", testDefaultAgent(), nil, nil, "lightspeed-agent")
 	if err == nil {
 		t.Error("expected error for nil LLM")
 	}

Also applies to: 526-526

🧰 Tools
🪛 golangci-lint (2.12.2)

[error] 519-519: SA1012: do not pass a nil Context, even if a function permits it; pass context.TODO if you are unsure about which Context to use

(staticcheck)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@controller/proposal/sandbox_templates_test.go` at line 519, Replace the nil
context arguments in the test calls to EnsureAgentTemplate with context.TODO();
locate the calls to EnsureAgentTemplate (e.g., the invocation with parameters
"...", "base", "ns", "analysis", nil, testLLMProvider(...), ...) and change the
first nil to context.TODO() (do the same for the second occurrence mentioned in
the comment) so the tests use an idiomatic non-nil context.

Source: Linters/SAST tools

@sakshiep1 sakshiep1 force-pushed the must-gather-version branch from 499ee65 to af4eef0 Compare June 12, 2026 12:15
@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 12, 2026
@sakshiep1 sakshiep1 force-pushed the must-gather-version branch from af4eef0 to b1431b0 Compare June 12, 2026 12:34
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 12, 2026
@sakshiep1 sakshiep1 force-pushed the must-gather-version branch from b1431b0 to 070a1e9 Compare June 12, 2026 12:49
@sakshiep1

Copy link
Copy Markdown
Author

/retest-required

1 similar comment
@sakshiep1

Copy link
Copy Markdown
Author

/retest-required

@sakshiep1 sakshiep1 force-pushed the must-gather-version branch from 070a1e9 to c2e6ec6 Compare June 13, 2026 15:56
@sakshiep1 sakshiep1 force-pushed the must-gather-version branch from c2e6ec6 to 682b7fa Compare June 13, 2026 16:23
@openshift-ci

openshift-ci Bot commented Jun 13, 2026

Copy link
Copy Markdown

@sakshiep1: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants