fix: skip CountWorkflow in batch operations when --yes is set#1012
fix: skip CountWorkflow in batch operations when --yes is set#1012bitalizer wants to merge 1 commit into
Conversation
The visibility CountWorkflowExecutions request was issued unconditionally before every batch terminate / signal / cancel / reset. The count is only used to populate the "Start batch against approximately N workflow(s)?" confirmation prompt. When --yes bypasses the prompt entirely, the count result is never read. In clusters whose visibility API is overloaded (e.g. Postgres-backed clusters with many workflows), this CountWorkflow call can time out and prevent batch jobs from being started at all, even though the batch operation itself uses the same query and would succeed. Skipping the count when --yes is set lets these batch jobs proceed unconditionally. Both batch entry points are updated: - commands.workflow.go (terminate / signal / cancel) - commands.workflow_reset.go (reset) When the count is skipped, the prompt text shown by --yes changes from "Start batch against approximately N workflow(s)? y/N" to "Start batch against workflows matching query "<query>"? y/N" so the output remains informative. Adds TestWorkflow_Terminate_BatchWorkflow_SkipsCountWhenYes which uses a gRPC unary interceptor to assert that CountWorkflowExecutionsRequest is *not* sent when --yes is passed, while StartBatchOperationRequest still is. The existing without-yes tests are unaffected. Closes temporalio#838
|
Overall: LGTM The Test The test is thorough — interceptor pattern, gRPC call counting, Missing coverage for The test only covers |
Closes #838.
Implements the design @cretz proposed in the issue thread: when
--yesbypasses the confirmation prompt, skip theCountWorkflowExecutionsrequest entirely.What was changed
SingleWorkflowOrBatchOptions.workflowExecOrBatch:CountWorkflowonly runs in the!s.Yesbranch.TemporalWorkflowResetCommand's batch path.When
--yesskips the count, the prompt text changes fromStart batch against approximately N workflow(s)? y/NtoStart batch against workflows matching query "<query>"? y/N, so the output (whichpromptYesstill prints in the auto-confirm path) stays informative.The non-
--yesinteractive flow is untouched: it still counts, still prompts, still prints the sameapproximately N workflow(s)message.Why
workflowExecOrBatch(terminate / signal / cancel) and the reset command both unconditionally callCountWorkflowExecutionsbefore starting a batch. The result is only used to fill in theapproximately N workflow(s)confirmation. When--yesis set, the prompt is skipped — but the count call still runs, and on clusters where the visibility API is timing out it fails the entire batch start. The original report is from a Postgres-backed cluster where the batch query itself works but the count times out; users can't start batch jobs they otherwise have permission to run.How was this tested
Added
TestWorkflow_Terminate_BatchWorkflow_SkipsCountWhenYesthat:CountWorkflowExecutionsRequestandStartBatchOperationRequestcalls.workflow terminate --query ... --yes.CountWorkflowcalls, 1StartBatchOperationcall, prompt text containsmatching queryand notapproximately.The interceptor pattern matches the existing
testTerminateBatchWorkflowhelper.Ran the new test plus a sample of the existing batch tests locally; they pass. (
TestWorkflow_Terminate_BatchWorkflowSuccessflakes locally on the unrelatedCompletedassertion both with and without my changes — pre-existing timing issue, not caused by this PR.)