Commit 40feda1
committed
fix: retry all SQS messages on unhandled errors instead of silently dropping them
When the scale-up lambda encounters an unhandled error (e.g. SSM
ThrottlingException during registration token creation), the catch block
returned an empty batchItemFailures array. With ReportBatchItemFailures
enabled, this tells SQS that all messages were processed successfully,
permanently deleting them from the queue.
This causes queued GitHub Actions jobs to be silently lost — they never
get a runner and remain stuck in 'queued' state indefinitely.
The fix returns all message IDs as batch item failures on unhandled
errors, so SQS retries them after the visibility timeout.1 parent 1d57199 commit 40feda1
1 file changed
+5
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
58 | | - | |
59 | | - | |
60 | | - | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
61 | 63 | | |
62 | 64 | | |
63 | 65 | | |
| |||
0 commit comments