Skip to content

fix(esp32): prevent ESP32-S3 serial reset deadlock and boot-mode lockup #532

@zackees

Description

@zackees

Context

During FastLED AutoResearch testing on an attached ESP32-S3, the board could be left in a temporarily bricked/dead state after the upload/monitor/RPC handoff path. The failure looked like a host-side deadlock plus a board that stopped responding to RPC.

Observed on Windows with the ESP32-S3 USB serial/JTAG port:

  • S3 upload/reset used COM22 successfully.
  • After monitor/RPC handoff attempts, the board could stop responding to JSON-RPC even though the serial device still existed.
  • A bad serial-line state could put the S3 into ROM download mode:
    • boot:0x23 DOWNLOAD(USB/UART0)
    • waiting for download
  • Recovery required an explicit DTR/RTS reset sequence. This is easy to miss if fbuild or a follow-on pyserial client is holding the port or has left DTR/RTS asserted in a bootloader-selecting state.
  • This is adjacent to, but distinct from, fix(monitor): release Windows COM port after monitor exits #531 where fbuild monitor --timeout ... can leave the Windows COM port owned by the daemon until fbuild daemon stop is run.

The practical result is that a test runner can look deadlocked while the S3 is effectively stuck in the boot ROM or otherwise unavailable for RPC.

Proposal

Harden the ESP32 serial reset/monitor/RPC handoff path so fbuild never leaves ESP32-S3 boards in an unrecoverable or confusing state:

  • Centralize ESP32 DTR/RTS handling for reset and boot mode selection.
  • Ensure normal monitor/RPC handoff leaves DTR/RTS in the run-firmware state, not the download-mode state.
  • Detect common ESP ROM download-mode output such as waiting for download after a reset/monitor operation and either recover automatically or fail with a targeted diagnostic.
  • Add watchdog-style timeout handling around monitor/serial operations so a board that stopped responding does not leave fbuild appearing deadlocked.
  • Always release the COM port before handing it to a pyserial/RPC client, including failure paths.

Acceptance criteria

  • Repeated deploy -> monitor --timeout -> pyserial/RPC open cycles on an ESP32-S3 do not leave the board in ROM download mode.
  • If the board emits waiting for download, fbuild reports that exact boot-mode problem or resets it back to application mode.
  • A failed or timed-out monitor/RPC operation does not require fbuild daemon stop before another process can recover the board.
  • The fbuild logs show enough DTR/RTS/reset context to diagnose future S3 boot-mode lockups.
  • The behavior is covered by at least a regression test or scripted repro path for the ESP32-S3 serial reset state machine.

Related issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions