Skip to content

Narrated walkthrough: in-page play mode + opt-in MP4 export#2

Open
russ wants to merge 2 commits into
mainfrom
feat/narrated-walkthrough-video
Open

Narrated walkthrough: in-page play mode + opt-in MP4 export#2
russ wants to merge 2 commits into
mainfrom
feat/narrated-walkthrough-video

Conversation

@russ

@russ russ commented Jun 22, 2026

Copy link
Copy Markdown
Owner

What & why

Turns a PatchStory walkthrough into a narrated screencast that breaks down the code being reviewed — inspired by this AI-video-generation pipeline, but built to fit PatchStory's local-first, zero-dependency model. The walkthrough's chapters already are the script (intent, risk, the referenced diff hunks), so this mostly adds a player and an exporter over data that already exists.

Two layers:

1. In-page "play" mode (the default — stays zero-dependency)

A ▶ Play button turns the static .html into a self-playing screencast: each chapter becomes a scene that pans the actual diff and spotlights the lines it references, while the browser's built-in speech synthesis reads a short narration (captions included, so it works muted). No ffmpeg, no API key, no network — the same single .html, just playing itself.

  • New optional Chapter.narration field (falls back to intentsummary, so existing walkthroughs still play).
  • Player overlay: requestAnimationFrame scene clock, speech + auto-advance, transport controls, diff spotlight/dim, keyboard (space///m/Esc), header Play button and p shortcut.
  • Themed via the existing light/dark CSS vars; prefers-reduced-motion fallback.

2. patchstory video (opt-in MP4 export)

For when you need a real shareable file. Same scene model, rendered to an .mp4 using system tools (no npm runtime deps; only touched when you ask for a video):

patchstory video pr-walkthrough.json --diff pr.diff --redact -o walkthrough.mp4

Per scene: deterministic scene HTML (fixed header/caption bands) → headless-Chromium screenshot → TTS → ffmpeg slices the screenshot into a fixed header, a vertically-panning code region, and a fixed caption, then muxes audio. Clips are concatenated to the final MP4.

  • ffmpeg/ffprobe resolved by actually running candidates (PATH → /usr/binPATCHSTORY_FFMPEG/FFPROBE), so a broken/shadowing PATH entry is skipped.
  • Chromium via --chrome, PATCHSTORY_CHROME, or a flatpak Chromium.
  • TTS via --tts: elevenlabs (ELEVENLABS_API_KEY), or local espeak-ng/flite/say, or none (silent; captions still shown).

Verification

  • typecheck clean · build clean · 23/23 tests pass.
  • Play mode driven in headless Chromium (overlay mounts, scenes render/advance, end card). That run caught and fixed a real bug: a TTS onerror was instantly skipping scenes instead of falling back to the timed clock.
  • patchstory video produces a valid 1920×1080 H.264 + AAC, ~133s MP4 from a 5-chapter walkthrough; sampled frames confirm the title card, panning syntax-highlighted diff with spotlit lines, and burned-in captions.

Notes

  • Web Speech API voice quality varies by OS/browser — the tradeoff for keeping play mode zero-dependency.
  • The MP4 path is heavier/slower by design; the HTML play mode is the local-first default.

🤖 Generated with Claude Code

russ and others added 2 commits June 22, 2026 15:34
Turn the static walkthrough into a self-playing, narrated screencast: each
chapter becomes a scene that pans the actual diff and spotlights the lines
it references, while the browser's built-in speech synthesis reads a short
narration (captions included, so it works muted). No new dependencies, no
API key, no network -- the same single .html, just playing itself.

- core: add optional Chapter.narration and allow it in the JSON Schema.
  Falls back to intent then summary, so existing walkthroughs still play.
- web: player overlay -- requestAnimationFrame scene clock, speech +
  auto-advance, transport controls, diff spotlight/dim, keyboard
  (space/arrows/m/Esc), header Play button and `p` shortcut.
- styles: full-screen player themed via the existing light/dark CSS vars,
  with a prefers-reduced-motion fallback.
- skill + README: author a per-chapter `narration`; document play mode.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01M8q2ot94CPocoNHL7XVStW
An opt-in counterpart to the in-page play mode that produces a real,
shareable .mp4 -- a title card plus one scene per chapter, each panning the
actual diff (spotlighting referenced lines) over a narration track. Reuses
the same scene model, but uses *system* tools, so it adds no npm runtime
dependencies and only touches them when a video is requested.

Pipeline per scene: build a deterministic scene HTML (fixed header/caption
bands) -> headless-Chromium screenshot -> TTS narration -> ffmpeg slices the
PNG into a fixed header, a vertically-panning code region, and a fixed
caption, then muxes the audio. Per-scene clips are concatenated to the MP4.

- renderer: packages/renderer/src/video/{scene-html,index}.ts; export
  renderVideo + types.
- cli: `patchstory video <walkthrough.json>` with
  --tts/--voice/--chrome/--fps/--keep (plus --diff/--redact via render path).
- tools: resolve ffmpeg/ffprobe by actually running candidates (PATH, then
  /usr/bin, then PATCHSTORY_FFMPEG/FFPROBE), so a broken or shadowing PATH
  entry is skipped; Chrome resolution also picks up a flatpak Chromium.
- tts: elevenlabs (ELEVENLABS_API_KEY) | espeak-ng | flite | say | none.
- docs: README "Narrated video" section; skill note.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01M8q2ot94CPocoNHL7XVStW
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant