feat(cli): capture-video on-demand fetcher + capture pipeline robustness#1447
Conversation
e4e8b44 to
b56b7c5
Compare
| capture: () => import("./commands/capture.js").then((m) => m.default), | ||
| "capture-video": () => import("./commands/capture-video.js").then((m) => m.default), |
There was a problem hiding this comment.
I wonder if capture-video could be done via a subcommand of capture?
capture video
or capture --video
There was a problem hiding this comment.
oh yeah actually capture video should work
There was a problem hiding this comment.
let me do that real quick
b56b7c5 to
8404517
Compare
51c9c6c to
343d5ac
Compare
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
Verified at HEAD 343d5ac — layering on @via's review at b56b7c5 and James's inline ask.
Verdict: approve-with-concerns. James's "make it a capture subcommand" landed cleanly; 2 of @via's 6 items resolved, 2 partial, 2 open. None blocking, but the buffering-vs-streaming gap is worth a follow-up since it shifts from "cosmetic duplication" to "actual OOM surface" on a hostile CDN.
Status table
James's ask — capture-video → capture video subcommand
| ✅ | subCommands: { video: ... } wired in capture.ts; "capture-video" entry removed from cli.ts registry; file moved to commands/capture/video.ts; example updated to hyperframes capture video ./linear-video --index 0. Clean follow-through. |
@via's 6 findings at HEAD
| # | Finding | Status | Notes |
|---|---|---|---|
| 1 | Duplicate download primitive (buffers full 250 MB via arrayBuffer() vs cloud/download.ts:downloadToFile's streaming) |
❌ open | Still Buffer.from(await r.arrayBuffer()) at video.ts:55. See concern below — this is now a real safety gap, not just a duplication. |
| 2 | run() untested (manifest discovery, --list, EEXIST short-circuit, exit codes) |
❌ open | video.test.ts still covers 4 helpers; integration surface uncovered. Acceptable for an author-time CLI but worth a follow-up. |
| 3 | maskNonScannableRanges masking has no regression test |
✅ resolved | New case at lintProject.test.ts: "does not flag <img>/<video> tokens inside <!-- -->, <style>, or <script>". Good. |
| 4 | SVG polarity light/dark fill regex asymmetric | Light glyph regex broadened to #fff(fff)?|white|#[ef][ef][ef]|#[ef]{6} — now catches #eee, #efe, etc. Still misses CSS named colors (gainsboro, silver), rgb()/hsl(), and class-driven fills. Light/dark sides still asymmetric in shape, but the heygen-on-#eee failure mode @via flagged is now caught. |
|
| 5 | Sharp failure swallows to empty caption silently | Now increments svgsSkipped and emits progress("design", "skipped rasterizing N SVG(s) — fell back to label-derived") after the loop. User learns something failed. Open: bare catch {} still drops the actual error, so when one user reports "Vision says X" you can't diagnose which SVG broke or why. Suggest logging err.message per skipped file at progress("design", ...) verbosity. |
|
| 6 | CodeQL network data written to file needs suppression w/ rationale |
✅ resolved | video.ts:223 now reads // lgtm[js/http-to-file-access] — manifest-vetted URL, content-type whitelist, 250MB cap, SSRF-safe fetch. Rationale explicit. |
My additional findings (independent read)
Concerns
• video.ts:55 — Buffering 250 MB before checking byteLength is a real OOM surface, not just cosmetic. MAX_VIDEO_BYTES content-length pre-check (line ~38) is advisory: a hostile or misconfigured CDN can send content-length: 100 and then stream 2 GB. The post-arrayBuffer() byteLength check fires AFTER node has buffered the whole response — process memory has already spiked. Streaming with a running byte cap (the shape cloud/download.ts:downloadToFile uses) would refuse mid-stream. This strengthens @via's #1 from "duplicate primitive" to "shared streaming primitive would actually be safer." Not blocking for v1 against trusted CDNs, but flag for a follow-up before this is used against arbitrary user-controlled manifests.
• video.ts:217 — safeFilename is non-injective; collisions silently EEXIST. decodeURIComponent + [^A-Za-z0-9._-]+ → _ collapse means "hero 1.mp4" and "hero_1.mp4" both produce "hero_1.mp4". If a manifest has two such entries, the second writeFileSync({ flag: "wx" }) throws EEXIST and the code logs "already downloaded: ... (skipping)" — misleading because the bytes on disk are entry A's, not entry B's. The PR's own test covers "a b___c" → "a_b___c", which proves the collapse is lossy. Suggest prefixing the filename with entry.index (e.g. 0-hero_1.mp4) so the manifest's own gap-tolerant key disambiguates on disk too.
• commands/capture.ts:56 — Manifest schema is duck-typed. JSON.parse then directly indexed via manifest.map((e) => e.index) etc. If capture's manifest shape ever changes (extra fields fine, but renamed index → entryIndex, or wrapped in { entries: [...] }), capture-video silently breaks or misreports. No schemaVersion on the manifest, no shape check on read. Author-time tool, so a runtime crash is acceptable — but a one-line "expected manifest version N" check on read would future-proof this cheaply.
Nits
• commands/capture.ts:56 — if (url === "video") return; early-return relies on citty's "parent run fires after subcommand routing" behavior. Comment is helpful but consider: if a user ever captures https://video.example.com and the URL is normalized somewhere upstream, this could short-circuit incorrectly. Worth a if (args._?.[0] === "video") return;-style guard against subCommand name only (not URL value). (nit)
• contentExtractor.ts:285 — bare catch {} (no (err)) means sharp's real failure cause is unrecoverable for diagnosis. Combined with the new svgsSkipped count, you get "I skipped 3 SVGs" but no breadcrumb to which SVGs or why. Pair with @via's #5 fix: at minimum catch (err) and emit a progress("design", ...) line per file. (nit, builds on Via #5)
• video.ts:223 — synchronous writeFileSync of a 250 MB Buffer blocks the event loop for the duration of the disk write. Fine for a CLI, but if this primitive ever gets reused server-side, prefer the fs/promises writeFile async version. (nit, forward-looking)
Questions
• video.ts:38 — application/octet-stream in the accept-list is broad. A lot of static-asset CDNs do send application/octet-stream for .mp4, so this is pragmatic — but it also means any 500-page HTML error served with that mimetype passes the check. Do you have a fallback "first 4 bytes are an mp4/webm/mov magic header" sniff, or is the URL trust model strong enough that magic-byte verification isn't worth the complexity?
• Cross-package coupling — @via flagged the download primitive duplication. Are there other ones emerging? sharp is now used in contentExtractor.ts for SVG rasterization; is it used anywhere else in the cli/core packages? If so, a packages/core/src/imageOps.ts-shaped landing zone might be worth carving before the third caller appears.
What I didn't verify
• I didn't run the new test suite locally — assumed the green CI claim in the PR body is accurate.
• I didn't manually exercise capture video against a real captured project; trusted the manual test plan in the PR body and the helper-level unit coverage.
• I didn't trace the new beats CLI surface that landed alongside this PR (visible in the b56b7c5..343d5ac compare diff) — it's adjacent feature work, not the capture-video review scope. Worth a separate look if it's in this PR's surface.
Review by Rames D Jusso
343d5ac to
eada860
Compare
| `video exceeded ${Math.round(MAX_VIDEO_BYTES / 1024 / 1024)}MB cap mid-stream for ${url}`, | ||
| ); | ||
| } | ||
| if (!file.write(chunk)) { |
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
Verified at HEAD eada860 — R3 layering on @via's R1 (b56b7c5) and my R2 (343d5ac).
Verdict: approve. The latest push directly addressed the two strongest follow-ups from R2 (OOM-via-arrayBuffer and silent collision misreport). The remaining gaps are minor and non-blocking.
R2 follow-up status at HEAD
| # | R2 finding | Status | Notes |
|---|---|---|---|
| 1 | Buffers full 250 MB via arrayBuffer() — hostile-CDN OOM surface |
✅ resolved | video.ts:25-100 — full rewrite to streamToFile with createWriteStream({ flags: "wx" }), mid-stream bytes > MAX_VIDEO_BYTES abort, partial-file unlink on non-EEXIST error, backpressure via drain, single shared streamErrored promise to avoid MaxListeners. Solid implementation. Reuses safeFetch from assetDownloader.js. Code-org duplication with cloud/download.ts:downloadToFile is still technically present, but the safety goal is met — calling this resolved, not "resolved-differently," because the threat model is satisfied. A future de-dup pass can extract a streamHttpToFile(url, dest, {maxBytes, contentTypeRe, safeFetch}) shared helper. |
| 2 | run() integration untested (manifest discovery, --list, EEXIST short-circuit, exit codes) |
❌ still open | video.test.ts still covers 4 pure helpers + the new findFilenameCollision. No integration coverage for the run-body branches (W2H-vs-standalone layout discovery, malformed-JSON exit code, collision-refusal exit code, list output). Acceptable for an author-time CLI; flag for a follow-up. |
| 4 | SVG polarity light-fill regex still misses named colors / rgb() |
contentExtractor.ts:262 unchanged. Heygen-on-#eee failure mode caught; named-color / rgb() / class-driven fills still missed. Same severity as R2. |
|
| 5 | catch {} swallows sharp error context |
contentExtractor.ts:273 still bare catch {}. svgsSkipped counter still emits the aggregate count via progress(...). User learns something failed; per-file diagnostic still missing. Same severity as R2. |
|
| M2 | safeFilename collisions silently misreport via EEXIST |
✅ resolved | New findFilenameCollision helper at video.ts:139-149 + check at video.ts:264-275 before download. Fails loudly with all colliding indices listed instead of silently EEXIST-skipping. Test coverage at video.test.ts:140-158 (collision detection + self-exclusion). Clean fix — better than my suggested entry.index-prefix because it surfaces the manifest authoring bug rather than papering over it. |
| M3 | Manifest schema duck-typed, no version field | ❌ still open | commands/capture/video.ts:228-232 still does raw JSON.parse + direct field access. Author-time tool, runtime crash on schema drift is acceptable; flag for a follow-up if this surface grows. |
New observations at HEAD
Nits
• video.ts:80-87 — file.once("drain", ...) is registered inside a backpressure-loop iteration, which means each slow chunk attaches a fresh drain listener. Node clears once listeners on fire, so the leak is bounded, but on a heavily-throttled CDN you can briefly see N parked listeners. Consider hoisting to a shared waitForDrain() helper that re-uses one resolver. Non-blocking. (nit)
• video.ts:53 — AbortSignal.timeout(120_000) is total request timeout, but streaming a 250 MB video at slow 3G could legitimately exceed 2 minutes. Consider replacing with an idle-timeout pattern (reset timer on each chunk) if real users report aborts. Forward-looking. (nit)
• video.ts:48-51 — content-length > MAX_VIDEO_BYTES pre-check now serves only as an early-exit optimization (mid-stream cap is the real safety). Fine to keep, but the comment at line 48 could note "early-exit only; mid-stream cap below is authoritative" so a future reader doesn't try to harden it. (nit)
Questions
• video.test.ts — net +0 lines vs R2 (additions: 0 in the R2→HEAD compare). The new collision helper added a test class, but the run()-body branches Via flagged are still uncovered. Was there a deliberate scope decision to defer integration tests, or is that a follow-up PR you'd take?
• The streamToFile implementation is good enough to be the shared primitive. Would you take an immediate follow-up that extracts both this and cloud/download.ts:downloadToFile into a single helper, or is the duplication acceptable until a third caller appears?
What I didn't verify
• Didn't re-run the new test suite locally — assumed CI green.
• Didn't trace whether file.destroy() interacts cleanly with the partial-file unlinkSync race on Windows (the EEXIST-vs-other-error branch at lines 96-103). Linux/macOS should be fine.
• Didn't audit the contentExtractor SVG-polarity changes against new failure cases beyond R2 — same code path.
Review by Rames D Jusso
jrusso1020
left a comment
There was a problem hiding this comment.
Approved on eada860. Clean R1→R2→R3 with the OOM-via-arrayBuffer + filename-collision concerns both resolved.
eada860 to
dda77c3
Compare
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
Verified delta at HEAD dda77c3, R3 baseline was eada860.
R4 — narrow review of the 3 new structural logo signals in assetCataloger.ts (lines 28–33, 71–83, 119–122, 348–351) and their consumer in assetDownloader.ts:144–155.
Verdict: approve-with-concerns. Signals are well-scoped, deterministic, side-effect-free, and the blast radius is intentionally tiny — they OR into isLogo which only flips the SVG filename slug from svg-<hash> to logo-<hash>. Nothing rasterization-path or polarity-detection adjacent; this does NOT subsume Via's regex-based light-glyph detection (orthogonal axis: "is logo at all" vs. "light-on-dark polarity").
Signal-by-signal semantics:
inBanner(closest('header, nav, [role="banner"]')) — clean. FP: nav-bar avatars / hamburger icons. FN: legacy SPAs that don't use semantic header/nav. Acceptable for candidate semantics.inHomeLink— covers/,#,./, and origin-only via/^https?:\\/\\/[^/]+\\/?$/. Regex is correctly double-escaped for the template-string →page.evaluateboundary (verified —\\/in the .ts source becomes\/post-template-literal-processing, which is a valid escaped/in the eval'd regex literal). FN worth knowing: React-router<Link to="/">that renders without anhrefattribute, and<button onclick>-based home navigation.matchesTitleBrand— asymmetric assumption:(document.title || '').split(/[-|—]/)[0]takes the FIRST segment as the brand. Works for"HeyGen | Product"(brand-first, the convention). Fails for the equally-common"Page Title | HeyGen"(brand-last). On a brand-last site, this signal silently flips to false-positive territory — alt-text containing the page-title phrase wrongly counts as brand-match. Length guard (> 1 && < 30) bounds the damage but doesn't eliminate it. Not a blocker since it only affects a filename slug, but worth a one-line "we're betting on brand-first titles" comment so future-you knows what to revisit if logo classification ever leaks into more consequential code paths.
Concerns (non-blocking):
- No test coverage on the new signals.
assetCataloger.test.tsexists but adds nothing forinBanner/inHomeLink/matchesTitleBrand. These are DOM-evaluated inside apage.evaluate(\...`)` template string, so unit testing means a jsdom harness or Playwright fixture — non-trivial. At minimum, one positive + one negative for each signal would lock the semantics. Given the contained blast radius (slug naming only) this is a concern, not a blocker. - Merge-step boolean OR is correct but worth flagging: across both
add()(line 119) anddeduplicateSrcsetVariants()(line 348), any positive sample sets the field. Means if the SAME asset URL appears once in a banner and once in a footer, it's flagged as logo from the banner instance. That's the intended behavior and the comment at line 114 documents it — good.
Determinism / side-effects / types: all clean. No Date.now/Math.random/fetch in the cataloger path. Fields are ?: boolean optional, backwards-compat with existing manifest consumers preserved.
Green-light from me for Rames Jusso's stamp.
Review by Rames D Jusso
jrusso1020
left a comment
There was a problem hiding this comment.
Re-approved on dda77c3 after R4 delta verification.
For the hyperframes.dev website-to-video flow. Real-AI-test runs against
heygen.com, huly.io, and heygen-showcase surfaced two gaps: (1) capture's
logo / asset-captioning signals missed modern React/Tailwind builds; and
(2) there was no CLI surface to pull the videos the manifest references.
New command:
• `hyperframes capture-video <project>` — on-demand downloader for
entries in capture/extracted/video-manifest.json. Capture writes the
manifest + preview PNGs but skips the mp4s; this pulls one entry by
`--index N` (matched against the entry's `index` field, NOT array
offset — gaps are possible when a preview screenshot fails). SSRF-safe
via safeFetch, 250 MB cap, content-type whitelist, race-free
exclusive-create write. Layout-aware (handles both standalone capture
and W2H project layouts).
Capture pipeline fixes:
• Structural logo signals (assetCataloger + tokenExtractor): inBanner /
inHomeLink / matchesTitleBrand. Class-substring alone caught 0/32 SVGs
on heygen.com — modern builds don't put 'logo' / 'brand' in any
className.
• Content-hash SVG slugs (assetDownloader): `svg-<8char-sha1>.svg` —
label-derived slugs mis-attributed partner-logo carousels
(heygen-logo.svg actually contained Google, hubspot-logo.svg contained
Trivago, etc.). Content-hash names are invariant by construction.
• SVG → PNG rasterization before Gemini Vision (contentExtractor): the
raw-SVG-as-text path was hallucinating wordmarks (VIVIENNE for HubSpot,
'wrestling' for Workday). Adds polarity detection so a white-glyph SVG
flattened to a blank PNG gets inverted before captioning. LOGO tag in
asset-descriptions.md when structural signals fire (independent of
Gemini key presence).
• Double-escape \/ inside the page.evaluate template literal in
assetCataloger + tokenExtractor: the original `/^https?:\/\/.../`
collapsed to `/` mid-template and threw `Unexpected token ^`. Capture
was 100% blocked on this until the escape was fixed.
• `asset-descriptions.md` header branches on Gemini-key presence with
an explicit 'Vision OFF — catalog-derived descriptions' warning.
New lint rule:
• `lintMissingLocalAsset` (cli/utils/lintProject): scans <video> / <img>
/ <source> src for local files that don't exist in the project.
Empirically the most common sub-agent mistake across multi-URL runs
(~5+ per run). Uses `resolveExistingLocalAsset` so the existence check
matches the bundler's notion of 'resolves'. Masks comment / style /
script ranges before scanning so a literal `<img src=missing.png>`
inside a tutorial comment isn't reported.
Tests: 17 new for capture-video (safeFilename decoding/sanitization,
VIDEO_CONTENT_TYPE_RE accept/reject, pickManifestEntry index-field lookup
with gaps, URL-mismatch + bad-index rejection, --index over --url
priority); 70 cases under lintProject.test.ts covering the new rule and
existing rules.
Sibling PRs in this stack:
• #PR_A1 — fix(producer): __dirname ESM banner shim
• #PR_A2 — fix(core/lint): findRootTag masks comment/style/script
dda77c3 to
6a024a3
Compare
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
R5 delta review — verified at HEAD 6a024a3, R4 baseline was dda77c3.
Verdict: approve-delta with one minor caveat
R4 concern #1 — brand-first title heuristic — ✅ fixed
packages/cli/src/capture/assetCataloger.ts:79-90 — split widened to /[-|—:]/ and iterates all titleParts instead of [0]. Brand-last ("Ideas - HeyGen") and colon-separated ("Vercel: Build") titles now match. Length guard 1 < part.length < 30 filters out empty/runaway segments. Good fix.
Minor caveat (not a blocker): the desc.indexOf(part) substring check is still noisy — a non-brand segment like "Ideas" will match an alt containing "ideas". Since this is a boolean signal OR-merged across samples and consumed by a downstream classifier (not a hard gate), false positives are bounded. Worth a one-line comment that this is a recall-biased signal, not a precision one — or leave as-is and re-tune if logo-detection precision suffers in the wild.
R4 concern #2 — tests for new logo signals — ⚠️ not added
New tests landed (commands/capture/video.test.ts, utils/lintProject.test.ts) but cover the capture-video fetcher + lintProject, not the new inBanner / inHomeLink / matchesTitleBrand signals in assetCataloger.ts. The signal logic runs inside page.evaluate(), so unit-testing it requires a DOM harness that may not exist in this package — acceptable trade-off for a heuristic-tuning PR, but worth tracking as a follow-up (extract the signal computation into a pure helper testable against a fixture DOM).
New code from this push — no concerns
assetCataloger.ts:122-124and:357-360— boolean-signals merge with "any positive wins" semantics is correct (text fields keep first-occurrence, booleans union across srcset variants).- New
capture/video.ts+ tests are scope-additive to the PR's main capture-video feature, not the logo-detection workstream.
Green-light to stamp.
Review by Rames D Jusso
jrusso1020
left a comment
There was a problem hiding this comment.
Re-approved on 6a024a3 after R5 delta verification.
What
hyperframes capture video <project>subcommand — on-demand mp4 fetch from the capture manifest.lintMissingLocalAssetrule — flags<video>/<img>/<source>srcpointing at missing local files.Why
heygen-logo.svgon main today contains Google's wordmark — label-derived slugs picked the wrong DOM context. The newwebsite-to-videoskill inherits this and would ship off-brand videos. Same root for: class-substringisLogocatching 0/32 SVGs on heygen.com, Vision hallucinating wordmarks from raw SVG path text, and capture being 100% blocked by a\/regex collapse inpage.evaluate.How
capture video: nested undercaptureper the team'sauth/cloud/lambdasubcommand pattern.safeFetch(SSRF + redirect-hop revalidation), 250 MB cap, content-type whitelist,flag: "wx"exclusive-create write, lookup byentry.index(not array offset — manifest can have gaps).svg-<8-char-sha1>.svg) so filename can never drift from content.sharprasterizes SVGs to PNG before Vision; polarity detection flips white-glyph SVGs to a dark background.lintMissingLocalAssetmasks<!-- -->/<style>/<script>ranges before scanning so commented-out examples aren't false positives.Test plan
bun run --filter @hyperframes/cli typecheck+testpass (17capture/videocases + 51lintProjectcases)hyperframes capture video <dir> --list+--index 0downloads mp4 from heygen + notion + webm from airbnb — all valid perfile(1)