Regenerate a --no-cache directory when a source .jsonl grows (#254)#256
Conversation
The single-file source-mtime freshness check (#221) was scoped to `input_path.is_file()`, so a directory run WITHOUT a cache (e.g. `--no-cache`, where `cache_manager` is None) fell through to a version-only skip: with no DB tracking per-source mtimes and no file mtime to compare, `should_regenerate` rode `is_outdated()` alone, so a session that grew between runs served stale `combined_transcripts.html`. (The cached directory path is unaffected — it tracks source mtimes via `is_html_stale`.) Extend the no-cache fallback's freshness check to directory sources: regenerate when the newest source `.jsonl` in the directory is newer than the output — the directory analogue of the single-file `input.mtime > output.mtime`, persistence-free (the output's own mtime is the basis). Non-recursive glob, matching how directory mode discovers sessions; shares the cache's mtime granularity limit (a `<stem>/subagents/` transcript growing without its top-level parent being touched isn't caught — rare, noted in the comment). Flows into the combined regeneration decision only, so the RegenerationReport signals stay accurate ("Successfully combined" prints because it truly did). Test: test_grown_session_regenerates_without_cache — grow a session in a dir, run `--no-cache` twice, assert the 2nd regenerates (coarse-FS-safe via os.utime). RED before (served stale), GREEN after. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThe fallback freshness check in convert_jsonl_to was updated to support directory inputs by comparing output mtime against the newest top-level ChangesDirectory Staleness Fix
Estimated code review effort: 2 (Simple) | ~10 minutes Sequence Diagram(s)sequenceDiagram
participant User
participant CLI
participant Converter as convert_jsonl_to
participant FS as Filesystem
User->>CLI: run with --no-cache on directory
CLI->>Converter: convert_jsonl_to(directory)
Converter->>FS: get output mtime
Converter->>FS: get max mtime of top-level *.jsonl files
Converter->>Converter: compute source_is_newer
Converter->>Converter: compute should_regenerate
alt should_regenerate is True
Converter->>FS: regenerate combined_transcripts.html
else
Converter-->>CLI: skip regeneration
end
Related Issues: Suggested labels: bug, converter Suggested reviewers: daaain 🐰 A session grew, the cache stayed blind, 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Fixes #254.
Problem
The single-file source-mtime freshness check added in #221 was scoped to
input_path.is_file(). A directory run without a cache (e.g.--no-cache, wherecache_managerisNone) falls through to the sameno-cache fallback branch, but with no file mtime to compare and no DB tracking
per-source mtimes,
should_regeneraterodeis_outdated()(embedded toolversion) alone. So a session
.jsonlthat grew between runs with an unchangedtool version was wrongly reported "current" → stale
combined_transcripts.html.The cached directory path is unaffected — it tracks source mtimes via
is_html_stale.Fix
Extend the no-cache fallback's freshness check to directory sources: regenerate
when the newest source
.jsonlin the directory is newer than the output —the directory analogue of the single-file
input.mtime > output.mtime,persistence-free (the output's own mtime is the basis).
glob("*.jsonl"), matching how directory mode discoverssessions.
<stem>/subagents/transcript growing without its top-level parent beingtouched isn't caught (rare — the parent session records the spawning
tool_useand usually grows too). Noted in the code comment.RegenerationReportsignals stay accurate ("Successfully combined" prints because it truly did).
Test
test_grown_session_regenerates_without_cache— grow a session in a directory,run
--no-cachetwice, assert the 2nd run regenerates (new content present),coarse-FS-safe via
os.utime. Verified RED before (served stale) andGREEN after; also confirmed with a real CLI repro.
🤖 Generated with Claude Code
Summary by CodeRabbit