Skip to content

perf: push --limit and --since filtering into the storage load path #25

@codeprakhar25

Description

@codeprakhar25

What

load_all_traces() currently loads ALL traces into memory on every invocation. --limit N on agentdiff list only truncates display after a full load. Push --limit and --since filters into the storage layer so only needed traces are fetched.

Why

At 100k+ traces (enterprise scale, 1+ year of data), agentdiff list and agentdiff stats will take seconds and consume hundreds of MB. Every subcommand hits this path.

How

With per-UUID file storage:

  • --limit N: list only the most recent N files from the trace tree (sort by UUID timestamp prefix)
  • --since <date>: read only shards after that date prefix
  • Change load_all_traces() signature to accept optional limit/since params and short-circuit early

Effort

~45 min. Depends on per-UUID file storage refactor (if not yet done, can be approximated with early-exit on sorted JSONL).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions