Skip to content

agentmemory import-jsonl CLI silently caps at 200 files; no flag to override; no warning when truncated #203

@bloodcarter

Description

@bloodcarter

Summary

agentmemory import-jsonl (the CLI users actually run, per the README's "Already have older Claude Code JSONL transcripts you want to bring in?" section) walks the target directory with a hard-coded default limit = 200. The CLI doesn't expose a --max-files flag, the help text doesn't mention any cap, and the post-import summary doesn't warn when files were skipped. On any non-trivial ~/.claude/projects tree the user gets a silent ~10% import.

The underlying API endpoint already accepts a maxFiles field — the cap is purely a CLI/UX limitation.

Repro

# Tree with 2167 JSONL files
find ~/.claude/projects -name '*.jsonl' | wc -l
# → 2167

agentmemory import-jsonl
# CLI shows "scanning files…" then "View at http://localhost:3113 → Replay tab"
# No file count, no warning.

curl -s localhost:3111/agentmemory/sessions?limit=10000 | jq '.sessions | length'
# → 199    (≈ the 200 cap, after dedupe)

Workaround (only works if you know the API exists):

curl -s -X POST http://localhost:3111/agentmemory/replay/import-jsonl \
  -H "Content-Type: application/json" \
  -d '{"path": "~/.claude/projects", "maxFiles": 5000}'
# → success: true, but iii function timed out at 30s on a large tree

So the practical override path for any large tree is "iterate per project subdirectory and call the API directly" — not something a normal CLI user will discover.

Root cause

src/functions/replay.ts:

async function findJsonlFiles(root: string, limit = 200): Promise<string[]> {
  // ...
  if (out.length >= limit) return;
  // ...
}
// ...
files = await findJsonlFiles(abs, data.maxFiles || 200);

The 200 default is an arbitrary safety cap — sensible for an HTTP endpoint with a 30s function timeout, but the CLI inherits it without ever exposing it or warning when it bites.

Suggested fix

Three small changes that compound:

  1. Expose the flag in the CLI. Add --max-files <N> to agentmemory import-jsonl (default e.g. 5000 to match real-world ~/.claude/projects sizes). Update the help text in src/cli.ts.
  2. Warn when the cap is hit. In the import summary, if files.length === limit, print: ⚠ Hit the ${limit}-file cap; ${found - imported} files were skipped. Re-run with --max-files=… or batch by subdirectory.
  3. Document the per-subdir batching pattern. A 200-file cap does make sense given the iii-engine 30s function timeout — under load a single 2000-file pass times out. The README's import section could mention "for very large trees, iterate by subdirectory" with a one-liner. Or, even better, have the CLI auto-batch when it detects a tree larger than --max-files.

Sibling bug

This compounds badly with #(parser-sessionId-bug — also filed today). Each re-run with a different --max-files creates new phantom sessions instead of merging, so users who hit the silent cap and re-run end up with wildly inflated session counts. Together they turn import-jsonl from a one-shot operation into "1/10th imported, the rest invisible, and any retry doubles the noise."

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions