Context
PR #3 ships streaming lookup over the 1GB padrón TXT (~1s per RUC, single scan for batch). Fine for ad-hoc use.
If batch sizes grow >1000 RUCs frequently, sub-ms lookups via sqlite would matter. Bun has built-in sqlite.
Scope
- `sunat padron index` — one-time index build after `padron sync`
- Schema: `(ruc TEXT PRIMARY KEY, razon_social, estado, condicion, ubigeo, ...)`
- `lookupRuc` and `lookupRucBatch` use sqlite when index exists, fall back to streaming when not
- `padron status` reports index presence + size
Trade-offs
- Disk: ~1GB extra (index parallel to TXT). Could replace TXT entirely with sqlite-only.
- Index build: ~30s on first run, then instant lookups
- Schema migrations as SUNAT adds columns
Success criteria
- 1000 RUC batch goes from ~1s to <50ms
- LIMITATIONS.md → padrón section reflects new option
Why P2
Streaming 1s is good enough for current single-user UX. Only matters when an agent does `for ruc in 5000_rucs`.
References
- src/sunat-rest/padron-local.ts → current streaming impl
Context
PR #3 ships streaming lookup over the 1GB padrón TXT (~1s per RUC, single scan for batch). Fine for ad-hoc use.
If batch sizes grow >1000 RUCs frequently, sub-ms lookups via sqlite would matter. Bun has built-in sqlite.
Scope
Trade-offs
Success criteria
Why P2
Streaming 1s is good enough for current single-user UX. Only matters when an agent does `for ruc in 5000_rucs`.
References