Skip to content

perf(geotiff)!: block-aligned LRU header cache; lazy tile metadata#529

Merged
kylebarron merged 12 commits into
mainfrom
kyle/cog-block-cache
May 12, 2026
Merged

perf(geotiff)!: block-aligned LRU header cache; lazy tile metadata#529
kylebarron merged 12 commits into
mainfrom
kyle/cog-block-cache

Conversation

@kylebarron
Copy link
Copy Markdown
Member

@kylebarron kylebarron commented May 12, 2026

This provides a huge latency improvement for rendering large COGs.

Before

61MB of header fetching!

Screen.Recording.2026-05-12.at.3.27.01.PM.mov

After:

Roughly 3 seconds from a cold cache, and much less metadata. (4x more tiles are being fetched in this view than in the Before, because #513 increased tile count by 4x)

Screen.Recording.2026-05-12.at.3.22.28.PM.mov

Change list

  • Don't prefetch big tags for tile byte counts and tile offsets
  • Use block-aligned cache only for header metadata, split from tile requests
  • Add debug? param to GeoTIFF.open for logging out debug info about requests made

Closes #528, Closes #501, Closes #294


Summary

Reworks COG header loading after live testing showed PR #509's approach (eager bulk-prefetch of TileOffsets/TileByteCounts at open) downloaded tens of MB before any tile could render on huge COGs.

This branch goes in the opposite direction — lazy everything, with a fixed-block LRU cache to amortize per-entry reads. Matches geotiff.js's BlockedSource architecture.

  • Drop eager TileOffsets/TileByteCounts prefetch. cogeotiff already supports lazy per-entry reads; the block cache (below) makes them cheap.
  • Disable cogeotiff's GDAL leader-bytes path in GeoTIFF.open (tiff.options = undefined). That optimization assumes tiles fit in one chunk and is harmful when they don't — it pollutes the header cache with image-data bytes.
  • GeoTIFF.fromUrl uses 64 KiB SourceChunk + 8 MiB SourceCache (matches geotiff.js's defaults). LRU-ish eviction keeps memory bounded.
  • Breaking: fromUrl drops the prefetch option (still on GeoTIFF.open for direct callers).

For the Vermont 200 GB COG with 61 MB header: previous PR downloaded ~60 MB upfront; this design opens with ~3 small reads and lazy-loads per-IFD metadata as tiles are actually requested.

Closes #500. Supersedes #509 (which should be closed).

Spec: dev-docs/specs/2026-05-12-cog-block-cache-design.md

Test plan

  • pnpm --filter @developmentseed/geotiff typecheck clean
  • Full vitest suite passes (69/69, excluding pre-existing integration-rasterio fixture failures unrelated to this branch)
  • New block-cache.test.ts asserts: every underlying fetch is 64 KiB-aligned; tiff.options is undefined post-open; getTileSize(0) triggers at most 2 chunk fetches
  • Existing fromurl.test.ts (issue SourceError: Request outside of bounds #524 regression) still passes
  • Manual: open the Vermont 200 GB COG in the example, confirm browser network panel shows small initial reads (~few hundred KB total) instead of tens of MB
  • Manual: zoom around the Vermont COG, confirm tile rendering works and per-tile metadata reads are served from cached blocks after warmup

🤖 Generated with Claude Code

kylebarron and others added 5 commits May 12, 2026 13:23
Supersedes the earlier read-ahead cache design. Uses chunkd's existing
SourceChunk + SourceCache (64 KiB blocks, 8 MiB LRU) instead of a
custom sequential cache; drops the eager TileOffsets/TileByteCounts
prefetch in favor of cogeotiff's lazy per-entry reads through the
block cache; disables cogeotiff's GDAL leader-bytes path so the
header cache stays free of image-data bytes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cogeotiff lazily fetches individual entries from these arrays via the
header source on first access. With a block-aligned header cache (next
commit), adjacent per-tile lookups hit the same 64 KiB block. The eager
bulk fetch downloaded tens of MB on huge COGs (e.g. Vermont) before any
tile could render, all of which was wasted work when the initial view
was at an overview level that didn't use the primary image's arrays.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ays metadata-only

cogeotiff auto-detects GDAL ghost option BLOCK_LEADER=SIZE_AS_UINT4 at
Tiff.create() time. When set, TiffImage.getTileSize fetches 4 bytes
near the tile data instead of reading TileByteCounts. The intent is
that the fetch's chunk also contains the tile, but tiles are often
larger than the chunk size, so the optimization pollutes the header
cache with image-data chunks and evicts metadata. cogeotiff core only
reads tiff.options here; nulling it after creation is safe.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the per-call prefetch tuning with a fixed-block cache matching
geotiff.js's BlockedSource. cogeotiff's lazy per-entry reads now hit a
shared 64 KiB block when adjacent (the typical case for tile-offset
arrays). LRU eviction keeps memory bounded at 8 MiB by default.

Breaking: drops the `prefetch` option on `GeoTIFF.fromUrl`. `prefetch`
remains available on `GeoTIFF.open` for direct callers that need to
control cogeotiff's defaultReadSize.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The original design (the sequential exponential read-ahead cache, then a frozen-after-open variant) tried to optimize *steady-state* tile rendering by bulk-loading `TileOffsets` / `TileByteCounts` arrays for each IFD. That moved the cost to *open time*. On a real 200 GB Vermont COG, that's tens of MB downloaded before any tile renders — even though the initial view is at an overview level whose primary-image arrays are never used.

geotiff.js takes the opposite approach. Each `fromUrl` call fetches just 1024 bytes (header + first IFD pointer). `getImage(i)` reads only that IFD's entries; tile-array values are wrapped in a `DeferredArray` that holds only their file offset + count. Per-tile reads fetch a single 4–8 byte entry through `BlockedSource`, a fixed-block LRU that coalesces adjacent entries into one block. The block cache lives inside the source layer; cogeotiff's lazy per-entry reads benefit from it automatically.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should fetch an initial prefetch of a full block size, not 1024 bytes


### Why disable cogeotiff's leader-bytes path?

cogeotiff auto-detects the GDAL ghost option `BLOCK_LEADER=SIZE_AS_UINT4` at `Tiff.create()` time. If present, `TiffImage.getTileSize()` skips the `TileByteCounts` lookup and instead fetches 4 bytes just before the tile data. The comment in cogeotiff explains the intent: *"This fetch will generally load in the bytes needed for the image too provided the image size is less than the size of a chunk."* But that assumption breaks for tiles larger than the block size (very common — many COG tiles are 256×256×3 bytes ≈ 200 KB, well above 64 KiB). When it breaks, the result is:
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it know where the tile data is though? Does it fetch the offset separately?

Comment thread packages/geotiff/src/geotiff.ts
Comment thread packages/geotiff/src/geotiff.ts
…plicit one-block prefetch

Addresses PR review feedback on #529:

- Rename the `source` parameter on the vendored getTile/getBytes
  helpers to `dataSource`. Functionally unchanged — every caller
  already passes `self.dataSource` — but the new name makes it
  impossible to confuse with the header source that cogeotiff uses
  internally for the TileOffsets/TileByteCounts lookups.

- Expand the doc comments on those helpers to explain the
  header-vs-data split explicitly.

- Pass `prefetch: chunkSize` from `GeoTIFF.fromUrl` to
  `GeoTIFF.open`, so the very first cogeotiff read is exactly one
  block. SourceChunk would pad it anyway, but being explicit keeps
  the intent local.

- Update the spec to clarify how per-tile offset/bytecount lookups
  work and note that we explicitly fetch a full block on the first
  read (cogeotiff's DefaultReadSize is 16 KiB).

- Add a TODO referencing the upstream issue (to be filed) tracking
  a cleaner opt-out for cogeotiff's GDAL leader-bytes path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kylebarron kylebarron changed the title feat(geotiff)!: block-aligned LRU header cache; lazy tile metadata perf(geotiff)!: block-aligned LRU header cache; lazy tile metadata May 12, 2026
@github-actions github-actions Bot added perf and removed feat labels May 12, 2026
Adds `debug?: boolean` (off by default) to `GeoTIFF.open` and
`GeoTIFF.fromUrl`. When enabled, the tile-fetch path logs each
`dataSource.fetch` call to the console with a `data`/`mask` label,
offset, and length. Useful for diagnosing per-request behavior against
the browser network panel — e.g. surfacing the tiny mask-tile requests
that motivated the option in the first place. Threaded through
`HasTiffReference` so both `GeoTIFF` and `Overview` paths log.

A future change can coalesce adjacent (data, mask) tile pairs into a
single range request; until then, this option is the easiest way to
observe the current behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread packages/geotiff/src/overview.ts Outdated
In the typical fromUrl path, prefetch was just coupled to chunkSize via
SourceChunk padding — small requests get padded up to one block, large
ones fetch multiple blocks. The option added no behavior over what the
chunking middleware already provides, and exposed a knob that callers
almost never need to tune.

Direct GeoTIFF.open callers who want a specific initial fetch size can
compose a SourceChunk of the desired block size into their headerSource;
the option is the right tool for that job, not a separate dial on open.

cogeotiff's default DefaultReadSize (16 KiB) is now used for the very
first read; SourceChunk pads it to chunkSize transparently.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread packages/geotiff/src/geotiff.ts Outdated
kylebarron and others added 2 commits May 12, 2026 15:36
Overview doesn't expose debug — only the primary GeoTIFF does. Drop the
required-property constraint so Overview satisfies the interface without
its own getter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ew accessor

Addresses PR review:
- Class field debug -> _debug, marked @internal. The user-facing option
  on GeoTIFF.open/fromUrl stays named 'debug'.
- Remove Overview.debug getter — overview tile fetches don't need to
  log; the primary GeoTIFF's _debug is the only opt-in surface.
- HasTiffReference._debug? is optional so Overview (without it) still
  satisfies the interface.
- Drop the explicit defaultReadSize parameter by switching from
  Tiff.create to 'new Tiff(...).init({ signal })', letting the
  constructor default to Tiff.DefaultReadSize.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kylebarron kylebarron enabled auto-merge (squash) May 12, 2026 19:42
@kylebarron kylebarron merged commit 22a7ddb into main May 12, 2026
3 checks passed
@kylebarron kylebarron deleted the kyle/cog-block-cache branch May 12, 2026 19:43
@kylebarron kylebarron self-assigned this May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

1 participant