Skip to content

Use numeric basename as concat index for bare-wildcard rasters (fixes #1465)#1480

Open
gaoflow wants to merge 2 commits into
Deltares:mainfrom
gaoflow:fix/1465-open-mfraster-bare-numeric-wildcard
Open

Use numeric basename as concat index for bare-wildcard rasters (fixes #1465)#1480
gaoflow wants to merge 2 commits into
Deltares:mainfrom
gaoflow:fix/1465-open-mfraster-bare-numeric-wildcard

Conversation

@gaoflow

@gaoflow gaoflow commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Problem

When reading externally-provided rasters named with bare numeric stems — e.g. modis_lai stored as 1.tif .. 12.tif and referenced in a catalog with uri: "*.tif"open_mfraster(..., concat=True) produced the wrong concatenation index and a non-deterministic order (#1465):

  • the concat dim was [0, 1, ..., 11] (file-order fallback) instead of the numeric stems [1, 2, ..., 12];
  • the order depended on the filesystem glob() order;
  • the variable name (and hence the order-dependent result) was non-deterministic.

The index was only derived from a numeric basename when a non-empty prefix had been parsed from the pattern:

elif prefix != "" and bname.split(".")[0].strip(prefix).isdigit():
    index = int(bname.split(".")[0].strip(prefix))
else:
    index = i   # <- bare "*.tif" with numeric names lands here

For a bare wildcard *.tif the prefix is empty, so the numeric stems were ignored.

(This complements #1459, which fixed the write side so hydromt-written mapstacks use <name>_<index>.tif; #1465 is about reading third-party bare-numeric files.)

Fix

  • Add an index branch for purely numeric basenames when the pattern has no prefix or postfix (the bare-wildcard case), using the numeric stem as the index.
  • Sort the matched files so the concat order and the variable name are deterministic and independent of the filesystem glob order.

Existing prefix / _-separated / PCRaster-style index parsing is untouched.

Reproduction (from #1465)

import tempfile, numpy as np, xarray as xr
from pathlib import Path
from hydromt.readers import open_mfraster

tmp = Path(tempfile.mkdtemp())
for month in range(1, 13):
    da = xr.DataArray(np.zeros((1, 10, 15), "float32"), dims=("band", "y", "x"),
                      coords={"x": np.linspace(0, 7, 15), "y": np.linspace(5, 0, 10), "band": [1]})
    da.rio.write_crs(4326, inplace=True)
    da.rio.to_raster(str(tmp / f"{month}.tif"))

ds = open_mfraster(str(tmp / "*.tif"), concat=True)
print(list(ds["dim0"].values))
# before: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]  (file order, non-deterministic)
# after : [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]  (numeric stems, sorted)

Verification

  • New regression test TestOpenMFRaster::test_open_mfraster_bare_wildcard_numeric_names asserts the concat dim equals [1..12]. It fails on main and passes with this change.
  • Full tests/data_catalog/drivers/raster/ suite passes (56 passed, 5 skipped).
  • ruff check and ruff format --check clean.

gaoflow and others added 2 commits June 1, 2026 15:23
open_mfraster determined the concatenation index from a numeric basename
only when a non-empty prefix was parsed from the path pattern. For a bare
wildcard such as *.tif matching files named 1.tif .. 12.tif
(e.g. modis_lai monthly data referenced in a catalog with uri *.tif),
the prefix is empty, so the numeric stems were ignored and the index fell
back to the file-order position (0 .. 11). The concat order also depended on
the filesystem-dependent glob order.

Use the numeric basename as the index for bare (prefix- and postfix-less)
wildcard patterns, and sort the matched files so the concat order and the
variable name are deterministic.

Fixes Deltares#1465
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants