Publish Zarr + Virtual Zarr datasets to STAC


> Just an idea, but what if we were to have a small STAC-GeoParquet catalog for these datasets? I think that could provide a few nice benefits:
> 1. showing how to visualize the data from a STAC catalog matches typical use cases in the VEDA ecosystem better than from a URL as far as I can tell
> 2. it would serve as an example of how to use STAC well with Zarr/Icechunk, following on from the guidance that Julia put together last year
> 3. we could use the catalog for other, non-visualization use-cases
> 
> I guess the downsides are yet another STAC catalog and a bit more work relative to just testing URLs 

 _Originally posted by @maxrjones in [#384](https://github.com/NASA-IMPACT/veda-odd/issues/384#issuecomment-4519824063)_

We now have a set of virtual datasets that should be published to STAC so that they are easily discoverable for demonstration purposes and to prototype integration for future VEDA instances and services to utilize them.

Here is the guidance Julia wrote about how to publish Zarr datasets to STAC: https://guide.cloudnativegeo.org/cookbooks/zarr-stac-report/data-producers. I am curious @maxrjones  if you think we should publish all of the underlying files as STAC items (ref: https://guide.cloudnativegeo.org/cookbooks/zarr-stac-report/data-producers/#virtual-dataset-in-an-external-file). At least, in the case of collections already published in CMR, we could point to a CMR query which returns all the associated underlying files.

To facilitate access, I would assume we would want to use the datacube and providers extension. I was also curious if there is an extension to store information about how to authenticate + authorize for access. A quick google search led me to https://github.com/stac-extensions/authentication whose most recent contributor is our very own @alukach 🎉 

Datasets:
- [NLDAS](https://github.com/virtual-zarr/nldas-icechunk/)
- [RASI](https://github.com/virtual-zarr/rasi-icechunk)
- [MUR SST p1 + p2](https://github.com/developmentseed/mursst-icechunk-updater), see also [concatenate_stores.ipynb](https://github.com/abarciauskas-bgse/icechunk-nasa/blob/main/notebooks/mur-sst/concatenate_stores.ipynb)
- Others from MAAP? Like the GEOS-CF Climatologies dataset?
-  In-progress
    - [GEOS-CF](https://github.com/NASA-IMPACT/geos-cf-virtualizarr-data-pipelines)
    - [GPM IMERG HH](https://github.com/virtual-zarr/gpmimerghh-virtualizarr-data-pipeline)

Any others I am missing? @siddharth0248 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Publish Zarr + Virtual Zarr datasets to STAC #392

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Publish Zarr + Virtual Zarr datasets to STAC #392

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions