Exploring S1-NRB data cubes

Introduction

This example notebook will give a short demonstration of how S1-NRB products can be explored as on-the-fly data cubes with little effort by utilizing the STAC metadata provided with each product. It is not intended to demonstrate how to process the S1-NRB products in the first place. For this information please refer to the usage instructions.

A lightning talk related to this topic has been given during the Cloud-Native Geospatial Outreach Event 2022, which can be found here.

Follow this link for a better visualization of this notebook!

Sentinel-1 Normalised Radar Backscatter Sentinel-1 Normalised Radar Backscatter (S1-NRB) is a newly developed Analysis Ready Data (ARD) product for the European Space Agency that offers high-quality, radiometrically terrain corrected (RTC) Synthetic Aperture Radar (SAR) backscatter and is designed to be compliant with the CEOS ARD for Land (CARD4L) NRB specification. You can find more detailed information about the S1-NRB product here.

SpatioTemporal Asset Catalog (STAC) All S1-NRB products include metadata in JSON format compliant with the SpatioTemporal Asset Catalog (STAC) specification. STAC uses several sub-specifications (Item, Collection & Catalog) to create a hierarchical structure that enables efficient querying and access of large volumes of geospatial data.

Getting started

After following the installation instructions you need to install an additional package into the activated conda environment:

conda activate nrb_env
conda install stackstac

Let’s assume you have a collection of S1-NRB scenes located on your local disk, a fileserver or somewhere in the cloud. As mentioned in the Introduction, each S1-NRB scene includes metadata as a STAC Item, describing the scene’s temporal, spatial and product specific properties.

The only step necessary to get started with analysing your collection of scenes, is the creation of STAC Collection and Catalog files, which connect individual STAC Items and thereby create a hierarchy of STAC objects. S1_NRB includes the utility function make_catalog, which will create these files for you. Please note that make_catalog expects a directory structure based on MGRS tile IDs, which allows for efficient data querying and access. After user confirmation it will take care of reorganizing your S1-NRB scenes if this directory structure doesn’t exist yet.

[3]:
import numpy as np
import stackstac
from S1_NRB.metadata.stac import make_catalog

nrb_catalog = make_catalog(directory='./NRB_thuringia', silent=True)
WARNING:
./NRB_thuringia
and the NRB products it contains will be reorganized into subdirectories based on unique MGRS tile IDs if this directory structure does not yet exist.
Do you wish to continue? [yes|no]  yes

#### New STAC endpoint created: ./NRB_thuringia/catalog.json

The STAC Catalog can then be used with libraries such as stackstac, which “turns a STAC Collection into a lazy xarray.DataArray, backed by dask”.

The term lazy describes a method of execution that only computes results when actually needed and thereby enables computations on larger-than-memory datasets. xarray is a Python library for working with labeled multi-dimensional arrays of data, while the Python library dask facilitates parallel computing in a flexible way.

Compatibility with odc-stac, a very similar library to stackstac, has also been implemented.

[4]:
aoi = (10.638066, 50.708415, 11.686751, 50.975775)
ds = stackstac.stack(items=nrb_catalog, bounds_latlon=aoi,
                     dtype=np.dtype('float32'), chunksize=(-1, 1, 1024, 1024))
ds
[4]:
<xarray.DataArray 'stackstac-f9b5b2607432a2a973be5982262095c8' (time: 121,
                                                                band: 10,
                                                                y: 3189, x: 7471)>
dask.array<fetch_raster_window, shape=(121, 10, 3189, 7471), dtype=float32, chunksize=(121, 1, 1024, 1024), chunktype=numpy.ndarray>
Coordinates: (12/55)
  * time                                   (time) datetime64[ns] 2020-01-03T1...
    id                                     (time) <U57 'S1A_IW_NRB__1SDV_2020...
  * band                                   (band) <U25 'noise-power-vh' ... '...
  * x                                      (x) float64 6.15e+05 ... 6.897e+05
  * y                                      (y) float64 5.651e+06 ... 5.619e+06
    constellation                          <U10 'sentinel-1'
    ...                                     ...
    processing:facility                    <U3 'FSU'
    raster:bands                           (band) object [{'unit': 'dB', 'nod...
    title                                  (band) <U30 'Noise Power' ... 'Acq...
    file:header_size                       (band) int64 6794 6794 ... 7642 6914
    file:byte_order                        <U13 'little-endian'
    epsg                                   int64 32632
Attributes:
    spec:        RasterSpec(epsg=32632, bounds=(614990.0, 5618680.0, 689700.0...
    crs:         epsg:32632
    transform:   | 10.00, 0.00, 614990.00|\n| 0.00,-10.00, 5650570.00|\n| 0.0...
    resolution:  10.0

As you can see in the output above, the collection of S1-NRB scenes was successfully loaded as an xarray.DataArray. The metadata attributes included in all STAC Items are now available as coordinate arrays (see here for clarification of Xarray’s terminology) and can be utilized during analysis.

It is now possible to explore and analyse the S1-NRB data cube. The most important tools in this regard are the already mentioned xarray and dask. Both are widely used and a lot of tutorials and videos can be found online, e.g. in the xarray Docs (1, 2) or the Pangeo Tutorial Gallery.