import pystac
import xarray as xr
import hvplot.xarray # noqa
Visualize zarr
xarray
and visualizing the data using hvplot
Run this notebook
You can launch this notebook in VEDA JupyterHub by clicking the link below.
Launch in VEDA JupyterHub (requires access)
Learn more
Inside the Hub
This notebook was written on the VEDA JupyterHub and as such is designed to be run on a jupyterhub which is associated with an AWS IAM role which has been granted permissions to the VEDA data store via its bucket policy. The instance used provided 16GB of RAM.
See (VEDA Analytics JupyterHub Access)[https://nasa-impact.github.io/veda-docs/veda-jh-access.html] for information about how to gain access.
Outside the Hub
The data is in a protected bucket. Please request access by emailing aimee@developmentseed.org or alexandra@developmentseed.org and providing your affiliation, interest in or expected use of the dataset and an AWS IAM role or user Amazon Resource Name (ARN). The team will help you configure the cognito client.
You should then run:
%run -i 'cognito_login.py'
Approach
- Use
pystac
to open a STAC collection - Use
xarray
anddask
to lazily read in the data - Plot the data using
hvplot
About the data
This is the Gridded Daily OCO-2 Carbon Dioxide assimilated dataset. More information can be found at: OCO-2 GEOS Level 3 daily, 0.5x0.625 assimilated CO2 V10r (OCO2_GEOS_L3CO2_DAY)
The data has been converted to zarr format and published to the development version of the VEDA STAC Catalog.
Declare your collection of interest
You can discover available collections the following ways:
- Programmatically: see example in the
list-collections.ipynb
notebook - JSON API: https://openveda.cloud/api/stac/collections
- STAC Browser: https://openveda.cloud
= "https://openveda.cloud/api/stac"
STAC_API_URL = "oco2-geos-l3-daily" collection_id
Get STAC collection
Use pystac
to access the STAC collection.
= pystac.Collection.from_file(f"{STAC_API_URL}/collections/{collection_id}")
collection collection
- type "Collection"
- id "oco2-geos-l3-daily"
- stac_version "1.1.0"
- description "The OCO-2 mission provides the highest quality space-based XCO2 retrievals to date. However, the instrument data are characterized by large gaps in coverage due to OCO-2’s narrow 10-km ground track and an inability to see through clouds and thick aerosols. This global gridded dataset is produced using a data assimilation technique commonly referred to as state estimation within the geophysical literature. Data assimilation synthesizes simulations and observations, adjusting the state of atmospheric constituents like CO2 to reflect observed values, thus gap-filling observations when and where they are unavailable based on previous observations and short transport simulations by GEOS. Compared to other methods, data assimilation has the advantage that it makes estimates based on our collective scientific understanding, notably of the Earth's carbon cycle and atmospheric transport. OCO-2 GEOS (Goddard Earth Observing System) Level 3 data are produced by ingesting OCO-2 L2 retrievals every 6 hours with GEOS CoDAS, a modeling and data assimilation system maintained by NASA's Global Modeling and Assimilation Office (GMAO). GEOS CoDAS uses a high-performance computing implementation of the Gridpoint Statistical Interpolation approach for solving the state estimation problem. GSI finds the analyzed state that minimizes the three-dimensional variational (3D-Var) cost function formulation of the state estimation problem."
links[] 6 items
0
- rel "self"
- href "https://openveda.cloud/api/stac/collections/oco2-geos-l3-daily"
- type "application/json"
1
- rel "items"
- href "https://openveda.cloud/api/stac/collections/oco2-geos-l3-daily/items"
- type "application/geo+json"
2
- rel "parent"
- href "https://openveda.cloud/api/stac/"
- type "application/json"
3
- rel "root"
- href "https://openveda.cloud/api/stac/"
- type "application/json"
4
- rel "external"
- href "https://catalog.data.gov/dataset/oco-2-geos-level-3-daily-0-5x0-625-assimilated-co2-v10r-oco2-geos-l3co2-day-at-ges-disc-72b15"
- type "text/html"
- title "OCO-2 GEOS Level 3 daily, 0.5x0.625 assimilated CO2 V10r (OCO2_GEOS_L3CO2_DAY) at GES DISC"
- label:assets None
5
- rel "http://www.opengis.net/def/rel/ogc/1.0/queryables"
- href "https://openveda.cloud/api/stac/collections/oco2-geos-l3-daily/queryables"
- type "application/schema+json"
- title "Queryables"
stac_extensions[] 1 items
- 0 "https://stac-extensions.github.io/datacube/v2.2.0/schema.json"
cube:variables
XCO2
- type "data"
- unit "mol CO2/mol dry"
attrs
- units "mol CO2/mol dry"
- long_name "Assimilated dry-air column average CO2 daily mean"
shape[] 3 items
- 0 2500
- 1 361
- 2 576
chunks[] 3 items
- 0 100
- 1 100
- 2 100
dimensions[] 3 items
- 0 "time"
- 1 "lat"
- 2 "lon"
- description "Assimilated dry-air column average CO2 daily mean"
XCO2PREC
- type "data"
- unit "mol CO2/mol dry"
attrs
- units "mol CO2/mol dry"
- long_name "Precision of dry-air column average CO2 daily mean from Desroziers et al. (2005) diagnostic"
shape[] 3 items
- 0 2500
- 1 361
- 2 576
chunks[] 3 items
- 0 100
- 1 100
- 2 100
dimensions[] 3 items
- 0 "time"
- 1 "lat"
- 2 "lon"
- description "Precision of dry-air column average CO2 daily mean from Desroziers et al. (2005) diagnostic"
cube:dimensions
lat
- axis "y"
- type "spatial"
extent[] 2 items
- 0 -90.0
- 1 90.0
- description "latitude"
- reference_system 4326
lon
- axis "x"
- type "spatial"
extent[] 2 items
- 0 -180.0
- 1 179.375
- description "longitude"
- reference_system 4326
time
- step "P1DT0H0M0S"
- type "temporal"
extent[] 2 items
- 0 "2015-01-01T12:00:00Z"
- 1 "2021-11-04T12:00:00Z"
- description "time"
- dashboard:is_periodic True
- dashboard:time_density "day"
- title "Gridded Daily OCO-2 Carbon Dioxide assimilated dataset"
extent
spatial
bbox[] 1 items
0[] 4 items
- 0 -180.0
- 1 -90.0
- 2 180.0
- 3 90.0
temporal
interval[] 1 items
0[] 2 items
- 0 None
- 1 None
- license "CC0-1.0"
providers[] 1 items
0
- name "NASA VEDA"
roles[] 1 items
- 0 "host"
- url "https://www.earthdata.nasa.gov/dashboard/"
assets
zarr
- href "s3://veda-data-store/oco2-geos-l3-daily/OCO2_GEOS_L3CO2_day.zarr"
- type "application/vnd+zarr"
- title "zarr"
roles[] 1 items
- 0 "data"
We can see that there is one zarr asset:
="application/vnd+zarr") collection.get_assets(media_type
{'zarr': <Asset href=s3://veda-data-store/oco2-geos-l3-daily/OCO2_GEOS_L3CO2_day.zarr>}
Read from zarr to xarray
With the url pointing to the Zarr store, you can create an xarray dataset backed by a dask array.
= collection.assets["zarr"].href
url
= xr.open_dataset(url, engine="zarr", chunks="auto")
ds ds
<xarray.Dataset> Size: 8GB Dimensions: (time: 2500, lat: 361, lon: 576) Coordinates: * lat (lat) float64 3kB -90.0 -89.5 -89.0 -88.5 ... 88.5 89.0 89.5 90.0 * lon (lon) float64 5kB -180.0 -179.4 -178.8 ... 178.1 178.8 179.4 * time (time) datetime64[ns] 20kB 2015-01-01T12:00:00 ... 2021-11-04T1... Data variables: XCO2 (time, lat, lon) float64 4GB dask.array<chunksize=(200, 200, 200), meta=np.ndarray> XCO2PREC (time, lat, lon) float64 4GB dask.array<chunksize=(200, 200, 200), meta=np.ndarray> Attributes: (12/25) BuildId: B10.2.06 Contact: Brad Weir (brad.weir@nasa.gov) Conventions: CF-1 DataResolution: 0.5x0.625 EastBoundingCoordinate: 179.375 Format: NetCDF-4/HDF-5 ... ... ShortName: OCO2_GEOS_L3CO2_DAY_10r SouthBoundingCoordinate: -90.0 SpatialCoverage: global Title: OCO-2 GEOS Level 3 daily, 0.5x0.625 assim... VersionID: V10r WestBoundingCoordinate: -180.0
In xarray
you can inspect just one data variable using dot notation:
ds.XCO2
<xarray.DataArray 'XCO2' (time: 2500, lat: 361, lon: 576)> Size: 4GB dask.array<open_dataset-XCO2, shape=(2500, 361, 576), dtype=float64, chunksize=(200, 200, 200), chunktype=numpy.ndarray> Coordinates: * lat (lat) float64 3kB -90.0 -89.5 -89.0 -88.5 ... 88.5 89.0 89.5 90.0 * lon (lon) float64 5kB -180.0 -179.4 -178.8 -178.1 ... 178.1 178.8 179.4 * time (time) datetime64[ns] 20kB 2015-01-01T12:00:00 ... 2021-11-04T12... Attributes: long_name: Assimilated dry-air column average CO2 daily mean units: mol CO2/mol dry
Plot data
We can plot the XCO2 variable as an interactive map (with date slider) using hvplot
.
ds.XCO2.hvplot(="lon",
x="lat",
y="time",
groupby=True,
coastline=True,
rasterize="mean",
aggregator="bottom",
widget_location=600,
frame_width )
The time slider will only work when running in the notebook. When rendered on a static website the slider has no impact.