import intake
import hvplot.xarray # noqa
Visualize zarr
intake
and visualizing the data using hvplot
Run this notebook
You can launch this notebook in VEDA JupyterHub by clicking the link below.
Launch in VEDA JupyterHub (requires access)
Learn more
Inside the Hub
This notebook was written on the VEDA JupyterHub and as such is designed to be run on a jupyterhub which is associated with an AWS IAM role which has been granted permissions to the VEDA data store via its bucket policy. The instance used provided 16GB of RAM.
See (VEDA Analytics JupyterHub Access)[https://nasa-impact.github.io/veda-docs/veda-jh-access.html] for information about how to gain access.
Outside the Hub
The data is in a protected bucket. Please request access by emailing aimee@developmentseed.org or alexandra@developmentseed.org and providing your affiliation, interest in or expected use of the dataset and an AWS IAM role or user Amazon Resource Name (ARN). The team will help you configure the cognito client.
You should then run:
%run -i 'cognito_login.py'
Approach
- Use
intake
to open a STAC collection using withxarray
anddask
- Plot the data using
hvplot
About the data
This is the Gridded Daily OCO-2 Carbon Dioxide assimilated dataset. More information can be found at: OCO-2 GEOS Level 3 daily, 0.5x0.625 assimilated CO2 V10r (OCO2_GEOS_L3CO2_DAY)
The data has been converted to zarr format and published to the development version of the VEDA STAC Catalog.
Declare your collection of interest
You can discover available collections the following ways:
- Programmatically: see example in the
list-collections.ipynb
notebook - JSON API: https://staging-stac.delta-backend.com/collections
- STAC Browser: http://veda-staging-stac-browser.s3-website-us-west-2.amazonaws.com
= "https://openveda.cloud/api/stac"
STAC_API_URL = "oco2-geos-l3-daily" collection_id
Get STAC collection
Use intake
to get the entire STAC collection.
= intake.open_stac_collection(f"{STAC_API_URL}/collections/{collection_id}")
collection collection
oco2-geos-l3-daily:
args:
stac_obj: https://openveda.cloud/api/stac/collections/oco2-geos-l3-daily
description: ''
driver: intake_stac.catalog.StacCollection
metadata:
assets:
zarr:
href: s3://veda-data-store/oco2-geos-l3-daily/OCO2_GEOS_L3CO2_day.zarr
roles:
- data
title: zarr
type: application/vnd+zarr
cube:dimensions:
lat:
axis: y
description: latitude
extent:
- -90.0
- 90.0
reference_system: 4326
type: spatial
lon:
axis: x
description: longitude
extent:
- -180.0
- 179.375
reference_system: 4326
type: spatial
time:
description: time
extent:
- '2015-01-01T12:00:00Z'
- '2021-11-04T12:00:00Z'
step: P1DT0H0M0S
type: temporal
cube:variables:
XCO2:
attrs:
long_name: Assimilated dry-air column average CO2 daily mean
units: mol CO2/mol dry
chunks:
- 100
- 100
- 100
description: Assimilated dry-air column average CO2 daily mean
dimensions:
- time
- lat
- lon
shape:
- 2500
- 361
- 576
type: data
unit: mol CO2/mol dry
XCO2PREC:
attrs:
long_name: Precision of dry-air column average CO2 daily mean from Desroziers
et al. (2005) diagnostic
units: mol CO2/mol dry
chunks:
- 100
- 100
- 100
description: Precision of dry-air column average CO2 daily mean from Desroziers
et al. (2005) diagnostic
dimensions:
- time
- lat
- lon
shape:
- 2500
- 361
- 576
type: data
unit: mol CO2/mol dry
dashboard:is_periodic: true
dashboard:time_density: day
description: "The OCO-2 mission provides the highest quality space-based XCO2\
\ retrievals to date. However, the instrument data are characterized by large\
\ gaps in coverage due to OCO-2\u2019s narrow 10-km ground track and an inability\
\ to see through clouds and thick aerosols. This global gridded dataset is produced\
\ using a data assimilation technique commonly referred to as state estimation\
\ within the geophysical literature. Data assimilation synthesizes simulations\
\ and observations, adjusting the state of atmospheric constituents like CO2\
\ to reflect observed values, thus gap-filling observations when and where they\
\ are unavailable based on previous observations and short transport simulations\
\ by GEOS. Compared to other methods, data assimilation has the advantage that\
\ it makes estimates based on our collective scientific understanding, notably\
\ of the Earth's carbon cycle and atmospheric transport. OCO-2 GEOS (Goddard\
\ Earth Observing System) Level 3 data are produced by ingesting OCO-2 L2 retrievals\
\ every 6 hours with GEOS CoDAS, a modeling and data assimilation system maintained\
\ by NASA's Global Modeling and Assimilation Office (GMAO). GEOS CoDAS uses\
\ a high-performance computing implementation of the Gridpoint Statistical Interpolation\
\ approach for solving the state estimation problem. GSI finds the analyzed\
\ state that minimizes the three-dimensional variational (3D-Var) cost function\
\ formulation of the state estimation problem."
extent:
spatial:
bbox:
- - -180.0
- -90.0
- 180.0
- 90.0
temporal:
interval:
- - null
- null
id: oco2-geos-l3-daily
license: CC0-1.0
providers:
- name: NASA VEDA
roles:
- host
url: https://www.earthdata.nasa.gov/dashboard/
stac_extensions:
- https://stac-extensions.github.io/datacube/v2.2.0/schema.json
stac_version: 1.0.0
title: Gridded Daily OCO-2 Carbon Dioxide assimilated dataset
type: Collection
Read from zarr to xarray
Intake lets you go straight from the asset to an xarray dataset backed by a dask array.
= collection.get_asset("zarr")
source
= source.to_dask()
ds ds
/srv/conda/envs/notebook/lib/python3.11/site-packages/intake_xarray/base.py:21: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
'dims': dict(self._ds.dims),
<xarray.Dataset> Size: 8GB Dimensions: (time: 2500, lat: 361, lon: 576) Coordinates: * lat (lat) float64 3kB -90.0 -89.5 -89.0 -88.5 ... 88.5 89.0 89.5 90.0 * lon (lon) float64 5kB -180.0 -179.4 -178.8 ... 178.1 178.8 179.4 * time (time) datetime64[ns] 20kB 2015-01-01T12:00:00 ... 2021-11-04T1... Data variables: XCO2 (time, lat, lon) float64 4GB dask.array<chunksize=(100, 100, 100), meta=np.ndarray> XCO2PREC (time, lat, lon) float64 4GB dask.array<chunksize=(100, 100, 100), meta=np.ndarray> Attributes: (12/25) BuildId: B10.2.06 Contact: Brad Weir (brad.weir@nasa.gov) Conventions: CF-1 DataResolution: 0.5x0.625 EastBoundingCoordinate: 179.375 Format: NetCDF-4/HDF-5 ... ... ShortName: OCO2_GEOS_L3CO2_DAY_10r SouthBoundingCoordinate: -90.0 SpatialCoverage: global Title: OCO-2 GEOS Level 3 daily, 0.5x0.625 assim... VersionID: V10r WestBoundingCoordinate: -180.0
In xarray
you can inspect just one data variable using dot notation:
ds.XCO2
<xarray.DataArray 'XCO2' (time: 2500, lat: 361, lon: 576)> Size: 4GB dask.array<open_dataset-XCO2, shape=(2500, 361, 576), dtype=float64, chunksize=(100, 100, 100), chunktype=numpy.ndarray> Coordinates: * lat (lat) float64 3kB -90.0 -89.5 -89.0 -88.5 ... 88.5 89.0 89.5 90.0 * lon (lon) float64 5kB -180.0 -179.4 -178.8 -178.1 ... 178.1 178.8 179.4 * time (time) datetime64[ns] 20kB 2015-01-01T12:00:00 ... 2021-11-04T12... Attributes: long_name: Assimilated dry-air column average CO2 daily mean units: mol CO2/mol dry
Plot data
We can plot the XCO2 variable as an interactive map (with date slider) using hvplot
.
ds.XCO2.hvplot(="lon",
x="lat",
y="time",
groupby=True,
coastline=True,
rasterize="mean",
aggregator="bottom",
widget_location=600,
frame_width )