Complex Time Series Indexes#
[1]:
import pyearthtools.data
import warnings
with warnings.catch_warnings(action="ignore"):
import site_archive_nci
Variables#
[2]:
var = 'tcwv'
Complex Data Retrievals#
The pyearthtools.data.DataIndex provides a method to retrieve data between any timesteps at any time interval, .series. This method allows for transforms to be set, which will be further explained in later Notebooks.
.series will attempt to use xarray.open_mfdataset for the performance benefits it provides, however, will use .single as a backup if this data is not on disk.
[3]:
era5 = pyearthtools.data.archive.ERA5(var, level = 'single')
Examples#
Let’s get data between 2021-01-12T03 & 2021-01-14T05 at 4 hourly intervals
[4]:
era5.series('2021-01-12T03', '2021-01-14T05', interval = (4, 'hour'))
[4]:
<xarray.Dataset> Size: 108MB
Dimensions: (longitude: 1440, latitude: 721, time: 13)
Coordinates:
* longitude (longitude) float32 6kB -180.0 -179.8 -179.5 ... 179.5 179.8
* latitude (latitude) float32 3kB 90.0 89.75 89.5 ... -89.5 -89.75 -90.0
* time (time) datetime64[ns] 104B 2021-01-12T03:00:00 ... 2021-01-14T...
Data variables:
tcwv (time, latitude, longitude) float64 108MB dask.array<chunksize=(13, 182, 360), meta=np.ndarray>
Attributes:
Conventions: CF-1.6
history: 2021-07-04 08:07:33 UTC+1000 by era5_replication_tools-1.9....
license: Licence to use Copernicus Products: https://apps.ecmwf.int/...
summary: ERA5 is the fifth generation ECMWF atmospheric reanalysis o...
title: ERA5 single-levels reanalysis total_column_water_vapour 202...Behaviour with partially defined times#
Within pyearthtools.data is a custom datetime object (pyearthtoolsDatetime) which aims to assist in data retrieval when a time is not fully defined.
This time object will keep track of the resolution to which the user defined the date. When adding/subtracting from this, all time values will be considered, even if they are not inside the resolution.
[5]:
time = pyearthtools.data.time.Petdt('2020-03')
print(time, '-', time.resolution)
2020-03 - month
This affects .series in the following ways:
If a user provides a resolution coarser as the start time, the first timestep for the data shall be used.
If the interval is finer, than the given data resolution, the interval is rounded up
If the data is stored at irregular or inconsistent time values, an automatic subset tolerance of the data resolution will be used, and can be manually overridden with tolerance
[6]:
era5.series('2021-01', '2021-01-04T05', interval = (4, 'hour'))
[6]:
<xarray.Dataset> Size: 166MB
Dimensions: (longitude: 1440, latitude: 721, time: 20)
Coordinates:
* longitude (longitude) float32 6kB -180.0 -179.8 -179.5 ... 179.5 179.8
* latitude (latitude) float32 3kB 90.0 89.75 89.5 ... -89.5 -89.75 -90.0
* time (time) datetime64[ns] 160B 2021-01-01 ... 2021-01-04T04:00:00
Data variables:
tcwv (time, latitude, longitude) float64 166MB dask.array<chunksize=(20, 182, 360), meta=np.ndarray>
Attributes:
Conventions: CF-1.6
history: 2021-07-04 08:07:33 UTC+1000 by era5_replication_tools-1.9....
license: Licence to use Copernicus Products: https://apps.ecmwf.int/...
summary: ERA5 is the fifth generation ECMWF atmospheric reanalysis o...
title: ERA5 single-levels reanalysis total_column_water_vapour 202...[7]:
era5.series('2021-01-01', '2021-01-12T05', interval = (30, 'minute'))
[7]:
<xarray.Dataset> Size: 2GB
Dimensions: (longitude: 1440, latitude: 721, time: 269)
Coordinates:
* longitude (longitude) float32 6kB -180.0 -179.8 -179.5 ... 179.5 179.8
* latitude (latitude) float32 3kB 90.0 89.75 89.5 ... -89.5 -89.75 -90.0
* time (time) datetime64[ns] 2kB 2021-01-01 ... 2021-01-12T04:00:00
Data variables:
tcwv (time, latitude, longitude) float64 2GB dask.array<chunksize=(186, 182, 360), meta=np.ndarray>
Attributes:
Conventions: CF-1.6
history: 2021-07-04 08:07:33 UTC+1000 by era5_replication_tools-1.9....
license: Licence to use Copernicus Products: https://apps.ecmwf.int/...
summary: ERA5 is the fifth generation ECMWF atmospheric reanalysis o...
title: ERA5 single-levels reanalysis total_column_water_vapour 202...[ ]: