干练的皮带 · 雪村曾网络发歌一炮走红 膨胀过后拍电影-中国新闻网· 2 月前 · |
深情的火腿肠 · Studio - 关于发布自动化项目· 3 月前 · |
谦虚好学的企鹅 · Costa Maria Vino ...· 4 月前 · |
天涯 · 如何使用8G以下显卡训练Stable ...· 1 年前 · |
一身肌肉的烤土司
1 年前 |
In this lesson, we cover some more advanced aspects of xarray.
import numpy as np
import xarray as xr
from matplotlib import pyplot as plt
%xmode Minimal
Interpolation#
In the previous lesson on Xarray, we learned how to select data based on its dimension coordinates and align data with dimension different coordinates.
But what if we want to estimate the value of the data variables at different coordinates.
This is where interpolation comes in.
# we write it out explicitly so we can see each point.
x_data = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
f = xr.DataArray(x_data**2, dims=['x'], coords={'x': x_data})
<xarray.DataArray (x: 11)>
array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100])
Coordinates:
* x (x) int64 0 1 2 3 4 5 6 7 8 9 10
xarray.DataArray
- x: 11
- 0 1 4 9 16 25 36 49 64 81 100
array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100])
- x(x)int640 1 2 3 4 5 6 7 8 9 10
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
f.plot(marker='o')
We only have data on the integer points in x.
But what if we wanted to estimate the value at, say, 4.5?
f.sel(x=4.5)
array(20.5)
Coordinates:
x float64 4.5
xarray.DataArray- 20.5
array(20.5)
- x()float644.5
array(4.5)
Interpolation uses scipy.interpolate under the hood.
There are different modes of interpolation.
# default
f.interp(x=4.5, method='linear').values
x_new = x_data + 0.5
f_interp_linear = f.interp(x=x_new, method='linear')
f_interp_cubic = f.interp(x=x_new, method='cubic')
f.plot(marker='o', label='original')
f_interp_linear.plot(marker='o', label='linear')
f_interp_cubic.plot(marker='o', label='cubic')
plt.legend()
You can apply interpolation to any dimension, and even to multiple dimensions at a time.
(Multidimensional interpolation only supports mode='nearest'
and mode='linear'
.)
But keep in mind that Xarray has no built-in understanding of geography.
If you use interp
on lat / lon coordinates, it will just perform naive interpolation of the lat / lon values.
More sophisticated treatment of spherical geometry requires another package such as xesmf.
Groupby#
Xarray copies Pandas’ very useful groupby functionality, enabling the “split / apply / combine” workflow on xarray DataArrays and Datasets. In the first part of the lesson, we will learn to use groupby by analyzing sea-surface temperature data.
First we load a dataset. We will use the NOAA Extended Reconstructed Sea Surface Temperature (ERSST) v5 product, a widely used and trusted gridded compilation of of historical data going back to 1854.
Since the data is provided via an OPeNDAP server, we can load it directly without downloading anything:
url = 'http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/noaa.ersst.v5/sst.mnmean.nc'
ds = xr.open_dataset(url, drop_variables=['time_bnds'])
ds = ds.sel(time=slice('1960', '2018')).load()
Dimensions: (lat: 89, lon: 180, time: 708)
Coordinates:
* lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0
* lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0
* time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2018-12-01
Data variables:
sst (time, lat, lon) float32 -1.8 -1.8 -1.8 -1.8 ... nan nan nan nan
Attributes: (12/38)
climatology: Climatology is based on 1971-2000 SST, X...
description: In situ data: ICOADS2.5 before 2007 and ...
keywords_vocabulary: NASA Global Change Master Directory (GCM...
keywords: Earth Science > Oceans > Ocean Temperatu...
instrument: Conventional thermometers
source_comment: SSTs were observed by conventional therm...
... ...
license: No constraints on data access or use
comment: SSTs were observed by conventional therm...
summary: ERSST.v5 is developed based on v4 after ...
dataset_title: NOAA Extended Reconstructed SST V5
data_modified: 2021-11-07
DODS_EXTRA.Unlimited_Dimension: time
xarray.Dataset- lat: 89
- lon: 180
- time: 708
- lat(lat)float3288.0 86.0 84.0 ... -86.0 -88.0
- units :
- degrees_north
- long_name :
- Latitude
- actual_range :
- [ 88. -88.]
- standard_name :
- latitude
- axis :
- Y
- coordinate_defines :
- center
array([ 88., 86., 84., 82., 80., 78., 76., 74., 72., 70., 68., 66.,
64., 62., 60., 58., 56., 54., 52., 50., 48., 46., 44., 42.,
40., 38., 36., 34., 32., 30., 28., 26., 24., 22., 20., 18.,
16., 14., 12., 10., 8., 6., 4., 2., 0., -2., -4., -6.,
-8., -10., -12., -14., -16., -18., -20., -22., -24., -26., -28., -30.,
-32., -34., -36., -38., -40., -42., -44., -46., -48., -50., -52., -54.,
-56., -58., -60., -62., -64., -66., -68., -70., -72., -74., -76., -78.,
-80., -82., -84., -86., -88.], dtype=float32)
- lon(lon)float320.0 2.0 4.0 ... 354.0 356.0 358.0
- units :
- degrees_east
- long_name :
- Longitude
- actual_range :
- [ 0. 358.]
- standard_name :
- longitude
- axis :
- X
- coordinate_defines :
- center
array([ 0., 2., 4., 6., 8., 10., 12., 14., 16., 18., 20., 22.,
24., 26., 28., 30., 32., 34., 36., 38., 40., 42., 44., 46.,
48., 50., 52., 54., 56., 58., 60., 62., 64., 66., 68., 70.,
72., 74., 76., 78., 80., 82., 84., 86., 88., 90., 92., 94.,
96., 98., 100., 102., 104., 106., 108., 110., 112., 114., 116., 118.,
120., 122., 124., 126., 128., 130., 132., 134., 136., 138., 140., 142.,
144., 146., 148., 150., 152., 154., 156., 158., 160., 162., 164., 166.,
168., 170., 172., 174., 176., 178., 180., 182., 184., 186., 188., 190.,
192., 194., 196., 198., 200., 202., 204., 206., 208., 210., 212., 214.,
216., 218., 220., 222., 224., 226., 228., 230., 232., 234., 236., 238.,
240., 242., 244., 246., 248., 250., 252., 254., 256., 258., 260., 262.,
264., 266., 268., 270., 272., 274., 276., 278., 280., 282., 284., 286.,
288., 290., 292., 294., 296., 298., 300., 302., 304., 306., 308., 310.,
312., 314., 316., 318., 320., 322., 324., 326., 328., 330., 332., 334.,
336., 338., 340., 342., 344., 346., 348., 350., 352., 354., 356., 358.],
dtype=float32)
- time(time)datetime64[ns]1960-01-01 ... 2018-12-01
- long_name :
- Time
- delta_t :
- 0000-01-00 00:00:00
- avg_period :
- 0000-01-00 00:00:00
- prev_avg_period :
- 0000-00-07 00:00:00
- standard_name :
- time
- axis :
- T
- actual_range :
- [19723. 80992.]
- _ChunkSizes :
- 1
array(['1960-01-01T00:00:00.000000000', '1960-02-01T00:00:00.000000000',
'1960-03-01T00:00:00.000000000', ..., '2018-10-01T00:00:00.000000000',
'2018-11-01T00:00:00.000000000', '2018-12-01T00:00:00.000000000'],
dtype='datetime64[ns]')
- sst(time, lat, lon)float32-1.8 -1.8 -1.8 -1.8 ... nan nan nan
- long_name :
- Monthly Means of Sea Surface Temperature
- units :
- degC
- var_desc :
- Sea Surface Temperature
- level_desc :
- Surface
- statistic :
- Mean
- dataset :
- NOAA Extended Reconstructed SST V5
- parent_stat :
- Individual Values
- actual_range :
- [-1.8 42.32636]
- valid_range :
- [-1.8 45. ]
- _ChunkSizes :
- [ 1 89 180]
array([[[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]],
[[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]],
[[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]],
[[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]],
[[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]]], dtype=float32)
- climatology :
- Climatology is based on 1971-2000 SST, Xue, Y., T. M. Smith, and R. W. Reynolds, 2003: Interdecadal changes of 30-yr SST normals during 1871.2000. Journal of Climate, 16, 1601-1612.
- description :
- In situ data: ICOADS2.5 before 2007 and NCEP in situ data from 2008 to present. Ice data: HadISST ice before 2010 and NCEP ice after 2010.
- keywords_vocabulary :
- NASA Global Change Master Directory (GCMD) Science Keywords
- keywords :
- Earth Science > Oceans > Ocean Temperature > Sea Surface Temperature >
- instrument :
- Conventional thermometers
- source_comment :
- SSTs were observed by conventional thermometers in Buckets (insulated or un-insulated canvas and wooded buckets) or Engine Room Intaker
- geospatial_lon_min :
- -1.0
- geospatial_lon_max :
- 359.0
- geospatial_laty_max :
- 89.0
- geospatial_laty_min :
- -89.0
- geospatial_lat_max :
- 89.0
- geospatial_lat_min :
- -89.0
- geospatial_lat_units :
- degrees_north
- geospatial_lon_units :
- degrees_east
- cdm_data_type :
- Grid
- project :
- NOAA Extended Reconstructed Sea Surface Temperature (ERSST)
- original_publisher_url :
- http://www.ncdc.noaa.gov
- References :
- https://www.ncdc.noaa.gov/data-access/marineocean-data/extended-reconstructed-sea-surface-temperature-ersst-v5 at NCEI and http://www.esrl.noaa.gov/psd/data/gridded/data.noaa.ersst.v5.html
- source :
- In situ data: ICOADS R3.0 before 2015, NCEP in situ GTS from 2016 to present, and Argo SST from 1999 to present. Ice data: HadISST2 ice before 2015, and NCEP ice after 2015
- title :
- NOAA ERSSTv5 (in situ only)
- history :
- created 07/2017 by PSD data using NCEI's ERSST V5 NetCDF values
- institution :
- This version written at NOAA/ESRL PSD: obtained from NOAA/NESDIS/National Centers for Environmental Information and time aggregated. Original Full Source: NOAA/NESDIS/NCEI/CCOG
- citation :
- Huang et al, 2017: Extended Reconstructed Sea Surface Temperatures Version 5 (ERSSTv5): Upgrades, Validations, and Intercomparisons. Journal of Climate, https://doi.org/10.1175/JCLI-D-16-0836.1
- platform :
- Ship and Buoy SSTs from ICOADS R3.0 and NCEP GTS
- standard_name_vocabulary :
- CF Standard Name Table (v40, 25 January 2017)
- processing_level :
- NOAA Level 4
- Conventions :
- CF-1.6, ACDD-1.3
- metadata_link :
- :metadata_link = https://doi.org/10.7289/V5T72FNM (original format)
- creator_name :
- Boyin Huang (original)
- date_created :
- 2017-06-30T12:18:00Z (original)
- product_version :
- Version 5
- creator_url_original :
- https://www.ncei.noaa.gov
- license :
- No constraints on data access or use
- comment :
- SSTs were observed by conventional thermometers in Buckets (insulated or un-insulated canvas and wooded buckets), Engine Room Intakers, or floats and drifters
- summary :
- ERSST.v5 is developed based on v4 after revisions of 8 parameters using updated data sets and advanced knowledge of ERSST analysis
- dataset_title :
- NOAA Extended Reconstructed SST V5
- data_modified :
- 2021-11-07
- DODS_EXTRA.Unlimited_Dimension :
- time
Let’s do some basic visualizations of the data, just to make sure it looks reasonable.
ds.sst[0].plot(vmin=-2, vmax=30)
Note that xarray correctly parsed the time index, resulting in a Pandas datetime index on the time dimension.
ds.time
<xarray.DataArray 'time' (time: 708)>
array(['1960-01-01T00:00:00.000000000', '1960-02-01T00:00:00.000000000',
'1960-03-01T00:00:00.000000000', ..., '2018-10-01T00:00:00.000000000',
'2018-11-01T00:00:00.000000000', '2018-12-01T00:00:00.000000000'],
dtype='datetime64[ns]')
Coordinates:
* time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2018-12-01
Attributes:
long_name: Time
delta_t: 0000-01-00 00:00:00
avg_period: 0000-01-00 00:00:00
prev_avg_period: 0000-00-07 00:00:00
standard_name: time
axis: T
actual_range: [19723. 80992.]
_ChunkSizes: 1
xarray.DataArray'time'- time: 708
- 1960-01-01 1960-02-01 1960-03-01 ... 2018-10-01 2018-11-01 2018-12-01
array(['1960-01-01T00:00:00.000000000', '1960-02-01T00:00:00.000000000',
'1960-03-01T00:00:00.000000000', ..., '2018-10-01T00:00:00.000000000',
'2018-11-01T00:00:00.000000000', '2018-12-01T00:00:00.000000000'],
dtype='datetime64[ns]')
- time(time)datetime64[ns]1960-01-01 ... 2018-12-01
- long_name :
- Time
- delta_t :
- 0000-01-00 00:00:00
- avg_period :
- 0000-01-00 00:00:00
- prev_avg_period :
- 0000-00-07 00:00:00
- standard_name :
- time
- axis :
- T
- actual_range :
- [19723. 80992.]
- _ChunkSizes :
- 1
array(['1960-01-01T00:00:00.000000000', '1960-02-01T00:00:00.000000000',
'1960-03-01T00:00:00.000000000', ..., '2018-10-01T00:00:00.000000000',
'2018-11-01T00:00:00.000000000', '2018-12-01T00:00:00.000000000'],
dtype='datetime64[ns]')
- long_name :
- Time
- delta_t :
- 0000-01-00 00:00:00
- avg_period :
- 0000-01-00 00:00:00
- prev_avg_period :
- 0000-00-07 00:00:00
- standard_name :
- time
- axis :
- T
- actual_range :
- [19723. 80992.]
- _ChunkSizes :
- 1
ds.sst.sel(lon=300, lat=50).plot()
As we can see from the plot, the timeseries at any one point is totally dominated by the seasonal cycle. We would like to remove this seasonal cycle (called the “climatology”) in order to better see the long-term variaitions in temperature. We will accomplish this using groupby.
The syntax of Xarray’s groupby is almost identical to Pandas.
We will first apply groupby to a single DataArray.
ds.sst.groupby?
----------
group : str, DataArray or IndexVariable
Array whose unique values should be used to group this array. If a
string, must be the name of a variable contained in this dataset.
squeeze : bool, optional
If "group" is a dimension of any arrays in this dataset, `squeeze`
controls whether the subarrays have a dimension of length 1 along
that dimension or if the dimension is squeezed out.
restore_coord_dims : bool, optional
If True, also restore the dimension order of multi-dimensional
coordinates.
Returns
-------
grouped
A `GroupBy` object patterned after `pandas.GroupBy` that can be
iterated over in the form of `(unique_value, grouped_array)` pairs.
Examples
--------
Calculate daily anomalies for daily data:
>>> da = xr.DataArray(
... np.linspace(0, 1826, num=1827),
... coords=[pd.date_range("1/1/2000", "31/12/2004", freq="D")],
... dims="time",
... )
<xarray.DataArray (time: 1827)>
array([0.000e+00, 1.000e+00, 2.000e+00, ..., 1.824e+03, 1.825e+03,
1.826e+03])
Coordinates:
* time (time) datetime64[ns] 2000-01-01 2000-01-02 ... 2004-12-31
>>> da.groupby("time.dayofyear") - da.groupby("time.dayofyear").mean("time")
<xarray.DataArray (time: 1827)>
array([-730.8, -730.8, -730.8, ..., 730.2, 730.2, 730.5])
Coordinates:
* time (time) datetime64[ns] 2000-01-01 2000-01-02 ... 2004-12-31
dayofyear (time) int64 1 2 3 4 5 6 7 8 ... 359 360 361 362 363 364 365 366
See Also
--------
core.groupby.DataArrayGroupBy
core.groupby.DatasetGroupBy
File: /srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/core/common.py
Type: method
Split Step#
The most important argument is group
: this defines the unique values we will us to “split” the data for grouped analysis. We can pass either a DataArray or a name of a variable in the dataset. Lets first use a DataArray. Just like with Pandas, we can use the time indexe to extract specific components of dates and times. Xarray uses a special syntax for this .dt
, called the DatetimeAccessor
.
ds.time.dt
<xarray.DataArray 'month' (time: 708)>
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4,
5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4,
5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
Coordinates:
* time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2018-12-01
xarray.DataArray
'month'- time: 708
- 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 ... 11 12 1 2 3 4 5 6 7 8 9 10 11 12
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4,
5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4,
5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
- time(time)datetime64[ns]1960-01-01 ... 2018-12-01
- long_name :
- Time
- delta_t :
- 0000-01-00 00:00:00
- avg_period :
- 0000-01-00 00:00:00
- prev_avg_period :
- 0000-00-07 00:00:00
- standard_name :
- time
- axis :
- T
- actual_range :
- [19723. 80992.]
- _ChunkSizes :
- 1
array(['1960-01-01T00:00:00.000000000', '1960-02-01T00:00:00.000000000',
'1960-03-01T00:00:00.000000000', ..., '2018-10-01T00:00:00.000000000',
'2018-11-01T00:00:00.000000000', '2018-12-01T00:00:00.000000000'],
dtype='datetime64[ns]')
ds.time.dt.year
We can use these arrays in a groupby operation:
gb = ds.sst.groupby(ds.time.dt.month)
Xarray also offers a more concise syntax when the variable you’re grouping on is already present in the dataset. This is identical to the previous line:
gb = ds.sst.groupby('time.month')
Now that the data are split, we can manually iterate over the group. The iterator returns the key (group name) and the value (the actual dataset corresponding to that group) for each group.
for group_name, group_da in gb:
# stop iterating after the first loop
break
print(group_name)
group_da
<xarray.DataArray 'sst' (time: 59, lat: 89, lon: 180)>
array([[[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]],
[[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]],
[[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]],
[[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]],
[[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]]], dtype=float32)
Coordinates:
* lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0
* lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0
* time (time) datetime64[ns] 1960-01-01 1961-01-01 ... 2018-01-01
Attributes:
long_name: Monthly Means of Sea Surface Temperature
units: degC
var_desc: Sea Surface Temperature
level_desc: Surface
statistic: Mean
dataset: NOAA Extended Reconstructed SST V5
parent_stat: Individual Values
actual_range: [-1.8 42.32636]
valid_range: [-1.8 45. ]
_ChunkSizes: [ 1 89 180]
xarray.DataArray'sst'- time: 59
- lat: 89
- lon: 180
- -1.8 -1.8 -1.8 -1.8 -1.8 -1.8 -1.8 ... nan nan nan nan nan nan nan
array([[[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]],
[[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]],
[[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]],
[[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]],
[[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[-1.8, -1.8, -1.8, ..., -1.8, -1.8, -1.8],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]]], dtype=float32)
- lat(lat)float3288.0 86.0 84.0 ... -86.0 -88.0
- units :
- degrees_north
- long_name :
- Latitude
- actual_range :
- [ 88. -88.]
- standard_name :
- latitude
- axis :
- Y
- coordinate_defines :
- center
array([ 88., 86., 84., 82., 80., 78., 76., 74., 72., 70., 68., 66.,
64., 62., 60., 58., 56., 54., 52., 50., 48., 46., 44., 42.,
40., 38., 36., 34., 32., 30., 28., 26., 24., 22., 20., 18.,
16., 14., 12., 10., 8., 6., 4., 2., 0., -2., -4., -6.,
-8., -10., -12., -14., -16., -18., -20., -22., -24., -26., -28., -30.,
-32., -34., -36., -38., -40., -42., -44., -46., -48., -50., -52., -54.,
-56., -58., -60., -62., -64., -66., -68., -70., -72., -74., -76., -78.,
-80., -82., -84., -86., -88.], dtype=float32)
- lon(lon)float320.0 2.0 4.0 ... 354.0 356.0 358.0
- units :
- degrees_east
- long_name :
- Longitude
- actual_range :
- [ 0. 358.]
- standard_name :
- longitude
- axis :
- X
- coordinate_defines :
- center
array([ 0., 2., 4., 6., 8., 10., 12., 14., 16., 18., 20., 22.,
24., 26., 28., 30., 32., 34., 36., 38., 40., 42., 44., 46.,
48., 50., 52., 54., 56., 58., 60., 62., 64., 66., 68., 70.,
72., 74., 76., 78., 80., 82., 84., 86., 88., 90., 92., 94.,
96., 98., 100., 102., 104., 106., 108., 110., 112., 114., 116., 118.,
120., 122., 124., 126., 128., 130., 132., 134., 136., 138., 140., 142.,
144., 146., 148., 150., 152., 154., 156., 158., 160., 162., 164., 166.,
168., 170., 172., 174., 176., 178., 180., 182., 184., 186., 188., 190.,
192., 194., 196., 198., 200., 202., 204., 206., 208., 210., 212., 214.,
216., 218., 220., 222., 224., 226., 228., 230., 232., 234., 236., 238.,
240., 242., 244., 246., 248., 250., 252., 254., 256., 258., 260., 262.,
264., 266., 268., 270., 272., 274., 276., 278., 280., 282., 284., 286.,
288., 290., 292., 294., 296., 298., 300., 302., 304., 306., 308., 310.,
312., 314., 316., 318., 320., 322., 324., 326., 328., 330., 332., 334.,
336., 338., 340., 342., 344., 346., 348., 350., 352., 354., 356., 358.],
dtype=float32)
- time(time)datetime64[ns]1960-01-01 ... 2018-01-01
- long_name :
- Time
- delta_t :
- 0000-01-00 00:00:00
- avg_period :
- 0000-01-00 00:00:00
- prev_avg_period :
- 0000-00-07 00:00:00
- standard_name :
- time
- axis :
- T
- actual_range :
- [19723. 80992.]
- _ChunkSizes :
- 1
array(['1960-01-01T00:00:00.000000000', '1961-01-01T00:00:00.000000000',
'1962-01-01T00:00:00.000000000', '1963-01-01T00:00:00.000000000',
'1964-01-01T00:00:00.000000000', '1965-01-01T00:00:00.000000000',
'1966-01-01T00:00:00.000000000', '1967-01-01T00:00:00.000000000',
'1968-01-01T00:00:00.000000000', '1969-01-01T00:00:00.000000000',
'1970-01-01T00:00:00.000000000', '1971-01-01T00:00:00.000000000',
'1972-01-01T00:00:00.000000000', '1973-01-01T00:00:00.000000000',
'1974-01-01T00:00:00.000000000', '1975-01-01T00:00:00.000000000',
'1976-01-01T00:00:00.000000000', '1977-01-01T00:00:00.000000000',
'1978-01-01T00:00:00.000000000', '1979-01-01T00:00:00.000000000',
'1980-01-01T00:00:00.000000000', '1981-01-01T00:00:00.000000000',
'1982-01-01T00:00:00.000000000', '1983-01-01T00:00:00.000000000',
'1984-01-01T00:00:00.000000000', '1985-01-01T00:00:00.000000000',
'1986-01-01T00:00:00.000000000', '1987-01-01T00:00:00.000000000',
'1988-01-01T00:00:00.000000000', '1989-01-01T00:00:00.000000000',
'1990-01-01T00:00:00.000000000', '1991-01-01T00:00:00.000000000',
'1992-01-01T00:00:00.000000000', '1993-01-01T00:00:00.000000000',
'1994-01-01T00:00:00.000000000', '1995-01-01T00:00:00.000000000',
'1996-01-01T00:00:00.000000000', '1997-01-01T00:00:00.000000000',
'1998-01-01T00:00:00.000000000', '1999-01-01T00:00:00.000000000',
'2000-01-01T00:00:00.000000000', '2001-01-01T00:00:00.000000000',
'2002-01-01T00:00:00.000000000', '2003-01-01T00:00:00.000000000',
'2004-01-01T00:00:00.000000000', '2005-01-01T00:00:00.000000000',
'2006-01-01T00:00:00.000000000', '2007-01-01T00:00:00.000000000',
'2008-01-01T00:00:00.000000000', '2009-01-01T00:00:00.000000000',
'2010-01-01T00:00:00.000000000', '2011-01-01T00:00:00.000000000',
'2012-01-01T00:00:00.000000000', '2013-01-01T00:00:00.000000000',
'2014-01-01T00:00:00.000000000', '2015-01-01T00:00:00.000000000',
'2016-01-01T00:00:00.000000000', '2017-01-01T00:00:00.000000000',
'2018-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
- long_name :
- Monthly Means of Sea Surface Temperature
- units :
- degC
- var_desc :
- Sea Surface Temperature
- level_desc :
- Surface
- statistic :
- Mean
- dataset :
- NOAA Extended Reconstructed SST V5
- parent_stat :
- Individual Values
- actual_range :
- [-1.8 42.32636]
- valid_range :
- [-1.8 45. ]
- _ChunkSizes :
- [ 1 89 180]
Map & Combine#
Now that we have groups defined, it’s time to “apply” a calculation to the group. Like in Pandas, these calculations can either be:
aggregation: reduces the size of the group
transformation: preserves the group’s full size
At then end of the apply step, xarray will automatically combine the aggregated / transformed groups back into a single object.
Warning
Xarray calls the “apply” step map
. This is different from Pandas!
The most fundamental way to apply is with the .map
method.
gb.map?
Signature: gb.map(func, shortcut=False, args=(), **kwargs)
Docstring:
Apply a function to each array in the group and concatenate them
together into a new array.
`func` is called like `func(ar, *args, **kwargs)` for each array `ar`
in this group.
Apply uses heuristics (like `pandas.GroupBy.apply`) to figure out how
to stack together the array. The rule is:
1. If the dimension along which the group coordinate is defined is
still in the first grouped array after applying `func`, then stack
over this dimension.
2. Otherwise, stack over the new dimension given by name of this
grouping (the argument to the `groupby` function).
Parameters
----------
func : callable
Callable to apply to each array.
shortcut : bool, optional
Whether or not to shortcut evaluation under the assumptions that:
(1) The action of `func` does not depend on any of the array
metadata (attributes or coordinates) but only on the data and
dimensions.
(2) The action of `func` creates arrays with homogeneous metadata,
that is, with the same dimensions and attributes.
If these conditions are satisfied `shortcut` provides significant
speedup. This should be the case for many common groupby operations
(e.g., applying numpy ufuncs).
*args : tuple, optional
Positional arguments passed to `func`.
**kwargs
Used to call `func(ar, **kwargs)` for each array `ar`.
Returns
-------
applied : DataArray or DataArray
The result of splitting, applying and combining this array.
File: /srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/core/groupby.py
Type: method
Aggregations#
.apply
accepts as its argument a function. We can pass an existing function:
gb.map(np.mean)
<xarray.DataArray 'sst' (month: 12)>
array([13.659641, 13.768647, 13.76488 , 13.684034, 13.642146, 13.713043,
13.921847, 14.093956, 13.982147, 13.691116, 13.506494, 13.529454],
dtype=float32)
Coordinates:
* month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12
xarray.DataArray'sst'- month: 12
- 13.66 13.77 13.76 13.68 13.64 13.71 ... 14.09 13.98 13.69 13.51 13.53
array([13.659641, 13.768647, 13.76488 , 13.684034, 13.642146, 13.713043,
13.921847, 14.093956, 13.982147, 13.691116, 13.506494, 13.529454],
dtype=float32)
- month(month)int641 2 3 4 5 6 7 8 9 10 11 12
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
Because we specified no extra arguments (like axis
) the function was applied over all space and time dimensions. This is not what we wanted. Instead, we could define a custom function. This function takes a single argument–the group dataset–and returns a new dataset to be combined:
def time_mean(a):
return a.mean(dim='time')
gb.apply(time_mean)
<xarray.DataArray 'sst' (month: 12, lat: 89, lon: 180)>
array([[[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan]],
[[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan]],
[[-1.7995025, -1.7995973, -1.7998415, ..., -1.7997988,
-1.7996519, -1.7995045],
[-1.7995876, -1.7997634, -1.8000009, ..., -1.8000009,
-1.7998358, -1.7996247],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan]]], dtype=float32)
Coordinates:
* lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0
* lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0
* month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12
xarray.DataArray'sst'- month: 12
- lat: 89
- lon: 180
- -1.8 -1.8 -1.8 -1.8 -1.8 -1.8 -1.8 ... nan nan nan nan nan nan nan
array([[[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan]],
[[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan]],
[[-1.7995025, -1.7995973, -1.7998415, ..., -1.7997988,
-1.7996519, -1.7995045],
[-1.7995876, -1.7997634, -1.8000009, ..., -1.8000009,
-1.7998358, -1.7996247],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan]]], dtype=float32)
- lat(lat)float3288.0 86.0 84.0 ... -86.0 -88.0
- units :
- degrees_north
- long_name :
- Latitude
- actual_range :
- [ 88. -88.]
- standard_name :
- latitude
- axis :
- Y
- coordinate_defines :
- center
array([ 88., 86., 84., 82., 80., 78., 76., 74., 72., 70., 68., 66.,
64., 62., 60., 58., 56., 54., 52., 50., 48., 46., 44., 42.,
40., 38., 36., 34., 32., 30., 28., 26., 24., 22., 20., 18.,
16., 14., 12., 10., 8., 6., 4., 2., 0., -2., -4., -6.,
-8., -10., -12., -14., -16., -18., -20., -22., -24., -26., -28., -30.,
-32., -34., -36., -38., -40., -42., -44., -46., -48., -50., -52., -54.,
-56., -58., -60., -62., -64., -66., -68., -70., -72., -74., -76., -78.,
-80., -82., -84., -86., -88.], dtype=float32)
- lon(lon)float320.0 2.0 4.0 ... 354.0 356.0 358.0
- units :
- degrees_east
- long_name :
- Longitude
- actual_range :
- [ 0. 358.]
- standard_name :
- longitude
- axis :
- X
- coordinate_defines :
- center
array([ 0., 2., 4., 6., 8., 10., 12., 14., 16., 18., 20., 22.,
24., 26., 28., 30., 32., 34., 36., 38., 40., 42., 44., 46.,
48., 50., 52., 54., 56., 58., 60., 62., 64., 66., 68., 70.,
72., 74., 76., 78., 80., 82., 84., 86., 88., 90., 92., 94.,
96., 98., 100., 102., 104., 106., 108., 110., 112., 114., 116., 118.,
120., 122., 124., 126., 128., 130., 132., 134., 136., 138., 140., 142.,
144., 146., 148., 150., 152., 154., 156., 158., 160., 162., 164., 166.,
168., 170., 172., 174., 176., 178., 180., 182., 184., 186., 188., 190.,
192., 194., 196., 198., 200., 202., 204., 206., 208., 210., 212., 214.,
216., 218., 220., 222., 224., 226., 228., 230., 232., 234., 236., 238.,
240., 242., 244., 246., 248., 250., 252., 254., 256., 258., 260., 262.,
264., 266., 268., 270., 272., 274., 276., 278., 280., 282., 284., 286.,
288., 290., 292., 294., 296., 298., 300., 302., 304., 306., 308., 310.,
312., 314., 316., 318., 320., 322., 324., 326., 328., 330., 332., 334.,
336., 338., 340., 342., 344., 346., 348., 350., 352., 354., 356., 358.],
dtype=float32)
- month(month)int641 2 3 4 5 6 7 8 9 10 11 12
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
Like Pandas, xarray’s groupby object has many built-in aggregation operations (e.g. mean
, min
, max
, std
, etc):
# this does the same thing as the previous cell
sst_mm = gb.mean(dim='time')
sst_mm
<xarray.DataArray 'sst' (month: 12, lat: 89, lon: 180)>
array([[[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan]],
[[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan]],
[[-1.7995025, -1.7995973, -1.7998415, ..., -1.7997988,
-1.7996519, -1.7995045],
[-1.7995876, -1.7997634, -1.8000009, ..., -1.8000009,
-1.7998358, -1.7996247],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan]]], dtype=float32)
Coordinates:
* lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0
* lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0
* month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12
xarray.DataArray'sst'- month: 12
- lat: 89
- lon: 180
- -1.8 -1.8 -1.8 -1.8 -1.8 -1.8 -1.8 ... nan nan nan nan nan nan nan
array([[[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan]],
[[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan]],
[[-1.7995025, -1.7995973, -1.7998415, ..., -1.7997988,
-1.7996519, -1.7995045],
[-1.7995876, -1.7997634, -1.8000009, ..., -1.8000009,
-1.7998358, -1.7996247],
[-1.8000009, -1.8000009, -1.8000009, ..., -1.8000009,
-1.8000009, -1.8000009],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan]]], dtype=float32)
- lat(lat)float3288.0 86.0 84.0 ... -86.0 -88.0
- units :
- degrees_north
- long_name :
- Latitude
- actual_range :
- [ 88. -88.]
- standard_name :
- latitude
- axis :
- Y
- coordinate_defines :
- center
array([ 88., 86., 84., 82., 80., 78., 76., 74., 72., 70., 68., 66.,
64., 62., 60., 58., 56., 54., 52., 50., 48., 46., 44., 42.,
40., 38., 36., 34., 32., 30., 28., 26., 24., 22., 20., 18.,
16., 14., 12., 10., 8., 6., 4., 2., 0., -2., -4., -6.,
-8., -10., -12., -14., -16., -18., -20., -22., -24., -26., -28., -30.,
-32., -34., -36., -38., -40., -42., -44., -46., -48., -50., -52., -54.,
-56., -58., -60., -62., -64., -66., -68., -70., -72., -74., -76., -78.,
-80., -82., -84., -86., -88.], dtype=float32)
- lon(lon)float320.0 2.0 4.0 ... 354.0 356.0 358.0
- units :
- degrees_east
- long_name :
- Longitude
- actual_range :
- [ 0. 358.]
- standard_name :
- longitude
- axis :
- X
- coordinate_defines :
- center
array([ 0., 2., 4., 6., 8., 10., 12., 14., 16., 18., 20., 22.,
24., 26., 28., 30., 32., 34., 36., 38., 40., 42., 44., 46.,
48., 50., 52., 54., 56., 58., 60., 62., 64., 66., 68., 70.,
72., 74., 76., 78., 80., 82., 84., 86., 88., 90., 92., 94.,
96., 98., 100., 102., 104., 106., 108., 110., 112., 114., 116., 118.,
120., 122., 124., 126., 128., 130., 132., 134., 136., 138., 140., 142.,
144., 146., 148., 150., 152., 154., 156., 158., 160., 162., 164., 166.,
168., 170., 172., 174., 176., 178., 180., 182., 184., 186., 188., 190.,
192., 194., 196., 198., 200., 202., 204., 206., 208., 210., 212., 214.,
216., 218., 220., 222., 224., 226., 228., 230., 232., 234., 236., 238.,
240., 242., 244., 246., 248., 250., 252., 254., 256., 258., 260., 262.,
264., 266., 268., 270., 272., 274., 276., 278., 280., 282., 284., 286.,
288., 290., 292., 294., 296., 298., 300., 302., 304., 306., 308., 310.,
312., 314., 316., 318., 320., 322., 324., 326., 328., 330., 332., 334.,
336., 338., 340., 342., 344., 346., 348., 350., 352., 354., 356., 358.],
dtype=float32)
- month(month)int641 2 3 4 5 6 7 8 9 10 11 12
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
So we did what we wanted to do: calculate the climatology at every point in the dataset. Let’s look at the data a bit.
Climatlogy at a specific point in the North Atlantic
sst_mm.sel(lon=300, lat=50).plot()
Transformations#
Now we want to remove this climatology from the dataset, to examine the residual, called the anomaly, which is the interesting part from a climate perspective.
Removing the seasonal climatology is a perfect example of a transformation: it operates over a group, but doesn’t change the size of the dataset. Here is one way to code it.
def remove_time_mean(x):
return x - x.mean(dim='time')
ds_anom = ds.groupby('time.month').apply(remove_time_mean)
ds_anom
Dimensions: (lat: 89, lon: 180, time: 708)
Coordinates:
* lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0
* lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0
* time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2018-12-01
Data variables:
sst (time, lat, lon) float32 9.537e-07 9.537e-07 9.537e-07 ... nan nan
xarray.Dataset
- lat: 89
- lon: 180
- time: 708
- lat(lat)float3288.0 86.0 84.0 ... -86.0 -88.0
- units :
- degrees_north
- long_name :
- Latitude
- actual_range :
- [ 88. -88.]
- standard_name :
- latitude
- axis :
- Y
- coordinate_defines :
- center
array([ 88., 86., 84., 82., 80., 78., 76., 74., 72., 70., 68., 66.,
64., 62., 60., 58., 56., 54., 52., 50., 48., 46., 44., 42.,
40., 38., 36., 34., 32., 30., 28., 26., 24., 22., 20., 18.,
16., 14., 12., 10., 8., 6., 4., 2., 0., -2., -4., -6.,
-8., -10., -12., -14., -16., -18., -20., -22., -24., -26., -28., -30.,
-32., -34., -36., -38., -40., -42., -44., -46., -48., -50., -52., -54.,
-56., -58., -60., -62., -64., -66., -68., -70., -72., -74., -76., -78.,
-80., -82., -84., -86., -88.], dtype=float32)
- lon(lon)float320.0 2.0 4.0 ... 354.0 356.0 358.0
- units :
- degrees_east
- long_name :
- Longitude
- actual_range :
- [ 0. 358.]
- standard_name :
- longitude
- axis :
- X
- coordinate_defines :
- center
array([ 0., 2., 4., 6., 8., 10., 12., 14., 16., 18., 20., 22.,
24., 26., 28., 30., 32., 34., 36., 38., 40., 42., 44., 46.,
48., 50., 52., 54., 56., 58., 60., 62., 64., 66., 68., 70.,
72., 74., 76., 78., 80., 82., 84., 86., 88., 90., 92., 94.,
96., 98., 100., 102., 104., 106., 108., 110., 112., 114., 116., 118.,
120., 122., 124., 126., 128., 130., 132., 134., 136., 138., 140., 142.,
144., 146., 148., 150., 152., 154., 156., 158., 160., 162., 164., 166.,
168., 170., 172., 174., 176., 178., 180., 182., 184., 186., 188., 190.,
192., 194., 196., 198., 200., 202., 204., 206., 208., 210., 212., 214.,
216., 218., 220., 222., 224., 226., 228., 230., 232., 234., 236., 238.,
240., 242., 244., 246., 248., 250., 252., 254., 256., 258., 260., 262.,
264., 266., 268., 270., 272., 274., 276., 278., 280., 282., 284., 286.,
288., 290., 292., 294., 296., 298., 300., 302., 304., 306., 308., 310.,
312., 314., 316., 318., 320., 322., 324., 326., 328., 330., 332., 334.,
336., 338., 340., 342., 344., 346., 348., 350., 352., 354., 356., 358.],
dtype=float32)
- time(time)datetime64[ns]1960-01-01 ... 2018-12-01
- long_name :
- Time
- delta_t :
- 0000-01-00 00:00:00
- avg_period :
- 0000-01-00 00:00:00
- prev_avg_period :
- 0000-00-07 00:00:00
- standard_name :
- time
- axis :
- T
- actual_range :
- [19723. 80992.]
- _ChunkSizes :
- 1
array(['1960-01-01T00:00:00.000000000', '1960-02-01T00:00:00.000000000',
'1960-03-01T00:00:00.000000000', ..., '2018-10-01T00:00:00.000000000',
'2018-11-01T00:00:00.000000000', '2018-12-01T00:00:00.000000000'],
dtype='datetime64[ns]')
- sst(time, lat, lon)float329.537e-07 9.537e-07 ... nan nan
array([[[ 9.5367432e-07, 9.5367432e-07, 9.5367432e-07, ...,
9.5367432e-07, 9.5367432e-07, 9.5367432e-07],
[ 9.5367432e-07, 9.5367432e-07, 9.5367432e-07, ...,
9.5367432e-07, 9.5367432e-07, 9.5367432e-07],
[ 9.5367432e-07, 9.5367432e-07, 9.5367432e-07, ...,
9.5367432e-07, 9.5367432e-07, 9.5367432e-07],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan]],
[[ 9.5367432e-07, 9.5367432e-07, 9.5367432e-07, ...,
9.5367432e-07, 9.5367432e-07, 9.5367432e-07],
[ 9.5367432e-07, 9.5367432e-07, 9.5367432e-07, ...,
9.5367432e-07, 9.5367432e-07, 9.5367432e-07],
[ 9.5367432e-07, 9.5367432e-07, 9.5367432e-07, ...,
9.5367432e-07, 9.5367432e-07, 9.5367432e-07],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan]],
[[-4.9746037e-04, -4.0268898e-04, -1.5842915e-04, ...,
-2.0110607e-04, -3.4809113e-04, -4.9543381e-04],
[-4.1234493e-04, -2.3651123e-04, 9.5367432e-07, ...,
9.5367432e-07, -1.6415119e-04, -3.7527084e-04],
[ 9.5367432e-07, 9.5367432e-07, 9.5367432e-07, ...,
9.5367432e-07, 9.5367432e-07, 9.5367432e-07],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan]]], dtype=float32)
In the above example, we applied groupby
to a Dataset
instead of a DataArray
.
Xarray makes these sorts of transformations easy by supporting groupby arithmetic.
This concept is easiest explained with an example:
gb = ds.groupby('time.month')
ds_anom = gb - gb.mean(dim='time')
ds_anom
Dimensions: (lat: 89, lon: 180, time: 708)
Coordinates:
* lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0
* lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0
* time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2018-12-01
month (time) int64 1 2 3 4 5 6 7 8 9 10 11 ... 2 3 4 5 6 7 8 9 10 11 12
Data variables:
sst (time, lat, lon) float32 9.537e-07 9.537e-07 9.537e-07 ... nan nan
xarray.Dataset- lat: 89
- lon: 180
- time: 708
- lat(lat)float3288.0 86.0 84.0 ... -86.0 -88.0
- units :
- degrees_north
- long_name :
- Latitude
- actual_range :
- [ 88. -88.]
- standard_name :
- latitude
- axis :
- Y
- coordinate_defines :
- center
array([ 88., 86., 84., 82., 80., 78., 76., 74., 72., 70., 68., 66.,
64., 62., 60., 58., 56., 54., 52., 50., 48., 46., 44., 42.,
40., 38., 36., 34., 32., 30., 28., 26., 24., 22., 20., 18.,
16., 14., 12., 10., 8., 6., 4., 2., 0., -2., -4., -6.,
-8., -10., -12., -14., -16., -18., -20., -22., -24., -26., -28., -30.,
-32., -34., -36., -38., -40., -42., -44., -46., -48., -50., -52., -54.,
-56., -58., -60., -62., -64., -66., -68., -70., -72., -74., -76., -78.,
-80., -82., -84., -86., -88.], dtype=float32)
- lon(lon)float320.0 2.0 4.0 ... 354.0 356.0 358.0
- units :
- degrees_east
- long_name :
- Longitude
- actual_range :
- [ 0. 358.]
- standard_name :
- longitude
- axis :
- X
- coordinate_defines :
- center
array([ 0., 2., 4., 6., 8., 10., 12., 14., 16., 18., 20., 22.,
24., 26., 28., 30., 32., 34., 36., 38., 40., 42., 44., 46.,
48., 50., 52., 54., 56., 58., 60., 62., 64., 66., 68., 70.,
72., 74., 76., 78., 80., 82., 84., 86., 88., 90., 92., 94.,
96., 98., 100., 102., 104., 106., 108., 110., 112., 114., 116., 118.,
120., 122., 124., 126., 128., 130., 132., 134., 136., 138., 140., 142.,
144., 146., 148., 150., 152., 154., 156., 158., 160., 162., 164., 166.,
168., 170., 172., 174., 176., 178., 180., 182., 184., 186., 188., 190.,
192., 194., 196., 198., 200., 202., 204., 206., 208., 210., 212., 214.,
216., 218., 220., 222., 224., 226., 228., 230., 232., 234., 236., 238.,
240., 242., 244., 246., 248., 250., 252., 254., 256., 258., 260., 262.,
264., 266., 268., 270., 272., 274., 276., 278., 280., 282., 284., 286.,
288., 290., 292., 294., 296., 298., 300., 302., 304., 306., 308., 310.,
312., 314., 316., 318., 320., 322., 324., 326., 328., 330., 332., 334.,
336., 338., 340., 342., 344., 346., 348., 350., 352., 354., 356., 358.],
dtype=float32)
- time(time)datetime64[ns]1960-01-01 ... 2018-12-01
- long_name :
- Time
- delta_t :
- 0000-01-00 00:00:00
- avg_period :
- 0000-01-00 00:00:00
- prev_avg_period :
- 0000-00-07 00:00:00
- standard_name :
- time
- axis :
- T
- actual_range :
- [19723. 80992.]
- _ChunkSizes :
- 1
array(['1960-01-01T00:00:00.000000000', '1960-02-01T00:00:00.000000000',
'1960-03-01T00:00:00.000000000', ..., '2018-10-01T00:00:00.000000000',
'2018-11-01T00:00:00.000000000', '2018-12-01T00:00:00.000000000'],
dtype='datetime64[ns]')
- month(time)int641 2 3 4 5 6 7 ... 6 7 8 9 10 11 12
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4,
5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4,
5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
- sst(time, lat, lon)float329.537e-07 9.537e-07 ... nan nan
array([[[ 9.5367432e-07, 9.5367432e-07, 9.5367432e-07, ...,
9.5367432e-07, 9.5367432e-07, 9.5367432e-07],
[ 9.5367432e-07, 9.5367432e-07, 9.5367432e-07, ...,
9.5367432e-07, 9.5367432e-07, 9.5367432e-07],
[ 9.5367432e-07, 9.5367432e-07, 9.5367432e-07, ...,
9.5367432e-07, 9.5367432e-07, 9.5367432e-07],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan]],
[[ 9.5367432e-07, 9.5367432e-07, 9.5367432e-07, ...,
9.5367432e-07, 9.5367432e-07, 9.5367432e-07],
[ 9.5367432e-07, 9.5367432e-07, 9.5367432e-07, ...,
9.5367432e-07, 9.5367432e-07, 9.5367432e-07],
[ 9.5367432e-07, 9.5367432e-07, 9.5367432e-07, ...,
9.5367432e-07, 9.5367432e-07, 9.5367432e-07],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan]],
[[-4.9746037e-04, -4.0268898e-04, -1.5842915e-04, ...,
-2.0110607e-04, -3.4809113e-04, -4.9543381e-04],
[-4.1234493e-04, -2.3651123e-04, 9.5367432e-07, ...,
9.5367432e-07, -1.6415119e-04, -3.7527084e-04],
[ 9.5367432e-07, 9.5367432e-07, 9.5367432e-07, ...,
9.5367432e-07, 9.5367432e-07, 9.5367432e-07],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan]]], dtype=float32)
Now we can view the climate signal without the overwhelming influence of the seasonal cycle.
Timeseries at a single point in the North Atlantic
ds_anom.sst.sel(lon=300, lat=50).plot()
Resample#
Resample in xarray is nearly identical to Pandas.
It can be applied only to time-index dimensions. Here we compute the five-year mean.
It is effectively a group-by operation, and uses the same basic syntax.
Note that resampling changes the length of the the output arrays.
ds_anom_resample = ds_anom.resample(time='5Y').mean(dim='time')
ds_anom_resample
Dimensions: (time: 13, lat: 89, lon: 180)
Coordinates:
* time (time) datetime64[ns] 1960-12-31 1965-12-31 ... 2020-12-31
* lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0
* lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0
Data variables:
sst (time, lat, lon) float32 -0.0005707 -0.0005493 ... nan nan
xarray.Dataset- time: 13
- lat: 89
- lon: 180
- time(time)datetime64[ns]1960-12-31 ... 2020-12-31
array(['1960-12-31T00:00:00.000000000', '1965-12-31T00:00:00.000000000',
'1970-12-31T00:00:00.000000000', '1975-12-31T00:00:00.000000000',
'1980-12-31T00:00:00.000000000', '1985-12-31T00:00:00.000000000',
'1990-12-31T00:00:00.000000000', '1995-12-31T00:00:00.000000000',
'2000-12-31T00:00:00.000000000', '2005-12-31T00:00:00.000000000',
'2010-12-31T00:00:00.000000000', '2015-12-31T00:00:00.000000000',
'2020-12-31T00:00:00.000000000'], dtype='datetime64[ns]')
- lat(lat)float3288.0 86.0 84.0 ... -86.0 -88.0
- units :
- degrees_north
- long_name :
- Latitude
- actual_range :
- [ 88. -88.]
- standard_name :
- latitude
- axis :
- Y
- coordinate_defines :
- center
array([ 88., 86., 84., 82., 80., 78., 76., 74., 72., 70., 68., 66.,
64., 62., 60., 58., 56., 54., 52., 50., 48., 46., 44., 42.,
40., 38., 36., 34., 32., 30., 28., 26., 24., 22., 20., 18.,
16., 14., 12., 10., 8., 6., 4., 2., 0., -2., -4., -6.,
-8., -10., -12., -14., -16., -18., -20., -22., -24., -26., -28., -30.,
-32., -34., -36., -38., -40., -42., -44., -46., -48., -50., -52., -54.,
-56., -58., -60., -62., -64., -66., -68., -70., -72., -74., -76., -78.,
-80., -82., -84., -86., -88.], dtype=float32)
- lon(lon)float320.0 2.0 4.0 ... 354.0 356.0 358.0
- units :
- degrees_east
- long_name :
- Longitude
- actual_range :
- [ 0. 358.]
- standard_name :
- longitude
- axis :
- X
- coordinate_defines :
- center
array([ 0., 2., 4., 6., 8., 10., 12., 14., 16., 18., 20., 22.,
24., 26., 28., 30., 32., 34., 36., 38., 40., 42., 44., 46.,
48., 50., 52., 54., 56., 58., 60., 62., 64., 66., 68., 70.,
72., 74., 76., 78., 80., 82., 84., 86., 88., 90., 92., 94.,
96., 98., 100., 102., 104., 106., 108., 110., 112., 114., 116., 118.,
120., 122., 124., 126., 128., 130., 132., 134., 136., 138., 140., 142.,
144., 146., 148., 150., 152., 154., 156., 158., 160., 162., 164., 166.,
168., 170., 172., 174., 176., 178., 180., 182., 184., 186., 188., 190.,
192., 194., 196., 198., 200., 202., 204., 206., 208., 210., 212., 214.,
216., 218., 220., 222., 224., 226., 228., 230., 232., 234., 236., 238.,
240., 242., 244., 246., 248., 250., 252., 254., 256., 258., 260., 262.,
264., 266., 268., 270., 272., 274., 276., 278., 280., 282., 284., 286.,
288., 290., 292., 294., 296., 298., 300., 302., 304., 306., 308., 310.,
312., 314., 316., 318., 320., 322., 324., 326., 328., 330., 332., 334.,
336., 338., 340., 342., 344., 346., 348., 350., 352., 354., 356., 358.],
dtype=float32)
- sst(time, lat, lon)
float32-0.0005707 -0.0005493 ... nan nan
array([[[-5.70684671e-04, -5.49256802e-04, -5.38418710e-04, ...,
-4.15881485e-04, -5.20139933e-04, -5.83072484e-04],
[-5.15530526e-04, -5.89827716e-04, -6.42200292e-04, ...,
-1.39196709e-04, -3.24239343e-04, -4.26093728e-04],
[-1.40964985e-05, -1.79181501e-04, -2.83718109e-04, ...,
-1.24106809e-04, -7.90158883e-05, -3.50972005e-05],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan]],
[[-5.70684671e-04, -5.49256802e-04, -5.38418710e-04, ...,
-4.15881485e-04, -5.20139933e-04, -5.83072484e-04],
[-5.15530526e-04, -5.89827716e-04, -6.42200292e-04, ...,
-1.39196709e-04, -3.24239343e-04, -4.26093728e-04],
[-1.40964985e-05, -1.79181501e-04, -2.83718109e-04, ...,
-1.24106809e-04, -7.90158883e-05, -3.50972005e-05],
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan]],
[[ 1.06710363e-02, 1.02709997e-02, 1.00688171e-02, ...,
7.78120104e-03, 9.72754415e-03, 1.09023089e-02],
[ 9.64192208e-03, 1.10286539e-02, 1.20063825e-02, ...,
2.61675986e-03, 6.07070001e-03, 7.97217432e-03],
[ 2.81731278e-04, 3.36324517e-03, 5.31469798e-03, ...,
2.31114682e-03, 1.49351358e-03, 6.73671544e-04],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan]]],
dtype=float32)
ds_anom.sst.sel(lon=300, lat=50).plot()
ds_anom_resample.sst.sel(lon=300, lat=50).plot(marker='o')
Rolling is also similar to pandas.
It does not change the length of the arrays.
Instead, it allows a moving window to be applied to the data at each point.
ds_anom_rolling = ds_anom.rolling(time=12, center=True).mean()
ds_anom_rolling
Dimensions: (lat: 89, lon: 180, time: 708)
Coordinates:
* lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0
* lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0
* time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2018-12-01
month (time) int64 1 2 3 4 5 6 7 8 9 10 11 ... 2 3 4 5 6 7 8 9 10 11 12
Data variables:
sst (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
xarray.Dataset- lat: 89
- lon: 180
- time: 708
- lat(lat)float3288.0 86.0 84.0 ... -86.0 -88.0
- units :
- degrees_north
- long_name :
- Latitude
- actual_range :
- [ 88. -88.]
- standard_name :
- latitude
- axis :
- Y
- coordinate_defines :
- center
array([ 88., 86., 84., 82., 80., 78., 76., 74., 72., 70., 68., 66.,
64., 62., 60., 58., 56., 54., 52., 50., 48., 46., 44., 42.,
40., 38., 36., 34., 32., 30., 28., 26., 24., 22., 20., 18.,
16., 14., 12., 10., 8., 6., 4., 2., 0., -2., -4., -6.,
-8., -10., -12., -14., -16., -18., -20., -22., -24., -26., -28., -30.,
-32., -34., -36., -38., -40., -42., -44., -46., -48., -50., -52., -54.,
-56., -58., -60., -62., -64., -66., -68., -70., -72., -74., -76., -78.,
-80., -82., -84., -86., -88.], dtype=float32)
- lon(lon)float320.0 2.0 4.0 ... 354.0 356.0 358.0
- units :
- degrees_east
- long_name :
- Longitude
- actual_range :
- [ 0. 358.]
- standard_name :
- longitude
- axis :
- X
- coordinate_defines :
- center
array([ 0., 2., 4., 6., 8., 10., 12., 14., 16., 18., 20., 22.,
24., 26., 28., 30., 32., 34., 36., 38., 40., 42., 44., 46.,
48., 50., 52., 54., 56., 58., 60., 62., 64., 66., 68., 70.,
72., 74., 76., 78., 80., 82., 84., 86., 88., 90., 92., 94.,
96., 98., 100., 102., 104., 106., 108., 110., 112., 114., 116., 118.,
120., 122., 124., 126., 128., 130., 132., 134., 136., 138., 140., 142.,
144., 146., 148., 150., 152., 154., 156., 158., 160., 162., 164., 166.,
168., 170., 172., 174., 176., 178., 180., 182., 184., 186., 188., 190.,
192., 194., 196., 198., 200., 202., 204., 206., 208., 210., 212., 214.,
216., 218., 220., 222., 224., 226., 228., 230., 232., 234., 236., 238.,
240., 242., 244., 246., 248., 250., 252., 254., 256., 258., 260., 262.,
264., 266., 268., 270., 272., 274., 276., 278., 280., 282., 284., 286.,
288., 290., 292., 294., 296., 298., 300., 302., 304., 306., 308., 310.,
312., 314., 316., 318., 320., 322., 324., 326., 328., 330., 332., 334.,
336., 338., 340., 342., 344., 346., 348., 350., 352., 354., 356., 358.],
dtype=float32)
- time(time)datetime64[ns]1960-01-01 ... 2018-12-01
- long_name :
- Time
- delta_t :
- 0000-01-00 00:00:00
- avg_period :
- 0000-01-00 00:00:00
- prev_avg_period :
- 0000-00-07 00:00:00
- standard_name :
- time
- axis :
- T
- actual_range :
- [19723. 80992.]
- _ChunkSizes :
- 1
array(['1960-01-01T00:00:00.000000000', '1960-02-01T00:00:00.000000000',
'1960-03-01T00:00:00.000000000', ..., '2018-10-01T00:00:00.000000000',
'2018-11-01T00:00:00.000000000', '2018-12-01T00:00:00.000000000'],
dtype='datetime64[ns]')
- month(time)int641 2 3 4 5 6 7 ... 6 7 8 9 10 11 12
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4,
5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4,
5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
- sst(time, lat, lon)float32nan nan nan nan ... nan nan nan nan
array([[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]]], dtype=float32)
ds_anom.sst.sel(lon=300, lat=50).plot(label='monthly anom')
ds_anom_resample.sst.sel(lon=300, lat=50).plot(marker='o', label='5 year resample')
ds_anom_rolling.sst.sel(lon=300, lat=50).plot(label='12 month rolling mean', color='k')
plt.legend()
Coarsen#
coarsen
is a simple way to reduce the size of your data along one or more axes.
It is very similar to resample
when operating on time dimensions; the key differnce is that coarsen
only operates on fixed blocks of data, irrespective of the coordinate values, while resample
actually looks at the coordinates to figure out, e.g. what month a particular data point is in.
For regularly-spaced monthly data beginning in January, the following should be equivalent to annual resampling.
However, results would different for irregularly-spaced data.
ds.coarsen(time=12).mean()
Dimensions: (time: 59, lat: 89, lon: 180)
Coordinates:
* lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0
* lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0
* time (time) datetime64[ns] 1960-06-16T08:00:00 ... 2018-06-16T12:00:00
Data variables:
sst (time, lat, lon) float32 -1.8 -1.8 -1.8 -1.8 ... nan nan nan nan
Attributes: (12/38)
climatology: Climatology is based on 1971-2000 SST, X...
description: In situ data: ICOADS2.5 before 2007 and ...
keywords_vocabulary: NASA Global Change Master Directory (GCM...
keywords: Earth Science > Oceans > Ocean Temperatu...
instrument: Conventional thermometers
source_comment: SSTs were observed by conventional therm...
... ...
license: No constraints on data access or use
comment: SSTs were observed by conventional therm...
summary: ERSST.v5 is developed based on v4 after ...
dataset_title: NOAA Extended Reconstructed SST V5
data_modified: 2021-11-07
DODS_EXTRA.Unlimited_Dimension: time
xarray.Dataset- time: 59
- lat: 89
- lon: 180
- lat(lat)float3288.0 86.0 84.0 ... -86.0 -88.0
- units :
- degrees_north
- long_name :
- Latitude
- actual_range :
- [ 88. -88.]
- standard_name :
- latitude
- axis :
- Y
- coordinate_defines :
- center
array([ 88., 86., 84., 82., 80., 78., 76., 74., 72., 70., 68., 66.,
64., 62., 60., 58., 56., 54., 52., 50., 48., 46., 44., 42.,
40., 38., 36., 34., 32., 30., 28., 26., 24., 22., 20., 18.,
16., 14., 12., 10., 8., 6., 4., 2., 0., -2., -4., -6.,
-8., -10., -12., -14., -16., -18., -20., -22., -24., -26., -28., -30.,
-32., -34., -36., -38., -40., -42., -44., -46., -48., -50., -52., -54.,
-56., -58., -60., -62., -64., -66., -68., -70., -72., -74., -76., -78.,
-80., -82., -84., -86., -88.], dtype=float32)
- lon(lon)float320.0 2.0 4.0 ... 354.0 356.0 358.0
- units :
- degrees_east
- long_name :
- Longitude
- actual_range :
- [ 0. 358.]
- standard_name :
- longitude
- axis :
- X
- coordinate_defines :
- center
array([ 0., 2., 4., 6., 8., 10., 12., 14., 16., 18., 20., 22.,
24., 26., 28., 30., 32., 34., 36., 38., 40., 42., 44., 46.,
48., 50., 52., 54., 56., 58., 60., 62., 64., 66., 68., 70.,
72., 74., 76., 78., 80., 82., 84., 86., 88., 90., 92., 94.,
96., 98., 100., 102., 104., 106., 108., 110., 112., 114., 116., 118.,
120., 122., 124., 126., 128., 130., 132., 134., 136., 138., 140., 142.,
144., 146., 148., 150., 152., 154., 156., 158., 160., 162., 164., 166.,
168., 170., 172., 174., 176., 178., 180., 182., 184., 186., 188., 190.,
192., 194., 196., 198., 200., 202., 204., 206., 208., 210., 212., 214.,
216., 218., 220., 222., 224., 226., 228., 230., 232., 234., 236., 238.,
240., 242., 244., 246., 248., 250., 252., 254., 256., 258., 260., 262.,
264., 266., 268., 270., 272., 274., 276., 278., 280., 282., 284., 286.,
288., 290., 292., 294., 296., 298., 300., 302., 304., 306., 308., 310.,
312., 314., 316., 318., 320., 322., 324., 326., 328., 330., 332., 334.,
336., 338., 340., 342., 344., 346., 348., 350., 352., 354., 356., 358.],
dtype=float32)
- time(time)datetime64[ns]1960-06-16T08:00:00 ... 2018-06-...
- long_name :
- Time
- delta_t :
- 0000-01-00 00:00:00
- avg_period :
- 0000-01-00 00:00:00
- prev_avg_period :
- 0000-00-07 00:00:00
- standard_name :
- time
- axis :
- T
- actual_range :
- [19723. 80992.]
- _ChunkSizes :
- 1
array(['1960-06-16T08:00:00.000000000', '1961-06-16T12:00:00.000000000',
'1962-06-16T12:00:00.000000000', '1963-06-16T12:00:00.000000000',
'1964-06-16T08:00:00.000000000', '1965-06-16T12:00:00.000000000',
'1966-06-16T12:00:00.000000000', '1967-06-16T12:00:00.000000000',
'1968-06-16T08:00:00.000000000', '1969-06-16T12:00:00.000000000',
'1970-06-16T12:00:00.000000000', '1971-06-16T12:00:00.000000000',
'1972-06-16T08:00:00.000000000', '1973-06-16T12:00:00.000000000',
'1974-06-16T12:00:00.000000000', '1975-06-16T12:00:00.000000000',
'1976-06-16T08:00:00.000000000', '1977-06-16T12:00:00.000000000',
'1978-06-16T12:00:00.000000000', '1979-06-16T12:00:00.000000000',
'1980-06-16T08:00:00.000000000', '1981-06-16T12:00:00.000000000',
'1982-06-16T12:00:00.000000000', '1983-06-16T12:00:00.000000000',
'1984-06-16T08:00:00.000000000', '1985-06-16T12:00:00.000000000',
'1986-06-16T12:00:00.000000000', '1987-06-16T12:00:00.000000000',
'1988-06-16T08:00:00.000000000', '1989-06-16T12:00:00.000000000',
'1990-06-16T12:00:00.000000000', '1991-06-16T12:00:00.000000000',
'1992-06-16T08:00:00.000000000', '1993-06-16T12:00:00.000000000',
'1994-06-16T12:00:00.000000000', '1995-06-16T12:00:00.000000000',
'1996-06-16T08:00:00.000000000', '1997-06-16T12:00:00.000000000',
'1998-06-16T12:00:00.000000000', '1999-06-16T12:00:00.000000000',
'2000-06-16T08:00:00.000000000', '2001-06-16T12:00:00.000000000',
'2002-06-16T12:00:00.000000000', '2003-06-16T12:00:00.000000000',
'2004-06-16T08:00:00.000000000', '2005-06-16T12:00:00.000000000',
'2006-06-16T12:00:00.000000000', '2007-06-16T12:00:00.000000000',
'2008-06-16T08:00:00.000000000', '2009-06-16T12:00:00.000000000',
'2010-06-16T12:00:00.000000000', '2011-06-16T12:00:00.000000000',
'2012-06-16T08:00:00.000000000', '2013-06-16T12:00:00.000000000',
'2014-06-16T12:00:00.000000000', '2015-06-16T12:00:00.000000000',
'2016-06-16T08:00:00.000000000', '2017-06-16T12:00:00.000000000',
'2018-06-16T12:00:00.000000000'], dtype='datetime64[ns]')
- sst(time, lat, lon)float32-1.8 -1.8 -1.8 -1.8 ... nan nan nan
- long_name :
- Monthly Means of Sea Surface Temperature
- units :
- degC
- var_desc :
- Sea Surface Temperature
- level_desc :
- Surface
- statistic :
- Mean
- dataset :
- NOAA Extended Reconstructed SST V5
- parent_stat :
- Individual Values
- actual_range :
- [-1.8 42.32636]
- valid_range :
- [-1.8 45. ]
- _ChunkSizes :
- [ 1 89 180]
array([[[-1.7999998, -1.7999998, -1.7999998, ..., -1.7999998,
-1.7999998, -1.7999998],
[-1.7999998, -1.7999998, -1.7999998, ..., -1.7999998,
-1.7999998, -1.7999998],
[-1.7999998, -1.7999998, -1.7999998, ..., -1.7999998,
-1.7999998, -1.7999998],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan]],
[[-1.7999998, -1.7999998, -1.7999998, ..., -1.7999998,
-1.7999998, -1.7999998],
[-1.7999998, -1.7999998, -1.7999998, ..., -1.7999998,
-1.7999998, -1.7999998],
[-1.7999998, -1.7999998, -1.7999998, ..., -1.7999998,
-1.7999998, -1.7999998],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan]],
[[-1.7999998, -1.7999998, -1.7999998, ..., -1.7999998,
-1.7999998, -1.7999998],
[-1.7999998, -1.7999998, -1.7999998, ..., -1.7999998,
-1.7999998, -1.7999998],
[-1.7999998, -1.7999998, -1.7999998, ..., -1.7999998,
-1.7999998, -1.7999998],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan]]], dtype=float32)
- climatology :
- Climatology is based on 1971-2000 SST, Xue, Y., T. M. Smith, and R. W. Reynolds, 2003: Interdecadal changes of 30-yr SST normals during 1871.2000. Journal of Climate, 16, 1601-1612.
- description :
- In situ data: ICOADS2.5 before 2007 and NCEP in situ data from 2008 to present. Ice data: HadISST ice before 2010 and NCEP ice after 2010.
- keywords_vocabulary :
- NASA Global Change Master Directory (GCMD) Science Keywords
- keywords :
- Earth Science > Oceans > Ocean Temperature > Sea Surface Temperature >
- instrument :
- Conventional thermometers
- source_comment :
- SSTs were observed by conventional thermometers in Buckets (insulated or un-insulated canvas and wooded buckets) or Engine Room Intaker
- geospatial_lon_min :
- -1.0
- geospatial_lon_max :
- 359.0
- geospatial_laty_max :
- 89.0
- geospatial_laty_min :
- -89.0
- geospatial_lat_max :
- 89.0
- geospatial_lat_min :
- -89.0
- geospatial_lat_units :
- degrees_north
- geospatial_lon_units :
- degrees_east
- cdm_data_type :
- Grid
- project :
- NOAA Extended Reconstructed Sea Surface Temperature (ERSST)
- original_publisher_url :
- http://www.ncdc.noaa.gov
- References :
- https://www.ncdc.noaa.gov/data-access/marineocean-data/extended-reconstructed-sea-surface-temperature-ersst-v5 at NCEI and http://www.esrl.noaa.gov/psd/data/gridded/data.noaa.ersst.v5.html
- source :
- In situ data: ICOADS R3.0 before 2015, NCEP in situ GTS from 2016 to present, and Argo SST from 1999 to present. Ice data: HadISST2 ice before 2015, and NCEP ice after 2015
- title :
- NOAA ERSSTv5 (in situ only)
- history :
- created 07/2017 by PSD data using NCEI's ERSST V5 NetCDF values
- institution :
- This version written at NOAA/ESRL PSD: obtained from NOAA/NESDIS/National Centers for Environmental Information and time aggregated. Original Full Source: NOAA/NESDIS/NCEI/CCOG
- citation :
- Huang et al, 2017: Extended Reconstructed Sea Surface Temperatures Version 5 (ERSSTv5): Upgrades, Validations, and Intercomparisons. Journal of Climate, https://doi.org/10.1175/JCLI-D-16-0836.1
- platform :
- Ship and Buoy SSTs from ICOADS R3.0 and NCEP GTS
- standard_name_vocabulary :
- CF Standard Name Table (v40, 25 January 2017)
- processing_level :
- NOAA Level 4
- Conventions :
- CF-1.6, ACDD-1.3
- metadata_link :
- :metadata_link = https://doi.org/10.7289/V5T72FNM (original format)
- creator_name :
- Boyin Huang (original)
- date_created :
- 2017-06-30T12:18:00Z (original)
- product_version :
- Version 5
- creator_url_original :
- https://www.ncei.noaa.gov
- license :
- No constraints on data access or use
- comment :
- SSTs were observed by conventional thermometers in Buckets (insulated or un-insulated canvas and wooded buckets), Engine Room Intakers, or floats and drifters
- summary :
- ERSST.v5 is developed based on v4 after revisions of 8 parameters using updated data sets and advanced knowledge of ERSST analysis
- dataset_title :
- NOAA Extended Reconstructed SST V5
- data_modified :
- 2021-11-07
- DODS_EXTRA.Unlimited_Dimension :
- time
Coarsen also works on spatial coordinates (or any coordiantes).
ds_coarse = ds.coarsen(lon=4, lat=4, boundary='pad').mean()
ds_coarse.sst.isel(time=0).plot(vmin=2, vmax=30, figsize=(12, 5), edgecolor='k')
An Advanced Example#
In this example we will show a realistic workflow with Xarray.
We will
Load a “basin mask” dataset
Interpolate the basins to our SST dataset coordinates
Group the SST by basin
Convert to Pandas Dataframe and plot mean SST by basin
basin = xr.open_dataset('http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NODC/.WOA09/.Masks/.basin/dods')
basin
Dimensions: (Z: 33, X: 360, Y: 180)
Coordinates:
* Z (Z) float32 0.0 10.0 20.0 30.0 50.0 ... 4e+03 4.5e+03 5e+03 5.5e+03
* X (X) float32 0.5 1.5 2.5 3.5 4.5 ... 355.5 356.5 357.5 358.5 359.5
* Y (Y) float32 -89.5 -88.5 -87.5 -86.5 -85.5 ... 86.5 87.5 88.5 89.5
Data variables:
basin (Z, Y, X) float32 ...
Attributes:
Conventions: IRIDL
xarray.Dataset- Z: 33
- X: 360
- Y: 180
- Z(Z)float320.0 10.0 20.0 ... 5e+03 5.5e+03
- gridtype :
- 0
- units :
- m
array([ 0., 10., 20., 30., 50., 75., 100., 125., 150., 200.,
250., 300., 400., 500., 600., 700., 800., 900., 1000., 1100.,
1200., 1300., 1400., 1500., 1750., 2000., 2500., 3000., 3500., 4000.,
4500., 5000., 5500.], dtype=float32)
- X(X)float320.5 1.5 2.5 ... 357.5 358.5 359.5
- standard_name :
- longitude
- pointwidth :
- 1.0
- gridtype :
- 1
- units :
- degree_east
array([ 0.5, 1.5, 2.5, ..., 357.5, 358.5, 359.5], dtype=float32)
- Y(Y)float32-89.5 -88.5 -87.5 ... 88.5 89.5
- standard_name :
- latitude
- pointwidth :
- 1.0
- gridtype :
- 0
- units :
- degree_north
array([-89.5, -88.5, -87.5, -86.5, -85.5, -84.5, -83.5, -82.5, -81.5, -80.5,
-79.5, -78.5, -77.5, -76.5, -75.5, -74.5, -73.5, -72.5, -71.5, -70.5,
-69.5, -68.5, -67.5, -66.5, -65.5, -64.5, -63.5, -62.5, -61.5, -60.5,
-59.5, -58.5, -57.5, -56.5, -55.5, -54.5, -53.5, -52.5, -51.5, -50.5,
-49.5, -48.5, -47.5, -46.5, -45.5, -44.5, -43.5, -42.5, -41.5, -40.5,
-39.5, -38.5, -37.5, -36.5, -35.5, -34.5, -33.5, -32.5, -31.5, -30.5,
-29.5, -28.5, -27.5, -26.5, -25.5, -24.5, -23.5, -22.5, -21.5, -20.5,
-19.5, -18.5, -17.5, -16.5, -15.5, -14.5, -13.5, -12.5, -11.5, -10.5,
-9.5, -8.5, -7.5, -6.5, -5.5, -4.5, -3.5, -2.5, -1.5, -0.5,
0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5,
10.5, 11.5, 12.5, 13.5, 14.5, 15.5, 16.5, 17.5, 18.5, 19.5,
20.5, 21.5, 22.5, 23.5, 24.5, 25.5, 26.5, 27.5, 28.5, 29.5,
30.5, 31.5, 32.5, 33.5, 34.5, 35.5, 36.5, 37.5, 38.5, 39.5,
40.5, 41.5, 42.5, 43.5, 44.5, 45.5, 46.5, 47.5, 48.5, 49.5,
50.5, 51.5, 52.5, 53.5, 54.5, 55.5, 56.5, 57.5, 58.5, 59.5,
60.5, 61.5, 62.5, 63.5, 64.5, 65.5, 66.5, 67.5, 68.5, 69.5,
70.5, 71.5, 72.5, 73.5, 74.5, 75.5, 76.5, 77.5, 78.5, 79.5,
80.5, 81.5, 82.5, 83.5, 84.5, 85.5, 86.5, 87.5, 88.5, 89.5],
dtype=float32)
- basin(Z, Y, X)float32...
- long_name :
- basin code
- CLIST :
- Atlantic Ocean
Pacific Ocean
Indian Ocean
Mediterranean Sea
Baltic Sea
Black Sea
Red Sea
Persian Gulf
Hudson Bay
Southern Ocean
Arctic Ocean
Sea of Japan
Kara Sea
Sulu Sea
Baffin Bay
East Mediterranean
West Mediterranean
Sea of Okhotsk
Banda Sea
Caribbean Sea
Andaman Basin
North Caribbean
Gulf of Mexico
Beaufort Sea
South China Sea
Barents Sea
Celebes Sea
Aleutian Basin
Fiji Basin
North American Basin
West European Basin
Southeast Indian Basin
Coral Sea
East Indian Basin
Central Indian Basin
Southwest Atlantic Basin
Southeast Atlantic Basin
Southeast Pacific Basin
Guatemala Basin
East Caroline Basin
Marianas Basin
Philippine Sea
Arabian Sea
Chile Basin
Somali Basin
Mascarene Basin
Crozet Basin
Guinea Basin
Brazil Basin
Argentine Basin
Tasman Sea
Atlantic Indian Basin
Caspian Sea
Sulu Sea II
Venezuela Basin
Bay of Bengal
Java Sea
East Indian Atlantic Basin
- valid_min :
- 1
- valid_max :
- 58
- scale_min :
- 1
- units :
- ids
- scale_max :
- 58
[2138400 values with dtype=float32]
- Conventions :
- IRIDL
basin = basin.rename({'X': 'lon', 'Y': 'lat'})
basin
Dimensions: (Z: 33, lon: 360, lat: 180)
Coordinates:
* Z (Z) float32 0.0 10.0 20.0 30.0 50.0 ... 4e+03 4.5e+03 5e+03 5.5e+03
* lon (lon) float32 0.5 1.5 2.5 3.5 4.5 ... 355.5 356.5 357.5 358.5 359.5
* lat (lat) float32 -89.5 -88.5 -87.5 -86.5 -85.5 ... 86.5 87.5 88.5 89.5
Data variables:
basin (Z, lat, lon) float32 ...
Attributes:
Conventions: IRIDL
xarray.Dataset- Z: 33
- lon: 360
- lat: 180
- Z(Z)float320.0 10.0 20.0 ... 5e+03 5.5e+03
- gridtype :
- 0
- units :
- m
array([ 0., 10., 20., 30., 50., 75., 100., 125., 150., 200.,
250., 300., 400., 500., 600., 700., 800., 900., 1000., 1100.,
1200., 1300., 1400., 1500., 1750., 2000., 2500., 3000., 3500., 4000.,
4500., 5000., 5500.], dtype=float32)
- lon(lon)float320.5 1.5 2.5 ... 357.5 358.5 359.5
- standard_name :
- longitude
- pointwidth :
- 1.0
- gridtype :
- 1
- units :
- degree_east
array([ 0.5, 1.5, 2.5, ..., 357.5, 358.5, 359.5], dtype=float32)
- lat(lat)float32-89.5 -88.5 -87.5 ... 88.5 89.5
- standard_name :
- latitude
- pointwidth :
- 1.0
- gridtype :
- 0
- units :
- degree_north
array([-89.5, -88.5, -87.5, -86.5, -85.5, -84.5, -83.5, -82.5, -81.5, -80.5,
-79.5, -78.5, -77.5, -76.5, -75.5, -74.5, -73.5, -72.5, -71.5, -70.5,
-69.5, -68.5, -67.5, -66.5, -65.5, -64.5, -63.5, -62.5, -61.5, -60.5,
-59.5, -58.5, -57.5, -56.5, -55.5, -54.5, -53.5, -52.5, -51.5, -50.5,
-49.5, -48.5, -47.5, -46.5, -45.5, -44.5, -43.5, -42.5, -41.5, -40.5,
-39.5, -38.5, -37.5, -36.5, -35.5, -34.5, -33.5, -32.5, -31.5, -30.5,
-29.5, -28.5, -27.5, -26.5, -25.5, -24.5, -23.5, -22.5, -21.5, -20.5,
-19.5, -18.5, -17.5, -16.5, -15.5, -14.5, -13.5, -12.5, -11.5, -10.5,
-9.5, -8.5, -7.5, -6.5, -5.5, -4.5, -3.5, -2.5, -1.5, -0.5,
0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5,
10.5, 11.5, 12.5, 13.5, 14.5, 15.5, 16.5, 17.5, 18.5, 19.5,
20.5, 21.5, 22.5, 23.5, 24.5, 25.5, 26.5, 27.5, 28.5, 29.5,
30.5, 31.5, 32.5, 33.5, 34.5, 35.5, 36.5, 37.5, 38.5, 39.5,
40.5, 41.5, 42.5, 43.5, 44.5, 45.5, 46.5, 47.5, 48.5, 49.5,
50.5, 51.5, 52.5, 53.5, 54.5, 55.5, 56.5, 57.5, 58.5, 59.5,
60.5, 61.5, 62.5, 63.5, 64.5, 65.5, 66.5, 67.5, 68.5, 69.5,
70.5, 71.5, 72.5, 73.5, 74.5, 75.5, 76.5, 77.5, 78.5, 79.5,
80.5, 81.5, 82.5, 83.5, 84.5, 85.5, 86.5, 87.5, 88.5, 89.5],
dtype=float32)
- basin(Z, lat, lon)float32...
- long_name :
- basin code
- CLIST :
- Atlantic Ocean
Pacific Ocean
Indian Ocean
Mediterranean Sea
Baltic Sea
Black Sea
Red Sea
Persian Gulf
Hudson Bay
Southern Ocean
Arctic Ocean
Sea of Japan
Kara Sea
Sulu Sea
Baffin Bay
East Mediterranean
West Mediterranean
Sea of Okhotsk
Banda Sea
Caribbean Sea
Andaman Basin
North Caribbean
Gulf of Mexico
Beaufort Sea
South China Sea
Barents Sea
Celebes Sea
Aleutian Basin
Fiji Basin
North American Basin
West European Basin
Southeast Indian Basin
Coral Sea
East Indian Basin
Central Indian Basin
Southwest Atlantic Basin
Southeast Atlantic Basin
Southeast Pacific Basin
Guatemala Basin
East Caroline Basin
Marianas Basin
Philippine Sea
Arabian Sea
Chile Basin
Somali Basin
Mascarene Basin
Crozet Basin
Guinea Basin
Brazil Basin
Argentine Basin
Tasman Sea
Atlantic Indian Basin
Caspian Sea
Sulu Sea II
Venezuela Basin
Bay of Bengal
Java Sea
East Indian Atlantic Basin
- valid_min :
- 1
- valid_max :
- 58
- scale_min :
- 1
- units :
- ids
- scale_max :
- 58
[2138400 values with dtype=float32]
- Conventions :
- IRIDL
basin_surf = basin.basin[0]
basin_surf
<xarray.DataArray 'basin' (lat: 180, lon: 360)>
array([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[11., 11., 11., ..., 11., 11., 11.],
[11., 11., 11., ..., 11., 11., 11.],
[11., 11., 11., ..., 11., 11., 11.]], dtype=float32)
Coordinates:
Z float32 0.0
* lon (lon) float32 0.5 1.5 2.5 3.5 4.5 ... 355.5 356.5 357.5 358.5 359.5
* lat (lat) float32 -89.5 -88.5 -87.5 -86.5 -85.5 ... 86.5 87.5 88.5 89.5
Attributes:
long_name: basin code
CLIST: Atlantic Ocean\nPacific Ocean \nIndian Ocean\nMediterranean S...
valid_min: 1
valid_max: 58
scale_min: 1
units: ids
scale_max: 58
xarray.DataArray'basin'- lat: 180
- lon: 360
- nan nan nan nan nan nan nan nan ... 11.0 11.0 11.0 11.0 11.0 11.0 11.0
array([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[11., 11., 11., ..., 11., 11., 11.],
[11., 11., 11., ..., 11., 11., 11.],
[11., 11., 11., ..., 11., 11., 11.]], dtype=float32)
- Z()float320.0
- gridtype :
- 0
- units :
- m
array(0., dtype=float32)
- lon(lon)float320.5 1.5 2.5 ... 357.5 358.5 359.5
- standard_name :
- longitude
- pointwidth :
- 1.0
- gridtype :
- 1
- units :
- degree_east
array([ 0.5, 1.5, 2.5, ..., 357.5, 358.5, 359.5], dtype=float32)
- lat(lat)float32-89.5 -88.5 -87.5 ... 88.5 89.5
- standard_name :
- latitude
- pointwidth :
- 1.0
- gridtype :
- 0
- units :
- degree_north
array([-89.5, -88.5, -87.5, -86.5, -85.5, -84.5, -83.5, -82.5, -81.5, -80.5,
-79.5, -78.5, -77.5, -76.5, -75.5, -74.5, -73.5, -72.5, -71.5, -70.5,
-69.5, -68.5, -67.5, -66.5, -65.5, -64.5, -63.5, -62.5, -61.5, -60.5,
-59.5, -58.5, -57.5, -56.5, -55.5, -54.5, -53.5, -52.5, -51.5, -50.5,
-49.5, -48.5, -47.5, -46.5, -45.5, -44.5, -43.5, -42.5, -41.5, -40.5,
-39.5, -38.5, -37.5, -36.5, -35.5, -34.5, -33.5, -32.5, -31.5, -30.5,
-29.5, -28.5, -27.5, -26.5, -25.5, -24.5, -23.5, -22.5, -21.5, -20.5,
-19.5, -18.5, -17.5, -16.5, -15.5, -14.5, -13.5, -12.5, -11.5, -10.5,
-9.5, -8.5, -7.5, -6.5, -5.5, -4.5, -3.5, -2.5, -1.5, -0.5,
0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5,
10.5, 11.5, 12.5, 13.5, 14.5, 15.5, 16.5, 17.5, 18.5, 19.5,
20.5, 21.5, 22.5, 23.5, 24.5, 25.5, 26.5, 27.5, 28.5, 29.5,
30.5, 31.5, 32.5, 33.5, 34.5, 35.5, 36.5, 37.5, 38.5, 39.5,
40.5, 41.5, 42.5, 43.5, 44.5, 45.5, 46.5, 47.5, 48.5, 49.5,
50.5, 51.5, 52.5, 53.5, 54.5, 55.5, 56.5, 57.5, 58.5, 59.5,
60.5, 61.5, 62.5, 63.5, 64.5, 65.5, 66.5, 67.5, 68.5, 69.5,
70.5, 71.5, 72.5, 73.5, 74.5, 75.5, 76.5, 77.5, 78.5, 79.5,
80.5, 81.5, 82.5, 83.5, 84.5, 85.5, 86.5, 87.5, 88.5, 89.5],
dtype=float32)
- long_name :
- basin code
- CLIST :
- Atlantic Ocean
Pacific Ocean
Indian Ocean
Mediterranean Sea
Baltic Sea
Black Sea
Red Sea
Persian Gulf
Hudson Bay
Southern Ocean
Arctic Ocean
Sea of Japan
Kara Sea
Sulu Sea
Baffin Bay
East Mediterranean
West Mediterranean
Sea of Okhotsk
Banda Sea
Caribbean Sea
Andaman Basin
North Caribbean
Gulf of Mexico
Beaufort Sea
South China Sea
Barents Sea
Celebes Sea
Aleutian Basin
Fiji Basin
North American Basin
West European Basin
Southeast Indian Basin
Coral Sea
East Indian Basin
Central Indian Basin
Southwest Atlantic Basin
Southeast Atlantic Basin
Southeast Pacific Basin
Guatemala Basin
East Caroline Basin
Marianas Basin
Philippine Sea
Arabian Sea
Chile Basin
Somali Basin
Mascarene Basin
Crozet Basin
Guinea Basin
Brazil Basin
Argentine Basin
Tasman Sea
Atlantic Indian Basin
Caspian Sea
Sulu Sea II
Venezuela Basin
Bay of Bengal
Java Sea
East Indian Atlantic Basin
- valid_min :
- 1
- valid_max :
- 58
- scale_min :
- 1
- units :
- ids
- scale_max :
- 58
basin_surf.plot(vmax=10)
<xarray.DataArray 'sst' (time: 708, basin: 14)>
array([[-1.8 , -1.8 , 23.455315 , ..., -1.8 ,
3.3971915 , 24.182198 ],
[-1.8 , -1.8 , 23.722523 , ..., -1.8 ,
0.03573781, 24.59657 ],
[-1.8 , -1.8 , 24.601315 , ..., -1.8 ,
-0.26487017, 26.234186 ],
[ 0.6758132 , 6.504184 , 29.279463 , ..., 10.920228 ,
15.955025 , 29.41976 ],
[-0.7937442 , 3.0715032 , 27.608435 , ..., 5.4078875 ,
10.673693 , 27.7558 ],
[-1.8 , -0.06063586, 25.881481 , ..., 0.5253569 ,
7.267694 , 26.163145 ]], dtype=float32)
Coordinates:
* time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2018-12-01
Z float32 0.0
* basin (basin) float64 1.0 2.0 3.0 4.0 5.0 ... 10.0 11.0 12.0 53.0 56.0
Attributes:
long_name: Monthly Means of Sea Surface Temperature
units: degC
var_desc: Sea Surface Temperature
level_desc: Surface
statistic: Mean
dataset: NOAA Extended Reconstructed SST V5
parent_stat: Individual Values
actual_range: [-1.8 42.32636]
valid_range: [-1.8 45. ]
_ChunkSizes: [ 1 89 180]
xarray.DataArray'sst'- time: 708
- basin: 14
- -1.8 -1.8 23.46 13.2 1.804 7.341 ... 3.645 -1.8 0.5254 7.268 26.16
array([[-1.8 , -1.8 , 23.455315 , ..., -1.8 ,
3.3971915 , 24.182198 ],
[-1.8 , -1.8 , 23.722523 , ..., -1.8 ,
0.03573781, 24.59657 ],
[-1.8 , -1.8 , 24.601315 , ..., -1.8 ,
-0.26487017, 26.234186 ],
[ 0.6758132 , 6.504184 , 29.279463 , ..., 10.920228 ,
15.955025 , 29.41976 ],
[-0.7937442 , 3.0715032 , 27.608435 , ..., 5.4078875 ,
10.673693 , 27.7558 ],
[-1.8 , -0.06063586, 25.881481 , ..., 0.5253569 ,
7.267694 , 26.163145 ]], dtype=float32)
- time(time)datetime64[ns]1960-01-01 ... 2018-12-01
- long_name :
- Time
- delta_t :
- 0000-01-00 00:00:00
- avg_period :
- 0000-01-00 00:00:00
- prev_avg_period :
- 0000-00-07 00:00:00
- standard_name :
- time
- axis :
- T
- actual_range :
- [19723. 80992.]
- _ChunkSizes :
- 1
array(['1960-01-01T00:00:00.000000000', '1960-02-01T00:00:00.000000000',
'1960-03-01T00:00:00.000000000', ..., '2018-10-01T00:00:00.000000000',
'2018-11-01T00:00:00.000000000', '2018-12-01T00:00:00.000000000'],
dtype='datetime64[ns]')
- Z()float320.0
- gridtype :
- 0
- units :
- m
array(0., dtype=float32)
- basin(basin)float641.0 2.0 3.0 4.0 ... 12.0 53.0 56.0
array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 53., 56.])
- long_name :
- Monthly Means of Sea Surface Temperature
- units :
- degC
- var_desc :
- Sea Surface Temperature
- level_desc :
- Surface
- statistic :
- Mean
- dataset :
- NOAA Extended Reconstructed SST V5
- parent_stat :
- Individual Values
- actual_range :
- [-1.8 42.32636]
- valid_range :
- [-1.8 45. ]
- _ChunkSizes :
- [ 1 89 180]
basin_mean_sst = ds.sst.groupby(basin_surf_interp).mean()
basin_mean_sst
<xarray.DataArray 'sst' (time: 708, basin: 14)>
array([[18.585493 , 20.757555 , 21.572067 , ..., 6.238062 , 6.889794 ,
26.49982 ],
[18.705065 , 20.81674 , 21.902279 , ..., 4.8877654, 5.44638 ,
26.577093 ],
[18.845842 , 20.865038 , 22.031416 , ..., 4.686406 , 5.5322194,
27.908558 ],
[19.84992 , 21.960493 , 20.389412 , ..., 17.571943 , 18.184528 ,
29.336565 ],
[19.424026 , 21.722925 , 21.061403 , ..., 13.461868 , 13.863244 ,
28.755905 ],
[19.265354 , 21.512274 , 21.814356 , ..., 9.417906 , 10.607256 ,
27.905243 ]], dtype=float32)
Coordinates:
* time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2018-12-01
Z float32 0.0
* basin (basin) float64 1.0 2.0 3.0 4.0 5.0 ... 10.0 11.0 12.0 53.0 56.0
xarray.DataArray'sst'- time: 708
- basin: 14
- 18.59 20.76 21.57 15.46 2.553 ... 1.762 -0.8986 9.418 10.61 27.91
array([[18.585493 , 20.757555 , 21.572067 , ..., 6.238062 , 6.889794 ,
26.49982 ],
[18.705065 , 20.81674 , 21.902279 , ..., 4.8877654, 5.44638 ,
26.577093 ],
[18.845842 , 20.865038 , 22.031416 , ..., 4.686406 , 5.5322194,
27.908558 ],
[19.84992 , 21.960493 , 20.389412 , ..., 17.571943 , 18.184528 ,
29.336565 ],
[19.424026 , 21.722925 , 21.061403 , ..., 13.461868 , 13.863244 ,
28.755905 ],
[19.265354 , 21.512274 , 21.814356 , ..., 9.417906 , 10.607256 ,
27.905243 ]], dtype=float32)
- time(time)datetime64[ns]1960-01-01 ... 2018-12-01
- long_name :
- Time
- delta_t :
- 0000-01-00 00:00:00
- avg_period :
- 0000-01-00 00:00:00
- prev_avg_period :
- 0000-00-07 00:00:00
- standard_name :
- time
- axis :
- T
- actual_range :
- [19723. 80992.]
- _ChunkSizes :
- 1
array(['1960-01-01T00:00:00.000000000', '1960-02-01T00:00:00.000000000',
'1960-03-01T00:00:00.000000000', ..., '2018-10-01T00:00:00.000000000',
'2018-11-01T00:00:00.000000000', '2018-12-01T00:00:00.000000000'],
dtype='datetime64[ns]')
- Z()float320.0
- gridtype :
- 0
- units :
- m
array(0., dtype=float32)
- basin(basin)float641.0 2.0 3.0 4.0 ... 12.0 53.0 56.0
array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 53., 56.])
df = basin_mean_sst.mean('time').to_dataframe()
import pandas as pd
basin_names = basin_surf.attrs['CLIST'].split('\n')
basin_df = pd.Series(basin_names, index=np.arange(1, len(basin_names)+1))
basin_df
1 Atlantic Ocean
2 Pacific Ocean
3 Indian Ocean
4 Mediterranean Sea
5 Baltic Sea
6 Black Sea
7 Red Sea
8 Persian Gulf
9 Hudson Bay
10 Southern Ocean
11 Arctic Ocean
12 Sea of Japan
13 Kara Sea
14 Sulu Sea
15 Baffin Bay
16 East Mediterranean
17 West Mediterranean
18 Sea of Okhotsk
19 Banda Sea
20 Caribbean Sea
21 Andaman Basin
22 North Caribbean
23 Gulf of Mexico
24 Beaufort Sea
25 South China Sea
26 Barents Sea
27 Celebes Sea
28 Aleutian Basin
29 Fiji Basin
30 North American Basin
31 West European Basin
32 Southeast Indian Basin
33 Coral Sea
34 East Indian Basin
35 Central Indian Basin
36 Southwest Atlantic Basin
37 Southeast Atlantic Basin
38 Southeast Pacific Basin
39 Guatemala Basin
40 East Caroline Basin
41 Marianas Basin
42 Philippine Sea
43 Arabian Sea
44 Chile Basin
45 Somali Basin
46 Mascarene Basin
47 Crozet Basin
48 Guinea Basin
49 Brazil Basin
50 Argentine Basin
51 Tasman Sea
52 Atlantic Indian Basin
53 Caspian Sea
54 Sulu Sea II
55 Venezuela Basin
56 Bay of Bengal
57 Java Sea
58 East Indian Atlantic Basin
dtype: object