Subtract two xarrays while keeping all dimensions

The quick answer is that you're doing it right, but your dimensions are not aligned. xarray IS designed to subtract entire arrays, but the coordinate labels must be aligned exactly. You likely have a disagreement between elements of your plev coordinate, which you can check with xr.align:

xr.align(wind1_array, wind2_array, join='exact')

See the xarray docs on computation: automatic alignment for more information.

Detailed example

The biggest difference between xarray and numpy (assuming you're familiar with math using numpy) is that xarray relies on the coordinate labels along each dimension to align arrays before any broadcasting operations, not just the shape.

As an example, let's consider two very simple arrays - one counting up from 0 to 19 and the other a block of ones, both reshaped to (4, 5). Subtracting these from each other in numpy is straightforward because they're the same shape:

In [15]: arr1 = np.arange(20).reshape((4, 5))

In [16]: arr2 = np.ones(shape=(4, 5))

In [17]: arr1 - arr2
Out[17]:
array([[-1.,  0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.,  8.],
       [ 9., 10., 11., 12., 13.],
       [14., 15., 16., 17., 18.]])

The xarray equivalent is also straightforward, but we must introduce dimension names and coordinates. Let's assume your pressure levels are in increments of 10 hPa decreasing toward STP, and latitudes are also in increments of 10 from 20 to 60:

In [18]: pressures = np.array([71.325, 81.325, 91.325, 101.325])

In [19]: lats = np.array([20, 30, 40, 50, 60])

In [20]: da1 = xr.DataArray(arr1, dims=['plev', 'lat'], coords=[pressures, lats])

In [21]: da2 = xr.DataArray(arr2, dims=['plev', 'lat'], coords=[pressures, lats])

In [22]: da2
Out[22]:
<xarray.DataArray (plev: 4, lat: 5)>
array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])
Coordinates:
  * plev     (plev) float64 71.33 81.33 91.33 101.3
  * lat      (lat) int64 20 30 40 50 60

In [23]: da1
Out[23]:
<xarray.DataArray (plev: 4, lat: 5)>
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])
Coordinates:
  * plev     (plev) float64 71.33 81.33 91.33 101.3
  * lat      (lat) int64 20 30 40 50 60

These arrays are aligned, so subtracting them is straightforward:

In [24]: da1 - da2
Out[24]:
<xarray.DataArray (plev: 4, lat: 5)>
array([[-1.,  0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.,  8.],
       [ 9., 10., 11., 12., 13.],
       [14., 15., 16., 17., 18.]])
Coordinates:
  * plev     (plev) float64 71.33 81.33 91.33 101.3
  * lat      (lat) int64 20 30 40 50 60

But because xarray relies on these coordinates being aligned exactly, relying on float coordinates can be tricky. If we introduce even a small error to the pressure level dimension, the arrays are not aligned and we see a result similar to yours:

In [25]: da2 = xr.DataArray(arr2, dims=['plev', 'lat'], coords=[pressures + 1e-8, lats])

In [26]: da1 - da2
Out[26]:
<xarray.DataArray (plev: 0, lat: 5)>
array([], shape=(0, 5), dtype=float64)
Coordinates:
  * plev     (plev) float64
  * lat      (lat) int64 20 30 40 50 60

This type of mis-alignment can happen for all sorts of reasons, including round-tripping data through storage, where changes in encoding can cause tiny numerical errors which show up as mis-aligned data.

You can check whether this is the problem with xr.align using the join='exact' argument:

In [27]: xr.align(da1, da2, join='exact')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-29-612460e52308> in <module>
----> 1 xr.align(da1, da2, join='exact')

~/miniconda3/envs/myenv/lib/python3.9/site-packages/xarray/core/alignment.py in align(join, copy, indexes, exclude, fill_value, *objects)
    320             ):
    321                 if join == "exact":
--> 322                     raise ValueError(f"indexes along dimension {dim!r} are not equal")
    323                 joiner = _get_joiner(join, type(matching_indexes[0]))
    324                 index = joiner(matching_indexes)

ValueError: indexes along dimension 'plev' are not equal

To get around this problem, you might try rounding your coordinates to a known tolerance of the coordinate:

In [32]: da2['plev'] = np.round(da2['plev'], 3)

In [33]: da1 - da2
Out[33]:
<xarray.DataArray (plev: 4, lat: 5)>
array([[-1.,  0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.,  8.],
       [ 9., 10., 11., 12., 13.],
       [14., 15., 16., 17., 18.]])
Coordinates:
  * plev     (plev) float64 71.33 81.33 91.33 101.3
  * lat      (lat) int64 20 30 40 50 60

Alternatively, you could set a positional/integer coordinate, with the actual pressure level as a non-indexing coordinate:

In [42]: da1
Out[42]:
<xarray.DataArray (plev_ind: 4, lat: 5)>
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])
Coordinates:
    plev      (plev_ind) float64 71.33 81.33 91.33 101.3
  * lat       (lat) int64 20 30 40 50 60
  * plev_ind  (plev_ind) int64 71325 81325 91325 101325

In [43]: da2
Out[43]:
<xarray.DataArray (plev_ind: 4, lat: 5)>
array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])
Coordinates:
    plev      (plev_ind) float64 71.33 81.33 91.33 101.3
  * lat       (lat) int64 20 30 40 50 60
  * plev_ind  (plev_ind) int64 71325 81325 91325 101325

In [44]: da1 - da2
Out[44]:
<xarray.DataArray (plev_ind: 4, lat: 5)>
array([[-1.,  0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.,  8.],
       [ 9., 10., 11., 12., 13.],
       [14., 15., 16., 17., 18.]])
Coordinates:
  * lat       (lat) int64 20 30 40 50 60
  * plev_ind  (plev_ind) int64 71325 81325 91325 101325

Detailed example

Recommended topics

Hot tags