dask.array.gufunc.
apply_gufunc
(
func
,
signature
,
*
args
,
axes
=
None
,
axis
=
None
,
keepdims
=
False
,
output_dtypes
=
None
,
output_sizes
=
None
,
vectorize
=
None
,
allow_rechunk
=
False
,
meta
=
None
,
**
kwargs
)
[source]
Apply a generalized ufunc or similar python function to arrays.
signature
determines if the function consumes or produces core
dimensions. The remaining dimensions in given input arrays (
*args
)
are considered loop dimensions and are required to broadcast
naturally against each other.
In other terms, this function is like
np.vectorize
, but for
the blocks of dask arrays. If the function itself shall also
be vectorized use
vectorize=True
for convenience.
Parameters
func
callable
Function to call like
func(*args,
**kwargs)
on input arrays
(
*args
) that returns an array or tuple of arrays. If multiple
arguments with non-matching dimensions are supplied, this function is
expected to vectorize (broadcast) over axes of positional arguments in
the style of NumPy universal functions
[1]
(if this is not the case,
set
vectorize=True
). If this function returns multiple outputs,
output_core_dims
has to be set as well.
signature: string
Specifies what core dimensions are consumed and produced by
func
.
According to the specification of numpy.gufunc signature
[2]
*args
numeric
Input arrays or scalars to the callable function.
axes: List of tuples, optional, keyword only
A list of tuples with indices of axes a generalized ufunc should operate on.
For instance, for a signature of
"(i,j),(j,k)->(i,k)"
appropriate for
matrix multiplication, the base elements are two-dimensional matrices
and these are taken to be stored in the two last axes of each argument. The
corresponding axes keyword would be
[(-2,
-1),
(-2,
-1),
(-2,
-1)]
.
For simplicity, for generalized ufuncs that operate on 1-dimensional arrays
(vectors), a single integer is accepted instead of a single-element tuple,
and for generalized ufuncs for which all outputs are scalars, the output
tuples can be omitted.
axis: int, optional, keyword only
A single axis over which a generalized ufunc should operate. This is a short-cut
for ufuncs that operate over a single, shared core dimension, equivalent to passing
in axes with entries of (axis,) for each single-core-dimension argument and
()
for
all others. For instance, for a signature
"(i),(i)->()"
, it is equivalent to passing
in
axes=[(axis,),
(axis,),
()]
.
keepdims: bool, optional, keyword only
If this is set to True, axes which are reduced over will be left in the result as
a dimension with size one, so that the result will broadcast correctly against the
inputs. This option can only be used for generalized ufuncs that operate on inputs
that all have the same number of core dimensions and with outputs that have no core
dimensions , i.e., with signatures like
"(i),(i)->()"
or
"(m,m)->()"
.
If used, the location of the dimensions in the output can be controlled with axes
and axis.
output_dtypes
Optional, dtype or list of dtypes, keyword only
Valid numpy dtype specification or list thereof.
If not given, a call of
func
with a small set of data
is performed in order to try to automatically determine the
output dtypes.
output_sizes
dict, optional, keyword only
Optional mapping from dimension names to sizes for outputs. Only used if
new core dimensions (not found on inputs) appear on outputs.
vectorize: bool, keyword only
If set to
True
,
np.vectorize
is applied to
func
for
convenience. Defaults to
False
.
allow_rechunk: Optional, bool, keyword only
Allows rechunking, otherwise chunk sizes need to match and core
dimensions are to consist only of one chunk.
Warning: enabling this can increase memory usage significantly.
Defaults to
False
.
meta: Optional, tuple, keyword only
tuple of empty ndarrays describing the shape and dtype of the output of the gufunc.
Defaults to
None
.
**kwargs
dict
Extra keyword arguments to pass to
func
Returns
Single dask.array.Array or tuple of dask.array.Array
References
https://docs.scipy.org/doc/numpy/reference/ufuncs.html
https://docs.scipy.org/doc/numpy/reference/c-api/generalized-ufuncs.html
Examples
>>> import dask.array as da
>>> import numpy as np
>>> def stats(x):
... return np.mean(x, axis=-1), np.std(x, axis=-1)
>>> a = da.random.normal(size=(10,20,30), chunks=(5, 10, 30))
>>> mean, std = da.apply_gufunc(stats, "(i)->(),()", a)
>>> mean.compute().shape
(10, 20)
>>> def outer_product(x, y):
... return np.einsum("i,j->ij", x, y)
>>> a = da.random.normal(size=( 20,30), chunks=(10, 30))
>>> b = da.random.normal(size=(10, 1,40), chunks=(5, 1, 40))
>>> c = da.apply_gufunc(outer_product, "(i),(j)->(i,j)", a, b, vectorize=True)
>>> c.compute().shape
(10, 20, 30, 40)