conda install dask-geopandas -c conda-forge
pip install dask-geopandas
For more details, see the installation instructions.
Example
As with dask.dataframe
and pandas
, the API of dask_geopandas
mirrors the one of geopandas
.
import geopandas
import dask_geopandas
df = geopandas.read_file(geopandas.datasets.get_path("naturalearth_lowres"))
dask_df = dask_geopandas.from_geopandas(df, npartitions=4)
dask_df.geometry.area.compute()
When should I use Dask-GeoPandas?
Dask-GeoPandas is useful when dealing with large GeoDataFrames that either do not comfortably fit in memory or require expensive computation that can be easily parallelised. Note that using Dask-GeoPandas is not always faster than using GeoPandas as there is an unavoidable overhead in task scheduling and transfer of data between threads and processes, but in other cases, your performance gains can be almost linear with more threads.
Useful links
Source Repository (GitHub) | Issues & Ideas | Gitter (chat)