添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

Using contextily to map OpenStreetMap & US Census bureau data #

In this notebook, we’ll discuss how to access open data sources, such as data from OpenStreetMap or the United States Census, and make static maps showing urban demographics and streets with basemaps provided by contextily . The three main packages covered in this notebook are explained below:

OSMNX #

OSMNX, (styled osmnx , pronounced as oh-ess-em-en-echs ), is a well-used package to examine Open Streetmap data from python. A good overview of the core concepts & ideas comes from @gboeing , the lead author and maintainer of the package. Here, we’ll use it to extract the street network of Austin, TX.

Contextily #

Contextily (pronounced context-a-lee ) is a python package that works with online map tile servers to provide basemaps for matplotlib plots. A ton of information on the package is available at `contextily.readthedocs.io < https://contextily.readthedocs.io/en/latest/ >`__.

Cenpy #

CenPy (pronounced sen-pie ) is a a python package for interacting with the US Census Bureau’s Data Products, hosted at `api.census.gov < https://api.census.gov >`__. The Census exposes a ton of data products for people to use. Cenpy itself provides 2 “levels” of access.

Census products #

Most users simply want to get into the census, retrieve data, and then map, plot, analyze, or model that data. For this, cenpy wraps the main “products” that users may want to access: the American Community Survey & 2010 Decennial Census. These are desgined to interface directly with the US Census Bureau’s data APIs, get both the geographies & data from the US Census, and return that to the user, ready to plot. We’ll cover this API here.

Building Blocks of cenpy.products #

For those interested, cenpy also has a lower-level interface designed to directly interact with US Census data products through their two constituent parts: the data product from https://api.census.gov , and the geography product , from the US Census’s ESRI MapServer. This is intended for developers to build new products or to interface directly with the API as they wish. This is pretty straightforward to use, but requires a bit more technical knowledge to make just work , so if you simply need US Census or ACS data, focus on the product API.

Using the Packages #

To use packages in python, you must first import the package. Below, we import three packages:

  • cenpy

  • osmnx

  • contextily

  • matplotlib.pyplot

  • import cenpy
    import osmnx
    import contextily
    import matplotlib.pyplot as plt
    %matplotlib inline
    /opt/anaconda3/envs/analysis/lib/python3.8/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
      warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
    

    osmnx, contextily, and cenpy.products work using a place-oriented API. This means that users specify a place name, like Columbus, OH or Kansas City, MO-KS, or California, and the package parses this name and grabs the relevant data. osmnx uses the Open Street Map service, cenpy uses the Us Census Bureau’s service, and contextily has its own distinctive set of providers so they can sometimes disagree slightly, especially when considering older census products.

    Regardless, to grab the US census data using cenpy, you pass the place name and the columns of the Census product you wish to extract. Below, we’ll grab two columns from the American Community Survey: Total population (B02001_001E) and count of African American persons (B02001_003E). We’ll grab this from Austin, TX:

    aus_data = cenpy.products.ACS().from_place('Austin, TX',
                                               variables=['B02001_001E', 'B02001_003E'])
    /opt/anaconda3/envs/analysis/lib/python3.8/site-packages/cenpy/geoparser.py:214: UserWarning: Shape is invalid:
    Ring Self-intersection[-10884881.1468 3554135.7868]
      tell_user('Shape is invalid: \n{}'.format(vexplain))
    /opt/anaconda3/envs/analysis/lib/python3.8/site-packages/pyproj/crs/crs.py:55: FutureWarning: '+init=<authority>:<code>' syntax is deprecated. '<authority>:<code>' is the preferred initialization method. When making the change, be mindful of axis order changes: https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6
      return _prepare_from_string(" ".join(pjargs))
    /opt/anaconda3/envs/analysis/lib/python3.8/site-packages/pyproj/crs/crs.py:55: FutureWarning: '+init=<authority>:<code>' syntax is deprecated. '<authority>:<code>' is the preferred initialization method. When making the change, be mindful of axis order changes: https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6
      return _prepare_from_string(" ".join(pjargs))
    

    When this runs, cenpy does a few things: 1. it asks the census for all the relevant US Census Tracts that fall within Austin, TX 2. it parses the shapes of Census tracts to make sure they’re valid 3. it parses the data from the Census to ensure it’s valid

    Above, you may see a warning that the Austin, TX shape is invalid! This is cenpy running validation on the data. This problem can be fixed, but does not immediately affect analyses.

    Likewise, OSMNX has a place-oriented API. To grab the street network from Austin, we can run a similar query:

    aus_graph = osmnx.graph_from_place('Austin, TX')
    

    However, the two pcakages default representations are quite different. osmnx focuses on the networkx package for its core representation (hence, osm for Open Streetmap and nx for NetworkX):

    aus_graph
    

    In contrast cenpy uses pandas (and, specifically, geopandas) to express the demographics and geography of US Census data. These packages provide dataframes, like spreadsheets, which can be used to analyze data in Python. Below, each road contains the shape of one US Census tract (the geometry used by default in by cenpy), and the columns provide descriptive information about the tract.

    aus_data.head()
    

    Fortunately, you can convert the networkx objects that osmnx focuses on into pandas dataframes, so that both cenpy and osmnx match in their representation. This makes it very easy to work with OSM data alongside of census data.

    To convert the OSM data into a pandas dataframe, we must do two things.

    First, we need to use the osmnx.graph_to_gdfs to convert the graph to GeoDataFrames, which are like a standard pandas.DataFrame, but with additional geographic information on the shape of each road. The graph_to_gdfs actually produces two dataframes: one full of roads and one full of intersections. We’ll separate the two below:

    aus_nodes, aus_streets  = osmnx.graph_to_gdfs(aus_graph)
    

    Now, the aus_streets dataframe looks like the aus_data dataframe, where each row is a street, and columns contain some information about the street:

    aus_streets.head()
    

    The last bit of data processing that is needed to make the two datasets fully comport within one another is to set their coordinate reference systems to ensure that they align and can be plotted with webtile backing. The US Census provides geographical data in Web Mercator projection (likely due to the fact that it serves many webmapping applications in the US Government), whereas the Open Streetmap project serves data in raw latitude/longitude by default. For contextily, we’ll need everything in a Web Mercator projection.

    So, to convert data between coordinate reference systems, we can use the to_crs method of GeoDataFrames. This changes the coordinate reference system for the dataframe. To convert one dataframe into the coordiante reference system of another, it’s often enough to provide the coordinate reference of the target dataframe to the to_crs function:

    aus_streets = aus_streets.to_crs(aus_data.crs)
    

    Now, the two dataframes have the same coordinate reference system:

    aus_data.crs
    - name: World - 85°S to 85°N
    - bounds: (-180.0, -85.06, 180.0, 85.06)
    Coordinate Operation:
    - name: Popular Visualisation Pseudo-Mercator
    - method: Popular Visualisation Pseudo Mercator
    Datum: World Geodetic System 1984
    - Ellipsoid: WGS 84
    - Prime Meridian: Greenwich
    
    [10]:
    
    aus_streets.crs
    - name: World - 85°S to 85°N
    - bounds: (-180.0, -85.06, 180.0, 85.06)
    Coordinate Operation:
    - name: Popular Visualisation Pseudo-Mercator
    - method: Popular Visualisation Pseudo Mercator
    Datum: World Geodetic System 1984
    - Ellipsoid: WGS 84
    - Prime Meridian: Greenwich
    

    Now, we can make maps using the data, or can conduct analyses using the streets & demographics of Austin, TX. Using contextily.add_basemap, we can also ensure that a nice basemap is added:

    [11]:
    
    f,ax = plt.subplots(1,1, figsize=(15,15))
    aus_streets.plot(linewidth=.25, ax=ax, color='k')
    aus_data.eval('pct_afam = B02001_003E / B02001_001E')\
            .plot('pct_afam', cmap='plasma', alpha=.7, ax=ax, linewidth=.25, edgecolor='k')
    contextily.add_basemap(ax=ax, url=contextily.providers.CartoDB.Positron)
    #ax.axis(aus_streets.total_bounds[[0,2,1,3]])
    ax.set_title('Austin, TX\nAfrican American %')
    #ax.set_facecolor('k')
    

    This means that urban data science in Python has never been easier! So much data is at your fingertips, from_place away. Both packages can be installed from conda-forge, the community-driven package repository in Anaconda, the scientific python distribution. Check out other examples of using `cenpy <https://cenpy-devs.github.io/cenpy>`__, `contextily <https://contextily.readthedocs.io/en/stable>`__ `osmnx <https://osmnx.readthedocs.io/en/stable/>`__ from their respective websites. And, most importantly, happy hacking!