Accessing NDVI data on Tribal Subdivisions

Download data for the vegetation coding challenge

id = 'stars'
site_name = 'Gila River Indian Community'
data_dir = 'gila-river'

STEP 1: Site map

We’ll need some Python libraries to complete this workflow.

Try It: Import necessary libraries

In the cell below, making sure to keep the packages in order, add packages for:

  • Working with DataFrames
  • Working with GeoDataFrames
  • Making interactive plots of tabular and vector data
Reflect and Respond

What are we using the rest of these packages for? See if you can figure it out as you complete the notebook.

import json
import os
import pathlib
from glob import glob

import earthpy.api.appeears as eaapp
import earthpy
import hvplot.xarray
import rioxarray as rxr
import xarray as xr
See our solution!
import json
import os
import pathlib
from glob import glob

import earthpy.api.appeears as eaapp
import earthpy
import geopandas as gpd
import hvplot.pandas
import hvplot.xarray
import pandas as pd
import rioxarray as rxr
import xarray as xr

We have one more setup task. We’re not going to be able to load all our data directly from the web to Python this time. That means we need to set up a place for it.

GOTCHA ALERT!

A lot of times in Python we say “directory” to mean a “folder” on your computer. The two words mean the same thing in this context.

Try It
  1. Replace ‘my-data-folder’ with a descriptive directory name.
  2. Run the cell to display your project directory.
  3. Can you find the directory, either in a terminal or through your operating system’s file browser/explorer/finder?
# Create a project directory in the system data folder
project = earthpy.project.Project(project_dirname=data_dir)
project.project_dir
See our solution!
# Create a project directory in the system data folder
project = earthpy.project.Project(project_dirname=data_dir)
project.project_dir
PosixPath('/home/runner/.local/share/earth-analytics/gila-river')

Study Area: Gila River Indian Community

Earth Data Science data formats

In Earth Data Science, we get data in three main formats:

Data type Descriptions Common file formats Python type
Time Series The same data points (e.g. streamflow) collected multiple times over time Tabular formats (e.g. .csv, or .xlsx) pandas DataFrame
Vector Points, lines, and areas (with coordinates) Shapefile (often an archive like a .zip file because a Shapefile is actually a collection of at least 3 files) geopandas GeoDataFrame
Raster Evenly spaced spatial grid (with coordinates) GeoTIFF (.tif), NetCDF (.nc), HDF (.hdf) rioxarray DataArray
Read More

Check out the sections about about vector data and raster data in the textbook.

Reflect and Respond

For this coding challenge, we are interested in the boundary of the Gila River Indian Community. In the cell below, answer the following question: What data type do you think the boundary will be?

Try It
  • Search the US open data website for “American Indian Tribal Subdivisions” data.gov for
  • Find the tl_2020_us_aitsn.zip file. Click on Download. Put the downloaded files into your data folder for this project.
  • Modify the code below to use descriptive variable names and to point at the path where you put your data. Feel free to refer back to previous challenges for similar code!

Take a look at your sample code for this task below. You might notice that there is some extra code:

# Create a session with a valid User-Agent header
fs = fsspec.filesystem("http", headers={"User-Agent": "Mozilla/5.0"})

# Open the file using fsspec and pass to geopandas
with fs.open(url) as f:
    gdf = gpd.read_file(f)

We need this code because the data.gov site doesn’t want bots or code downloading their data. What we’re doing won’t cause any harm, but computers can be programmed to repeatedly hit URLs so that they overload the online data service. This is sometimes called a denial of service attack. With this extra two lines of code, your code can pretend to be the Mozilla web browser, allowing the data to download normally.

# Download the site boundary

# Create a session with a valid User-Agent header
fs = fsspec.filesystem("http", headers={"User-Agent": "Mozilla/5.0"})

# Open the file using fsspec and pass to geopandas
with fs.open(url) as f:
    gdf = gpd.read_file(f)
    
# Check that the data were downloaded correctly
Try It
  • Select all the subdivision of the Gila River Indian Community from the resulting DataFrame
  • **Spatially merge the results to get a single boundary for the Gila River Indian Community.

Site Map

We usually want to create a site map as a first step when working with geospatial data. This helps us see that we’re looking at the right location, and learn something about the context of the analysis.

Try It
  1. Plot your Gila River Indian Community shapefile on an interactive map
  2. Make sure to add a title
  3. Add ESRI World Imagery as the basemap/background using the tiles=... parameter. To get the tiles to match up with your data, you will also need to add the geo=True parameter.
# Plot the site boundary
See our solution!
# Plot the site boundary
# gdf.hvplot(
#     geo=True,
#     title=site_name,
#     tiles='EsriImagery')