8. Advanced spatial R and mapmaking: I

From 1,000 point-clicks to 1 script…

Learning objectives


The ability to work in one place or with one program from start to finish is powerful and more efficient than splitting your workflow across multiple tools. By sticking with one single framework or set of tools, we can reduce the mental workload necessary when switch between programs, staying organized in each, and dealing with import/export across multiple programs. Although different tools such as ESRI (or ArcPy extensions) are powerful, they require a paid license and typically use point-click user interfaces.

The advantage R has over these tools is that it is freely available, easily integrates with vast statistical/modeling toolboxes, has access to many spatial analysis and mapmaking tools, and allows us to work in a single place.

If we use a functional programming approach (described in the iteration module ) for spatial problems, R can be a very robust and powerful tool for analysis and spatial visualization of data! Furthermore, once analyses have been completed, we can re-use the scripts and functions for common spatial tasks (like making maps or exporting specific spatial files).

Common Geospatial Tasks

Common tasks in a GUI-based approach will always require the same number of point and clicks. With a script-based approach, it’s much easier to recycle previously written code, or to just change a variable and re-run the code. This efficiency is magnified immensely when it can be automated or iterated over the same task through time, or multiple data sets.

For example, some common tasks may include:

Script-based analyses with {sf}

The {sf} package truly makes working with vector-based spatial data easy. We can use a pipeline that includes:

There are many more options that can be added or subtracted from these pieces, but at the core, we can use this very functional approach to provide data, make maps, conduct analysis, and so much more.

A Groundwater/Surfacewater Hydrology Example

Let’s use an example where we read in some groundwater station data, spatially find the closest surface water stations, download some river data, and visualize!

The Packages

First we’ll need a few packages we’ve not used yet. Please install these with install.packages() if you don’t have them.

library(tidyverse) # data wrangling & viz
library(purrr) # iteration
library(janitor) # name cleaning
library(glue) # pasting stuff together

library(sf) # analysis tools
library(mapview)  # interactive maps!
mapviewOptions(fgb = FALSE) # to save interactive maps

library(tigris) # county & state boundaries
library(units) # for convert spatial and other units
library(dataRetrieval) # download USGS river data
library(tmap) # mapping
library(tmaptools) # mapping tools

Importing Spatial Data

We’ll leverage the ability to pull in many different data and stitch them all together through joins (spatial or common attributes). Each data component may be comprised of one or more “layers”, which ultimately we can use on a map.

Get State & County Data

First we need state and county boundaries. The {tigris} package is excellent for this.

# get {sf} CA boundary
ca <- states(cb=TRUE, progress_bar = FALSE) %>% 
  dplyr::filter(STUSPS == "CA")

# get counties for CA as {sf}
ca_co <- counties("CA", cb = TRUE, progress_bar = FALSE)

Let’s also pull in a shapefile that’s just Sacramento County. We’ll use this to crop/trim things down as we move forward. Note, we’ll need to check the coordinate reference system and projection here, and make sure we are matching our spatial data.

# get just sacramento: here we read in a shapefile:
sac_co <- st_read("data/shp/sac/sac_county.shp")
Reading layer `sac_county' from data source 
  using driver `ESRI Shapefile'
Simple feature collection with 1 feature and 9 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -13565710 ymin: 4582007 xmax: -13472670 ymax: 4683976
Projected CRS: WGS 84 / Pseudo-Mercator
# check CRS
[1] 3857
# match with other data
sac_co <- st_transform(sac_co, st_crs(ca_co))
st_crs(sac_co) == st_crs(ca_co) # should be TRUE!
[1] TRUE
# make a box around sacramento county
# (a grid with an n=1) for inset
sac_box <- st_make_grid(sac_co, n = 1)

And let’s quickly visualize these pieces! We’ll use the base plot() functions here.

# make sure we have all the pieces with a quick test plot
plot(ca$geometry, col = alpha("gray", 0.5), border = "black", lwd=2)
plot(ca_co$geometry, add = T, border = "purple", col = NA)
plot(sac_co$geometry, add = T, border = "cyan4", col = "skyblue",alpha=0.4, lwd = 2)
plot(sac_box, add = T, border = "orange", col = NA, lwd = 1.4)

Iterate: Get Groundwater Stations

Let’s practice our iteration skills. We’ll read in groundwater stations for 3 counties (El Dorado, Placer, and Sacramento), convert to {sf} objects, plot them, and then crop/select a subset of stations using spatial intersection.

Iteration…remember {purrr}? Let’s use it here!

# read the stations
gw_files <- list.files(path = "data/gwl/county",
                       full.names = TRUE, pattern = "*.csv")

# read all files into dataframes and combine with purrr
gw_df <- map_df(gw_files, ~read.csv(.x))

# the readr package will also do this same thing by default
# when passed a list of files with the same data types
gw_df <- read_csv(gw_files)

# now make "spatial" as sf objects
gw_df <- st_as_sf(gw_df, coords=c("LONGITUDE","LATITUDE"), 
                  crs=4326, remove=FALSE) %>% 
  # and transform!
  st_transform(., st_crs(ca_co))

# preview!
mapview(gw_df, zcol="COUNTY_NAME", layer.name="GW Stations") +
  mapview(sac_co, legend=FALSE)