R4WRDS Introductory Course

Unleasing the power of reproducible workflows with “R for Water Resources Data Science” (R 4 WRDS).


Who is this course for?

This course is most relevant and targeted at folks who work with data, from analysts and program staff to engineers and scientists. This course provides an introduction to the power and possibility of a reproducible programming language (R) by demonstrating how to import, explore, visualize, analyze, and communicate different types of data. Using water resources based examples, this course guides participants through basic data science skills and strategies for continued learning and use of R.


Why R?

R is a language for statistical computing and a general purpose programming language. It is one of the primary languages used for data science, modeling, and visualization.

This workshop will provide attendees with a starting point for continued learning and use of R. We will cover a variety of commonly used file types (i.e., .csv, .xlsx, .shp) used in analysis, and provide resources for additional learning.


What will you learn?

In this course, we start from first principles and assume no prior experience with R. Although each module in this course can serve as a “stand-alone” lesson, we recommend completing modules in order from start to finish.

In this course you will gain practice in:

Artwork by @allison_horst

Figure 1: Artwork by @allison_horst

Course Modules

  1. Install R and RStudio
  2. Get oriented in RStudio
  3. Practice data and file management: understand RProjects and file paths
  4. Import and export various water resources data
  5. Visualize data with {ggplot2}
  6. Understand and identify different data structures (i.e., vectors, dataframes, lists)
  7. Transform data with {dplyr}
  8. Discuss spreadsheets and pivots
  9. Learn how to write custom functions
  10. Join different datasets together
  11. Use spatial data to create static and interactive maps
  12. Explore strategies for Exploratory Data Analysis (EDA)
  13. Practice data presentation and communication with {RMarkdown}
  14. Explore strategies for troubleshooting (reading documentation, intro to reprex)


Data

All data used in this course is expected to live in a /data subfolder in the project directory. It can be downloaded in 1 of 2 ways:

  1. Downloaded and unzipped from OSF
  2. Cloned from the r4wrds-data Github repository

Your project directory structure should look like this (note the position of the /data subfolder):

.
├── code
│   ├── module_01.R
│   └── module_02.R
│   └── ...
├── data
│   ├── gwl.csv
│   └── polygon.shp
│   └── ...
└── intro.Rproj

To complete code exercises and follow along in the course, you will create a /code subfolder to store .R scripts, which we recommend naming by module.

You will also need to create an intro.Rproj file, covered in the introductory project management module. Alternatively, on the command line, use touch intro.Rproj (MacOS/Linux) or echo > intro.Rproj (Windows) in the root project directory.


Workshop Overview

We will follow the SFS Code of Conduct throughout our workshop.


Source content

All source materials for this website can be accessed at the r4wrds Github repository.


Attribution

Content in these lessons has been modified and/or adapted from Data Carpentry: R for data analysis and visualization of Ecological Data, the USGS-R training curriculum here, the NCEAS Open Science for Synthesis workshop here, Mapping in R, and the wonderful text R for data science.


Next module:
1. Install R/RStudio

site last updated: 2021-07-29 10:10

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/r4wrds/r4wrds, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".