4 min read

Adding and reading local data files in R Markdown posts


1 Problem

Problem: You want to read in a data file in an R code chunk in an R Markdown post. But:

  • Where should you save the data file?
  • What file path will work to run the code chunks in the console?
  • What file path will work when you serve site?

Solution: Read on.

This post will show you how to add local data files to your blogdown site, and the file paths to read those data files in an R code chunk. (If you came here looking for how to add static images and use file paths, please see this post.)

2 TL;DR

Let’s say you have a data file called "mazes.csv", and you want to read in that CSV file in an R chunk. The below table summarizes where the file should live in your blogdown site directory, and the file paths to use.

File location File path in R chunk File path using here package
/content/post/mazes.csv "mazes.csv" no need!
/static/mazes.csv "../../static/mazes.csv" here("static", "mazes.csv")
/static/data/mazes.csv "../../static/data/mazes.csv" here("static", "data", "mazes.csv")
raw GitHub url "raw_url/mazes.csv" no need!

More detail on each of these scenarios below. We’ll use readr::read_csv(), so you’ll need to install and load the readr package:

# install.packages("readr")
library(readr)

2.1 Place file in your post/ folder

What to do:

  • File goes in /content/post/mazes.csv
  • R file path is "mazes.csv"

Example:

mazes <- read_csv("mazes.csv")

2.2 Place file in your static/ folder

What to do:

  • File goes in /static/mazes.csv
  • R file path is "../../static/mazes.csv"
  • R file path with here is: here("static", "mazes.csv")

Example:

mazes <- read_csv("../../static/mazes.csv")
# or
library(here)
mazes <- read_csv(here("static", "mazes.csv"))

2.3 Place file in your static/data/ folder

What to do:

  • File goes in /static/data/mazes.csv
  • R file path is "../../static/data/mazes.csv"
  • R file path with here is: here("static", "data", "mazes.csv")

Example:

mazes <- read_csv("../../static/data/mazes.csv")
# or
library(here)
mazes <- read_csv(here("static", "data", "mazes.csv"))

2.4 Place "mazes.csv" online

What to do:

  • File goes online: options include in a GitHub gist or push to your site’s repo
  • Use the raw GitHub or gist url

Example:

mazes_gist <- "https://gist.githubusercontent.com/kylebgorman/77ce12c9167554ade560af9d34565c11/raw/c5d653fb146821ecd96a9aa085263c3f17480dd5/McFarlaneEtAl_MazeData-Deidentified.csv"
mazes <- read_csv(mazes_gist)

3 HERE

How did here work?

# use here to build the file path
library(here)
## here() starts at /Users/hillali/Documents/Projects/r-projects/blogdown-demo
# read in the file
mazes <- read_csv(here("static", "data", "mazes.csv"))
## Parsed with column specification:
## cols(
##   Study.ID = col_character(),
##   CA = col_double(),
##   VIQ = col_double(),
##   DX = col_character(),
##   Activity = col_character(),
##   Content = col_double(),
##   Filler = col_double(),
##   REP = col_double(),
##   REV = col_double(),
##   FS = col_double(),
##   Cued = col_double(),
##   Not.Cued = col_double()
## )
mazes
## # A tibble: 381 x 12
##    Study.ID    CA   VIQ DX    Activity    Content Filler   REP   REV    FS
##    <chr>    <dbl> <dbl> <chr> <chr>         <dbl>  <dbl> <dbl> <dbl> <dbl>
##  1 CSLU-001  5.67   124 TD    Conversati…   24.0   31.0   2.00  5.00 17.0 
##  2 CSLU-001  5.67   124 TD    Picture De…    1.00   2.00  0     0     1.00
##  3 CSLU-001  5.67   124 TD    Play          21.0    6.00  3.00  8.00 10.0 
##  4 CSLU-001  5.67   124 TD    Wordless P…    8.00   2.00  0     4.00  4.00
##  5 CSLU-002  6.50   124 TD    Conversati…    3.00  10.0   3.00  0     0   
##  6 CSLU-002  6.50   124 TD    Picture De…    5.00   3.00  2.00  1.00  2.00
##  7 CSLU-002  6.50   124 TD    Play           8.00   8.00  3.00  2.00  3.00
##  8 CSLU-002  6.50   124 TD    Wordless P…    2.00   2.00  0     0     2.00
##  9 CSLU-007  7.50   108 TD    Conversati…   25.0   21.0   4.00  4.00 17.0 
## 10 CSLU-007  7.50   108 TD    Picture De…   10.0   13.0   0     2.00  8.00
## # ... with 371 more rows, and 2 more variables: Cued <dbl>, Not.Cued <dbl>

Now that is some serious black magic. Let’s break down what the here package did.

# where are we?
here() 
## [1] "/Users/hillali/Documents/Projects/r-projects/blogdown-demo"
# if mazes.csv were in static, this is the file path
here("static", "mazes.csv")
## [1] "/Users/hillali/Documents/Projects/r-projects/blogdown-demo/static/mazes.csv"
# but it is not! it is in /static/data
here("static", "data", "mazes.csv")
## [1] "/Users/hillali/Documents/Projects/r-projects/blogdown-demo/static/data/mazes.csv"
# you can read in directly
mazes1 <- read_csv(here("static", "data", "mazes.csv"))
## Parsed with column specification:
## cols(
##   Study.ID = col_character(),
##   CA = col_double(),
##   VIQ = col_double(),
##   DX = col_character(),
##   Activity = col_character(),
##   Content = col_double(),
##   Filler = col_double(),
##   REP = col_double(),
##   REV = col_double(),
##   FS = col_double(),
##   Cued = col_double(),
##   Not.Cued = col_double()
## )
# you can save the file path 
mazes_file <- here("static", "data", "mazes.csv")

# then read that in
mazes2 <- read_csv(mazes_file)
## Parsed with column specification:
## cols(
##   Study.ID = col_character(),
##   CA = col_double(),
##   VIQ = col_double(),
##   DX = col_character(),
##   Activity = col_character(),
##   Content = col_double(),
##   Filler = col_double(),
##   REP = col_double(),
##   REV = col_double(),
##   FS = col_double(),
##   Cued = col_double(),
##   Not.Cued = col_double()
## )

For more of a breakdown, see here, here by Jenny Bryan.