Skip to contents

This sets up a typical (opinionated) project directory structure that we use for many projects at DataHaven. It will write directories at the specified path, but it will NOT overwrite any directories that already exist. You'll have the option to cancel before anything is written.

Usage

scaffold_project(
  dir = ".",
  input_data = TRUE,
  output_data = TRUE,
  fetch_data = TRUE,
  analysis = TRUE,
  prep_scripts = FALSE,
  plots = FALSE,
  format_tables = FALSE,
  drafts = FALSE,
  utils = TRUE,
  addl = NULL,
  gitblank = TRUE,
  dryrun = FALSE
)

Arguments

dir

String: path to directory in which new files will be written. Default: '.'

input_data

Create a directory input_data. Default: TRUE.

Standard use: data from some outside source to be analyzed in this project.

output_data

Create a directory output_data. Default: TRUE.

Standard use: data written after analysis done in this project, generally in formats that can still be used for analysis and visualization (csv, rds) rather than formats for distribution (I usually add a folder to_distro) or to pass on to a client (xlsx). Nice spreadsheet outputs should go in format_tables or some other distribution-centered folder.

fetch_data

Create a directory fetch_data. Default: TRUE.

Standard use: a place to dump data as it comes in from API calls, queries, batch file downloads, etc.

analysis

Create a directory analysis. Default: TRUE

Standard use: main analysis scripts, both notebooks and .R scripts.

prep_scripts

Create a directory prep_scripts. Default: FALSE

Standard use: scripts use to prep or reshape data or documents, e.g. creating formatted spreadsheets for a client, making metadata, prepping to post to data.world, formatting for a website, bulk rendering parameterized Rmarkdown documents.

plots

Create a directory plots. Default: FALSE

Standard use: plots, either for in-house use or outside distribution.

format_tables

Create a directory format_tables. Default: FALSE

Standard use: spreadsheets--probably written by a script in prep_scripts--to be shared with clients or collaborators. Think of these as being files appropriate for presentation or addenda to a report, not for doing further analysis.

drafts

Create a directory drafts. Default: FALSE

Standard use: separating the more EDA-centered notebooks from notebooks used for drafting writing. Also a good place to keep files that have been edited in outside software (.docx, etc).

utils

Create a directory utils. Default: TRUE

Standard use: utility scripts and miscellaneous files, e.g. logo images, snippets of data, lists of colors to use.

addl

A string vector of any additional directories to create. Default: NULL

gitblank

Logical: whether to write a blank placeholder file in each new directory to force git tracking, even without yet having folder contents. Default: TRUE. If FALSE, empty directories will not be tracked by git.

dryrun

Logical: whether to just do a dry run without actually writing any directories or files. Defaults FALSE.

Value

Returns nothing, but prints paths to newly created directories.

Examples

# create default folders--good for small analysis projects
scaffold_project(dryrun = TRUE)
#>  Note that this is just a dry run. You'll see the normal printouts but no files will actually be written.
#>  The following new directories will be created:
#>_utils
#>analysis
#>fetch_data
#>input_data
#>output_data
#> ────────────────────────────────────────────────────────────────────────────────
#> → Writing ./_utils
#> → Writing ./analysis
#> → Writing ./fetch_data
#> → Writing ./input_data
#> → Writing ./output_data

# create all available folders--good for larger print projects
scaffold_project(prep_scripts = TRUE, 
                 plots = TRUE,
                 format_tables = TRUE,
                 drafts = TRUE,
                 dryrun = TRUE)
#>  Note that this is just a dry run. You'll see the normal printouts but no files will actually be written.
#>  The following new directories will be created:
#>_utils
#>analysis
#>drafts
#>fetch_data
#>format_tables
#>input_data
#>output_data
#>plots
#>prep_scripts
#> ────────────────────────────────────────────────────────────────────────────────
#> → Writing ./_utils
#> → Writing ./analysis
#> → Writing ./drafts
#> → Writing ./fetch_data
#> → Writing ./format_tables
#> → Writing ./input_data
#> → Writing ./output_data
#> → Writing ./plots
#> → Writing ./prep_scripts