This function combines several ways titles and names may need to be formatted. It's meant to be simple, yet flexible.
Usage
clean_titles(
  x,
  cap_all = FALSE,
  split_case = TRUE,
  keep_running_caps = TRUE,
  space = "_",
  remove = NULL
)Arguments
- x
 A character vector
- cap_all
 Logical: if
TRUE, first letter of each word after splitting will be capitalized. IfFALSE, only the first character of the string will be capitalized. Note that in order to balance this with respecting consecutive capital letters, such as from acronyms,- split_case
 Logical: if
TRUE, consecutive lowercase-uppercase pairs will be treated as two words to be separated.- keep_running_caps
 Logical: if
TRUE, consecutive uppercase letters will be kept uppercase.- space
 Character vector of characters and/or regex patterns that should be replaced with a space to separate words.
- remove
 Character vector of characters and/or regex patterns that will be removed before any other operations; if
NULL, nothing is removed.
Details
Examples of possible common operations include:
"TownName" –> "Town Name"
"town_name" –> "Town Name"
"town_name" –> "Town name"
"RegionABC" –> "Region ABC"
"TOWN_NAME" –> "Town Name"
Examples
t1 <- c("GreaterNewHaven", "greater_new_haven", "GREATER_NEW_HAVEN")
clean_titles(t1, cap_all = TRUE, keep_running_caps = FALSE)
#> [1] "Greater New Haven" "Greater new haven" "Greater New Haven"
t2 <- c("Male!CollegeGraduates", "Male CollegeGraduates")
clean_titles(t2, space = c("_", "!"))
#> [1] "Male college graduates" "Male college graduates"
t3 <- c("Greater BPT Men", "Greater BPT Men HBP", "GreaterBPT_men", "greaterBPT")
clean_titles(t3, cap_all = FALSE)
#> [1] "Greater BPT men"     "Greater BPT men HBP" "Greater BPT men"    
#> [4] "Greater BPT"        
t4 <- c(
    "New Haven town, New Haven County, Connecticut",
    "Newtown town, Fairfield County, Connecticut"
)
clean_titles(t4, cap_all = TRUE, remove = " town,.+")
#> [1] "New Haven" "Newtown"