Skip to contents

This is a set of helper functions for formatting numbers where you also need to top-code or bottom-code values above or below some threshold (or both). Additional arguments help with generating plain English text or escaped HTML characters for printing in tables.

Usage

number_thresh(
  x,
  thresh,
  less_than = TRUE,
  accuracy = 1,
  txt = FALSE,
  html = FALSE,
  ...
)

percent_thresh(
  x,
  thresh,
  less_than = TRUE,
  accuracy = 1,
  txt = FALSE,
  html = FALSE,
  ...
)

dollar_thresh(
  x,
  thresh,
  less_than = TRUE,
  accuracy = 1,
  txt = FALSE,
  html = FALSE,
  ...
)

Arguments

x

Numeric vector

thresh

Numeric, the threshold above/below which numbers will be capped.

less_than

Boolean: if TRUE, values less than the threshold will be lumped together. Otherwise, values greater than the threshold will be lumped. Ignored if both bottom and top endpoints are given in thresh. Default: TRUE

accuracy

Number: accuracy of formatted numbers, passed to scales::label_number and related functions. Defaults to 1, meaning no decimal places are returned.

txt

Boolean: if TRUE, plain English is used (e.g. "less than") instead of symbols (e.g. "<"). For percent_thresh, this also means using " percent" instead of "%". Default: FALSE

html

Boolean: if TRUE, HTML-appropriate symbols are used (e.g. &lt;) instead of more readable ones (e.g. "<"). Has no effect if txt = TRUE. Default: FALSE function

...

Arguments passed on to scales::label_number

scale

A scaling factor: x will be multiplied by scale before formatting. This is useful if the underlying data is very small or very large.

prefix

Additional text to display before the number. The suffix is applied to absolute value before style_positive and style_negative are processed so that prefix = "$" will yield (e.g.) -$1 and ($1).

suffix

Additional text to display after the number.

big.mark

Character used between every 3 digits to separate thousands.

decimal.mark

The character to be used to indicate the numeric decimal point.

style_positive

A string that determines the style of positive numbers:

  • "none" (the default): no change, e.g. 1.

  • "plus": preceded by +, e.g. +1.

  • "space": preceded by a Unicode "figure space", i.e., a space equally as wide as a number or +. Compared to "none", adding a figure space can ensure numbers remain properly aligned when they are left- or right-justified.

style_negative

A string that determines the style of negative numbers:

  • "hyphen" (the default): preceded by a standard hypen -, e.g. -1.

  • "minus", uses a proper Unicode minus symbol. This is a typographical nicety that ensures - aligns with the horizontal bar of the the horizontal bar of +.

  • "parens", wrapped in parentheses, e.g. (1).

scale_cut

Named numeric vector that allows you to rescale large (or small) numbers and add a prefix. Built-in helpers include:

  • cut_short_scale(): [10^3, 10^6) = K, [10^6, 10^9) = M, [10^9, 10^12) = B, [10^12, Inf) = T.

  • cut_long_scale(): [10^3, 10^6) = K, [10^6, 10^12) = M, [10^12, 10^18) = B, [10^18, Inf) = T.

  • cut_si(unit): uses standard SI units.

If you supply a vector c(a = 100, b = 1000), absolute values in the range [0, 100) will not be rescaled, absolute values in the range [100, 1000) will be divided by 100 and given the suffix "a", and absolute values in the range [1000, Inf) will be divided by 1000 and given the suffix "b". If the division creates an irrational value (or one with many digits), the cut value below will be tried to see if it improves the look of the final label.

trim

Logical, if FALSE, values are right-justified to a common width (see base::format()).

Value

A character vector of the same length as x

Examples

rate <- c(0.9, 0.95, 0.92, 0.991)
percent_thresh(rate, thresh = 0.99, less_than = FALSE)
#> [1] "90%"  "95%"  "92%"  ">99%"
percent_thresh(rate, thresh = 0.99, less_than = FALSE, txt = TRUE)
#> [1] "90 percent"           "95 percent"           "92 percent"          
#> [4] "more than 99 percent"

# censor amounts under 100 dollars or above 1000 dollars
money <- c(200, 99, 400, 1005, 999)
dollar_thresh(money, thresh = c(100, 1000))
#> [1] "$200"    "<$100"   "$400"    ">$1,000" "$999"