Last data update: 2014.03.03

R: Character classes
ClassGroupsR Documentation

Character classes

Description

Match character classes.

Usage

alnum(lo, hi, char_class = TRUE)

alpha(lo, hi, char_class = TRUE)

blank(lo, hi, char_class = TRUE)

cntrl(lo, hi, char_class = TRUE)

digit(lo, hi, char_class = TRUE)

graph(lo, hi, char_class = TRUE)

lower(lo, hi, char_class = TRUE)

printable(lo, hi, char_class = TRUE)

punct(lo, hi, char_class = TRUE)

space(lo, hi, char_class = TRUE)

upper(lo, hi, char_class = TRUE)

hex_digit(lo, hi, char_class = TRUE)

any_char(lo, hi)

dgt(lo, hi, char_class = TRUE)

wrd(lo, hi, char_class = TRUE)

spc(lo, hi, char_class = TRUE)

not_dgt(lo, hi, char_class = TRUE)

not_wrd(lo, hi, char_class = TRUE)

not_spc(lo, hi, char_class = TRUE)

ascii_digit(lo, hi, char_class = TRUE)

ascii_lower(lo, hi, char_class = TRUE)

ascii_upper(lo, hi, char_class = TRUE)

ascii_alpha(lo, hi, char_class = TRUE)

ascii_alnum(lo, hi, char_class = TRUE)

char_range(lo, hi, char_class = lo < hi)

Arguments

lo

A non-negative integer. Minimum number of repeats, when grouped.

hi

positive integer. Maximum number of repeats, when grouped.

char_class

A logical value. Should x be wrapped in a character class? If NA, the function guesses whether that's a good idea.

Value

A character vector representing part or all of a regular expression.

Note

R has many built-in locale-dependent character classes, like [:alnum:] (representing lower or upper case letters or numbers). There are also some generic character classes like w (representing lower or upper case letters or numbers or underscores). Finally, there are ASCII-only ways of specifying letters like a-zA-Z. Which version you want depends upon how you want to deal with international characters, and the vagaries of the underlying regular expression engine. I suggest reading the regex help page and doing lots of testing.

References

http://www.regular-expressions.info/shorthand.html and http://www.rexegg.com/regex-quickstart.html#posix

See Also

regex, Unicode

Examples

# R character classes
alnum()
alpha()
blank()
cntrl()
digit()
graph()
lower()
printable()
punct()
space()
upper()
hex_digit()

# Generic classes
any_char()
dgt()
wrd()
spc()

# Generic negated classes
not_dgt()
not_wrd()
not_spc()

# Non-locale-specific classes
ascii_digit()
ascii_lower()
ascii_upper()

# Don't provide a class wrapper
digit(char_class = FALSE) # same as DIGIT

# Match repeated values
digit(3)
digit(3, 5)
digit(0)
digit(1)
digit(0, 1)

# Ranges of characters
char_range(0, 7) # octal number

# Usage
(rx <- digit(3))
stringi::stri_detect_regex(c("123", "one23"), rx)

Results