Last data update: 2014.03.03

R: Parse a character vector of dates or date times.
parse_datetimeR Documentation

Parse a character vector of dates or date times.

Description

Parse a character vector of dates or date times.

Usage

parse_datetime(x, format = "", locale = default_locale())

col_datetime(format = "")

parse_date(x, format = "%Y-%m-%d", locale = default_locale())

col_date(format = NULL)

parse_time(x, format = "", locale = default_locale())

col_time(format = "")

Arguments

x

A character vector of dates to parse.

format

A format specification, as described below. If omitted, parses dates according to the ISO8601 specification (with caveats, as described below). Times are parsed like ISO8601 times, but also accept an optional am/pm specification.

Unlike strptime, the format specification must match the complete string.

locale

The locale controls defaults that vary from place to place. The default locale is US-centric (like R), but you can use locale to create your own locale that controls things like the default time zone, encoding, decimal mark, big mark, and day/month names.

Value

A POSIXct vector with tzone attribute set to tz. Elements that could not be parsed (or did not generate valid dates) will bes set to NA, and a warning message will inform you of the total number of failures.

Format specification

readr uses a format specification similiar to strptime. There are three types of element:

  1. Date components are specified with "%" followed by a letter. For example "%Y" matches a 4 digit year, "%m", matches a 2 digit month and "%d" matches a 2 digit day.

  2. Whitespace is any sequence of zero or more whitespace characters.

  3. Any other character is matched exactly.

parse_datetime recognises the following format specifications:

  • Year: "%Y" (4 digits). "%y" (2 digits); 00-69 -> 2000-2069, 70-99 -> 1970-1999.

  • Month: "%m" (2 digits), "%b" (abbreviated name in current locale), "%B" (full name in current locale).

  • Day: "%d" (2 digits), "%e" (optional leading space)

  • Hour: "%H"

  • Minutes: "%M"

  • Seconds: "%S" (integer seconds), "%OS" (partial seconds)

  • Time zone: "%Z" (as name, e.g. "America/Chicago"), "%z" (as offset from UTC, e.g. "+0800")

  • AM/PM indicator: "%p".

  • Non-digits: "%." skips one non-digit character, "%+" skips one or more non-digit characters, "%*" skips any number of non-digits characters.

  • Shortcuts: "%D" = "%m/%d/%y", "%F" = "%Y-%m-%d", "%R" = "%H:%M", "%T" = "%H:%M:%S", "%x" = "%y/%m/%d".

ISO8601 support

Currently, readr does not support all of ISO8601. Missing features:

  • Week & weekday specifications, e.g. "2013-W05", "2013-W05-10"

  • Ordinal dates, e.g. "2013-095".

  • Using commas instead of a period for decimal separator

The parser is also a little laxer than ISO8601:

  • Dates and times can be separated with a space, not just T.

  • Mostly correct specifications like "2009-05-19 14:" and "200912-01" work.

Examples

# Format strings --------------------------------------------------------
parse_datetime("01/02/2010", "%d/%m/%Y")
parse_datetime("01/02/2010", "%m/%d/%Y")
# Handle any separator
parse_datetime("01/02/2010", "%m%.%d%.%Y")

# Dates look the same, but internally they use the number of days since
# 1970-01-01 instead of the number of seconds. This avoids a whole lot
# of troubles related to time zones, so use if you can.
parse_date("01/02/2010", "%d/%m/%Y")
parse_date("01/02/2010", "%m/%d/%Y")

# You can parse timezones from strings (as listed in OlsonNames())
parse_datetime("2010/01/01 12:00 US/Central", "%Y/%m/%d %H:%M %Z")
# Or from offsets
parse_datetime("2010/01/01 12:00 -0600", "%Y/%m/%d %H:%M %z")

# Use the locale parameter to control the default time zone
# (but note UTC is considerably faster than other options)
parse_datetime("2010/01/01 12:00", "%Y/%m/%d %H:%M",
  locale = locale(tz = "US/Central"))
parse_datetime("2010/01/01 12:00", "%Y/%m/%d %H:%M",
  locale = locale(tz = "US/Eastern"))

# Unlike strptime, the format specification must match the complete
# string (ignoring leading and trailing whitespace). This avoids common
# errors:
strptime("01/02/2010", "%d/%m/%y")
parse_datetime("01/02/2010", "%d/%m/%y")

# Failures -------------------------------------------------------------
parse_datetime("01/01/2010", "%d/%m/%Y")
parse_datetime(c("01/ab/2010", "32/01/2010"), "%d/%m/%Y")

# Locales --------------------------------------------------------------
# By default, readr expects English date/times, but that's easy to change'
parse_datetime("1 janvier 2015", "%d %B %Y", locale = locale("fr"))
parse_datetime("1 enero 2015", "%d %B %Y", locale = locale("es"))

# ISO8601 --------------------------------------------------------------
# With separators
parse_datetime("1979-10-14")
parse_datetime("1979-10-14T10")
parse_datetime("1979-10-14T10:11")
parse_datetime("1979-10-14T10:11:12")
parse_datetime("1979-10-14T10:11:12.12345")

# Without separators
parse_datetime("19791014")
parse_datetime("19791014T101112")

# Time zones
us_central <- locale(tz = "US/Central")
parse_datetime("1979-10-14T1010", locale = us_central)
parse_datetime("1979-10-14T1010-0500", locale = us_central)
parse_datetime("1979-10-14T1010Z", locale = us_central)
# Your current time zone
parse_datetime("1979-10-14T1010", locale = locale(tz = ""))

Results