How do I translate an ISO 8601 datetime string into a Python datetime object?
How do I translate an ISO 8601 datetime string into a Python datetime object?
I prefer using the dateutil library for timezone handling and generally solid date parsing. If you were to get an ISO 8601
string like: 2010-05-08T23:41:54.000Z
youd have a fun time parsing that with strptime, especially if you didnt know up front whether or not the timezone was included. pyiso8601
has a couple of issues (check their tracker) that I ran into during my usage and it hasnt been updated in a few years. dateutil, by contrast, has been active and worked for me:
from dateutil import parser
yourdate = parser.parse(datestring)
Since Python 3.7 and no external libraries, you can use the strptime function from the datetime module:
datetime.datetime.strptime(2019-01-04T16:41:24+0200, %Y-%m-%dT%H:%M:%S%z)
For more formatting options, see here.
Python 2 doesnt support the %z
format specifier, so its best to explicitly use Zulu time everywhere if possible:
datetime.datetime.strptime(2007-03-04T21:08:12Z, %Y-%m-%dT%H:%M:%SZ)
How do I translate an ISO 8601 datetime string into a Python datetime object?
Because ISO 8601 allows many variations of optional colons and dashes being present, basically CCYY-MM-DDThh:mm:ss[Z|(+|-)hh:mm]
. If you want to use strptime, you need to strip out those variations first.
The goal is to generate a UTC datetime object.
If you just want a basic case that work for UTC with the Z suffix like 2016-06-29T19:36:29.3453Z
:
datetime.datetime.strptime(timestamp.translate(None, :-), %Y%m%dT%H%M%S.%fZ)
If you want to handle timezone offsets like 2016-06-29T19:36:29.3453-0400
or 2008-09-03T20:56:35.450686+05:00
use the following. These will convert all variations into something without variable delimiters like 20080903T205635.450686+0500
making it more consistent/easier to parse.
import re
# This regex removes all colons and all
# dashes EXCEPT for the dash indicating + or - utc offset for the timezone
conformed_timestamp = re.sub(r[:]|([-](?!((d{2}[:]d{2})|(d{4}))$)), , timestamp)
datetime.datetime.strptime(conformed_timestamp, %Y%m%dT%H%M%S.%f%z )
If your system does not support the %z
strptime directive (you see something like ValueError: z is a bad directive in format %Y%m%dT%H%M%S.%f%z
) then you need to manually offset the time from Z
(UTC). Note %z
may not work on your system in Python versions < 3 as it depended on the C library support which varies across system/Python build type (i.e., Jython, Cython, etc.).
import re
import datetime
# This regex removes all colons and all
# dashes EXCEPT for the dash indicating + or - utc offset for the timezone
conformed_timestamp = re.sub(r[:]|([-](?!((d{2}[:]d{2})|(d{4}))$)), , timestamp)
# Split on the offset to remove it. Use a capture group to keep the delimiter
split_timestamp = re.split(r([+|-]),conformed_timestamp)
main_timestamp = split_timestamp[0]
if len(split_timestamp) == 3:
sign = split_timestamp[1]
offset = split_timestamp[2]
else:
sign = None
offset = None
# Generate the datetime object without the offset at UTC time
output_datetime = datetime.datetime.strptime(main_timestamp +Z, %Y%m%dT%H%M%S.%fZ )
if offset:
# Create timedelta based on offset
offset_delta = datetime.timedelta(hours=int(sign+offset[:-2]), minutes=int(sign+offset[-2:]))
# Offset datetime with timedelta
output_datetime = output_datetime + offset_delta