How do I translate an ISO 8601 datetime string into a Python datetime object?

How do I translate an ISO 8601 datetime string into a Python datetime object?

I prefer using the dateutil library for timezone handling and generally solid date parsing. If you were to get an ISO 8601 string like: 2010-05-08T23:41:54.000Z youd have a fun time parsing that with strptime, especially if you didnt know up front whether or not the timezone was included. pyiso8601 has a couple of issues (check their tracker) that I ran into during my usage and it hasnt been updated in a few years. dateutil, by contrast, has been active and worked for me:

from dateutil import parser
yourdate = parser.parse(datestring)

Since Python 3.7 and no external libraries, you can use the strptime function from the datetime module:

datetime.datetime.strptime(2019-01-04T16:41:24+0200, %Y-%m-%dT%H:%M:%S%z)

For more formatting options, see here.

Python 2 doesnt support the %z format specifier, so its best to explicitly use Zulu time everywhere if possible:

datetime.datetime.strptime(2007-03-04T21:08:12Z, %Y-%m-%dT%H:%M:%SZ)

How do I translate an ISO 8601 datetime string into a Python datetime object?

Because ISO 8601 allows many variations of optional colons and dashes being present, basically CCYY-MM-DDThh:mm:ss[Z|(+|-)hh:mm]. If you want to use strptime, you need to strip out those variations first.

The goal is to generate a UTC datetime object.


If you just want a basic case that work for UTC with the Z suffix like 2016-06-29T19:36:29.3453Z:

datetime.datetime.strptime(timestamp.translate(None, :-), %Y%m%dT%H%M%S.%fZ)

If you want to handle timezone offsets like 2016-06-29T19:36:29.3453-0400 or 2008-09-03T20:56:35.450686+05:00 use the following. These will convert all variations into something without variable delimiters like 20080903T205635.450686+0500 making it more consistent/easier to parse.

import re
# This regex removes all colons and all
# dashes EXCEPT for the dash indicating + or - utc offset for the timezone
conformed_timestamp = re.sub(r[:]|([-](?!((d{2}[:]d{2})|(d{4}))$)), , timestamp)
datetime.datetime.strptime(conformed_timestamp, %Y%m%dT%H%M%S.%f%z )

If your system does not support the %z strptime directive (you see something like ValueError: z is a bad directive in format %Y%m%dT%H%M%S.%f%z) then you need to manually offset the time from Z (UTC). Note %z may not work on your system in Python versions < 3 as it depended on the C library support which varies across system/Python build type (i.e., Jython, Cython, etc.).

import re
import datetime

# This regex removes all colons and all
# dashes EXCEPT for the dash indicating + or - utc offset for the timezone
conformed_timestamp = re.sub(r[:]|([-](?!((d{2}[:]d{2})|(d{4}))$)), , timestamp)

# Split on the offset to remove it. Use a capture group to keep the delimiter
split_timestamp = re.split(r([+|-]),conformed_timestamp)
main_timestamp = split_timestamp[0]
if len(split_timestamp) == 3:
    sign = split_timestamp[1]
    offset = split_timestamp[2]
else:
    sign = None
    offset = None

# Generate the datetime object without the offset at UTC time
output_datetime = datetime.datetime.strptime(main_timestamp +Z, %Y%m%dT%H%M%S.%fZ )
if offset:
    # Create timedelta based on offset
    offset_delta = datetime.timedelta(hours=int(sign+offset[:-2]), minutes=int(sign+offset[-2:]))

    # Offset datetime with timedelta
    output_datetime = output_datetime + offset_delta

Leave a Reply

Your email address will not be published. Required fields are marked *