How do I parse an ISO 8601-formatted date and time?
Asked Answered
D

28

928

I need to parse RFC 3339 strings like "2008-09-03T20:56:35.450686Z" into Python's datetime type.

I have found strptime in the Python standard library, but it is not very convenient.

What is the best way to do this?

Dynamite answered 24/9, 2008 at 15:17 Comment(2)
related: Convert timestamps with offset to datetime obj using strptimeSurprint
To be clear: ISO 8601 is the main standard. RFC 3339 is a self-proclaimed “profile” of ISO 8601 that makes some unwise overrides of ISO 8601 rules.Disclose
E
682

isoparse function from python-dateutil

The python-dateutil package has dateutil.parser.isoparse to parse not only RFC 3339 datetime strings like the one in the question, but also other ISO 8601 date and time strings that don't comply with RFC 3339 (such as ones with no UTC offset, or ones that represent only a date).

>>> import dateutil.parser
>>> dateutil.parser.isoparse('2008-09-03T20:56:35.450686Z') # RFC 3339 format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=tzutc())
>>> dateutil.parser.isoparse('2008-09-03T20:56:35.450686') # ISO 8601 extended format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)
>>> dateutil.parser.isoparse('20080903T205635.450686') # ISO 8601 basic format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)
>>> dateutil.parser.isoparse('20080903') # ISO 8601 basic format, date only
datetime.datetime(2008, 9, 3, 0, 0)

The python-dateutil package also has dateutil.parser.parse. Compared with isoparse, it is presumably less strict, but both of them are quite forgiving and will attempt to interpret the string that you pass in. If you want to eliminate the possibility of any misreads, you need to use something stricter than either of these functions.

Comparison with Python 3.7+’s built-in datetime.datetime.fromisoformat

dateutil.parser.isoparse is a full ISO-8601 format parser, but in Python ≤ 3.10 fromisoformat is deliberately not. In Python 3.11, fromisoformat supports almost all strings in valid ISO 8601. See fromisoformat's docs for this cautionary caveat. (See this answer).

Edessa answered 5/3, 2013 at 15:44 Comment(14)
For the lazy, it's installed via python-dateutil not dateutil, so: pip install python-dateutil.Stemson
Be warned that the dateutil.parser is intentionally hacky: it tries to guess the format and makes inevitable assumptions (customizable by hand only) in ambiguous cases. So ONLY use it if you need to parse input of unknown format and are okay to tolerate occasional misreads.Turpentine
Agreed. An example is passing a "date" of 9999. This will return the same as datetime(9999, current month, current day). Not a valid date in my view.Ingenerate
@Turpentine what package would you recommend for non-guessing parsing?Butterfish
@Butterfish iso8601 as another answer suggests.Turpentine
@Turpentine but that's for iso8601 not rfc3339. Although the question is kind of confusing, seems to treat both as the same. I though we were talking only about the rfc3339Butterfish
@Butterfish RFC 3339, right in the abstract: "This document defines a date and time format for use in Internet protocols that is a profile of the ISO 8601 standard for representation of dates and times using the Gregorian calendar."Turpentine
@Turpentine I stand corrected then, thanks. I took a look at the doc, but did not understand that a profile of the ISO 8601 means a strict subset of ISO 8601 (I'm not a native speaker). BTW, there seems to be an minor incompatibility between the both with the TZ -00:00, but I don't think that can cause any trouble in my case.Butterfish
In Python 3, the parser always uses the tzlocal time zone, regardless of Z appearing at the end of the time string, on systems that are configured to use UTC as their default time zone. Numeric offsets produce a tzoffset tzinfo object.Reasoned
For a shorter way to write it down you can do: from dateutil.parser import parse as parsedate and then use parsedate() instead of dateutil.parser.parse()Athamas
@Turpentine there's an update to the module that reads iso8601 dates: dateutil.readthedocs.io/en/stable/…Villa
It is a pity, you have to install a third party library for a very common use of a date format, i mean the notation ending with Z.Rendering
This should be fixed in Python 3.11Adlay
@noamcohen97 Thanks. I edited this answer and the other answer for Python 3.11Edessa
N
479

Since Python 3.11, the standard library’s datetime.fromisoformat supports any valid ISO 8601 input. In earlier versions it only parses a specific subset, see the cautionary note in the docs. If you are using Python 3.10 or earlier on strings that don't fall into that subset (like in the question), see other answers for functions from outside the standard library. The docs:

classmethod datetime.fromisoformat(date_string):

Return a datetime corresponding to a date_string in any valid ISO 8601 format, with the following exceptions:

  1. Time zone offsets may have fractional seconds.
  2. The T separator may be replaced by any single unicode character.
  3. Ordinal dates are not currently supported.
  4. Fractional hours and minutes are not supported.

Examples:

>>> from datetime import datetime
>>> datetime.fromisoformat('2011-11-04')
datetime.datetime(2011, 11, 4, 0, 0)
>>> datetime.fromisoformat('20111104')
datetime.datetime(2011, 11, 4, 0, 0)
>>> datetime.fromisoformat('2011-11-04T00:05:23')
datetime.datetime(2011, 11, 4, 0, 5, 23)
>>> datetime.fromisoformat('2011-11-04T00:05:23Z')
datetime.datetime(2011, 11, 4, 0, 5, 23, tzinfo=datetime.timezone.utc)
>>> datetime.fromisoformat('20111104T000523')
datetime.datetime(2011, 11, 4, 0, 5, 23)
>>> datetime.fromisoformat('2011-W01-2T00:05:23.283')
datetime.datetime(2011, 1, 4, 0, 5, 23, 283000)
>>> datetime.fromisoformat('2011-11-04 00:05:23.283')
datetime.datetime(2011, 11, 4, 0, 5, 23, 283000)
>>> datetime.fromisoformat('2011-11-04 00:05:23.283+00:00')
datetime.datetime(2011, 11, 4, 0, 5, 23, 283000, tzinfo=datetime.timezone.utc)
>>> datetime.fromisoformat('2011-11-04T00:05:23+04:00')   
datetime.datetime(2011, 11, 4, 0, 5, 23, tzinfo=datetime.timezone(datetime.timedelta(seconds=14400)))

New in version 3.7.

Changed in version 3.11: Previously, this method only supported formats that could be emitted by date.isoformat() or datetime.isoformat().

Nerissanerita answered 11/4, 2018 at 20:32 Comment(9)
That's weird. Because a datetime may contain a tzinfo, and thus output a timezone, but datetime.fromisoformat() doesn't parse the tzinfo ? seems like a bug ..Derzon
Don't miss that note in the documentation, this doesn't accept all valid ISO 8601 strings, only ones generated by isoformat. It doesn't accept the example in the question "2008-09-03T20:56:35.450686Z" because of the trailing Z, but it does accept "2008-09-03T20:56:35.450686".Edessa
To properly support the Z the input script can be modified with date_string.replace("Z", "+00:00").Prosaic
Note that for seconds it only handles either exactly 0, 3 or 6 decimal places. If the input data has 1, 2, 4, 5, 7 or more decimal places, parsing will fail!Frannie
As noted, this method will only successfully parse the output of isoformat, and is not fully ISO-8601 compliant, but very few languages are fully compliant given how large and arcane that standard is. Yes Java will accept timezones and date offsets, but anything further than that will fall over as wellAurea
@jox: Do you mean "+0000" instead of "+00:00"? I am looking at the docs for datetime.strptime() and %z here: docs.python.org/3/library/…Contamination
@Contamination no, datetime.fromisoformat seems to expect another format. I just tested both versions and while it works fine with +00:00, I get "ValueError: Invalid isoformat string" with +0000.Prosaic
@Prosaic Great feedback. So datetime.fromisoformat is even more insane that I thought! How can Python be such a great language and ecosystem, but have such horrible date/time handling? My Python date/time code is usually littered with "gotcha" comments and links to SO.com answers / comments!Contamination
fromisoformat accepts almost all ISO 8601 date strings in Python 3.11 now, so a lot of these comments are out of date.Edessa
S
235

Note in Python 2.6+ and Py3K, the %f character catches microseconds.

>>> datetime.datetime.strptime("2008-09-03T20:56:35.450686Z", "%Y-%m-%dT%H:%M:%S.%fZ")

See issue here

Shaina answered 24/9, 2008 at 15:45 Comment(7)
Note - if using Naive datetimes - I think you get no TZ at all - Z may not match anything.Temekatemerity
in my case %f caught microseconds rather than Z, datetime.datetime.strptime(timestamp, '%Y-%m-%dT%H:%M:%S.%f') so this did the trickBoulanger
Does Py3K mean Python 3000?!?Indelicate
Fails if no ms or tz.Indelicate
@Indelicate IIRC, "Python 3000" is an old name for what is now known as Python 3.Reasoned
This answer (in its current, edited form) relies upon hard-coding a particular UTC offset (namely "Z", which means +00:00) into the format string. This is a bad idea because it will fail to parse any datetime with a different UTC offset and raise an exception. Also, even if you use this to parse a datetime with an offset of Z, you'll get back a "naive" datetime object with no timezone, instead of "timezone-aware" one with UTC as the timezone, which would be more correct.Doble
fail for this string: 2008-09-03T20:56:35ZGroundhog
D
196

As of Python 3.7, you can basically (caveats below) get away with using datetime.datetime.strptime to parse RFC 3339 datetimes, like this:

from datetime import datetime

def parse_rfc3339(datetime_str: str) -> datetime:
    try:
        return datetime.strptime(datetime_str, "%Y-%m-%dT%H:%M:%S.%f%z")
    except ValueError:
        # Perhaps the datetime has a whole number of seconds with no decimal
        # point. In that case, this will work:
        return datetime.strptime(datetime_str, "%Y-%m-%dT%H:%M:%S%z")

It's a little awkward, since we need to try two different format strings in order to support both datetimes with a fractional number of seconds (like 2022-01-01T12:12:12.123Z) and those without (like 2022-01-01T12:12:12Z), both of which are valid under RFC 3339. But as long as we do that single fiddly bit of logic, this works.

Some caveats to note about this approach:

  • It technically doesn't fully support RFC 3339, since RFC 3339 bizarrely lets you use a space instead of a T to separate the date from the time, even though RFC 3339 purports to be a profile of ISO 8601 and ISO 8601 does not allow this. If you want to support this silly quirk of RFC 3339, you could add datetime_str = datetime_str.replace(' ', 'T') to the start of the function.
  • My implementation above is slightly more permissive than a strict RFC 3339 parser should be, since it will allow timezone offsets like +0500 without a colon, which RFC 3339 does not support. If you don't merely want to parse known-to-be-RFC-3339 datetimes but also want to rigorously validate that the datetime you're getting is RFC 3339, use another approach or add in your own logic to validate the timezone offset format.
  • This function definitely doesn't support all of ISO 8601, which includes a much wider array of formats than RFC 3339. (e.g. 2009-W01-1 is a valid ISO 8601 date.)
  • It does not work in Python 3.6 or earlier, since in those old versions the %z specifier only matches timezones offsets like +0500 or -0430 or +0000, not RFC 3339 timezone offsets like +05:00 or -04:30 or Z.
Doble answered 7/6, 2015 at 17:53 Comment(0)
N
85

Try the iso8601 module; it does exactly this.

There are several other options mentioned on the WorkingWithTime page on the python.org wiki.

Napoleon answered 24/9, 2008 at 15:38 Comment(6)
Simple as iso8601.parse_date("2008-09-03T20:56:35.450686Z")Olympie
The question wasn't "how do I parse ISO 8601 dates", it was "how do I parse this exact date format."Napoleon
@tiktak The OP asked "I need to parse strings like X" and my reply to that, having tried both libraries, is to use another one, because iso8601 has important issues still open. My involvement or lack thereof in such a project is completely unrelated to the answer.Imperium
iso8601, a.k.a. pyiso8601, has been updated as recently as Feb 2014. The latest version supports a much broader set of ISO 8601 strings. I've been using to good effect in some of my projects.Cashandcarry
Sadly that lib called "iso8601" on pypi is trivially incomplete. It clearly states it doesn't handle dates based on week numbers just to pick one example.Cacka
@Tobia: iso8601 seems to be getting updates again.Livelihood
V
75

Python >= 3.11

fromisoformat now parses Z directly:

from datetime import datetime

s = "2008-09-03T20:56:35.450686Z"

datetime.fromisoformat(s)
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=datetime.timezone.utc)

Python 3.7 to 3.10

A simple option from one of the comments: replace 'Z' with '+00:00' - and use fromisoformat:

from datetime import datetime

s = "2008-09-03T20:56:35.450686Z"

datetime.fromisoformat(s.replace('Z', '+00:00'))
# datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=datetime.timezone.utc)

Why prefer fromisoformat?

Although strptime's %z can parse the 'Z' character to UTC, fromisoformat is faster by ~ x40 (or even ~x60 for Python 3.11):

from datetime import datetime
from dateutil import parser

s = "2008-09-03T20:56:35.450686Z"

# Python 3.11+
%timeit datetime.fromisoformat(s)
85.1 ns ± 0.473 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

# Python 3.7 to 3.10
%timeit datetime.fromisoformat(s.replace('Z', '+00:00'))
134 ns ± 0.522 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

%timeit parser.isoparse(s)
4.09 µs ± 5.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

%timeit datetime.strptime(s, '%Y-%m-%dT%H:%M:%S.%f%z')
5 µs ± 9.26 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

%timeit parser.parse(s)
28.5 µs ± 99.2 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

(Python 3.11.3 x64 on GNU/Linux)

See also: A faster strptime

Venosity answered 7/7, 2020 at 6:34 Comment(2)
@mikerodent: the point is that fromisoformat parses +00:00 but not Z to aware datetime with tzinfo being UTC. If your input e.g. ends with Z+00:00, you can just remove the Z before feeding it into fromisoformat. Other UTC offsets like e.g. +05:30 will then be parsed to a static UTC offset (not an actual time zone).Venosity
Totally different point now (I understand a bit more about "awareness"). But I've noted in the docs "Changed in version 3.11: Previously, this method only supported formats that could be emitted by date.isoformat() or datetime.isoformat()." and "corresponding to a date_string in any valid ISO 8601 format". That might actually not be what one wants. The fromisoformat of datetime.date is more explicit: "Return a date corresponding to a date_string given in any valid ISO 8601 format... " ... and it gives some surprising examples of strings which work which are not simple YYYY-MM-DD.Nordine
P
49

Starting from Python 3.7, strptime supports colon delimiters in UTC offsets (source). So you can then use:

import datetime

def parse_date_string(date_string: str) -> datetime.datetime
    try:
       return datetime.datetime.strptime(date_string, '%Y-%m-%dT%H:%M:%S.%f%z')
    except ValueError:
       return datetime.datetime.strptime(date_string, '%Y-%m-%dT%H:%M:%S%z')

EDIT:

As pointed out by Martijn, if you created the datetime object using isoformat(), you can simply use datetime.fromisoformat().

EDIT 2:

As pointed out by Mark Amery, I added a try..except block to account for missing fractional seconds.

Pedalfer answered 31/1, 2018 at 9:52 Comment(8)
But in 3.7, you also have datetime.fromisoformat() which handles strings like your input automatically: datetime.datetime.isoformat('2018-01-31T09:24:31.488670+00:00').Reliquary
Good point. I agree, I recommend to use datetime.fromisoformat() and datetime.isoformat()Pedalfer
This is the only answer that actually meets the question criteria. If you have to use strptime this is the correct answerAdur
You example fails on Python 3.6 with: ValueError: time data '2018-01-31T09:24:31.488670+00:00' does not match format '%Y-%m-%dT%H:%M:%S.%f%z' that's due to %z not matching +00:00. However +0000 matches %z see python doc docs.python.org/3.6/library/…Diorite
@Diorite Yes, this answer only works in Python 3.7 or newer.Edessa
Alas, neither your strptime incantation nor fromisoformat() as @MartijnPieters suggests are sufficient to parse even all valid RFC 3339 datetimes (let alone ISO 8601, of course). Your strptime incantation chokes if given input without a fractional number of seconds (e.g. '2018-01-31T09:24:31+00:00', while fromisoformat can't handle a timezone offset of Z (like used in the example in the question, or output from JavaScript's Date.toISOString() method).Doble
@MarkAmery: You're right about the strptime limitation. I adjusted the answer for your edge case, thanks. Regarding fromisoformat: I would only use it if you created the date string with isoformat() also (as stated in the answer).Pedalfer
@MarkAmery: Python 3.11 has further improved fromisoformat() and it can handle the Z timezone now: datetime.fromisoformat('2018-01-31T09:24:31Z') produces datetime.datetime(2018, 1, 31, 9, 24, 31, tzinfo=datetime.timezone.utc).Reliquary
A
39

What is the exact error you get? Is it like the following?

>>> datetime.datetime.strptime("2008-08-12T12:20:30.656234Z", "%Y-%m-%dT%H:%M:%S.Z")
ValueError: time data did not match format:  data=2008-08-12T12:20:30.656234Z  fmt=%Y-%m-%dT%H:%M:%S.Z

If yes, you can split your input string on ".", and then add the microseconds to the datetime you got.

Try this:

>>> def gt(dt_str):
        dt, _, us= dt_str.partition(".")
        dt= datetime.datetime.strptime(dt, "%Y-%m-%dT%H:%M:%S")
        us= int(us.rstrip("Z"), 10)
        return dt + datetime.timedelta(microseconds=us)

>>> gt("2008-08-12T12:20:30.656234Z")
datetime.datetime(2008, 8, 12, 12, 20, 30, 656234)
Anjanette answered 24/9, 2008 at 15:19 Comment(7)
You can't just strip .Z because it means timezone and can be different. I need to convert date to the UTC timezone.Dynamite
A plain datetime object has no concept of timezone. If all your times are ending in "Z", all the datetimes you get are UTC (Zulu time).Anjanette
if the timezone is anything other than "" or "Z", then it must be an offset in hours/minutes, which can be directly added to/subtracted from the datetime object. you could create a tzinfo subclass to handle it, but that's probably not reccomended.Sardis
Additionally, "%f" is the microsecond specifier, so a (timezone-naive) strptime string looks like: "%Y-%m-%dT%H:%M:%S.%f" .Astrophotography
This will raise an exception if the given datetime string has a UTC offset other than "Z". It does not support the entire RFC 3339 format and is an inferior answer to others that handle UTC offsets properly.Doble
Why not use the %f I don't get it ? I just saw this post because of it was used as a duplicate on #69953576 but that seems not easy regarding juts use "%Y-%m-%dT%H:%M:%S.%fZ"Adaurd
Python 3.11 has a much improved datetime.fromisoformat which will handle most iso8601 and rfc3339 formats. docs.python.org/3.11/library/…Lo
M
28
import re
import datetime
s = "2008-09-03T20:56:35.450686Z"
d = datetime.datetime(*map(int, re.split(r'[^\d]', s)[:-1]))
Myrmecology answered 24/9, 2008 at 15:27 Comment(10)
I disagree, this is practically unreadable and as far as I can tell does not take into account the Zulu (Z) which makes this datetime naive even though time zone data was provided.Fidge
I find it quite readable. In fact, it's probably the easiest and most performing way to do the conversion without installing additional packages.Imperium
This is equivalent of d=datetime.datetime(*map(int, re.split('\D', s)[:-1])) i suppose.Steno
def from_utc(date_str): """ Convert UTC time data string to time.struct_time """ UTC_FORMAT = "%Y-%m-%dT%H:%M:%S.%fZ" return time.strptime(date_str, UTC_FORMAT)Truck
a variation: datetime.datetime(*map(int, re.findall('\d+', s))Surprint
This results in a naive datetime object without timezone, right? So the UTC bit gets lost in translation?Sin
@w00t: aware_d = d.replace(tzinfo=timezone.utc)Surprint
This has the benefit of working with incomplete iso strings including dates and second-less datetimesDiorite
Not all formats of RFC3339 work with this code sample, only if the second fraction part has 6 digits! So the first example on page 9 section 5.8 of RFC 3339 version July 2002 would not work: 1985-04-12T23:20:50.52Z --> false: 1985-04-12T23:20:50.**0000**52 I mention this, because the question seems related to RFC3339 and only provides an 6 digit second fraction number as a 'like' example not telling that all date times contain always 6 digits, or always trailing zeros in the second fraction part (...59.999000Z or ...59.999Z ?).Beauteous
@Beauteous You saved my sanity! I was trying all the combinations in this thread and you had the magic answer: You can only have 6 decimals after the decimal point!!! The timestamps I was working with had 9 decimals! Who would think that should make a difference between conversion and invalid format? I ended up using fromisoformat(timestamp[:-4]) to keep it simple, and that worked fine!Interrelated
W
22

In these days, Arrow also can be used as a third-party solution:

>>> import arrow
>>> date = arrow.get("2008-09-03T20:56:35.450686Z")
>>> date.datetime
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=tzutc())
Waldner answered 15/2, 2015 at 16:47 Comment(1)
Just use python-dateutil - arrow requires python-dateutil.Nobie
A
21

Just use the python-dateutil module:

>>> import dateutil.parser as dp
>>> t = '1984-06-02T19:05:00.000Z'
>>> parsed_t = dp.parse(t)
>>> print(parsed_t)
datetime.datetime(1984, 6, 2, 19, 5, tzinfo=tzutc())

Documentation

Anglophobe answered 28/2, 2017 at 18:14 Comment(1)
dateutil.parser.parse will accept formats that are definitely not ISO 8601, like "Sat Oct 11 17:13:46 UTC 2003". If you specifically want ISO 8601 parsing, you would probably rather use dateutil.parse.isoparse instead, as Flimms's answer recommends.Doble
W
16

I have found ciso8601 to be the fastest way to parse ISO 8601 timestamps.

It also has full support for RFC 3339, and a dedicated function for strict parsing RFC 3339 timestamps.

Example usage:

>>> import ciso8601
>>> ciso8601.parse_datetime('2014-01-09T21')
datetime.datetime(2014, 1, 9, 21, 0)
>>> ciso8601.parse_datetime('2014-01-09T21:48:00.921000+05:30')
datetime.datetime(2014, 1, 9, 21, 48, 0, 921000, tzinfo=datetime.timezone(datetime.timedelta(seconds=19800)))
>>> ciso8601.parse_rfc3339('2014-01-09T21:48:00.921000+05:30')
datetime.datetime(2014, 1, 9, 21, 48, 0, 921000, tzinfo=datetime.timezone(datetime.timedelta(seconds=19800)))

The GitHub Repo README shows their speedup versus all of the other libraries listed in the other answers.

My personal project involved a lot of ISO 8601 parsing. It was nice to be able to just switch the call and go faster. :)

Edit: I have since become a maintainer of ciso8601. It's now faster than ever!

Winze answered 27/3, 2017 at 18:41 Comment(4)
This looks like a great library! For those wanting to optimize ISO8601 parsing on Google App Engine, sadly, we can't use it since it's a C library, but your benchmarks were insightful to show that native datetime.strptime() is the next fastest solution. Thanks for putting all that info together!Krissy
@hamx0r, be aware that datetime.strptime() is not a full ISO 8601 parsing library. If you are on Python 3.7, you can use the datetime.fromisoformat() method, which is a little more flexible. You might be interested in this more complete list of parsers which should be merged into the ciso8601 README soon.Winze
ciso8601 works quite nice, but one have to first do "pip install pytz", because one cannot parse a timestamp with time zone information without the pytz dependency. Example would look like: dob = ciso8601.parse_datetime(result['dob']['date'])Deuce
@Dirk, only in Python 2. But even that should be removed in the next release.Winze
M
13

If you are working with Django, it provides the dateparse module that accepts a bunch of formats similar to ISO format, including the time zone.

If you are not using Django and you don't want to use one of the other libraries mentioned here, you could probably adapt the Django source code for dateparse to your project.

Mainspring answered 30/9, 2015 at 21:42 Comment(1)
Django's DateTimeField uses this when you set a string value.Zoba
T
12

If you don't want to use dateutil, you can try this function:

def from_utc(utcTime,fmt="%Y-%m-%dT%H:%M:%S.%fZ"):
    """
    Convert UTC time string to time.struct_time
    """
    # change datetime.datetime to time, return time.struct_time type
    return datetime.datetime.strptime(utcTime, fmt)

Test:

from_utc("2007-03-04T21:08:12.123Z")

Result:

datetime.datetime(2007, 3, 4, 21, 8, 12, 123000)
Truck answered 27/3, 2014 at 22:50 Comment(3)
This answer relies upon hard-coding a particular UTC offset (namely "Z", which means +00:00) into the format string passed to strptime. This is a bad idea because it will fail to parse any datetime with a different UTC offset and raise an exception. See my answer that describes how parsing RFC 3339 with strptime is in fact impossible.Doble
It's hard-coded but its sufficient for case when you need to parse zulu only.Veery
@alexander yes - which may be the case if, for instance, you know that your date string was generated with JavaScript's toISOString method. But there's no mention of the limitation to Zulu time dates in this answer, nor did the question indicate that that's all that's needed, and just using dateutil is usually equally convenient and less narrow in what it can parse.Doble
C
9

I've coded up a parser for the ISO 8601 standard and put it on GitHub: https://github.com/boxed/iso8601. This implementation supports everything in the specification except for durations, intervals, periodic intervals, and dates outside the supported date range of Python's datetime module.

Tests are included! :P

Cacka answered 2/3, 2013 at 13:31 Comment(1)
Generally, links to a tool or library should be accompanied by usage notes, a specific explanation of how the linked resource is applicable to the problem, or some sample code, or if possible all of the above.Accommodative
E
8

This works for stdlib on Python 3.2 onwards (assuming all the timestamps are UTC):

from datetime import datetime, timezone, timedelta
datetime.strptime(timestamp, "%Y-%m-%dT%H:%M:%S.%fZ").replace(
    tzinfo=timezone(timedelta(0)))

For example,

>>> datetime.utcnow().replace(tzinfo=timezone(timedelta(0)))
... datetime.datetime(2015, 3, 11, 6, 2, 47, 879129, tzinfo=datetime.timezone.utc)
Evaporite answered 11/3, 2015 at 6:3 Comment(5)
This answer relies upon hard-coding a particular UTC offset (namely "Z", which means +00:00) into the format string passed to strptime. This is a bad idea because it will fail to parse any datetime with a different UTC offset and raise an exception. See my answer that describes how parsing RFC 3339 with strptime is in fact impossible.Doble
In theory, yes, this fails. In practice, I've never encountered an ISO 8601-formatted date that wasn't in Zulu time. For my very-occasional need, this works great and isn't reliant on some external library.Evaporite
you could use timezone.utc instead of timezone(timedelta(0)). Also, the code works in Python 2.6+ (at least) if you supply utc tzinfo objectSurprint
Doesn't matter if you've encountered it, it doesn't match the spec.Oshiro
You can use the %Z for timezone in the most recent versions of Python.Brietta
B
8

I'm the author of iso8601utils. It can be found on GitHub or on PyPI. Here's how you can parse your example:

>>> from iso8601utils import parsers
>>> parsers.datetime('2008-09-03T20:56:35.450686Z')
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)
Berglund answered 26/10, 2016 at 5:20 Comment(0)
C
7

One straightforward way to convert an ISO 8601-like date string to a UNIX timestamp or datetime.datetime object in all supported Python versions without installing third-party modules is to use the date parser of SQLite.

#!/usr/bin/env python
from __future__ import with_statement, division, print_function
import sqlite3
import datetime

testtimes = [
    "2016-08-25T16:01:26.123456Z",
    "2016-08-25T16:01:29",
]
db = sqlite3.connect(":memory:")
c = db.cursor()
for timestring in testtimes:
    c.execute("SELECT strftime('%s', ?)", (timestring,))
    converted = c.fetchone()[0]
    print("%s is %s after epoch" % (timestring, converted))
    dt = datetime.datetime.fromtimestamp(int(converted))
    print("datetime is %s" % dt)

Output:

2016-08-25T16:01:26.123456Z is 1472140886 after epoch
datetime is 2016-08-25 12:01:26
2016-08-25T16:01:29 is 1472140889 after epoch
datetime is 2016-08-25 12:01:29
Cumin answered 25/8, 2016 at 16:16 Comment(4)
Thanks. This is disgusting. I love it.Damiendamietta
What an incredible, awesome, beautiful hack! Thanks!Representational
Welcome to the Bad and the Ugly section.Roguery
Note that SQLite's date & time parsing is both more permissive than RFC 3339 and not permissive enough to handle all of ISO 8601, so it's not a perfect approach to parsing either format. Also, this is a hideous hack. But I suppose the fact that it avoids the need to install third party libraries is a virtue of sorts!Doble
G
7

Django's parse_datetime() function supports dates with UTC offsets:

parse_datetime('2016-08-09T15:12:03.65478Z') =
datetime.datetime(2016, 8, 9, 15, 12, 3, 654780, tzinfo=<UTC>)

So it could be used for parsing ISO 8601 dates in fields within entire project:

from django.utils import formats
from django.forms.fields import DateTimeField
from django.utils.dateparse import parse_datetime

class DateTimeFieldFixed(DateTimeField):
    def strptime(self, value, format):
        if format == 'iso-8601':
            return parse_datetime(value)
        return super().strptime(value, format)

DateTimeField.strptime = DateTimeFieldFixed.strptime
formats.ISO_INPUT_FORMATS['DATETIME_INPUT_FORMATS'].insert(0, 'iso-8601')
Girish answered 8/9, 2016 at 9:42 Comment(0)
U
7

An another way is to use specialized parser for ISO-8601 is to use isoparse function of dateutil parser:

from dateutil import parser

date = parser.isoparse("2008-09-03T20:56:35.450686+01:00")
print(date)

Output:

2008-09-03 20:56:35.450686+01:00

This function is also mentioned in the documentation for the standard Python function datetime.fromisoformat:

A more full-featured ISO 8601 parser, dateutil.parser.isoparse is available in the third-party package dateutil.

Uniplanar answered 24/9, 2019 at 12:32 Comment(0)
E
5

If pandas is used anyway, I can recommend Timestamp from pandas. There you can

ts_1 = pd.Timestamp('2020-02-18T04:27:58.000Z')    
ts_2 = pd.Timestamp('2020-02-18T04:27:58.000')

Rant: It is just unbelievable that we still need to worry about things like date string parsing in 2021.

Equiprobable answered 28/7, 2021 at 14:22 Comment(4)
pandas is strongly discouraged for this simple case: It depends on pytz, which violates the python standard, and pd.Timestamp is subtly not a compatible datetime object.Roguery
Thanks for your comment. Do you have some pointers for me? I was not able to find pytz: github.com/pandas-dev/pandas/blob/… and I’m not sure what Python standard and its violation you are referring to.Equiprobable
See the rant by Paul Ganssle. As for incompatibility, execute both datetime.fromisoformat('2021-01-01T00:00:00+01:00').tzinfo.utc and pandas.Timestamp('2021-01-01T00:00:00+01:00').tzinfo.utc : Not the same at all.Roguery
Thank you for pointers to this ongoing work. I didn’t know about that issue, but I really hope they fix it soon! But again: I can’t believe that time parsing is still an issue. :-)Equiprobable
O
3

Because ISO 8601 allows many variations of optional colons and dashes being present, basically CCYY-MM-DDThh:mm:ss[Z|(+|-)hh:mm]. If you want to use strptime, you need to strip out those variations first.

The goal is to generate a utc datetime object.


If you just want a basic case that work for UTC with the Z suffix like 2016-06-29T19:36:29.3453Z:
datetime.datetime.strptime(timestamp.translate(None, ':-'), "%Y%m%dT%H%M%S.%fZ")


If you want to handle timezone offsets like 2016-06-29T19:36:29.3453-0400 or 2008-09-03T20:56:35.450686+05:00 use the following. These will convert all variations into something without variable delimiters like 20080903T205635.450686+0500 making it more consistent/easier to parse.
import re
# this regex removes all colons and all 
# dashes EXCEPT for the dash indicating + or - utc offset for the timezone
conformed_timestamp = re.sub(r"[:]|([-](?!((\d{2}[:]\d{2})|(\d{4}))$))", '', timestamp)
datetime.datetime.strptime(conformed_timestamp, "%Y%m%dT%H%M%S.%f%z" )


If your system does not support the %z strptime directive (you see something like ValueError: 'z' is a bad directive in format '%Y%m%dT%H%M%S.%f%z') then you need to manually offset the time from Z (UTC). Note %z may not work on your system in python versions < 3 as it depended on the c library support which varies across system/python build type (i.e. Jython, Cython, etc.).
import re
import datetime

# this regex removes all colons and all 
# dashes EXCEPT for the dash indicating + or - utc offset for the timezone
conformed_timestamp = re.sub(r"[:]|([-](?!((\d{2}[:]\d{2})|(\d{4}))$))", '', timestamp)

# split on the offset to remove it. use a capture group to keep the delimiter
split_timestamp = re.split(r"[+|-]",conformed_timestamp)
main_timestamp = split_timestamp[0]
if len(split_timestamp) == 3:
    sign = split_timestamp[1]
    offset = split_timestamp[2]
else:
    sign = None
    offset = None

# generate the datetime object without the offset at UTC time
output_datetime = datetime.datetime.strptime(main_timestamp +"Z", "%Y%m%dT%H%M%S.%fZ" )
if offset:
    # create timedelta based on offset
    offset_delta = datetime.timedelta(hours=int(sign+offset[:-2]), minutes=int(sign+offset[-2:]))
    # offset datetime with timedelta
    output_datetime = output_datetime + offset_delta
Oshiro answered 28/6, 2016 at 19:54 Comment(1)
This is broken; some quick experimentation shows it raises an exception if timestamp is '2016-06-29T19:36:29.123Z' or '2016-06-29T19:36:29+00:00', both of which are valid RFC 3339 and ISO 8601 datetimes.Doble
C
3

Nowadays there's Maya: Datetimes for Humans™, from the author of the popular Requests: HTTP for Humans™ package:

>>> import maya
>>> str = '2008-09-03T20:56:35.450686Z'
>>> maya.MayaDT.from_rfc3339(str).datetime()
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=<UTC>)
Calhoun answered 24/9, 2018 at 18:21 Comment(0)
G
3

datetime.fromisoformat() is improved in Python 3.11 to parse most ISO 8601 formats

datetime.fromisoformat() can now be used to parse most ISO 8601 formats, barring only those that support fractional hours and minutes. Previously, this method only supported formats that could be emitted by datetime.isoformat().

>>> from datetime import datetime
>>> datetime.fromisoformat('2011-11-04T00:05:23Z')
datetime.datetime(2011, 11, 4, 0, 5, 23, tzinfo=datetime.timezone.utc)
>>> datetime.fromisoformat('20111104T000523')
datetime.datetime(2011, 11, 4, 0, 5, 23)
>>> datetime.fromisoformat('2011-W01-2T00:05:23.283')
datetime.datetime(2011, 1, 4, 0, 5, 23, 283000)
Giffin answered 9/11, 2022 at 4:43 Comment(1)
An excellent improvement to the previous situation. This should be most people's first choice.Foxtail
H
2

The python-dateutil will throw an exception if parsing invalid date strings, so you may want to catch the exception.

from dateutil import parser
ds = '2012-60-31'
try:
  dt = parser.parse(ds)
except ValueError, e:
  print '"%s" is an invalid date' % ds
Hibbert answered 9/8, 2013 at 15:53 Comment(2)
I think it will throw an exception sometimes, it is not guaranteed to throw an exception if it can make a best effort at guessing what the datetime is.Edessa
Error hiding is in top three of anti-patterns: Don't.Roguery
S
1

For something that works with the 2.X standard library try:

calendar.timegm(time.strptime(date.split(".")[0]+"UTC", "%Y-%m-%dT%H:%M:%S%Z"))

calendar.timegm is the missing gm version of time.mktime.

Singapore answered 21/7, 2011 at 6:47 Comment(2)
This just ignores the timezone '2013-01-28T14:01:01.335612-08:00' --> parsed as UTC, not PDTMadi
Besides ignoring the timezone as @Madi notes, this also raises an exception if you give it input which has a timezone but not a fractional number of seconds (and thus no . character), like 2022-10-09T15:49:22-07:00. Such a value is a valid RFC 3339 and ISO 8601 date time string, so a parser shouldn't choke on it.Doble
A
1

Thanks to great Mark Amery's answer I devised function to account for all possible ISO formats of datetime:

class FixedOffset(tzinfo):
    """Fixed offset in minutes: `time = utc_time + utc_offset`."""
    def __init__(self, offset):
        self.__offset = timedelta(minutes=offset)
        hours, minutes = divmod(offset, 60)
        #NOTE: the last part is to remind about deprecated POSIX GMT+h timezones
        #  that have the opposite sign in the name;
        #  the corresponding numeric value is not used e.g., no minutes
        self.__name = '<%+03d%02d>%+d' % (hours, minutes, -hours)
    def utcoffset(self, dt=None):
        return self.__offset
    def tzname(self, dt=None):
        return self.__name
    def dst(self, dt=None):
        return timedelta(0)
    def __repr__(self):
        return 'FixedOffset(%d)' % (self.utcoffset().total_seconds() / 60)
    def __getinitargs__(self):
        return (self.__offset.total_seconds()/60,)

def parse_isoformat_datetime(isodatetime):
    try:
        return datetime.strptime(isodatetime, '%Y-%m-%dT%H:%M:%S.%f')
    except ValueError:
        pass
    try:
        return datetime.strptime(isodatetime, '%Y-%m-%dT%H:%M:%S')
    except ValueError:
        pass
    pat = r'(.*?[+-]\d{2}):(\d{2})'
    temp = re.sub(pat, r'\1\2', isodatetime)
    naive_date_str = temp[:-5]
    offset_str = temp[-5:]
    naive_dt = datetime.strptime(naive_date_str, '%Y-%m-%dT%H:%M:%S.%f')
    offset = int(offset_str[-4:-2])*60 + int(offset_str[-2:])
    if offset_str[0] == "-":
        offset = -offset
    return naive_dt.replace(tzinfo=FixedOffset(offset))
Aidaaidan answered 14/3, 2016 at 15:5 Comment(0)
S
-1

Initially I tried with:

from operator import neg, pos
from time import strptime, mktime
from datetime import datetime, tzinfo, timedelta

class MyUTCOffsetTimezone(tzinfo):
    @staticmethod
    def with_offset(offset_no_signal, signal):  # type: (str, str) -> MyUTCOffsetTimezone
        return MyUTCOffsetTimezone((pos if signal == '+' else neg)(
            (datetime.strptime(offset_no_signal, '%H:%M') - datetime(1900, 1, 1))
          .total_seconds()))

    def __init__(self, offset, name=None):
        self.offset = timedelta(seconds=offset)
        self.name = name or self.__class__.__name__

    def utcoffset(self, dt):
        return self.offset

    def tzname(self, dt):
        return self.name

    def dst(self, dt):
        return timedelta(0)


def to_datetime_tz(dt):  # type: (str) -> datetime
    fmt = '%Y-%m-%dT%H:%M:%S.%f'
    if dt[-6] in frozenset(('+', '-')):
        dt, sign, offset = strptime(dt[:-6], fmt), dt[-6], dt[-5:]
        return datetime.fromtimestamp(mktime(dt),
                                      tz=MyUTCOffsetTimezone.with_offset(offset, sign))
    elif dt[-1] == 'Z':
        return datetime.strptime(dt, fmt + 'Z')
    return datetime.strptime(dt, fmt)

But that didn't work on negative timezones. This however I got working fine, in Python 3.7.3:

from datetime import datetime


def to_datetime_tz(dt):  # type: (str) -> datetime
    fmt = '%Y-%m-%dT%H:%M:%S.%f'
    if dt[-6] in frozenset(('+', '-')):
        return datetime.strptime(dt, fmt + '%z')
    elif dt[-1] == 'Z':
        return datetime.strptime(dt, fmt + 'Z')
    return datetime.strptime(dt, fmt)

Some tests, note that the out only differs by precision of microseconds. Got to 6 digits of precision on my machine, but YMMV:

for dt_in, dt_out in (
        ('2019-03-11T08:00:00.000Z', '2019-03-11T08:00:00'),
        ('2019-03-11T08:00:00.000+11:00', '2019-03-11T08:00:00+11:00'),
        ('2019-03-11T08:00:00.000-11:00', '2019-03-11T08:00:00-11:00')
    ):
    isoformat = to_datetime_tz(dt_in).isoformat()
    assert isoformat == dt_out, '{} != {}'.format(isoformat, dt_out)
Sam answered 15/5, 2019 at 1:32 Comment(5)
May I ask why did you do frozenset(('+', '-'))? Shouldn't a normal tuple like ('+', '-') be able to accomplish the same thing?Trod
Sure, but isn't that a linear scan rather than a perfectly hashed lookup?Sam
At least a couple of bugs in your to_datetime_tz function: 1. datetime strings without a decimal point in the seconds (like 2019-03-11T08:00:00+11:00) trigger exceptions despite being valid ISO 8601 and RFC 3339 datetimes, and 2. timezone offset Z is treated differently from +00:00 even though they are supposed to mean the same thing.Doble
As for @PrahladYeri's point about the frozenset, Prahlad is quite right. There's no way the frozenset lookup is gonna be faster with only two items, especially when you're actually having to construct and iterate over an equivalent 2-item tuple anyway as part of the construction of the frozenset. And even if it were faster, the cost of doing a lookup in a 2-item collection is never gonna matter.Doble
I suppose you can do length checks on the input string to determine what's in it. You're welcome to edit this answer from > 3 years ago.Sam

© 2022 - 2024 — McMap. All rights reserved.