Python dateutil parser fails
Asked Answered
S

2

3

I am attempting to parse the following date strings obtained from email headers:

from dateutil import parser
d1 = parser.parse('Tue, 28 Jun 2011 01:46:52 +0200')
d2 = parser.parse('Mon, 11 Jul 2011 10:01:56 +0200 (CEST)')
d3 = parser.parse('Wed, 13 Jul 2011 02:00:01 +0000 (GMT+00:00)')

The third one fails; am I missing something obvious?

Shumate answered 18/7, 2011 at 7:39 Comment(3)
have you tried parser.parse('...', fuzzy=True)?Maus
phimuemue, add that as an answer and I will accept it!Shumate
eryksun, that is a good suggestion.Shumate
M
4

have you tried parser.parse('...', fuzzy=True)? (I suppose it works :))

Maus answered 18/7, 2011 at 19:3 Comment(1)
Yes it works. The problem is the extra "+00:00" after "GMT", as pointed out below. The "fuzzy" option ignores this.Shumate
J
2

Give a try to parsedatetime library.

In [16]: import parsedatetime.parsedatetime as pdt

In [17]: p = pdt.Calendar()

In [18]: p.parse("Wed, 13 Jul 2011 02:00:01 +0000 (GMT+00:00)")
Out[18]: ((2011, 7, 20, 0, 0, 0, 2, 201, -1), 3)
Jaleesa answered 18/7, 2011 at 9:14 Comment(3)
But is it correct? I have difficulty interpreting the tuple. Where is the "13", for example?Shumate
It seems that this parser is confused and thinks the "Wed" refers to tomorrow July 20, which is the closest Wednesday.Shumate
Looks like parsedatetime always takes future dates. it has a comment in the source code: # if that day and month have already passed in this year, then increment the year by 1Engram

© 2022 - 2024 — McMap. All rights reserved.