Timezone offset sign reversed by dateutil?
Asked Answered
R

2

9

Does anyone know why python's dateutil reverses the sign of the GMT offset when it parses the datetime field?

Apparently this feature is a known outcome of not only dateutil but also other parsing functions. But this results in an incorrect datetime result unless a pre-processing hack is applied:

from dateutil import parser

jsDT = 'Fri Jan 02 2015 03:04:05.678910 GMT-0800'
python_datetime = parser.parse(jsDT)
print(python_datetime)
>>> 2015-01-02 03:04:05.678910+08:00

jsDT = 'Fri Jan 02 2015 03:04:05.678910 GMT-0800'
if '-' in jsDT:
    jsDT = jsDT.replace('-','+')
elif '+' in jsDT:
    jsDT = jsDT.replace('+','-')
python_datetime = parser.parse(jsDT)
print(python_datetime)
>>> 2015-01-02 03:04:05.678910-08:00
Radiative answered 26/6, 2015 at 17:18 Comment(1)
github.com/dateutil/dateutil/issues/70Anatolia
A
10

It seems dateutil uses POSIX-style signs here. It is not related to Python. Other software does it too. From the tz database:

# We use POSIX-style signs in the Zone names and the output abbreviations,
# even though this is the opposite of what many people expect.
# POSIX has positive signs west of Greenwich, but many people expect
# positive signs east of Greenwich.  For example, TZ='Etc/GMT+4' uses
# the abbreviation "GMT+4" and corresponds to 4 hours behind UT
# (i.e. west of Greenwich) even though many people would expect it to
# mean 4 hours ahead of UT (i.e. east of Greenwich).

The tz database is used almost everywhere.

Example:

$ TZ=Etc/GMT-8 date +%z
+0800

You probably expect a different timezone:

>>> from datetime import datetime
>>> import pytz
>>> pytz.timezone('America/Los_Angeles').localize(datetime(2015, 1, 2, 3, 4, 5, 678910), is_dst=None).strftime('%Y-%m-%d %H:%M:%S.%f %Z%z')
'2015-01-02 03:04:05.678910 PST-0800'

Note: PST, not GMT.

Though dateutil uses POSIX-style signs even for the PST timezone abbreviation:

>>> from dateutil.parser import parse
>>> str(parse('2015-01-02 03:04:05.678910 PST-0800'))
'2015-01-02 03:04:05.678910+08:00'

datetime.strptime() in Python 3 interprets it "correctly":

$ TZ=America/Los_Angeles python3                                               
...
>>> from datetime import datetime
>>> str(datetime.strptime('2015-01-02 03:04:05.678910 PST-0800', '%Y-%m-%d %H:%M:%S.%f %Z%z'))
'2015-01-02 03:04:05.678910-08:00'

Notice the sign.

Despite the confusion due to POSIX-style signs; dateutil behavior is unlikely to change. See dateutil bug: "GMT+1" is parsed as "GMT-1" and @Lennart Regebro's reply:

Parsing GTM+1 this way is actually a part of the Posix specification. This is therefore a feature, and not a bug.

See how TZ environment variable is defined in the POSIX specification, glibc uses similar definition.

It is not clear why dateutil uses POSIX TZ-like syntax to interpret the timezone info in a time string. The syntax is not exactly the same e.g., POSIX syntax requires a semicolon: hh[:mm[:ss]] in the utc offset that is not present in your input.

Anatolia answered 26/6, 2015 at 20:24 Comment(2)
Thanks for investigating this. If I interpret it correctly, the problem arises from a conflict between the timezone specs used by python datetime and posix. In python datetime, the time values conceptually represent the result of a GMT adjustment. So, python: GMT - Adj = Datetime Local. Whereas in posix, the timezone adj tell you how to obtain GMT. Datetime Local + Adj = GMT. This issue is strange, since epoch timestamping in posix means the local 'human' time is always derived from the UTC float and the default javascript new Date() specs appear to accord with python datetime.Radiative
Python like everybody else uses <utc time> + <utc offset> = <local time> definition. POSIX behavior (GMT+h, no minutes and the reversed sign) is supported but it is considered deprecated as a rule.Anatolia
H
2

The source code for dateutil.parser.parse explains this.

Check for something like GMT+3, or BRST+3. Notice that it doesn't mean "I am 3 hours after GMT", but "my time +3 is GMT". If found, we reverse the logic so that timezone parsing code will get it right.

And a further comment:

With something like GMT+3, the timezone is not GMT.

Hautesalpes answered 26/6, 2015 at 17:32 Comment(1)
Thanks for the explanation. It seems rather counter-productive that it produces a datetime python object that cannot be used by subsequent datetime methods (or even itself) to manipulate timezone reporting without creating the wrong datetime. But, it does seem intentional (for reasoning unknown). So, I guess I'll have to keep my hack in place for now. Thanks again.Radiative

© 2022 - 2024 — McMap. All rights reserved.