datetime from string in Python, best-guessing string format
Asked Answered
P

5

110

The function to get a datetime from a string, datetime.strptime(date_string, format) requires a string format as the second argument. Is there a way to build a datetime from a string without without knowing the exact format, and having Python best-guess it?

Plot answered 29/2, 2012 at 22:27 Comment(5)
possible duplicate of Is there any python library for parsing dates and times from a natural language?Lamebrain
Differentiating between mm/dd/yyyy vs. dd/mm/yyyy is an interesting problem, with disastrous results if you get it wrong.Lauds
It depends how inexact you mean to be when you say, "without the exact format." Could you give examples of the types of inputs you want to be able to handle? Or, could you potentially have partial info about the format (such as whether the year is 2 or 4 digits, or whether the month precedes the day or vice versa)? Without at least some basic info, even a person can't do what you ask. Is 01/02/12 Feb 1st 2012, Jan 2nd 2012, Feb 12th 2001, Dec 2nd 2001, or something else?Hoehne
github.com/jeffreystarr/dateinferSandblind
@denfromufa I get the following error while importing dateinfer on Python3: from infer import infer ModuleNotFoundError: No module named 'infer'Denysedenzil
P
173

Use the dateutil library.

I was already using dateutil as an indispensable lib for handling timezones
(See Convert UTC datetime string to local datetime and How do I convert local time to UTC in Python?)

And I've just realized it has date parsing support:

import dateutil.parser
yourdate = dateutil.parser.parse(datestring)

(See also How do I translate a ISO 8601 datetime string into a Python datetime object?)

Plot answered 1/3, 2012 at 13:42 Comment(5)
Great suggestion. It can parse any formatted date/time from a string.Unfamiliar
I know it's old. but it doesn't handle this string date type "Thursday, 21 May 2020 07:05:00 GMT" because the day is full written. Any suggestion on that one ?Heliotrope
Good approach for a single string but not great for an arrayBonner
@YoëlZerbib I just tested your string. Seems like it has been fixed. I am on Python 3.10.2Nystatin
Can't believe this actually exists, the amount of hassle datetime objects have caused me over the years, sigh. Thank you.Oxygen
T
33

Can get away with a simple function if only checking against dates.

def get_date(s_date):
    date_patterns = ["%d-%m-%Y", "%Y-%m-%d"]

    for pattern in date_patterns:
        try:
            return datetime.datetime.strptime(s_date, pattern).date()
        except:
            pass
    
    print("Date is not in expected format: %s").format(s_date)
Telepathy answered 1/8, 2014 at 12:48 Comment(2)
Much quicker than using dateutil provided your date format is covered.Herschel
I think this enumerative approach with silent fails on all attempted bad formats can be best used to handle edge cases in an error handler after the usual (standard, expected) date format conversion has already failed.Greenling
A
9

Back before I was a python guy, I was a perl guy. One of the things that I've always missed but haven't seen anything close to it is Date::Manip. That module can extract a good timestamp from a smattering of nibbles. I almost suspect that it's author struck a deal with the Devil.

I've run across a few things that take stabs at it in Python:

If you find anything better I'd love to hear about it though.

Attorn answered 1/3, 2012 at 4:50 Comment(4)
Thanks for the recommendations- See my answer though, think I found my own answer with the dateutil library.Plot
what the heck is a smattering of nibbles?Allamerican
@Allamerican A nibble is a half-byte, a smattering is a small, scattered amount.Attorn
hilarious, sounds like something out of hairy potterAllamerican
S
9

You can use datefinder ,It will detect all types of natural style of dates.

import datefinder # Module used to find different style of date with time

string_value = " created 01/15/2005 by ACME inc.and associates.January 4th,2017 at 8pm"
matches = datefinder.find_dates(string_value)            
for match in matches:
    print("match found ",match)

Output

match found  2005-01-15 00:00:00
match found  2017-01-04 20:00:00
Stu answered 24/2, 2019 at 19:36 Comment(3)
Unlike dateutil, datefinder can't parse a bare month, eg. "July" (without either a day or a year.) This is kind of a major limitation that would seem to be a trivial fix.Clupeid
Cant find the date in 02-08-2021 - 10_789_0107987_1_165Bienne
Ah shame, wish it could output the string format to parse dates withAllamerican
S
0

if pandas is already imported, it has a function which fits the bill - pd.to_datetime. In my experience this works with a wide range of date formats.

Be careful with ambiguity about day/month first: is 01/02/2000 the first of February, or the 2nd of January?

Demo:

dts = ['2018-09-30',
'2020-9-8',
'25-12-2018',
'2018-12-25 23:50:55',
'10:15:35.889 AM',
'10:15:35.889 PM',
'2018-12-25 23:50:55.999',
'2018-12-25 23:50:55.999 +0530'
]

pd.DataFrame([{'string': dt, 'datetime': pd.to_datetime(dt)} for dt in dts])

enter image description here

Note that the third value in the list triggers the warning UserWarning: Parsing dates in %d-%m-%Y format when dayfirst=False (the default) was specified - because here it is clearly day first. If it wouldn't be clear, it would be assumed that it is month first, and potentially give the wrong datetime.

Signora answered 22/2 at 22:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.