The function to get a datetime from a string, datetime.strptime(date_string, format)
requires a string format as the second argument. Is there a way to build a datetime from a string without without knowing the exact format, and having Python best-guess it?
Use the dateutil library.
I was already using dateutil as an indispensable lib for handling timezones
(See Convert UTC datetime string to local datetime and How do I convert local time to UTC in Python?)
And I've just realized it has date parsing support:
import dateutil.parser
yourdate = dateutil.parser.parse(datestring)
(See also How do I translate a ISO 8601 datetime string into a Python datetime object?)
Can get away with a simple function if only checking against dates.
def get_date(s_date):
date_patterns = ["%d-%m-%Y", "%Y-%m-%d"]
for pattern in date_patterns:
try:
return datetime.datetime.strptime(s_date, pattern).date()
except:
pass
print("Date is not in expected format: %s").format(s_date)
Back before I was a python guy, I was a perl guy. One of the things that I've always missed but haven't seen anything close to it is Date::Manip. That module can extract a good timestamp from a smattering of nibbles. I almost suspect that it's author struck a deal with the Devil.
I've run across a few things that take stabs at it in Python:
- normaldate
- mxDateTime
- roundup's date module has some fans
If you find anything better I'd love to hear about it though.
You can use datefinder ,It will detect all types of natural style of dates.
import datefinder # Module used to find different style of date with time
string_value = " created 01/15/2005 by ACME inc.and associates.January 4th,2017 at 8pm"
matches = datefinder.find_dates(string_value)
for match in matches:
print("match found ",match)
Output
match found 2005-01-15 00:00:00
match found 2017-01-04 20:00:00
02-08-2021 - 10_789_0107987_1_165
–
Bienne if pandas is already imported, it has a function which fits the bill - pd.to_datetime. In my experience this works with a wide range of date formats.
Be careful with ambiguity about day/month first: is 01/02/2000 the first of February, or the 2nd of January?
Demo:
dts = ['2018-09-30',
'2020-9-8',
'25-12-2018',
'2018-12-25 23:50:55',
'10:15:35.889 AM',
'10:15:35.889 PM',
'2018-12-25 23:50:55.999',
'2018-12-25 23:50:55.999 +0530'
]
pd.DataFrame([{'string': dt, 'datetime': pd.to_datetime(dt)} for dt in dts])
Note that the third value in the list triggers the warning UserWarning: Parsing dates in %d-%m-%Y format when dayfirst=False (the default) was specified
- because here it is clearly day first. If it wouldn't be clear, it would be assumed that it is month first, and potentially give the wrong datetime.
© 2022 - 2024 — McMap. All rights reserved.