Python - Getting the date format [duplicate]
Asked Answered
B

2

5

I'm getting a date as a string, then I'm parsing it to datetime object. Is there any way to check what's is the date format of the object?

Let's say that this is the object that I'm creating:

modified_date = parser.parse("2015-09-01T12:34:15.601+03:00")

How can i print or get the exact date format of this object, i need this in order to verify that it's in the correct format, so I'll be able to to make a diff of today's date and the given date.

Bisset answered 3/12, 2015 at 18:27 Comment(7)
This question doesn't make sense. If you're parsing it to a datetime object, then it's a datetime object, and doesn't have a format. What exactly do you think you need to compare?Chambers
I'm getting a string and i want to make diff of days between today's date and the given string, but in order to perform the diff i have to make sure that the format of both of the object is the same, otherwise i'll get an exceptionBisset
But you said you were converting it to a datetime object. What does parser.parse do?Chambers
Do you want to verify that the string is in ISO format or what? I don't get itForeordination
The following code is not working: modified_date = parser.parse("2015-09-01T12:34:15.601+03:00") today = datetime.today() diff_test = modified_date - today I'm getting an exception: TypeError: can't subtract offset-naive and offset-aware datetimes It's probably related to the timezone, but i'm not sureBisset
@DanielRoseman I believe that is python-dateutil parserKurt
TypeError is a different question (the answer: use timezone-aware datetime object for the current time e.g., datetime.now(utc))Molest
K
6

I had a look in the source code and, unfortunately, python-dateutil doesn't expose the format. In fact it doesn't even generate a guess for the format at all, it just goes ahead and parses - the code is like a big nested spaghetti of conditionals.

You could have a look at dateinfer which looks to be what you're searching for, but these are unrelated libraries so there is no guarantee at all that python-dateutil will parse with the same format that dateinfer suggests.

>>> from dateinfer import infer
>>> s = "2015-09-01T12:34:15.601+03:00"
>>> infer([s])
'%Y-%d-%mT%I:%M:%S.601+%m:%d'

Look at that .601. Close but not cigar. I think it has probably also mixed up the month and the day. You might get better results by giving it more than one date string to base the guess upon.

Kurt answered 3/12, 2015 at 18:58 Comment(1)
it seems like, the format isn't in the convention of datetime.. I've already created a method that returns the requested format. def get_formatted_date(date_format, date_to_reformat): """ Reformatting a date object to a specific format :param date_format: String the desired format :param date_to_reformat: datetime The actual date :return: datetime The actual date """ date_str = date_to_reformat.strftime(date_format) return parser.parse(date_str) Bisset
M
4

i need this in order to verify that it's in the correct format

If you know the expected time format (or a set of valid time formats) then you could just parse the input using it: if it succeeds then the time format is valid (the usual EAFP approach in Python):

for date_format in valid_date_formats:
    try:
        return datetime.strptime(date_string, date_format), date_format
    except ValueError: # wrong date format
        pass # try the next format
raise ValueError("{date_string} is not in the correct format. "
                 "valid formats: {valid_date_formats}".format(**vars()))

Here's a complete code example (in Russian -- ignore the text, look at the code).

If there are many valid date formats then to improve time performance you might want to combine them into a single regular expression or convert the regex to a deterministic or non-deterministic finite-state automaton (DFA or NFA).

In general, if you need to extract dates from a larger text that is too varied to create parsing rules manually; consider machine learning solutions e.g., a NER system such as webstruct (for html input).

Molest answered 4/12, 2015 at 13:37 Comment(4)
"if it succeeds then the time format is valid" <-- valid, ok, but not necessarily correct!Kurt
@wim: what is the difference between "valid" and "correct" in this case. Could you provide an example of a date_string that is valid but that is not correct?Molest
Yes. If you have "01-02-2017" then both "%d-%m-%Y" and "%m-%d-%Y" are valid, but only one is correct. More context is needed, e.g. locale information, or using multiple data points.Kurt
it makes sense. My answer assumes that the format are sufficiently different e.g., formats in the link: date_formats = '%B %d, %Y', '%b %d, %Y', '%Y-%B-%d' (valid and correct are identical here).Molest

© 2022 - 2024 — McMap. All rights reserved.