Automatic Date/Time parser without specifying format [closed]
Asked Answered
U

7

7

I am searching for a java library that can parse a string into a POJO without specifying the format. I have researched POjava. Is there anyother library which does similar thing?

DateTime dateTime = DateTimeParser.parse("21/02/13");

//If unclear use the cultural information passed
DateTime dateTime = DateTimeParser.parse("01/02/13", new Locale("en-us"));

//Should also work with time zones
DateTime dateTime = DateTimeParser.parse("2011/12/13T14:15:16+01:00");

I found the following links with the same problem Intelligent date / time parser for Java, but not very useful answers. Neither Joda or JChronic does what I wanted. Correct me if I am wrong.

Update:

The reason I say Joda does not solve my purpose is, Joda expects the string to be parsed in ISO8601 format or any format you specify like "yyyyMMdd". I will not be able to hardcode this format as I need to handle several formats.

I have a hack around solution for eliminating the ambiguity with respect to American or European date formats, i.e. mm/dd/yy or dd/mm/yy. Assuming I have access to timezone of the date, can I determine if it is American or European format? Can someone tell me way to do this? Googled but found nothing.

Ultrasound answered 21/2, 2013 at 19:0 Comment(8)
What would you expect the first version to do with 06/05/2013? Whatever you pick, you'll be wrong for a considerable amount of the planet.Whaley
Did you look at Joda Time?Sputter
If there is a ambiguity with the first line, use the default cultural information unless specified one as in the second code lineUltrasound
https://mcmap.net/q/1059353/-intelligent-date-time-parser-for-java-closedExecutrix
possible duplicate of Using Joda Date & Time API to parse multiple formatsAyacucho
Well you are not wrong, but you are not right either. Can't be done, unless you restrict input to non-ambiguous formats, or guess. The former will make you unpopular, the latter uncompetent. :)Cumine
possible duplicate of Parse any date in JavaPeirsen
check this one github.com/zoho/hawkingToothed
U
3

I found the answer to my problem. I used this particular library POjava. This page explains how you can format the date+time string without specifying any format. However, for the library to work properly, you got to specify the date ordering like Day followed by Month or Month followed by Day.

Ultrasound answered 28/2, 2013 at 16:26 Comment(0)
A
10

The problem is that there are some formats that cannot be guessed right.

A simple example is 01/02/2013. Is this february 1st or january 2nd? Or even worse: 01/02/09?

Both formats exist. (Thank you, UK and US!)

So any format guesser will have to rely on luck for these formats, or fail deliberately for these.

The python module dateutil.parser can serve as an example of a best effort parser. Sorry that I don't know a java equivalent. But you might want to look at Joda Time

http://labix.org/python-dateutil#head-b95ce2094d189a89f80f5ae52a05b4ab7b41af47

it actually has parameters dayfirst and yearfirst.

Then there is a perl module:

https://metacpan.org/pod/Time::ParseDate

You might be able to use the precedence list from that module. It's not very fast to blindly try a number of patterns (an optimized lexer will be way faster), but it may be good enough for you, unless you are guessing the format of millions of records.

Ayacucho answered 21/2, 2013 at 19:7 Comment(1)
If there is a ambiguity with the first line (please refer to the code in the original question), use the default cultural information unless specified one as in the second code line. Can I get this kind of intelligence from any library?Ultrasound
U
3

I found the answer to my problem. I used this particular library POjava. This page explains how you can format the date+time string without specifying any format. However, for the library to work properly, you got to specify the date ordering like Day followed by Month or Month followed by Day.

Ultrasound answered 28/2, 2013 at 16:26 Comment(0)
D
1

There is no magic solution to this. Remember date/time formats can also depend on your locale.

Realistically the best you can do is define a list of formats, and "try" them one by one until you find one (or none) that fit.

private static final FORMAT_1 = "MM/dd/yyyy'T'HH:mm:ss.SSS"
private static final FORMAT_2 = "MM/dd/yyyy'T'HH:mm:ss"
private static final FORMAT_3 = "MM/dd/yyyy"

Remember to think about thread safety when working with date/time objects in java. I have a class doing this sort of stuff named "ThreadSafeDateTimeFormatter".

Good luck!

Disfeature answered 21/2, 2013 at 19:11 Comment(0)
I
1

Since I didn't find handy solution for my situation I wrote a simple static utility method that helped me. Wrapping formats in collection and iterating over it could make things easier if many more formats are added.

public static Date returnDateFromDateString(String propValue) throws Exception {

    SimpleDateFormat sdfFormat1 = new SimpleDateFormat(IDateFConstants.DATE_STRING_FORMAT_1);
    SimpleDateFormat sdfFormat2 = new SimpleDateFormat(IDateFConstants.DATE_STRING_FORMAT_2);
    SimpleDateFormat sdfISO8601 = new SimpleDateFormat(IDateFConstants.DATE_STRING_ISO_8601);

    try {
        return sdfFormat1.parse(propValue);
    } catch (ParseException e) { }

    try {
        return sdfFormat2.parse(propValue);
    } catch (ParseException e) { }

    try {
        return sdfISO8601.parse(propValue);
    } catch (ParseException e) { }

    throw new Exception(IDateFConstants.DATE_FORMAT_ERROR);
}

where IDateFConstants looks like

public interface IDateFConstants {

public static final String DATE_STRING_ISO_8601 = "yyyy-MM-dd'T'HH:mm:ss";
public static final String DATE_STRING_FORMAT_1 = "dd.MM.yyyy";
public static final String DATE_STRING_FORMAT_2 = "dd.MM.yyyy HH:mm:ss";

public static final String DATE_FORMAT_ERROR = "Date string wasn't" + 
                                            + "formatted in known formats";

}
Itinerancy answered 4/10, 2016 at 14:46 Comment(0)
G
0

You need to at least have an ordered list of pattern candidates. Once you have that, Apache DateUtils has a parseDate(String dateString, String[] patterns) method that lets you easily try out a list of patterns on your date string, and parse it by the first one that matches:

public static Date parseDate(String str,
                         String[] parsePatterns)
                  throws ParseException
Parses a string representing a date by trying a variety of different parsers.

The parse will try each parse pattern in turn. A parse is only deemed successful if it parses the whole of the input string. If no parse patterns match, a ParseException is thrown.

The parser will be lenient toward the parsed date.

Glarum answered 18/6, 2018 at 14:41 Comment(0)
A
0
        public static String detectDateFormat(String inputDate, String requiredFormat) {
        String tempDate = inputDate.replace("/", "").replace("-", "").replace(" ", "");
        String dateFormat;

        if (tempDate.matches("([0-12]{2})([0-31]{2})([0-9]{4})")) {
            dateFormat = "MMddyyyy";
        } else if (tempDate.matches("([0-31]{2})([0-12]{2})([0-9]{4})")) {
            dateFormat = "ddMMyyyy";
        } else if (tempDate.matches("([0-9]{4})([0-12]{2})([0-31]{2})")) {
            dateFormat = "yyyyMMdd";
        } else if (tempDate.matches("([0-9]{4})([0-31]{2})([0-12]{2})")) {
            dateFormat = "yyyyddMM";
        } else if (tempDate.matches("([0-31]{2})([a-z]{3})([0-9]{4})")) {
            dateFormat = "ddMMMyyyy";
        } else if (tempDate.matches("([a-z]{3})([0-31]{2})([0-9]{4})")) {
            dateFormat = "MMMddyyyy";
        } else if (tempDate.matches("([0-9]{4})([a-z]{3})([0-31]{2})")) {
            dateFormat = "yyyyMMMdd";
        } else if (tempDate.matches("([0-9]{4})([0-31]{2})([a-z]{3})")) {
            dateFormat = "yyyyddMMM";
        } else {
//add your required regex
            return "";
        }
        try {
            String formattedDate = new SimpleDateFormat(requiredFormat, Locale.ENGLISH).format(new SimpleDateFormat(dateFormat).parse(tempDate));

            return formattedDate;
        } catch (Exception e) {

            return "";
        }

    }
Agate answered 4/7, 2019 at 12:54 Comment(0)
T
0

This date/time Parser supports 20+ date formats, user can set date format as configuration for input. Check out the complete Doc and it does more than other date-time Libraries.

Github Link: https://github.com/zoho/hawking. Devolped by ZOHO ZIA Team.

Hawking Parser is a Java-based NLP parser for parsing date and time information. The most popular parsers out there like Heidel Time, SuTime, and Natty Date time parser are distinctly rule-based. As such, they often tend to struggle with parsing date/time information where more complex factors like context, tense, multiple values, and more need to be considered.

With this in mind, Hawking Parser is designed to address a lot of these challenges and has many distinct advantages over other available date/time parsers.

It's an open source Library under GPL v3 and the best one. To know why it's best, check out this blog that explains in detail: https://www.zoho.com/blog/general/zias-nlp-based-hawking-date-time-parser-is-now-open-source.html

P.S: I'm one of the developers of this project

Toothed answered 1/4, 2021 at 11:14 Comment(2)
I noticed that you've written several answers linking to your website. So far, they all appear to be relevant to the questions, and there are only a few answers, so it's fine. But I should warn you that even if you disclose affiliation to the linked resources, continuing to promote your content repeatedly and/or needlessly could be considered to be spam. Please see How to not be a spammer for further information on this.Slant
Thanks for letting me know. will keep this in mind before posting answersToothed

© 2022 - 2024 — McMap. All rights reserved.