Parsing a date’s ordinal indicator ( st, nd, rd, th ) in a date-time string
Asked Answered
A

5

21

I checked the SimpleDateFormat javadoc, but I am not able to find a way to parse the ordinal indicator in a date format like this:

 Feb 13th 2015 9:00AM

I tried "MMM dd yyyy hh:mma", but the days have to be in number for it to be correct?

Is it possible to parse the "13th" date using a SimpleDateFormat without having to truncate the string?

Amerce answered 14/2, 2015 at 9:48 Comment(7)
Is date also consist of 1st, 2nd, 3rd..???Forkey
yes like a normal calendarAmerce
Try #4011575Goosefish
@Goosefish I think he needs the reverse operationInferior
not exactly the same, I want to parse the ordinal numbers to just numbers... the post wants it the other way round....Amerce
@Amerce Well, then keep that in mind, if you like to do that in this "direction" :D. Try dateString.replaceAll("st|nd|rd|th", "") for your case. This will cut off the unparsable words (like in Bohemians answers).Goosefish
The solution via String preprocessing as in accepted answer is fine as workaround, however formatting/printing would be more difficult. A comprehensive API-example for my library Time4J is used in this demo.Feudal
L
26

Java's SimpleDateFormat doesn't support an ordinal suffix, but the ordinal suffix is just eye candy - it is redundant and can easily be removed to allow a straightforward parse:

Date date = new SimpleDateFormat("MMM dd yyyy hh:mma")
    .parse(str.replaceAll("(?<=\\d)(st|nd|rd|th)", ""));

The replace regex is so simple because those sequences won't appear anywhere else in a valid date.


To handle any language that appends any length of ordinal indicator characters from any language as a suffix:

Date date = new SimpleDateFormat("MMM dd yyyy hh:mma")
    .parse(str.replaceAll("(?<=\\d)(?=\\D* \\d+ )\\p{L}+", ""));

Some languages, eg Mandarin, prepend their ordinal indicator, but that could be handled too using an alternation - left as an exercise for the reader :)

Lashawnda answered 14/2, 2015 at 10:6 Comment(7)
Thanks, this seems like the easiest solution. I do really hope that the simpledateformat can handle ordinal numbers one day...Amerce
@Amerce I don't think that SimpleDateFormat will be improved in the future. More likely "they" work on the new Time/Date classes of Java 8: oracle.com/technetwork/articles/java/….Goosefish
Note that this works only for English. So I suggest specifying the Locale to make this assumption explicit. Other languages use other values for their ordinal indicator. And other languages may use any of those four English strings in their month name. Perhaps this regex would be revamped to remove any non-digits appended to the first number.Mullens
Hm, I don't think that the regular expression given above can take into account all languages of the world, leaving aside the detail about suffix versus prefix. Some languages require different digit symbols, for example Arabic (in most countries) or in Indian languages (Tamil, Bengali etc). So let's better say, the regexp-solution is only guaranteed to work for English and most? other European languages.Feudal
'st' will actually match in August you can add a check to ensure there is one or more digits or a space before the sh (\d+[ ])st to accept 1st and 1 stCartwright
This creates a problem when the month is August.Cheat
@Ozuf thanks. I've corrected the regex to prevent August being a problemLashawnda
G
8

Java 8 answer (and Java 6 and 7) (because when this question was asked in 2015, the replacement for SimpleDateFormat was already out):

    DateTimeFormatter parseFormatter = DateTimeFormatter
            .ofPattern("MMM d['st']['nd']['rd']['th'] uuuu h:mma", Locale.ENGLISH);
    LocalDateTime dateTime = LocalDateTime.parse(dateTimeString, parseFormatter);

With the sample date from the question this yiedls:

2015-02-13T09:00

In the format pattern [] denotes optional parts and '' denotes literal parts. So the pattern says that the number may be followed by st, nd, rd or th.

To use this in Java 6 or 7 you need ThreeTen Backport. Or for Android ThreeTenABP.

Since those suffixes are special for English, and other languages/locales have completely other usages for writing dates and times (also they don’t use AM/PM), I believe that unless you have other requirements, you should try to implement this for English dates and times only. Also you should give an English speaking locale explicitly so it will work independently of the locale setting of your computer or JVM.

I have tried to combine the best parts of answers by Hugo and myself to a duplicate question. Under that duplicate question there are still more java 8 answers. One limitation of the above code is it doesn’t have very strict validation: you will get away with Feb 13rd and even Feb 13stndrdth.

Edit: My own favourite among my answers on ordinal indicators is this one. It’s about formatting, but the formatter I present there works fine for parsing too.

Gesso answered 26/5, 2017 at 9:24 Comment(0)
P
3

In case someone finds it useful: DateTimeFormatter builder. This formatter allows you to format and to parse UK dates with ordinal suffixes (eg. "1st January 2017"):

public class UkDateFormatterBuilder
{
    /**
     * The UK date formatter that formats a date without an offset, such as '14th September 2020' or '1st January 2017'.
     * @return an immutable formatter which uses the {@link ResolverStyle#SMART SMART} resolver style. It has no override chronology or zone.
     */
    public DateTimeFormatter build()
    {
        return new DateTimeFormatterBuilder()
                .parseCaseInsensitive()
                .parseLenient()
                .appendText(DAY_OF_MONTH, dayOfMonthMapping())
                .appendLiteral(' ')
                .appendText(MONTH_OF_YEAR, monthOfYearMapping())
                .appendLiteral(' ')
                .appendValue(YEAR, 4)
                .toFormatter(Locale.UK);
    }

    private Map<Long, String> monthOfYearMapping()
    {
        Map<Long, String> monthOfYearMapping = new HashMap<>();
        monthOfYearMapping.put(1L, "January");
        monthOfYearMapping.put(2L, "February");
        monthOfYearMapping.put(3L, "March");
        monthOfYearMapping.put(4L, "April");
        monthOfYearMapping.put(5L, "May");
        monthOfYearMapping.put(6L, "June");
        monthOfYearMapping.put(7L, "July");
        monthOfYearMapping.put(8L, "August");
        monthOfYearMapping.put(9L, "September");
        monthOfYearMapping.put(10L, "October");
        monthOfYearMapping.put(11L, "November");
        monthOfYearMapping.put(12L, "December");
        return monthOfYearMapping;
    }

    private Map<Long, String> dayOfMonthMapping()
    {
        Map<Long, String> suffixes = new HashMap<>();
        for (int day=1; day<=31; day++)
        {
            suffixes.put((long)day, String.format("%s%s", (long) day, dayOfMonthSuffix(day)));
        }
        return suffixes;
    }

    private String dayOfMonthSuffix(final int day)
    {
        Preconditions.checkArgument(day >= 1 && day <= 31, "Illegal day of month: " + day);
        if (day >= 11 && day <= 13)
        {
            return "th";
        }
        switch (day % 10)
        {
            case 1:  return "st";
            case 2:  return "nd";
            case 3:  return "rd";
            default: return "th";
        }
    }
}

Plus a fragment of the test class:

public class UkDateFormatterBuilderTest
{
    DateTimeFormatter formatter = new UkDateFormatterBuilder().build();

    @Test
    public void shouldFormat1stJanuaryDate()
    {
        final LocalDate date = LocalDate.of(2017, 1, 1);

        final String formattedDate = date.format(formatter);

        Assert.assertEquals("1st January 2017", formattedDate);
    }

    @Test
    public void shouldParse1stJanuaryDate()
    {
        final String formattedDate = "1st January 2017";

        final LocalDate parsedDate = LocalDate.parse(formattedDate, formatter);

        Assert.assertEquals(LocalDate.of(2017, 1, 1), parsedDate);
    }
}

PS. I used Greg Mattes' solution for ordinal suffixes from here: How do you format the day of the month to say "11th", "21st" or "23rd" in Java? (ordinal indicator)

Pickar answered 1/9, 2017 at 10:24 Comment(0)
N
3

Well, no need to replace the text. The DateTimeFormatterBuilder is able to parse this as well.

First, we need to create a Map which maps day-of-month against their day-of-month-with-ordinal-suffix. That is because unfortunately, there is no standard thing, as far as I know.

static final Map<Long, String> ORDINAL_SUFFIX_MAP;
static {
    Map<Long, String> map = new HashMap<>();
    for (int i = 1; i <= 31; i++) {
        String suffix = switch (i) {
            case 1, 21, 31 -> "st";
            case 2, 22     -> "nd";
            case 3, 23     -> "rd";
            default        -> "th";
        };
        map.put((long) i, i + suffix);
    }
    ORDINAL_SUFFIX_MAP = Map.copyOf(map);
}

Then we can utilize the DateTimeFormatterBuilder as follows:

DateTimeFormatter formatter = new DateTimeFormatterBuilder()
    .appendPattern(firstPartOfYourPattern)
    .appendText(ChronoField.DAY_OF_MONTH, ORDINAL_SUFFIX_MAP)
    .appendPattern(lastPartOfYourPattern)
    .toFormatter(Locale.ROOT);
LocalDateTime result = LocalDateTime.parse(str, formatter);
Nankeen answered 9/6, 2022 at 9:10 Comment(3)
This seems like it should work, but it seems not to - see an example with the details fleshed out. Any thoughts on why? Do you have a definitely working version of this code?Frazier
Of course! OP had time components in their pattern string, that's why I used LocalDateTime::parse, because I assumed those components to be available. However, in the example you posted, you are not using any time component, yet you are using LocalDateTime. To fix this, you should be using LocalDate instead.Nankeen
No worries. We've all been there once :-)Nankeen
F
0

You should be using RuleBasedNumberFormat. It works perfectly and it's respectful of the Locale.

Firman answered 18/10, 2017 at 15:9 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.