Java 8 Stream function to group a List of anagrams into a Map of Lists
Asked Answered
U

2

10

Java 8 is about to be released... While learning about Streams, I got into a scenario about grouping anagrams using one of the new ways. The problem I'm facing is that I can't find a way to group Strings objects using the map/reduce functions. Instead, I had to create a similar way as documented at Aggregate Operations - Reduction.

Based on the documentation, we can simply use:

LIST<T>.stream().collect(Collectors.groupingBy(POJO::GET_METHOD))

So that Collectors.groupingBy() will aggregate the keys of the map based on the method used. However, this approach is too seems to be cumbersome to wrap a simple String presentation.

public class AnagramsGrouping {
    static class Word {
        public String original;

        public Word(String word) {
            original = word;
        }

        public String getKey() {
            char[] characters = input.toCharArray();
            Arrays.sort(characters);
            return new String(characters);
        }

        public String toString() {
            return original;
        }
    }

    public static void main(String[] args) {
        List<Word> words = Arrays.asList(new Word("pool"), new Word("loop"),
                new Word("stream"), new Word("arc"), new Word("odor"),
                new Word("car"), new Word("rood"), new Word("meats"),
                new Word("fires"), new Word("fries"), new Word("night"),
                new Word("thing"), new Word("mates"), new Word("teams"));

        Map<String, List<Word>> anagrams = words.stream().collect(
                Collectors.groupingBy(Word::getKey));

        System.out.println(anagrams);
    }
}

This prints the following:

{door=[odor, rood], acr=[arc, car], ghint=[night, thing],
 aemrst=[stream], efirs=[fires, fries], loop=[pool, loop],
 aemst=[meats, mates, teams]}

Instead, I'm looking for a simpler and more direct solution that uses the new map/reduce functions to accumulate the results into the similar interface Map<String, List<String>. Based on How to convert List to Map, I have the following:

List<String> words2 = Arrays.asList("pool", "loop", "stream", "arc",
        "odor", "car", "rood", "meats", "fires", "fries",
        "night", "thing", "mates", "teams");

words2.stream().collect(Collectors.toMap(w -> sortChars(w), w -> w));

But this code generates a key collision as it is a Map of 1-1.

Exception in thread "main" java.lang.IllegalStateException: Duplicate key pool

which makes sense... Is there a way to group them into the similar output as the first solution with groupingBy, but without using a POJO wrapping the values?

Ulibarri answered 24/2, 2014 at 4:1 Comment(0)
F
19

The single-argument groupingBy collector does exactly what you want to do. It classifies its input, which you've already done using sortChars (or getKey in the earlier example). Each stream value that's classified under the same key gets put into a list which is the map's value. Thus we have:

Map<String, List<String>> anagrams =
    words2.stream().collect(Collectors.groupingBy(w -> sortChars(w)));

giving the output

{door=[odor, rood], acr=[arc, car], ghint=[night, thing], aemrst=[stream],
efirs=[fires, fries], loop=[pool, loop], aemst=[meats, mates, teams]}

You could also use a method reference:

Map<String, List<String>> anagrams =
    words2.stream().collect(Collectors.groupingBy(GroupingAnagrams::sortChars));

If you want to do something with the values other than building up a list, use a multi-arg overload of groupingBy and a "downstream" collector. For example, to count the words instead of building up a list, do this:

Map<String, Long> anagrams =
    words2.stream().collect(
        Collectors.groupingBy(GroupingAnagrams::sortChars, Collectors.counting()));

This results in:

{door=2, acr=2, ghint=2, aemrst=1, efirs=2, loop=2, aemst=3}

EDIT:

In case it wasn't clear, sortChars is simply a static function that performs a similar function to what getKey did in the first example, but from string to string:

public static String sortChars(String input) {
    char[] characters = input.toCharArray();
    Arrays.sort(characters);
    return new String(characters);
}
Feaze answered 24/2, 2014 at 4:40 Comment(0)
M
0

You can use toMap method with four parameters and specify separately: the key type, the value type, the merge function for values with the same key, and the particular implementation of the Map into which the results will be inserted.

In this case, you can choose:

  • key - int[] - a sorted array of character code points of the word;
  • value - List<String> - a list of anagrams;
  • merge function - two lists into one;
  • map - TreeMap with a comparator that compares two int[] arrays.
List<String> words = List.of("pool", "loop", "stream", "arc", "odor", "car",
        "rood", "meats", "fires", "fries", "night", "thing", "mates", "teams");
Map<int[], List<String>> anagrams = words.stream()
        .collect(Collectors.toMap(
                // key - a sorted array of character code points
                word -> word.codePoints().sorted().toArray(),
                // value - a list of anagrams
                word -> new ArrayList<>(List.of(word)),
                // merge elements of two lists
                (list1, list2) -> {
                    list1.addAll(list2);
                    return list1;
                },
                // comparator that compares two int[] arrays
                () -> new TreeMap<>(Arrays::compare)));
// output
anagrams.forEach((k, v) -> System.out.println(v.get(0) + "=" + v));

Output:

arc=[arc, car]
stream=[stream]
meats=[meats, mates, teams]
odor=[odor, rood]
fires=[fires, fries]
night=[night, thing]
pool=[pool, loop]

See also: How do you check if a word has an anagram that is a palindrome?

Meatball answered 14/4, 2021 at 22:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.