Collectors.groupingBy doesn't accept null keys
Asked Answered
R

8

70

In Java 8, this works:

Stream<Class> stream = Stream.of(ArrayList.class);
HashMap<Class, List<Class>> map = (HashMap)stream.collect(Collectors.groupingBy(Class::getSuperclass));

But this doesn't:

Stream<Class> stream = Stream.of(List.class);
HashMap<Class, List<Class>> map = (HashMap)stream.collect(Collectors.groupingBy(Class::getSuperclass));

Maps allows a null key, and List.class.getSuperclass() returns null. But Collectors.groupingBy emits a NPE, at Collectors.java, line 907:

K key = Objects.requireNonNull(classifier.apply(t), "element cannot be mapped to a null key"); 

It works if I create my own collector, with this line changed to:

K key = classifier.apply(t);  

My questions are:

1) The Javadoc of Collectors.groupingBy doesn't say it shouldn't map a null key. Is this behavior necessary for some reason?

2) Is there another, easier way, to accept a null key, without having to create my own collector?

Raised answered 25/3, 2014 at 3:45 Comment(1)
related: #24631463Greenheart
R
17

For the first question, I agree with skiwi that it shouldn't be throwing a NPE. I hope they will change that (or else at least add it to the javadoc). Meanwhile, to answer the second question I decided to use Collectors.toMap instead of Collectors.groupingBy:

Stream<Class<?>> stream = Stream.of(ArrayList.class);

Map<Class<?>, List<Class<?>>> map = stream.collect(
    Collectors.toMap(
        Class::getSuperclass,
        Collections::singletonList,
        (List<Class<?>> oldList, List<Class<?>> newEl) -> {
        List<Class<?>> newList = new ArrayList<>(oldList.size() + 1);
        newList.addAll(oldList);
        newList.addAll(newEl);
        return newList;
        }));

Or, encapsulating it:

/** Like Collectors.groupingBy, but accepts null keys. */
public static <T, A> Collector<T, ?, Map<A, List<T>>>
groupingBy_WithNullKeys(Function<? super T, ? extends A> classifier) {
    return Collectors.toMap(
        classifier,
        Collections::singletonList,
        (List<T> oldList, List<T> newEl) -> {
            List<T> newList = new ArrayList<>(oldList.size() + 1);
            newList.addAll(oldList);
            newList.addAll(newEl);
            return newList;
            });
    }

And use it like this:

Stream<Class<?>> stream = Stream.of(ArrayList.class);
Map<Class<?>, List<Class<?>>> map = stream.collect(groupingBy_WithNullKeys(Class::getSuperclass));

Please note rolfl gave another, more complicated answer, which allows you to specify your own Map and List supplier. I haven't tested it.

Raised answered 26/3, 2014 at 2:14 Comment(1)
Thanks, it helped me for a different need: groupBy and Count, here is my wrapper: public static <T,R> Map<R,Long> countGroupBy(Collection<T> list, Function<T,R> groupBy){ Collector<T, ?, Map<R, Long>> mapCollectors = Collectors.toMap( groupBy, l->1L, Long::sum); return list.stream().collect(mapCollectors); } Spider
C
83

I had the same kind of problem. This failed, because groupingBy performs Objects.requireNonNull on the value returned from the classifier:

    Map<Long, List<ClaimEvent>> map = events.stream()
      .filter(event -> eventTypeIds.contains(event.getClaimEventTypeId()))
      .collect(groupingBy(ClaimEvent::getSubprocessId));

Using Optional, this works:

    Map<Optional<Long>, List<ClaimEvent>> map = events.stream()
      .filter(event -> eventTypeIds.contains(event.getClaimEventTypeId()))
      .collect(groupingBy(event -> Optional.ofNullable(event.getSubprocessId())));
Capwell answered 26/9, 2014 at 12:13 Comment(0)
R
17

For the first question, I agree with skiwi that it shouldn't be throwing a NPE. I hope they will change that (or else at least add it to the javadoc). Meanwhile, to answer the second question I decided to use Collectors.toMap instead of Collectors.groupingBy:

Stream<Class<?>> stream = Stream.of(ArrayList.class);

Map<Class<?>, List<Class<?>>> map = stream.collect(
    Collectors.toMap(
        Class::getSuperclass,
        Collections::singletonList,
        (List<Class<?>> oldList, List<Class<?>> newEl) -> {
        List<Class<?>> newList = new ArrayList<>(oldList.size() + 1);
        newList.addAll(oldList);
        newList.addAll(newEl);
        return newList;
        }));

Or, encapsulating it:

/** Like Collectors.groupingBy, but accepts null keys. */
public static <T, A> Collector<T, ?, Map<A, List<T>>>
groupingBy_WithNullKeys(Function<? super T, ? extends A> classifier) {
    return Collectors.toMap(
        classifier,
        Collections::singletonList,
        (List<T> oldList, List<T> newEl) -> {
            List<T> newList = new ArrayList<>(oldList.size() + 1);
            newList.addAll(oldList);
            newList.addAll(newEl);
            return newList;
            });
    }

And use it like this:

Stream<Class<?>> stream = Stream.of(ArrayList.class);
Map<Class<?>, List<Class<?>>> map = stream.collect(groupingBy_WithNullKeys(Class::getSuperclass));

Please note rolfl gave another, more complicated answer, which allows you to specify your own Map and List supplier. I haven't tested it.

Raised answered 26/3, 2014 at 2:14 Comment(1)
Thanks, it helped me for a different need: groupBy and Count, here is my wrapper: public static <T,R> Map<R,Long> countGroupBy(Collection<T> list, Function<T,R> groupBy){ Collector<T, ?, Map<R, Long>> mapCollectors = Collectors.toMap( groupBy, l->1L, Long::sum); return list.stream().collect(mapCollectors); } Spider
L
8

Use filter before groupingBy##

Filter out the null instances before groupingBy.

Here is an example
MyObjectlist.stream()
  .filter(p -> p.getSomeInstance() != null)
  .collect(Collectors.groupingBy(MyObject::getSomeInstance));
Leatherjacket answered 25/8, 2015 at 15:8 Comment(4)
The second question from the OPS implicitly states that they want to accept null keys so this doesn't answer his question.Eschalot
This can be a part of a correct answer. The other part being that you can then explicitly put null items in the result: result.put(null, myList.stream.filter(p -> p.get() == null).collect(Collectors.toList());Kind of ugly, but may still be more convenient than creating a collector.Muzz
The problem: the groupingBy lambda function has to be called twice. this could be expensive, depending on the functionKilkenny
The other problem is that groupingBy result is not guaranteed to be mutable. But you can pass mapFactory parameter, supplying a mutable map (like HashMap::new)Tropism
T
3

I figured I would take a moment and try to digest this issue you have. I put together a SSCE for what I would expect if I did it manually, and what the groupingBy implementation actually does.

I don't think this is an answer, but it is a 'wonder why it is a problem' thing. Also, if you want, feel free to hack this code to have a null-friendly collector.

Edit: A generic-friendly implementation:

/** groupingByNF - NullFriendly - allows you to specify your own Map and List supplier. */
private static final <T,K> Collector<T,?,Map<K,List<T>>> groupingByNF (
        final Supplier<Map<K,List<T>>> mapsupplier,
        final Supplier<List<T>> listsupplier,
        final Function<? super T,? extends K> classifier) {

    BiConsumer<Map<K,List<T>>, T> combiner = (m, v) -> {
        K key = classifier.apply(v);
        List<T> store = m.get(key);
        if (store == null) {
            store = listsupplier.get();
            m.put(key, store);
        }
        store.add(v);
    };

    BinaryOperator<Map<K, List<T>>> finalizer = (left, right) -> {
        for (Map.Entry<K, List<T>> me : right.entrySet()) {
            List<T> target = left.get(me.getKey());
            if (target == null) {
                left.put(me.getKey(), me.getValue());
            } else {
                target.addAll(me.getValue());
            }
        }
        return left;
    };

    return Collector.of(mapsupplier, combiner, finalizer);

}

/** groupingByNF - NullFriendly - otherwise similar to Java8 Collections.groupingBy */
private static final <T,K> Collector<T,?,Map<K,List<T>>> groupingByNF (Function<? super T,? extends K> classifier) {
    return groupingByNF(HashMap::new, ArrayList::new, classifier);
}

Consider this code (the code groups String values based on the String.length(), (or null if the input String is null)):

public static void main(String[] args) {

    String[] input = {"a", "a", "", null, "b", "ab"};

    // How we group the Strings
    final Function<String, Integer> classifier = (a) -> {return a != null ? Integer.valueOf(a.length()) : null;};

    // Manual implementation of a combiner that accumulates a string value based on the classifier.
    // no special handling of null key values.
    BiConsumer<Map<Integer,List<String>>, String> combiner = (m, v) -> {
        Integer key = classifier.apply(v);
        List<String> store = m.get(key);
        if (store == null) {
            store = new ArrayList<String>();
            m.put(key, store);
        }
        store.add(v);
    };

    // The finalizer merges two maps together (right into left)
    // no special handling of null key values.
    BinaryOperator<Map<Integer, List<String>>> finalizer = (left, right) -> {
        for (Map.Entry<Integer, List<String>> me : right.entrySet()) {
            List<String> target = left.get(me.getKey());
            if (target == null) {
                left.put(me.getKey(), me.getValue());
            } else {
                target.addAll(me.getValue());
            }
        }
        return left;
    };

    // Using a manual collector
    Map<Integer, List<String>> manual = Arrays.stream(input).collect(Collector.of(HashMap::new, combiner, finalizer));

    System.out.println(manual);

    // using the groupingBy collector.        
    Collector<String, ?, Map<Integer, List<String>>> collector = Collectors.groupingBy(classifier);

    Map<Integer, List<String>> result = Arrays.stream(input).collect(collector);

    System.out.println(result);
}

The above code produces the output:

{0=[], null=[null], 1=[a, a, b], 2=[ab]}
Exception in thread "main" java.lang.NullPointerException: element cannot be mapped to a null key
  at java.util.Objects.requireNonNull(Objects.java:228)
  at java.util.stream.Collectors.lambda$groupingBy$135(Collectors.java:907)
  at java.util.stream.Collectors$$Lambda$10/258952499.accept(Unknown Source)
  at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
  at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:512)
  at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:502)
  at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
  at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
  at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
  at CollectGroupByNull.main(CollectGroupByNull.java:49)
Toulouse answered 25/3, 2014 at 14:35 Comment(0)
V
2

To your 1st question, from the docs:

There are no guarantees on the type, mutability, serializability, or thread-safety of the Map or List objects returned.

Because not all Map implementations allow null keys they probably added this to reduce to the most common allowable definition of a map to get maximum flexibility when choosing a type.

To your 2nd question, you just need a supplier, wouldn't a lambda work? I'm still getting acquainted with Java 8, maybe a smarter person can add a better answer.

Volkman answered 25/3, 2014 at 4:16 Comment(3)
You can specify a map application that does accept null keys, so yes it is quite a big issue that it will never accept null keys.Babe
Jason, the Map interface allows it to accept or reject nulls. So if they want maximum flexibility they should accept them. Anyway, if someone uses a map implementation that doesn't accept nulls, it can throw its own exceptions.Raised
The interface does but the implementations do not all support null keys. By preventing nulls in the generic collector they have the flexibility to choose any map implementation. I agree that this feels off but likely they weighed the possibility of a mysterious null pointer exception based on performance tuning (or however it chooses an implementation) verses the widest (as in works with all implementations of the Map interface) definition of a key.Volkman
B
2

First of all, you are using lots of raw objects. This is not a good idea at all, first convert the following:

  • Class to Class<?>, ie. instead of a raw type, a parametrized type with an unknown class.
  • Instead of forcefully casting to a HashMap, you should supply a HashMap to the collector.

First the correctly typed code, without caring about a NPE yet:

Stream<Class<?>> stream = Stream.of(ArrayList.class);
HashMap<Class<?>, List<Class<?>>> hashMap = (HashMap<Class<?>, List<Class<?>>>)stream
        .collect(Collectors.groupingBy(Class::getSuperclass));

Now we get rid of the forceful cast there, and instead do it correctly:

Stream<Class<?>> stream = Stream.of(ArrayList.class);
HashMap<Class<?>, List<Class<?>>> hashMap = stream
        .collect(Collectors.groupingBy(
                Class::getSuperclass,
                HashMap::new,
                Collectors.toList()
        ));

Here we replace the groupingBy which just takes a classifier, to one that takes a classifier, a supplier and a collector. Essentially this is the same as what there was before, but now it is correctly typed.

You are indeed correct that in the javadoc it is not stated that it will throw a NPE, and I do not think it should be throwing one, as I am allowed to supply whatever map I want, and if my map allows null keys, then it should be allowed.

I do not see any other way to do it simpler as of now, I'll try to look more into it.

Babe answered 25/3, 2014 at 8:9 Comment(3)
Skiwi, I typed a bad example, since I don't really need a HashMap, and the simpler groupingBy is not guaranteed to return one. My example should have been just Map, like this: Stream<Class<?>> stream = Stream.of(ArrayList.class); Map<Class<?>, List<Class<?>>> map = stream.collect(Collectors.groupingBy(Class::getSuperclass));Raised
A warning for IntellijIDEA13 users: the IDE issues a Cyclic inference error on Class::getSuperclass. If you remove all the <?> this error is not shown. But now I realized it compiles and runs even with this "error", so it seems not to be a real error, but an IntellijIDEA bug.Raised
this answer doesn't work. All groupingBy methods in Collectors end up using internally the same implementation that rejects null keys. Can't believe nobody has bothered to check this before upvoting...Eschalot
H
0

You can use Stream#collect(Supplier<R> supplier, BiConsumer<R,? super T> accumulator, BiConsumer<R,R> combiner) instead.

https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html#collect-java.util.function.Supplier-java.util.function.BiConsumer-java.util.function.BiConsumer-


When you have a list of objects of a self-defined POJO type:

package code;

import static java.util.Arrays.asList;
import static java.util.stream.Collectors.toList;
import static lombok.AccessLevel.PRIVATE;

import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Stream;

import lombok.Data;
import lombok.experimental.Accessors;
import lombok.experimental.FieldDefaults;

public class MainGroupListIntoMap {

    public static void main(String[] args) throws Exception {

        final List<Item> items = Arrays.asList(
            new Item().setName("One").setType("1"),
            new Item().setName("Two").setType("1"),
            new Item().setName("Three").setType("1"),
            new Item().setName("Four").setType("2"),
            new Item().setName("Same").setType(null),
            new Item().setName("Same").setType(null),
            new Item().setName(null).setType(null)
        );

        final Map<String, List<Item>> grouped = items
                .stream()
                .collect(HashMap::new,
                         (m, v) -> m.merge(v.getType(),
                                           asList(v),
                                           (oldList, newList) -> Stream.concat(oldList.stream(),
                                                                               newList.stream())
                                                                       .collect(toList())),
                         HashMap::putAll);

        grouped.entrySet().forEach(System.out::println);
    }
}

@Data
@Accessors(chain = true)
@FieldDefaults(level = PRIVATE)
class Item {
    String name;
    String type;
}

Output:

null=[Item(name=Same, type=null), Item(name=Same, type=null), Item(name=null, type=null)]
1=[Item(name=One, type=1), Item(name=Two, type=1), Item(name=Three, type=1)]
2=[Item(name=Four, type=2)]

In your case:

package code;

import static java.util.Arrays.asList;
import static java.util.stream.Collectors.toList;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Stream;

public class MainGroupListIntoMap2 {

    public static void main(String[] args) throws Exception {

        group(asList(ArrayList.class, List.class))
            .entrySet()
            .forEach(System.out::println);
    }

    private static Map<Class<?>, List<Class<?>>> group(List<Class<?>> classes) {
        final Map<Class<?>, List<Class<?>>> grouped = classes
                .stream()
                .collect(HashMap::new,
                         (m, v) -> m.merge(v.getSuperclass(),
                                           asList(v),
                                           (oldList, newList) -> Stream.concat(oldList.stream(),
                                                                               newList.stream())
                                                                       .collect(toList())),
                         HashMap::putAll);

        return grouped;
    }
}

Output:

null=[interface java.util.List]
class java.util.AbstractList=[class java.util.ArrayList]
Hum answered 2/9, 2021 at 8:29 Comment(0)
E
0

This is what I settled on:

import org.junit.jupiter.api.Test;

import java.util.ArrayList;
import java.util.Collection;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.function.Function;
import java.util.function.Supplier;
import java.util.stream.Collector;
import java.util.stream.Collectors;

import static org.junit.jupiter.api.Assertions.assertEquals;

class CollectorTest {

    record Pair(Integer left, Integer right) {}
    List<Pair> pairs = List.of(new Pair(null, 1), new Pair(null, 2),
                               new Pair(1, 10),   new Pair(1, 11),
                               new Pair(2, 20),   new Pair(2, 21));

    @Test
    void testToMapOfSets() {
        Map<Integer, Set<Integer>> map = pairs.stream().collect(CollectorTest.toMapOfSets(Pair::left, Pair::right));
        assertEquals(Set.of(1, 2),   map.get(null));
        assertEquals(Set.of(10, 11), map.get(1));
        assertEquals(Set.of(20, 21), map.get(2));
    }

    @Test
    void testMapOfLists() {
        Map<Integer, List<Integer>> map = pairs.stream().collect(CollectorTest.toMapOfLists(Pair::left, Pair::right));
        assertEquals(List.of(1,2),   map.get(null));
        assertEquals(List.of(10,11), map.get(1));
        assertEquals(List.of(20,21), map.get(2));
    }

    public static <T, K, U> Collector<T, ?, Map<K, Set<U>>> toMapOfSets(Function<? super T, ? extends K> keyMapper,
                                                                        Function<? super T, ? extends U> valueMapper) {
        return toMapOfCollections(keyMapper, valueMapper, HashSet::new);
    }

    public static <T, K, U> Collector<T, ?, Map<K,List<U>>> toMapOfLists(Function<? super T, ? extends K> keyMapper,
                                                                         Function<? super T, ? extends U> valueMapper) {
        return toMapOfCollections(keyMapper, valueMapper, ArrayList::new);
    }

    private static <T, K, U, C extends Collection<U>>
    Collector<T, ?, Map<K,C>> toMapOfCollections(Function<? super T, ? extends K> keyMapper,
                                                 Function<? super T, ? extends U> valueMapper,
                                                 Supplier<C> collectionSupplier) {
        return Collectors.toMap(keyMapper, valueMapper.andThen(value -> {
            C collection = collectionSupplier.get();
            collection.add(value);
            return collection;
        }), (collection1, collection2) -> {
            collection1.addAll(collection2);
            return collection1;
        }, HashMap::new);
    }
}
Erechtheum answered 20/7, 2023 at 11:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.