I was presented with an interesting problem by a colleague of mine and I was unable to find a neat and pretty Java 8 solution. The problem is to stream through a list of POJOs and then collect them in a map based on multiple properties - the mapping causes the POJO to occur multiple times
Imagine the following POJO:
private static class Customer {
public String first;
public String last;
public Customer(String first, String last) {
this.first = first;
this.last = last;
}
public String toString() {
return "Customer(" + first + " " + last + ")";
}
}
Set it up as a List<Customer>
:
// The list of customers
List<Customer> customers = Arrays.asList(
new Customer("Johnny", "Puma"),
new Customer("Super", "Mac"));
Alternative 1: Use a Map
outside of the "stream" (or rather outside forEach
).
// Alt 1: not pretty since the resulting map is "outside" of
// the stream. If parallel streams are used it must be
// ConcurrentHashMap
Map<String, Customer> res1 = new HashMap<>();
customers.stream().forEach(c -> {
res1.put(c.first, c);
res1.put(c.last, c);
});
Alternative 2: Create map entries and stream them, then flatMap
them. IMO it is a bit too verbose and not so easy to read.
// Alt 2: A bit verbose and "new AbstractMap.SimpleEntry" feels as
// a "hard" dependency to AbstractMap
Map<String, Customer> res2 =
customers.stream()
.map(p -> {
Map.Entry<String, Customer> firstEntry = new AbstractMap.SimpleEntry<>(p.first, p);
Map.Entry<String, Customer> lastEntry = new AbstractMap.SimpleEntry<>(p.last, p);
return Stream.of(firstEntry, lastEntry);
})
.flatMap(Function.identity())
.collect(Collectors.toMap(
Map.Entry::getKey, Map.Entry::getValue));
Alternative 3: This is another one that I came up with the "prettiest" code so far but it uses the three-arg version of reduce
and the third parameter is a bit dodgy as found in this question: Purpose of third argument to 'reduce' function in Java 8 functional programming. Furthermore, reduce
does not seem like a good fit for this problem since it is mutating and parallel streams may not work with the approach below.
// Alt 3: using reduce. Not so pretty
Map<String, Customer> res3 = customers.stream().reduce(
new HashMap<>(),
(m, p) -> {
m.put(p.first, p);
m.put(p.last, p);
return m;
}, (m1, m2) -> m2 /* <- NOT USED UNLESS PARALLEL */);
If the above code is printed like this:
System.out.println(res1);
System.out.println(res2);
System.out.println(res3);
The result would be:
{Super=Customer(Super Mac), Johnny=Customer(Johnny Puma), Mac=Customer(Super Mac), Puma=Customer(Johnny Puma)}
{Super=Customer(Super Mac), Johnny=Customer(Johnny Puma), Mac=Customer(Super Mac), Puma=Customer(Johnny Puma)}
{Super=Customer(Super Mac), Johnny=Customer(Johnny Puma), Mac=Customer(Super Mac), Puma=Customer(Johnny Puma)}
So, now to my question: How should I, in a Java 8 orderly fashion, stream through the List<Customer>
and then somehow collect it as a Map<String, Customer>
where you split the whole thing as two keys (first
AND last
) i.e. the Customer
is mapped twice. I do not want to use any 3rd party libraries, I do not want to use a map outside of the stream as in alt 1. Are there any other nice alternatives?
The full code can be found on hastebin for simple copy-paste to get the whole thing running.
collect()
is "the preferred way"; usingreduce()
for this is simply wrong. Reduce is for values; the contortions that Alternative 3 goes through violates the specification ofreduce()
. And if later you try and run that stream in parallel, you'll just get the wrong answer. On the other hand, usingcollect()
as Misha shows is designed to handle this case and is safe either sequentially or in parallel. – Monoicous