Why does mapMulti need type information in comparison to flatMap
Asked Answered
L

1

7

I want to use mapMulti instead of flatMap and refactored the following code:

// using flatMap (version 1) => returns Set<Item>
var items = users.stream()
                 .flatMap(u -> u.getItems().stream())
                .collect(Collectors.toSet());

into this (version 2):

// using mapMulti (version 2) => returns Set<Item>
var items = users.stream()
                 .<Item>mapMulti((u, consumer) -> u.getItems().forEach(consumer))
                 .collect(Collectors.toSet());

Both return the same elements. However, I am in doubt if I should really replace all my flatMap with that more verbose code of mapMulti. Why do I need to add the type information before mapMuli (.<Item>mapMulti). If I don't inlcude the type information, it will return a Set<Object>. (How) can I simplify mapMulti?

Lukash answered 5/2, 2022 at 14:11 Comment(4)
I want to use mapMulti instead of flatMap and refactored the following code: Why? I have quite a bit of code that works fine and doesn't even use streams. Is there going to be a signficant benefit to refactoring?Reverberate
@Reverberate It's only about performance. I use flatMap in a very many places. According to multiMap's javadoc "When replacing each stream element with a small (possibly zero) number of elements." it can provide better performance. Indeed. I only replace each stream element with a small (possibly zero) number of elements.. so it would fit for me to use multiMap..Lukash
@Lukash No doubt the performance gain might be improved. Would it be noticeable? It might be worthwhile doing some benchmark testing using something like Java MircobenchMark Harness. As long as you want to make improvements, you might also find other areas unrelated to flatMap that could use some tweaking.Reverberate
There is another benefit to mapMulti over flatMap, which is that it is more amenable to "imperative" generation of the replacement. With flatMap, you'd have to generate them into a List, and then ask the List for a Stream; with mapMulti, you can generate them directly into the next stage of the pipeline.Sackbut
J
10

Notice that the kind of type inference required to deduce the resulting stream type when you use flatMap, is very different from that when you use mapMulti.

When you use flatMap, the type of the resulting stream is the same type as the return type of the lambda body. That's a special thing that the compiler has been designed to infer type variables from (i.e. the compiler "knows about" it).

However, in the case of mapMulti, the type of the resulting stream that you presumably want can only be inferred from the things you do to the consumer lambda parameter. Hypothetically, the compiler could be designed so that, for example, if you have said consumer.accept(1), then it would look at what you have passed to accept, and see that you want a Stream<Integer>, and in the case of getItems().forEach(consumer), the only place where the type Item could have come from is the return type of getItems, so it would need to go look at that instead.

You are basically asking the compiler to infer the parameter types of a lambda, based on the types of arbitrary expressions inside it. The compiler simply has not been designed to do this.

Other than adding the <Item> prefix, there are other (longer) ways to let it infer a Stream<Item> as the return type of mapMulti:

Make the lambda explicitly typed:

var items = users.stream()
             .mapMulti((User u, Consumer<Item> consumer) -> u.getItems().forEach(consumer))
             .collect(Collectors.toSet());

Add a temporary stream variable:

// By looking at the type of itemStream, the compiler can figure out that mapMulti should return a Stream<Item>
Stream<Item> itemStream = users.stream()
             .mapMulti((u, consumer) -> u.getItems().forEach(consumer));
var items = itemStream.collect(Collectors.toSet());

I don't know if this is more "simplified", but I think it is neater if you use method references:

var items = users.stream()
             .map(User::getItems)
             .<Item>mapMulti(Iterable::forEach)
             .collect(Collectors.toSet());
Jabin answered 5/2, 2022 at 14:58 Comment(7)
Sweeper: one question with your "simplified" solution: as you use map(User::getItems), you introduce an "intermediate stream", or? If so, then I am asking: Does the use of map(User::getItems) eliminates the advantages of mapMulti compared to flatMap?Lukash
@Lukash I did not profile that. You can try profiling it if you want to know the answer. Performance depends on a lot of things, after all. As I said, I prefer the extra map simply because it is more visually appealing to me. Also, as WJS suggested, there is a high chance that none of this is going to matter.Jabin
@Lukash see also: ericlippert.com/2012/12/17/performance-rantJabin
Yes, I can do profiling, but the main advantage with mapMulti is, to eliminate the need for an "intermediate stream". As you use map() which again introduce an intermediate stream, I suspect that the benefits are lost.Lukash
I am not talking about premature optimization. I am talking about the fact beween mapMulti and flatMap and the benefit of mapMulti in compare to flatMap as described within the official API: I see in your "simplified" example, that you add an intermediate stream (by map(User::getItems)) and this eliminates the benefits of mapMulti.Lukash
@Jabin You keep saying "the compiler", as if the compiler implementation has any latitude here. What you're really talking about is the language, and how things like type inference works are clearly spelled out in the language specification (JLS Ch 18.) The compiler has no latitude to change the language which it accepts; that's solely the province of the language definition, as laid out in the specification.Sackbut
@BrianGoetz I totally agree. It is the language specification that causes the compiler to be this way. I was being a bit loose with my wording there.Jabin

© 2022 - 2024 — McMap. All rights reserved.