Ignore duplicates when producing map using streams
Asked Answered
O

13

401
Map<String, String> phoneBook = people.stream()
                                      .collect(toMap(Person::getName,
                                                     Person::getAddress));

I get java.lang.IllegalStateException: Duplicate key when a duplicated element is found.

Is it possible to ignore such exception on adding values to the map?

When there is duplicate it simply should continue by ignoring that duplicate key.

Oscillogram answered 31/8, 2015 at 13:49 Comment(2)
If you can use it, HashSet will ignore the key, if it already exists.Hejira
@captain-aryabhatta. Is it possible to have key values in hashsetOscillogram
C
666

This is possible using the mergeFunction parameter of Collectors.toMap(keyMapper, valueMapper, mergeFunction):

Map<String, String> phoneBook = 
    people.stream()
          .collect(Collectors.toMap(
             Person::getName,
             Person::getAddress,
             (address1, address2) -> {
                 System.out.println("duplicate key found!");
                 return address1;
             }
          ));

mergeFunction is a function that operates on two values associated with the same key. adress1 corresponds to the first address that was encountered when collecting elements and adress2 corresponds to the second address encountered: this lambda just tells to keep the first address and ignores the second.

Crankshaft answered 31/8, 2015 at 13:58 Comment(10)
I'm confused, why is duplicate values (not keys) not allowed? And how to allow duplicate values?Dapplegray
is there any way to retrieve the key for which the collision occurs ? answer here: #40762454Economize
Is it possible to totally ignore this entry if there's a clash? Basically, if I ever encounter duplicate keys I don't want them to be added at all. In the example above, I don't want address1 or address2 in my map.Hogback
@Hendy Irawan : duplicate values are allowed. The merging function is to chose between (or merge) two values that have the same key.Orthochromatic
@Hogback Actually you can, you just have to make your remapping function return null. See toMap doc that point to merge doc that states If the remapping function returns null, the mapping is removed.Orthochromatic
Shouldn't we return address2 to mimic standard map behavior. If this where a for each instead of a collect the standard behavior would be that put on the second address would wipe out the first. Thus to avoid changes in behavior when code refactoring occurs address2 is the logical choice.Bibliophile
@Orthochromatic if there are 3 duplicate entries then it will ignore the first two entries but it will pick the third one. I tried with distinct() method to the stream too but same issue.Anagnos
@Anagnos I'm not sure I understand what you want. This code will pick the first one. If you want to pick the last one return address2 instead.Orthochromatic
@Orthochromatic : it might be that newday meant that returning null for the first comparison to skip entry creation still creates one if there's a third with the same keyBatch
@HendyIrawan In a hashmap the index of a value is based on it's key value. In the background there is a hashing function which converts the key into an index. If the same key would be inserted the new value should be placed on the exact same place as the previous value. There are multiple solutions upon resolving this issue, but codewise in the first this is not allowed.Jojo
S
187

As said in JavaDocs:

If the mapped keys contains duplicates (according to Object.equals(Object)), an IllegalStateException is thrown when the collection operation is performed. If the mapped keys may have duplicates, use toMap(Function keyMapper, Function valueMapper, BinaryOperator mergeFunction) instead.

So you should use toMap(Function keyMapper, Function valueMapper, BinaryOperator mergeFunction) instead. Just provide a merge function, that will determine which one of duplicates is put in the map.

For example, if you don't care which one, just call

Map<String, String> phoneBook = people.stream().collect(
        Collectors.toMap(Person::getName, Person::getAddress, (a1, a2) -> a1));
Sandstrom answered 31/8, 2015 at 14:3 Comment(2)
This could be a serious data loss if not understood correctly.Mostly
Well, yes. In most cases duplicated values must be somehow combined or exception is thrown (by default and it's correct behavior). But in some rare cases you need to ignore duplicate, and that was the question.Sandstrom
H
20

The answer from alaster helped me a lot, but I would like to add meaningful information if someone is trying to group the data.

If you have, for example, two Orders with the same code but different quantity of products for each one, and your desire is to sum the quantities, you can do the following:

List<Order> listQuantidade = new ArrayList<>();
listOrders.add(new Order("COD_1", 1L));
listOrders.add(new Order("COD_1", 5L));
listOrders.add(new Order("COD_1", 3L));
listOrders.add(new Order("COD_2", 3L));
listOrders.add(new Order("COD_3", 4L));

listOrders.collect(Collectors.toMap(Order::getCode, 
                                    o -> o.getQuantity(), 
                                    (o1, o2) -> o1 + o2));

Result:

{COD_3=4, COD_2=3, COD_1=9}

Or, from the javadocs, you can combine addresses:

 Map<String, String> phoneBook
     people.stream().collect(toMap(Person::getName,
                                   Person::getAddress,
                                   (s, a) -> s + ", " + a));
Harrumph answered 20/6, 2018 at 18:31 Comment(0)
U
4

For anyone else getting this issue but without duplicate keys in the map being streamed, make sure your keyMapper function isn't returning null values.

It's very annoying to track this down because when it processes the second element, the Exception will say "Duplicate key 1" when 1 is actually the value of the entry instead of the key.

In my case, my keyMapper function tried to look up values in a different map, but due to a typo in the strings was returning null values.

final Map<String, String> doop = new HashMap<>();
doop.put("a", "1");
doop.put("b", "2");

final Map<String, String> lookup = new HashMap<>();
doop.put("c", "e");
doop.put("d", "f");

doop.entrySet().stream().collect(Collectors.toMap(e -> lookup.get(e.getKey()), e -> e.getValue()));
Uremia answered 18/12, 2019 at 0:51 Comment(0)
T
3

For grouping by Objects

Map<Integer, Data> dataMap = dataList.stream().collect(Collectors.toMap(Data::getId, data-> data, (data1, data2)-> {LOG.info("Duplicate Group For :" + data2.getId());return data1;}));
Tamtam answered 9/9, 2019 at 21:21 Comment(1)
How can you log the key name here if the values are strings?Tenorrhaphy
P
1

Feels like toMap working often but not always is a dark underbelly of the java Streams. Like they should have called it toUniqueMap or something...

The easiest way is to use Collectors.groupingBy instead of Collectors.toMap.

It will return a List type output by default, but collision problem is gone, and that maybe what you want in the presence of multiples anyway.

  Map<String, List<Person>> phoneBook = people.stream()
          .collect(groupingBy((x) -> x.name));

If a Set type collection of the addresses associated with a particular name, groupingBy can do that as well:

Map<String, Set<String>> phoneBook = people.stream()
          .collect(groupingBy((x) -> x.name, mapping((x) -> x.address, toSet())));

The other way is to "start" with either a Hash or a Set...And carefully track through to make sure the keys never duplicate in the output stream. Ugh. Here's an example that happens to survive this...sometimes...

Ph answered 7/5, 2021 at 20:58 Comment(0)
M
1

I got the same issue. A map stores key, value pairs and does not allow duplicate keys. If individual objects have duplicate names, you will get the error

java.lang.IllegalStateException: Duplicate key

Ex:

Map<String, String> stringMap;
        List<Person> personList = new ArrayList<>();
        personList.add(new Person(1, "Mark", "Menlo Park"));
        personList.add(new Person(2, "Sundar", "1600 Amphitheatre Pkwy"));
        personList.add(new Person(3, "Sundar", "232 Santa Margarita Ave"));
        personList.add(new Person(4, "Steve", "Los Altos"));

        stringMap = personList.stream().distinct().collect(Collectors.toMap(Person::getName, Person::getAddress));

enter image description here

To resolve it, we need to use a different method with an additional parameter, the mergeFunction.

    phoneBook = personList.stream().distinct().collect(Collectors.toMap(Person::getName, Person::getAddress
                    , (existing, replacement) -> existing));

System.out.println("Map object output :" + stringMap);

Output: Map object output :{Steve=Los Altos, Mark=Menlo Park, Sundar=1600 Amphitheatre Pkwy}

NOTE: When you change (existing, replacement) -> replacement). the old key is replaced with the new value. And if you need all addresses to store the same key check this Multimap

Marquisette answered 6/3, 2023 at 15:43 Comment(0)
A
1

If someone is looking get Map<String, List<Person>> then follow below steps:

Map<String, List<Person>> phoneBook = people
            .stream()
            .collect(Collectors.toMap(
                    Person::getName,
                    Function.identity(),
                    (first, second) -> first);

Function.identity(): is a convenient way to indicate that the values in the map should be the Person objects themselves without any modification.

(first, second) -> first: This is a merge function used to handle situations where there are duplicate keys in the stream. If two Person objects have the same name (a duplicate key), this function specifies that the value associated with that key should be the first Person encountered (address1) among the duplicates, effectively preserving the first occurrence and discarding the duplicates.

Use below two case as per you need:

  • If we use ((first, second) -> first) specifies what to do in case of duplicate keys it chooses the first encountered Person.

  • If we use ((first, second) -> second) specifies what to do in case of duplicate keys it chooses the last encountered Person.

Accused answered 22/9, 2023 at 15:0 Comment(0)
F
0

I have encountered such a problem when grouping object, i always resolved them by a simple way: perform a custom filter using a java.util.Set to remove duplicate object with whatever attribute of your choice as bellow

Set<String> uniqueNames = new HashSet<>();
Map<String, String> phoneBook = people
                  .stream()
                  .filter(person -> person != null && !uniqueNames.add(person.getName()))
                  .collect(toMap(Person::getName, Person::getAddress));

Hope this helps anyone having the same problem !

Flintshire answered 29/3, 2019 at 9:1 Comment(0)
P
0

For completeness, here's how to "reduce" duplicates down to just one.

If you are OK with the last:

  Map<String, Person> phoneBook = people.stream()
          .collect(groupingBy(x -> x.name, reducing(null, identity(), (first, last) -> last)));

If you want only the first:

  Map<String, Person> phoneBook = people.stream()
          .collect(groupingBy(x -> x.name, reducing(null, identity(), (first, last) -> first != null ? first : last)));

And if you want last but "address as String" (doesn't use the identity() as a parameter).

  Map<String, String> phoneBook = people.stream()
          .collect(groupingBy(x -> x.name, reducing(null, x -> x.address, (first, last) -> last)));

source

So in essence groupingBy paired with a reducing collector starts to behave very similarly to the toMap collector, having something similar to its mergeFunction...and identical end result...

Ph answered 7/5, 2021 at 21:25 Comment(0)
I
0

One can use lambda function: the comparison is done on key string from key(...)

List<Blog> blogsNoDuplicates = blogs.stream()
            .collect(toMap(b->key(b), b->b, (b1, b2) -> b1))  // b.getAuthor() <~>key(b) as Key criteria for Duplicate elimination
            .values().stream().collect(Collectors.toList());

static String key(Blog b){
    return b.getTitle()+b.getAuthor(); // make key as criteria of distinction
}
Inocenciainoculable answered 17/3, 2022 at 11:25 Comment(0)
H
-1

Assuming you have people is List of object

  Map<String, String> phoneBook=people.stream()
                                        .collect(toMap(Person::getName, Person::getAddress));

Now you need two steps :

1)

people =removeDuplicate(people);

2)

Map<String, String> phoneBook=people.stream()
                                        .collect(toMap(Person::getName, Person::getAddress));

Here is method to remove duplicate

public static List removeDuplicate(Collection<Person>  list) {
        if(list ==null || list.isEmpty()){
            return null;
        }

        Object removedDuplicateList =
                list.stream()
                     .distinct()
                     .collect(Collectors.toList());
     return (List) removedDuplicateList;

      }

Adding full example here

 package com.example.khan.vaquar;

import java.util.Arrays;
import java.util.Collection;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class RemovedDuplicate {

    public static void main(String[] args) {
        Person vaquar = new Person(1, "Vaquar", "Khan");
        Person zidan = new Person(2, "Zidan", "Khan");
        Person zerina = new Person(3, "Zerina", "Khan");

        // Add some random persons
        Collection<Person> duplicateList = Arrays.asList(vaquar, zidan, zerina, vaquar, zidan, vaquar);

        //
        System.out.println("Before removed duplicate list" + duplicateList);
        //
        Collection<Person> nonDuplicateList = removeDuplicate(duplicateList);
        //
        System.out.println("");
        System.out.println("After removed duplicate list" + nonDuplicateList);
        ;

        // 1) solution Working code
        Map<Object, Object> k = nonDuplicateList.stream().distinct()
                .collect(Collectors.toMap(s1 -> s1.getId(), s1 -> s1));
        System.out.println("");
        System.out.println("Result 1 using method_______________________________________________");
        System.out.println("k" + k);
        System.out.println("_____________________________________________________________________");

        // 2) solution using inline distinct()
        Map<Object, Object> k1 = duplicateList.stream().distinct()
                .collect(Collectors.toMap(s1 -> s1.getId(), s1 -> s1));
        System.out.println("");
        System.out.println("Result 2 using inline_______________________________________________");
        System.out.println("k1" + k1);
        System.out.println("_____________________________________________________________________");

        //breacking code
        System.out.println("");
        System.out.println("Throwing exception _______________________________________________");
        Map<Object, Object> k2 = duplicateList.stream()
                .collect(Collectors.toMap(s1 -> s1.getId(), s1 -> s1));
        System.out.println("");
        System.out.println("k2" + k2);
        System.out.println("_____________________________________________________________________");
    }

    public static List removeDuplicate(Collection<Person> list) {
        if (list == null || list.isEmpty()) {
            return null;
        }

        Object removedDuplicateList = list.stream().distinct().collect(Collectors.toList());
        return (List) removedDuplicateList;

    }

}

// Model class
class Person {
    public Person(Integer id, String fname, String lname) {
        super();
        this.id = id;
        this.fname = fname;
        this.lname = lname;
    }

    private Integer id;
    private String fname;
    private String lname;

    // Getters and Setters

    public Integer getId() {
        return id;
    }

    public void setId(Integer id) {
        this.id = id;
    }

    public String getFname() {
        return fname;
    }

    public void setFname(String fname) {
        this.fname = fname;
    }

    public String getLname() {
        return lname;
    }

    public void setLname(String lname) {
        this.lname = lname;
    }

    @Override
    public String toString() {
        return "Person [id=" + id + ", fname=" + fname + ", lname=" + lname + "]";
    }

}

Results :

Before removed duplicate list[Person [id=1, fname=Vaquar, lname=Khan], Person [id=2, fname=Zidan, lname=Khan], Person [id=3, fname=Zerina, lname=Khan], Person [id=1, fname=Vaquar, lname=Khan], Person [id=2, fname=Zidan, lname=Khan], Person [id=1, fname=Vaquar, lname=Khan]]

After removed duplicate list[Person [id=1, fname=Vaquar, lname=Khan], Person [id=2, fname=Zidan, lname=Khan], Person [id=3, fname=Zerina, lname=Khan]]

Result 1 using method_______________________________________________
k{1=Person [id=1, fname=Vaquar, lname=Khan], 2=Person [id=2, fname=Zidan, lname=Khan], 3=Person [id=3, fname=Zerina, lname=Khan]}
_____________________________________________________________________

Result 2 using inline_______________________________________________
k1{1=Person [id=1, fname=Vaquar, lname=Khan], 2=Person [id=2, fname=Zidan, lname=Khan], 3=Person [id=3, fname=Zerina, lname=Khan]}
_____________________________________________________________________

Throwing exception _______________________________________________
Exception in thread "main" java.lang.IllegalStateException: Duplicate key Person [id=1, fname=Vaquar, lname=Khan]
    at java.util.stream.Collectors.lambda$throwingMerger$0(Collectors.java:133)
    at java.util.HashMap.merge(HashMap.java:1253)
    at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1320)
    at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
    at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
    at com.example.khan.vaquar.RemovedDuplicate.main(RemovedDuplicate.java:48)
Hortensiahorter answered 28/11, 2018 at 22:24 Comment(0)
S
-1

I had the same case and found that the simplest solution (Assuming you want to just override the map value for duplicate key) is:

Map<String, String> phoneBook = 
       people.stream()
           .collect(Collectors.toMap(Person::getName, 
                                  Person::getAddress, 
                                        (key1, key2)-> key2));
Spier answered 20/3, 2022 at 11:53 Comment(1)
This is actually a duplicate answer. Please see the accepted one.Pommel

© 2022 - 2024 — McMap. All rights reserved.