Why there is no ConcurrentHashSet against ConcurrentHashMap

Asked 9/8, 2011 at 7:14 Answered 1/3 at 12:13

Solved java collections concurrency hashmap hashset

698

HashSet is based on HashMap.

If we look at HashSet<E> implementation, everything is been managed under HashMap<E,Object>.

<E> is used as a key of HashMap.

And we know that HashMap is not thread safe. That is why we have ConcurrentHashMap in Java.

Based on this, I am confused that why we don't have a ConcurrentHashSet which should be based on the ConcurrentHashMap?

Is there anything else that I am missing? I need to use Set in a multi-threaded environment.

Also, If I want to create my own ConcurrentHashSet can I achieve it by just replacing the HashMap to ConcurrentHashMap and leaving the rest as is?

Bainbridge answered 9/8, 2011 at 7:14 Comment(7)

After looking at the API, if I were to guess I would say that it seems to come down to 2 factors, (1) avoiding having to create a class in Java API for every little bit of functionality needed (2) Providing convenience classes for more frequently used objects. I personally prefer LinkedHashMap and LinkedHashSet since they guarantee order is the same as insertion order, the only reason for using a set is to avoid duplication, often I still want to maintain insertion order. – Erysipelas 9/8, 2011 at 7:31

@Ali, I personally prefer LinkedHashMap and LinkedHashSet you will go far :) – Colombi 9/8, 2011 at 20:54

A bit old question, but as it is the first result in Google, may be useful to know that ConcurrentSkipListSet already has the implementation of ConcurrentHashMap. See docs.oracle.com/javase/7/docs/api/java/util/concurrent/… – Dania 13/6, 2013 at 16:18

What I saw from Java source ConcurrentSkipListSet is built on ConcurrentSkipListMap, which implements ConcurrentNavigableMap and ConcurrentMap. – Bainbridge 14/6, 2013 at 12:18

possible duplicate of is Java HashSet thread-safe for read only? – Rejection 5/11, 2014 at 18:52

Very good point of view. Perhaps the guys at java thought that always is good to associate things.. so they promote fiercely this idea!! – Vizcacha 28/6, 2015 at 0:57

Now I need this feature too, and I have the same question as you have. I used ConcurrentSkipListSet for sorting as well, the bottom storage is a map, but unfortunately not a Hash one. I don't think this has an excuse to not support it, maybe comes in future edition. – Kerril 19/3, 2018 at 5:4

745

There's no built in type for ConcurrentHashSet because you can always derive a set from a map. Since there are many types of maps, you use a method to produce a set from a given map (or map class).

Prior to Java 8, you produce a concurrent hash set backed by a concurrent hash map, by using Collections.newSetFromMap(map)

In Java 8 (pointed out by @Matt), you can get a concurrent hash set view via ConcurrentHashMap.newKeySet(). This is a bit simpler than the old newSetFromMap which required you to pass in an empty map object. But it is specific to ConcurrentHashMap.

Anyway, the Java designers could have created a new set interface every time a new map interface was created, but that pattern would be impossible to enforce when third parties create their own maps. It is better to have the static methods that derive new sets; that approach always works, even when you create your own map implementations.

Vinculum answered 9/8, 2011 at 7:17 Comment(19)

Am I right to say that if you create the set this way from ConcurrentHashMap, you lose the benefits you'd get from ConcurrentHashMap ? – Unbreathed 1/11, 2011 at 20:47

There are no benefits to lose. newSetFromMap's implementation is found starting on line 3841 in docjar.com/html/api/java/util/Collections.java.html. It's just a wrapper.... – Vinculum 1/11, 2011 at 23:36

@zneak, isn't the add operation on a Set equivalent to what putIfAbsent does? The other operations in ConcurrentMap don't offer any functionality that doesn't exist in Set. Set already has a remove method and I don't see how a ConcurrentMap-like replace operation would make sense on a Set object. – Tinsel 19/7, 2012 at 10:36

@Andrew, I think the motivation behind using a "ConcurrentSet" stems from not the API but rather the implementation - thread safety but without a universal lock - multiple concurrent reads for instance. – Photoelectron 13/9, 2012 at 16:7

@UstamanSangat: hmmm, I think the comment I was responding to was deleted at some point. I vaguely remember it asking about why there wasn't a ConcurrentSet interface (and my argument was that it wouldn't provide any new methods; it'd only be a marker interface, like RandomAccess. Admittedly Set itself is basically a marker interface since it doesn't provide new methods; just new constraints). – Tinsel 14/9, 2012 at 21:42

If your elements are Comparable, why not simply use ConcurrentSkipListSet? – Selestina 2/10, 2014 at 10:13

ConcurrentSkipList has lots of (size) overhead and the lookups are slower. – Chism 10/6, 2015 at 15:44

My complaint is that having to go this route, you lose the HashSet(Collection<? extends E> c) constructor, with its nice syntactic sugar for instance converting a List to a Set. Or am I wrong? – Ressler 14/9, 2015 at 21:17

@rogerdpack: Good point. I just ran into this now. newSetFromMap could do that work for us. – Pennipennie 14/12, 2015 at 4:58

One thing I found with the ConcurrentHashMap.newKeySet (and the Collections alternative route) is that it doesn't cater for null keys and will NPE on a remove call (which HashSet will not do). – Caulk 6/10, 2016 at 8:12

The real problem when the JDK not providing a true ConcurrentSet interface is that you'll be missing some atomic operations on that Set, namely addIfAbsent, computeIfAbsent, etc. With JDK8 you can create those yourself because ConcurrentHashMap.KeySetView gives you access to the underlying ConcurrentMap. With JDK7 you're stuck and may have to resort to creating your own wrapper. – Jut 14/1, 2018 at 17:3

take care when using this approach, since some methods are not implemented correctly. Just follow the links: Collections.newSetFromMap creates a SetFromMap. e.g. the SetFromMap.removeAll method delegates to the KeySetView.removeAll, that inherits from ConcurrentHashMap$CollectionView.removeAll. This method is highly inefficient in bulk removing elements. imagine removeAll(Collections.emptySet()) traverses all elements in the Map without doing anything. Having a ConcurrentHashSet that is corretly implemented will be better in most cases. – Demott 1/3, 2018 at 9:50

To follow up on @Demott comment, HashSet gets it's removeAll from AbstractSet, which will at least iterate on the smaller of the two collections... so it's not super terrible how the CollectionView works... especially if you look a bit closer at the perf characteristics of concurrent map... by always using it's own iterator, it can use Iterator.remove to detach an item, whereas the AbstractSet version might iterate the argument and then call .remove(), which causes a complete hash table lookup (versus Iterator.remove, which can avoid the expensive checks / graph transversals). – Enfleurage 6/8, 2018 at 23:8

Basically, if the set/map knows it is expensive to do a "cold remove()", it can make the (right) choice to avoid pathological performance when both collections are large, at the expense of (maybe) poorer performance when collections are small or empty. Especially empty, since you, the caller, can optimize by filtering out empty queries before you make them. – Enfleurage 6/8, 2018 at 23:9

@Enfleurage my comment relates to Java8 and before. they have adressed this with Java9+, so the method indeed makes these checks now. – Demott 8/8, 2018 at 17:59

Ah, cool. many thanks for the update. My company hasn't made the leap to java9+ just yet, so always good to pick up tidbit for when we have to bite the bullet (soon). :-) – Enfleurage 10/8, 2018 at 15:42

Then why ConcurrentSkipListSet exists, when you can derive a set from a ConcurrentSkipListMap? – Risner 13/9, 2018 at 9:57

That's a good question for the author, Doug Lea. He definitely decided it was a good idea to build ConcurrentSkipListSet by wrapping a ConcurrentSkipListMap -- see line 65 of fuseyism.com/classpath/doc/java/util/concurrent/…. I suppose one could figure out why he did that by comparing the source code of the map and set (perhaps in this case there was a benefit to wrapping the map?) but it might be easier to ask Doug directly. I suspect it has something to do with skiplists being sorted sets as opposed to unrestricted sets that can use hashing. – Vinculum 14/9, 2018 at 5:53

@NickNick that’s a question we could expand to the entire Collection API. HashSet is just a wrapper around HashMap, LinkedHashSet a wrapper around LinkedHashMap, TreeSet is just a wrapper around TreeMap… – Wearable 17/3, 2022 at 10:33

129

Prior to Java 8:

Set<String> mySet = Collections.newSetFromMap(new ConcurrentHashMap<String, Boolean>());

Java 8 or greater:

ConcurrentHashMap.KeySetView<String, Boolean> mySet = ConcurrentHashMap.newKeySet();

Eudocia answered 23/6, 2014 at 9:21 Comment(3)

As Ray Toal points out in his answer, the preferred way to this in 2021 is definitely ConcurrentHashMap.newKeySet() (which does a similar thing behind the scenes), as it's more succinct. – Digged 26/1, 2021 at 6:14

this is thread-safe right? – Gettings 10/12, 2021 at 7:3

yes, it is thread safe – Eudocia 17/10, 2023 at 8:32

103

With Guava 15 you can also simply use:

Set s = Sets.newConcurrentHashSet();

Bergerac answered 1/2, 2015 at 20:35 Comment(4)

This is always a nightmare. If you have a set or a map that does not indicate whether or not something is thread safe you find all kind of hazards and desasters happen in maintaince. I always would want a type that indicates thread safety for collections (or not). – Calpac 14/12, 2015 at 9:1

The method description is literally "Creates a thread-safe set backed by a hash map" – Bergerac 14/12, 2015 at 16:33

As I said, there is a ConcurrentSet<E> missing. ConcurrentHashMap comes along with a ConcurrentMap interface to indicate this. This is the very same reason I always add this ConcurrentSet interface as well. – Calpac 15/12, 2015 at 9:59

The ConcurrentMap interface is much more than a marker interface - it contains concurrency-specific methods such as putIfAbsent, remove and replace. – Pinter 10/12, 2023 at 10:32

Like Ray Toal mentioned it is as easy as:

Set<String> myConcurrentSet = ConcurrentHashMap.newKeySet();

Chronometry answered 7/7, 2016 at 17:28 Comment(1)

This seems to require Java 8. Looking at implementation, this also seems to be just a wrapper of ConcurrentHashMap. – Papandreou 8/10, 2018 at 5:10

It looks like Java provides a concurrent Set implementation with its ConcurrentSkipListSet. A SkipList Set is just a special kind of set implementation. It still implements the Serializable, Cloneable, Iterable, Collection, NavigableSet, Set, SortedSet interfaces. This might work for you if you only need the Set interface.

Regardant answered 2/10, 2013 at 16:17 Comment(4)

Note that ConcurrentSkipListSet's elements should be Comparable – Selestina 2/10, 2014 at 10:14

If you need to extend from a Set that is concurrent, this is the only solution here that will work. – Capacity 28/12, 2016 at 6:17

ConcurrentSkipListMap adds unnecessary performance penalty of having tree as base data structure, instead of using HashTable, even when you do not need sorting/navigation functionality. – Aforesaid 9/9, 2017 at 18:31

don't use ConcurrentSkipListSet unless you want a SortedSet. A usual operation like add or remove should be O(1) for a HashSet, but O(log(n)) for a SortedSet. – Demott 1/3, 2018 at 9:38

As pointed by this the best way to obtain a concurrency-able HashSet is by means of Collections.synchronizedSet()

Set s = Collections.synchronizedSet(new HashSet(...));

This worked for me and I haven't seen anybody really pointing to it.

EDIT This is less efficient than the currently aproved solution, as Eugene points out, since it just wraps your set into a synchronized decorator, while a ConcurrentHashMap actually implements low-level concurrency and it can back your Set just as fine. So thanks to Mr. Stepanenkov for making that clear.

http://docs.oracle.com/javase/8/docs/api/java/util/Collections.html#synchronizedSet-java.util.Set-

Rejection answered 5/11, 2014 at 18:56 Comment(1)

the synchronizedSet method just creates the decorator under Collection to wrap methods that could be thread-safe by synchronization the whole collection. But ConcurrentHashMap is implemented using non-blocking algorithms and "low-level" synchronisations without any locks of the whole collection. So wrapers from Collections.synchronized... is worse in multi-threads environments for performance reasons. – Woodring 1/1, 2015 at 19:53

You can use guava's Sets.newSetFromMap(map) to get one. Java 6 also has that method in java.util.Collections

Poinciana answered 9/8, 2011 at 7:17 Comment(6)

it's available in java.utll.Collections and set of CHM is usually a bad thing anyways. – Colombi 9/8, 2011 at 7:18

yeah, I noticed it is added in Java 6, so added it to the answer – Poinciana 9/8, 2011 at 7:19

The main this is that if it is ThreadSafe, and I really doubt that. – Bainbridge 9/8, 2011 at 7:22

@Talha, it's thread safe, however thread safety alone means nothing – Colombi 9/8, 2011 at 7:25

Sometimes it means everything. Its shaldom a performance problem unless it is part of an algorithm which are usually implemented in a way that the need for concurrent mapping is minimized. – Calpac 14/12, 2015 at 9:3

Indeed, I have faced visibility issues with Sets.newSetFromMap(map), and since then switched to ConcurrentHashMap.newKeySet(). – Bedford 11/1, 2017 at 11:56

import java.util.AbstractSet;
import java.util.Iterator;
import java.util.Set;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;


public class ConcurrentHashSet<E> extends AbstractSet<E> implements Set<E>{
   private final ConcurrentMap<E, Object> theMap;

   private static final Object dummy = new Object();

   public ConcurrentHashSet(){
      theMap = new ConcurrentHashMap<E, Object>();
   }

   @Override
   public int size() {
      return theMap.size();
   }

   @Override
   public Iterator<E> iterator(){
      return theMap.keySet().iterator();
   }

   @Override
   public boolean isEmpty(){
      return theMap.isEmpty();
   }

   @Override
   public boolean add(final E o){
      return theMap.put(o, ConcurrentHashSet.dummy) == null;
   }

   @Override
   public boolean contains(final Object o){
      return theMap.containsKey(o);
   }

   @Override
   public void clear(){
      theMap.clear();
   }

   @Override
   public boolean remove(final Object o){
      return theMap.remove(o) == ConcurrentHashSet.dummy;
   }

   public boolean addIfAbsent(final E o){
      Object obj = theMap.putIfAbsent(o, ConcurrentHashSet.dummy);
      return obj == null;
   }
}

Wingspread answered 25/9, 2014 at 5:53 Comment(2)

I like the idea to use Boolean.TRUE instead of an dummy object. It is a little bit more elegant. Also using NULL is also possible since it would be available in the key set even if mapped to null. – Calpac 14/12, 2015 at 9:5

@MartinKersten fyi, ConcurrentHashMap doesn't allow null values – Crossroad 13/1, 2016 at 19:19

Why not use: CopyOnWriteArraySet from java.util.concurrent?

Schurman answered 7/10, 2016 at 14:56 Comment(3)

Because CopyOnWriteArraySet copies the entire collection on any state mutation, which is not always wanted due to the performance impact. It's designed to work only in special cases. – Eberhard 16/1, 2017 at 14:59

Additionally CopyOnWriteArraySet.contains() has a run-time of O(n) (has to check ever entry) where as HashSet/HashMap has O(1). – Hannus 1/12, 2020 at 12:2

because this works only one way: if you're reading data owned/managed by another thread. if you need to pass data to that thread, you'd have to additionally update its atomic ref to updated array - but that makes little sense – Gimlet 23/2, 2021 at 18:10

We have ConcurrentHashSet in io.vertx which can be used against java.util ConcurrentHashMap. https://www.javadoc.io/static/io.vertx/vertx-core/3.0.0/io/vertx/core/impl/ConcurrentHashSet.html

Halfmast answered 17/1, 2023 at 8:33 Comment(1)

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center. – Secularity 18/1, 2023 at 11:56

Imho what we really need is a common ConcurrentSet interface which should be on par with ConcurrentMap interface.

All existing code should align on this new interface to clarify a bit when we have a Set that is concurrent (thread-safe) or not...

for example:

ConcurrentSet<String> set = ConcurrentHashMap.newKeySet();
ConcurrentSet<String> set = Collections.synchronizedSet(new HashSet<>());
ConcurrentSet<String> set = new ConcurrentSkipListSet<>();

Alexandro answered 1/3 at 12:13 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags