Pattern for Java ConcurrentHashMap of Sets
Asked Answered
S

2

4

A data structure that I use commonly in multi-threaded applications is a ConcurrentHashMap where I want to save a group of items that all share the same key. The problem occurs when installing the first item for a particular key value.

The pattern that I have been using is:

final ConcurrentMap<KEYTYPE, Set<VALUETYPE>> hashMap = new ConcurrentHashMap<KEYTYPE, Set<VALUETYPE>>();
// ...
Set<VALUETYPE> newSet = new HashSet<VALUETYPE>();
final Set<VALUETYPE> set = hashMap.putIfAbsent(key, newSet)
if (set != null) {
  newSet = set;
}
synchronized (newSet) {
  if (!newSet.contains(value)) {
    newSet.add(value);
  }
}

Is there a better pattern for doing this operation? Is this even thread-safe? Is there a better class to use for the inner Set than java.util.HashSet?

Sungod answered 22/3, 2012 at 12:29 Comment(0)
C
5

I strongly recommend using the Google Guava libraries for this, specifically an implementation of Multimap. The HashMultimap would be your best bet, though if you need concurrent update opterations you would need to wrap it in a delegate using Multimaps.synchronizedSetMultimap().

Another option is to use a ComputingMap (also from Guava), which is a map that, if the Value returned from a call to get(Key) does not exist, it is instantiated there and then. ComputingMaps are created using MapMaker.

The code from your question would be roughly:

ConcurrentMap<KEYTYPE, Set<VALUETYPE>> hashMap = new MapMaker()
                 .makeComputingMap(
        new Function<KEYTYPE, VALUETYPE>() {
         public Graph apply(KEYTYPE key) {
           return new HashSet<VALUETYPE>();
         }
       });

The Function would only be called when a call to get() for a specific key would otherwise return null. This means that you can then do this:

hashMap.get(key).put(value);

safely knowing that the HashSet<VALUETYPE> is created if it doesn't already exist.

MapMaker is also relevant because of the control it gives you over the tuning of the returned Map, letting you specify, for example, the concurrency level using the method concurrencyLevel(). You may find that useful:

Guides the allowed concurrency among update operations. Used as a hint for internal sizing. The table is internally partitioned to try to permit the indicated number of concurrent updates without contention. Because assignment of entries to these partitions is not necessarily uniform, the actual concurrency observed may vary.

Chaeta answered 22/3, 2012 at 12:34 Comment(6)
What kind of thread-safe guarantees are made by the Guava collections? From the JavaDocs for HashMultimap: "This class is not threadsafe when any concurrent operations update the multimap. Concurrent read operations will work correctly. To allow concurrent update operations, wrap your multimap with a call to Multimaps.synchronizedSetMultimap(com.google.common.collect.SetMultimap)." Within the java.util.* collections, the concurrent package was added because the performance of Collections.synchronizedMap() is very poor.Sungod
That is probably something you will need to evaluate yourself. Another option is to use a Computing Map - I will update my answer with an example of that too.Chaeta
Guava does not provide a fully concurrent Multimap, largely because it's extremely difficult to do.Cruelty
That said, I think you're doing it the way I would.Cruelty
@LouisWasserman when you say "you're", are you referring to the OP or to me? :)Chaeta
Sorry. The OP is doing things roughly the way I would, if I were trying to correctly implement a more concurrent multimap -- but synchronizedSetMultimap is known to work, whereas I'm not nearly as confident as I ought to be that the OP's technique works. Achieve a correct solution first, and only then optimize things like this.Cruelty
R
0

I think using java.util.concurrent.ConcurrentSkipListMap and java.util.concurrent.ConcurrentSkipListSet could help you resolve the concurrency concerns.

Recitativo answered 22/3, 2012 at 12:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.