Explain synchronization of collections when iterators are used?
Asked Answered
L

2

24

I understand that collections like the Hashtable are synchronized, but can someone explain to me how it works, and at what point(s) access is restricted to concurrent calls? For example, let's say I use some iterators like this:

Hashtable<Integer,Integer> map = new Hashtable<Integer,Integer>();

void dosomething1(){
    for (Iterator<Map.Entry<Integer,Integer>> i = map.entrySet().iterator(); i.hasNext();){
        // do something
    }
}
void dosomething2(){
    for (Iterator<Map.Entry<Integer,Integer>> i = map.entrySet().iterator(); i.hasNext();){
        // do something
        // and remove it
        i.remove();
    }
}
void putsomething(int a, int b){
    map.put(a,b);
}
void removesomething(int a){
    map.remove(a);
}
var clear(){
    map = new Hashtable<Integer,Integer>();
}

Can someone please explain if there are any pitfalls with me calling these functions at random from different threads? How does the iterator, in particular, do its synchronization, especially when it is using entrySet(), which would seem to also require synchronization? What happens if clear() is called while one of the loops is in progress? What if removesomething() removes an item that is not yet processed by a concurrent loop in dosomething1() ?

Thanks for any help!

Lowe answered 21/11, 2009 at 15:5 Comment(0)
D
49

Iteration over collections in Java is not thread safe, even if you are using one of the synchronized wrappers (Collections.synchronizedMap(...)):

It is imperative that the user manually synchronize on the returned map when iterating over any of its collection views:

Map m = Collections.synchronizedMap(new HashMap());
...
Set s = m.keySet();  // Needn't be in synchronized block
...
synchronized(m) {  // Synchronizing on m, not s!
    Iterator i = s.iterator(); // Must be in synchronized block
    while (i.hasNext())
        foo(i.next());
}

Java Collection Framework docs

Other calls to synchronized collections are safe, as the wrapper classes surround them with synchronized blocks, which use the wrapper collection as their monitor:

public int size() {
    synchronized( this ) {
        return collection.size();
    }
}

with collection being the original collection. This works for all methods exposed by a collection/map, except for the iteration stuff.

The key set of a map is made synchronized just the same way: the synchronized wrapper does not return the original key set at all. Instead, it returns a special synchronized wrapper of the collection's original key set. The same applies to the entry set and the value set.

Decagon answered 21/11, 2009 at 15:13 Comment(10)
Fixed the answer: the monitor used is actually the wrapper collection, not the original one.Decagon
That is super helpful and presented very well. I was having trouble finding a source that explains this clearly, so thanks very much!Lowe
"Iteration over collections in Java is not thread safe, even if you are using one of the synchronized wrappers" O_o terribleCoagulant
@Coagulant - this is just saying that the synchronized wrappers don't address all requirements. Some things need to be synchronized at a more coarse level.Tips
What do you mean java collections are not thread safe when iterating? The CopyOnWriteArrayList and ConcurrentHashMap are surely thread safe when iterating over them.Magnesium
@John W. Yes, those were particularly designed with concurrent access in mind. But the collections obtained by calling one of the Collections.synchronizedXxx are not easily iterable without a proper locking discipline which the client code has to provide.Decagon
shouldn't synchronized(m) { for(Object o : m){foo.(o);} } be safe as well or do you have to explicitly use the iterator? I wouldn't think this would be me a ConcurrentModificationException but it is...Cristincristina
The link is broken.Margiemargin
@schwiz for(l : list) finally uses iterator as well.Rexrexana
SynchronizedCollection which is used for Collections.synchronizedList and value sets in Collections.synchronizedMap has the methods forEach and removeIf synchronized. These are great alternatives to iteration!Tris
T
2

I understand that collections like the Hashtable are synchronized

The HashTable's entry set uses a SynchronizedSet which is a type of SynchronizedCollection.

If you modify any collection synchronized or not while using an iterator on it, the iterator will throw a ConcurrentModificationException.

An iterator is an Object that acts on a collection, being given the collection's state during construction. This lets you decide when you want to see the next item in the collection, if ever. You must use an iterator on a collection you know won't be modified, or only plan to modify using the iterator.

The reason ConcurrentModificationException is thrown is because of a check iterators make on the collection's current modification count, if it doesn't match the expected value, the exception is thrown. All collections increment a modification count variable each time something is added or removed.

How does the iterator, in particular, do its synchronization, especially when it is using entrySet()

So the iterator doesn't do synchronization and is not safe to use when you expect the collection to be modified by other threads, (or the current thread outside of the iterator).

However, SynchronizedCollection does provide a way to go though the collection synchronously. Its implementation of the forEach method is synchronized.

public void forEach(Consumer<? super E> consumer)

Just keep in mind, forEach uses an enhanced for loop which uses an iterator internally. This means forEach is only for reviewing the collection's contents, not for modifying it while looking through. Otherwise ConcurrentModificationException will be thrown.

can someone explain to me how it works, and at what point(s) access is restricted to concurrent calls

SynchronizedCollection causes threads to take turns accessing the collection if they want to use the synchronized methods such as (add, remove, forEach).

It works by introducing a synchronized block similar to how it's shown in the following code:

public boolean add(Object o) {
  synchronized(this) {
  super.add(o);
  }
}

A synchronized block is introduced around all of the operations you can perform on the collection except for the following methods:

iterator(), spliterator(), stream(), parallelStream()

Java Official Documentation

Tris answered 5/1, 2019 at 22:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.