keySet() method in HashMap could be terser [duplicate]

Asked 20/11, 2020 at 4:12 Answered 20/11, 2020 at 6:27

JDK 8 on mac OS, looking at following code from HashMap.java:

    public Set<K> keySet() {
        Set<K> ks = keySet;
        if (ks == null) {
            ks = new KeySet();
            keySet = ks;
        }
        return ks;
    }

Any changes to the returned ks will reflect in keySet as they always point to the same underlying set, if this is true, can it be written as:

    public Set<K> keySet() {
        if (keySet == null) {
            keySet = new KeySet();
        }
        return keySet;
    }

Are the two code snippets behave equivalent?

If so, why HashMap uses the first variation rather than the second variation?

Hulk answered 20/11, 2020 at 4:12 Comment(0)

Caching to a local variable is done to improve performance. The generated bytecode is smaller, the field is read once and therefore a cache miss could occur only once, and some other things.

This is quite an advanced optimization and should be carried only on very frequently run pieces of code. The reason it was applied here, likely is because HashMap was written in Java 1.2, when the JIT was very basic and therefore things like these had a considerable impact.

In this case, it is also done to support multi-threaded access. HashMap is not synchronized, however it can be shared via safe publication if later it is not modified. If two threads execute the method concurrently, a race condition could occur: the first read in if(keySet == null) could read a newer value written by another thread and the second read in return keySet; read the older (null) value. Using a local variable ensures that if and return use the same reference when non-null. So it can never return null.

Opium answered 20/11, 2020 at 6:14 Comment(4)

So basically it's the same reason for copying the Lock to a local variable in ArrayBlockingQueue – Eluviation 20/11, 2020 at 6:30

It’s not a real synchronization; the read still is racy, so it’s possible that two or more threads create their own KeySet instance instead of using a single one. Further, if multiple threads use a single KeySet instance published via a data race, it’s only safe because KeySet is an immutable stateless view to the actual data of the HashMap. – Gook 20/11, 2020 at 12:15

What @Gook is saying does not conflict in any way with my answer. – Opium 23/11, 2020 at 6:18

Right. It’s an addendum, originally a response to a now-removed comment calling it “kind of synchronization”, which is it not, while the main point is in agreement with the answer, that this is a thread safe construct, which it wouldn’t be without the vocal variable. And this safety is important to make keySet()’s behavior read-only from the caller’s perspective, as the internal caching is an implementation detail. – Gook 23/11, 2020 at 11:4

-1

The local variable is saved only as an optimisation as pointed out by @Fransesco. It also avoids the new object creation in a few cases.

The implementation does not store any state internally, and operates on the underlying hashmap for all operations and changes to the set are expected as per the java docs

add and addAll are not allowed (throw an UnsupportedOperationException)
remove, retainAll are expected to modify the underlying map

For reference

   /**
     * Returns a {@link Set} view of the keys contained in this map.
     * The set is backed by the map, so changes to the map are
     * reflected in the set, and vice-versa.  If the map is modified
     * while an iteration over the set is in progress (except through
     * the iterator's own <tt>remove</tt> operation), the results of
     * the iteration are undefined.  The set supports element removal,
     * which removes the corresponding mapping from the map, via the
     * <tt>Iterator.remove</tt>, <tt>Set.remove</tt>,
     * <tt>removeAll</tt>, <tt>retainAll</tt>, and <tt>clear</tt>
     * operations.  It does not support the <tt>add</tt> or <tt>addAll</tt>
     * operations.
     *
     * @return a set view of the keys contained in this map
     */
    public Set<K> keySet() {
        Set<K> ks = keySet;
        if (ks == null) {
            ks = new KeySet();
            keySet = ks;
        }
        return ks;
    }

    final class KeySet extends AbstractSet<K> {
        public final int size()                 { return size; }
        public final void clear()               { HashMap.this.clear(); }
        public final Iterator<K> iterator()     { return new KeyIterator(); }
        public final boolean contains(Object o) { return containsKey(o); }
        public final boolean remove(Object key) {
            return removeNode(hash(key), key, null, false, true) != null;
        }
        public final Spliterator<K> spliterator() {
            return new KeySpliterator<>(HashMap.this, 0, -1, 0, 0);
        }
        public final void forEach(Consumer<? super K> action) {
            Node<K,V>[] tab;
            if (action == null)
                throw new NullPointerException();
            if (size > 0 && (tab = table) != null) {
                int mc = modCount;
                for (int i = 0; i < tab.length; ++i) {
                    for (Node<K,V> e = tab[i]; e != null; e = e.next)
                        action.accept(e.key);
                }
                if (modCount != mc)
                    throw new ConcurrentModificationException();
            }
        }
    }

AFAIK the behaviour is same in all platforms, not just Mac

Exalted answered 20/11, 2020 at 6:27 Comment(0)

Recommended topics

Hot tags