How does a weak hash map know to garbage-collect an object?
Asked Answered
C

4

15

I recently found out about the WeakHashMap data structure in Java.

However, I don't understand what it means by it garbage-collects a mapping when it is no longer in ordinary use. How does the data structure know I will no longer use a key in my program? What if I don't refer to a key for a long time?

Clausewitz answered 23/6, 2012 at 23:27 Comment(2)
See this question for details regarding when to use these Weak data structures.Valdivia
WeakHashMap doesn't 'know to garbage-collect' anything. The garbage collector does. All that WeakHashMap adds to that is the use of weak references, and a reference queue.Determine
S
20

However, I don't understand what it means by it garbage-collects a mapping when it is no longer in ordinary use.

OK. Under normal circumstances, when the garbage collector runs it will remove objects that your program can no longer use. The technical term is an "unreachable object", and it means that the program execution has no way to get hold of a reference to the object any more. An object that is unreachable, may be collected in the next GC cycle ... or not. Either way, it is no longer the application's concern.

In this case, the WeakHashMap uses a special class called WeakReference to refer to the keys1. A weak reference is an object that acts sort of like an indirect pointer (a pointer to an object holding a pointer). It has the interesting property that the garbage collector is allowed to break the reference; i.e. replace the reference it contains with null. And the rule is that a weak reference to an object will be broken when the GC notices that the object is no longer reachable via a chain of normal (strong) or soft references2.

The phrase "no longer in ordinary use" really means that the key object is no longer strongly or softly reachable; i.e. via a chain of strong and / or soft references.

How does the data structure know I will no longer use a key in my program?

The WeakHashmap doesn't do it. Rather, it is the GC that notices that the key is not strongly reachable.

As part of its normal traversal, the GC will find and mark all strongly reachable objects. Then it goes through all of the WeakReference objects and checks to see if the objects they refer to have been marked, and breaks them if they have not. (Or something like that ... I've never looked at the actual GC implementation. And it is complicated by the fact that it has to deal with SoftReference and PhantomReference objects as well.)

The only involvement that WeakHashmap has is that:

  • it created and uses WeakReference objects for the keys, and
  • it expunges hash table entries whose key WeakReferences have been cleared by the GC.

What if I don't refer to a key for a long time?

The criterion for deciding that a weak reference should be broken is not time based.

But it is possible that timing influences whether not a key is removed. For instance, a key could 1) cease to be strongly reference, 2) be retrieved from the map, and 3) be assigned to a reachable variable making it strongly referenced once again. If the GC doesn't run during the window in which the key is not strongly reachable, the key and its associated value will stay in the map. (Which is what you'd want to happen ...)


1 - Implementation detail: in recent Java releases, the weak references actually refer to the map's internal Entry objects rather than the keys. This allows broken references to be purged from the map more efficiently. Look at the code for details.
2 - Soft references are a kind of reference that the GC is allowed to break if there is a shortage of heap memory.

Svensen answered 24/6, 2012 at 0:19 Comment(0)
P
4

Java has a system of references where the language can tell your code whether or not some object is still in use. You can use references to detect when some object has been specifically identified as no longer in use or usable, and can then take action accordingly. This tutorial covers references in some depth, in case you're curious how to use them.

Internally, WeakHashMap likely uses these references to automatically detect when a given key cannot be used any more. The implementation can then remove those objects from the hash table so that they no longer take up any space.

Hope this helps!

Polymerize answered 23/6, 2012 at 23:29 Comment(0)
U
4

The JVM garbage collects in this order:

  1. Unreferenced objects
  2. Objects whose only reference is a WeakReference
  3. Objects whose only reference is a SoftReference

Normally, the garbage collector only garbage collects unreferenced objects.

A weak reference to an object does not count as a reference as far as the garbage collector is concerned. The garbage collector may, or may not collect them. Typically, it will not collect them unless memory is running low, but there are no guarantees.

If the JVM is about run out of memory, the garbage collector will collect softly referenced objects. All weak referenced objects will be garbage collected before any soft referenced objects are garbage collected.

From the javadoc of SoftReference:

All soft references to softly-reachable objects are guaranteed to have been cleared before the virtual machine throws an OutOfMemoryError

Uphill answered 24/6, 2012 at 0:2 Comment(0)
F
2

I read your question as asking about the specific wording "regular use" and so assume you already know about strong and weak references. Regular use refers to the case where a weak hashmap contains keys that are also referenced (strongly) by some other data structure(s). The existence of at least one strong reference to a key is "regular use." The key can't be garbage collected as long as this other data structure references it. When the other data structure is no longer reachable (pointers to it no longer exist), the key also becomes unreachable. It's no longer in regular use because the only reference to it is the weak one in the mapping. The garbage collector can eventually reclaim it, and the mapping disappears.

This comes up when you'd like to extend a type C by subclassing but can't: for example when C is an interface with many implementations. You can work around this problem by using a weak hashmap with key type C and new fields to be added wrapped in a new class E. Whenever you create a C instance, you also create an E instance and add the pair to the map, which is then used to access the new fields during the life of the C intance. When the C instance becomes garbage, the mapping and E instance do also. This is automatic because the hashmap is weak. If it weren't, it would have to be cleaned up manually in the same manner you have to explicitly free storage in languages with no garbage collector.

Formative answered 24/6, 2012 at 0:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.