ConcurrentHashMap memory overhead
Asked Answered
G

3

11

Does somebody know what is the memory overhead of a ConcurrentHashMap (compared to a "classical" HashMap) ?

  • At construction ?
  • At insertion of an element ?
Guienne answered 19/6, 2012 at 14:32 Comment(3)
It doesn't make sense to create large number of ConcurrentHashMap give you only have a limited number of cores. The overhead of a small number of CHM is likely to be less than 1 cents worth of memory.Sadonia
@PeterLawrey I don't really get your point. "It's useless to create a large amount of ConcurrentHashMap", so what ? They still have an overhead. Beside, even if it's obviously weird to have many CHMs at the same time, one can easily imagine that short-living objects create a concurrent hash map at their construction (let's say a join operator in a DB oriented software ?).Guienne
A reason you might use concurrent collections is because you have more cores than collections. If you have many more collections than cores, it is highly unlikely you will have concurrent access.Sadonia
S
11

If you run the following with -XX:-UseTLAB -XX:NewSize=900m -mx1g on a 64-bit JVM.

public static void main(String... args) throws NoSuchMethodException, IllegalAccessException {
    for (int i = 0; i < 4; i++) {
        long used1 = usedMemory();
        populate(new HashMap());
        long used2 = usedMemory();
        populate(new ConcurrentHashMap());
        long used3 = usedMemory();
        System.out.println("The ratio of used memory is " + (double) (used3 - used2) / (used2 - used1));
        System.out.println("For an extra " + ((used3 - used2) - (used2 - used1)) / 1000000 + " bytes per entry was used.");
    }
}

private static void populate(Map map) {
    for (Integer i = 0; i < 1000000; i++)
        map.put(i, i);
}

private static long usedMemory() {
    return Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
}

you get with Java 6 and 7 for one million entries.

The ratio of used memory is 1.1291128466982379
For an extra 8 bytes per entry was used.
The ratio of used memory is 1.1292086928728067
For an extra 8 bytes per entry was used.
The ratio of used memory is 1.1292086928728067
For an extra 8 bytes per entry was used.
The ratio of used memory is 1.1292086928728067
For an extra 8 bytes per entry was used.

Eight MB of memory costs around 5 cents.

Sadonia answered 19/6, 2012 at 15:33 Comment(7)
How reliable is that measurement of memory usage?Apotheosis
Can you explain the point of using thread local allocations? Thanks!Apotheosis
It may be different on Java 5 or other JVMs than HotSpot or OpenJDK but I would surprised if it is significantly different. The difference could be smaller on 32-bit JVMs.Sadonia
I am specifically turning them OFF because while they improve efficiency by allowing multi-threaded object allocation they make freeMemory() less accurate.Sadonia
Your code with empty maps shows that initial size of ConcurrentHashMap is 96 bytes higher than HashMap (224 vs 128 bytes). Quite remote from the 1,700 bytes figure mentioned in the other answer.Apotheosis
In other words, if I have a use case where I create 100.000's of CHM that are usually mostly empty (but can sometimes be accessed from multiple threads), I'm far better of using traditional synchronization on a traditional HM. 224 vs 128 bytes is quite a significant price to pay for convenience in this case.Lillalillard
@Lillalillard True, however if you have that many data structures, you will do much better with an actual class instead of a Map.Sadonia
C
5

ConcurrentHashMap doesn't use significantly more memory than HashMap, both at construction and at insertion.

At Intialization

ConcurrentHashMap uses almost the same amount of memory as a HashMap, may be slightly more for couple of extra book-keeping variables and locks.

During initialization, ConcurrentHashMap creates 16 Segments to store key-values, each Segment is equivalent to a HashMap.

The intial capacity/size of each Segment is 1/16 of the overall initial capacity. So in essence, ConcurrentHashMap creates 16 small HashMaps equivalent to one HashMap. Each Segment has own lock and couple of book-keeping variables (count, threshold etc), this is extra memory overhead.

You can control the number of Segments created by ConcurrentHashMap by passing appropriate value to concurrencyLevel parameter to the ConcurrentHashMap. Smaller this value, then less space will be used but more contention when high number of threads update the Map. Higher this value, then more Segments will be created but the performance of parallel updates will be faster. Note: Significantly higher value for concurrencyLevel parameter, affects both space and time.

This small overhead in memory is what a developer is willing to accept in exchange for concurrency.

At Insertion

When Segments get filled, the size of that Segment will be increased. The policy to increase the size is same as the HashMap. loadfactor parameter decides when to increase the size of the Segment. Note only that Segment which is filled will be increased. Once again, memory overhead is almost same as HashMap.

Overall, ConcurrentHashMap doesn't use significantly more memory than HashMap, but its really hard to measure each and every extra byte used by ConcurrentHashMap.

Commotion answered 19/6, 2012 at 16:18 Comment(0)
A
2

I don't really understand the premise of the question - either you need the concurrency or you do not.

However, according to this link, the memory footprint of an empty ConcurrentHashMap is 1700 bytes. It recommends that you use the ConcurrentHashMap if you have multiple threads that need read/write access, but a Hashtable if you have many threads that need read access but one with write.

Aisne answered 19/6, 2012 at 14:38 Comment(2)
The link is way too old to be relevant. Indeed, this article was published (04/28/2012) before Java 7 and the implementation has probably changed. Moreover, your answer is incomplete since I ask for insertion too (is there is one).Guienne
@Guienne Implementation probably hasn't changed that much, but in any case it describes the method they used to find the data. A bit of research shows that nobody else has yet done it.Aisne

© 2022 - 2024 — McMap. All rights reserved.