The JDK core libs in general do not specify or even footnote an impl detail in their javadoc about all sorts of highly useful things in regards to threading: They rarely document their color, and indeed don't document their locks very well. They should - that's directly impacting on a library caller's code and thus must be documented. Point is, even if you think 'this is a bug, can it be fixed', the 'fix' would be: "Now the javadoc is updated to tell you that, indeed, it uses synchronized
".
Situation: No need to modify it
If the map is fully formed before you start the first job (i.e. no method is called on that instance of ConcurrentHashMap
that modifies it - no put
, no computeIfAbsent
, no clear
, and so on), then you can just continue to use CHM: Its read-only methods (.get
, .getOrDefault
, size()
, that sort of stuff) do not synchronized () {}
on anything.
The answer in this scenario would be: "Nope, you're good. Don't worry that synchronized
is in the source code of ConcurrentHashMap.java - it will not affect you at all here".
Situation: Modify during job
If the task does require the ability for a job (or some other process, after the first job starts) to modify the map, then fundamentally, you need to set up synchronization: You can't just tell the system to just make a few million jobs and run em in whatever way it pleases (such as: Make a few million VTs, toss em all in a queue, get a bunch of carriers to muddle through the lot), because fundamentally the job at hand just cannot be done that way: That's what 'synchronized' is all about, really: If you have a need to send data from one job to another job, then that's actually really complicated, and requires an externally synced system (such as a DB, or a message queue), or synchronization primitives that tell the CPU to do the job (synchronized
, volatile
, or primitive constructs such as Compare-And-Set, used by stuff like AtomicReference
, and available in various core APIs).
All those primitives have a significant cost associated to them. This should be obvious: The amount of time a CPU needs to read some data from one of its L1 cache blocks is on the order of magnitude of a single cycle of a single pipeline, whereas attempting to send any data to another core is hundreds of such cycles. The sheer distance between cores already means it cannot, by way of the speed of light, be done in a single cycle.
CHM can't magic that away.
VTs are indeed 'not great' at this situation to put it mildly. Simply don't use them here. Which sounds like useless advice. But perhaps not quite:
Solution 1 - split up a single job
Can you insulate the part of the code that needs to interact with that map (which is, by definition or we wouldn't be in this section of the answer, a map shared amongst the VTs), extracting out a whole bunch of code that can be put in map-reduce terms (The code gets the data it needs to do its job as a parameter, and reports its result by returning it - it does not need to modify any fields, nor does it need to read any fields that are being modified by stuff outside of the job's code, i.e. does not need that map at all)?
If that's possible, then do that: Have a separate handler that does the task of extracting what it needs from the CHM, registering a job, and processing job-now-done reports, re-integrating the result of the job back into the CHM. At which point it might be worthwhile to reduce that part of the total task (the code that interacts with the map) to a single thread, as it'll likely run mostly with single threaded performance due to all the synchronized
going on, and then just use a plain jane HashMap
.
Solution 2 - different data structures
If that's not an option, you need a different data concept than what CHM provides. Can this stuff be put in terms of database accesses? Databases like postgres can be tasked to apply optimistic locking. Which may be a disaster here (if you do a lot of interaction with that map, in which case, it's probably fundamentally impossible to meaningfully speed it up with concurrency!), but if the job isn't 'primarily reading and writing that map', that might offer faster performance than CHM.
Solution 3 - different job delineation
If there are 1000 jobs and it is possible to group these jobs into, say, 8 subsets with the property that any job in subset A does not affect what subset B has to do (perhaps it might modify your CHM, but not in a way that has any effect whatsoever on jobs from subset B, C, D, E, F, G, or H ), then instead you can create 8 jobs instead of a thousand, each job being fully isolated (no need for a CHM, the job makes the map, fills the map, and returns it as the final product), then simply combine your 8 sub results into one larger result.
Solution 4 - put the hammer away
If none of the above solutions are an answer, then the job just fundamentally is what it is. VTs aren't magic 'go fast' juice. They are a way to take a job that can be done quickly on current hardware, but where prior to the introduction of VTs, it was somewhat annoying and complicated to write the code in the java language such that it would hit the performance peak that the problem in theory could reach - and make it much, much easier to write it.
They don't open doors to letting you write code that you never could before. This is more generally true of all language features (languages are turing complete. You can do everything a computer can do in em - just, some jobs are needlessly convoluted to write in a language, and a new lang feature might address that).
If you already had code that worked without VTs, then maybe that was already near peak performance, and refactoring it to use VTs neither makes it go any faster, nor makes it easier to maintain/understand.
Solution 5 - don't worry about it
Perhaps you are using CHM not at all for its 'concurrency' aspects, but solely for its API: CHM has various methods that normal maps do not have, that really have nothing to do with 'concurrent' specifically. For example, chm.keySet(defaultValue)
gives you a set which has an implementation of add
: Adding things to the keyset adds them to the map, with as 'value' part that default value. HashMap does not have this method. I'm not sure why not.
Similarly, there's stuff like reduceEntriesToInt
, which you may like. Probably you can replace that by using a plain jane HashMap and using hashMap.entrySet().stream().reduce
instead.
However, if this is what you're doing:
- Each job makes a CHM. This object is never shared with any other job. It's made by the method, never assigned to a field, just used and ditched.
- It's not a plain HashMap only because you need those methods CHM has that HM doesn't.
Then CHM ends up with 'useless synchronize' - being the only thread that ever syncs on an object, with no other thread even having a reference to the monitor, let alone an attempt to synchronized()
on it. Hotspot should be quite good at eliminating those monitor acquires. Do you have an actual performance problem (as in, you measured it, and it's slower than you thought it should be), or, are you merely afraid that the maxim 'VTs and synchronized
don't play nice with each other' therefore means using job-local CHMs is a problem? 'VTs and synchronized
don't play nice with each other' is an oversimplification, because the actual maxim is 'VTs and waiting on acquiring monitors don't play nice with each other' - and that wouldn't happen if each CHM is job-local.
Run your profiler to make sure, but, the JVM does lots of 'useless synchronizing', and Hotspot is all about optimizing common patterns. It's good news:
- useless sync is a common pattern.
- useless sync is easy to optimize (just... don't acquire anything)
- useless sync is easy enough to detect.
- Therefore, JVM hotspot engines are good at eliminating that cost and will do it, and do it cheaply.
Either #4 is correct, or 'JVM hotspot engineers are incompetent' is correct. Could be, but that's an extraordinary claim with no facts in evidence to support it.
System.out.println
is synchronised and so all threads are contending for very same monitor. Try comment out the println inside thecomputeIfAbsent
– DisremembercomputeIfAbsent()
. Any non-trivial java software has that. Where do VT fit in non-trivial software? – DagdaConcurrentHashMap
itself is much of a problem but performing long tasks in locking operations likecomputeIfAbsent
is. – SturgeoncomputeIfAbsent
: “Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple.” Performing operations acquiring a lock within the function surely does not qualify. And it defeats the entire purpose of aConcurrentHashMap
. – Postage