Normally when one uses Java 8's parallelStream(), the result is execution via the default, common fork-join pool (i.e. ForkJoinPool.commonPool()).
That is clearly undesirable, however, if one has work that is far from CPU bound, e.g. may be waiting on IO much of the time. In such cases one will want to use a separate pool, sized according to other criteria (e.g. how much of the time the tasks are likely to be actually using the CPU).
There's no obvious means of getting parallelStream() to use a different pool, but there is a way as detailed here.
Unfortunately, that approach entails invoking the terminal operation on the parallel stream from a fork-join pool thread. The downside of this is that if the target-fork join pool is completely busy with existing work, the whole execution will wait on it while doing absolutely nothing. Thus the pool can become a bottleneck worse than single threaded execution. By contrast, when one uses parallelStream() in the "normal" fashion, ForkJoinPool.common.externalHelpComplete() or ForkJoinPool.common.tryExternalUnpush() are used to let the calling thread from outside the pool help in the processing.
Does anyone know of a way to both get parallelStream() to use a non-default fork-join pool and have a calling thread from outside the fork-join pool help in the processing of this work (but not the rest of the fork-join pool's work)?
get
on your tasks which is not in the common pool, it will still callForkJoinPool.common.tryExternalUnpush()
, but, of course, won’t find the task in the common pool’s queue. – Seisin