Memory visibility in Fork-join
Asked Answered
B

2

7

Brian Goetz's wrote a nice article on fork-join at http://www.ibm.com/developerworks/java/library/j-jtp03048.html. In it, he lists a merge sort algorithm using the fork-join mechanism, in which he performs the sort on two sides of an array in parallel, then merges the result.

The algorithm sorts on two different sections of the same array simultaneously. Why isn't an AtomicIntegerArray or some other mechanism necessary to maintain visibility? What guarantee is there that one thread will see the writes done by the other, or is this a subtly bug? As a follow up, does Scala's ForkJoinScheduler also make this guarantee?

Thanks!

Banana answered 26/1, 2011 at 0:52 Comment(2)
They're operating on different sections of the array. There's no contention until it comes to merging.Shortwinded
I agree that they're operating on different sections. But the semantics of Java's memory model more or less say that not all threads are guaranteed to see all writes (unless the variable is volatile). According to this blog: jeremymanson.blogspot.com/2009/06/volatile-arrays-in-java.html even using a volatile int[] isn't enough to guarantee that other threads see your writes to the arrayBanana
D
9

The join (of ForkJoin) itself requires a synchronization point, thats the most important piece of information. A synchronization point will ensure that all writes that happen are visible after said point.

If you take a look at the code you can see where the synchronization point occurs. This is just one method call invokeAll

public static void invokeAll(ForkJoinTask<?> t1, ForkJoinTask<?> t2) {
    t2.fork();
    t1.invoke();
    t2.join();
}

Here t2 forks into another process, t1 executes its task and that calling thread will wait on t2.join(). When passing t2. All writes to t1 and t2 will then be visible.

Edit: This edit is just to give a little more of an explanation of what I meant by synchronization point.

Lets say that you have two variables

int x;
volatile int y;

Any time you write to y all writes that happened before you read y will be available. For example

public void doWork(){
   x = 10;
   y = 5;
}

If another thread reads y = 5 that thread is guaranteed to read x = 10. This is because the write to y creates a synchronization point in which all writes before said point will be visible after the write.

With the Fork Join pool the join of a ForkJoinTask will create a synchronization point. Now if t2.fork() and t1.invoke() the joining of t2 will ensure that all writes that previously happened will be seen. Since all the previous writes are within the same structure it will be safe for visibility.

I would be happy to explain further if that isnt as clear.

Dismast answered 26/1, 2011 at 1:34 Comment(5)
Is invokeAll called by coInvoke?Harding
And, just to complement the answer, see also java.sun.com/docs/books/jls/third_edition/html/….Harding
@Danciel C. Sobral interesting you ask, when I was writing this example I was actually looking for the coInvoke method. It seems that the coInvoke method itself has been purged from the source code. That is at least I cannot find it any longer.Dismast
According to the document Daniel posted: "All actions in a thread happen-before any other thread successfully returns from a join() on that thread." I started reading the source code myself - it's very, very tricky stuff. It looks like the join guarantees the visibility though. I'd love to have some more details on this, but I'll mark this answer correct.Banana
@Joshua Hartman feel free to look at my edit for more of an explanationDismast
P
1

Just a guess: merge includes joining on a Thread, and the join guarantees the visibility.

The second part is sure; I don't know how is merge implemented.

Phototube answered 26/1, 2011 at 1:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.