Java multi-threaded program using join() gives wrong results in calculating sum of adjacent numbers
Asked Answered
J

5

6

I wrote this simple multi-threaded program to add numbers from 1 to 100,000. When I ran this, I got different values as the end result ( values less than expected 5000050000 ). When I executed the program only using one thread, it gave correct result. Program also works for smaller values such as 100. What did possibly go wrong ? Thanks in advance.

class Calculation {

  private long total=0;
  public void calcSum(long start, long end) {
      long i = start;
      for( ;i<=end; i++) {
          total += i;
      }
  }
  public long getTotal() {
     return total;
  }
}

class CalculatorThread extends Thread{
   private long start;
   private long end;
   private Calculation calc;
   public CalculatorThread(long start, long end, Calculation calc) {
       this.start = start;
       this.end = end;
       this.calc = calc;
   }
   @Override
   public void run() {
       calc.calcSum(start, end);
   }
}

public class ParallelTest {

  public static void main(String[] args) throws InterruptedException {

      int start = 1;
      int end = 100000;

      Calculation calc = new Calculation();
      CalculatorThread ct1 = new CalculatorThread(start, end/2 , calc);
      CalculatorThread ct2 = new CalculatorThread( (end/2) + 1, end, calc);

      ct1.start();
      ct2.start();

      ct1.join();
      ct2.join();

      System.out.println(calc.getTotal());
  }
}
Jaimeejaimes answered 27/5, 2018 at 22:3 Comment(0)
K
3

Unsynchronized access to shared mutable state usually doesn't go well.

In your Calculation calc, there is a mutable variable long total. When you start the threads:

  CalculatorThread ct1 = new CalculatorThread(start, end/2 , calc);
  CalculatorThread ct2 = new CalculatorThread( (end/2) + 1, end, calc);

you share the mutable state of calc between those two threads. There is no synchronization whatsoever inside calc, so the threads will only trash each other's memory in random time intervals.

Here is a working version:

class ParallelSum {
  public static long calcSum(long start, long end) {
      long total = 0;
      for(long i = start; i < end; i++) {
          total += i;
      }
      return total;
  }

  public static class CalculatorThread extends Thread {
     private long result = 0;
     private long start;
     private long end;
     public CalculatorThread(long start, long end) {
         this.start = start;
         this.end = end;
     }
     @Override
     public void run() {
         result = calcSum(start, end);
     }
     public long getResult() {
         return result;
     }
  }

  public static void main(String[] args) throws InterruptedException {
    int start = 1;
    int end = 100000;
    int endExcl = end + 1;

    CalculatorThread ct1 = new CalculatorThread(start, endExcl/2);
    CalculatorThread ct2 = new CalculatorThread(endExcl / 2, endExcl);

    ct1.start();
    ct2.start();

    ct1.join();
    ct2.join();

    System.out.println(ct1.getResult() + ct2.getResult());
  }
}

Outputs:

5000050000

Additional note: always use [inclusive, exclusive) indexing of ranges. This decreases the chance for off-by-one errors dramatically. Also, I've replaced the Calculation class by a method: nothing can go wrong with local variables, inside a method, and the less mutable state there is, the better.

Kitten answered 27/5, 2018 at 22:20 Comment(0)
C
3

The class Calculation is not thread safe, and you are using the same instance on two threads. So each thread is replacing the value set by the other.

A simple solution will be to create two Calculation instances, one for every CalculatorThread.

Calculation calc1 = new Calculation();
Calculation calc2 = new Calculation();
CalculatorThread ct1 = new CalculatorThread(start, end/2 , calc1);
CalculatorThread ct2 = new CalculatorThread( (end/2) + 1, end, calc2);

This is a classic calculation that can be broken down to independent parts (that can execute concurrently) and reduced to a single result at the end.

Casaba answered 27/5, 2018 at 22:26 Comment(0)
K
0

total += i is not atomic (see this answer). You have two threads modifying the same value total at the same time, and the two threads are interfering with each other.

Take a look at AtomicLong (javadoc), which has a method addAndGet that atomically adds some number:

class Calculation {
    private AtomicLong total = new AtomicLong();
    public void calcSum(long start, long end) {
        long i = start;
        for( ;i<=end; i++) {
            total.addAndGet(i);
        }
    }
    public long getTotal() {
       return total.get();
    }
}

With this approach, both threads can add to total atomically, so they won't interfere anymore.

Knap answered 27/5, 2018 at 22:23 Comment(3)
Those two calculations can be performed completely independently, one single + is required in the very end. No reason to share any state at all between threads here.Kitten
Indeed. But that's only because the OP is learning/practicing with a toy problem. Using atomic operations is a more general solution, so I think my answer is more educational. (What if there was a third thread that needed to periodically print out the "current total"?)Knap
Fair enough, one should know about all possible options and techniques. Showing how it works with atomics certainly has educational value. :]Kitten
P
0

this is not the easiest way to achieve your result. If you want to calculate the sum of numbers from 1 to N you should use this formula: sum from 1 to N = N(N+1)/2. You are facing with what is called inconsistent reading. Your two threads are sharing the same field "calc" and expecially the long field total of Calculation class. With your solution this can happens:

trhead1 reads a total equals to 0, thread1 increment the value of total to 1, thread2 reads a total equals to 0, thread2 increment the value of total to 1.

Other scenarios are possible with your solution, the fact is that in this way the value of total is not predictible.So use the provided formula or a simple for, don't use async solution for a sync problem.

Pendley answered 27/5, 2018 at 22:37 Comment(0)
F
0

The fundamental problem with your code is that the Calculation is effectively a wrapper for a shared variable (the count field) which is being updated without any synchronization.

There are three ways to solve this:

  • synchronize the variable; e.g. by using synchronized at the appropriate points
  • use an atomic type for the variable
  • don't use a shared variable.

The third option is better from a performance perspective, and the implementation is straight-forward; see @JenniferP's answer.

Note that no extra synchronization is required if you give each thread its own Calculation object. The happens before relations that are associated with the Thread::start() and Thread::join() calls ensure that the main thread sees the correct results from the child threads when it reads them after the join.


For what it is worth, a thread-safe implementation of Calculation that will work when the instance is shared would look like this:

  class Calculation {

      private long total = 0;

      public void calcSum(long start, long end) {        
          for (long i = start; i <= end; i++) {
              increment(i);
          }
      }

      public synchronized long getTotal() {
         return total;
      }

      private synchronized void increment(long by) {
         total += by;
      }
  }

Note that you need to synchronize both on the getter and the increment method. An alternative is to use an AtomicLong to represent the total. But in both cases there will be a significant amount of overhead due to the use of the shared variable. (In the version with synchronized, the overhead could be extreme due to lock contention. I would be surprised if there was any speedup at all compared with the single-threaded version. Even if you could eliminate the thread creation overheads.)

Floribunda answered 27/5, 2018 at 23:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.