Passing a List Iterator to multiple Threads in Java
Asked Answered
A

5

7

I have a list that contains roughly 200K elements.

Am I able to pass the iterator for this list to multiple threads and have them iterate over the whole lot, without any of them accessing the same elements?

This is what I am thinking of at the moment.

Main:

public static void main(String[] args)
{
    // Imagine this list has the 200,000 elements.
    ArrayList<Integer> list = new ArrayList<Integer>();

    // Get the iterator for the list.
    Iterator<Integer> i = list.iterator();

    // Create MyThread, passing in the iterator for the list.
    MyThread threadOne = new MyThread(i);
    MyThread threadTwo = new MyThread(i);
    MyThread threadThree = new MyThread(i);

    // Start the threads.
    threadOne.start();
    threadTwo.start();
    threadThree.start();
}

MyThread:

public class MyThread extends Thread
{

    Iterator<Integer> i;

    public MyThread(Iterator<Integer> i)
    {
        this.i = i;
    }

    public void run()
    {
        while (this.i.hasNext()) {
            Integer num = this.i.next();
            // Do something with num here.
        }
    }
}

My desired outcome here is that each thread would process roughly 66,000 elements each, without locking up the iterator too much, and also without any of the threads accessing the same element.

Does this sound doable?

Andradite answered 5/2, 2016 at 11:4 Comment(5)
Using Java 8 Streams and parallel() seems to be the appropriate use case here.Frissell
No, you can't (safely, with this code), because the hasNext and next calls are not atomic.Dnieper
@AndyTurner With streams, the OP would not handle the iterators manually.Frissell
its very hard to do it safely. but its possible to do it with next to nothing wait time if u synchronize the index onlyRespectability
@AndyTurner yeah, possible with java 8,Respectability
F
7

Do you really need to manipulate threads and iterators manually? You could use Java 8 Streams and let parallel() do the job.

By default, it will use one less thread as you have processors.

Example :

list.stream()
    .parallel()
    .forEach(this::doSomething)
;

//For example, display the current integer and the current thread number.
public void doSomething(Integer i) {
  System.out.println(String.format("%d, %d", i, Thread.currentThread().getId()));
}

Result :

49748, 13
49749, 13
49750, 13
192710, 14
105734, 17
105735, 17
105736, 17
[...]

Edit : if you are using maven, you will need to add this piece of configuration in pom.xml in order to use Java 8 :

<build>
  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-compiler-plugin</artifactId>
      <version>3.3</version>
      <configuration>
        <source>1.8</source>
        <target>1.8</target>
      </configuration>
    </plugin>
  </plugins>
</build>
Frissell answered 5/2, 2016 at 11:20 Comment(5)
Looks like a great solution, but I am getting "Method references are not supported at this language level" when I try to do the above. Any ideas?Andradite
Your project is not configured to use Java 8. Hint : if you use maven, you need to add a piece of configuration.Frissell
@TomWright I added the piece of maven conf for Java 8Frissell
Thanks Arnaud, that got it working. I'm now testing the code in your answer.Andradite
Note, if you need to control the number of threads spun up in order to not crush external systems (for cases where you are making external calls from inside your thread) you should use the subList mentioned below in conjuntion with the ExecutorService mentioned below where the fixedThreadPool has the number of threads you wish to useFasciate
S
3

Since next() method of the class that implements Iterator interface does data manipulation, concurrent usage of next() method needs synchronization. The synchronization can be accomplished using synchronized block on iterator object as follows:

synchronized(i)
{
    i.next();
}

Though, I recommend the usage of Stream API as in the answer above if your need is only parallel processing of the list.

Stereoscopic answered 5/2, 2016 at 11:40 Comment(1)
Without synchronization, what is a possible outcome here? Is it that i.next() could return the same value for multiple calling threads?Hangman
G
2

You can't do it in a thread safe way with a single iterator. I suggest to use sublists:

List sub1 = list.subList(0, 100);
List sub2 = list.subList(100, 200);

ArrayList#subList() method will just wrap the given list without copying elements. Then you can iterate each subList in a different thread.

Golub answered 5/2, 2016 at 11:16 Comment(0)
C
0

Hi to prevent your threads from dreadlocks or starvation you can use the ExecutorService from the thread pool class. This words better for me than using synchronized, locks or Re-entrant-locks. You can also try using the Fork/join but i haven't used it before. This is a sample code but i hope you get the idea

public static void main(String[] args){
   ExecutorService executor = Executors.newFixedThreadPool(200000);
   List<Future<Integer>> futureList = new ArrayList<>();
   //iteration code goes here
  executor.shutdown();
}

Public class MyThread implements Callable<ArrayList<Integer>>{

@Override
        public Iterator<Integer> call() throws Exception {
            //code goes here!
        }  

}
Cryoscopy answered 5/2, 2016 at 11:42 Comment(0)
E
0

If you use a parallel stream, you'll be executing your code across many threads, with the elements distributed evenly between threads:

list.parallelStream().forEach(this::processInteger);

This approach makes it really simple to code; all the heavy lifting is done by the JRE.

Also, regarding your code, it is bad style to extend Thread. Instead, implement Runnable and pass an instance to the constructor of Thread - see live

Esma answered 5/2, 2016 at 12:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.