Divide an uneven number between Threads
Asked Answered
G

6

8

I am just learning Threads in Java and I want to sort a list of words alphabetical. My program read the words of a txt-file and put them in a String-array. The user can choose how many threads they want to use themselves. I want to split the array in even (as possible) chunks that the threads can sort by themselves.

So to my question:

How can I split the array.length as even as possible across the threads? My mind is blanking and I can't think of a smart way to do this.

e.g: If I have a array.length of 22 and 4 threads, how can give the threads in this case; 6, 6, 5 and 5 sized pieces of the array? Needs to be applicable to every number given.

I tried to explain it the best I could, please ask if something was unclear! Thank you!

Gordan answered 18/4, 2016 at 8:28 Comment(1)
The fact that you are doing this to split work amongst threads is largely irrelevant - you appear to be asking how to split an array into N roughly-equally-sized chunks.Prodrome
Y
6

It doesn't need to be as evenly as possible. If one thread has 6, this will determine the length of time it takes in which case it doesn't matter how many are up to 6.

You can do

int chunkSize = (tasks + threads - 1) / threads; // divide by threads rounded up.
for (int t = 0; t < threads; t++) {
    int start = t * chunksSize;
    int end = Math.min(start + chunkSize, tasks);
    executor.submit(() -> {
         // inside the thread
         for (int i = start; i < end; i++) {
             process(i);
    });
}

Note: if you use Stream.of(array).parallel() it actually create two tasks per thread. This mitigates that some batches might take longer even though they have the same number of elements.

Yolandayolande answered 18/4, 2016 at 8:35 Comment(2)
@Marco13 the longest running thread will most likely have two tasks and this will determine how long it takes for them all to complete. I see your point though +1 Note: computers rarely scale linearly with the number of CPU you use at once. If you have 5x2 task threads they might complete faster than 2x2 + 6x1 as the work load is even.Yolandayolande
Maybe I misunderstood something here, but for 10 tasks and 8 threads, this seems to assign -4 (!) elements to the last thread (I didn't do the maths, just tried it - it's certainly only a minor bug, but seems wrong to me). Beyond that, of course, there are many subtleties involved: Unrelated system workload, number of real cores vs. virtual cores, other threads that are running, the kind of computation that is done there (IO vs. arithmetic), and the general input numbers (e.g. feeding 4 threads with 100000 elements vs. 7 elements - in the first case, +/- 100 may not matter much...)Samala
G
7

Let me just take your example as it will be easy to explain. 22 elements amongst 4 threads.

22 % 4 = 2. This gives you the number of threads that will get one element more than the remaining threads.

22 / 4 = 5. This gives you the minimum number of elements per thread.

Now start dividing your array into 5 elements and assign them to a thread each till you are left with (22%4) 2 threads. Assign them the remaining (5+1=6) elements each.

Gelatinize answered 18/4, 2016 at 8:37 Comment(0)
Y
6

It doesn't need to be as evenly as possible. If one thread has 6, this will determine the length of time it takes in which case it doesn't matter how many are up to 6.

You can do

int chunkSize = (tasks + threads - 1) / threads; // divide by threads rounded up.
for (int t = 0; t < threads; t++) {
    int start = t * chunksSize;
    int end = Math.min(start + chunkSize, tasks);
    executor.submit(() -> {
         // inside the thread
         for (int i = start; i < end; i++) {
             process(i);
    });
}

Note: if you use Stream.of(array).parallel() it actually create two tasks per thread. This mitigates that some batches might take longer even though they have the same number of elements.

Yolandayolande answered 18/4, 2016 at 8:35 Comment(2)
@Marco13 the longest running thread will most likely have two tasks and this will determine how long it takes for them all to complete. I see your point though +1 Note: computers rarely scale linearly with the number of CPU you use at once. If you have 5x2 task threads they might complete faster than 2x2 + 6x1 as the work load is even.Yolandayolande
Maybe I misunderstood something here, but for 10 tasks and 8 threads, this seems to assign -4 (!) elements to the last thread (I didn't do the maths, just tried it - it's certainly only a minor bug, but seems wrong to me). Beyond that, of course, there are many subtleties involved: Unrelated system workload, number of real cores vs. virtual cores, other threads that are running, the kind of computation that is done there (IO vs. arithmetic), and the general input numbers (e.g. feeding 4 threads with 100000 elements vs. 7 elements - in the first case, +/- 100 may not matter much...)Samala
S
1

In order to make sure that the threads have a "similar" workload, it is important to find an even distribution. This is particularly important when the number of threads is "high" compared to the number of elements. For this case, one should make sure that the numbers of elements that the threads are responsible for differs by at most 1.

To achieve this, you can compute the remainder of dividing the number of elements (the array length, in your case) by the number of threads, and distribute this remainder, one by one, among the tasks.

I had the same problem a while ago. In fact, I tried to solve it in a slightly more general form, for some ParallelRangeExecutor class, which required the computation of the start- and end indices of the intervals of an arbitrary range (which does not need to start with index 0). The following is "extracted" from this class:

import java.util.Arrays;

public class EvenTaskDistribution
{
    public static void main(String[] args)
    {
        test( 22, 4);
        test( 21, 4);
        test(100, 3);
        test(  3, 4);
    }

    private static void test(int numElements, int parallelism)
    {
        int taskSizes[] = computeTaskSizes(parallelism, 0, numElements);
        System.out.printf("Distributing %4d elements among %4d threads: %s\n",
            numElements, parallelism, Arrays.toString(taskSizes));
    }

    public static int[] computeTaskSizes(
        int parallelism, int globalMin, int globalMax)
    {
        if (parallelism <= 0)
        {
            throw new IllegalArgumentException(
                "Parallelism must be positive, but is " + parallelism);
        }
        if (globalMin > globalMax)
        {
            throw new IllegalArgumentException(
                "The global minimum may not be larger than the global " + 
                "maximum. Global minimum is "+globalMin+", " + 
                "global maximum is "+globalMax);
        }
        int range = globalMax - globalMin;
        if (range == 0)
        {
            return new int[0];
        }
        int numTasks = Math.min(range, parallelism);
        int localRange = (range - 1) / numTasks + 1;
        int spare = localRange * numTasks - range;
        int currentIndex = globalMin;
        int taskSizes[] = new int[numTasks];
        for (int i = 0; i < numTasks; i++)
        {
            final int min = currentIndex;
            final int max = min + localRange - (i < spare ? 1 : 0);
            taskSizes[i] = max - min; 
            currentIndex = max;
        }
        return taskSizes;
    }
}

The output is

Distributing   22 elements among    4 threads: [5, 5, 6, 6]
Distributing   21 elements among    4 threads: [5, 5, 5, 6]
Distributing  100 elements among    3 threads: [33, 33, 34]
Distributing    3 elements among    4 threads: [1, 1, 1]

(The last one shows one of the corner cases that one might have to take into account. E.g. one could expect [1,1,1,0] here. But this can easily be adjusted depending on the application case).

Samala answered 18/4, 2016 at 9:18 Comment(0)
P
0

You can do it in two phases. First: divide length with threads count without the remainder to get chunks. Second: split the remainder between chunks - +1 per each chunk. Some chunk won't get +1.

Patina answered 18/4, 2016 at 8:34 Comment(0)
C
0

Given n elements and kthreads, you should assign 1 + n/k elements to the first n % k threads, and n/k elements to the remaining threads.

In your case, you have n = 22 and k = 4, so... n/k = 5 (rounded down) and n%k = 2, so first 2 threads have 5+1 elements assigned to them, and the remaining 2 threads have 5 assigned to them.

Clarsach answered 18/4, 2016 at 8:37 Comment(0)
V
0
  • Implementation of @MS Srikkanth's point.

    {
         int threadCount = 4;
         ExecutorService executorService = Executors.newFixedThreadPool(threadCount);
         int numberOfTasks = 22;
         int chunkSize = numberOfTasks / threadCount;
         int extras = numberOfTasks % threadCount;
    
         int startIndex, endIndex = 0;
    
         for(int threadId = 0; threadId < threadCount; threadId++){
             startIndex = endIndex;
             if(threadId < (threadCount-extras)) {
                 endIndex = Math.min(startIndex + chunkSize, numberOfTasks);
             }else{
                 endIndex = Math.min(startIndex + chunkSize + 1, numberOfTasks);
             }
    
    
             int finalStartIndex = startIndex;
             int finalEndIndex = endIndex;
             executorService.submit(() -> {
                 log.info("Running tasks from startIndex: {}, to endIndex: {}, total : {}", finalStartIndex, finalEndIndex-1, finalEndIndex-finalStartIndex);
                 for (int i = finalStartIndex; i < finalEndIndex; i++) {
                      process(i);
                 }
             });
         }
         executorService.shutdown();
    }
    
Venue answered 11/8, 2021 at 15:41 Comment(1)
Output is something like this: [pool-1-thread-1] Running tasks from startIndex: 0, to endIndex: 5, total : 5 [pool-1-thread-2] Running tasks from startIndex: 5, to endIndex: 10, total : 5 [pool-1-thread-4] Running tasks from startIndex: 16, to endIndex: 22, total : 6 [pool-1-thread-3] Running tasks from startIndex: 10, to endIndex: 16, total : 6Venue

© 2022 - 2024 — McMap. All rights reserved.