How can I wait for any/all pthreads to complete?
Asked Answered
D

5

37

I just want my main thread to wait for any and all my (p)threads to complete before exiting.

The threads come and go a lot for different reasons, and I really don't want to keep track of all of them - I just want to know when they're all gone.

wait() does this for child processes, returning ECHILD when there are no children left, however wait does not (appear to work with) (p)threads.

I really don't want to go through the trouble of keeping a list of every single outstanding thread (as they come and go), then having to call pthread_join on each.

As there a quick-and-dirty way to do this?

Draper answered 27/5, 2011 at 15:35 Comment(0)
D
23

The proper way is to keep track of all of your pthread_id's, but you asked for a quick and dirty way so here it is. Basically:

  • just keep a total count of running threads,
  • increment it in the main loop before calling pthread_create,
  • decrement the thread count as each thread finishes.
  • Then sleep at the end of the main process until the count returns to 0.

.

volatile int running_threads = 0;
pthread_mutex_t running_mutex = PTHREAD_MUTEX_INITIALIZER;

void * threadStart()
{
   // do the thread work
   pthread_mutex_lock(&running_mutex);
   running_threads--;
   pthread_mutex_unlock(&running_mutex);
}

int main()
{
  for (i = 0; i < num_threads;i++)
  {
     pthread_mutex_lock(&running_mutex);
     running_threads++;
     pthread_mutex_unlock(&running_mutex);
     // launch thread

  }

  while (running_threads > 0)
  {
     sleep(1);
  }
}
Despumate answered 27/5, 2011 at 17:4 Comment(7)
This can be accomplished much more easily with a barrier instead of counter and mutex.Gothar
I like this solution - it hadn't occurred to me simply count the running instances. I think you could even do away with the need for mutexes as the operations are all atomic, IINM.Jalapa
The operations are definitely not atomic. The mutex is essential. Look up barriers though; they're a lot easier to use and do the counting for you.Gothar
I just looked it up, yes, it's not atomic. Interesting though - what would be an intermediary state of an incrementation operation? Partially set/unset bits? Anyway, thanks for the note about barriers - I am investigating them, they sound interesting.Jalapa
I don't like the solution - because it means I have to keep polling and sleeping - but it still is indeed the simplest! In reality - I would not fall into the while/sleep loop until I received a shutdown signal - so it wouldn't really burn any CPU cycles in the real-world.Draper
The mutexes are required to keep the increments/decrements atomic - unless you use an atomic inc/dec function (which there are some GCC-specific hacks to use). You still however need a read-memory-barrier in the while() conditional. Thanks for all the help!!Draper
@Jalapa An intermediary state might be loading the value to be incremented into a register. You can't be sure that the increment happens against memory directly.Expository
D
27

Do you want your main thread to do anything in particular after all the threads have completed?

If not, you can have your main thread simply call pthread_exit() instead of returning (or calling exit()).

If main() returns it implicitly calls (or behaves as if it called) exit(), which will terminate the process. However, if main() calls pthread_exit() instead of returning, that implicit call to exit() doesn't occur and the process won't immediately end - it'll end when all threads have terminated.

Can't get too much quick-n-dirtier.

Here's a small example program that will let you see the difference. Pass -DUSE_PTHREAD_EXIT to the compiler to see the process wait for all threads to finish. Compile without that macro defined to see the process stop threads in their tracks.

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <time.h>

static
void sleep(int ms)
{
    struct timespec waittime;

    waittime.tv_sec = (ms / 1000);
    ms = ms % 1000;
    waittime.tv_nsec = ms * 1000 * 1000;

    nanosleep( &waittime, NULL);
}

void* threadfunc( void* c)
{
    int id = (int) c;
    int i = 0;

    for (i = 0 ; i < 12; ++i) {
        printf( "thread %d, iteration %d\n", id, i);
        sleep(10);
    }

    return 0;
}


int main()
{
    int i = 4;

    for (; i; --i) {
        pthread_t* tcb = malloc( sizeof(*tcb));

        pthread_create( tcb, NULL, threadfunc, (void*) i);
    }

    sleep(40);

#ifdef USE_PTHREAD_EXIT
    pthread_exit(0);
#endif

    return 0;
}
Depreciate answered 27/5, 2011 at 18:7 Comment(7)
Thanks for the reply! Actually - yes - the main thread needs to clean-up/remove a shared memory segment - so I can't just call pthread_exit as you have described. (I now realize I should have stated this in the OP). Thanks for the response!Draper
Lovely answer. I'm a little confused about why the return statement after the pthread_exit statement doesn't end the process though. Do we now have 2 different processes running, where one is terminated, and the other is not? I was under the (perhaps erroneous) impression that returning from main would destroy some process at least.Agree
@Sammaron: if pthread_exit() is called, then the return statement at the end of main() is never executed. The thread is exited, but none of the machinery to tear down the process is performed. The OS will do that once all threads in the process have exited (or another thread can call a function like exit() to terminate the process).Depreciate
Would not freeing tcbnot cause a memory leak?Dyan
@razzak: yeah. This program was just a simple example to demonstrate the difference between calling exit() (or letting main() return) and calling pthread_exit() from the main program. It's not meant to show the complete correct handling of thread resources.Depreciate
@MichaelBurr k thanks, I thought that pthread_exit() might be freeing the thread automatically when i saw your code snippet.Dyan
Is there not still a risk that a thread created in the loop will end before the next thread is created? Ahhh, hence the sleep().Albarran
D
23

The proper way is to keep track of all of your pthread_id's, but you asked for a quick and dirty way so here it is. Basically:

  • just keep a total count of running threads,
  • increment it in the main loop before calling pthread_create,
  • decrement the thread count as each thread finishes.
  • Then sleep at the end of the main process until the count returns to 0.

.

volatile int running_threads = 0;
pthread_mutex_t running_mutex = PTHREAD_MUTEX_INITIALIZER;

void * threadStart()
{
   // do the thread work
   pthread_mutex_lock(&running_mutex);
   running_threads--;
   pthread_mutex_unlock(&running_mutex);
}

int main()
{
  for (i = 0; i < num_threads;i++)
  {
     pthread_mutex_lock(&running_mutex);
     running_threads++;
     pthread_mutex_unlock(&running_mutex);
     // launch thread

  }

  while (running_threads > 0)
  {
     sleep(1);
  }
}
Despumate answered 27/5, 2011 at 17:4 Comment(7)
This can be accomplished much more easily with a barrier instead of counter and mutex.Gothar
I like this solution - it hadn't occurred to me simply count the running instances. I think you could even do away with the need for mutexes as the operations are all atomic, IINM.Jalapa
The operations are definitely not atomic. The mutex is essential. Look up barriers though; they're a lot easier to use and do the counting for you.Gothar
I just looked it up, yes, it's not atomic. Interesting though - what would be an intermediary state of an incrementation operation? Partially set/unset bits? Anyway, thanks for the note about barriers - I am investigating them, they sound interesting.Jalapa
I don't like the solution - because it means I have to keep polling and sleeping - but it still is indeed the simplest! In reality - I would not fall into the while/sleep loop until I received a shutdown signal - so it wouldn't really burn any CPU cycles in the real-world.Draper
The mutexes are required to keep the increments/decrements atomic - unless you use an atomic inc/dec function (which there are some GCC-specific hacks to use). You still however need a read-memory-barrier in the while() conditional. Thanks for all the help!!Draper
@Jalapa An intermediary state might be loading the value to be incremented into a register. You can't be sure that the increment happens against memory directly.Expository
J
2

If you don't want to keep track of your threads then you can detach the threads so you don't have to care about them, but in order to tell when they are finished you will have to go a bit further.

One trick would be to keep a list (linked list, array, whatever) of the threads' statuses. When a thread starts it sets its status in the array to something like THREAD_STATUS_RUNNING and just before it ends it updates its status to something like THREAD_STATUS_STOPPED. Then when you want to check if all threads have stopped you can just iterate over this array and check all the statuses.

Don't forget though that if you do something like this, you will need to control access to the array so that only one thread can access (read and write) it at a time, so you'll need to use a mutex on it.

Jalapa answered 27/5, 2011 at 15:50 Comment(5)
This solution doesn't make things any easier. If you're going to make that ugly array (which you need to synchronize, by the way!), you could instead just store the pthread_t ids in it and pthread_join them all.Gothar
You're right - but the OP does say that threads come and go and that he doesn't want to keep track of them, which I interpreted as meaning that he doesn't want to join all threads, but simply be able wait for all current threads to end when an exit condition arises. As gravitron suggests, counting running threads would be simpler and avoid synchronisation, but the array approach adds flexibility (should it be required) so that each thread could be described by a struct containing more info such as when it was started, etc to perhaps help with thread monitoring.Jalapa
Regardless, your approach has serious bugs. A thread cannot set its own status to THREAD_STATUS_RUNNING because there will be a race condition before it's set. Instead, the creating thread needs to do this before calling pthread_create. There's a lot more synchronization that would need to be done to make your approach valid. If you're not an expert in this field, pthread_join (or barriers) would be a much simpler and much less error-prone solution.Gothar
Absolutely - again, you're right, and your suggested method of avoiding the race condition is spot on. In fact, that's exactly how I solved it in one of my projects. I think gravitron or Michael Burr have solutions more in line with what the OP wants, I'm just giving food for thought :-)Jalapa
I can't just call detach - because I need to do some cleanup in my main thread after all the threads have quiesced and exited. You are correct in that I was hoping for something much simpler than keeping track of (locking, synchronizing) and rejoining all the worker threads. Thanks for the reply!Draper
O
1

you could keep a list all your thread ids and then do pthread_join on each one, of course you will need a mutex to control access to the thread id list. you will also need some kind of list that can be modified while being iterated on, maybe a std::set<pthread_t>?

int main() {
   pthread_mutex_lock(&mutex);

   void *data;
   for(threadId in threadIdList) {
      pthread_mutex_unlock(&mutex);
      pthread_join(threadId, &data);
      pthread_mutex_lock(&mutex);
   }

   printf("All threads completed.\n");
}

// called by any thread to create another
void CreateThread()
{
   pthread_t id;

   pthread_mutex_lock(&mutex);
   pthread_create(&id, NULL, ThreadInit, &id); // pass the id so the thread can use it with to remove itself
   threadIdList.add(id);
   pthread_mutex_unlock(&mutex);  
}

// called by each thread before it dies
void RemoveThread(pthread_t& id)
{
   pthread_mutex_lock(&mutex);
   threadIdList.remove(id);
   pthread_mutex_unlock(&mutex);
}
Orms answered 23/2, 2014 at 21:32 Comment(0)
D
0

Thanks all for the great answers! There has been a lot of talk about using memory barriers etc - so I figured I'd post an answer that properly showed them used for this.

#define NUM_THREADS 5

unsigned int thread_count;
void *threadfunc(void *arg) {
  printf("Thread %p running\n",arg);
  sleep(3);
  printf("Thread %p exiting\n",arg);
  __sync_fetch_and_sub(&thread_count,1);
  return 0L;
}

int main() {
  int i;
  pthread_t thread[NUM_THREADS];

  thread_count=NUM_THREADS;
  for (i=0;i<NUM_THREADS;i++) {
    pthread_create(&thread[i],0L,threadfunc,&thread[i]);
  }

  do {
    __sync_synchronize();
  } while (thread_count);
  printf("All threads done\n");
}

Note that the __sync macros are "non-standard" GCC internal macros. LLVM supports these too - but if your using another compiler, you may have to do something different.

Another big thing to note is: Why would you burn an entire core, or waste "half" of a CPU spinning in a tight poll-loop just waiting for others to finish - when you could easily put it to work? The following mod uses the initial thread to run one of the workers, then wait for the others to complete:

  thread_count=NUM_THREADS;
  for (i=1;i<NUM_THREADS;i++) {
    pthread_create(&thread[i],0L,threadfunc,&thread[i]);
  }

  threadfunc(&thread[0]);

  do {
    __sync_synchronize();
  } while (thread_count);
  printf("All threads done\n");
}

Note that we start creating the threads starting at "1" instead of "0", then directly run "thread 0" inline, waiting for all threads to complete after it's done. We pass &thread[0] to it for consistency (even though it's meaningless here), though in reality you'd probably pass your own variables/context.

Draper answered 18/6, 2015 at 13:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.