Parallelization: pthreads or OpenMP?

Asked 1/6, 2009 at 15:59 Answered 13/6, 2015 at 22:32

Solved multithreading optimization pthreads openmp

Most people in scientific computing use OpenMP as a quasi-standard when it comes to shared memory parallelization.

Is there any reason (other than readability) to use OpenMP over pthreads? The latter seems more basic and I suspect it could be faster and easier to optimize.

Millenarianism answered 1/6, 2009 at 15:59 Comment(0)

It basically boils down to what level of control you want over your parallelization. OpenMP is great if all you want to do is add a few #pragma statements and have a parallel version of your code quite quickly. If you want to do really interesting things with MIMD coding or complex queueing, you can still do all this with OpenMP, but it is probably a lot more straightforward to use threading in that case. OpenMP also has similar advantages in portability in that a lot of compilers for different platforms support it now, as with pthreads.

So you're absolutely correct - if you need fine-tuned control over your parallelization, use pthreads. If you want to parallelize with as little work as possible, use OpenMP.

Whichever way you decide to go, good luck!

Elinaelinor answered 1/6, 2009 at 16:10 Comment(0)

One other reason: the OpenMP is task-based, Pthreads is thread based. It means that OpenMP will allocate the same number of threads as number of cores. So you will get scalable solution. It is not so easy task to do it using raw threads.

The second opinion: OpenMP provides reduction features: when you need to compute partial results in threads and combine them. You can implement it just using single line of code. But using raw threads you should do more job.

Just think about your requirements and try to understand: is OpenMP enough for you? You will save lots of time.

Zsazsa answered 20/6, 2009 at 14:24 Comment(3)

Please elaborate on solution scalability. Does the scalability only apply at compile time or is it determined at runtime? Or can runtime scalability only be done with threads? – Xenogenesis 5/2, 2012 at 23:59

You can set the number of threads created at either compile or runtime. If you choose to have the number set at runtime, you can set the number of threads through an environment variable numthreads, such that it can be easily set to an appropriate number on whatever architecture you're running it on. – Appeasement 6/5, 2012 at 18:6

This answer makes no sense. OpenMP is a threading model just like POSIX threads. OpenMP didn't even have tasks for the first few versions. – Pelagianism 5/9, 2015 at 21:8

OpenMP requires a compiler that supports it, and works with pragmas. The advantage to this is that when compiling without OpenMP-support (e.g. PCC or Clang/LLVM as of now), the code will still compile. Also, have a look at what Charles Leiserson wrote about DIY multithreading.

Pthreads is a POSIX standard (IEEE POSIX 1003.1c) for libraries, while OpenMP specifications are to be implemented on compilers; that being said, there are a variety of pthread implementations (e.g. OpenBSD rthreads, NPTL), and a number of compilers that support OpenMP (e.g. GCC with the -fopenmp flag, MSVC++ 2008).

Pthreads are only effective for parallelization when multiple processors are available, and only when the code is optimized for the number of processors available. Code for OpenMP is more-easily scalable as a result. You can mix code that compiles with OpenMP with code using pthreads, too.

Sligo answered 29/12, 2009 at 20:41 Comment(2)

The last paragraph of this answer is all kinds of wrong. – Pelagianism 5/9, 2015 at 21:9

Really only the first sentence is super wrong. Second sentence is "meh, on the fence", while third sentence looks fine to me. – Appropriation 29/8, 2022 at 2:46

You're question is similar to the question "Should I program C or assembly", C being OpenMP and assembly being pthreads.

With pthreads you can do much better parallelisation, better meaning very tightly adjusted to your algorithm and hardware. This will be a lot of work though.

With pthreads it is also much easier to produce a poorly parallelised code.

C answered 22/7, 2012 at 11:56 Comment(3)

This assumes OpenMP is implemented using Pthreads. That is not required, although generally true. If OpenMP were implemented to bare metal on a specialized architecture, it could be faster than Pthreads. – Pelagianism 5/9, 2015 at 21:13

@Jeff I am not assuming that and my answer is indeoendent of the implementation details.OpenMP and C are more "high-level" than pthreads ans assembly. That's why I believe that both my statements remain true, no matter how C and OpenMP are implemented. – C 18/11, 2015 at 6:27

It seems you are conflating the syntactic simplicity of OpenMP with semantic burdens on the runtime. Have you compared the POSIX thread specification with the OpenMP 4 specification? In particular, have you considered what is required for pthread_create() vs pragma omp parallel {}? – Pelagianism 18/11, 2015 at 18:33

Is there any reason (other than readability) to use OpenMP over pthreads?

Mike kind of touched upon this:

OpenMP also has similar advantages in portability in that a lot of compilers for different platforms support it now, as with pthreads

Crypto++ is cross-platform, meaning in runs on Windows, Linux, OS X and the BSDs. It uses OpenMP for threading support in places where the operation can be expensive, like modular exponentiation and modular multiplication (and where concurrent operation can be performed).

Windows does not support pthreads, but modern Windows compilers do support OpenMP. So if you want portability to the non-*nix's, then OpenMP is often a good choice.

And as Mike also pointed out:

OpenMP is great if all you want to do is add a few #pragma statements and have a parallel version of your code quite quickly.

Below is an example of Crypto++ precomputing some values used in Rabin-Williams signatures using Tweaked Roots as described by Bernstein in RSA signatures and Rabin-Williams signatures...:

void InvertibleRWFunction::Precompute(unsigned int /*unused*/)
{
    ModularArithmetic modp(m_p), modq(m_q);

    #pragma omp parallel sections
    {
        #pragma omp section
            m_pre_2_9p = modp.Exponentiate(2, (9 * m_p - 11)/8);
        #pragma omp section
            m_pre_2_3q = modq.Exponentiate(2, (3 * m_q - 5)/8);
        #pragma omp section
            m_pre_q_p = modp.Exponentiate(m_q, m_p - 2);
    }
}

It fits with Mike's observation - fine grain control and synchronization was not really needed. Parallelization was used to speed up execution, and the synchronization came at no cost in the source code.

And if OpenMP is not available, the the code reduces to:

m_pre_2_9p = modp.Exponentiate(2, (9 * m_p - 11)/8);
m_pre_2_3q = modq.Exponentiate(2, (3 * m_q - 5)/8);
m_pre_q_p = modp.Exponentiate(m_q, m_p - 2);

Whitebait answered 13/6, 2015 at 22:32 Comment(0)

OpenMP is ideal when you need to perform the same task in parallel (that is, on multiple data), a kind of SIMD machine (single-instruction multiple-data).

Pthreads is needed when you want to perform (quite different) tasks in parallel such as, for example, reading data in one thread and interacting with the user in another thread.

See this page:

http://berenger.eu/blog/c-cpp-openmp-vs-pthread-openmp-or-posix-thread/

Naamann answered 2/8, 2013 at 10:37 Comment(1)

OpenMP has always supported more than data parallelism. Do you actually understand OpenMP? – Pelagianism 5/9, 2015 at 21:11

Recommended topics

Hot tags