C++ Parallelization Libraries: OpenMP vs. Thread Building Blocks [closed]
Asked Answered
F

7

54

I'm going to retrofit my custom graphics engine so that it takes advantage of multicore CPUs. More exactly, I am looking for a library to parallelize loops.

It seems to me that both OpenMP and Intel's Thread Building Blocks are very well suited for the job. Also, both are supported by Visual Studio's C++ compiler and most other popular compilers. And both libraries seem quite straight-forward to use.

So, which one should I choose? Has anyone tried both libraries and can give me some cons and pros of using either library? Also, what did you choose to work with in the end?

Thanks,

Adrian

Fruiterer answered 5/3, 2009 at 15:28 Comment(1)
Similar question: #326987 (I added a reference to this question in my question.)Protostele
C
61

I haven't used TBB extensively, but my impression is that they complement each other more than competing. TBB provides threadsafe containers and some parallel algorithms, whereas OpenMP is more of a way to parallelise existing code.

Personally I've found OpenMP very easy to drop into existing code where you have a parallelisable loop or bunch of sections that can be run in parallel. However it doesn't help you particularly for a case where you need to modify some shared data - where TBB's concurrent containers might be exactly what you want.

If all you want is to parallelise loops where the iterations are independent (or can be fairly easily made so), I'd go for OpenMP. If you're going to need more interaction between the threads, I think TBB may offer a little more in that regard.

Centare answered 5/3, 2009 at 20:46 Comment(1)
Good point about the existing code. It's easier to plug few pragma's here and there. Plugging in TBB might be more difficult (much depends on the existing code style)Monroemonroy
M
27

From Intel's software blog: Compare Windows* threads, OpenMP*, Intel® Threading Building Blocks for parallel programming

It is also the matter of style - for me TBB is very C++ like, while I don't like OpenMP pragmas that much (reeks of C a bit, would use it if I had to write in C).

I would also consider the existing knowledge and experience of the team. Learning a new library (especially when it comes to threading/concurrency) does take some time. I think that for now, OpenMP is more widely known and deployed than TBB (but this is just mine opinion).

Yet another factor - but considering most common platforms, probably not an issue - portability. But the license might be an issue.

  • TBB incorporates some of nice research originating from academic research, for example recursive data parallel approach.
  • There is some work on cache-friendliness, for example.
  • Lecture of the Intel blog seems really interesting.
Monroemonroy answered 5/3, 2009 at 15:41 Comment(2)
Thanks for the link, but since it is hosted on Intel's website, I would not really trust it with providing a completely unbiased opinion. Clearly they wrote the article to promote usage of their own library.Fruiterer
Yes, forgot the emoticon somewhere in the first line ;)Monroemonroy
L
20

In general I have found that using TBB requires much more time consuming changes to the code base with a high payoff while OpenMP gives a quick but moderate payoff. If you are staring a new module from scratch and thinking long term go with TBB. If you want small but immediate gains go with OpenMP.

Also, TBB and OpenMP are not mutually exclusive.

Larimer answered 6/3, 2009 at 10:51 Comment(0)
P
9

I've actually used both, and my general impression is that if your algorithm is fairly easy to make parallel (e.g. loops of even size, not too much data interdependence) OpenMP is easier, and quite nice to work with. In fact, if you find you can use OpenMP, it's probably the better way to go, if you know your platform will support it. I haven't used OpenMP's new Task structures, which are much more general than the original loop and section options.

TBB gives you more data structures up front, but definitely requires more up front. As a plus, it might be better at making you aware of race condition bugs. What I mean by this is that it is fairly easy in OpenMP to enable race conditions by not making something shared (or whatever) that should be. You only see this when you get bad results. I think this is a bit less likely to occur with TBB.

Overall my personal preference was for OpenMP, especially given its increased expressiveness with tasks.

Putrefaction answered 28/4, 2009 at 11:54 Comment(0)
A
3

As far as i know, TBB (there is an OpenSource Version under GPLv2 avaiable) adresses more the C++ then C Area. These times it's hard to find C++ and general OOP parallelization specific Informations.The most adresses functional stuff like c (the same on CUDA or OpenCL). If you need C++ Support for parallelization go for TBB!

Adallard answered 10/1, 2012 at 7:14 Comment(1)
TBB now uses Apache license...Zared
D
3

Yes, TBB is much more C++ friendly while OpenMP is more appropriate for FORTRAN-style C code given its design. The new task feature in OpenMP looks very interesting, while at the same time the Lambda and function object in C++0x may make TBB easier to use.

Defiance answered 24/1, 2013 at 6:51 Comment(0)
R
2

In Visual Studio 2008, you can add the following line to parallelize any "for" loop. It even works with multiple nested for loops. Here is an example:

#pragma omp parallel for private(i,j)
for (i=0; i<num_particles; i++)
{
  p[i].fitness = fitnessFunction(p[i].present);
  if (p[i].fitness > p[i].pbestFitness)
  { 
     p[i].pbestFitness = p[i].fitness;
     for (j=0; j<p[i].numVars; j++) p[i].pbest[j] = p[i].present[j];
  }
}  
gbest = pso_get_best(num_particles, p);

After we added the #pragma omp parallel, both cores on my Core 2 Duo were used to their maximum capacity, so total CPU usage went from 50% to 100%.

Rhines answered 28/6, 2010 at 17:43 Comment(2)
Just a note: nested loops work only if the compiler supports itIrvin
Just another note: you can use omp parallel for to parallelize any parallelizable for loop. For example, you could not use omp parallel for if the body contains some code like this: p[j] = p[j] - p[j-1]Jabon

© 2022 - 2024 — McMap. All rights reserved.