copton's web log: parallel-programming

Thread Building Blocks

Intel Thread Building Blocks (TBB) is a open source C++ library. Its aim complies with the one of OpenMP: taking advantage of multi-core processors without being an expert on parallel programming.

TBB provides a rich and pretended complete set of algorithms, containers, models and primitives for concurrent software. I would really call it the C++ library for concurrent programming. Though its great flexibility comes hand in hand with its complexity. You have to dig into the documentation for quite a while before you get you work done. And compared to OpenMP or Cilk++ you must understand more about concurrency.

Furthermore compared to OpenMP or Cilk++ your concurrent-enabled code looks very different to its sequential equivalent. TBB programs must be designed to use TBB from the beginning.

Cilk++

Cilk ARTS is a MIT spin-off which develops Cilk++, an enhancement of the Cilk Project.

Cilk++ is similar to OpenMP in that way that you specify the program's concurrency structure with special keywords. The tool chain and the run time thereby turn your sequential program into a highly concurrent one. Cilk ARTS focuses on porting existing software on multi-core systems with least possible knowledge about concurrent programming.

In my opinion the most important part of their tool chain is the "Race Detector". Cilk ARTS claims that it detects any existing data race in the program. This means, if the Race Detector finds no error, the concurrent program is correct if and only if the sequential one is. And the correctness of the latter is already checked with traditional software tests.

I have no idea how the Race Detector works. And I'm afraid they won't tell anybody any details. What they say is that they test if all possible schedulings result in the same values of the data. But how does the Race Detector get all possible threads of execution without automatically "understanding" what the code does?

OpenMP

OpenMP is a set of compiler directives and a run time environment which enable programmers to implement fork-and-join models very easy.

Fork-and-join means that a sequential program comes to a point where parallel execution is possible and meaningful. At this point multiple threads of execution are started which do all the partial computations in parallel. When everything is done the partial results are collected and aggregated and the threads are joined. From there on the program is sequential again.

With OpenMP's compiler directives you simply specify the programs concurrency structure. Everything else is done automatically by the compiler and the OpenMP run time environment. Your program becomes highly concurrent and you don't see a single lock or a thread. Everything happens behind the scenes. Nevertheless it is possible to configure scheduler algorithms and parameters.

Wikipedia has a great article about OpenMP. No need to repeat everything here. If you have a program which fits the fork-and-join model you definitely want to give OpenMP a try.

Monday, March 9, 2009

Thread Building Blocks

Cilk++

OpenMP