My computer has a quadcore i7 processor. I'm studying parallelization of scientific simulations. How does hyperthreading impact on parallel performances? I know I should never use more than 4 working processes to get descent performances. But should I disable hyperthreading as well? Does it have an impact on parallel performances?
In my experience, running electromagnetic modelling and inversion codes, the answer is yes, you should disable hyperthreading. But this is not the sort of question which is well answered by other people's anecdotes (not even mine, fascinating and true as they are).
You are the student, this is definitely a topic worth your time spent in coming to your own conclusions. There are so many factors involved that my experience running my codes on my platforms is nearly worthless to you.
Under Linux, if you have 4 busy threads on an i7 it will place each one on a different core. Provided the other half of the core is idle, the performance should be the same. If you are running another program, it is debatable as to whether having hyperthreading to run the extra programs or context switching is better. (I suspect less context switching is better)
A common mistake is assuming that if you use 8 threads instead of 4 it will be twice as fast. It might be only slightly faster (in which case it might still be worth it) or slightly slower (in which case limit your program to 4 threads) I have found examples of where using double the number of threads was slightly faster. IMHO, Its all a matter of test it to find the optimal number and use that many.
The only time I can see you need to turning HT off is when you have no control over how your application behaves and using 4 threads is faster.
You state:
I know I should never use more than 4 working processes to get descent performances.
This isn't necessarily true! Here is an example of what I have found running on an i7-3820 with HT enabled. All of my code that I was running was C++. Consider that I have 8 separate programs (albeit identical) that I need to run. I have tried the two following ways of running these codes:
- Run only 4 separate threads at a time, simultaneously. When these 4 complete, run the next 4 threads (4 x 2 = 8 total).
- Run all 8 as separate threads simultaneously (8 x 1 = 8 total).
As you can see these two scenarios achieve the same thing. However, what I have found is that the run times are:
- 1 hour for each set of 4 threads; for a total of 2 hours to complete all 8.
- 1.5 hours for the set of 8 threads.
What you find is that a single thread will finish faster for case #1, but that overall #2 gives better performance since ALL of your work is completed in less time. I found typical increases in performance to be ~25% with HT enabled.
As is evident, there are scenarios when running 8 threads is faster than 4.
HyperTreading is the Intel implementation of Simultaneous Multi Threading (SMT). In general, SMT is almost always beneficial (this is why it is usually enabled), unless your application is CPU-bound. If you know for sure that your application is CPU-bound, then disable SMT. Otherwise (your application is IO-bound or is not able to completely saturate the cores), leave it enabled.
© 2022 - 2024 — McMap. All rights reserved.