Intel MKL vs. AMD Math Core Library
Asked Answered
S

3

22

Does anybody have experience programming for both the Intel Math Kernel Library and the AMD Math Core Library? I'm building a personal computer for high performance statistical computations and am debating on the components to buy. An appeal of the AMD Math Core library is that it is free, but I am in academia so the MKL is not that expensive. But I'd be interested in hearing thoughts on:

  1. Which provides a better API?
  2. Which provides better performance, on average, per dollar, including licensing and hardware costs.
  3. Is the AMCL-GPU a factor I should consider?
Sorenson answered 29/10, 2009 at 16:22 Comment(2)
I wonder if there's a trick to make MKL work on AMD processors.Faber
@Drazick I understand that MKL also works with AMD processors, with optimizations and all.Overreact
Z
12

Intel MKL and ACML have similar APIs but MKL has a richer set of supported functionality including BLAS (and CBLAS)/LAPACK/FFTs/Vector and Statistical Math/Sparse direct and iterative solvers/Sparse BLAS, and so on. Intel MKL is also optimized for both Intel and AMD processors and has an active user forum you can turn to for help or guidance. An independent assessment of the two libraries is posted here: (http://www.advancedclustering.com/company-blog/high-performance-linpack-on-xeon-5500-v-opteron-2400.html)

• Shane Corder, Advanced Clustering, (also carried by HPCWire: Benchmark Challenge: Nehalem Versus Istanbul): “In our recent testing and through real world experience, we have found that the Intel compilers and Intel Math Kernel Library (MKL) usually provide the best performance. Instead of just settling on Intel's toolkit we tried various compilers including: Intel, GNU compilers, and Portland Group. We also tested various linear algebra libraries including: MKL, AMD Core Math Library (ACML), and libGOTO from the University of Texas. All of the testing showed we could achieve the highest performance when using both the Intel Compilers and Intel Math Library--even on the AMD system--so these were used them as the base of our benchmarks.” [Benchmark testing showed 4-core Nehalem X5550 2.66GHz at 74.0GFs vs. Istanbul 2435 2.6GHz at 99.4GFs; Istanbul only 34% faster despite 50% more cores]

Hope this helps.

Zoellick answered 30/10, 2009 at 23:36 Comment(1)
Are you sure MKL chose optimal code path for AMD CPU's as well?Faber
M
3

In fact, there are two versions of LAPACK routines in ACML. The ones without trailing underscore (_) are the C-version routines, which as Victor said, don't require workspace arrays and you can just pass values instead of references for the parameters. The ones with the underscore however are just vanilla Fortran routines. Do a "dumpbin /exports" on libacml_dll.dll and you'll see.

Messaline answered 21/6, 2010 at 19:45 Comment(0)
S
2

I have used AMCL for its BLAS/LAPACK routines, so this will probably not answer your question, but I hope it's useful for someone. Comparing them to vanilla BLAS/LAPACK, their performance was a factor of 2-3 better in my particular use case. I used it for dense nonsymmetric complex matrices, for both linear solves and eigensystem computations. You should know that the function declarations are not identical to the vanilla routines. This required a substantial amount of preprocessor macros to allow me to freely switch between the two. In particular all LAPACK routines in AMCL do not require work arrays. This is a major convenience if AMCL is the only library you will use.

Sori answered 29/10, 2009 at 18:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.