Setting up Anaconda for AMD Ryzen without MKL
Asked Answered
J

3

5

Like many others, I've bought myself a new Ryzen CPU. I need to use Anaconda Python for my PhD (together with Tensorflow etc). Since Anaconda now comes pre-packaged with MKL which is slow on AMD CPUs, what is the best way to setup an Anaconda environment with openblas, and link numpy and scikit-learn, while keeping all other packages the same?

I've found the following posts which all points to installing some packages one way or another.

https://anaconda.org/anaconda/nomkl

https://anaconda.org/anaconda/openblas

How to install scipy without mkl

Joycejoycelin answered 17/7, 2019 at 14:26 Comment(4)
I would suggest using the conda-forge channel to install your dependencies. Create a new environment with conda create -n name-of-env -c conda-forge scipy <other dependencies>.Staurolite
Thanks for your reply. Seems like there's no scipy in conda-forge without mkl (see github.com/conda-forge/numpy-feedstock/issues/153). So I'm going to try github.com/fo40225/Anaconda-Windows-AMDJoycejoycelin
Are you on Windows? That wasn't clear from your question...Staurolite
Yes sorry, I'm on Windows 10Joycejoycelin
W
2

This post from reddit has a much more thorough explanation of what's going on, but it's just a one liner in your terminal to trick MKL into thinking you are an Intel system since MKL does nasty things to non Intel devices: https://www.reddit.com/r/MachineLearning/comments/f2pbvz/discussion_workaround_for_mkl_on_amd/

WINDOWS:

opening a command prompt (CMD) with admin rights and typing in:

setx /M MKL_DEBUG_CPU_TYPE 5

Doing this will make the change permanent and available to ALL Programs using the MKL on your system until you delete the entry again from the variables.

LINUX:

Simply type in a terminal:

export MKL_DEBUG_CPU_TYPE=5 

before running your script from the same instance of the terminal.

Permanent solution for Linux:

echo 'export MKL_DEBUG_CPU_TYPE=5' >> ~/.profile

will apply the setting profile-wide.

Some highlights since I figure you can click the link to read the entire thing if interested:

"However, the numerical lib that comes with many of your packages by default is the Intel MKL. The MKL runs notoriously slow on AMD CPUs for some operations. This is because the Intel MKL uses a discriminative CPU Dispatcher that does not use efficient codepath according to SIMD support by the CPU, but based on the result of a vendor string query. If the CPU is from AMD, the MKL does not use SSE3-SSE4 or AVX1/2 extensions but falls back to SSE no matter whether the AMD CPU supports more efficient SIMD extensions like AVX2 or not.

The method provided here enforces AVX2 support by the MKL, independent of the vendor string result and takes less than a minute to apply. If you have an AMD CPU that is based on the Zen/Zen+/Zen2 µArch Ryzen/Threadripper, this will boost your performance tremendously."

Wellborn answered 4/6, 2020 at 2:37 Comment(0)
B
3

An alternate to giving up MKL is simply to make it run much faster on a Ryzen CPU by telling MKL to use a more Ryzen-compatible instruction set. By doing

conda install mkl -c intel --no-update-deps
set MKL_DEBUG_CPU_TYPE=5

I saw about a 15x speedup using numpy/theano/PyMC3 on my Ryzen CPU under Windows 10 vs the default initial miniconda installation.

Becquerel answered 8/12, 2019 at 15:7 Comment(3)
Does running the install this way (with parameter setting) have to be done each time you want to use your python environment, or is this happening one time only allowing your environment to stay fixed? (sorry if this seems obvious, but I am currently looking to build a desktop similar to OP and don't want to be bottle necked with AMD)Deli
how does this compare with the current intel cpu release (similarly priced)? is there any documented comparison? I'm at the point of getting a new PC and this consideration is hampering my decision. I dont want to buy a new PC and end up being slower in python land - the main reason for the purchase...Wyrick
re: @SteveCarter the AMD CPUs these past few generations are generally about twice the cores for 70% of the price and half the power usage (read power bill). Performance per core is more or less the same if not sometimes much better, check individual CPU reviews. Also, the AM4 motherboards are suppose to be compatible with new processors for a few years, compared with intel forcing users to get new ones each generation. Motherboards can be pricey.Wellborn
W
3

As of 2021, Intel unfortunately removed the MKL_DEBUG_CPU_TYPE to prevent people on AMD use the workaround presented in the accepted answer. This means that the workaround no longer works, and AMD users have to either switch to OpenBLAS or keep using MKL.

To use the workaround, follow this method:

  1. Create a conda environment with conda's and NumPy's MKL=2019.
  2. Activate the environment
  3. Set MKL_DEBUG_CPU_TYPE = 5

The commands for the above steps:

  1. conda create -n my_env -c anaconda python numpy mkl=2019.* blas=*=*mkl
  2. conda activate my_env
  3. conda env config vars set MKL_DEBUG_CPU_TYPE=5

And thats it!

Wofford answered 28/8, 2021 at 18:2 Comment(0)
W
2

This post from reddit has a much more thorough explanation of what's going on, but it's just a one liner in your terminal to trick MKL into thinking you are an Intel system since MKL does nasty things to non Intel devices: https://www.reddit.com/r/MachineLearning/comments/f2pbvz/discussion_workaround_for_mkl_on_amd/

WINDOWS:

opening a command prompt (CMD) with admin rights and typing in:

setx /M MKL_DEBUG_CPU_TYPE 5

Doing this will make the change permanent and available to ALL Programs using the MKL on your system until you delete the entry again from the variables.

LINUX:

Simply type in a terminal:

export MKL_DEBUG_CPU_TYPE=5 

before running your script from the same instance of the terminal.

Permanent solution for Linux:

echo 'export MKL_DEBUG_CPU_TYPE=5' >> ~/.profile

will apply the setting profile-wide.

Some highlights since I figure you can click the link to read the entire thing if interested:

"However, the numerical lib that comes with many of your packages by default is the Intel MKL. The MKL runs notoriously slow on AMD CPUs for some operations. This is because the Intel MKL uses a discriminative CPU Dispatcher that does not use efficient codepath according to SIMD support by the CPU, but based on the result of a vendor string query. If the CPU is from AMD, the MKL does not use SSE3-SSE4 or AVX1/2 extensions but falls back to SSE no matter whether the AMD CPU supports more efficient SIMD extensions like AVX2 or not.

The method provided here enforces AVX2 support by the MKL, independent of the vendor string result and takes less than a minute to apply. If you have an AMD CPU that is based on the Zen/Zen+/Zen2 µArch Ryzen/Threadripper, this will boost your performance tremendously."

Wellborn answered 4/6, 2020 at 2:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.