I am trying to enable multithreading/multiprocessing in an Anaconda installation of Numpy. My test program is the following:
import os
import numpy as np
from timeit import timeit
size = 1024
A = np.random.random((size, size)),
B = np.random.random((size, size))
print 'Time with %s threads: %f s' \
%(os.environ.get('OMP_NUM_THREADS'),
timeit(lambda: np.dot(A, B), number=4))
I change the environmental variable OMP_NUM_THREADS
, but regardless its value, it always takes the same amount of time to run the code and always a single core is being used.
It appears that my Numpy is linked against OpenBlas:
numpy.__config__.show()
lapack_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/home/myuser/anaconda3/envs/py2env/lib']
define_macros = [('HAVE_CBLAS', None)]
language = c
blas_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/home/myuser/anaconda3/envs/py2env/lib']
define_macros = [('HAVE_CBLAS', None)]
language = c
openblas_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/home/myuser/anaconda3/envs/py2env/lib']
define_macros = [('HAVE_CBLAS', None)]
language = c
blis_info:
NOT AVAILABLE
openblas_lapack_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/home/myuser/anaconda3/envs/py2env/lib']
define_macros = [('HAVE_CBLAS', None)]
language = c
lapack_mkl_info:
NOT AVAILABLE
blas_mkl_info:
NOT AVAILABLE
and this is my relevant part of conda list
:
conda list | grep blas
blas 1.1 openblas conda-forge
libblas 3.9.0 1_h6e990d7_netlib conda-forge
libcblas 3.9.0 3_h893e4fe_netlib conda-forge
numpy 1.14.6 py27_blas_openblashd3ea46f_200 [blas_openblas] conda-forge
openblas 0.2.20 8 conda-forge
scikit-learn 0.19.2 py27_blas_openblasha84fab4_201 [blas_openblas] conda-forge
I also tried setting OPENBLAS_NUM_THREADS
but did make any difference.
I use a Python2.7 environment in conda 4.12.0
.