I read in this question that eigen
has very good performance. However, I tried to compare eigen
MatrixXi
multiplication speed vs numpy
array
multiplication. And numpy
performs better (~26 seconds vs. ~29). Is there a more efficient way to do this eigen
?
Here is my code:
Numpy:
import numpy as np
import time
n_a_rows = 4000
n_a_cols = 3000
n_b_rows = n_a_cols
n_b_cols = 200
a = np.arange(n_a_rows * n_a_cols).reshape(n_a_rows, n_a_cols)
b = np.arange(n_b_rows * n_b_cols).reshape(n_b_rows, n_b_cols)
start = time.time()
d = np.dot(a, b)
end = time.time()
print "time taken : {}".format(end - start)
Result:
time taken : 25.9291000366
Eigen:
#include <iostream>
#include <Eigen/Dense>
using namespace Eigen;
int main()
{
int n_a_rows = 4000;
int n_a_cols = 3000;
int n_b_rows = n_a_cols;
int n_b_cols = 200;
MatrixXi a(n_a_rows, n_a_cols);
for (int i = 0; i < n_a_rows; ++ i)
for (int j = 0; j < n_a_cols; ++ j)
a (i, j) = n_a_cols * i + j;
MatrixXi b (n_b_rows, n_b_cols);
for (int i = 0; i < n_b_rows; ++ i)
for (int j = 0; j < n_b_cols; ++ j)
b (i, j) = n_b_cols * i + j;
MatrixXi d (n_a_rows, n_b_cols);
clock_t begin = clock();
d = a * b;
clock_t end = clock();
double elapsed_secs = double(end - begin) / CLOCKS_PER_SEC;
std::cout << "Time taken : " << elapsed_secs << std::endl;
}
Result:
Time taken : 29.05
I am using numpy 1.8.1
and eigen 3.2.0-4
.
g++ -std=c++11 -I/usr/include/eigen3 time_eigen.cpp -o my_exec
– Niobous-02
does not help the performance. – Niobous-march=native
would bring us more performance, however theeigen
impl is still slower (now just slightly slower) thannumpy
version, I guessnumpy
makes use of heavily optimized blas packages, so not easy to beat it with the aboveeigen
code, right? – Feingold