C++: Eigen conservativeResize too expensive?
Asked Answered
G

1

1

I have some Eigen Matrices whose dimensions I don't know in advance, I only have an upper bound. I have a loop in which I fill those matrices (I initialize them using the upper bound) column by column until a stopping criterion is fulfilled (let's say after j iterations).

My problem is now: After the loop, I need those matrices for matrix multiplications (obviously using only the first j columns). The straightforward solution would be to use Eigen's conservativeResize and go right ahead and perform the matrix multiplication. Since the matrices tend to be quite large (100000+ dimensions) and (as far as I can see, not sure though) Eigen's conservativeResize reallocates the memory for the resized matrices and performs one deep copy, this solution is quite expensive.

I was thinking about writing my own custom matrix multiplication function, which uses the old (big) matrices, taking arguments specifying the number of columns to use. I fear though that Eigen's matrix multiplications are so much more optimized that in the end this solution is slower than just using conservative resizing and standard Eigen multiplication...

Should I just bite the bullet and use conservativeResize or does anyone have a better idea? BTW: The matrices we're talking about are used in 3 multiplications and 1 transpose after the loop/resize

Thanks in advance!

Edit: this is the relevant part of the code (where X is a MatrixXd, y is a VectorXd and numComponents is the number of latent variables PLS1 is supposed to use). The thing is though: at the beginning, numComponents will always be the number of dimensions in X (X.cols()) but the stopping criterion is supposed to check the relative improvement on the explained variance in the output vector (that, I have not implemented yet). If the relative improvement is too small, the algorithm is supposed to stop (since we are happy with the first j components) and then compute the regression coefficients. For that, I need the conservativeResize:

using namespace Eigen;
MatrixXd W,P,T,B;
VectorXd c,xMean;
double xMean;

W.resize(X.cols(),numComponents);
P.resize(X.cols(),numComponents);
T.resize(X.rows(),numComponents);
c.resize(numComponents);
xMean.resize(X.cols());
xMean.setZero();
yMean=0;
VectorXd yCopy=y;
//perform PLS1
for(size_t j=0; j< numComponents; ++j){
    VectorXd tmp=X.transpose()*y;
    W.col(j)=(tmp)/tmp.norm();
    T.col(j)=X*W.col(j);
    double divisorTmp=T.col(j).transpose()*T.col(j);
    c(j)=(T.col(j).transpose()*y);
    c(j)/=divisorTmp;
    P.col(j)=X.transpose()*T.col(j)/divisorTmp;
    X=X-T.col(j)*P.col(j).transpose();
    y=y-T.col(j)*c(j);
    if(/*STOPPINGCRITERION(TODO)*/ && j<numComponents-1){
        numComponents=j+1;
        W.conservativeResize(X.cols(),numComponents);
        P.conservativeResize(X.cols(),numComponents);
        T.conservativeResize(X.rows(),numComponents);
        c.conservativeResize(numComponents);
    }
}
//store regression matrix
MatrixXd tmp=P.transpose()*W;
B=W*tmp.inverse()*c;
yCopy=yCopy-T*c;
mse=(yCopy.transpose()*yCopy);
mse/=y.size();//Mean Square Error
Gabon answered 12/6, 2014 at 14:1 Comment(3)
Have you written any code or attempted to benchmark it?Engler
Yes I have most of the code (it's an adapted version of the PLS1 algorithm), the only thing missing is the stopping criterion. I'll append it to the question. I have not tried benchmarking it thoughGabon
s/benchmark/profile/, also, interesting: channel9.msdn.com/Events/Build/2014/4-587Dominicdominica
B
6

I think you could allocate large matrix once, then for multiplication use block create a view of its part which would contain meaningful data. You can reuse a big matrix then. This will spare you reallocations.

Following example fully demonstrates it.

./eigen_block_multiply.cpp:

#include <Eigen/Dense>
#include <iostream>
using namespace std;
using namespace Eigen;
int main()
{
  Matrix<float, 2, 3> small;
  small << 1,2,3,
           4,5,6;

  Matrix<float, 4, 4> big = Matrix<float, 4, 4>::Constant(0.6);
  cout << "Big matrix:\n";
  cout << big << endl;
  cout << "Block of big matrix:\n";
  cout << big.block(0,0,3,2) << endl;
  cout << "Small matrix:\n";
  cout << small << endl;
  cout << "Product:\n";
  cout << small * big.block(0,0,3,2) << endl;

  Matrix<float, 3, 3> small2;
  small2 << 1,2,3,
            4,5,6,
            7,8,9;
  big = Matrix<float, 4, 4>::Constant(6.66);
  cout << "Product2:\n";
  cout << small * big.block(0,0,3,3) << endl;
}

Output:

Big matrix:
0.6 0.6 0.6 0.6
0.6 0.6 0.6 0.6
0.6 0.6 0.6 0.6
0.6 0.6 0.6 0.6
Block of big matrix:
0.6 0.6
0.6 0.6
0.6 0.6
Small matrix:
1 2 3
4 5 6
Product:
3.6 3.6
  9   9
Product2:
39.96 39.96 39.96
 99.9  99.9  99.9
Balsamic answered 12/6, 2014 at 14:37 Comment(3)
I didn't know Eigen brings that functionality along, it's exactly what I was searching for. Thank you Sir, you made my day.Gabon
Ok, please notice, that this still might be slower, than exactly tailored matrices, because of cache friendliness. The data access patterns when you don't iterate over whole matrix, you have some holes, and it might be less cache friendly. But it does not have to. I would do some tests. Being cache friendly can bring great improvements.Balsamic
Alright, I'll keep it in mind and test that once everything runsGabon

© 2022 - 2024 — McMap. All rights reserved.