Solving normal equation system in C++

R

5

7

I would like to solve the system of linear equations:

 Ax = b

A is a n x m matrix (not square), b and x are both n x 1 vectors. Where A and b are known, n is from the order of 50-100 and m is about 2 (in other words, A could be maximum [100x2]).

I know the solution of x: $x = \inv(A^T A) A^T b$

I found few ways to solve it: uBLAS (Boost), Lapack, Eigen and etc. but i dont know how fast are the CPU computation time of 'x' using those packages. I also don't know if this numerically a fast why to solve 'x'

What is for my important is that the CPU computation time would be short as possible and good documentation since i am newbie.

After solving the normal equation Ax = b i would like to improve my approximation using regressive and maybe later applying Kalman Filter.

My question is which C++ library is the robuster and faster for the needs i describe above?

Ruddock answered 3/1, 2011 at 13:38 Comment(4)

How do you multiple an n x m matrix by an n dimensional column vector? Presumably x is actually m dimensional. – Antedate 3/1, 2011 at 14:23

Also, have you got some requirement that states a minimum amount of buzzword compliance? – Antedate 3/1, 2011 at 14:24

@Ruddock I don't think the Boost uBLAS library implements this but please correct me if I'm wrong. It rather seems that uBLAS provides you with vectors, matrices and basic operations (multiplication, addition) but nothing like LU, QR, SVD or matrix inversion, let alone OLS implementation. However it's probably a good library to implement such algorithms. Again, please tell me if I'm wrong or if you find a good Boost uBLAS OLS implementation... – Hash 8/1, 2012 at 16:53

i was wrong, there is LU decomposition in lu.hpp. Along with the included triangular solver it lets you do some stuffs – Hash 9/1, 2012 at 3:20

I

7

This is a least squares solution, because you have more unknowns than equations. If m is indeed equal to 2, that tells me that a simple linear least squares will be sufficient for you. The formulas can be written out in closed form. You don't need a library.

If m is in single digits, I'd still say that you can easily solve this using A(transpose)*A*X = A(transpose)*b. A simple LU decomposition to solve for the coefficients would be sufficient. It should be a much more straightforward problem than you're making it out to be.

Integrator answered 3/1, 2011 at 14:17 Comment(9)

He talks about Kalman filter. I presume he is comfortable with linear algebra and OLS in particular. He wants optimized library. – Lukasz 3/1, 2011 at 14:40

@duffymo, you are right, for now the solution of $x = \inv(A^T A) A^T b$ is what i am searching. The Kalman Filter is maybe for future development. What was importent to me is which wich libary (that support inverse, transpose, matrix multiplication and etc.) should i work with (Boost, Eigen, Lapack or etc.) – Ruddock 3/1, 2011 at 14:44

I don't believe you need any of those right now. Kalman filter would be a different matter, but it's a future consideration. – Integrator 3/1, 2011 at 14:54

@duffymo, i thought i need Eigen or Boost for matrix transpose, inverse and multiplication. If i dont need those things right now, then what do i need? – Ruddock 3/1, 2011 at 16:6

@Ruddock you don't need to calculate the inverse. – Antedate 3/1, 2011 at 17:33

Eigenvalue problems are different from solving systems of equations. I'll have to go back and review what I know about Kalman filtering to recall if eigenvalues are central to its implementation. (Sorry, my books are at home.) – Integrator 3/1, 2011 at 18:18

At present i just want to solve $Ax = b$ (Kalman filtering is not for now). Nether the less, for finding LU or eigenvalues i would still need to use Eigen or Boost... Which one should i use? – Ruddock 3/1, 2011 at 20:41

If it's really just two coefficients, I'd use the closed form solution and be done with it. It's easy to derive. – Integrator 3/1, 2011 at 21:20

"Which one should I use?" - the one you know best. If you've never used either, then pick one and try it. Or, try both. Your problem is pretty simple. I'd suggest setting up a bake off, pick a measure, and see which one wins. – Integrator 4/1, 2011 at 2:22

L

7

uBlas is not optimized unless you use it with optimized BLAS bindings.

The following are optimized for multi-threading and SIMD:

Intel MKL. FORTRAN library with C interface. Not free but very good.
Eigen. True C++ library. Free and open source. Easy to use and good.
Atlas. FORTRAN and C. Free and open source. Not Windows friendly, but otherwise good.

Btw, I don't know exactly what are you doing, but as a rule normal equations are not a proper way to do linear regression. Unless your matrix is well conditioned, QR or SVD should be preferred.

Lukasz answered 3/1, 2011 at 14:4 Comment(11)

Also ACML for AMD chips. This one is free I believe. – Rummer 3/1, 2011 at 14:12

I'm not sure the optimised mult-threaded versions would be that much of a benefit for matrices as miniscule as this. – Antedate 3/1, 2011 at 14:12

would boost::numeric::ublas be consider as "optimized BLAS bindings"? – Ruddock 3/1, 2011 at 14:22

@David Heffernan. 100x100 isn't that small. – Lukasz 3/1, 2011 at 14:24

I wrote a simulator (master thesis) which generate some values. It could be a jump in the data and i am trying to detect it in real-time using statistical tests and updating my Normal equation estimation. I know that there are other ways, but this is what i am using – Ruddock 3/1, 2011 at 14:27

@watson: it's 100x2. It is small. – Tussle 3/1, 2011 at 14:27

@Eagle: So check @duffymo's solution. You need to compute A * A^t and A^tb and perform a LU decomposition on a small matrix. All this is cheap and stable (cheaper and much stabler than solving a 100x100 system) – Tussle 3/1, 2011 at 14:30

@watson @Alexandre I'd be happy for my phone to do this in real time, but I presume the OP is using a real computer! – Antedate 3/1, 2011 at 14:31

Ah... it is 100x2. Maybe too small for OpenMP. Nevertheless, I didn't say to use multithreading :). – Lukasz 3/1, 2011 at 14:36

@watson 100x100 is tiny too, unless I'm missing something – Antedate 3/1, 2011 at 14:43

@David Heffernan. 100x100 is big enough for QR. SVD is even more expensive. Plenty opportunity for multithreading. 100x100 doubles won't fit into L1 cache. More reasons to use optimized implementation. – Lukasz 3/1, 2011 at 15:26