I am having a hard time optimizing a program that is relying on ad
s conjugateGradientDescent
function for most of it's work.
Basically my code is a translation of an old papers code that is written in Matlab and C. I have not measured it, but that code is running at several iterations per second. Mine is in the order of minutes per iteration ...
The code is available in this repositories:
The code in question can be run by following these commands:
$ cd aer-utils
$ cabal sandbox init
$ cabal sandbox add-source ../aer
$ cabal run learngabors
Using GHCs profiling facilities I have confirmed that the descent is in fact the part that is taking most of the time:
(interactive version here: https://dl.dropboxusercontent.com/u/2359191/learngabors.svg)
-s
is telling me that productivity is quite low:
Productivity 33.6% of total user, 33.6% of total elapsed
From what I have gathered there are two things that might lead to higher performance:
Unboxing: currently I use a custom matrix implementation (in
src/Data/SimpleMat.hs
). This was the only way I could getad
to work with matrices (see: How to do automatic differentiation on hmatrix?). My guess is that by using a matrix type likenewtype Mat w h a = Mat (Unboxed.Vector a)
would achieve better performance due to unboxing and fusion. I found some code that hasad
instances for unboxed vectors, but up to now I haven't been able to use these with theconjugateGradientFunction
.Matrix derivatives: In an email I just can't find at the moment Edward mentions that it would be better to use
Forward
instances for matrix types instead of having matrices filled withForward
instances. I have a faint idea how to achieve that, but have yet to figure out how I'd implement it in terms ofad
s type classes.
This is probably a question that is too wide to be answered on SO, so if you are willing to help me out here, feel free to contact me on Github.
cabal run
does run compiled code. Running the same thing from GHCi (ie use:main
) is even slower. – Prohibitionist