Using derivatives as functions in CppAD
Asked Answered
L

2

28

I am trying to modify the example here:

# include <cppad/cppad.hpp>
namespace { // ---------------------------------------------------------
// define the template function JacobianCases<Vector> in empty namespace
template <typename Vector>
bool JacobianCases()
{     bool ok = true;
     using CppAD::AD;
     using CppAD::NearEqual;
     double eps99 = 99.0 * std::numeric_limits<double>::epsilon();
     using CppAD::exp;
     using CppAD::sin;
     using CppAD::cos;

     // domain space vector
     size_t n = 2;
     CPPAD_TESTVECTOR(AD<double>)  X(n);
     X[0] = 1.;
     X[1] = 2.;

     // declare independent variables and starting recording
     CppAD::Independent(X);

     // a calculation between the domain and range values
     AD<double> Square = X[0] * X[0];

     // range space vector
     size_t m = 3;
     CPPAD_TESTVECTOR(AD<double>)  Y(m);
     Y[0] = Square * exp( X[1] );
     Y[1] = Square * sin( X[1] );
     Y[2] = Square * cos( X[1] );

     // create f: X -> Y and stop tape recording
     CppAD::ADFun<double> f(X, Y);

     // new value for the independent variable vector
     Vector x(n);
     x[0] = 2.;
     x[1] = 1.;

     // compute the derivative at this x
     Vector jac( m * n );
     jac = f.Jacobian(x);

     /*
     F'(x) = [ 2 * x[0] * exp(x[1]) ,  x[0] * x[0] * exp(x[1]) ]
             [ 2 * x[0] * sin(x[1]) ,  x[0] * x[0] * cos(x[1]) ]
             [ 2 * x[0] * cos(x[1]) , -x[0] * x[0] * sin(x[i]) ]
     */
     ok &=  NearEqual( 2.*x[0]*exp(x[1]), jac[0*n+0], eps99, eps99);
     ok &=  NearEqual( 2.*x[0]*sin(x[1]), jac[1*n+0], eps99, eps99);
     ok &=  NearEqual( 2.*x[0]*cos(x[1]), jac[2*n+0], eps99, eps99);

     ok &=  NearEqual( x[0] * x[0] *exp(x[1]), jac[0*n+1], eps99, eps99);
     ok &=  NearEqual( x[0] * x[0] *cos(x[1]), jac[1*n+1], eps99, eps99);
     ok &=  NearEqual(-x[0] * x[0] *sin(x[1]), jac[2*n+1], eps99, eps99);

     return ok;
}
} // End empty namespace
# include <vector>
# include <valarray>
bool Jacobian(void)
{     bool ok = true;
     // Run with Vector equal to three different cases
     // all of which are Simple Vectors with elements of type double.
     ok &= JacobianCases< CppAD::vector  <double> >();
     ok &= JacobianCases< std::vector    <double> >();
     ok &= JacobianCases< std::valarray  <double> >();
     return ok;
}

I am trying to modify it in the following way:

Let G be the Jacobian jac that is calculated in this example, in the line:

jac = f.Jacobian(x);

and, as in the example, let X be the independent variables. I would like to construct a new function, H, which is a function of jac, i.e. H(jacobian(X)) = something, such that H is autodifferentiable. An example may be H(X) = jacobian( jacobian(X)[0]), i.e. the jacobian of the first element of jacobian(X) w.r.t X (a second derivative of sorts).

The problem is that jac as written here is of type Vector, which is a parameterized type on a raw double, not an AD<double>. To my knowledge, this means the output is not autodifferentiable.

I am looking for some advice on if it is possible to use the Jacobian in a larger operation, and take the Jacobian of that larger operation (not unlike any arithmetic operator) or if this is not possible.

EDIT: This has been put up for a bounty once, but I'm putting it up again to see if there's a better solution, because I think this is important. To be a bit more clear, the elements that the "correct" answer needs are:

a) A means of calculating arbitrary order derivatives.

b) An intelligent way of not having to specify the order of derivatives a priori. If the maximum order derivative must be known at compile time, the order of derivative can't be determined algorithmically. Further, specifying an enormously large order as in the current answer given will lead to memory allocation issues and, I imagine, performance issues.

c) Abstracting the templating of derivative order away from the end-user. This is important, because it can be difficult to keep track of the order of derivatives needed. This is probably something that comes "for free" if b) is solved.

If anybody can crack this, it would be an awesome contribution and an extremely useful operation.

Lacee answered 30/5, 2018 at 23:25 Comment(2)
I have added the requested code. Sorry about that!Lacee
So far, it looks like multiple levels might be the correct answer (such as in coin-or.github.io/CppAD/doc/mul_level.cpp.htm), but it's not clear to me how to apply this an arbitrary number of times, which is what I need.Lacee
S
6

If you want to nest functions, you should nest the AD<> as well. You can nest Jacobians as other functions, for instance see the code snippet below, which is computing the double derivative by nesting Jacobian

#include <cstring>
#include <iostream>      // standard input/output                                                                                                                                                                                      
#include <vector>        // standard vector                                                                                                                                                                                            
#include <cppad/cppad.hpp> // the CppAD package http://www.coin-or.org/CppAD/                                                                                                                                                          

// main program                                                                                                                                                                                                                        
int main(void)
{     using CppAD::AD;           // use AD as abbreviation for CppAD::AD                                                                                                                                                               
  using std::vector;         // use vector as abbreviation for std::vector                                                                                                                                                             
  size_t i;                  // a temporary index                                                                                                                                                                                      


  // domain space vector                                                                                                                                                                                                               
  auto Square = [](auto t){return t*t;};
  vector< AD<AD<double>> > X(1); // vector of domain space variables                                                                                                                                                                   

  // declare independent variables and start recording operation sequence                                                                                                                                                              
  CppAD::Independent(X);

  // range space vector                                                                                                                                                                                                                
  vector< AD<AD<double>> > Y(1); // vector of ranges space variables                                                                                                                                                                   
  Y[0] = Square(X[0]);      // value during recording of operations                                                                                                                                                                    

  // store operation sequence in f: X -> Y and stop recording                                                                                                                                                                          
  CppAD::ADFun<AD<double>> f(X, Y);

  // compute derivative using operation sequence stored in f                                                                                                                                                                           
  vector<AD<double>> jac(1); // Jacobian of f (m by n matrix)                                                                                                                                                                          
  vector<AD<double>> x(1);       // domain space vector                                                                                                                                                                                

  CppAD::Independent(x);
  jac  = f.Jacobian(x);      // Jacobian for operation sequence                                                                                                                                                                        
  CppAD::ADFun<double> f2(x, jac);
  vector<double> result(1);
  vector<double> x_res(1);
  x_res[0]=15.;
  result=f2.Jacobian(x_res);

  // print the results                                                                                                                                                                                                                 
  std::cout << "f'' computed by CppAD = " << result[0] << std::endl;
}

As a side-note, since C++14 or 11 implementing expression templates and automatic differentiation became easier and can be done with much less effort, as shown e.g. in this video towards the end https://www.youtube.com/watch?v=cC9MtflQ_nI (sorry for the poor quality). If I had to implement reasonably simple symbolic operations I would start from scratch with modern C++: you can write simpler code, and you get errors that you can understand easily.

Edit: Generalizing the example to build arbitrary order derivatives can be a template metaprogramming exercice. The snippet below shows it is possible using template recursion

#include <cstring>
#include <iostream>
#include <vector>
#include <cppad/cppad.hpp>

using CppAD::AD;
using std::vector;

template<typename T>
struct remove_ad{
    using type=T;
};

template<typename T>
struct remove_ad<AD<T>>{
    using type=T;
};

template<int N>
struct derivative{
    using type = AD<typename derivative<N-1>::type >;
    static constexpr int order = N;
};

template<>
struct derivative<0>{
    using type = double;
    static constexpr int order = 0;
};

template<typename T>
struct Jac{
    using value_type = typename remove_ad<typename T::type>::type;

    template<typename P, typename Q>
    auto operator()(P & X, Q & Y){

    CppAD::ADFun<value_type> f(X, Y);
    vector<value_type> jac(1);
    vector<value_type> x(1);

    CppAD::Independent(x);
    jac  = f.Jacobian(x);

    return Jac<derivative<T::order-1>>{}(x, jac);
    }

};

template<>
struct Jac<derivative<1>>{
    using value_type = derivative<0>::type;

    template<typename P, typename Q>
    auto operator()(P & x, Q & jac){

    CppAD::ADFun<value_type> f2(x, jac);
    vector<value_type> res(1);
    vector<value_type> x_res(1);
    x_res[0]=15.;
    return f2.Jacobian(x_res);
    }
};

int main(void)
{
    constexpr int order=4;
    auto Square = [](auto t){return t*t;};
    vector< typename derivative<order>::type > X(1);
    vector< typename derivative<order>::type > Y(1);

    CppAD::Independent(X);   
    Y[0] = Square(X[0]);
    auto result = Jac<derivative<order>>{}(X, Y);

    std::cout << "f'' computed by CppAD = " << result[0] << std::endl;
} 
Scar answered 6/6, 2018 at 7:42 Comment(7)
Thanks for the example. This almost works - except that you have to know the number of levels of differentiation you want a priori. Is there a way to take nth order derivatives without having to specify n in advance?Lacee
I added another snippet in which arbitrary order derivation is computedScar
Thank you. I haven't verified this yet (I hopefully will tomorrow), but I gave you the bounty anyway since I won't be able to check it until after it ended.Lacee
Can you explain in a little more depth what each of the components here do?Lacee
I believe there are actually bugs in this counstruction, since for order=2, it's outputting "f'' computed by CppAD = 0" Which is wrong; it should be 2.0.Lacee
oh right, there was a bug. I was calling Independent(X) and Y[0]=f(X) in the wrong order. I fixed it. I'll add some extra explanation in a bit..Scar
Thanks for the quick fix. One other question though - it is not clear how one would use this code to pass in a value for the dependent vector. How would one do that?Lacee
A
1

There is a new feature in CppAD that eliminates the need for AD< AD >, see https://coin-or.github.io/CppAD/doc/base2ad.cpp.htm

Altissimo answered 16/10, 2018 at 21:54 Comment(1)
Even though this technically answers the question, it would be much improved if it included any relevant code in the answer itself.Kerge

© 2022 - 2024 — McMap. All rights reserved.