using SciPy to integrate a function that returns a matrix or array
Asked Answered
H

6

16

I have a symbolic array that can be expressed as:

from sympy import lambdify, Matrix

g_sympy = Matrix([[   x,  2*x,  3*x,  4*x,  5*x,  6*x,  7*x,  8*x,   9*x,  10*x],
                  [x**2, x**3, x**4, x**5, x**6, x**7, x**8, x**9, x**10, x**11]])

g = lambdify( (x), g_sympy )

So that for each x I get a different matrix:

g(1.) # matrix([[  1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.],
      #         [  1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.]])
g(2.) # matrix([[  2.00e+00,   4.00e+00,   6.00e+00,   8.00e+00,   1.00e+01, 1.20e+01,   1.40e+01,   1.60e+01,   1.80e+01,   2.00e+01],
      #         [  4.00e+00,   8.00e+00,   1.60e+01,   3.20e+01,   6.40e+01, 1.28e+02,   2.56e+02,   5.12e+02,   1.02e+03,   2.05e+03]])

and so on...


I need to numerically integrate g over x, say from 0. to 100. (in the real case the integral does not have an exact solution) and in my current approach I have to lambdify each element in g and integrate it individually. I am using quad to do an element-wise integration like:

ans = np.zeros( g_sympy.shape )
for (i,j), func_sympy in ndenumerate(g_sympy):
    func = lambdify( (x), func_sympy)
    ans[i,j] = quad( func, 0., 100. )

There are two problems here: 1) lambdify used many times and 2) for loop; and I believe the first one is the bottleneck, because the g_sympy matrix has at most 10000 terms (which is not a big deal to a for loop).

As shown above lambdify allows the evaluation of the whole matrix, so I thought: "Is there a way to integrate the whole matrix?"

scipy.integrate.quadrature has a parameter vec_func which gave me hope. I was expecting something like:

g_int = quadrature( g, x1, x2 )

to get the fully integrated matrix, but it gives the ValueError: matrix must be 2-dimensional


EDIT: What I am trying to do can apparently be done in Matlab using quadv and has already been discussed for SciPy

The real case has been made available here.

To run it you will need:

  • numpy
  • scipy
  • matplotlib
  • sympy

Just run: "python curved_beam_mrs.py".

You will see that the procedure is already slow, mainly because of the integration, indicated by the TODO in file curved_beam.py.

It will go much slower if you remove the comment indicated after the TODO in file curved_beam_mrs.py.

The matrix of functions which is integrated is showed in the print.txt file.

Thank you!

Hispaniola answered 12/6, 2013 at 16:10 Comment(10)
can you specify what do you mean by integrating a matrix --- in just math terms? What the result should be, is it a matrix or is it a scalar or something else?Abran
I am changing x from 0. to 100., and I want to integrate every element in the matrix, like doing quad( g[i,j], 0., 100. ) for (i,j),v in ndenumerate(g) but this is the element-wise approach that I am trying to avoid...Hispaniola
code looks awkward in comments, so added an answer --- more of a non-answer, actually :-).Abran
What kind of accuracy are you looking for here?Towandatoward
A solution with an accuracy of 1.e-3 would be great already.Hispaniola
Are the elements of the matrix simple polynomial functions of x?Lemma
In the real case they are long series of sin and cos functions...Hispaniola
Like sin(x)^m*cos(x)^n, with m,n arbitrary positive integer coefficients?Lemma
@flebool question updated!Hispaniola
How much does the getting rid of sympy help? and after that, how much of an issue is speed still?Newly
A
6

The first argument to either quad or quadrature must be a callable. The vec_func argument of the quadrature refers to whether the argument of this callable is a (possibly multidimensional) vector. Technically, you can vectorize the quad itself:

>>> from math import sin, cos, pi
>>> from scipy.integrate import quad
>>> from numpy import vectorize
>>> a = [sin, cos]
>>> vectorize(quad)(a, 0, pi)
(array([  2.00000000e+00,   4.92255263e-17]), array([  2.22044605e-14,   2.21022394e-14]))

But that's just equivalent to explicit looping over the elements of a. Specifically, it'll not give you any performance gains, if that's what you're after. So, all in all, the question is why and what exactly you are trying to achieve here.

Abran answered 13/6, 2013 at 23:2 Comment(2)
thank you for your answer. I am relly looking for performance gain. But your solution is more elegant than the for loop I am using now...Hispaniola
If I replace a = [sin, cos] with a = lambda t: [sin(t), cos(t)] the code does not work anymore. How should I adapt the code to get it to work for a = lambda t: [sin(t), cos(t)]?Lonely
N
3

In the real case the integral does not have an exact solution, do you mean singularities? Could you be more precise on it, as well as on the size of the matrix that you wish to integrate. I have to admit that sympy is dreadfully slow when it comes to some things (not sure if integration is part of it, but i prefer to stay away from sympy and stick to numpy solution). Do you want to get a more elegant solution, by doing it with a matrix or a faster one?

-note: apparently i cant add comment to your post to ask for this stuff, so i had to post this as answer, maybe this is because i dont have enough reputation or so?-

edit: something like this?

    import numpy
    from scipy.integrate import trapz
    g=lambda x: numpy.array([[x,2*x,3*x],[x**2,x**3,x**4]])
    xv=numpy.linspace(0,100,200)
    print trapz(g(xv))

having seen that you want to integrate stuff like sum(a*sin(bx+c)^n*cos(dx+e)^m), for different coefficients for the a,b,c,d,e,m,n, i suggest doing all of those analytically. (should have some formula for that since you can just rewrite sin to complex exponentials

Another thing i noted when checking those functions a bit better, is that sin(a*x+pi/2) and sin(a*x+pi) and stuff like that can be rewritten to cos or sin in a way that removes the pi/2 or pi. Also what i see is that just by looking at the first element in your matrix of functions:

a*sin(bx+c)^2+d*cos(bx+c)^2 = a*(sin^2+cos^2)+(d-a)*cos(bx+c)^2 = a+(d-a)*cos(bx+c)^2 

which also simplifies the calculations. If you had the formulas in a way which didnt involve a massive txtfile or so, id check what the most general formula is that you need to integrate, but i guess its something like a*sin^n(bx+c)*cos^m(dx+e), with m and n being 0 1 or 2, and those things can be simplified into something which can be analytically integrated. So if you find out the most general analytical function you got, you can easily make something like

f=lambda x: [[s1(x),s2(x)],[s3(x),s4(x)]]
res=f(x2)-f(x1)

where s1(x) etc are just the analytically integrated versions of your functions?

(not really planning on going through your entire code to see what all the rest does, but is it just integrating those functions in the txt file from a to b or something like that? or is there somewhere something like that you take the square of each function or whatever thing that might mess up the possibility of doing it analytically?)

this should simplify your integrals i guess?

first integral and: second one

hmm, that second link doesnt work, but you get the idea from the first one i guess

edit, since you do not want analytical solutions: the improvement remains in getting rid of sympy:

from sympy import sin as SIN
from numpy import sin as SIN2
from scipy.integrate import trapz
import time
import numpy as np

def integrand(rlin):
    first=np.arange(1,11)[:,None]
    second=np.arange(2,12)[:,None]
    return np.vstack((rlin*first,np.power(rlin,second)))

def simpson2(func,a,b,num):
    a=float(a)
    b=float(b)
    h=(b-a)/num
    p1=a+h*np.arange(1,num,2)
    p2=a+h*np.arange(2,num-1,2)
    points=np.hstack((p1,p2,a,b))
    mult=np.hstack((np.repeat(4,p1.shape[0]),np.repeat(2,p2.shape[0]),1,1))
    return np.dot(integrand(points),mult)*h/3


A=np.linspace(0,100.,200)

B=lambda x: SIN(x)
C=lambda x: SIN2(x)

t0=time.time()
D=simpson2(B,0,100.,200)
print time.time()-t0
t1=time.time()
E=trapz(C(A))
print time.time()-t1

    t2=time.time()
    F=simpson2(C,0,100.,200)
    print time.time()-t2

results in:

0.000764131546021 sec for the faster method, but when using sympy

7.58171081543e-05 sec for my slower method, but which uses numpy

0.000519037246704 sec for the faster method, when using numpy, 

conclusion: use numpy, ditch sympy, (my slower numpy method is actually faster in this case, because in this example i only tried it on one sin-function, instead of on a ndarray of them, but the point of ditching sympy still remains when comparing the time of the numpy version of the faster method to the one of the sympy version of the faster method)

Newly answered 13/6, 2013 at 9:0 Comment(12)
yes, you need 50 reputation to add comments, but thank you very much by the comment. I just want to integrate a function that returns a matrix/array, the sympy example was included to give some backgroundHispaniola
But is your function "gentle enough"? (does it have singularities or other nasty stuff in it? or is it just some function like a product of some stuff that cant be solved analytically but still is "gentle enough to be integrated numerically" ?)Newly
so, are you looking for a performance gain? or for a more elegant code?Newly
anyone care to explain the downvote? (first upvoted, than half an hour later downvoted again? )Newly
i suggest an analytical solution ;-) since your things you want to integrate are all asin(bx+c)*cos(e*x+f), or similar (assuming after a quick look, you just want to integrate the functions you got there in that txt file), (if it can be done analytical, do it that way), and get rid of sympy, since its SLOW. Change all your sympy.sin by numpy.sin etc, that should boost your performance a lotNewly
edited it a bit more with some explanation on the analytic version- that should definately make it MUCH faster to integrate everythingNewly
Thank you very much for your answer. I am still seeking for a solution that makes it possible to do numerical integration to be of a more general scope.Hispaniola
@sgpc, Just to be clear, you mean that you are NOT interested in an analytical solution?Lemma
Exactly, just because it should be general, and an analytical solution would fit only for a specific caseHispaniola
can you explain a bit more what the most general case is? because if its only sin n cos n things like that, but with different numbers in it, analytical still is the way to go? or do you expect future applications to contain besselfunctions n other stuff that might get messy? (and did getting rid of sympy help enough?)Newly
tested it a bit myself: sympy.sin is much slower than numpy.sinNewly
how much do these changes make it faster when you change it into your code, when using numpy instead of sympy? (if it is just with one sin function, it should be as described, but not sure when it is used in the entire code etc)Newly
T
3

Vectorizing trapezoidal and simpson's integral rules. Trapezoidal is just copy and pasted from another project that used logspace instead of linspace so that it can utilize non-uniform grids.

def trap(func,a,b,num):
    xlinear=np.linspace(a,b,num)
    slicevol=np.diff(xlinear)
    output=integrand(xlinear)
    output=output[:,:-1]+output[:,1:]
    return np.dot(output,slicevol)/2

def simpson(func,a,b,num):
    a=float(a)
    b=float(b)
    h=(b-a)/num

    output=4*np.sum(integrand(a+h*np.arange(1,num,2)),axis=1)
    output+=2*np.sum(integrand(a+h*np.arange(2,num-1,2)),axis=1)
    output+=np.sum(integrand(b),axis=1)
    output+=np.sum(integrand(a),axis=1)
    return output*h/3

def integrand(rlin):
    first=np.arange(1,11)[:,None]
    second=np.arange(2,12)[:,None]
    return np.vstack((rlin*first,np.power(rlin,second)))

Examine trapazoidal and simpsons rule cumulative relative errors:

b=float(100)
first=np.arange(1,11)*(b**2)/2
second=np.power(b,np.arange(3,13))/np.arange(3,13)
exact=np.vstack((first,second))

for x in range(3):
    num=x*100+100
    tr=trap(integrand,0,b,num).reshape(2,-1)
    si=simpson(integrand,0,b,num).reshape(2,-1)
    rel_trap=np.sum(abs((tr-exact)/exact))*100
    rel_simp=np.sum(abs((si-exact)/exact))*100
    print 'Number of points:',num,'Trap Rel',round(rel_trap,6),'Simp Rel',round(rel_simp,6)

Number of points: 100 Trap Rel 0.4846   Simp Rel 0.000171
Number of points: 200 Trap Rel 0.119944 Simp Rel 1.1e-05
Number of points: 300 Trap Rel 0.053131 Simp Rel 2e-06

Timeit. Note that both trapezoidal rules use 200 points while simpsons is timed at only 100 based on the above convergence. Sorry I dont have sympy:

s="""
import numpy as np
from scipy.integrate import trapz

def integrand(rlin):
    first=np.arange(1,11)[:,None]
    second=np.arange(2,12)[:,None]
    return np.vstack((rlin*first,np.power(rlin,second)))

def trap(func,a,b,num):
    xlinear=np.linspace(a,b,num)
    slicevol=np.diff(xlinear)
    output=integrand(xlinear)
    output=output[:,:-1]+output[:,1:]
    return np.dot(output,slicevol)/2

def simpson(func,a,b,num):
    a=float(a)
    b=float(b)
    h=(b-a)/num

    output=4*np.sum(integrand(a+h*np.arange(1,num,2)),axis=1)
    output+=2*np.sum(integrand(a+h*np.arange(2,num-1,2)),axis=1)
    output+=np.sum(integrand(b),axis=1)
    output+=np.sum(integrand(a),axis=1)
    return output*h/3

def simpson2(func,a,b,num):
    a=float(a)
    b=float(b)
    h=(b-a)/num
    p1=a+h*np.arange(1,num,2)
    p2=a+h*np.arange(2,num-1,2)
    points=np.hstack((p1,p2,a,b))
    mult=np.hstack((np.repeat(4,p1.shape[0]),np.repeat(2,p2.shape[0]),1,1))
    return np.dot(integrand(points),mult)*h/3

def x2(x):
    return x**2
def x3(x):
    return x**3
def x4(x):
    return x**4
def x5(x):
    return x**5
def x5(x):
    return x**5
def x6(x):
    return x**6
def x7(x):
    return x**7
def x8(x):
    return x**8
def x9(x):
    return x**9
def x10(x):
    return x**10
def x11(x):
    return x**11
def xt5(x):
    return 5*x
"""

zhenya="""
a=[xt5,xt5,xt5,xt5,xt5,xt5,xt5,xt5,xt5,xt5,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11]
vectorize(quad)(a, 0, 100)
"""

usethedeathstar="""
g=lambda x: np.array([[x,2*x,3*x,4*x,5*x,6*x,7*x,8*x,9*x,10*x],[x**2,x**3,x**4,x**5,x**6,x**7,x**8,x**9,x**10,x**11]])
xv=np.linspace(0,100,200)
trapz(g(xv))
"""

vectrap="""
trap(integrand,0,100,200)
"""

vecsimp="""
simpson(integrand,0,100,100)
"""

vecsimp2="""
simpson2(integrand,0,100,100)
"""

print 'zhenya took',timeit.timeit(zhenya,setup=s,number=100),'seconds.'
print 'usethedeathstar took',timeit.timeit(usethedeathstar,setup=s,number=100),'seconds.'
print 'vectrap took',timeit.timeit(vectrap,setup=s,number=100),'seconds.'
print 'vecsimp took',timeit.timeit(vecsimp,setup=s,number=100),'seconds.'
print 'vecsimp2 took',timeit.timeit(vecsimp2,setup=s,number=100),'seconds.'

Results:

zhenya took 0.0500509738922 seconds.
usethedeathstar took 0.109386920929 seconds.
vectrap took 0.041011095047 seconds.
vecsimp took 0.0376999378204 seconds.
vecsimp2 took 0.0311458110809 seconds.

Something to point out in the timings is zhenya's answer should be much more accurate. I believe everything is correct, please let me know if changes are required.

If you provide the functions and range that you will be using I can probably whip up something better for your system. Also would you be interested in utilizing additional cores/nodes?

Towandatoward answered 2/7, 2013 at 15:5 Comment(1)
thank you very much for your answer. I've updated the question with the real case I am integrating...Hispaniola
E
1

I might have found some interesting way of doing this, at the expense of defining different symbols for the matrix g_symp:

import numpy as np
from scipy.integrate import quad
import sympy as sy

@np.vectorize
def vec_lambdify(var, expr, *args, **kw):
    return sy.lambdify(var, expr, *args, **kw)

@np.vectorize
def vec_quad(f, a, b, *args, **kw):
    return quad(f, a, b, *args, **kw)[0]

Y = sy.symbols("y1:11")
x = sy.symbols("x")
mul_x = [y.subs(y,x*(i+1)) for (i,y) in enumerate(Y)]
pow_x = [y.subs(y,x**(i+1)) for (i,y) in enumerate(Y)]

g_sympy = np.array(mul_x + pow_x).reshape((2,10))
X = x*np.ones_like(g_sympy)
G = vec_lambdify(X, g_sympy)
I = vec_quad(G, 0, 100)
print(I)

with results:

[[  5.00000000e+03   1.00000000e+04   1.50000000e+04   2.00000000e+04
    2.50000000e+04   3.00000000e+04   3.50000000e+04   4.00000000e+04
    4.50000000e+04   5.00000000e+04]
[  5.00000000e+03   3.33333333e+05   2.50000000e+07   2.00000000e+09
   1.66666667e+11   1.42857143e+13   1.25000000e+15   1.11111111e+17
   1.00000000e+19   9.09090909e+20]]

and using the ipython magic%timeit vec_quad(G,0,100) I got

1000 loops, best of 3: 527 µs per loop

I think this approach is somewhat more clean, despite the juggling with symbols.

Elyn answered 27/3, 2014 at 16:51 Comment(0)
P
1

quadpy (a project of mine) does vectorized quadrature. This

import numpy
import quadpy


def f(x):
    return [
        [k * x for k in range(2, 11)],
        [x ** k for k in range(2, 11)],
    ]


sol, err = quadpy.quad(f, 0.0, 100.0, epsabs=numpy.inf, epsrel=1.0e-10)

print(sol)
print(err)

gives

[[1.00000000e+04 1.50000000e+04 2.00000000e+04 2.50000000e+04
  3.00000000e+04 3.50000000e+04 4.00000000e+04 4.50000000e+04
  5.00000000e+04]
 [3.33333333e+05 2.50000000e+07 2.00000000e+09 1.66666667e+11
  1.42857143e+13 1.25000000e+15 1.11111111e+17 1.00000000e+19
  9.09090909e+20]]
[[5.11783704e-16 4.17869644e-16 1.02356741e-15 9.15506521e-16
  8.35739289e-16 1.19125717e-15 2.04713482e-15 1.93005721e-15
  1.83101304e-15]
 [6.69117036e-14 9.26814751e-12 1.05290634e-09 1.12081237e-07
  1.09966583e-05 1.09356156e-03 1.00722052e-01 9.31052614e+00
  9.09545305e+02]]

Timings:

%timeit quadpy.quad(f, 0.0, 100.0, epsabs=numpy.inf, epsrel=1.0e-10)    
904 µs ± 3.02 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Pro answered 22/3, 2017 at 8:56 Comment(0)
B
0

SciPy quad_vec for adaptive quadrature of vector functions

For those like me landing on this question many years later, there is now a specific SciPy function scipy.integrate.quad_vec that specifically addresses what this question asks. It allows for adaptive quadrature of vector functions (or different functions).

This function is particularly useful when the different functions or components to integrate share some of the computation, which is often the case for vector functions.

A very efficient vectorized quadrature of the vector function func can be achieved with the following lines of code (I rewrote the precise function in the original question, which is slightly different from the one used in some of the other answers)

import  numpy as np
from scipy import integrate

def func(x):
    return np.vstack([x*np.arange(1, 11), x**np.arange(2, 12)])

res = integrate.quad_vec(func, 0, 100)

print(res[0])  # first element is the integral of every components

The printed result is

[[5.00000000e+03 1.00000000e+04 1.50000000e+04 2.00000000e+04
  2.50000000e+04 3.00000000e+04 3.50000000e+04 4.00000000e+04
  4.50000000e+04 5.00000000e+04]
 [3.33333333e+05 2.50000000e+07 2.00000000e+09 1.66666667e+11
  1.42857143e+13 1.25000000e+15 1.11111111e+17 1.00000000e+19
  9.09090909e+20 8.33333333e+22]]
Behre answered 13/6, 2023 at 13:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.