financial python library that has xirr and xnpv function?
Asked Answered
D

8

19

numpy has irr and npv function, but I need xirr and xnpv function.

this link points out that xirr and xnpv will be coming soon. http://www.projectdirigible.com/documentation/spreadsheet-functions.html#coming-soon

Is there any python library that has those two functions? tks.

Demilune answered 19/1, 2012 at 0:52 Comment(0)
P
13

With the help of various implementations I found in the net, I came up with a python implementation:

def xirr(transactions):
    years = [(ta[0] - transactions[0][0]).days / 365.0 for ta in transactions]
    residual = 1
    step = 0.05
    guess = 0.05
    epsilon = 0.0001
    limit = 10000
    while abs(residual) > epsilon and limit > 0:
        limit -= 1
        residual = 0.0
        for i, ta in enumerate(transactions):
            residual += ta[1] / pow(guess, years[i])
        if abs(residual) > epsilon:
            if residual > 0:
                guess += step
            else:
                guess -= step
                step /= 2.0
    return guess-1

from datetime import date
tas = [ (date(2010, 12, 29), -10000),
    (date(2012, 1, 25), 20),
    (date(2012, 3, 8), 10100)]
print xirr(tas) #0.0100612640381
Platypus answered 16/7, 2012 at 11:39 Comment(2)
WARNING: If you use these step and guess values, it will be impossible to obtain an irr in (-100%, -95%) Zugzwang
Since this is still one of the top search results for XIRR in Python, I feel it needs to be said that this calculation is not direction agnostic. It assumes inflow as negative and outflow as positive. If you reverse this, this doesn't work.Chole
B
20

Here is one way to implement the two functions.

import scipy.optimize

def xnpv(rate, values, dates):
    '''Equivalent of Excel's XNPV function.

    >>> from datetime import date
    >>> dates = [date(2010, 12, 29), date(2012, 1, 25), date(2012, 3, 8)]
    >>> values = [-10000, 20, 10100]
    >>> xnpv(0.1, values, dates)
    -966.4345...
    '''
    if rate <= -1.0:
        return float('inf')
    d0 = dates[0]    # or min(dates)
    return sum([ vi / (1.0 + rate)**((di - d0).days / 365.0) for vi, di in zip(values, dates)])

def xirr(values, dates):
    '''Equivalent of Excel's XIRR function.

    >>> from datetime import date
    >>> dates = [date(2010, 12, 29), date(2012, 1, 25), date(2012, 3, 8)]
    >>> values = [-10000, 20, 10100]
    >>> xirr(values, dates)
    0.0100612...
    '''
    try:
        return scipy.optimize.newton(lambda r: xnpv(r, values, dates), 0.0)
    except RuntimeError:    # Failed to converge?
        return scipy.optimize.brentq(lambda r: xnpv(r, values, dates), -1.0, 1e10)
Bottali answered 21/10, 2015 at 13:4 Comment(3)
Can you explain why your XNPV function returns infinity for any rate below -1.0 (-100%)? I understand the case for -100% exactly, but the exponentiation operator binds before the division, so you will not get a division by zero for rates not equal exactly -100%. For example at a rate of 105% a $100 payment a year from now has a NPV = 100 / (1 + 1.05) ** 1 = $48.78... The same future payment at a rate of -5% is around -$105 (100 / (1 - .05) ** 1). Currently some bonds "pay" negative rate so this is not just theoretical. Now consider rate -105%, and we get 100 / (1 - 1.05) ** 1 = -1999.999...Skyrocket
What about 100/(1-1.05)**2 = 40000? Would it make sense? As far as bonds with negative rate go, could it be the case that you are talking about rates less than 0 here, not rates less than -100%?Bottali
I've made the following improvement that seems to converge faster and more accurately. Basically it uses a guess based in gross return as a starting point for the newton method. ``` def xirr(values, dates): positives = [x if x > 0 else 0 for x in values] negatives = [x if x < 0 else 0 for x in values] return_guess = (sum(positives) + sum(negatives)) / (-sum(negatives)) try: return scipy.optimize.newton(lambda r: xnpv(r, values, dates), return_guess) ```Intorsion
P
13

With the help of various implementations I found in the net, I came up with a python implementation:

def xirr(transactions):
    years = [(ta[0] - transactions[0][0]).days / 365.0 for ta in transactions]
    residual = 1
    step = 0.05
    guess = 0.05
    epsilon = 0.0001
    limit = 10000
    while abs(residual) > epsilon and limit > 0:
        limit -= 1
        residual = 0.0
        for i, ta in enumerate(transactions):
            residual += ta[1] / pow(guess, years[i])
        if abs(residual) > epsilon:
            if residual > 0:
                guess += step
            else:
                guess -= step
                step /= 2.0
    return guess-1

from datetime import date
tas = [ (date(2010, 12, 29), -10000),
    (date(2012, 1, 25), 20),
    (date(2012, 3, 8), 10100)]
print xirr(tas) #0.0100612640381
Platypus answered 16/7, 2012 at 11:39 Comment(2)
WARNING: If you use these step and guess values, it will be impossible to obtain an irr in (-100%, -95%) Zugzwang
Since this is still one of the top search results for XIRR in Python, I feel it needs to be said that this calculation is not direction agnostic. It assumes inflow as negative and outflow as positive. If you reverse this, this doesn't work.Chole
E
12

Created a package for fast XIRR calculation, PyXIRR

It doesn't have external dependencies and works faster than any existing implementation.

from datetime import date
from pyxirr import xirr

dates = [date(2020, 1, 1), date(2021, 1, 1), date(2022, 1, 1)]
amounts = [-1000, 1000, 1000]

# feed columnar data
xirr(dates, amounts)

# feed tuples
xirr(zip(dates, amounts))

# feed DataFrame
import pandas as pd
xirr(pd.DataFrame({"dates": dates, "amounts": amounts}))
Excellence answered 7/5, 2021 at 11:19 Comment(3)
Hi Alexander. Can you tell me how to handle the following error? Context - I have a DataFrame which contains multiple id's having multiple amounts. Also I'm not looping the id's to calculate the XIRR but instead doing the following: df.sort_values(["id"],ascending=True).groupby("id")[["timestamp", "amount", "id"]].apply(xirr) pyxirr.InvalidPaymentsError: negative and positive payments are requiredAmitosis
@MohseenMulla, some of your groups have only negative or only positive payments, so it's impossible to calculate XIRR. InvalidPaymentsError is suppressed by passing silent=True flag (you will get None instead of error)Excellence
A few notes: you don't need either sorting or id column after groupby. Simply do df.groupby("id")[["timestamp", "amount"]].apply(xirr, silent=True)Excellence
C
1

This answer is an improvement on @uuazed's answer and derives from that. However, there are a few changes:

  1. It uses a pandas dataframe instead of a list of tuples
  2. It is cashflow direction agnostic, i.e., whether you treat inflows as negative and outflows as positive or vice versa, the result will be the same, as long as the treatment is consistent for all transactions.
  3. XIRR calculation with this method doesn't work if cashflows are not ordered by date. Hence I have handled sorting of the dataframe internally.
  4. In the earlier answer, there was an implicit assumption that XIRR will mostly be positive. which created the problem pointed out in the other comment, that XIRR between -100% and -95% cannot be calculated. This solution does away with that problem.
import pandas as pd
import numpy as np

def xirr(df, guess=0.05, date_column = 'date', amount_column = 'amount'):
    '''Calculates XIRR from a series of cashflows. 
       Needs a dataframe with columns date and amount, customisable through parameters. 
       Requires Pandas, NumPy libraries'''

    df = df.sort_values(by=date_column).reset_index(drop=True)
    df['years'] = df[date_column].apply(lambda x: (x-df[date_column][0]).days/365)
    step = 0.05
    epsilon = 0.0001
    limit = 1000
    residual = 1

    #Test for direction of cashflows
    disc_val_1 = df[[amount_column, 'years']].apply(
                lambda x: x[amount_column]/((1+guess)**x['years']), axis=1).sum()
    disc_val_2 = df[[amount_column, 'years']].apply(
                lambda x: x[amount_column]/((1.05+guess)**x['years']), axis=1).sum()
    mul = 1 if disc_val_2 < disc_val_1 else -1

    #Calculate XIRR    
    for i in range(limit):
        prev_residual = residual
        df['disc_val'] = df[[amount_column, 'years']].apply(
                lambda x: x[amount_column]/((1+guess)**x['years']), axis=1)
        residual = df['disc_val'].sum()
        if abs(residual) > epsilon:
            if np.sign(residual) != np.sign(prev_residual):
                step /= 2
            guess = guess + step * np.sign(residual) * mul   
        else:
            return guess

Explanation:

In the test block, it checks whether increasing the discounting rate increases the discounted value or reduces it. Based on this test, it is determined which direction the guess should move. This block makes the function handle cashflows regardless of direction assumed by the user.

The np.sign(residual) != np.sign(prev_residual) checks when the guess has increased/decreased beyond the required XIRR rate, because that's when the residual goes from negative to positive or vice versa. The step size is reduced at this point.

The numpy package is not absolutely necessary. without numpy, np.sign(residual) can be replaced with residual/abs(residual). I have used numpy to make the code more readable and intuitive

I have tried to test this code with a variety of cash flows. If you find any cases which are not handled by this function, do let me know.

Edit: Here's a cleaner and faster version of the code using numpy arrays. In my test with about 700 transaction, this code ran 5 times faster than the one above:

def xirr(df, guess=0.05, date_column='date', amount_column='amount'):
    '''Calculates XIRR from a series of cashflows. 
       Needs a dataframe with columns date and amount, customisable through parameters. 
       Requires Pandas, NumPy libraries'''

    df = df.sort_values(by=date_column).reset_index(drop=True)

    amounts = df[amount_column].values
    dates = df[date_column].values

    years = np.array(dates-dates[0], dtype='timedelta64[D]').astype(int)/365

    step = 0.05
    epsilon = 0.0001
    limit = 1000
    residual = 1

    #Test for direction of cashflows
    disc_val_1 = np.sum(amounts/((1+guess)**years))
    disc_val_2 = np.sum(amounts/((1.05+guess)**years))
    mul = 1 if disc_val_2 < disc_val_1 else -1

    #Calculate XIRR    
    for i in range(limit):
        prev_residual = residual
        residual = np.sum(amounts/((1+guess)**years))
        if abs(residual) > epsilon:
            if np.sign(residual) != np.sign(prev_residual):
                step /= 2
            guess = guess + step * np.sign(residual) * mul   
        else:
            return guess
Chole answered 11/5, 2020 at 4:58 Comment(1)
Hello Gourav, First of all, Thanks for your work, your code works perfectly except for few scenarios like below, do you mind taking a look at it. 6/29/2018, -23979294.56 9/18,2018,-363717.94 11/26/2018,-3788281.69 12/21/2018,-932400 3/14/2019,-1614520.85 6/25/2019,-1294261.08 9/26/2019,-1321393.94 12/18/2019,-1632714.2 2/13/2020,-428904 3/23/2020,-843655.5 4/2/2020,-983682 11/06/2020, 6124619.985 I have tried changing the iteration and it is not working. Thanks in Advance!Advection
P
1

I started from @KT 's solution but improved on it in a few ways:

  • as pointed out by others, there is no need for xnpv to return inf if the discount rate <= -100%
  • if the cashflows are all positive or all negative, we can return a nan straight away: no point in letting the algorithm search forever for a solution which doesn't exist
  • I have made the daycount convention an input; sometimes it is 365, some other times it is 360 - it depends on the case. I have not modelled 30/360. More details on Matlab's docs
  • I have added optional inputs for the maximum number of iterations and for the starting point of the algorithm
  • I have not changed the default tolerance of the algorithms but that's very easy to change

Key findings for the specific example below (results may well be different for other cases, I have not had the time to test many other cases):

  • starting from a value = -sum(all cashflows) / sum(negative cashflows) slows the algorithms a little bit (by 7-10%)
  • scipi's netwon is faster than scipy's fsolve


Execution time with newton vs fsolve:

import numpy as np
import pandas as pd
import scipy
import scipy.optimize
from datetime import date
import timeit


def xnpv(rate, values, dates , daycount = 365):
    daycount = float(daycount)
    # Why would you want to return inf if the rate <= -100%? I removed it, I don't see how it makes sense
    # if rate <= -1.0:
    #     return float('inf')
    d0 = dates[0]    # or min(dates)
    # NB: this xnpv implementation discounts the first value LIKE EXCEL
    # numpy's npv does NOT, it only starts discounting from the 2nd
    return sum([ vi / (1.0 + rate)**((di - d0).days / daycount) for vi, di in zip(values, dates)])

def find_guess(cf):
    whereneg = np.where(cf < 0)
    sumneg = np.sum( cf[whereneg] )
    return -np.sum(cf) / sumneg
    
    

def xirr_fsolve(values, dates, daycount = 365, guess = 0, maxiters = 1000):
    
    cf = np.array(values)
    
    if np.where(cf <0,1,0).sum() ==0 | np.where(cf>0,1,0).sum() == 0:
        #if the cashflows are all positive or all negative, no point letting the algorithm
        #search forever for a solution which doesn't exist
        return np.nan
    

   
    result = scipy.optimize.fsolve(lambda r: xnpv(r, values, dates, daycount), x0 = guess , maxfev = maxiters, full_output = True )
    
    if result[2]==1: #ie if the solution converged; if it didn't, result[0] will be the last iteration, which won't be a solution
        return result[0][0]
    else:
        #consider rasiing a warning
        return np.nan
    
def xirr_newton(values, dates, daycount = 365, guess = 0, maxiters = 1000, a = -100, b =1e5):
    # a and b: lower and upper bound for the brentq algorithm
    cf = np.array(values)
    
    if np.where(cf <0,1,0).sum() ==0 | np.where(cf>0,1,0).sum() == 0:
        #if the cashflows are all positive or all negative, no point letting the algorithm
        #search forever for a solution which doesn't exist
        return np.nan
    
    res_newton =  scipy.optimize.newton(lambda r: xnpv(r, values, dates, daycount), x0 = guess, maxiter = maxiters, full_output = True)
    
    if res_newton[1].converged == True:
        out = res_newton[0]
    else:
        res_b = scipy.optimize.brentq(lambda r: xnpv(r, values, dates, daycount), a = a , b = b, maxiter = maxiters, full_output = True)
        if res_b[1].converged == True:
            out = res_b[0]
        else:
            out = np.nan
            
    return out
            
# let's compare how long each takes
d0 = pd.to_datetime(date(2010,1,1))

# an investment in which we pay 100 in the first month, then get 2 each month for the next 59 months
df = pd.DataFrame()
df['month'] = np.arange(0,60)
df['dates'] = df.apply( lambda x: d0 + pd.DateOffset(months = x['month']) , axis = 1 )
df['cf'] = 0
df.iloc[0,2] = -100
df.iloc[1:,2] = 2

r = 100
n = 5

t_newton_no_guess = timeit.Timer ("xirr_newton(df['cf'], df['dates'], guess = find_guess(df['cf'].to_numpy() )  ) ", globals = globals() ).repeat(repeat = r, number = n)
t_fsolve_no_guess = timeit.Timer ("xirr_fsolve(df['cf'], df['dates'],  guess = find_guess(df['cf'].to_numpy() ) )", globals = globals() ).repeat(repeat = r, number = n)

t_newton_guess_0 = timeit.Timer ("xirr_newton(df['cf'], df['dates'] , guess =0.) ", globals = globals() ).repeat(repeat = r, number = n)
t_fsolve_guess_0 = timeit.Timer ("xirr_fsolve(df['cf'], df['dates'], guess =0.) ", globals = globals() ).repeat(repeat = r, number = n)

resdf = pd.DataFrame(index = ['min time'])
resdf['newton no guess'] = [min(t_newton_no_guess)]
resdf['fsolve no guess'] = [min(t_fsolve_no_guess)]
resdf['newton guess 0'] = [min(t_newton_guess_0)]
resdf['fsolve guess 0'] = [min(t_fsolve_guess_0)]
# the docs explain why we should take the min and not the avg
resdf = resdf.transpose()
resdf['% diff vs fastest'] = (resdf / resdf.min() -1) * 100

Conclusions

  • I noticed there were some cases in which newton and brentq didn't converge, but fsolve did, so I modified the function so that, in order, it starts with newton, then brentq, then, lastly, fsolve.
  • I haven't actually found a case in which brentq was used to find a solution. I'd be curious to understand when it would work, otherwise it's probably best to just remove it.
  • I went back to try/except because I noticed the code above wasn't identifying all the cases of non-convergence. That's something I'd like to look into when I have a bit more time

This is my final code:

def xirr(values, dates, daycount = 365, guess = 0, maxiters = 10000, a = -100, b =1e10):
    # a and b: lower and upper bound for the brentq algorithm
    cf = np.array(values)
    
    if np.where(cf <0,1,0).sum() ==0 | np.where(cf >0,1,0).sum() == 0:
        #if the cashflows are all positive or all negative, no point letting the algorithm
        #search forever for a solution which doesn't exist
        return np.nan
    
    try:
        output =  scipy.optimize.newton(lambda r: xnpv(r, values, dates, daycount),
                                        x0 = guess, maxiter = maxiters, full_output = True, disp = True)[0]
    except RuntimeError:
        try:

            output = scipy.optimize.brentq(lambda r: xnpv(r, values, dates, daycount),
                                      a = a , b = b, maxiter = maxiters, full_output = True, disp = True)[0]
        except:
            result = scipy.optimize.fsolve(lambda r: xnpv(r, values, dates, daycount),
                                           x0 = guess , maxfev = maxiters, full_output = True )
    
            if result[2]==1: #ie if the solution converged; if it didn't, result[0] will be the last iteration, which won't be a solution
                output = result[0][0]
            else:
                output = np.nan
                
    return output

Tests

These are some tests I have put together with pytest

import pytest
import numpy as np
import pandas as pd
import whatever_the_file_name_was as finc
from datetime import date

    
    
def test_xirr():

    dates = [date(2010, 12, 29), date(2012, 1, 25), date(2012, 3, 8)]
    values = [-10000, 20, 10100]
    assert pytest.approx( finc.xirr(values, dates) ) == 1.006127e-2

    dates = [date(2010, 1,1,), date(2010,12,27)]
    values = [-100,110]
    assert pytest.approx( finc.xirr(values, dates, daycount = 360) ) == 0.1
    
    values = [100,-110]
    assert pytest.approx( finc.xirr(values, dates, daycount = 360) ) == 0.1
    
    values = [-100,90]
    assert pytest.approx( finc.xirr(values, dates, daycount = 360) ) == -0.1
    
    # test numpy arrays
    values = np.array([-100,0,121])
    dates = [date(2010, 1,1,), date(2011,1,1), date(2012,1,1)]
    assert pytest.approx( finc.xirr(values, dates, daycount = 365) ) == 0.1
    
    # with a pandas df
    df = pd.DataFrame()
    df['values'] = values
    df['dates'] = dates
    assert pytest.approx( finc.xirr(df['values'], df['dates'], daycount = 365) ) == 0.1
    
    # with a pands df and datetypes
    df['dates'] = pd.to_datetime(dates)
    assert pytest.approx( finc.xirr(df['values'], df['dates'], daycount = 365) ) == 0.1
    
    # now for some unrealistic values
    df['values'] =[-100,5000,0]
    assert pytest.approx( finc.xirr(df['values'], df['dates'], daycount = 365) ) == 49
    
    df['values'] =[-1e3,0,1]
    rate = finc.xirr(df['values'], df['dates'], daycount = 365)
    npv = finc.xnpv(rate, df['values'], df['dates'])
    # this is an extreme case; as long as the corresponsing NPV is between these values it's not a bad result
    assertion = ( npv < 0.1 and npv > -.1)
    assert assertion == True
    

P.S. Important difference between this xnpv and numpy.npv

This is not, strictly speaking, relevant to this answer, but useful to know for whoever runs financial calculations with numpy:

numpy.npv doesn't discount the first item of cashflow - it starts from the second, e.g.

np.npv(0.1,[110,0]) = 110

and

np.npv(0.1,[0,110] = 100

Excel, however, discounts from the very first item:

NPV(0.1,[110,0]) = 100

Numpy's financial functions will be deprecated and replaced with those of numpy_financial, which however will likely continue to behave the same, if only for backward compatibility.

Proceeding answered 5/2, 2021 at 19:25 Comment(3)
Regarding rates <= -1.0 - what do they even mean? Let us compute the NPV of one dollar after half a year under the assumption of "-200% annual discounting rate". 1/(-1)^0.5 = 1/i = -i. So, the net present value of my dollar is, apparently, minus one imaginary dollar. What should I do with this information?Bottali
Another example. If you allow rates below -1, you may not blindly return "nan" for all positive cash flows. Let us evaluate NPV for "-200% rate" of the following cash flow: we get $1 in 0.5 years (its "NPV" is -i), $1 in a year (with a NPV of -1), $1 in 1.5 years (NPV=i) and $1 in 2 years (NPV=1). The total NPV of the cash flow is 0, hence we must conclude -200% is its IRR!Bottali
Note that in your main code where you're comparing performance, your dates array is constructed of timestamps. If instead, you construct an array of true date objects (i.e. with df.apply(lambda x: pd.to_datetime(d0 + pd.DateOffset(months=df.index[x.name])).date(), axis=1), you will net a relative performance increase of 600-800% with the same input. I was comparing performance with gouravkr's answer and found that yours was slower until I controlled the input accordingly (after which yours became 70-140% faster).Connotative
R
1

Created a python package finance-calulator which can be used for xirr calculation. underlying, it uses newton's method.

Also I did some time profiling and it is little better than the scipy's xnpv method as suggested in @KT.'s answer.

Here's the implementation.

Robbinrobbins answered 16/2, 2021 at 5:44 Comment(0)
E
0

With Pandas, I got the following to work: (note, I'm using ACT/365 convention)

rate = 0.10
dates= pandas.date_range(start=pandas.Timestamp('2015-01-01'),periods=5, freq="AS")
cfs = pandas.Series([-500,200,200,200,200],index=dates)

# intermediate calculations( if interested)
# cf_xnpv_days = [(cf.index[i]-cf.index[i-1]).days for i in range(1,len(cf.index))]
# cf_xnpv_days_cumulative = [(cf.index[i]-cf.index[0]).days for i in range(1,len(cf.index))]
# cf_xnpv_days_disc_factors = [(1+rate)**(float((cf.index[i]-cf.index[0]).days)/365.0)-1   for i in range(1,len(cf.index))]

cf_xnpv_days_pvs = [cf[i]/float(1+(1+rate)**(float((cf.index[i]-cf.index[0]).days)/365.0)-1)  for i in range(1,len(cf.index))]

cf_xnpv = cf[0]+ sum(cf_xnpv_days_pvs)
Equestrian answered 30/7, 2015 at 4:44 Comment(0)
B
0
def xirr(cashflows,transactions,guess=0.1):
#function to calculate internal rate of return.
#cashflow: list of tuple of date,transactions
#transactions: list of transactions
try:
    return optimize.newton(lambda r: xnpv(r,cashflows),guess)
except RuntimeError:
    positives = [x if x > 0 else 0 for x in transactions]
    negatives = [x if x < 0 else 0 for x in transactions]
    return_guess = (sum(positives) + sum(negatives)) / (-sum(negatives))
    return optimize.newton(lambda r: xnpv(r,cashflows),return_guess)
Breath answered 12/2, 2021 at 17:58 Comment(2)
Could you add some description to your code?Faux
@WBM The above code tries to find Xirr using he cashflow and guess that you provide. I needed to match it to excel XIRR function so was using guess=0.1. But I was getting runtime error in some cases so I added an except block which calculates guess using the transactions. And it solved the runtime error scenario for me. The defination of inputs being used is mentioned in the code.Breath

© 2022 - 2024 — McMap. All rights reserved.