Heiken Ashi Using pandas python
Asked Answered
R

13

24

enter image description here I was defining a function Heiken Ashi which is one of the popular chart type in Technical Analysis. I was writing a function on it using Pandas but finding little difficulty. This is how Heiken Ashi [HA] looks like-

                 Heikin-Ashi Candle Calculations
           HA_Close = (Open + High + Low + Close) / 4
           HA_Open = (previous HA_Open + previous HA_Close) / 2
           HA_Low = minimum of Low, HA_Open, and HA_Close
           HA_High = maximum of High, HA_Open, and HA_Close

               Heikin-Ashi Calculations on First Run
            HA_Close = (Open + High + Low + Close) / 4
                   HA_Open = (Open + Close) / 2
                           HA_Low = Low
                           HA_High = High

There is a lot of stuff available on various websites using for loop and pure python but i think Pandas can also do job well. This is my progress-

   def HA(df):

       df['HA_Close']=(df['Open']+ df['High']+ df['Low']+ df['Close'])/4

       ha_o=df['Open']+df['Close']  #Creating a Variable
       #(for 1st row)

       HA_O=df['HA_Open'].shift(1)+df['HA_Close'].shift(1) #Another variable
       #(for subsequent rows)

       df['HA_Open']=[ha_o/2 if df['HA_Open']='nan' else HA_O/2]     
       #(error Part Where am i going wrong?)

       df['HA_High']=df[['HA_Open','HA_Close','High']].max(axis=1)

       df['HA_Low']=df[['HA_Open','HA_Close','Low']].min(axis=1)

       return df

Can Anyone Help me with this please?` It doesnt work.... I tried on this-

  import pandas_datareader.data as web 
  import HA
  import pandas as pd
  start='2016-1-1'
  end='2016-10-30'
  DAX=web.DataReader('^GDAXI','yahoo',start,end)

This is the New Code i wrote

    def HA(df):
            df['HA_Close']=(df['Open']+ df['High']+ df['Low']+df['Close'])/4
...:        ha_o=df['Open']+df['Close']
...:        df['HA_Open']=0.0
...:        HA_O=df['HA_Open'].shift(1)+df['HA_Close'].shift(1)
...:        df['HA_Open']= np.where( df['HA_Open']==np.nan, ha_o/2, HA_O/2 )
...:        df['HA_High']=df[['HA_Open','HA_Close','High']].max(axis=1)
...:        df['HA_Low']=df[['HA_Open','HA_Close','Low']].min(axis=1)
...:        return df

But still the HA_Open result was not satisfactory

Ralf answered 15/11, 2016 at 15:19 Comment(9)
Does it work? If not, what's the problem? Please provide a sample dataframe also.Englebert
It doesnt work.... I tried on this- import pandas_datareader.data as web import HA import pandas as pd start='2016-1-1' end='2016-10-30' DAX=web.DataReader('^GDAXI','yahoo',start,end)Ralf
Try this for you line that gives you an error: df['HA_Open']= np.where( df['HA_Open']==np.nan, ha_o/2, HA_O/2 ), but I think you also failed to define df['HA_Open']?Englebert
also do import numpy as np if you didn't alreadyEnglebert
Nope.No luck. I initialized df['HA_Open'] =0.0 just before the line you suggested but still getting error- KeyError: 'HA_Open'Ralf
Is it possible that we assign a specific formula for a particular cell (In this case first row of Column.df[HA_Open'] and write a function for subsequent rows for same Series. Offcourse we trying to solve same thing but i just thought if some specific line of code exist!Ralf
The "New Code" you added above works fine for me. I mean, I doubt it gives the right answer, but it doesn't crash or anything.Englebert
Ya It doesnt crash but df['HA_Open'] starts with NaN instead returning ha_o/2Ralf
@Abbas Please don't inline images of code: meta.https://mcmap.net/q/490707/-should-application-users-be-database-usersArchive
O
22

Here is the fastest, accurate and efficient implementation as per my tests:

def HA(df):
    df['HA_Close']=(df['Open']+ df['High']+ df['Low']+df['Close'])/4

    idx = df.index.name
    df.reset_index(inplace=True)

    for i in range(0, len(df)):
        if i == 0:
            df.set_value(i, 'HA_Open', ((df.get_value(i, 'Open') + df.get_value(i, 'Close')) / 2))
        else:
            df.set_value(i, 'HA_Open', ((df.get_value(i - 1, 'HA_Open') + df.get_value(i - 1, 'HA_Close')) / 2))

    if idx:
        df.set_index(idx, inplace=True)

    df['HA_High']=df[['HA_Open','HA_Close','High']].max(axis=1)
    df['HA_Low']=df[['HA_Open','HA_Close','Low']].min(axis=1)
    return df

Here is my test algorithm (essentially I used the algorithm provided in this post to benchmark the speed results):

import quandl
import time

df = quandl.get("NSE/NIFTY_50", start_date='1997-01-01')

def test_HA():
    print('HA Test')
    start = time.time()
    HA(df)
    end = time.time()
    print('Time taken by set and get value functions for HA {}'.format(end-start))

    start = time.time()
    df['HA_Close_t']=(df['Open']+ df['High']+ df['Low']+df['Close'])/4

    from collections import namedtuple
    nt = namedtuple('nt', ['Open','Close'])
    previous_row = nt(df.ix[0,'Open'],df.ix[0,'Close'])
    i = 0
    for row in df.itertuples():
        ha_open = (previous_row.Open + previous_row.Close) / 2
        df.ix[i,'HA_Open_t'] = ha_open
        previous_row = nt(ha_open, row.Close)
        i += 1

    df['HA_High_t']=df[['HA_Open_t','HA_Close_t','High']].max(axis=1)
    df['HA_Low_t']=df[['HA_Open_t','HA_Close_t','Low']].min(axis=1)
    end = time.time()
    print('Time taken by ix (iloc, loc) functions for HA {}'.format(end-start))

Here is the output I got on my i7 processor (please note the results may vary depending on your processor speed but I assume that the results will be similar):

HA Test
Time taken by set and get value functions for HA 0.05005788803100586
Time taken by ix (iloc, loc) functions for HA 0.9360761642456055

My experience with Pandas shows that functions like ix, loc, iloc are slower in comparison to set_value and get_value functions. Moreover computing value for a column on itself using shift function gives erroneous results.

Overprize answered 2/10, 2017 at 8:35 Comment(2)
I have added the explanation to the post. For more insight you may refer to my technical indicators project on GitHubOverprize
the .set_value function doesn't seem to work on a Pandas DataFrameAscites
E
4

Unfortunately, set_value(), and get_value() are deprecated. Building off arkochhar's answer, I was able to get a 75% speed increase by using the following list comprehension method with my own OHLC data (7000 rows of data). It is faster than using at and iat as well.

def HA( dataframe ):

    df = dataframe.copy()

    df['HA_Close']=(df.Open + df.High + df.Low + df.Close)/4

    df.reset_index(inplace=True)

    ha_open = [ (df.Open[0] + df.Close[0]) / 2 ]
    [ ha_open.append((ha_open[i] + df.HA_Close.values[i]) / 2) \
    for i in range(0, len(df)-1) ]
    df['HA_Open'] = ha_open

    df.set_index('index', inplace=True)

    df['HA_High']=df[['HA_Open','HA_Close','High']].max(axis=1)
    df['HA_Low']=df[['HA_Open','HA_Close','Low']].min(axis=1)

    return df
Educatory answered 11/3, 2019 at 21:4 Comment(0)
F
3

I'm not that knowledgeable regarding Python, or Pandas, but after some research, this is what I could figure would be a good solution.

Please, feel free to add any comments. I very much appreciate.

I used namedtuples and itertuples (seem to be the fastest, if looping through a DataFrame).

I hope it helps!

def HA(df):
    df['HA_Close']=(df['Open']+ df['High']+ df['Low']+df['Close'])/4

    nt = namedtuple('nt', ['Open','Close'])
    previous_row = nt(df.ix[0,'Open'],df.ix[0,'Close'])
    i = 0
    for row in df.itertuples():
        ha_open = (previous_row.Open + previous_row.Close) / 2
        df.ix[i,'HA_Open'] = ha_open
        previous_row = nt(ha_open, row.Close)
        i += 1

    df['HA_High']=df[['HA_Open','HA_Close','High']].max(axis=1)
    df['HA_Low']=df[['HA_Open','HA_Close','Low']].min(axis=1)
    return df
Farreaching answered 9/2, 2017 at 13:3 Comment(0)
C
3
def heikenashi(df):
    df['HA_Close'] = (df['Open'] + df['High'] + df['Low'] + df['Close']) / 4
    df['HA_Open'] = (df['Open'].shift(1) + df['Open'].shift(1)) / 2
    df.iloc[0, df.columns.get_loc("HA_Open")] = (df.iloc[0]['Open'] + df.iloc[0]['Close'])/2
    df['HA_High'] = df[['High', 'Low', 'HA_Open', 'HA_Close']].max(axis=1)
    df['HA_Low'] = df[['High', 'Low', 'HA_Open', 'HA_Close']].min(axis=1)
    df = df.drop(['Open', 'High', 'Low', 'Close'], axis=1)  # remove old columns
    df = df.rename(columns={"HA_Open": "Open", "HA_High": "High", "HA_Low": "Low", "HA_Close": "Close", "Volume": "Volume"})
    df = df[['Open', 'High', 'Low', 'Close', 'Volume']]  # reorder columns
    return df
Cutaway answered 3/2, 2018 at 4:48 Comment(1)
This code is not quite right as the HA open should be (HA_Open(-1) + HA_Close(-1))/2 and which requires iterating through the data frame as per arkochhar's answer.Griseous
A
2

I adjusted the code to make it work with Python 3.7

def HA(df):
    df_HA = df
    df_HA['Close']=(df['Open']+ df['High']+ df['Low']+df['Close'])/4

    #idx = df_HA.index.name
    #df_HA.reset_index(inplace=True)

    for i in range(0, len(df)):
        if i == 0:
            df_HA['Open'][i]= ( (df['Open'][i] + df['Close'][i] )/ 2)
        else:
            df_HA['Open'][i] = ( (df['Open'][i-1] + df['Close'][i-1] )/ 2)


    #if idx:
        #df_HA.set_index(idx, inplace=True)

    df_HA['High']=df[['Open','Close','High']].max(axis=1)
    df_HA['Low']=df[['Open','Close','Low']].min(axis=1)
    return df_HA
Alysiaalyson answered 23/5, 2020 at 16:20 Comment(1)
This worked perfectly. Thank youAscites
C
2

Perfectly working HekinAshi function. I am not the original author of this code. I found this on Github (https://github.com/emreturan/heikin-ashi/blob/master/heikin_ashi.py)

def heikin_ashi(df):
        heikin_ashi_df = pd.DataFrame(index=df.index.values, columns=['open', 'high', 'low', 'close'])
    
    heikin_ashi_df['close'] = (df['open'] + df['high'] + df['low'] + df['close']) / 4
    
    for i in range(len(df)):
        if i == 0:
            heikin_ashi_df.iat[0, 0] = df['open'].iloc[0]
        else:
            heikin_ashi_df.iat[i, 0] = (heikin_ashi_df.iat[i-1, 0] + heikin_ashi_df.iat[i-1, 3]) / 2
        
    heikin_ashi_df['high'] = heikin_ashi_df.loc[:, ['open', 'close']].join(df['high']).max(axis=1)
    
    heikin_ashi_df['low'] = heikin_ashi_df.loc[:, ['open', 'close']].join(df['low']).min(axis=1)
    
    return heikin_ashi_df
Carlitacarlo answered 30/12, 2020 at 18:4 Comment(0)
F
2

Numpy version working with Numba

@jit(nopython=True)
def heiken_ashi_numpy(c_open, c_high, c_low, c_close):
    ha_close = (c_open + c_high + c_low + c_close) / 4
    ha_open = np.empty_like(ha_close)
    ha_open[0] = (c_open[0] + c_close[0]) / 2
    for i in range(1, len(c_close)):
        ha_open[i] = (c_open[i - 1] + c_close[i - 1]) / 2
    ha_high = np.maximum(np.maximum(ha_open, ha_close), c_high)
    ha_low = np.minimum(np.minimum(ha_open, ha_close), c_low)
    return ha_open, ha_high, ha_low, ha_close
Flacon answered 6/12, 2021 at 14:8 Comment(0)
A
1

Will be faster with numpy.

 def HEIKIN(O, H, L, C, oldO, oldC):
     HA_Close = (O + H + L + C)/4
     HA_Open = (oldO + oldC)/2
     elements = numpy.array([H, L, HA_Open, HA_Close])
     HA_High = elements.max(0)
     HA_Low = elements.min(0)
     out = numpy.array([HA_Close, HA_Open, HA_High, HA_Low])  
     return out
Assuming answered 13/8, 2018 at 10:18 Comment(0)
F
1

No Loop Solution for DataFrames

This was the simplest, easy to understand, no-loop solution I could come up with for dataframes.

  • Temporarily store Heikin-Ashi output in 'o', 'h', 'l', 'c' columns
  • 'h' based on yesterday's values so we can use .shift(1) and copy the first entry
  • Replace 'Open', 'High', 'Low', 'Close' with 'o', 'h', 'l', 'c'

Python 3.9.7

def heikin_ashi(df):
    df = df.copy()
    df['c'] = (df['Open'] + df['High'] + df['Low'] + df['Close']) / 4
    df['o'] = ((df['Open'] + df['Close']) / 2).shift(1)
    df.iloc[0,-1] = df['o'].iloc[1]
    df['h'] = df[['High', 'o', 'c']].max(axis=1)
    df['l'] = df[['Low', 'o', 'c']].min(axis=1)
    df['Open'], df['High'], df['Low'], df['Close'] = df['o'], df['h'], df['l'], df['c']
    return df.drop(['o', 'h', 'l', 'c'], axis=1)
Figure answered 2/2, 2022 at 23:35 Comment(1)
I believe this is a wrong implementation. The next Open of Heikin-Ashi = (Previous Open Heikin-Ashi + Previous Close Heikin-Ashi) / 2. Instead, you are just using the previous NORMAL open and close.Navarrete
S
0
def HA(df):
    df_HA = df
    df_HA['Close']=(df['Open']+ df['High']+ df['Low']+df['Close'])/4


    for i in range(0, len(df)):
        if i == 0:
            df_HA['Open'][i]= ( (df['Open'][i] + df['Close'][i] )/ 2)
        else:
            df_HA['Open'][i] = ( (df['Open'][i-1] + df['Close'][i-1] )/ 2)


    df_HA['High']=df[['Open','Close','High']].max(axis=1)
    df_HA['Low']=df[['Open','Close','Low']].min(axis=1)
    return df_HA

This code works but calculates HA candles wrong. Else statement is looking at normal candles for open and close instead of HA to calculate next HA Open. Replace with:

    for i in range(0, len(df)):
    if i == 0:
        df_HA['Open'][i]= ( (df['Open'][i] + df['Close'][i] )/ 2)
    else:
        df_HA['Open'][i] = ( (df_HA['Open'][i-1] + df_HA['Close'][i-1] )/ 2)

Next is HA High and low. Calculations not right.

    df_HA['High']=df[['Open','Close','High']].max(axis=1)
    df_HA['Low']=df[['Open','Close','Low']].min(axis=1)

It is again comparing only against normal candles, instead of current normal candles High, and HA Open and HA Close. this code fixes the issue:

def HA_Initialise(df):
    df_HA = pd.DataFrame(columns=['Date', 'Open', 'High', 'Low', 'Close'])

    df_HA['Close']=(df['Open']+ df['High']+ df['Low']+df['Close'])/4

    for i in range(0, len(df)):
        if i == 0:
            df_HA['Open'][i]= ( (df['Open'][i] + df['Close'][i] )/ 2)
        else:
            test = []
            df_HA['Open'][i] = ( (df_HA['Open'][i-1] + df_HA['Close'][i-1] )/ 2)
            test.append(df['High'][i])
            test.append(df['Low'][i])
            test.append(df_HA['Open'][i])
            test.append(df_HA['Close'][i])

            high = max(test)
            low = min(test)
            df_HA['High'][i] = high
            df_HA['Low'][i] = low

    return df_HA

df is data frame with normal candle data, and df_HA is what we are building and looking into while code runs for needed calculations

Sardis answered 28/3, 2022 at 11:4 Comment(0)
N
0

Assuming you have everything in a list of lists; where each row has: time, open, close, high, low, volume.

        if candles:
            close_values = [sum(row[1:5]) / 4 for row in candles]

            previous_close = close_values[0]
            previous_open = (candles[0][1] + previous_close) / 2

            opens = collections.deque()
            opens.append(previous_open)
            for close_value in close_values[1:]:
                previous_open = (previous_open + previous_close) / 2
                opens.append(previous_open)
                previous_close = close_value

            candles = [[row[0], o, c, max(row[3], o, c), min(row[4], o, c), row[5]] 
for row, o, c in zip(candles, opens, close_values)]

This solution only uses list comprehensions and the collections module.

If you want to return the dataframe:

return pd.DataFrame.from_records(
            data=candles,
            columns=['Time', 'Open', 'Close', 'High', 'Low', 'Volume'],
            index='Time',
            coerce_float=True,
        )
Navarrete answered 13/6, 2022 at 8:55 Comment(0)
M
0

import pandas_ta as ta # TA-lib
import pandas as pd

Using Pandas implementation of ta was easiest and fastest at mine

dfHA = df.ta.ha()

I assume this wasn't available at the time the question was asked

Malmo answered 31/7, 2023 at 13:55 Comment(0)
F
-2

Fastest solution I found.

def HA(df):
    df['HA_Close']=(df['Open']+ df['High']+ df['Low']+df['Close'])/4

    idx = df.index.name
    df.reset_index(inplace=True)

    ha_close_values = self.data['HA_Close'].values

    length = len(df)
    ha_open = np.zeros(length, dtype=float)
    ha_open[0] = (df['Open'][0] + df['Close'][0]) / 2

    for i in range(0, length - 1):
        ha_open[i + 1] = (ha_open[i] + ha_close_values[i]) / 2

    df['HA_Open'] = ha_open

    df['HA_High']=df[['HA_Open','HA_Close','High']].max(axis=1)
    df['HA_Low']=df[['HA_Open','HA_Close','Low']].min(axis=1)
    return df

This solution is similar to user11186769 with 2 additional optimization.

The major optimizations which gave a 3.5-4x speedup is this part:

ha_close_values = self.data['HA_Close'].values

length = len(df)
ha_open = np.zeros(length, dtype=float)
ha_open[0] = (df['Open'][0] + df['Close'][0]) / 2

for i in range(0, length - 1):
    ha_open[i + 1] = (ha_open[i] + ha_close_values[i]) / 2

vs this:

[ha_open.append((ha_open[i] + df.HA_Close.values[i]) / 2) for i in range(0, len(df)-1)]

The first difference is that in that answer there is an unnecessary and expensive call in every iteration. Which is this: df.HA_Close.values[i]. (It converts the series to a numpy array in every iteration.)

As you can see, in my solution I only calculated that value once and stored it like this: ha_close_values = self.data['HA_Close'].values, and used this value in the for loop.

The other optimization is using a numpy array with a fix size instead of a python list. Instead of appending to that list in every iteration, I just used the current index+1 to set the values of ha_open.

Frenchpolish answered 21/7, 2021 at 11:34 Comment(1)
what is self. here?Greening

© 2022 - 2024 — McMap. All rights reserved.