Finding the min max of a stock chart
Asked Answered
R

10

33

stock

Are there any specific algorithms that will allow me to find the min and max points in the picture above?

I have data in text format so I don't need to find it in the picture. The problem with stocks is that they have so many local mins and maxes simple derivatives won't work.

I am thinking of using digital filters (z domain), and smoothing out the graph, but I am still left with too many localized minimums and maximums.

I also tried to use a moving average as well to smooth out the graph, but again I have too many maxes and mins.

EDIT:

I read some of the comments and I just didn't circle some of the minimums and and maximums by accident.

I think I came up with an algorithm that may work. First find the minimum and maximum points (High of the day and low of the day). Then draw three lines one from open to high or low whichever comes first then a line from low to high or high to low and finally to close. Then in each of these three regions find the point that is furthest points from the line as my high and low and then repeat loop.

Riverhead answered 15/8, 2011 at 1:57 Comment(5)
Related question: #6836909Newish
If there are too many maxes and mins you need a clear criteria to distinguish between the maxes and mins. This criteria will be application and dataset specific. For example, I am not sure why you didn't circle the local minima on the far left of the figure. I recommend you think of a factor to discriminate local minima and then cycle through these factors to achieve your expected results for your dataset.Leavis
Greetings, please host your data file so that we can play around with a prototype algo for this.Spruill
Please restate your last sentence ("Then in each of these three regions find the point that is furthest points from the line as my high and low and then repeat loop."). It makes no sense.Inadmissible
What's wrong with an eye-tuned moving average ? Just play with the window until you get as many extrema as you want. This kind of problems has no other satisfactory solutions.Rod
M
17

I usually use a combination of Moving Average and Exponential Moving Average. It proved (empirically) to be well fitted for the task (enough for my needs, at least). The results are tuned with only two parameters. Here is a sample:

enter image description here

Edit

In case it is useful for somebody, here is my Mathematica code:

f[sym_] := Module[{l},
  (*get data*)
  l = FinancialData[sym, "Jan. 1, 2010"][[All, 2]];
  (*perform averages*)
  l1 = ExponentialMovingAverage[MovingAverage[l, 10], .2];
  (*calculate ma and min positions in the averaged list*)
  l2 = {#[[1]], l1[[#[[1]]]]} & /@ 
    MapIndexed[If[#1[[1]] < #1[[2]] > #1[[3]], #2, Sequence @@ {}] &, 
     Partition[l1, 3, 1]];
  l3 = {#[[1]], l1[[#[[1]]]]} & /@ 
    MapIndexed[If[#1[[1]] > #1[[2]] < #1[[3]], #2, Sequence @@ {}] &, 
     Partition[l1, 3, 1]];
  (*correlate with max and mins positions in the original list*)
  maxs = First /@ (Ordering[-l[[#[[1]] ;; #[[2]]]]] + #[[1]] - 
        1 & /@ ({4 + #[[1]] - 5, 4 + #[[1]] + 5} & /@ l2));
  mins = Last /@ (Ordering[-l[[#[[1]] ;; #[[2]]]]] + #[[1]] - 
        1 & /@ ({4 + #[[1]] - 5, 4 + #[[1]] + 5} & /@ l3));
  (*Show the plots*)
  Show[{
    ListPlot[l, Joined -> True, PlotRange -> All, 
     PlotLabel -> 
      Style[Framed[sym], 16, Blue, Background -> Lighter[Yellow]]],
    ListLinePlot[ExponentialMovingAverage[MovingAverage[l, 10], .2]], 
    ListPlot[{#, l[[#]]} & /@ maxs, 
     PlotStyle -> Directive[PointSize[Large], Red]],
    ListPlot[{#, l[[#]]} & /@ mins, 
     PlotStyle -> Directive[PointSize[Large], Black]]}, 
   ImageSize -> 400]
  ]
Mcgruder answered 19/9, 2011 at 20:41 Comment(4)
+1. MA + EWMA (and even simple MA) is indeed the right tool for the job. Don't try to be too savvy on such a task, simple low pass filters are you best friend, since you won't be able to define properly what is min and max here.Rod
@Alexandre I doubt that the "right tool" exists :) These plots are always "interpreted" by people that are masters in the art of chart exegesisMcgruder
what does this line mean in normal english/python? l2 = {#[[1]], l1[[#[[1]]]]} & /@ MapIndexed[If[#1[[1]] < #1[[2]] > #1[[3]], #2, Sequence @@ {}] &, Partition[l1, 3, 1]];Ophelia
@manav: (right-to-left) Partition[l1, 3, 1] partitions l1 into list of size 3 ({a,b,c}, {b,c,d}, {c,d,e}...), we end up with list-of-lists-of-size-3 to which we apply a transformer( MapIndexed[If[#1[[1]] < #1[[2]] > #1[[3]], #2, Sequence @@ {}]) that will return the index of the 3-tuple where a middle element is the highest or empty list otherwise, resulting in the list of indexes of such tuples, {#[[1]], l1[[#[[1]]]]} & /@ applied to all this list then results in a list of tuples, where first tuple element in this index, and second being the value of the l1 at this index.Sophomore
B
5

You'll notice that a-lot of the answers go for derivatives with some sort of low-pass filtering. A moving average of some kind, if you will. Both the fft, the square-window moving average and the exponential moving average are all pretty similar on a fundamental level. However, given the choice over all moving averages, which is the best one?

The answer: The Gaussian moving average; That of the normal distribution, of which you are aware.

The reason: The Gaussian filter is the only filter that will never produce a "fake" maximum; A maximum that was not there to begin with. This has been theoretically proven for both continuous and discrete data (Make sure you use the discrete Gaussian for discrete data though!). As you increase the Gaussian sigma, the local maxima and minima will merge, in a most intuitive way. Thus if you want there to be no more than one local maximum per day, you set sigma to one, ET cetera.

Barone answered 23/9, 2011 at 16:16 Comment(0)
R
4

I don't know what you mean by "simple derivatives". I understand it to mean you have tested a gradient descent and found it unsatisfactory because of the abundance of local extrema. If so, you want to look at simulated annealing:

Annealing is a metallurgical process used to temper metals through a heating and cooling treatment. (...). These irregularities are due to atoms being stuck in the wrong place of the structure. In the process of annealing, the metal is heated up and then allowed to cool down slowly. Heating up gives the atoms the energy they need to get un-stuck, and the slow cool-down period allows them to move to their correct location in the structure.

(...) However, in order to escape local optima, the algorithm will have a probability of taking a step in a bad direction: in other words, of taking a step that increases the value for a minimization problem or that decreases the value for a maximization problem. To simulate the annealing process, this probability will depend in part on a "temperature" parameter in the algorithm, which is initialized at a high value and decreased at each iteration. Consequently, the algorithm will initially have a high probability of moving away from a nearby (likely local) optimum. Over the iterations that probability will decrease and the algorithm will converge on the (hopefully global) optimum it did not have the chance to escape from. (source, cuts &, emphasis mine)

I know that local optima is precisely what the circles in your drawing represent, above, and hence what you want to find. But, as I interpret the quote "so many local mins and maxes simple derivatives won't work.", this is also precisely what you find too much of. I assume you have trouble with all the "zig-zag" you curve makes between two circled points.

All that seems to differentiate the optima you circle from the rest of the points of the curve is their globality, precisely: to find a lower point than the first point you circle on the left you have to go further away either way in the x coordinate than you need to do the same for its close neighbors. That's what annealing gives you: depending on the temperature parameter, you can control the size of the jumps you allow yourself to make. There has to be a value for which you catch the "big" local optima, and yet miss the "small" ones. What I'm suggesting isn't revolutionary: there are several examples (e.g. 1 2) where people have obtained nice results from such noisy data.

Roseanneroseate answered 15/8, 2011 at 2:55 Comment(4)
The whole point of simulated annealing is to avoid local optima, which are exactly what the OP wants to find.Palp
I noticed he wanted to skip some local optima, and from what his circled curve looks like, I understand he wants to find some others. I think my suggestion is highly relevant to that problem, and have lenghtily edited my answer to argue and explain that. Please have a look.Roseanneroseate
+1 Points taken. Adjusting the temperature parameter is the key.Palp
I just made a mistake on all the points I circled. I meant to circle all the local minimum and maximum values.Riverhead
U
2

Simply define what you mean by minimum and maximum in a precise, but tunable way, and then tune it until it finds the right number of minima and maxima. For example, you can first smooth the graph by replacing each value with the average of that value and the N values left and right of it. By increasing N, you can reduce the number of minima and maxima you find.

You can then define a minimum as a point where if you skip A values left and right, the next B values all show a consistent increasing trend. By increasing B, you can find fewer minima and maxima. By adjusting A, you can tune how 'flat' a minimum or maximum is allowed to be.

Once you use a tunable algorithm, you can just tune it until it looks right.

Unaffected answered 15/8, 2011 at 9:11 Comment(0)
C
2

You can use Spline method to create a contenous approximation polynom to your original function [with the desired degree]. After you have this polynom, look for a local minimum/maximum [using basic calculus] on it [the generated polynom].

Note that spline method gives you an approximation polynom that is both 'smooth' - so it is easy to find local min/maxs, and both as close as possible to the original function, and thus the local min/max should be very close to the true value, in the original function.

To improve accuracy, after finding the local mins/max in the generated polynom, for each x0 which represent a local min/max, you should look in all x such that x0-delta < x < x0 + delta, to find the real min/max this point represents.

Cosmogony answered 20/9, 2011 at 12:14 Comment(0)
B
1

I have often found that the extrema perceived subjectively by humans (read: the only extrema in stock charts, which are mostly random noise) are often to be found after Fourier bandpass filtering. You might try this algorithm:

  1. Perform an FFT
  2. Do a bandpass in frequency space. Select the bandpass' parameters based on the range of data you want your extrema to look good on, that is, the timescale of interest.
  3. Perform an inverse FFT.
  4. Select the local maxima of the resulting curve.

The parameters of the second step seem pretty subjective, but again, subjectiveness is the very nature of stock chart analysis.

Bootlick answered 18/9, 2011 at 0:51 Comment(0)
S
1

Here is a converted @Dr. belisarius's code to python:

import pandas as pd
import numpy as np

def emasma_maxmin(l: pd.Series, sma_period=10, ema_factor=.2):
    """
    mathematica's ema(sma(sma_period), ema_factor) minmax in python
    :param l: prices as pd.Series
    :param sma_period
    :param ema_factor
    :return: a tuple that contains max and min index lists
    """
    l1 = l.rolling(window=sma_period).mean().ewm(com=int(1/ema_factor)-1).mean().dropna()
    l1_triples = [w.to_list() for w in l1.rolling(window=3) if len(w.to_list()) == 3]
    # utilize the fact that original indexes are preserved in pd.Series
    l1_indexes = l1.index[:-2]
    # index is from ordinal l that corresponds to the first element of the triple
    l2 = [index for (index, (fst, snd, trd)) in zip(l1_indexes, l1_triples) if fst < snd > trd]
    l3 = [index for (index, (fst, snd, trd)) in zip(l1_indexes, l1_triples) if fst > snd < trd]
    max_ranges = [(-np.array([l[index] for index in range(index - sma_period + 1, index + 1)])).argsort() + (index - sma_period + 1) for index in l2]
    min_ranges = [(-np.array([l[index] for index in range(index - sma_period + 1, index + 1)])).argsort() + (index - sma_period + 1) for index in l3]
    return (
        [r[0] for r in max_ranges if len(r) > 0],
        [r[-1] for r in min_ranges if len(r) > 0],
    )

You may visualize the results via something like (using mplfinance):

import pandas as pd
import numpy as np
import mplfinance as mpf

# read OHLC csv
msft = pd.read_csv("MSFT1440.csv", parse_dates=True)
df = msft.loc[msft['date'] >= '2021-01-01']
max_indexes, _ = emasma_maxmin(df.high)
_, min_indexes = emasma_maxmin(df.low)
maxs, mins = set(max_indexes), set(min_indexes)
# color minmax candlesticks
mco = ['green' if index in maxs else 'red' if index in mins else None
       for index in df.index.values]
# generate vertical lines at minmax
vlines = [pd.to_datetime(str(df['date'][index]))
          for index in df.index.values if index in max_indexes or index in min_indexes]
vline_colors = ['green' if index in maxs else 'red'
                for index in df.index.values if index in max_indexes or index in min_indexes]

df.set_index(pd.DatetimeIndex(df['date']), inplace=True)
mpf.plot(df,
         type='candle',
         marketcolor_overrides=mco,
         vlines=dict(vlines=vlines, linewidths=0.25, colors=vline_colors))

which produces:

enter image description here

Sophomore answered 19/2, 2023 at 16:49 Comment(1)
would you please explain the meaning of max_ranges and min_ranges ? what's the meaning of taking - sign on the list and argsort it? thanks!Reprove
C
0

As mentioned by belisarius, the best method seems to involve filtering the data smooth. With enough smoothing, looking for changes in the slope should pinpoint the local mins and maxes (the derivative will help here). I would use a centered sliding window for a running median/average, or an ongoing EMA (or similar IIR filter).

Cool answered 23/9, 2011 at 17:22 Comment(0)
A
0

This python code detects local extremums in range 5. df should contain OHLC columns

df['H_5'], df['L_5'] =  df['H'].shift(-5), df['L'].shift(-5)
df['MAXH5'] = df['H'].rolling(window=5).max()
df['MINL5'] = df['L'].rolling(window=5).min()
df['MAXH_5'] = df['H_5'].rolling(window=5).max()
df['MINL_5'] = df['L_5'].rolling(window=5).min()
df.eval(" maximum5 = (MAXH5==H) & (MAXH_5==H)   ")
df.eval(" minimum5 = (MINL5==L) & (MINL_5==L)   ")
df.eval(" is_extremum_range5 = maximum5 | minimum5  ")

The result is in column is_extremum_range5 = {True| False}

Assam answered 20/11, 2019 at 11:6 Comment(0)
N
-2

Fermat's Theorem will help you find the local minima and maxima.

Newish answered 15/8, 2011 at 2:10 Comment(1)
I quote the OP: " so many local mins and maxes simple derivatives won't work."Roseanneroseate

© 2022 - 2024 — McMap. All rights reserved.