Are there known techniques to generate realistic looking fake stock data?
Asked Answered
B

12

25

I recently wrote some Javascript code to generate random fake stock data as I wanted to show a chart that at first glanced looked like real stock data - but all I came up with was pretty noddy. I was just wondering if there are some resources that explain how this might be done "properly" i.e. so you get realistic looking data that has the same patterns that you see in real stock data?

Bindman answered 21/12, 2011 at 23:30 Comment(0)
L
10

I had a book Fractal Market Analysis (just got rid of it recently) that talked about the statistical properties of stock prices. Not very useful for investing, but it might have been able to help you.

You'll need something that models a random process with desired statistical properties. Two examples of random processes are Gaussian white noise and a Wiener process (the latter which models Brownian motion and is also the limit of a random walk with small steps).

If I remember right from the Fractal Market Analysis book, there was an assertion that the logarithm of stock prices had characteristics similar to so-called "1/f noise" or "pink noise", so you could try looking for articles on pink noise generation in software. (and then take the results and plug them into e^x) (edit: oops, I misremembered. Looks like it's more like fractional Brownian motion)

(Here's a nice readable essay that talks about the history behind the study of fractal random processes -- and how the flooding of the Nile relates to the stock market -- unfortunately it doesn't get into technical data, but maybe there are search terms like Hurst exponent that can get you started.)

The problem becomes more difficult if you need multiple series of stock data. (in which case there is some correlation between stocks that depends on various common factors e.g. national economy, industry type, etc.) I'm not sure how you could go about that, but start with one random process first.

Luanneluanni answered 21/12, 2011 at 23:49 Comment(2)
Thanks for this. I'll have to get reading! Yeah I see what you mean about multiple stocks - I guess if you want to mimic the stocks in a particular sector say, that tend to go up and down together it's way more complex. Also to get it to look good over different periods - e.g. day, month and year then it looks like a real challenge!Bindman
It might also be a news that draws suddenly the whole market toward one direction.Hymnody
R
75

A simple algorithm is to use a simple volatility number that restricts how much the stock can change within a given period (say, a single day). The higher the number, the more volatile. So each day you can compute the new price by:

rnd = Random_Float(); // generate number, 0 <= x < 1.0
change_percent = 2 * volatility * rnd;
if (change_percent > volatility)
    change_percent -= (2 * volatility);
change_amount = old_price * change_percent;
new_price = old_price + change_amount;

A stable stock would have a volatility number of perhaps 2%. A volatility of 10% would show some pretty large swings.

Not perfect, but it could look pretty realistic.

Samples

enter image description here

Renn answered 21/12, 2011 at 23:53 Comment(8)
Downvoters: It's customary to supply a reason with a downvote.Renn
I've used this just to mess about with a few things, it's great! However maybe it's just my maths but the change amount, doesn't that need to be: change_amount = (old_price / 100) * change_percent;Millrace
I created a Java implementation based on this algorithm which worked very well for my needs. Because you can't post code in a comment, I added a reply down below with the code: https://mcmap.net/q/523066/-are-there-known-techniques-to-generate-realistic-looking-fake-stock-dataApennines
I just want you to know that Ive come back to this answer multiple times over the years. I wish I could upvote it more than once.Reduce
@Jim Mischel - I added a picture showing how it looks. Hope that's cool with you. Btw nifty algorithm. Cheers!Crocker
Note that if your volatility number is between 1 and 100, then yes you do have to divide by 100 as mentioned in the comment. But if your volatility is a floating point number between 0 and 1 (i.e. .02 is 2%), then the division isn't necessary.Renn
Simplification: rnd = Random_Float() - 0.5; and then remove if (change_percent > volatility) change_percent -= (2 * volatility);Aguie
don't forget old_price = new_price at the end of iterationCoeval
L
10

I had a book Fractal Market Analysis (just got rid of it recently) that talked about the statistical properties of stock prices. Not very useful for investing, but it might have been able to help you.

You'll need something that models a random process with desired statistical properties. Two examples of random processes are Gaussian white noise and a Wiener process (the latter which models Brownian motion and is also the limit of a random walk with small steps).

If I remember right from the Fractal Market Analysis book, there was an assertion that the logarithm of stock prices had characteristics similar to so-called "1/f noise" or "pink noise", so you could try looking for articles on pink noise generation in software. (and then take the results and plug them into e^x) (edit: oops, I misremembered. Looks like it's more like fractional Brownian motion)

(Here's a nice readable essay that talks about the history behind the study of fractal random processes -- and how the flooding of the Nile relates to the stock market -- unfortunately it doesn't get into technical data, but maybe there are search terms like Hurst exponent that can get you started.)

The problem becomes more difficult if you need multiple series of stock data. (in which case there is some correlation between stocks that depends on various common factors e.g. national economy, industry type, etc.) I'm not sure how you could go about that, but start with one random process first.

Luanneluanni answered 21/12, 2011 at 23:49 Comment(2)
Thanks for this. I'll have to get reading! Yeah I see what you mean about multiple stocks - I guess if you want to mimic the stocks in a particular sector say, that tend to go up and down together it's way more complex. Also to get it to look good over different periods - e.g. day, month and year then it looks like a real challenge!Bindman
It might also be a news that draws suddenly the whole market toward one direction.Hymnody
L
8
# The following is an adaptation from a program shown at page 140 in
# "Stochastic Simulations and Applications in Finance",
# a book written by Huynh, Lai and Soumaré.
# That program was written in MatLab and this one was written in R by me.
# That program produced many price paths and this one produces one.
# The latter is also somewhat simpler and faster.

# Y is the time period in years, for instance 1 (year)
# NbSteps is the number of steps in the simulation,
# for instance 250 (trading days in a year).
# DeltaY is the resulting time step.

# The computations shown implement the exact solution
# to the stochastic differential equation for
# the geometric Brownian motion modelling stock prices,
# with mean mu and volatility sigma, thus generating a stochastic price path
# such as that exhibited by stock prices when price jumps are rare.

PricePath <- function(Y,NbSteps,mu,sigma,InitPrice) {
    DeltaY <- Y/NbSteps; SqrtDeltaY <- sqrt(DeltaY)
    DeltaW <- SqrtDeltaY * rnorm(NbSteps)
    Increments <- (mu-sigma*sigma/2)*DeltaY + sigma*DeltaW
    ExpIncr <- exp(Increments)
    PricePath <- cumprod(c(InitPrice,ExpIncr))
    return(PricePath)
}

The plot of the output from this program looks very much like a stock price path:

Lundeen answered 22/12, 2011 at 20:44 Comment(0)
T
7

There are several answers that give a fairly textbook answer: use geometric brownian motion to model stock prices. But there's one major reason to consider this wrong. Real stock prices do not behave anything like geometric brownian motion (GBM). I'll explain this in a bit.

The reason GBM is used in textbooks to model a stock price process is for simplicity. It helps you get the theory off the ground and derive some basic results which seem to be "essentially" correct. This doesn't mean you should think that's what stock prices "look like" however. That would be like deriving an equation of motion neglecting friction (which is theoretically very useful) and then thinking this is what motion looks like in real life, e.g. everyone slides around on their shoes like ice skates.

One of the theoretically most useful properties of GBM is that future changes are independent of past changes. Is this true of stock prices? Nope. Not at all. Serial correlation occurs everywhere. Not only that, large decreases are usually followed by increased volatility while large increases are usually followed by decreased volatility.

I suppose I might be accused of nitpicking, but these stylized facts are commonly known to investors and economists, so I think it's fair to say GBM doesn't look realistic to anybody that is familiar with stock market behavior.

Econometricians have come up with plenty of models for stock prices. The one that seems to work in a lot of situations is an autoregressive model for the conditional mean combined with an (G)Arch type model for the volatility. For the volatility model, an assymetric GARCH with a fat-tail distribution (like Student's t) seems to work the best for a variety of financial markets.

Tanganyika answered 25/11, 2013 at 6:33 Comment(0)
N
6

I wrote a quick an dirty javascript version inspired by Peter P.'s response here. I needed to create weekly, yearly and overall trends so this accepts an array of parameters and overlays these to get a more complex (fake) trend.

  function getRandomData(numPoints, center, min, max, cycles)
{
    var result = [];
    var phase = Math.random() * Math.PI;
    var y = center;

    function randomPlusMinus() { return (Math.random() * 2) - 1; }

    $.each(cycles, function(i,thisCycle) {
        thisCycle.phase = Math.random() * Math.PI;
        thisCycle.increment = Math.PI / thisCycle.length;
    });

    for (var i = 0; i < numPoints; i++)
    {
        $.each(cycles, function(i,thisCycle) {
            thisCycle.phase += thisCycle.increment * randomPlusMinus();
            y += (Math.sin(thisCycle.phase) * (thisCycle.variance / thisCycle.length) * (randomPlusMinus() * thisCycle.noise)) + (thisCycle.trend / thisCycle.length);

        });
        if (min) y = Math.max(y,min);
        if (max) y = Math.min(y,max);
        result.push(y);
    }

    return result;
}

var data = getRandomData(365,80,20,100,
                      [{ length: 7, variance: 50, noise: 1, trend: 0},
                       { length: 365, variance: 30, noise: 1, trend: 0},
                       { length: 700, variance: 2, noise: 0, trend: 100}]);

I put a chart on there to show the result: http://jsfiddle.net/z64Jr/3/

Norman answered 7/1, 2014 at 18:34 Comment(0)
A
3

I wanted to reply to Jim Mischel's post above (https://mcmap.net/q/523066/-are-there-known-techniques-to-generate-realistic-looking-fake-stock-data) but since I wanted to include code, I am forced to put my reply here.

Based on Jim Mischel's alorithm, I did the following Java implementation, and it worked well for my needs, generating numbers that when graphed, produced visually appealing, realistic-looking stock ticker prices.

Java:

private float getNextPrice(float oldPrice)
{
    // Instead of a fixed volatility, pick a random volatility
    // each time, between 2 and 10.
    float volatility = _random.nextFloat() * 10 + 2;

    float rnd = _random.nextFloat();

    float changePercent = 2 * volatility * rnd;

    if (changePercent > volatility) {
        changePercent -= (2 * volatility);
    }
    float changeAmount = oldPrice * changePercent/100;
    float newPrice = oldPrice + changeAmount;

    // Add a ceiling and floor.
    if (newPrice < MIN_PRICE) {
        newPrice += Math.abs(changeAmount) * 2;
    } else if (newPrice > MAX_PRICE) {
        newPrice -= Math.abs(changeAmount) * 2;
    }

    return newPrice;

}

Note that, as wiggles pointed out in his comment, I needed to divide percentage by 100 when declaring the changeAmount variable.

Apennines answered 12/3, 2014 at 15:12 Comment(0)
K
1

Take a look at yahoo finance, they offer free delayed data from the stock exchange and charts.

Here's an article about using the feed: http://www.codeproject.com/KB/aspnet/StockQuote.aspx

You'll need JQuery or you can just use XMLHttpRequest to comsume the service. FYI, there's a plugin for JQuery to process a CSV: http://code.google.com/p/js-tables/

Keeton answered 21/12, 2011 at 23:53 Comment(1)
...or, depending on the need, one could possibly download actual stock price series with long histories (meaning: without on-the-fly updates).Almond
C
1

I needed to create some dummy market data for a sim game I was working on. I needed the data to look like market data yet stay within certain ranges so it was predictable in terms of starting price, maximum / minimum for the day.

In the end, I combined sine waves of varying frequencies and then added in some randomness and the results don't just look good but are consistent (you don't get anything that looks odd). Even where the sine wave pattern can be perceived, it still looks okay.

Random generated market data

The code is written in a BASIC scripting language, but it should be very simple to understand and convert to whatever language you want. Once you've got the array of normalised data, multiply the values by whatever maximum value you want to get a bounded dataset.

dim values[] as float
dim offsets[] as integer
dim frequencies[] as float

function GetPoint(x#, f#, a#, o#)

    f# = 360.0 / f#

    x# = FMod(x# + o#, f#)
    angle# = (x# / f#) * 360.0

    r# = Sin(angle#) * a#

endfunction r#

function Generate()

    // Empty arrays
    offsets.Length = -1
    frequencies.Length = -1
    values.Length = -1

    offsets.Insert(Random(0, 359))
    offsets.Insert(Random(0, 359))
    offsets.Insert(Random(0, 359))

    f# = Random(100, 300)
    f# = f# / 1000.0
    frequencies.Insert(f#)
    f# = Random(500, 1000)
    f# = f# / 1000.0
    frequencies.Insert(f#)
    f# = Random(2000, 4000)
    f# = f# / 1000.0
    frequencies.Insert(f#)

    c# = 0
    for i = 0 to 1919
        v# = 0
        v# = v# + GetPoint(i, frequencies[0], 190, offsets[0])
        v# = v# + GetPoint(i, frequencies[1], 85, offsets[1])
        v# = v# + GetPoint(i, frequencies[2], 40, offsets[2])

        r# = Random(0, 40)
        r# = r# - 20.0

        c# = Clamp(c# + r#, c# - 40, c# + 40)
        v# = v# + c#

        values.Insert(v#)
    next i

    start# = values[0]
    max# = 0.0
    for i = 0 to values.Length
        values[i] = values[i] - start#
        if Abs(values[i]) > max#
            max# = Abs(values[i])
        endif
    next i

    // Normalize
    for i = 0 to values.Length
        values[i] = (values[i] / max#)
    next i

endfunction

function Clamp(v#, min#, max#)

    if v# < min#
        exitfunction min#
    elseif v# > max#
        exitfunction max#
    endif

endfunction v#
Chirp answered 20/5, 2017 at 21:2 Comment(2)
I converted this to ES6 and the data generated doesn't make sense in relation to your example graph. Can you explain how the generated data is supposed to be graphed? Thanks.Fleurdelis
The data is normalised, so you'll need to multiply it by whatever maximum value you're looking for. Then simply iterate over the data and plot.Chirp
O
0

Here's my attempt in ruby! :) This will output a string you can copy and paste into google charts. I allow for positive, negative or no trending of the data. This code could probably be optimized and/or tweaked for randomness/regularity.

Google charts: https://code.google.com/apis/ajax/playground/?type=visualization#line_chart

# In order to generate a semi-realistic looking graph behavior
# we use a sine function to generate period behavior.  In order to avoid
# a graph that is too regular, we introduce randomness at two levels:
# The delta between steps across the x-axis is random, but within a range(deltavariance)
# The wavelength of the sine function is varied by randomly incrementing the index we pass
# to the sine function(sine_index)

# CONFIGURATION VARIABLES
yvalue = 1 # start value
range = 100 # y-range
deltavariance = 10 # allowable variance between changes
sine_index, wavelength = 0, 0.33 #index into our sine function that determines whether we change direction or not
i, maxi = 0, 100 # our counter and its maximum
data = {sine_index => yvalue} # seed our data structure with its first value
trend = :positive # :negative, :none # do we want the graph to trend upwards, downwards or neither
periodmin, periodmax = 0, 0 # vars to enforce trending
direction = 1 # start in a positive direction, -1 for negative

# DO NOT EDIT BELOW THIS LINE
while(i < maxi)

  olddirection = direction
  direction = Math.sin(sine_index).to_f
  direction = direction < 0 ? direction.floor : direction.ceil

  delta = rand(deltavariance) 
  yvalue += delta * direction

  if trend == :positive 
    yvalue = periodmin if yvalue < periodmin
    periodmin = yvalue if olddirection < direction
  elsif trend == :negative
    yvalue = periodmax if yvalue > periodmax
    periodmax = yvalue if olddirection > direction

  end

  data[sine_index] = yvalue
  sine_index += Math.sin(rand) # Math.sin(rand) will give random numbers from -1..1
  i += 1
end

code = <<-CODE
function drawVisualization() {
  // Create and populate the data table.
  var data = google.visualization.arrayToDataTable([
    ['x', 'Cats'],
    DATASTR
  ]);

  // Create and draw the visualization.
  new google.visualization.LineChart(document.getElementById('visualization')).
      draw(data, {curveType: "function",
                  width: 500, height: 400,
                  vAxis: {maxValue: 10}}
          );
}
CODE

datastr = data.collect{|k,v|  "[#{k},#{v}]"}.join(",")
code = code.gsub('DATASTR', datastr)
puts code
Onondaga answered 20/11, 2013 at 6:48 Comment(1)
sorry, don't know why the syntax highlight isn't working...see this pastie: pastie.org/8494639Onondaga
H
0
double price=2000;
    while (true) {
        double min =  (price*-.02);
        double max =  (price*.02);
        double randomNum = ThreadLocalRandom.current().nextDouble(min, max+1);
        price=price+randomNum;
        System.out.println(price);
    }

It is in java. Just plot the result in excel column to see the graph.Use a large set of values to plot in excel. It is intriguing to see the how similar it looks like real stock data.

Hinton answered 11/1, 2017 at 4:52 Comment(0)
A
0

Golang code based on the above algorithm by @Jim Mischel

package main

import (
    "fmt"
    "math/rand"
)

func main() {

    var (
        change_percent, change_amount, new_price, old_price float64
    )
    volatility := 0.02
    old_price = 50

    for i := 0; i < 100; i++ {
        rnd := rand.Float64() // generate number, 0 <= x < 1.0
        // fmt.Printf("rnd %v ", rnd)
        change_percent = 2 * volatility * rnd
        // fmt.Printf("change_percent %v\n", change_percent)
        if change_percent > volatility {
            change_percent = change_percent - (2 * volatility)
        }
        change_amount = old_price * change_percent
        new_price = old_price + change_amount
        fmt.Printf("new_price %f\n", new_price)
        new_price = old_price
    }

}

Adverb answered 7/11, 2022 at 21:31 Comment(0)
R
-1

Here is the code that I created for my usage. The prices are created for new candle-stick that includes Open, High, Low, Close, and Volume. The new prices are generated based on % of volatility. I used total 5% for prices.

The code is C# based.

public class PriceBar
{
    public DateTime Date { get; set; }
    public double Open { get; set; }
    public double High { get; set; }
    public double Low { get; set; }
    public double Close { get; set; }
    public long Volume { get; set; }
}

public static double GetRandomNumber(double minimum, double maximum)
{
    Random random = new Random();
    return random.NextDouble() * (maximum - minimum) + minimum;
}

public static void GenerateRandomBar(PriceBar newBar)
{
    double fluct = 0.025;
    double volFluct = 0.40;

    //Open is equal to the previous close
    newBar.Open = newBar.Close;
    newBar.Close = GetRandomNumber(newBar.Close - newBar.Close * fluct, newBar.Close + newBar.Close * fluct);
    newBar.High = GetRandomNumber(Math.Max(newBar.Close, newBar.Open), Math.Max(newBar.Close, newBar.Open) + Math.Abs(newBar.Close - newBar.Open) * fluct);
    newBar.Low = GetRandomNumber(Math.Min(newBar.Close, newBar.Open), Math.Min(newBar.Close, newBar.Open) - Math.Abs(newBar.Close - newBar.Open) * fluct);
    newBar.Volume = (long)GetRandomNumber(newBar.Volume * volFluct, newBar.Volume);
}

Usage:

Create an instance of PriceBar, fill the previous bar's prices. Feed the PriceBar instance to the function GenerateRandomBar(). It will return a PriceBar with new values.

Rok answered 5/8, 2014 at 1:13 Comment(1)
That's not how you generate HOLC data. The realistic looking (fake) stream of orders, once generated, can be decomposed into time frames, which are basically a grouping of all the orders placed within certain periods (1m, 3m, 10m, 1d and so on). Then you can extract the opening, highest, lowest and closing prices accordingly based on the tick data. Generating random HOLC data does not make any sense.Shipmaster

© 2022 - 2024 — McMap. All rights reserved.