Linear Regression in Javascript [closed]
Asked Answered
G

7

41

I want to do Least Squares Fitting in Javascript in a web browser.

Currently users enter data point information using HTML text inputs and then I grab that data with jQuery and graph it with Flot.

After the user had entered in their data points I would like to present them with a "line of best fit". I imagine I would calculate the linear, polynomial, exponential and logarithmic equations and then choose the one with the highest R^2 value.

I can't seem to find any libraries that will help me to do this though. I stumbled upon jStat, but it is completely missing documentation (as far as I can find) and after digging through the the source code it doesn't seem to have any linear regression functionality built in--I'm basing this purely on function names however.

Does anyone know any Javascript libraries that offer simple regression analysis?


The hope would be that I could use the library like so...

If I had some set of scatter points in an array var points = [[3,4],[15,45],...[23,78]], I would be able to hand that to some function like lin_reg(points) and it would return something like [7.12,3] if the linear equation was y = 7.12 x + 3.

Glyptography answered 1/6, 2011 at 1:28 Comment(1)
A popular library that does what you want, linear regression, is simple-statistics.Rupp
C
27

What kind of linear regression? For something simple like least squares, I'd just program it myself:

http://mathworld.wolfram.com/LeastSquaresFitting.html

The math is not too hard to follow there, give it a shot for an hour or so and let me know if it's too hard, I can try it.

EDIT:

Found someone that did it:

http://dracoblue.net/dev/linear-least-squares-in-javascript/159/

Clement answered 1/6, 2011 at 1:32 Comment(8)
Sorry it has been a while since my last stats class. ;) Yes, least squares fitting is what I'm looking for. (I edited my question to say as much).Glyptography
Also, I will certainly program it myself if I have to but I would certainly prefer to utilize a library if possible--and I can't imagine there isn't something out there that does this.Glyptography
I did a small library some time ago to do simple matrix stuff like determinants and inverses that are handy for least squares calcs - it's not to hard. There are some neat tricks for performance but they're only necessary if that's an issue. They obfuscate the code.Interpret
@omega, duplicate x values? The basic algorithm doesn't seem to account for that, and I haven't done the math to figure it out, but I'm guessing you might be ok just averaging the different y values for each x that occurs multiple times? Or maybe a geometric average? Not sure, fiddle with it and post something if you figure it out.Clement
@Clement Hey just a quick check with you if the linear regression algorithm is applicable for the case to predict future profit of an organization based on the past profit?Pepita
@hyperfkcb, it seems very unlikely that profit can be predicted by any kind of regression, let alone basic linear regression. For predictive models look at Winters Holt algorithm and other similar ideas. But also watch the movie Pi and see how he drilled a hole in his head because he went insane trying to predict the stock market. Rule of thumb: don't look for silver bullets.Clement
@Clement Even for cubic regression? I have found a solution here: #22320105 But I not sure how to actually apply itPepita
Profits do not follow mathematical functions that you and I can possibly understand :) Profits change according to thousands of variables, please think about the problem more carefully before you make some big mistakes.Clement
A
28

The simplest solution I found for the question at hand can be found in the following post: http://trentrichardson.com/2010/04/06/compute-linear-regressions-in-javascript/

Note that in addition to the linear equation, it also returns the R2 score, which can be useful.

** EDIT **

Here is the actual code snippet:

function linearRegression(y,x){
        var lr = {};
        var n = y.length;
        var sum_x = 0;
        var sum_y = 0;
        var sum_xy = 0;
        var sum_xx = 0;
        var sum_yy = 0;

        for (var i = 0; i < y.length; i++) {

            sum_x += x[i];
            sum_y += y[i];
            sum_xy += (x[i]*y[i]);
            sum_xx += (x[i]*x[i]);
            sum_yy += (y[i]*y[i]);
        } 

        lr['slope'] = (n * sum_xy - sum_x * sum_y) / (n*sum_xx - sum_x * sum_x);
        lr['intercept'] = (sum_y - lr.slope * sum_x)/n;
        lr['r2'] = Math.pow((n*sum_xy - sum_x*sum_y)/Math.sqrt((n*sum_xx-sum_x*sum_x)*(n*sum_yy-sum_y*sum_y)),2);

        return lr;
}

To use this you just need to pass it two arrays, known_y's and known_x's, so this is what you might pass:

var known_y = [1, 2, 3, 4];
var known_x = [5.2, 5.7, 5.0, 4.2];

var lr = linearRegression(known_y, known_x);
// now you have:
// lr.slope
// lr.intercept
// lr.r2
Accrete answered 22/7, 2015 at 14:56 Comment(4)
Hey but what can I do with those values? For example, I wanted to predict a future profit of an organization based on the past records. How can I actually use the slope, intercept and r2 returned to calculate the forecast?Pepita
@hyperfkcb Given 2 lists of matching positions (x coordinates, y coordinates), find the linear equation's slope and intercept (Y = slope * X + intercept). You can use this equation to predict a future Y based on a given future X valueAccrete
Hey sorry but the algorithm above only works for the data which grows in a straight line right? Let's say if my data is in wavy shape, then the prediction will become inaccurate, am I right?Pepita
@hyperfkcb correct, that is the meaning of "linear" :)Accrete
C
27

What kind of linear regression? For something simple like least squares, I'd just program it myself:

http://mathworld.wolfram.com/LeastSquaresFitting.html

The math is not too hard to follow there, give it a shot for an hour or so and let me know if it's too hard, I can try it.

EDIT:

Found someone that did it:

http://dracoblue.net/dev/linear-least-squares-in-javascript/159/

Clement answered 1/6, 2011 at 1:32 Comment(8)
Sorry it has been a while since my last stats class. ;) Yes, least squares fitting is what I'm looking for. (I edited my question to say as much).Glyptography
Also, I will certainly program it myself if I have to but I would certainly prefer to utilize a library if possible--and I can't imagine there isn't something out there that does this.Glyptography
I did a small library some time ago to do simple matrix stuff like determinants and inverses that are handy for least squares calcs - it's not to hard. There are some neat tricks for performance but they're only necessary if that's an issue. They obfuscate the code.Interpret
@omega, duplicate x values? The basic algorithm doesn't seem to account for that, and I haven't done the math to figure it out, but I'm guessing you might be ok just averaging the different y values for each x that occurs multiple times? Or maybe a geometric average? Not sure, fiddle with it and post something if you figure it out.Clement
@Clement Hey just a quick check with you if the linear regression algorithm is applicable for the case to predict future profit of an organization based on the past profit?Pepita
@hyperfkcb, it seems very unlikely that profit can be predicted by any kind of regression, let alone basic linear regression. For predictive models look at Winters Holt algorithm and other similar ideas. But also watch the movie Pi and see how he drilled a hole in his head because he went insane trying to predict the stock market. Rule of thumb: don't look for silver bullets.Clement
@Clement Even for cubic regression? I have found a solution here: #22320105 But I not sure how to actually apply itPepita
Profits do not follow mathematical functions that you and I can possibly understand :) Profits change according to thousands of variables, please think about the problem more carefully before you make some big mistakes.Clement
H
13

I found this great JavaScript library.

It's very simple, and seems to work perfectly.

I also can't recommend Math.JS enough.

Helenahelene answered 22/7, 2013 at 16:47 Comment(1)
That library is no longer maintained and by default it rounds the gradient, etc. to two decimals, i.e., a gradient of 0.009 is rounded to 0, which makes absolutely no sense to do as default.Pucida
M
9

Simple linear regression with measures of variation ( Total sum of squares = Regression sum of squares + Error sum of squares ), Standard error of estimate SEE (Residual standard error), and coefficients of determination R2 and correlation R.

const regress = (x, y) => {
    const n = y.length;
    let sx = 0;
    let sy = 0;
    let sxy = 0;
    let sxx = 0;
    let syy = 0;
    for (let i = 0; i < n; i++) {
        sx += x[i];
        sy += y[i];
        sxy += x[i] * y[i];
        sxx += x[i] * x[i];
        syy += y[i] * y[i];
    }
    const mx = sx / n;
    const my = sy / n;
    const yy = n * syy - sy * sy;
    const xx = n * sxx - sx * sx;
    const xy = n * sxy - sx * sy;
    const slope = xy / xx;
    const intercept = my - slope * mx;
    const r = xy / Math.sqrt(xx * yy);
    const r2 = Math.pow(r,2);
    let sst = 0;
    for (let i = 0; i < n; i++) {
       sst += Math.pow((y[i] - my), 2);
    }
    const sse = sst - r2 * sst;
    const see = Math.sqrt(sse / (n - 2));
    const ssr = sst - sse;
    return {slope, intercept, r, r2, sse, ssr, sst, sy, sx, see};
}
regress([1, 2, 3, 4, 5], [1, 2, 3, 4, 3]);
Mingle answered 4/3, 2017 at 10:19 Comment(0)
C
7

Check out https://web.archive.org/web/20150523035452/https://cgwb.nci.nih.gov/cgwbreg.html (javascript regression calculator) - pure JavaScript, not CGI calls to server. The data and processing remains on your computer. Complete R style results and R code to check the work and a visualization of the results.

See the source code for the embedded JavaScript implementations of OLS and statistics associated with the results.

The code is my effort to port the GSL library functions to JavaScript.

The codes is released under GPL because it's basically line for line porting of GPL licensed Gnu Scientific Library (GSL) code.

EDIT: Paul Lutus also provides some GPL code for regression at: http://arachnoid.com/polysolve/index.html

Calamanco answered 20/10, 2012 at 18:13 Comment(0)
B
5

Here is a snippet that will take an array of triplets (x, y, r) where r is the weight of the (x, y) data point and return [a, b] such that Y = a*X + b approximate the data.

// return (a, b) that minimize
// sum_i r_i * (a*x_i+b - y_i)^2
function linear_regression( xyr )
{
    var i, 
        x, y, r,
        sumx=0, sumy=0, sumx2=0, sumy2=0, sumxy=0, sumr=0,
        a, b;

    for(i=0;i<xyr.length;i++)
    {   
        // this is our data pair
        x = xyr[i][0]; y = xyr[i][1]; 

        // this is the weight for that pair
        // set to 1 (and simplify code accordingly, ie, sumr becomes xy.length) if weighting is not needed
        r = xyr[i][2];  

        // consider checking for NaN in the x, y and r variables here 
        // (add a continue statement in that case)

        sumr += r;
        sumx += r*x;
        sumx2 += r*(x*x);
        sumy += r*y;
        sumy2 += r*(y*y);
        sumxy += r*(x*y);
    }

    // note: the denominator is the variance of the random variable X
    // the only case when it is 0 is the degenerate case X==constant
    b = (sumy*sumx2 - sumx*sumxy)/(sumr*sumx2-sumx*sumx);
    a = (sumr*sumxy - sumx*sumy)/(sumr*sumx2-sumx*sumx);

    return [a, b];
}
Bur answered 27/4, 2012 at 14:52 Comment(1)
What would this be used for?Velvetvelveteen
R
1

Somewhat based on Nic Mabon's answer.

function linearRegression(x, y)
{
    var xs = 0;  // sum(x)
    var ys = 0;  // sum(y)
    var xxs = 0; // sum(x*x)
    var xys = 0; // sum(x*y)
    var yys = 0; // sum(y*y)

    var n = 0;
    for (; n < x.length && n < y.length; n++)
    {
        xs += x[n];
        ys += y[n];
        xxs += x[n] * x[n];
        xys += x[n] * y[n];
        yys += y[n] * y[n];
    }

    var div = n * xxs - xs * xs;
    var gain = (n * xys - xs * ys) / div;
    var offset = (ys * xxs - xs * xys) / div;
    var correlation = Math.abs((xys * n - xs * ys) / Math.sqrt((xxs * n - xs * xs) * (yys * n - ys * ys)));

    return { gain: gain, offset: offset, correlation: correlation };
}

Then y' = x * gain + offset.

Repartee answered 10/6, 2015 at 10:2 Comment(7)
Hey sorry but what is the y' for? How do you actually execute that linePepita
y' is the estimate of y at a given x according to the linear regression. For example if you wanted to plot your linear regression on a graph you'd do something like: x1 = min(x); x2 = max(x); y1 = x1 * gain + offset; y2 = x2 * gain + offset; and then plot a line from x1, y1 to x2, y2.Repartee
Hey sorry but let's say my y-axis is the profit and x-axis is the month. So what I needed to do is pass in two arrays of data into the algorithm above to calculate for the gain and offset. After that, let's say I wanted to predict for June, by using the formula above, the 'x' should be 6 in this case, then I just get the returned gain and offset to calculate for the 'y'. Am I right?Pepita
Does the above algorithm works for dataset with crazily increasing up and down rather than a smooth dataset which increase gradually? Because my dataset will be going up and down and I afraid that might reduce the accuracy of the predictionPepita
because I realized a problem with linear regression prediction, let's say I wanted to predict the profit for next year based on current year record. If my records for current year is going up and down for each months, when I perform the prediction, the data for next year will only either increase or decrease gradually. The predicted data will never go up and down, and I am afraid that the accuracy of the prediction is not there.Pepita
Yes that is how you would calculate it, but as you say, monthly profits are very unlikely to be linear. You need a more expressive model.Repartee
I could not find any non-linear example online though. :( Do you mind to take a look at this: #46881782, it is very similar, just that it is non-linearPepita

© 2022 - 2024 — McMap. All rights reserved.