Increasing cost for linear regression

Asked 30/8, 2018 at 9:32 Answered 11/9, 2018 at 10:1

python machine-learning linear-regression

I implemented, for training purpose, a linear regression in python. The problem is that the cost is increasing instead of decreasing. For the data I use the Airfoil Self-Noise Data Set. Data can be found here

I import data as follow :

import pandas as pd

def features():

    features = pd.read_csv("data/airfoil_self_noise/airfoil_self_noise.dat.txt", sep="\t", header=None)

    X = features.iloc[:, 0:5]
    Y = features.iloc[:, 5]

    return X.values, Y.values.reshape(Y.shape[0], 1)

My code for the linear regression is the following :

import numpy as np
import random

class linearRegression():

    def __init__(self, learning_rate=0.01, max_iter=20):
        """
        Initialize the hyperparameters of the linear regression.

        :param learning_rate: the learning rate
        :param max_iter: the max numer of iteration to perform
        """

        self.lr = learning_rate
        self.max_iter = max_iter
        self.m = None
        self.weights = None
        self.bias = None

    def fit(self, X, Y):
        """
        Run gradient descent algorithm

        :param X: the inputs
        :param Y: the outputs
        :return:
        """

        self.m = X.shape[0]
        self.weights = np.random.normal(0, 0.1, (X.shape[1], 1))
        self.bias = random.normalvariate(0, 0.1)

        for iter in range(0, self.max_iter):

            A = self.__forward(X)
            dw, db = self.__backward(A, X, Y)

            J = (1/(2 * self.m)) * np.sum(np.power((A - Y), 2))

            print("at iteration %s cost is %s" % (iter, J))

            self.weights = self.weights - self.lr * dw
            self.bias = self.bias - self.lr * db

    def predict(self, X):
        """
        Make prediction on the inputs

        :param X: the inputs
        :return:
        """

        Y_pred = self.__forward(X)

        return Y_pred

    def __forward(self, X):
        """
        Compute the linear function on the inputs

        :param X: the inputs
        :return:
            A: the activation
        """

        A = np.dot(X, self.weights) + self.bias

        return A

    def __backward(self, A, X, Y):
        """

        :param A: the activation
        :param X: the inputs
        :param Y: the outputs
        :return:
            dw: the gradient for the weights
            db: the gradient for the bias
        """

        dw = (1 / self.m) * np.dot(X.T, (A - Y))
        db = (1 / self.m) * np.sum(A - Y)

        return dw, db

Then I instantiate the linearRegression class as follow :

X, Y = features()
model = linearRegression()
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=42)
model.fit(X_train, y_train)

I tried to find why the cost is increasing but so far I was not able to find out why. If someone could point me in the right direction it would be appreciated.

Haply answered 30/8, 2018 at 9:32 Comment(4)

Post the full code where you create the class instance and call the functions. Allow others to reproduce the error/problem – Unbolt 30/8, 2018 at 9:40

I edited my post – Haply 30/8, 2018 at 10:19

what kind of results do you get when you use other packages on your data? What do you get when you do a few iterations by hand? – Edwardedwardian 4/9, 2018 at 13:27

it's probably your data. when I change the line in features() to X = features.iloc[:, 1:2] (instead of using the first four columns) your cost starts decreasing. Even when I use sklearn, I can't get a score better than .6 with the original data. Try constructing an artificial dataset that you know will place nicely with linear regression- see what kind of results you get with that – Edwardedwardian 4/9, 2018 at 13:54

Normally if you chose a large learning rate you may have a similar problem. I have tried to examine your code and my observations are:

your cost function J seems alright.
but in your backwards function you seem to subtract your actual results from your guesses. By doing so you may get negative weights and since you are subtracting multiplication of your learning rate and your rate from the weights and gradients you end up getting increased cost function results

Rosiorosita answered 4/9, 2018 at 13:12 Comment(1)

as of 2nd point - could this cause oscillation problem, or large momentum problem, (resulting from large learning rate) ? it seems that the nature of the problem is like you mentioned. But if we can struggle the problem you mentioned just changing the learning rate? or just changing THE cost func to MAE is also suitable? – Abhorrent 24/11, 2023 at 4:50

Your learning rate is much too high. When I run your code unmodified with except for a learning rate of 1e-7 instead of 0.01, I get reliably decreasing costs.

Adenosine answered 5/9, 2018 at 3:3 Comment(0)

Generally, learning rate is too high when cost is increasing.

Lindahl answered 11/9, 2018 at 10:1 Comment(0)

Recommended topics

Hot tags