Distinction between linear and non linear regression?
Asked Answered
M

3

8

In Machine Learning, we say that:

  • w1x1 + w2x2 +...+ wnxn is a linear regression model where w1,w2....wn are the weights and x1,x2...x2 are the features whereas:
  • w1x12 + w2x22 +...+ wnxn2 is a non linear (polynomial) regression model

However, in some lectures I have seen people say a model is linear based on the weights, i.e. the coefficients of weights are linear and the degree of the features doesn't matter, whether they are linear(x1) or polynomial(x12). Is that true? How does one differentiate a linear and non linear model? Is it based on weights or feature values?

Muslin answered 4/5, 2016 at 5:17 Comment(0)
L
2

Both flavors exist.

If you are in the Statistics community it is usually former (nonlinearity in features, x^2 or e^x, etc). See this for example.

In the machine learning community the focus is more on the weights; the feature functions can be anything (see for example the kernel trick in SVMs).

The reason for this is that different communities have different approaches for solving these similar problems. The stat community has more of a direct and analytical approach; while the goal of machine learning is slightly different (modeling intricate complex patterns in an unknown concept space).

Lysenko answered 4/5, 2016 at 6:18 Comment(2)
Thanks. In that case, when can weights be polynomial in nature? The weights are learnt through techniques like gradient descent right? So under what circumstances will they be linear or polynomial?Muslin
Anything as long as they're differentiable can optimized via gradient descent (and it cousins).Lysenko
T
2

How does one differentiate a linear and non linear model? Is it based on weights or feature values?

I've only heard / read about it in "a model is linear / nonlinear with respect to the features". This is usually the interesting thing. I don't see how having a term wi2 in your model will help you as it is essentially a constant. Only the features change during testing time.

So a linear model is something that can be expressed as

enter image description here

where the wi define your model and the xi are your input. Different wi result in a different model (but they are all linear with respect to the features). If your model does not fit to that scheme, then your model is not linear with respect to the features.

Now, you can add new features which are essentially only (handcrafted) non-linear transformations of the input. For example, you could make a model

enter image description here

You could argue that this is a non-linear model with respect to the input. However, you can also argue that it is essentially the model

enter image description here

I think the important part here is that it was hand-crafted. You changed the feature space, not the abilities of the model. So it is still a linear model, but in another feature-space. When you go this way, you can make any model to be non-linear.

After all: Does it really matter? It sounds a bit like you're preparing for an exam. If this is the case, I suggest to just ask your lecturer and stick with what he defines as linear / non-linear.

Tristichous answered 5/5, 2016 at 9:46 Comment(0)
E
0

Please join Stats SE and add to this discussion. I believe it is more appropriate in that context. However, to convince you to at least click the link, here is the SHORT ANSWER: "If (and only if) the statistical distribution of a model's noise (error) can be described using only linear combinations of observations, factors and/or predictors, that model is linear. Otherwise, it is not."

As you can see, I put a statistics spin on it because that is my educational background (actually more of applied mathematics with a recent heavy focus on probability). What this means is that when you subtract the prediction(s) of the model from the truth (vector), that equation must be expressible (sometimes through mathematical transformations) as a linear combination of the factors/predictors and the truth data as vectors in a linear space.

Extravagant answered 31/3, 2020 at 19:49 Comment(1)
The equation to which I refer is error(s) = truth value(s) - predicted value(s). So all of the quantities must be in the same units; one cannot subtract apples from oranges or vice versa. If you transform the entire equation, that would work as long as the transformation is a linear one.Extravagant

© 2022 - 2024 — McMap. All rights reserved.