Fitting a function in R
Asked Answered
S

3

14

I have a few datapoints (x and y) that seem to have a logarithmic relationship.

> mydata
    x   y
1   0 123
2   2 116
3   4 113
4  15 100
5  48  87
6  75  84
7 122  77

> qplot(x, y, data=mydata, geom="line")

plot

Now I would like to find an underlying function that fits the graph and allows me to infer other datapoints (i.e. 3 or 82). I read about lm and nls but I'm not getting anywhere really.

At first, I created a function of which I thought it resembled the plot the most:

f <- function(x, a, b) {
    a * exp(b *-x)
}
x <- seq(0:100)
y <- f(seq(0:100), 1,1)
qplot(x,y, geom="line")

plot2

Afterwards, I tried to generate a fitting model using nls:

> fit <- nls(y ~ f(x, a, b), data=mydata, start=list(a=1, b=1))
   Error in numericDeriv(form[[3]], names(ind), env) :
   Missing value or an Infinity produced when evaluating the model

Can someone point me in the right direction on what to do from here?

Follow up

After reading your comments and googling around a bit further I adjusted the starting parameters for a, b and c and then suddenly the model converged.

fit <- nls(y~f(x,a,b,c), data=data.frame(mydata), start=list(a=1, b=30, c=-0.3))
x <- seq(0,120)
fitted.data <- data.frame(x=x, y=predict(fit, list(x=x))
ggplot(mydata, aes(x, y)) + geom_point(color="red", alpha=.5) + geom_line(alpha=.5) + geom_line(data=fitted.data)

plot3

Sulfide answered 7/8, 2012 at 11:1 Comment(5)
I think the right place to point you is towards a Statistics 101 course. You could at least show us your efforts with lm.Mackay
I advise you to read the R manual : type ?lm, ?nls and ?formula in your RConsoleTurnkey
Sorry for my laziness - I'm a little frustrated right now. I added the steps I did with nls and the error it produced.Sulfide
I know this is an old question, but have you every tried using lm(y ~ poly(x, 3), data = mydata) ? One can try different degrees of polynomial and compare the results of lm using anova.Mcmillen
Please edit your question - you now refer to f(x,a,b,c), which is not defined. What is it? You have only f(x,a,b)Shutz
S
11

Maybe using a cubic specification for your model and estimating via lm would give you a good fit.

# Importing your data
dataset <- read.table(text='
    x   y
1   0 123
2   2 116
3   4 113
4  15 100
5  48  87
6  75  84
7 122  77', header=T)

# I think one possible specification would be a cubic linear model
y.hat <- predict(lm(y~x+I(x^2)+I(x^3), data=dataset)) # estimating the model and obtaining the fitted values from the model

qplot(x, y, data=dataset, geom="line") # your plot black lines
last_plot() + geom_line(aes(x=x, y=y.hat), col=2) # the fitted values red lines

# It fits good.

enter image description here

Spew answered 7/8, 2012 at 12:42 Comment(0)
M
4

Try taking the log of your response variable and then using lm to fit a linear model:

fit <- lm(log(y) ~ x, data=mydata)

The adjusted R-squared is 0.8486, which at face value isn't bad. You can look at the fit using plot, for example:

plot(fit, which=2)

But perhaps, it's not such a good fit after all:

last_plot() + geom_line(aes(x=x, y=exp(fit$fitted.values)))
Mccaskill answered 7/8, 2012 at 11:32 Comment(2)
What is the implication of using log on response variable?Bocage
Simply that the OP suspected a log relationship so fitting a linear model of x ~ log(y) is a quick way to check that.Mccaskill
B
3

Check this document out: http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf

In brief, first you need to decide on the model to fit onto your data (e.g., exponential) and then estimate its parameters.

Here are some widely used distributions: http://www.itl.nist.gov/div898/handbook/eda/section3/eda366.htm

Balsam answered 7/8, 2012 at 11:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.