Using "predict" in nls
Asked Answered
R

3

5

I have data from the USGS Nation Water Data website. I am currently trying to plot and fit curves to the data to use in prediction for different measurements taken within the dataset (dissolved oxygen, pH, gage height and temp) all in relation to discharge rate. I used the "nls" command and I am using a book of equations to find which curve to use...for this example I have specifically used Schumacher’s equation (p.48 in the book).

Find the link to data:

curve book: http://www.for.gov.bc.ca/hfd/pubs/docs/bio/bio04.htm

data I used: http://waterdata.usgs.gov/mi/nwis/uv?referred_module=qw&search_station_nm=River%20Rouge%20at%20Detroit%20MI&search_station_nm_match_type=anywhere&index_pmcode_00065=1&index_pmcode_00060=1&index_pmcode_00300=1&index_pmcode_00400=1&index_pmcode_00095=1&index_pmcode_00010=1&group_key=NONE&sitefile_output_format=html_table&column_name=agency_cd&column_name=site_no&column_name=station_nm&range_selection=date_range&begin_date=2013-11-18&end_date=2013-12-18&format=html_table&date_format=YYYY-MM-DD&rdb_compression=file&list_of_search_criteria=search_station_nm,realtime_parameter_selection

My problem is that I CANNOT get nls to predict new values once I picked a curve coded it... I also can't quite figure out how to plot it...I'm guessing this can be with the residuals? In the code I have used "aggregate" to extract means of the listed measurements and the corresponding discharge rates, now I just need to get R to predict for me. I got as far as getting what I think are fitted values... but I'm not sure and I hit a wall with "?nls."

##Create new dataframes with means given date for each constituent
ph <- aggregate(Discharge~pH, data=River.Data, mean)

##pH models
pH <- ph$pH
disch <- ph$Discharge
phm <- nls(disch~exp(a+(b/pH)), data=ph, trace=T, start=list(a=-47.06 ,b=400.2))
newph<- data.frame(ph=c(3.0,4.0,5.0,6.0,7.0,8.0,9.0))
predict(phm, newdata=newph)
Rucksack answered 21/12, 2013 at 16:23 Comment(2)
It's a bit tricky: you have to use exactly the same variable names in predict as in the original model. Otherwise predict , as you found, returns the fitted values without any warning message. I think predict(phm,newdata=list(ph=newph)) will work.Isentropic
Carl I did try this... it still returns the fitted values. That's the frustrating part I can't get a grasp on why it's only returning these fitted values: predict(phm, newdata=newph) [1] 663.69857 460.76412 322.92607 228.39464 162.95840 117.25539 85.05862 62.18766Rucksack
L
5

Seems like you've already got your answer(??), but:

ph    <- aggregate(Discharge~pH, data=River.Data, mean)
phm   <- nls(Discharge~exp(a+(b/pH)), data=ph, trace=T, start=list(a=-47.06 ,b=400.2))
newph <- data.frame(pH=seq(3,9,by=0.1))
Discharge.pred <- predict(phm, newdata=newph)

plot(ph$pH, ph$Discharge, xlim=c(3,9), ylim=c(0,1000))
par(new=t)
plot(newph$pH,Discharge.pred, xlab="", ylab="", axes=F, xlim=c(3,9), ylim=c(0,1000), type="l")

The problem is that your data is for pH in [7.5,8.2] but you are trying to predict in [3,9]. The model you've chosen is not stable for pH that far outside the range.

Levitt answered 21/12, 2013 at 17:24 Comment(1)
Yes that seems to be the case! Thanks for your response it seems to work but perhaps I need to choose a better model?Rucksack
S
1

Jarrod, try this. Cheers, Robert.

#Try this
#pH <- ph$pH # you don't need this
#disch <- ph$Discharge # you don't need this
phm <- nls(Discharge~exp(a+(b/pH)), data=ph, trace=T, start=list(a=-47.06 ,b=400.2))
newph<- data.frame(pH=seq(3,9,0.1)) # it'll be smoother with a sequence in increments of 0.1
plot(newph,predict(phm, data=newph,type="l"))
Swindell answered 21/12, 2013 at 16:37 Comment(3)
It doesn't seem to like that either... It gives an error in the plot: > plot(newph,predict(phm, data=newph,type="l") + plot(newph,predict(phm, data=newph,type="l") Error: unexpected symbol in: "plot(newph,predict(phm, data=newph,type="l") plot"Rucksack
It still only returns the fitted values if you take this out of plot and just use predict... maybe it's an error with the model? Should I be using "glm" instead?Rucksack
I forgot a close parens in the 'plot()' statement. I've edited it, above.Swindell
A
1

This answer is based on @Carl Witthoft's comment. I'm highlighting this in a separate answer, because I skimmed over the comments and didn't really take in what Carl was proposing. Hoping it will help others who also find themselves here.

From the ?predict.nls file, the new data needs to be a "named list or dataframe". Which is where I have fallen down in the past; the list or dataframe column needs to have a name. And as Carl states, this needs to be exactly the same name as that used in the model. I think the OP's issue may be because they used pH in the model, but ph in the creation of new data.

Assam answered 16/12, 2018 at 14:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.