R package "spatstat": How do you get standard errors for non-smooth terms in a poisson process model (function: ppm) when use.gam=TRUE?
Asked Answered
O

2

7

In the R package spatstat (I am using the current version, 1.31-0) , there is an option use.gam. When you set this to true, you can include smooth terms in the linear predictor, the same way you do with the R package mgcv. For example,

g <- ppm(nztrees, ~1+s(x,y), use.gam=TRUE) 

Now, if I want a confidence interval for the intercept, you can usually use summary or vcov, which works when you don't use gam but fails when you do use gam

vcov(g)

which gives the error message

Error in model.frame.default(formula = fmla, data = 
    list(.mpl.W = c(7.09716796875,  :invalid type (list) for variable 's(x, y)'

I am aware that this standard error approximation here is not justified when you use gam, but this is captured by the warning message:

In addition: Warning message: model was fitted by gam();
            asymptotic variance calculation ignores this 

I'm not concerned about this - I am prepared to justify the use of these standard errors for the purpose I'm using them - I just want the numbers and would like to avoid "writing-my-own" to do so.

The error message I got above does not seem to depend on the data set I'm using. I used the nztrees example here because I know it comes pre-loaded with spatstat. It seems like it's complaining about the variable itself, but the model clearly understands the syntax since it fits the model (and the predicted values, for my own dataset, look quite good, so I know it's not just pumping out garbage).

Does anybody have any tips or insights about this? Is this a bug? To my surprise, I've been unable to find any discussion of this online. Any help or hints are appreciated.

Edit: Although I have definitively answered my own question here, I will not accept my answer for the time being. That way, if someone is interested and willing to put in the effort to find a "workaround" for this without waiting for the next edition of spatstat, I can award the bounty to him/her. Otherwise, I'll just accept my own answer at the end of the bounty period.

Olds answered 25/2, 2013 at 18:51 Comment(3)
I get the same error message if I just try to print the g to the screen. It seems that the problem occurs when calling the model.frame function. With simple debugging the error seems to occur in line data <- .Internal(model.frame(formula, rownames, variables, varnames, extras, extranames, subset, na.action))Upbraiding
Hi @Hemmo, I can see that. But, the model still is estimated (e.g. coef(g) works) and you can plot predicted values, etc. (although, when you try to get standard errors for the predictions, you're back to this error). Any tips?Olds
Quick looking at the code of ppm and mpl.engine, I would say that ppm and it's subfunctions do not use model.frame approach. It saves formula and data into the output (g$internal), but default formula parsing /model.frame.default cannot handle list s(x,y) as there is no such thing in the data frame. My guess is that this is a bug, and you should ask this from the package author. You could also test this with older version of the package and see if you get the same error.Upbraiding
O
4

I have contacted one of the package authors, Adrian Baddeley, about this. He promptly responded and let me know that this is indeed a bug with the software and, apparently, I am the first person to encounter it. Fortunately, it only took him a short time to track down the issue and correct it. The fix will be included in the next release of spatstat, 1.31-1.

Edit: The updated version of spatstat has been released and does not have this bug anymore:

g <- ppm(nztrees, ~1+s(x,y), use.gam=TRUE)
sqrt( vcov(g)[1,1] ) 
[1] 0.1150982
Warning message:
model was fitted by gam(); asymptotic variance calculation ignores this 

See the spatstat website for other release notes. Thanks to all who read and participated in this thread!

Olds answered 1/3, 2013 at 13:50 Comment(0)
E
1

I'm not sure you can specify the trend the way you have which is possibly what is causing the error. It doesn't seem to make sense according to the documentation:

The default formula, ~1, indicates the model is stationary and no trend is to be fitted.

But instead you can specify the model like so:

g <- ppm(nztrees, ~x+y, use.gam=TRUE)   
#Then to extract the coefficientss:
>coef(g)
(Intercept)             x             y 
-5.0346019490  0.0013582470 -0.0006416421 
#And calculate their se:
vc <- vcov(g)
se <- sqrt(diag(vc))
> se
(Intercept)           x           y 
0.264854030 0.002244702 0.003609366 

Does this make sense/expected result? I know that the package authors are very active on the r-sig-geo mailing lsit as they have helped me in the past. You may also want to post your question to that mailing list, but you should reference your question here when you do.

Esmeralda answered 28/2, 2013 at 17:57 Comment(3)
Hi @Simon, thanks for your attention. The s(x,y) is gam syntax for specifying the effect of x,y by a non-parametric smooth function (that is estimated). See the gam documentation. Note that the model is estimated as a functional parameter when you run my code (e.g. plot the predicted surface - plot(predict(g))), but it seems that the link with gam is incomplete, since SEs for the non-smooth terms aren't available. The model you've fit is a usual log-linear in x and y and I have had success getting those standard errors, etc.Olds
@Olds ah I should have paid more attention. I will look into it further to see if I can help, but I strongly advise to post to r-sig-geo, and I strongly advise to read the posting guide first. I have been slated for a poorly formed question the mailing list before!Panpipe
Non-local gam documentation for anyone else following thisPanpipe

© 2022 - 2024 — McMap. All rights reserved.