Confidence intervals of the muhaz package hazard function [duplicate]
Asked Answered
M

1

11

The muhaz package estimates the hazard function from right censored data using kernel smoothing methods. My question is, is there any way to obtain confidence intervals for the hazard function that muhaz calculates?

options(scipen=999)
library(muhaz)
data(ovarian, package="survival")
attach(ovarian)
fit1 <- muhaz(futime, fustat)
plot(fit1, lwd=3, ylim=c(0,0.002))

muhaz hazard function

In the above example the muhaz.object fit has some entries fit1$msemin, fit1$var.min, fit1$haz.est however their length is half of the fit1$haz.est.

Any ideas if it is possible to extract confidence intervals for the hazard function?

EDIT: I tried the following based with what @user20650 suggested

options(scipen=999)
library(muhaz)
data(ovarian, package="survival")
fit1 <- muhaz(ovarian$futime, ovarian$fustat,min.time=0, max.time=744)


h.df<-data.frame(est=fit1$est.grid, h.orig=fit1$haz.est)

for (i in 1:10000){
d.s.onarian<-ovarian[sample(1:nrow(ovarian), nrow(ovarian), replace = T),]
d.s.muhaz<-muhaz(d.s.onarian$futime, d.s.onarian$fustat, min.time=0, max.time=744 )
h.df<-cbind(h.df, d.s.muhaz$haz.est)
}


h.df$upper.ci<-apply(h.df[,c(-1,-2)], 1,  FUN=function(x) quantile(x, probs = 0.975))
h.df$lower.ci<-apply(h.df[,c(-1,-2)], 1,  FUN=function(x) quantile(x, probs = 0.025))
plot(h.df$est, h.df$h.orig, type="l", ylim=c(0,0.003), lwd=3)
lines(h.df$est, h.df$upper.ci,  lty=3, lwd=3)
lines(h.df$est, h.df$lower.ci,  lty=3, lwd=3)

Setting max.time seems to works, every bootstrap sample hast the same estimating grid points. However the CI obtained, make little sense. Normally I would expect that the intervals are narrow at t=0 and get wider with time (less information, more uncertainty) but the intervals obtained seem to be more or less constant with time.

enter image description here

Miltonmilty answered 16/2, 2015 at 14:23 Comment(3)
Could you bootstrap it?. This with(fit1, plot(est.grid, haz.est, type="l", lwd=3, ylim=c(0,0.002))) gives the same plot, so you will need estimates of haz.est at the same timepoints as for fit1. However, as you resample and refit the muhaz model the timepoints change, From a quick try, I think you can force est.grid to be at the same timepoints for each resample if you set the min.time and max.time to be the same as in the original fit. ie with(dat, muhaz(futime, fustat, min.time=0, max.time=744)), where dat is the bootstrap data.Sarraceniaceous
Setting max.time seems to works, every bootstrap sample hast the same estimating grid points. However the CI obtained, make little sense. Normally I would expect that the intervals get wider with time (less information, more uncertainty) but the intervals obtained seem to be more or less constant with time.Miltonmilty
Related: https://mcmap.net/q/1158962/-how-do-i-extract-hazards-from-survfit-in-r/6574038Telemark
H
5

Bootstrapping provides the answer as the commenter suggested. Your intuitions are right that you should expect the CIs to widen as the number at risk decreases. However, this effect is going to be diminished by the smoothing process and the longer the interval over which smoothing is applied the less you should notice the change in size of the CI. Try smoothing over a sufficiently short interval and you should notice the CIs widen more noticeably.

As you may find, these smoothed hazard plots can be of very limited use and are highly sensitive to how the smoothing is done. As an exercise, it is instructive to simulate survival times from a series of Weibull distributions w/ the shape parameter set to 0.8, 1.0, 1.2, and then look at these smoothed hazard plots and try to categorize them. To the extent that these plots are informative, it should be fairly easy to tell the difference between those three curves based on the trend rate of the hazard function. YMMV, but I haven't been very impressed with the results when I've done this tests with reasonable sample sizes consistent with clinical trials in oncology.

As an alternative to smoothed hazard plots, you might try fitting piecewise exponential curves using the method of Han et al. (http://www.ncbi.nlm.nih.gov/pubmed/23900779) and bootstrapping that. Their algorithm will identify the break points at which there is a statistically significant change in the hazard rate and may give you a better sense of the trend in the hazard rate than smoothed hazard plots. It will also avoid the somewhat arbitrary but consequential choice of smoothing parameters.

Hunchbacked answered 26/5, 2015 at 2:12 Comment(4)
Thank you for your elaborate answer. Do you know if there is an R package for that procedure?Miltonmilty
I don't believe they have published their R code yet. You may be able to get the code from the authors if you explain how you will use it and ask nicely.Alternately, their paper should explain the method well enough to make it an interesting project.Hunchbacked
cran.r-project.org/web/packages/RPEXE.RPEXT/index.htmlDowngrade
I should have included a label for that link. It is an implementation by Han of the methods used in Han, Schell, and Kim (2012 <doi:10.1080/19466315.2012.698945>, 2014 <doi:10.1002/sim.5915>), and Han et al. (2016 <doi:10.1111/biom.12590>). The 2014 paper cited above is free from Stat in MedDowngrade

© 2022 - 2024 — McMap. All rights reserved.