How to compute the mean survival time
Asked Answered
C

3

14

I'm using the survival library. After computing the Kaplan-Meier estimator of a survival function:

km = survfit(Surv(time, flag) ~ 1)

I know how to compute percentiles:

quantile(km, probs = c(0.05,0.25,0.5,0.75,0.95))

But, how do I compute the mean survival time?

Chlamys answered 2/4, 2017 at 20:10 Comment(0)
W
27

Calculate Mean Survival Time

The mean survival time will in general depend on what value is chosen for the maximum survival time. You can get the restricted mean survival time with print(km, print.rmean=TRUE). By default, this assumes that the longest survival time is equal to the longest survival time in the data. You can set this to a different value by adding an rmean argument (e.g., print(km, print.rmean=TRUE, rmean=250)).

Extract Value of Mean Survival Time and Store in an Object

In response to your comment: I initially figured one could extract the mean survival time by looking at the object returned by print(km, print.rmean=TRUE), but it turns out that print.survfit doesn't return a list object but just returns text to the console.

Instead, I looked through the code of print.survfit (you can see the code by typing getAnywhere(print.survfit) in the console) to see where the mean survival time is calculated. It turns out that a function called survmean takes care of this, but it's not an exported function, meaning R won't recognize the function when you try to run it like a "normal" function. So, to access the function, you need to run the code below (where you need to set rmean explicitly):

survival:::survmean(km, rmean=60) 

You'll see that the function returns a list where the first element is a matrix with several named values, including the mean and the standard error of the mean. So, to extract, for example, the mean survival time, you would do:

survival:::survmean(km, rmean=60)[[1]]["*rmean"]

Details on How the Mean Survival Time is Calculated

The help for print.survfit provides details on the options and how the restricted mean is calculated:

?print.survfit 

The mean and its variance are based on a truncated estimator. That is, if the last observation(s) is not a death, then the survival curve estimate does not go to zero and the mean is undefined. There are four possible approaches to resolve this, which are selected by the rmean option. The first is to set the upper limit to a constant, e.g.,rmean=365. In this case the reported mean would be the expected number of days, out of the first 365, that would be experienced by each group. This is useful if interest focuses on a fixed period. Other options are "none" (no estimate), "common" and "individual". The "common" option uses the maximum time for all curves in the object as a common upper limit for the auc calculation. For the "individual"options the mean is computed as the area under each curve, over the range from 0 to the maximum observed time for that curve. Since the end point is random, values for different curves are not comparable and the printed standard errors are an underestimate as they do not take into account this random variation. This option is provided mainly for backwards compatability, as this estimate was the default (only) one in earlier releases of the code. Note that SAS (as of version 9.3) uses the integral up to the last event time of each individual curve; we consider this the worst of the choices and do not provide an option for that calculation.

Wigley answered 2/4, 2017 at 21:9 Comment(6)
Nice, thanks! Is there some way to directly store the restricted mean into a variable, or do I have to copy it from print's output?Chlamys
Thank you very much! I would upvote you another time, but I can't. :-|Chlamys
Maybe the survival package was updated to achieve this since the question was asked, but today there's no need to use the "hidden" survival:::survmean(km, rmean=60). Use just summary(km)$table[,5:6], which gives you the RMST and its SE. The CI can be calculated using appropriate quantile of the normal distribution.Manhunt
@Manhunt I think you mean summary(km)$table[5:6] which works fine for the RMST across the entire survival curve. However, if we want to extract a time-specific RMST, we still have no other solution but to use survival:::survmean(km, rmean = 60).Hacienda
I'm looking to extract the same value, but I'm getting Error: 'survmean' is not an exported object from 'namespace:survival', despite the fact that when I run getAnywhere(print.survfit) I can see survmean. Any help with this would be much appreciated.Ofelia
In answer to my own question: can use summary(km, rmean = 60)$table[,"rmean"] - I guess this is what @Manhunt was getting at, but missing the rmean = 60 argument in the summary call.Ofelia
J
2

Using the tail formula (and since our variable is non negative) you can calculate the mean as the integral from 0 to infinity of 1-CDF, which equals the integral of the Survival function.

If we replace a parametric Survival curve with a non parametric KM estimate, the survival curve goes only until the last time point in our dataset. From there on it "assumes" that the line continues straight. So we can use the tail formula in a "restricted" manner only until some cut-off point, which we can define (default is the last time point in our dataset).

You can calculate it using the print function, or manually:

print(km, print.rmean=TRUE) # print function
sum(diff(c(0,km$time))*c(1,km$surv[1:(length(km$surv)-1)])) # manually

I add 0 in the beginning of the time vector, and 1 at the beginning of the survival vector since they're not included. I only take the survival vector up to the last point, since that is the last chunk. This basically calculates the area-under the survival curve up to the last time point in your data.

If you set up a manual cut-off point after the last point, it will simply add that area; e.g., here:

print(km, print.rmean=TRUE, rmean=4) # gives out 1.247
print(km, print.rmean=TRUE, rmean=4+2) # gives out 1.560
1.247+2*min(km$surv) # gives out 1.560

If the cut-off value is below the last, it will only calculate the area-under the KM curve up to that point.

Jollenta answered 30/4, 2022 at 8:48 Comment(2)
Thanks for your answer! I know the theory of how this works, but at the time I didn't know the specifics of how to do this in R, and I really wanted to make sure I wasn't missing an option in the survival library that would do the computation for me.Chlamys
Welcome. I wanted to add this, especially for the manual calculations.Jollenta
M
2

There's no need to use the "hidden" survival:::survmean(km, rmean=60).

Use just summary(km)$table[,5:6], which gives you the RMST and its SE. The CI can be calculated using appropriate quantile of the normal distribution.

Manhunt answered 20/5, 2022 at 1:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.