How to predict survival probabilities in R?
Asked Answered
C

1

5

I have data called veteran stored in R. I created a survival model and now wish to predict survival probability predictions. For example, what is the probability that a patient with 80 karno value, 10diagtime, age 65 and prior=10 and trt = 2 lives longer than 100 days?

In this case the design matrix is x = (1,0,1,0,80,10,65,10,2)

Here is my code:

library(survival)
attach(veteran)
weibull <- survreg(Surv(time,status)~celltype + karno+diagtime+age+prior+trt ,dist="w")

and here is the output:

enter image description here

Any idea how to predict the survival probabilities?

Cattan answered 10/12, 2014 at 19:3 Comment(9)
You could check out the function predict.survreg, which will allow you to compute survival probabilities.Karlykarlyn
Did you try the predict() function? To make it easier for others to help you, please provide a reproducible example with sample input so we might test out different solutions. Also, you should learn to avoid attach() and just use the data= parameter in modeling functions.Whydah
@MrFlick, does it make any difference in the results to use attach() in my way? Also, how could I write the code with colored background as you guys do? I used the predict function, but it produces the survival time, not the probabilities. Am I right?Belmonte
It doesn't make a different in the result, but using attach() is not a best practice. How are you defining "survival probabilities"? You usually need some notion of a specific time i would think.Whydah
Above I described a patient. How could I predict the probability that this patients lives longer than 100 days?Belmonte
@Günal. Sorry, you're right. The specific scenario is included in the question and I missed it.Whydah
stat.ethz.ch/R-manual/R-patched/library/survival/html/… includes different types of prediction. Look at the argument type in the predict functionSasnett
@DavidePassaretti, I used predict(weibull, data=x) but this gives the survival times, not the probabilities or am I missing something out?Belmonte
Perhaps this question can help you: #9152091Whydah
M
6

You can get predict.survreg to produce predicted times of survival for individual cases (to which you will pass values to newdata) with varying quantiles:

 casedat <- list(celltype="smallcell", karno =80, diagtime=10, age= 65 , prior=10 , trt = 2)
 predict(weibull, newdata=casedat,  type="quantile", p=(1:98)/100)
 [1]   1.996036   3.815924   5.585873   7.330350   9.060716  10.783617
 [7]  12.503458  14.223414  15.945909  17.672884  19.405946  21.146470
[13]  22.895661  24.654597  26.424264  28.205575  29.999388  31.806521
[19]  33.627761  35.463874  37.315609  39.183706  41.068901  42.971927
[25]  44.893525  46.834438  48.795420  50.777240  52.780679  54.806537
[31]  56.855637  58.928822  61.026962  63.150956  65.301733  67.480255
[37]  69.687524  71.924578  74.192502  76.492423  78.825521  81.193029
[43]  83.596238  86.036503  88.515246  91.033959  93.594216  96.197674
[49]  98.846083 **101.541291** 104.285254 107.080043 109.927857 112.831032
[55] 115.792052 118.813566 121.898401 125.049578 128.270334 131.564138
[61] 134.934720 138.386096 141.922598 145.548909 149.270101 153.091684
[67] 157.019655 161.060555 165.221547 169.510488 173.936025 178.507710
[73] 183.236126 188.133044 193.211610 198.486566 203.974520 209.694281
[79] 215.667262 221.917991 228.474741 235.370342 242.643219 250.338740
[85] 258.511005 267.225246 276.561118 286.617303 297.518110 309.423232
[91] 322.542621 337.160149 353.673075 372.662027 395.025122 422.263020
[97] 457.180183 506.048094
#asterisks added

You can then figure out which one is greater than the specified time and it looks to be around the 50th percentile, just as one might expect from a homework question.

png(); plot(x=predict(weibull, newdata=casedat,  type="quantile", 
             p=(1:98)/100),  y=(1:98)/100 , type="l") 
dev.off()

enter image description here

Maxwellmaxy answered 10/12, 2014 at 20:52 Comment(3)
thanks for the answer. Just to make sure I have understood you correctly, the probability a patients lives more than 500 is 1/98. IS that right? Can you also please briefly explain what those predictions mean?Belmonte
The quantiles are for the hazard rather than survival. If you plotted 1-p as the y-value you would get the typical estimated survivalMaxwellmaxy
And the quantile fo the time>506 is 98/100 rather than 1/98.Maxwellmaxy

© 2022 - 2024 — McMap. All rights reserved.