Left censoring for survival data in R
Asked Answered
T

2

6

I want to perform survival analysis (Kaplan-Meier and Cox PH modelling) on data which is both left and right censored. I'm looking at the time to occurrence of a heart arrhythmia (AF) in the presence versus the absence of a particular gene (Gene 0 or 1). However, some subjects are found to already have the arrhythmia at recruitment and so should be left censored. I've read the survival package documentation but can't work out how to account for the left censoring. Some made up example data below. Subjects 1 and 3 had AF at baseline and so should be left censored. Subject 2 did not experience the event by the end of follow-up and so is right censored. Subjects 5 and 6 both experienced the event (at 8 and 3 months respectively).

Gene<-c(0,0,1,1,0)
AF_at_baseline<-c(1,0,1,0, 0)
Followup_time<-c(11,3,8,15,7)
AF_time<-c(NA, NA, NA, 8, 3)
AF_data<-data.frame(Gene, AF_at_baseline, Followup_time, AF_time)
Thomajan answered 31/1, 2017 at 22:25 Comment(3)
Left censoring is appropriate where you have an observation start time, and you don't know the exact event time but you have an upper bound. See, e.g., the example in this answer. For this data left censoring would only make sense if your zero-time (observation start) was say, birth. You seem to want to use recruitment as the observation start, so left-censoring does not really apply.Sephira
This doesn't seem on-topic for SO, I'd recommend asking additional questions at stats.stackexchange if you need more methodoligical/statistical guidance.Sephira
Generally you would just omit the cases in AF when you were trying to predict time to onset of persons currently free of the condition.Fabyola
M
3

I had a similar problem and solved it like this:

As it is stated in the survival help file you need to specify time and time2.

You can think of left censored data as going from -infinity until the time you measured, and of right censored of going from the time you measured (probably last follow-up) until +infinity. Infinity is best coded with NA.

What solved my problem was creating two vectors: a start vector time and a stop vector time2.

For time you want all those values that are left censored to be NA. Right censored observations are filled in with the time of measurement, just as the Events.

For time2 it is the other way around.

I don't really get your data however. Why would you follow-up on subjects if they already had the event? This is what you do for subject 4 and 5 by saying AF-time was 8 and 3 but Followup_time was 15 and 7.

Trying to help, I assume the following:

You have 5 patients with

AF_at_baseline<-c(1,0,1,0,0) #where 1 indicates left censoring

Follow-up times are event times (or last time of follow-up for left and right censored)

So for the left censored data your Followup_time would look like this:

Followup_time <- c(NA, 3, NA, 15, 7)

For the right censored data:

Followup_time2 <- c(11, NA, 8 ,15, 7)
#Since you indicated that only subject 2 didn't experience the event

Now you can call Surv

Surv.Obj <- Surv(Followup_time, Followup_time2, type = 'interval2')
Surv.Obj
[1] 11-  3+  8- 15   7 # with '-' indicating left censoring and '+' right censoring

Then you can call survfit and plot the Kaplan-Meier curve:

km <- survfit(Surv.Obj ~ 1, conf.type = "none")
km
Call: survfit(formula = Surv.Obj ~ 1, conf.type = "none")

      n  events  median 0.95LCL 0.95UCL 
      5       4       7       7      NA 
    enter code here

summary(km)
Call: survfit(formula = Surv.Obj ~ 1, conf.type = "none")

 time n.risk  n.event survival std.err lower 95% CI upper 95% CI
  7.0      4 3.00e+00     0.25   0.217       0.0458            1
  7.5      1 4.44e-16     0.25   0.217       0.0458            1
 15.0      1 1.00e+00     0.00     NaN           NA           NA


plot(km, conf.int = FALSE, mark.time = TRUE)

So far, I didn't find out how to do Cox PH with interval data. See my question here.

Millenary answered 6/2, 2017 at 15:31 Comment(0)
F
3

If you have both left censored and right censored data, you can consider this to be a special case of interval censoring. This is the case when you know the event time only up to an interval. If you have left censoring, this interval is (-Inf, t), with right censoring this is (t, Inf).

As such, you can use my R package icenReg to model your data. For the Cox-PH model, this can be fit as

fit <- ic_sp(cbind(left, right) ~ covars, 
             data = myData, model = 'ph', 
             bs_samples = 500)

where left and right are the left and right sides of the interval in which the event occurred for an individual. If an event is uncensored, then just set left equal to right for that subject.

Frumpy answered 29/4, 2017 at 23:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.