Given n-patient records with time and status variables (among others), I would like to obtain their survival risk in the time period they're within ie 2, 4, 6, 8, 10 years.
I have a division of 24 - 47 months (2 years), 48 - 83 months (4 years), 84 - 107 months (6 years), 108 - 119 months (8 years) and 120 - "up to what's available" months (10 years).
In an individual perspective, a patient with survival months of 30 months will be included in the two-year period and along with the other predictive variables I want to know this patient's survival risk within two years.
My methodI'm retrieving survival risk percentages for my data using R
km <- survfit(Surv(time, status)~1, data=mydata)survest <- stepfun(km$time, c(1, km$surv))The time variable is the survival months and the status has values 1 and 0 for alive and dead respectively.
The code outputs something like this (taken from here):
> survest(0:100) [1] 1.0000000 0.9854015 0.9781022 0.9708029 0.9635036 0.9635036 0.9635036 [8] 0.9416058 0.9124088 0.9124088 0.8978102 0.8905109 0.8759124 0.8613139 [15] 0.8613139 0.8467153 0.8394161 0.8394161 0.8175182 0.8029197 0.7883212 [22] 0.7737226 0.7664234 0.7664234 0.7518248 0.7299270 0.7299270 0.7225540 [29] 0.7225540 0.7151810 0.7004350 0.6856890 0.6856890 0.6783160 0.6783160
My question is:
- Are these the actual survival estimates for my 300,000 individual records wherein I need to use survest(0:300000)? I tried survest(0:1000) but the result already converged to some value and this does not answer my problem.
- Can we do it using SPSS?