Impact of model misspecification in shared frailty survival models.
journal contributionposted on 24.07.2019, 14:11 by Alessandro Gasparini, Mark S. Clements, Keith R. Abrams, Michael J. Crowther
Survival models incorporating random effects to account for unmeasured heterogeneity are being increasingly used in biostatistical and applied research. Specifically, unmeasured covariates whose lack of inclusion in the model would lead to biased, inefficient results are commonly modeled by including a subject-specific (or cluster-specific) frailty term that follows a given distribution (eg, gamma or lognormal). Despite that, in the context of parametric frailty models, little is known about the impact of misspecifying the baseline hazard or the frailty distribution or both. Therefore, our aim is to quantify the impact of such misspecification in a wide variety of clinically plausible scenarios via Monte Carlo simulation, using open-source software readily available to applied researchers. We generate clustered survival data assuming various baseline hazard functions, including mixture distributions with turning points, and assess the impact of sample size, variance of the frailty, baseline hazard function, and frailty distribution. Models compared include standard parametric distributions and more flexible spline-based approaches; we also included semiparametric Cox models. The resulting bias can be clinically relevant. In conclusion, we highlight the importance of fitting models that are flexible enough and the importance of assessing model fit. We illustrate our conclusions with two applications using data on diabetic retinopathy and bladder cancer. Our results show the importance of assessing model fit with respect to the baseline hazard function and the distribution of the frailty: misspecifying the former leads to biased relative and absolute risk estimates, whereas misspecifying the latter affects absolute risk estimates and measures of heterogeneity.