s12874-019-0858-x.pdf (1.67 MB)
Download file

Assessing causal treatment effect estimation when using large observational datasets

Download (1.67 MB)
journal contribution
posted on 07.05.2020, 08:17 by ER John, KR Abrams, CE Brightling, NA Sheehan
Background: Recently, there has been a heightened interest in developing and evaluating different methods for analysing observational data. This has been driven by the increased availability of large data resources such as Electronic Health Record (EHR) data alongside known limitations and changing characteristics of randomised controlled trials (RCTs). A wide range of methods are available for analysing observational data. However, various, sometimes strict, and often unverifiable assumptions must be made in order for the resulting effect estimates to have a causal interpretation. In this paper we will compare some common approaches to estimating treatment effects from observational data in order to highlight the importance of considering, and justifying, the relevant assumptions prior to conducting an observational analysis. Methods: A simulation study was conducted based upon a small cohort of patients with chronic obstructive pulmonary disease. Two-stage least squares instrumental variables, propensity score, and linear regression models were compared under a range of different scenarios including different strengths of instrumental variable and unmeasured confounding. The effects of violating the assumptions of the instrumental variables analysis were also assessed. Sample sizes of up to 200,000 patients were considered. Results: Two-stage least squares instrumental variable methods can yield unbiased treatment effect estimates in the presence of unmeasured confounding provided the sample size is sufficiently large. Adjusting for measured covariates in the analysis reduces the variability in the two-stage least squares estimates. In the simulation study, propensity score methods produced very similar results to linear regression for all scenarios. A weak instrument or strong unmeasured confounding led to an increase in uncertainty in the two-stage least squares instrumental variable effect estimates. A violation of the instrumental variable assumptions led to bias in the two-stage least squares effect estimates. Indeed, these were sometimes even more biased than those from a naïve linear regression model. Conclusions: Instrumental variable methods can perform better than naïve regression and propensity scores. However, the assumptions need to be carefully considered and justified prior to conducting an analysis or performance may be worse than if the problem of unmeasured confounding had been ignored altogether.

Funding

This report is independent research arising from a (Doctoral Research Fellowship, Eleanor John, DRF-2018-11-ST2–034 and NIHR Research Methods Fellowship, NIHR-RMFI-2016-07-10) supported by the National Institute for Health Research. The views expressed in this publication are those of the author(s) and not necessarily those of the NHS, the National Institute for Health Research or the Department of Health. KRA is partially supported as a UK National Institute for Health Research (NIHR) Senior Investigator Emeritus (NI-SI-0512-10159).

History

Citation

John, E.R., Abrams, K.R., Brightling, C.E. et al. Assessing causal treatment effect estimation when using large observational datasets. BMC Med Res Methodol 19, 207 (2019). https://doi.org/10.1186/s12874-019-0858-x

Version

VoR (Version of Record)

Published in

BMC MEDICAL RESEARCH METHODOLOGY

Volume

19

Issue

1

Pagination

207 (15)

Publisher

BMC

issn

1471-2288

eissn

1471-2288

Acceptance date

23/10/2019

Copyright date

2019

Available date

14/11/2019

Publisher version

https://link.springer.com/article/10.1186/s12874-019-0858-x

Spatial coverage

England

Language

English