Please use this identifier to cite or link to this item:
Scopus Web of Science® Altmetric
Type: Journal article
Title: Multiple imputation for handling missing outcome data when estimating the relative risk
Author: Sullivan, T.
Lee, K.
Ryan, P.
Salter, A.
Citation: BMC Medical Research Methodology, 2017; 17(1):134-1-134-10
Publisher: BioMed Central
Issue Date: 2017
ISSN: 1471-2288
Statement of
Thomas R. Sullivan, Katherine J. Lee, Philip Ryan and Amy B. Salter
Abstract: Background: Multiple imputation is a popular approach to handling missing data in medical research, yet little is known about its applicability for estimating the relative risk. Standard methods for imputing incomplete binary outcomes involve logistic regression or an assumption of multivariate normality, whereas relative risks are typically estimated using log binomial models. It is unclear whether misspecification of the imputation model in this setting could lead to biased parameter estimates. Methods: Using simulated data, we evaluated the performance of multiple imputation for handling missing data prior to estimating adjusted relative risks from a correctly specified multivariable log binomial model. We considered an arbitrary pattern of missing data in both outcome and exposure variables, with missing data induced under missing at random mechanisms. Focusing on standard model-based methods of multiple imputation, missing data were imputed using multivariate normal imputation or fully conditional specification with a logistic imputation model for the outcome. Results: Multivariate normal imputation performed poorly in the simulation study, consistently producing estimates of the relative risk that were biased towards the null. Despite outperforming multivariate normal imputation, fully conditional specification also produced somewhat biased estimates, with greater bias observed for higher outcome prevalences and larger relative risks. Deleting imputed outcomes from analysis datasets did not improve the performance of fully conditional specification. Conclusions: Both multivariate normal imputation and fully conditional specification produced biased estimates of the relative risk, presumably since both use a misspecified imputation model. Based on simulation results, we recommend researchers use fully conditional specification rather than multivariate normal imputation and retain imputed outcomes in the analysis when estimating relative risks. However fully conditional specification is not without its shortcomings, and so further research is needed to identify optimal approaches for relative risk estimation within the multiple imputation framework.
Keywords: Relative risk
Missing data
Multiple imputation
Description: Published online: 06 September 2017
Rights: © The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.
DOI: 10.1186/s12874-017-0414-5
Grant ID:
Published version:
Appears in Collections:Aurora harvest 8
Public Health publications

Files in This Item:
File Description SizeFormat 
hdl_109324.pdfPublished Version497.11 kBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.