Multiple Imputation for Handling Missing Outcome Data

Salter, AmySullivan, Thomas Richard2019-05-302019-05-302017http://hdl.handle.net/2440/119248Background: Multiple imputation is a widely used approach to handling missing data. Despite a growing evidence base for its use, implementation in practical settings remains challenging. This thesis considers knowledge gaps in the application of multiple imputation for handling missing outcome data. Research has shown that deleting observations with multiply imputed outcomes before analysis can be beneficial when imputation and analysis models are the same. However, it is unclear how this approach performs with auxiliary variables, which are often available in practice. Another challenge arises when the outcome of interest is binary. The use of log binomial regression to produce relative risks is common, yet standard methods for imputing binary outcomes involve logistic regression or a multivariate normal assumption. It is uncertain whether inconsistencies between imputation and analysis models in this setting lead to biased or inefficient estimation. Questions also remain concerning the utility of multiple imputation in randomised trials. Unlike observational studies, the key exposure in randomised trials (randomised group) is always observed and independent of covariates for adjustment. If extended follow-up beyond completion of a randomised trial is planned, there may be more missing outcome data than in the original trial, and the use of eligibility restrictions and separate consent processes for participation in extended follow-up may complicate the use of multiple imputation. Unfortunately little is known about the extent of missing outcome data in this setting. Aims: Specific aims are to: 1. Evaluate the effect of deleting imputed outcomes prior to analysis in the presence of auxiliary variables; 2. Investigate the performance of multiple imputation when estimating the relative risk; 3. Assess the utility of multiple imputation in randomised trials; 4. Summarise the extent of missing outcome data and provide guidance on the implementation of multiple imputation in extended follow-up studies. Methods: The performance of multiple imputation was evaluated using data simulation and application to a real clinical trial. To summarise the extent of missing outcome data in extended follow-up studies, a systematic review of published follow-up studies was undertaken. Results: Deleting imputed outcomes prior to analysis can lead to bias when the imputation model contains auxiliary variables associated with missingness in the outcome. For relative risk estimation, standard multiple imputation methods introduce bias and tend to produce confidence intervals that are too wide. Multiple imputation performs well in randomised trials, but simpler unbiased alternative methods for handling missing data are often slightly more efficient. Missing outcome data are a considerable threat to the validity of conclusions from extended follow-up studies. Eligibility restrictions and separate consent processes for participation are commonly employed in this setting, making the implementation of multiple imputation more challenging. Conclusions: This thesis demonstrates the pitfalls of deleting imputed outcomes prior to analysis, the need for new methods of imputation when estimating the relative risk, and the limitations of multiple imputation for handling missing outcome data in randomised trials and extended follow-up studies. These findings will help to guide researchers on the appropriate use of multiple imputation for handling missing outcome data.enMultiple imputationmissing datamissing at randomclinical trialsMultiple Imputation for Handling Missing Outcome DataThesis