Review
Analysing randomised controlled trials with missing data: Choice of approach affects conclusions

https://doi.org/10.1016/j.cct.2011.12.002Get rights and content

Abstract

Background

The publication of a wrong conclusion from a randomised trial could have disastrous consequences. Missing data are unavoidable in most studies, but ignoring the problem may introduce bias to the results. Finding an appropriate way to deal with missing data is of paramount importance. We show how the choice of analysis method can impact on the conclusion of the trial with regard to the quality of life outcomes.

Methods

Various analysis strategies (analysis of covariance, linear mixed effects model) with and without imputation were carried out to assess treatment difference in four quality of life outcomes in an example clinical trial.

Results

Across all four quality of life outcomes, the various analysis approaches provided different estimates of treatment difference, with varying precision, using different numbers of patients. In some cases the decision about statistical significance differed. The results suggested that where possible extra effort should be made to retrieve missing responses. In the presence of data missing at random, simple imputation was inappropriate with multiple imputation or a linear mixed effects model more useful.

Conclusion

Different trial conclusions were obtained for a variety of analysis approaches for the same outcome. Collecting as much data as possible is of paramount importance. Careful consideration should be taken when deciding on the most appropriate strategy for analysis when missing data are involved and this strategy should be pre-specified in the trial protocol. Making inappropriate decisions could result in inappropriate conclusions potentially leading to the adoption of a clinical intervention in error.

Introduction

The randomised controlled trial (RCT) is an important way of evaluating healthcare interventions, forming the basis of evidence based medicine [1]. Information gained from trials is optimal when the trial dataset is complete or relatively few data are missing. In practice this is very difficult to achieve and most trial datasets will contain missing data. Missing data are a problem for many different types of outcomes. Ignoring the presence of missing data could have major consequences and potentially lead to the publication of a wrong conclusion about a particular therapy, which ultimately could impact on clinical practice. Follow-up outcome data collected through postal questionnaires are particularly susceptible to the problems of missing data as completion cannot be enforced.

The focus of the work presented is quality of life (QoL) outcomes, but the results are applicable to the problem of missing data in general. Taking account of missing QoL outcome data is of paramount importance as often the reason why the data are missing is related to the QoL itself. Patients may forget to fill them in and not return the questionnaires, may not be physically or mentally able or perhaps do not receive them through being lost in the post. The missing data mechanism describes the underlying reason why missing data have occurred [2]. If missingness relates to the QoL itself then this could potentially be important when analysing the trial outcomes. If missingness is due to death the implications of this should be considered.

In an effort to tackle the problem of non-returned postal questionnaires some organisations now employ a system of reminder questionnaires to help retrieve data that were initially missing. The rationale being that sometimes participants need a little prompting and receiving a reminder may prompt them to respond, improving the sample size, allowing the study to have sufficient power to make conclusions and not introduce bias through some participants being removed from analysis.

A common approach in analysing RCTs is to use a complete case analysis, whereby patients with incomplete data are ignored. In recent years the use of imputation has been seen as a way of providing a sensitivity analysis for this. Choice between imputation methods is often limited to those which are readily available and easy to implement (e.g. mean imputation). Recent advances in multiple imputation have caused this to be more widely used, but this approach is still considered as a bit of a ‘black box’ by many researchers [3]. Many trials (including our example, REFLUX) collect QoL outcome data at baseline and several times during follow-up, but only data from the final endpoint are analysed. A complete case analysis on the final endpoint ignores any patient without this final outcome even though their interim responses may be valuable in deciding between treatment options. Using an example trial we aim to investigate the use of alternative analysis strategies that utilise all responses and alongside different approaches for dealing with missing data show how conclusions about which treatment is best can be affected.

Section snippets

Example trial

The REFLUX trial [4], [5] was undertaken by Centre for Healthcare Randomised Trials, part of the Health Services Research Unit at the University of Aberdeen. The aim of this trial was to determine the relative benefits and risks of laparoscopic fundoplication surgery as an alternative to long term drug treatment for gastro-oesophageal reflux disease (GORD). It was a multicentre trial and recruited 357 participants (178 to surgery and 179 to medical management) to the randomised part of the

Results

The REFLUX trial included 357 randomised participants. At the final endpoint (12 months), 38% responded immediately and a further 51% responded after reminder. This gave an overall response rate of 89%. The patient characteristics collected at baseline are shown in Table 1. The mean age was 46.3 years and two thirds were male. No obvious differences were seen between the two groups which was to be expected since the groups were randomised. Table 2 shows the missing data pattern for the REFLUX

Discussion

The aim of this paper was to illustrate (using REFLUX) how different choices of analysis methods can impact upon a trial conclusion. Data from three QoL instruments (four outcomes) collected at three time points were obtained. Eight different analysis strategies were implemented and included the original published ANCOVA, a linear mixed effects model, simple imputation (using LVCF) and multiple imputation (using predictive mean match model) followed by ANCOVA. It was found that the choice of

Conclusion

In conclusion, researchers should carefully consider how best to analyse a study where missing data may be an issue. Since the choice of methods may provide different results, the methods chosen should be pre-specified in the trial protocol. Ensuring the maximum amount of data as possible is used is important. Use of reminders to recover data initially missing may be helpful. In addition taking into account all available data (e.g. linear mixed effects model) may be of benefit as everyone with

Acknowledgements

We would like to thank the Centre for HealthCare Randomised Trials based within the Health Services Research Unit and their staff for providing the data used for this work. Particularly, Samantha Wileman who assisted with data queries and provided background information on the trial.

References (24)

  • Brooks

    EuroQoL: the current state of play

    Health Policy

    (1996)
  • K.F. Schulz et al.

    Sample size slippages in randomised trials: exclusions and the lost and wayward

    Lancet

    (2002)
  • S.J. Pocock

    Clinical Trials: A Practical Approach

    (1983)
  • D.B. Rubin

    Inference and missing data

    Biometrika

    (1976)
  • J.R. Carpenter et al.

    Missing Data in Randomised Controlled Trials — A Practical Guide

  • A.M. Grant et al.

    Minimal access surgery compared with medical management for chronic gastro-oesophageal reflux disease: UK collaborative randomised trial

    BMJ

    (2009)
  • A. Grant et al.

    The effectiveness and cost-effectiveness of minimal access surgery amongst people with gastro-oesophageal reflux disease — a UK collaborative study. The REFLUX trial

    Health Technol Assess

    (2008)
  • S. Macran et al.

    The development of a new measure of quality of life in the management of gastro-oesophageal reflux disease: the Reflux questionnaire

    Qual Life Res

    (2007 Mar)
  • J.R. Ware et al.

    SF-36 Health Survey Manual and Interpretation Guide

    (1993)
  • S. Fielding et al.

    A review of RCTs in four medical journals to assess the use of imputation to overcome missing data in quality of life outcomes

    Trials

    (Aug 11 2008)
  • N.K. Aaronson et al.

    The European Organisation for research and Treatment of Cancer QLQ-C30: a quality of life instrument for use in international clinical trials in oncology

    J Natl Cancer Inst

    (1993)
  • D.L. Fairclough

    Design and Analysis of Quality of Life Studies in Clinical Trials

    (2002)
  • Cited by (37)

    • Statistical analysis and design in ophthalmology: Toward optimizing your data

      2019, Computational Retinal Image Analysis: Tools, Applications and Perspectives
    • Fragility of Results in Ophthalmology Randomized Controlled Trials: A Systematic Review

      2018, Ophthalmology
      Citation Excerpt :

      The purpose of this paper is not to decry the merits of the P value, but rather to offer a proposed additional reporting measure in the FI that will help augment a P value's interpretation. The concept of a quantification of fragility was first introduced by Feinstein13 and Walter,14 and was recently simplified by Walsh et al.9 While our study evaluates the FI in ophthalmology RCTs, previous studies have hinted at a lack of robustness in the field.6,15,16 Sanfilippo et al6 demonstrated that 47% to 60% of publications in the ophthalmic literature would only qualify as “anecdotal evidence” if Bayesian analyses and evidentiary standards were applied.

    • Mindfulness improves psychological quality of life in community-based patients with severe mental health problems: A pilot randomized clinical trial

      2015, Schizophrenia Research
      Citation Excerpt :

      Intention-to-treat analysis was used to avoid overestimation of the efficacy resulting from removal of non-compliers. Multiple stochastic imputation was selected to deal with missing data — an appropriate and robust method (Baraldi and Enders, 2010), and recommended to deal with data missing at random (Fielding et al., 2012). This pilot trial follows the JARS group recommendations (Cooper, 2008) for randomized clinical trial reporting standards.

    View all citing articles on Scopus

    Funding: The Health Services Research Unit is funded by the Chief Scientist Office of the Scottish Government Health Directorate. Shona Fielding was funded by the Chief Scientist Office on a Research Training Fellowship (CZF/1/31) while carrying out this work. The views expressed are, however, not necessarily those of the funding body.

    View full text