Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Medical research is conducted to answer uncertainties and to identify effective treatments for patients. Different questions are best addressed by different types of study design—but the randomised, controlled clinical trial is typically viewed as the gold standard, providing a very high level of evidence, when examining efficacy.1 While clinical trial methodology has advanced considerably with clear guidance provided as to how to avoid sources of bias, even the most robustly designed study can succumb to missing data.2 ,3 In this statistics note, we discuss strategies for dealing with missing data but what we hope emerges is a very clear message that there is no ideal solution to missing data and prevention is the best strategy.
A senior colleague asks me to critique a publication of a randomised, controlled clinical trial comparing two drugs which aim to reduce intraocular pressure (IOP) in patients with primary open angle glaucoma. One eye per patient has been analysed and results are provided for IOP at 6 months. The study presents data on 147 subjects treated with drug A and 145 subjects with drug B. The mean pressure in patients on drug A is lower than in those on drug B, with an estimated treatment difference of 3.1 mm Hg, 95% CI (2.5, 3.8). A p value of <0.001 is provided. It seems clear that drug A is more efficacious in reducing IOP at 6 months than drug B, but does this mean that I am correct in deducing that A is better than B and therefore that patients should be given drug A?
Something about the numbers doesn't seem quite right: 147 versus 145 where I had expected equal numbers in the two groups. I learn (via the internet) that the researchers may have used simple randomisation in which case chance imbalances can and do occur, particularly with smaller studies.4 While this might impact upon power, it does not in itself represent an issue; however, on careful scrutiny of the publication, I uncover that at the start of the study there were 150 patients in each arm of the study. Three patients receiving treatment A do not provide 6- month outcome data, and five patients allocated to treatment B do not appear in the final outcome analysis. The paper is not clear as to what happened to these eight patients; however, my colleague knows the authors and agrees to drop them a note to find out what happened. He does however point out that 8 patients out of 300 is just 2.6%, which is well within the anticipated rate of loss to follow-up allowed for in the original sample size calculation.
Several months later I receive the information that I was after. Five patients on treatment arm B had merely not attended their 6-month follow-up visit. A couple had moved and three had simply not attended the follow-up visit. They did, however, all have IOP data at 5 months and three attended at 7 months. IOP values at 5 months and 7 months were fairly similar to each other indicating that perhaps the 5-month data (or indeed the 7-month data) could be viewed as a reasonable estimate of 6-month data. The 5-month IOP changes seen in the patients were similar to those seen in the patients who had attended at 6 months. Patients on treatment A had, however, not attended their 6-month visit or indeed any further visits. Contact with their general practitioners revealed that they had each suffered respiratory issues. Later on, I learned that there was indeed a causal link between drug A and adverse respiratory problems. Treatment A no longer seems the best treatment—particularly for those at risk of respiratory problems.
This scenario is given to illustrate the potential for misleading conclusions to be drawn in the presence of missing data. Missing data were the focus of a statistics note by Altman and Bland.3 At the time of writing that note, the authors commented that ‘the topic of how to handle missing data is not often discussed outside statistics journals’. They stated also that the most common approach to deal with missing data was to simply analyse everyone with complete data only—an available case or complete case analysis—as is illustrated in scenario one. While this method may be appropriate when there is little missing data—it can lead to incorrect conclusions—again as illustrated in scenario one. If an available case analysis is conducted, it is essential to examine reasons for data being missing. If the fact that an observation is missing is unrelated to both the observed and the unobserved data, the missing data are said to be missing completely at random (MCAR). By examining reasons for ‘missingness’ (if possible) it may become clear that data are not MCAR but that they are missing because of reasons related to the treatments and that these reasons may differ systematically by treatment (as illustrated in scenario one). If there are not many missing data, an available case analysis with a valid assumption of data being MCAR may be unbiased (ie, it does not overestimate or underestimate a treatment difference or evidence of association), but it will have lower power to detect a difference or association than if all data were present.5 Fewer data equate to less information, which in turns equates to less chance of being sure that if you fail to find a significant difference, it is because there truly is no difference.6
Clearly there are situations where there is no information about those who are missing. In such cases, we would recommend drawing attention to the presence of missing data and the fact that it was not possible to investigate further. By doing this, readers are aware of the potential for bias. Best and worst case scenarios could be considered to show how conclusions might have differed under such circumstances. (eg, In a study comparing drugs A and B, and where the primary outcome is treatment success, a best case scenario might be that all those lost to follow-up on treatment A were successes while all those on treatment B were failures. A worst case scenario would reverse these assumptions).
An alternative to ‘available case analysis’, where subjects with missing data are simply omitted from the analysis, is to impute data. Imputation replaces missing data with some plausible value predicted from that subject's (or another subject's) data. One method of imputation, which is commonplace in ophthalmic literature, is that of ‘last observation carried forward’ (LOCF), where any missing data is replaced with the last observed value for that patient.
A senior colleague draws my attention to a paper which has used LOCF. It is a paper published in a highly regarded journal. The primary outcome is visual acuity at 1 year after randomisation, and it compares two treatments for age-related macular degeneration. One hundred patients were randomised; three did not provide a measure at 1 year, one in one arm of the study and two in the other. The authors have looked at the reasons for the subjects being unable to provide data at 1 year and are satisfied that there is little to suggest that the data are not MCAR. Despite the proportion of data being lost to follow-up being small and within the margin expected in the sample size calculation and despite there being no overt evidence of data being anything other than MCAR, the authors have used LOCF. They state that this is essential in order to conduct a true intent to treat (ITT) analysis, where all randomised patients are included. ITT is, the authors state, essential in order to preserve the benefits of randomisation and protect against bias.7 The paper says that the patient who withdrew from treatment arm A actually did so prior to receiving any treatment. This means that the only available observation of visual acuity was that at baseline. The two patients who withdrew from treatment arm B moved out of the area, but did attend at 11 months. While it seems acceptable to use the 11-month data for treatment arm B, the use of baseline data for treatment arm A seems very tenuous indeed, and note that our definition of valid imputation was ‘a plausible’ value. The authors did, however, compare their imputed analysis with the available case analysis and found little difference between results other than a slightly lower SE for the effect estimate with the LOCF assumption, and so in this scenario, while the assumptions made do not appear to be sensible, the conclusion drawn from the LOCF analysis is similar to that which would have been drawn from the available case analysis.
While LOCF is widely used and indeed required by the Food and Drug Administration (FDA) in the USA, it has serious and in some cases, fundamental problems.8
An alternative to LOCF is ‘simple mean imputation’ replacing missing data with the average value observed in that treatment arm. This is not ideal because if there are many patients with missing data, giving them all the same mean will reduce variability between observations and suggest more confidence in findings than should be drawn.
A senior colleague draws my attention to a study comparing visual acuity in 80 patients with diabetic macular oedema after treatment with either drug A or drug B (table 1).
One eye only has been included and results are presented with imputation having been conducted by replacing missing data with the average observed in each treatment arm. Five patients were missing at 1 year after treatment with drug A but no patients were lost to follow-up from drug B. The study concludes that there is evidence that B is better than A. Something worries me however. I look at the available case analysis, and it suggests that there is no evidence of a difference between A and B. I now have disagreement between the available case analysis and the analysis which imputed for missing subjects. The authors have not commented upon this discrepancy and in reality there is little difference between a p value of 0.046 and 0.063—other than that one meets an arbitrary accepted value of being less than 0.05 while the other does not. Without further knowledge of why the five patients were lost to follow-up, one is unable to recommend which approach is best, and this illustrates how assumptions made about missingness and strategies to deal with missingness have the potential to mislead.
The scenarios presented thus far illustrate cases where subjects are missing at final follow-up, yet clearly missing data present challenges to researchers in other ways:
A validated questionnaire is used with a scoring algorithm provided for computing summary scores based on answers to all questions on the questionnaire. You determine that some subjects simply have not answered some of the questions. There is some data, but not all.
Many studies involve assessment of the eye by imaging equipment, such as optical coherence tomography (OCT). Some of this equipment may be very expensive. No technology is immune from failure, and there may be times during a trial where the equipment fails—subjects do not therefore have assessments at particular visits in the trial schedule.
Postal questionnaires—not all individuals respond despite several attempts to encourage postal return.
While examples discussed here relate to tightly controlled clinical trials, it is evident that missing data are likely to be more of an issue outside the rigour of a randomised trial, for example, electronic patient records and observational studies.
The strategies presented here for missing data are simple, yet many better methods are now well described in the statistical literature—multiple imputation and model based approaches such as mixed models and weighted generalised estimating equations exist.9 Multiple imputation, which draws plausible values multiple times from the observed distributions of relevant variables and aggregates the results incorporating the differences between them in the estimates of uncertainty, is a superior method, but is only appropriate when the assumption of missing at random can be made.10
A word of caution is provided by Streiner, however: ‘the easy methods are not good and the good ones … are not easy’.8
The examples provided are based on actual ophthalmic clinical trials. Numbers lost were small which clearly limits their likely impact on study conclusions, yet they are realistic scenarios which researchers may face and example 1 shows how even small numbers can alter study conclusions. A more detailed example is provided in a comprehensive paper by Fielding et al.11 This paper describes the REFLUX trial which randomised 357 participants with gastro-oesophageal reflux disease to surgery or medicine and had an overall response rate of 89%. The authors examined the impact of missing data on a quality of life outcome measure, the EuroQuol EQ-5D which is the primary outcome of a large clinical trial currently being conducted on patients with glaucoma.12 Fielding et al explored eight different approaches to missing data and show that while two approaches gave statistically significant results, six did not and that for the statistically significant models, one estimated an effect that was of clinical significance, the other did not. Choice of analysis method for missing data can thus impact on conclusions.
Whatever approach is adopted, missing data are ‘what it says on the tin’, ‘missing’, and as eloquently summarised by Bland and Altman, ‘there is no satisfactory solution to this’.3 Greater efforts should be made at the design stage to limit the likelihood of data being missing, and one simple yet very rewarding approach can be to talk to patients in advance of conducting a study. At a thyroid eye disease patient day organised by the National Institute for Health Research (NIHR) Biomedical Research Centre for Ophthalmology patients were quite vocal about the need for researchers to carefully consider treatment schedules when designing studies. Patients with thyroid eye disease may find their condition leads to fatigue and disfigurement, resulting in their not wanting to venture out of the house—a trial requiring monthly visits, when standard practise is a 6-monthly visit, is likely to suffer recruitment and retention issues.13 Ensuring that everyone involved in research understands the potential for missing data to undermine the scientific integrity of the study and ultimately do a disservice to patients and public might also be a simple yet rewarding approach. Other strategies include targeting a population not currently served by treatment (so giving an incentive to remain in the study) and shortening the follow-up period for the primary outcome.14
Some missing data may however be inevitable: patients are human and humans get ill, go on holiday, look after sick children or drop out of studies; machines fail and post can go missing. Missing data undermine internal validity and cause loss of power, and simple methods of accounting for missing data can produce biased estimates of the treatment effect. Data will be missing for a reason and researchers are strongly encouraged to record why a value is missing. This paper hopefully highlights the need to be explicit in relation to the potential impact of missing outcome data and outlines some helpful strategies to consider when this does occur. Figure 1 is provided as a useful guide. Available case analysis used to be the standard and while there are indeed occasions where this is not helpful, simple imputation can also lead to erroneous conclusions. The final comment we leave with Streiner: ‘the solution is to consult with a statistician; most of them are (relatively) friendly’.8
Prevention is best, even in relation to missing data.
Report missing data where it occurs and record reasons for missingness wherever possible.
Statistical methods do exist for handling missing data, but assumptions made by such methods must be rigorously evaluated.
If the assumptions made in relation to missingness are incorrect, analyses may mislead.
Collaborators Valentina Cipriani, David Crabb, Philippa Cumberland, Gabriela Czanner, Andrew Elders, Marta Garcia Finana, Rachel Nash, Luke Saunders, Chris Rogers, Simon Skene, John Stephenson, Irene Stratton, Wen Xing.
Contributors CB and AQ drafted the paper. CB, CJD and NF critically reviewed and revised the paper.
Funding The posts of CB and AQ are partly funded by the National Institute for Health Research (NIHR) Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology.
Disclaimer The views expressed in this article are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.