Statistics from Altmetric.com
Editor,—We refer to the interesting article by Maaret al1 published in theBJO. The authors should be congratulated for presenting a new method of screening for diabetic macular oedema. However, the paper contains some statistical errors, and we believe the results can be interpreted in other ways.
Firstly, there are some internal inconsistencies with some of the results reported in the paper. Specificity of the Mollon T test (score >1) for detecting clinically significant macular oedema (CSMO) was reported as 88.9% and specificity was reported as 99.3%. These values could not possibly be obtained with the subject numbers specified, and we believe they have been erroneously published.
From other data provided by Maar et al1 we are able to reconstruct values for these statistics. The mean Mollon T score and SD for non-CSMO patients implies 28 non-CSMO patients passed the test (true negatives) and one failed (false positive). The number of eye with CSMO was 10 and there were 29 eyes without CSMO. Table 1 can be constructed.
It is apparent from this table that the sensitivity of the test is 8/10 or 80.0% and not 88.9% and the specificity of the test is 28/29 or 96.6% and not 99.3% as quoted in the paper.
These values are of course only estimates of the true specificity and sensitivity and are based on fairly small numbers of subjects. It is good practice to report such figures accompanied by the 95% confidence intervals of these proportions.2 For the purpose of assessing the practical value of a screening tool the lower limit or 5% confidence limit of the proportion is of clinical interest because it reflects the sensitivity and specificity of this test in the “worst” scenario. Various methods could be used to calculate or approximate such an interval. For example, using the Wilson's score method incorporating continuity correction,3 the 95% CI of the sensitivity is 44.2%–99.1% and the specificity is 80.4%–99.8%. The clinician will be quick to notice the wide confidence interval for the sensitivity. This is an inherent problem in such studies, where the absolute number of cases of CSMO is relatively small. Thus it is possible that the Mollon test is only a mediocre indicator of the presence of CSMO and the high sensitivity was obtained by chance.
The authors have mentioned that “the Mollon-Reffin test was a better predictor than the other tests because it had a lower false positive rate (1 − specificity)” in the legend to the Figure 3 of their paper.1 Perhaps it would be better (but not mandatory) to present the area under the receiver operator curves (ROC). This will state the probability of correctly identifying a randomly selected participant as a case or non-case and is therefore a measure of overall test validity. This will also offer an objective way of comparing different ROCs. We also note that the shape of the ROC for the DD-15 test was unusual (some sensitivity values were associated with two different specificity values). It is difficult to conceive of a data set that could generate a non-monotonic ROC curve and some comment on its unusual shape might have been useful.
Previous presentations: Nil.
Commercial interests: Nil.