Article Text

## Statistics from Altmetric.com

Editor,—In a recent paper,1 the vertical cup/disc ratio (CDR) in relation to optic disc size was evaluated as an aid in the identification of optic discs with glaucomatous optic neuropathy. Two methods of using the vertical CDR were assessed, one method independent of disc size and the other dependent on disc size.

With the disc size independent method, for a group of patients with primary open angle glaucoma (POAG) and a control group, the authors calculated the vertical CDR and, based on a histogram plot of the control group, concluded that the vertical CDR is not normally distributed. An empirical cut off for the upper limit of normal was taken as the 97.5 percentile. When this test criterion (vertical CDR = 0.682) was applied to the two groups, this method yielded a sensitivity of 56.6% and specificity of 97.7% for the identification of glaucomatous optic discs. The conclusion that the vertical CDR is not normally distributed is not disputed (Chernoff-Lehmann test,2 p<0.10). However, the optimal vertical CDR may be selected rationally (rather than arbitrarily) by plotting sensitivity against (1 − specificity) to produce a receiver operator characteristics (ROC) curve (Fig 1A). The optimal test criterion is the point on the ROC curve furthest from the line of zero discrimination3; from the authors’ data, the optimal test criterion is a vertical CDR cut off of 0.587 (sensitivity 86.6%, specificity 87.5%). Having rationally selected the optimal test criterion, its value as a clinical aid is best assessed by the predictive power of a positive test (rather than by isolated sensitivity and specificity values). This predictive power (V+) is the proportion of true positives (by reference test) to total positives (true positives + false positives) and is a function not only of sensitivity and specificity but also of prevalence. V+ may be calculated from: V+ = 1/((((S − 1).(P − 1))/N.P)+1)^{−1}, where N = sensitivity, S = specificity, and P = prevalence. The prevalence of glaucoma varies according to the population studied and the criteria used as the reference test, but is generally considered to be approximately 2% in adults older than 40 years of age.4-6 With this prevalence and a sensitivity of 86.6% and a specificity of 87.5%, for an optimal vertical CDR cut off of 0.587, the predictive power of a positive test is 12.4%. This indicates that 87.6% of positives would be false. The corresponding predictive value for the authors’ test criterion (vertical CDR cut off = 0.682) is 33.4%.

With the disc size dependent method, the authors calculated the 95% confidence interval of the linear regression of the relation between vertical cup diameter and vertical disc diameter, after appropriate magnification correction. It then appears that the authors used the upper waist of the 95% confidence interval as a straight line to calculate a linear intercept of −0.87. The simple relation between vertical cup diameter and vertical disc diameter was then used to calculate the upper limit of the 95% confidence interval of the vertical CDR as: (((1.193.vertical disc diameter) − 0.87)/vertical disc diameter). When this test criterion was applied to the two groups, sensitivity and specificity for the identification of glaucomatous optic discs were respectively 62.3% and 98.9%. The optimal confidence interval may be selected rationally using an ROC curve constructed from different confidence intervals: the optimal test criterion is a confidence interval of 72% (sensitivity 90.2%, specificity 92.3%) (Fig 1B) which yields a predictive value, V+ = 19.3%. The predictive value of the authors’ test criterion of a 95% confidence interval is 53.6%.

We therefore agree completely with the authors that optic disc biometry provides useful data in the identification of glaucomatous discs: the authors’ work has shown that the disc size dependent method (V+ = 19.3%) is superior to that which is disc size independent (V+ = 12.4%). Their disc size dependent method is reminiscent of the concept of the “rim index” (observed neuroretinal rim area/expected neuroretinal rim area) first described in 1991 by Montgomery.7

In general, the arbitrary selection of test criteria yield better predictive values but poorer sensitivities, whereas optimal selection of test criteria using ROC curves improves sensitivity but reduces predictive power. We therefore consider it important to appreciate that, in screening for a low prevalence disease with important health implications such as glaucoma, test criteria must be optimised to maintain a high sensitivity and steps taken to cater for a false positive rate of at least 80% which is inevitable in these circumstances.

## References

# Reply

Editor,—We thank Barr and Nolan for their observations. We agree that the ROC curve is indeed a useful way of presenting such data, allowing a ready comprehension of the relation between sensitivity and specificity over a range of possible cut off values. However, the statement that the “optimal test criterion” is the point on the ROC curve furthest from line zero is an oversimplification. The optimal test criterion depends entirely on circumstances in which the test is applied, not on an abstract mathematical concept. Specifically, in this context, the estimation of the cup/disc ratio (CDR) is not a “test” performed in isolation. The vast majority of new glaucoma referrals to the hospital eye service are from optometrists, who perform a number of tests for glaucoma: tonometry, ophthalmoscopy (estimation of the CDR and qualitative assessment) and, increasingly, visual field testing. The optometrist integrates the results of disc assessment with the other test results, and comes to a reasoned decision of “glaucoma probability”.

The cut off values used in our paper were selected for high specificity, because the prevalence of glaucoma in the general population is low. In a population with a 2% glaucoma prevalence, application of our cut off values would yield the number of cases (per 1000 population) given in Table 11. For the disc size dependent method, there are 23.3 referrals per 1000, 53.6% of whom have glaucoma.

In a population with a 2% glaucoma prevalence, application of Barr and Nolan’s cut off values would yield the number of cases (per 1000 population) given in Table 12. For the disc size dependent method, there are 93.5 referrals per 1000, 19.3% of whom have glaucoma.

Application of the latter cut off values would result in a huge burden of false positive referrals to the hospital eye service, to be added to the false positive referrals that result from the application of the other tests (tonometry and visual field testing).

The true false negative rate using the cut off values we propose will be much lower than suggested in Table 11. Our data set comprised glaucoma patients with minimal visual field loss (average MD −3.44). In reality, there will be a greater range of glaucoma severity in the undiagnosed population, making the test more sensitive. In addition, the results of disc biometry (of which estimation of the CDR is only a part) have to be taken together with the results of other tests. If it is assumed that half all individuals with glaucoma have raised IOP, then up to half the false negatives are likely to be referred for further assessment on that basis. Visual field testing would further reduce the false negative rate.

A more sophisticated application of the data is possible, with cut off values tailored according to the relative prevalence of glaucoma in subsets of the population. For instance, it is well established that the prevalence of glaucoma rises with the height of intraocular pressure (IOP). It would be possible to apply a cut off with high sensitivity to individuals with raised IOP and a cut off with high specificity to individuals with normal IOP. In this way, the overall pick up of cases can be maximised and the false positive rate kept to a minimum. A false positive rate of 80% is, therefore, by no means inevitable. However, making the method unnecessarily complicated would deter its use.

The method described in our paper was advocated as a simple adjunct to qualitative disc assessment. The great majority of cases not picked up by the CDR method (at our selected cut off) had focal changes in the neuroretinal rim, which can be detected by a careful qualitative examination. Over 90% of the more difficult cases to detect clinically, those with diffuse rim loss, were picked up by the method.

In summary, our selected cut off values were far from arbitrary, allowing CDR estimation to be integrated rationally with other aspects of the clinical examination, without resulting in unmanageable numbers of false positive glaucoma referrals.