AIM To determine whether pharmacological mydriasis leads to a significant difference in interobserver agreement of optic disc measurement compared with examination without mydriasis.
METHOD A cross sectional study was performed with a pair of observers examining the optic disc of two randomised groups of patients, one group before diagnostic mydriasis, and the other afterwards. Horizontal and vertical disc diameters and cup/disc ratios were measured with a 78 dioptre lens. The study was repeated with another observer pair and two further groups of patients.
RESULTS In study A 86 subjects were examined in total (52 without and 34 with mydriasis). In study B 87 subjects were examined (45 without and 42 with mydriasis). The 95% limits of agreement of the cup/disc ratio measurement differences were significantly larger without mydriasis (p<0.001 for all studies (F test)). For both studies examination after mydriasis gave significantly greater agreement for vertical and horizontal cup/disc ratios. The cases with good agreement (0.1 difference or better) for vertical cup/disc ratios were 37/52 (72%) and 34 /45 (76%) without mydriasis and 33/34 (97%) and 40/42 (95%) respectively with mydriasis. Similar differences were recorded for horizontal cup/disc ratios. Disc diameter measurement results showed similar differences in study A but were not affected by mydriasis in study B.
CONCLUSIONS Examination of the optic disc without pharmacological mydriasis gives significantly poorer interobserver agreement. In this study, the mean 95% limits of agreement values for all cup/disc ratio values were 0.27 for examination without mydriasis and 0.13 for examination with mydriasis. A measure outside these limits would suggest a real difference. This study indicates that mydriasis is important for reproducible clinical examination in glaucoma.
- optic disc
- interobserver variability
Statistics from Altmetric.com
Examination of the ocular fundus with mydriasis is generally considered an essential part of an ophthalmic assessment. However, mydriatic examination does have time and resource implications. Patients need to wait until topical mydriatics have become effective, and therefore need to be seen twice if intraocular pressure measurements and/or gonioscopy is performed.
Diagnostic yield from a routine mydriatic examination in general ophthalmic practice has been shown to be low and it has been suggested that for the detection of peripheral retinal lesions it is an expensive test for each case prevented.1 2 Modern slit lamp indirect ophthalmoscopy lenses are designed to offer an improved view through an undilated pupil and to enable a reasonable fundal view that would not previously be so accessible with other methods of funduscopy.
Since its introduction, the concept of cup/disc ratio has been the cornerstone of assessment of the optic disc in glaucoma.3 4 Optic disc size is also recognised to be important in assessment as it may have a profound effect on the clinical significance of the cup/disc ratio.5 Both these variables may readily be examined and are a regular part of clinical practice.
Sequential stereoscopic optic disc photographs may offer better reproducibility than clinical examination, and yet further improvements may be available with the confocal scanning laser ophthalmoscope.6 7 Ideally, serial stereoscopic photographs would enable a more accurate follow up. In clinical practice, however, the former is not available for many patients, and the latter has yet to be introduced into general clinical practice.
This study was designed to test the hypothesis that examination of the optic disc with mydriasis gives significantly improved interobserver agreement compared with examination without mydriasis in a clinical setting. Good interobserver agreement is important in the clinical management of glaucoma, as over the course of their disease, patients may well be managed by different clinicians. Additionally, an assessment with good interobserver agreement is more likely to indicate a higher “quality” examination than one without this. In addition to determining any difference, the study aimed to measure the magnitude of this difference, as a statistically significant difference may not represent a practical difference in terms of possible outcome. Given current pressure to improve throughput in ophthalmic clinics, it was designed to give meaningful data to suggest how useful (or otherwise) this practice might be for patient care.
Two prospective randomised observational studies were performed. The studies were performed in a university hospital eye department, from general ophthalmic outpatient and glaucoma clinics. Patients were invited to participate after giving informed consent to examination. The examination was performed as part of an outpatient appointment. Three observers examined two cohorts of patients. One cohort of patients was examined by one pair of observers, both with glaucoma subspecialty training (JFK and AEL: study A), and the other cohort was examined by one of the other pair and a less experienced observer (JFK and PG: study B). Observers did not undergo a prestudy validation, and were masked as to the results during the course of the study. Patients due to undergo a mydriatic examination were randomised by simple randomisation to undergo examination either before or after mydriasis. Randomisation was performed by the first examiner with a coin after consent had been obtained. One eye of each patient was randomised to undergo examination as part of the study. The patient was examined by one examiner, and then immediately afterward by the other examiner at the same slit lamp with the same examination lens. The second observer was unaware of any clinical information. For mydriasis, patients received tropicamide 1%, with or without phenylephrine 10%, depending on clinical judgment.
Sample size was estimated using a proportions method. The study was designed to have sufficient power to detect a reduction in good agreement from 90% to 65%, given an alpha error of 5% and a beta error of 80%. A total sample size of 88 patients was required for each study.
Patients were excluded from this study for the following reasons:
Unwillingness to participate
Media opacity causing an obscured view of the optic disc
Abnormal optic disc morphology, such as a tilted disc
Poor cooperation with examination—for example, where the fellow eye had poor vision
A poorly responding pupil—for example, as a result of chronic pilocarpine treatment.
Examination of the optic disc was performed with a Haag–Streit slit lamp and a 78 dioptre lens (Volk) using the slit beam column scale to estimate the apparent optic disc diameter as has been previously described.8 9 The examiner did not observe the reading on the slit lamp scale until the measurement was completed. The optic disc margin was defined as the inner border of the scleral ring. No correction was made for refractive error or axial length. The same slit lamp was used for each examination. Determination of the cup/disc ratio was left to the judgment of the examiners, with emphasis on the contour of the optic cup. No measurement devices or standardisation chart was used. Examiners were encouraged to estimate to the nearest 0.05 of a cup/disc ratio. Four variables were measured: vertical disc, vertical cup/disc ratio, horizontal disc diameter, and horizontal cup/disc ratio.
Each variable was analysed both as continuous and categorical—to allow comparison with previous work.7 10 11Bland–Altman plots were constructed,12 plotting the difference between two observers against their average, to detect whether there was any systematic bias between readings and whether the differences between readings varied in any way with the size of the measures. Where there appeared no evidence of varying differences, findings were summarised as the mean and 95% limits of agreement. The distributions were plotted to test for normality. The mean of the differences is an estimate of bias, thus assessed byt test for evidence of bias. To determine whether the variances were significantly different, the F test was used.
To assist comparison with previous work, a good degree of agreement was defined as a difference of 0.1 or less in the cup/disc ratio measurement. The differences in the proportion of examinations reaching this standard were then compared using the χ2 test.
For the examiners in study A, 86 patients were examined and, of these, 52 were examined before mydriasis and 34 were examined after mydriasis.
For study B, 87 patients were examined and 45 examined before mydriasis and 42 were examined after mydriasis.
It can be seen that average agreement between observers did not differ significantly with use of mydriasis; however, it should be noted that the 95% limits of agreement were narrower when mydriasis was used than when not. This was true for each variable except disc diameter values in study B.
These 95% limits of agreement values give an indication of the degree of agreement. Taking the mean of vertical and horizontal cup/disc ratios for both studies A and B where examination was performed after mydriasis, the mean of their 95% limits of agreement was 0.1334. For the corresponding examinations before mydriasis, the mean 95% limits of agreement value was 0.2697. Figures 1 and 2 show the Bland–Altman plots of vertical cup/disc ratio, and vertical disc diameter for study A.
Overall, the optic disc diameters were not significantly different for patients examined with and without mydriasis in either study A or B. Cup/disc ratios were not significantly different comparing examinations with and without mydriasis in both groups. This indicates that a difference between the “non-mydriatic” and “mydriatic” groups is unlikely to have significantly affected the overall results. However, patients examined in study A had a larger mean cup/disc ratio than those in study B, reflecting a higher proportion of glaucoma patients in this study.
There was no evidence of significant measurement bias for all measures except for vertical disc diameter in study B where a small positive bias was evident for both non-mydriatic and mydriatic examinations (mean values 0.056 (p=0.002) and 0.054 (p<0.001) respectively). This was the study with the least experienced observer.
Table 3 shows the proportion of examinations where a good degree of agreement was found for cup/disc ratio measurement in studies A and B. As can be seen, significant differences are evident between the groups for both vertical and horizontal measurements.
These data indicate that examination of the optic disc without pupillary dilatation markedly impairs the interobserver agreement of cup/disc ratio measurements in this setting. For two sets of observers, significant differences were found for both vertical and horizontal measurements. In general, the standard deviation values were approximately doubled for both sets of observers when the examination was performed before pupillary dilatation. These data are supported by both parametric F tests and by comparison of the proportions of measurements with a given degree of agreement. Essentially this means that for a measured difference in cup/disc ratio to correspond with a “true” difference with 95% confidence, a difference of at least 0.13 needs to be detected with an examination with mydriasis. For a non-mydriatic examination, this measured difference in cup/disc ratio needs to be at least 0.27.
The data for the measurements of disc diameter were less clear. One pair of observers (A) demonstrated an increase in SD measurements by approximately 50% when examination was performed before pupillary dilatation whereas this was not evident for the other pair. Optic disc diameter measurements were not corrected for the optics of the eye, examination lens, and slit lamp. However as all measurements were repeated with these constant, correction was not necessary.
Variation of error across the range of measurement can be a problem, but the data showed a fairly even spread across the range of cup/disc ratio measurements, particularly in the mid range.
This study was designed to be relevant to clinical practice. Observers did not undergo a prestudy validation training because the purpose of this study was to assess the process of optic disc examination in a clinical outpatient setting rather than an optimised research setting, which might have produced more concordant results. Similarly, a standardisation chart to assist with cup/disc ratio measurements was not used. Patients with a poor fundal view were excluded as it was felt that these patients should always be examined with mydriasis so as to optimise the view. This study also excluded patients with a “difficult” optic disc; these patients require a mydriatic examination. The study focused on the more “normal” optic disc where it could be argued that an examination without mydriasis was reasonable. Two separate parallel studies (A and B) were performed to demonstrate reproducibility and improve external validity.
Many other workers have reported on interobserver and intraobserver reproducibility of optic disc examination. Results for examination with mydriasis are probably broadly comparable with those of others. Direct comparison between different studies is not always possible because different measurements and analysis techniques have been used, and different populations were examined. Some other studies have used weighted kappa values as a summary statistic. Kappa depends on the proportion of subjects in each category and it can be misleading to compare kappa values from different studies where prevalences differ. The use of weighted kappa has been criticised because the results are subjective; if different weights are used, results may not be comparable. Varma and coworkers compared stereoscopic and monoscopic photographs and found improved interobserver agreement for stereoscopic photographs. The median weighted kappas were 0.57 (monoscopic) and 0.67 (stereoscopic).7 Tielsch et al, using different criteria for weighted kappa values, found a mean kappa of 0.74 for vertical cup/disc ratio from stereo photographs.10 Haslett and coworkers have shown very good interobserver agreement with a modified 60 dioptre lens with a measuring graticule, using similar methods of analysis as this study.11 Spencer and Vernon similarly demonstrated good levels of interobserver agreement for disc diameter measurements with the Zeiss 4 mirror lens and 78 dioptre lens.13 14 A study using the confocal scanning laser ophthalmoscope demonstrated that coefficients of variation were significantly increased when pilocarpine was administered to subjects, and the authors suggested caution in interpreting such data.15
An alternative study design would be to compare the intraobserver agreement. Study of interobserver agreement was chosen for several reasons. Firstly, as glaucoma patients are likely to be examined by many clinicians during their lifetime, it was felt that in this setting the degree of agreement between different observers would be more helpful. Secondly, the study was logistically simpler as patients did not need to be recalled (and there was no need to consider any “clinician memory effects”). Thirdly, as intraobserver studies tend to have much higher levels of agreement, a much larger study would have been necessary.
Potential bias is always an issue in any study, and this was reduced as follows. The patients examined with and without mydriasis were not significantly different from each other in either group. By randomisation we hoped to reduce any selection bias. The examiners were masked throughout the study from the corresponding results of the other examiner. It was clearly not possible to prevent examiners from being aware whether the examination was performed with or without mydriasis, and this is a potential source of bias.
The greater effect of mydriasis on measurement of the cup/disc ratio compared with disc diameter is likely to be due to the need to examine the contour of the neuroretinal rim margin. This is more likely to require a binocular image than optic disc size measurement. The latter is likely to depend on detection of an edge rather than a three dimensional contour. This study did not consider the importance of other subtle features in the optic disc examination such as focal neuroretinal rim loss, blood vessel contour, and peripapillary atrophy. These were not examined in this study, but it is not unreasonable to infer that detection and description of these may be impaired where mydriasis is not used.
Obviously, this study has only addressed the examination of the optic disc, which is most relevant for assessment of glaucoma patients and suspects. Better agreement levels without mydriasis could possibly have been obtained if observers had undergone a prestudy validation exercise, possibly with a standardising chart. Such techniques may be appropriate in an epidemiological study; however, for this study we were concerned with the “outpatient clinic” setting. Given that the (mydriatic) reproducibility results are broadly comparable with other studies, the data are probably reasonably generalisable for epidemiological as well as clinical purposes.
In current practice, it is important to provide evidence for allocating limited time and resources. These data strongly suggest that examination of the optic disc without mydriasis significantly impairs the clinical examination of glaucoma patients.
The authors thank Mr Ian Murdoch and Mr Richard Wormald for their helpful comments.