Aim: To investigate the repeatability and sensitivity of two commonly used sine wave patch charts for contrast sensitivity (CS) measurement in cataract and refractive surgery outcomes.
Methods: The Vistech CS chart and its descendant, the Functional Acuity Contrast Test (FACT), were administered in three experiments: (1) Post-LASIK and age matched normal subjects; (2) Preoperative cataract surgery and age matched normal subjects; (3) Test-retest repeatability data in normal subjects.
Results: Contrast sensitivity was similar between post-LASIK and control groups and between the Vistech and FACT charts. The percentage of subjects one month post-LASIK achieving the maximum score across spatial frequencies (1.5, 3, 6, 12, 18 cycles per degree) were (50, 33, 13, 13, 0 respectively) for FACT, but only (0, 0, 13, 4, 0 respectively) for Vistech. A small number of cataract patients also registered the maximum score on the FACT, but up to 60% did not achieve the minimum score. Test-retest intraclass correlation coefficients varied from 0.28 to 0.64 for Vistech and 0.18 to 0.45 for FACT. Bland-Altman limits of agreement across spatial frequencies were between ±0.30 and ±0.85 logCS for Vistech, and ±0.30 to ±0.75 logCS for FACT.
Discussion: The Vistech was confirmed as providing poorly repeatable data. The FACT chart, likely because of a smaller step size, showed slightly better retest agreement. However, the reduced range of scores on the chart due to the smaller step size led to ceiling (post-LASIK) and floor (cataract) effects. These problems could mask subtle differences between groups of patients with near normal visual function as found post-refractive or cataract surgery. The Vistech and FACT CS charts are ill suited for refractive or cataract surgery outcomes research.
- contrast sensitivity
- refractive surgery
- test chart design
Statistics from Altmetric.com
Much has been written about the inadequacy of visual acuity (VA) as the sole measure of visual performance after refractive and cataract surgery and the need to measure visual outcome in terms of contrast vision.1–8 This is gaining acceptance and many refractive surgery studies have included a measure of vision in the contrast domain—either contrast sensitivity (CS),9–16 or low contrast visual acuity (LCVA).17–20 Many cataract surgery outcome studies have also included a measure in the contrast domain.21–26 However, what is less accepted is which tests of CS are best suited to such outcomes studies.
There are several commercially available clinical tests that measure CS, but the most commonly used are the Vistech in its various versions (including wall charts VCTS 6500 and the vision screener based MCT-8000)27–29 and similar charts such as the Vector Vision CSV-1000.30,31 These charts have the advantage over letter CS charts, such as the Pelli-Robson chart,32 in that they can measure CS at several spatial frequencies. The Vistech CS chart was first introduced in 1984,27 and contains circular photographic plates of sine wave gratings arranged in five rows (spatial frequencies 1.5, 3, 6, 12 and 18 cycles per degree (cpd)) and nine columns (contrast levels). The step sizes are irregular, but the average step size is about 0.25 log units with a range of 1.75 log units. The gratings are either vertical or tilted 15° to the right or left. The patient indicates the orientation of each grating, or responds “blank” if nothing is seen. It is therefore essentially a criterion dependent method as the patient is allowed to decide when they cannot see a grating, and cautious observers may give slightly low CS values. However there is a 3-alternative forced choice (AFC) check on “risk taking” patients as they must indicate the orientation of the gratings. The Vistech charts have been widely used to measure CS in cataract,21,23,33–37 and to assess changes after refractive surgery where they have typically shown no significant decrease in CS.9–11,29,38–41 However, the poor test-retest repeatability of the Vistech charts,42–46 could obscure subtle differences between normal and abnormal. The Vistech charts consistently show very poor test-retest correlations of between 0.25 and 0.61 (an average of 0.48).42–46
The “second generation” Vistech chart, the Functional Acuity Contrast Test (FACT),47 which has also been used in refractive surgery studies,12,48–52 uses the same format as the Vistech: circular photographic plates arranged in five rows and nine columns; the same spatial frequencies; the same grating orientations. It differs in using smaller step sizes (0.15 log units) and an AFC method, presumably to try to improve repeatability. Given that the number of steps has not changed, a consequence of the smaller step size is a smaller range of scores for the FACT chart compared to the Vistech (Fig 1). It also has “blurred” grating patch edges with the gratings smoothed into a grey background and a larger patch size so that an increased number of cycles are presented at low spatial frequency.
The repeatability and sensitivity of the FACT chart to cataract or refractive surgery changes have not previously been reported, and the chart has not been compared to its predecessor, the Vistech chart. In this study, we assessed the sensitivity of the Vistech and FACT CS charts to changes after refractive surgery (experiment 1), assessed the sensitivity of the FACT CS chart in cataract subjects (experiment 2) and compared the test-retest repeatability of the FACT and Vistech charts (experiment 3). In addition, we investigated the clinical usefulness of having CS data from five spatial frequencies using factor analysis.
For all experiments, informed consent was obtained from all subjects after the nature of the study had been fully explained. The tenets of the Declaration of Helsinki were followed and the study gained approval from both the Bradford University and Leeds Regional Ethical Committees. For experiment 1, inclusion criteria were healthy eyes with a VA better than 0.1 logMAR (6/7.5) for the normal group, and previous LASIK refractive surgery for the refractive surgery group. For experiment 2, inclusion criteria were subjects aged 60 years or older with normal healthy eyes or presenting for cataract surgery (no VA limit); and for experiment 3, inclusion criteria were age 18 years or older with normal healthy eyes and a VA better than 0.1 logMAR (6/7.5). Exclusion criteria were any ocular pathology (other than cataract for the cataract subject group) or abnormality including amblyopia and strabismus; any previous ocular surgery (other than LASIK for the post-LASIK group); any neurological problem, any systemic disease, taking of any medication which may affect contrast sensitivity, inability to speak English sufficiently to be instructed to perform the tests, insufficient mental ability to perform the tests, and physical disability which would make it arduous to perform the tests (for example, wheelchair bound).
Contrast sensitivity and VA data were compared between 27 subjects at least five weeks (range 5–64 weeks, mean 22.4 (SD 18.1) weeks) after LASIK surgery (mean age 41.1 (SD 9.8) years) and 27 subjects with normal, healthy eyes (mean age 38.8 (SD 9.8) years). Both postoperative and control subjects were recruited from a refractive surgery centre (Ultralase, Leeds, UK). The groups were similar in age (ANOVA F1,52 = 0.71, p = 0.40).
Contrast sensitivity (FACT) and VA data were compared between 53 subjects with early cataract (age 73.3 (SD 7.4) years, VA 0.19 (SD 0.23) logMAR, Snellen 6/9) and 23 subjects with normal, healthy eyes (age 70.3 (SD 4.2), VA −0.03 (SD 0.08) logMAR, Snellen 6/6+). Cataract subjects were recruited from the ophthalmology pre-assessment clinic of one of the authors (RMLD) at Leeds General Infirmary, Leeds, UK and control subjects were recruited from the Eye Clinic at the University of Bradford. The groups were similar for age (ANOVA F1,74 = 3.40, p<0.05).
Thirty three subjects with normal, healthy eyes (mean age 31.6 (SD 15.1) years) had CS measurements repeated with a test-retest time of approximately one week. All subjects were recruited from the Eye Clinic at the University of Bradford.
In all experiments, CS was measured with the Vistech and FACT charts using the manufacturer’s recommended testing procedure. Measurements were made monocularly with optimal refractive correction and natural pupil, with a chart luminance of 120 cd/m2 and a working distance of 3 m. The orders of test measurement and of spatial frequency measurement within each test were randomised. With the Vistech chart, the plate furthest along each row correctly seen by each subject determined CS and subjects were allowed to state that they could not see any gratings. As recommended by the manufacturers, a strict 3-AFC measurement paradigm was used with the FACT chart and subjects were forced to guess at a plate that they indicated they could not see. Visual acuity was measured using a Bailey-Lovie logMAR chart with a chart luminance of 160 cd/m2, a working distance of 4 m and by-letter scoring.
The data were inspected for compliance with normality and significant differences between the groups were tested with one way analysis of variance (ANOVA). Test-retest reliability was determined by calculating the intraclass correlation coefficient (ICC), and the limits of agreement by the method of Bland and Altman.53 Factor analysis was performed to investigate for redundancy within each CS chart. The results from the five spatial frequencies along with VA were included in the analysis with the number of factors (with eigenvalues greater than 1.0) and the correlations taken from the Varimax rotated solution. These analyses were performed on SPSS v 10.1 for Windows (SPSS Inc, Chicago, IL, USA).
The post-LASIK group had slightly worse VA (mean −0.04 (SD 0.08) logMAR, Snellen 6/6++) compared to the controls (−0.09 (SD 0.06) logMAR, Snellen 6/5)(ANOVA F1,52 = 8.10, p<0.01). The mean CS data for both charts and for the post-LASIK and control groups are shown in Figure 2. The results from the Vistech and FACT charts were similar (2-factor ANOVA, p>0.05) within both control and post-LASIK groups, although a significant interaction effect (p<0.05) indicates that there are some significant differences between the chart scores from individual spatial frequencies. The FACT chart gave average scores higher than the Vistech for 1.5 and 3.0 cpd, but lower, or similar, scores for all other spatial frequencies. The post-LASIK group had similar FACT CS as the controls for all spatial frequencies (p>0.05) except at 1.5 cpd (p<0.05) where the LASIK group gave higher CS scores. The Vistech chart found the LASIK group had reduced CS at three spatial frequencies (p<0.05), and improved CS at one spatial frequency (p<0.001) compared to the control group.
There was a ceiling effect with many post-LASIK and control subjects scoring the highest CS value possible on the chart for several spatial frequencies. The proportions are listed in table 1. This ceiling effect is much greater on the FACT chart than on the Vistech. To further investigate this ceiling effect, CS was measured, in an additional nine LASIK patients who were seen within one week of surgery (mean age 35.2 (SD 7.4) years, VA −0.05 (SD 0.05) logMAR, Snellen ~6/5−1). This group were not combined with the longer term follow up group as previous studies have shown larger losses of CS in the immediate postoperative period, recovering completely or to only subtle losses in the long term.38,48,54 However, the percentage of subjects achieving the maximum score on the FACT were similar to the normal and one month post-LASIK groups (table 1).
Factor analysis of the five CS results for each test and VA yielded two factors for both charts (table 2). The correlations of spatial frequencies and factors show that there is essentially a low spatial frequency factor (1.5 and 3.0 cpd) and a high spatial frequency factor (VA, 6, 12, and 18) although the 6.0 cpd data for FACT could equally be included in either.
The cataract group had poorer FACT CS, at every spatial frequency, than the normal group (ANOVA F1,74 = 19.99 to 44.85, p<0.001) (table 3). This confirms the sensitivity of the FACT to cataract, however the groups were also different in terms of VA (cataract 0.19 (SD 0.23) logMAR, Snellen 6/9; normal subjects −0.03 (SD 0.08) logMAR, Snellen 6/6+1; ANOVA F1,74 = 20.88, p<0.001). The proportion of cataract and control subjects who achieved the top score for each spatial frequency on the FACT chart are listed in Table 3. The ceiling effect is notable in the normal group, and similar to the normal and LASIK groups from experiment 1. However, a small ceiling effect also exists in the cataract group (Table 3). There is also a large floor effect in the cataract group, especially at higher spatial frequencies, where subjects fail to see the first target and thereby fail to register a score.
Vistech and FACT CS were tested twice on a group of 33 normal subjects. There were no significant differences between test and retest for all spatial frequencies (p>0.05). These subjects were younger (mean (SD) age 31.6 (15.1) years) than the groups used in experiments 1 and 2. The mean log CS scores are similar to that seen in the normal group in experiment 1, although slightly better on several spatial frequencies. The repeatability of the CS tests was determined using ICC, the coefficient of repeatability (COR) and the 95% limits for change (Table 4). The COR, is calculated as 1.96 times the standard deviation of the differences between the test and retest scores.53 For measures that use a continuous score, the COR provides a criterion for statistically significant change. For tests that do not measure on a continuous scale, the criterion for significant change falls at the next log CS level above the coefficient of repeatability. Therefore, if the COR was ±0.23 log CS, but the chart used step sizes of 0.10 log CS, the criterion for change would be ±0.30 log CS or ±3 steps. The 95% limits of agreement are derived by adding and subtracting the COR from the mean difference.53 Due to the coarseness of the step sizes, the practical 95% limits of agreement would be one log CS step above this figure (Table 4).
The higher CS value at 1.5 cpd on the FACT is likely to be because the low spatial frequency target is larger and thus displays more cycles, which would improve CS.28 The Vistech gave higher average readings at all other spatial frequencies probably because the highest CS values attainable on the FACT are lower than those on the Vistech (fig 1, table 1). Therefore a greater proportion of the subjects scored the maximum CS value on the FACT (42%) compared with the Vistech (10%) (table 1). Even within one week of surgery, when previous studies have shown that the greatest reductions in CS after refractive surgery occur,38,48,54 at least 33% showed maximal CS scores at 1.5, 3, and 6 cpd and 11% at 18 cpd. An additional reason why so many subjects scored the maximum on the FACT chart is because a strict 3-AFC method was used as suggested by the manufacturers. This gives a 33% probability of a subject scoring one step above their threshold due to chance, and an 11% probability of scoring two steps above threshold.
Although there is a strong ceiling effect with the FACT chart, there is only a minor ceiling effect with the Vistech chart (table 1). It seems that the modification of the first generation Vistech to create the second generation FACT by reduction in step size without an increase in the number of steps has created a FACT chart with a truncated scale (fig 1), which fails to discriminate between subjects with good contrast sensitivity. So many cases reaching the ceiling of the chart is a serious problem as the FACT chart is missing the most important part of the scale if it were to be used for detecting any subtle loss of CS caused by refractive surgery or, possibly, intraocular lenses.
The advantage of sine wave grating CS tests is that they can measure CS at different spatial frequencies. However, this assumes that CS from neighbouring spatial frequencies provides useful additional information, which may not be the case.42,55 Principal components factor analysis with Varimax orthogonal transformation indicated that measurements of the grating CS tests can safely be summarised by two scores, one at low spatial frequency and the other at high (Table 2). In addition, VA was also highly covariant with the high spatial frequency factor. This suggests that the two higher spatial frequency results are not necessary as the same information may be more reliably provided by a logMAR VA chart.45,46,56 Therefore if VA is already reliably measuring the high spatial frequency end of the contrast sensitivity function, all that remains is to measure low spatial frequencies reliably. This could be done with the Pelli-Robson contrast sensitivity chart, which is sensitive and reliable and free from ceiling and floor effects.32,45,46,57 For the Vistech and FACT with superfluous low spatial frequency data, the three lower spatial frequency scores could be considered together to improve reliability and repeatability. In addition, as the grey area on the results sheet represents the 90% limits of normal, the Vistech or FACT score could be taken to be abnormal if two of the three low frequency scores are below the grey area. 1−(0.95)3 = 14.3%, so that there is a 2% probability (0.143×0.143 = 0.020) that two of the three values will be below the grey area due to chance.55 This seems to be an acceptable level of false positive results as most charts give the 2.5th percentile as the lower limit of normal.
The FACT chart is sensitive to the presence of early cataract in that depressed scores are seen in the cataract group compared with the normal group. However, ceiling and floor effects hamper accurate measurement of CS in cataract subjects. A strong ceiling effect is seen in the normal group, similar to that seen in normal subjects in Experiment 1, but a weak ceiling effect is even seen in the cataract group. At least two of 53 subjects, who were scheduled for cataract surgery, had contrast sensitivity at two spatial frequencies that was better than that which could be measured with the FACT chart. The strong floor effect with early cataract subjects does indicate that many of them have very poor CS, which is in line with previous studies, and may be clinically useful.58 However, their actual CS is not measured, a score of zero is assigned when their true CS falls somewhere within a one log unit range from zero to the minimum possible score (0.60–1.05 log CS depending upon spatial frequency) (fig 1, table 4). This has implications for research if a mean score were required, as true CS will be underestimated if a zero score is assigned or overestimated if missing data is assigned. Therefore the FACT chart is missing the most important part of the scale for differentiating patients with loss of CS due to cataract, and as such is a poor test for research in subjects with cataract or other eye diseases causing severe losses of CS.
There were no significant differences between mean test and retest scores (table 4), as would be expected if there were no significant training or fatigue effect. The poor reliability of Vistech CS chart measurements was indicated by the low ICCs, which ranged between 0.28 and 0.64 (table 4). Previous studies have also found low test-retest correlations of between 0.25 and 0.61.42–46 Poor repeatability was also illustrated with coefficients of repeatability between ±0.26 and ±0.54. Similar values of repeatability have been found previously.44,46 This is probably because the Vistech uses large step sizes (~0.25 log units), a small number of decisions at each level (one), a criterion dependent method and a low number of alternatives (three) to catch risk takers.44,46
The FACT chart uses a smaller step size (0.15 log units) and a fully forced choice method, but the average test-retest ICC is similar to or worse than the Vistech (0.34 v. 0.46 respectively) and the average COR is only slightly better (±0.35 v ±0.40 log CS respectively). The poor ICC values may be because of the truncated nature of the FACT data. Many of the scores are at the maximum value (particularly at 1.5 cpd), so the score is very poor at discriminating between subjects. The retest agreement of the FACT chart is better (95% limits for change average of ±0.42 log CS for the FACT, compared with ±0.57 log CS for Vistech) probably due to the smaller step size used on the FACT chart (0.15 compared to an average of 0.25 log CS). As poor as these reliabilities are, they may overestimate reliability for older (such as cataract) subjects, as it has been previously shown that older subjects have greater variability.46
It may be that using the FACT in forced choice mode may not be the best approach. Forced choice tests must contain a large number of trials, otherwise their reliability will be poor.59,60 In the 3-AFC mode, the FACT offers a 33% chance of correctly identifying the grating position with your eyes closed. It may be that, given the design of the FACT with one decision per level and only three alternatives, allowing the patient to respond that they cannot see a grating is preferable. Although when it is used in this manner it is criterion dependent as cautious subjects can set a lower criterion for threshold, the technique still provides a 3-AFC check on risk takers. It may also provide less truncated data, as fewer subjects might reach a maximum score. The CSV-100030,31 has also been used in refractive surgery studies.38,49,61–66 Its psychophysical design with one decision per level, a criterion dependent method with a 2-AFC check on risk takers, and a relatively small step size of about 0.16 log units, would suggest its repeatability should be similar to that of the FACT. The Pelli-Robson CS chart or VA charts using Bailey-Lovie LogMAR design features, with three or five decisions per level, approximately 10 (or 26) alternative choices, and a step size of 0.15 or 0.10 log units have both been shown to provide more repeatable measurements of CS or low contrast VA.45,46
These reliability data must influence the way any differences in results in experiments are interpreted. The LASIK subjects appear to have lower contrast sensitivity at three spatial frequencies, and higher at one spatial frequency than the normal subjects on the Vistech chart but no differences on the FACT chart (fig 2). A sample size calculation based on the differences between normal and LASIK subjects (accounting for unequal variance between groups), for a power of 0.80, a type I error rate of 0.05, an alpha of 0.05, moderated by repeatability (sample size divided by ICC) gave minimum sample sizes of (1.5 cpd, n = 20 282; 3.0 cpd, n = 84; 6.0 cpd, n = 88; 12 cpd, n = 64; 18 cpd, n = 124) total cases to show a difference between groups. Therefore studies reporting results of small series with the Vistech or FACT may not be valid.
The drive to measure visual outcome of cataract and refractive surgery in the contrast domain,1–4,6–8 is not aided by these negative findings for the Vistech and FACT charts. In addition to possible reliability problems discussed above, the psychophysical design of other photographic patch tests of CS such as the CSV-100030,31 with four spatial frequencies (3, 6, 12, 18 cpd), minimum CS values of (0.71, 0.91, 0.72, 0.17 log CS) and maximum values of (2.08, 2.29, 1.99, 1.55 log CS) suggests it may also suffer from ceiling effects in near normal subjects and floor effects in cataract subjects, similar to the FACT. However, other tests, including low contrast visual acuity, Pelli-Robson and monitor based CS are more reliable and sensitive to the vision changes seen with cataract and refractive surgery, as well as being free from ceiling and floor effects.45,46,56,67–70
This research was supported by a grant from the Vision Research Trust. The authors thank Stereo Optical Co for their loan of a FACT chart. We would also like to thank Ultralase, 77–85 Albion St, Leeds, LS1 5AP, UK for facilitating access to refractive surgery patients. NHMRC Sir Neil Hamilton Fairley Fellowship 007161 supports K Pesudovs.