Background/aims Retinal screening programmes in England and Scotland have similar photographic grading schemes for background (non-proliferative) and proliferative diabetic retinopathy, but diverge over maculopathy. We looked for the most cost-effective method of identifying diabetic macular oedema from retinal photographs including the role of automated grading and optical coherence tomography, a technology that directly visualises oedema.
Methods Patients from seven UK centres were recruited. The following features in at least one eye were required for enrolment: microaneurysms/dot haemorrhages or blot haemorrhages within one disc diameter, or exudates within one or two disc diameters of the centre of the macula. Subjects had optical coherence tomography and digital photography. Manual and automated grading schemes were evaluated. Costs and QALYs were modelled using microsimulation techniques.
Results 3540 patients were recruited, 3170 were analysed. For diabetic macular oedema, England's scheme had a sensitivity of 72.6% and specificity of 66.8%; Scotland's had a sensitivity of 59.5% and specificity of 79.0%. When applying a ceiling ratio of £30 000 per quality adjusted life years (QALY) gained, Scotland's scheme was preferred. Assuming automated grading could be implemented without increasing grading costs, automation produced a greater number of QALYS for a lower cost than England's scheme, but was not cost effective, at the study's operating point, compared with Scotland's. The addition of optical coherence tomography, to each scheme, resulted in cost savings without reducing health benefits.
Conclusions Retinal screening programmes in the UK should reconsider the screening pathway to make best use of existing and new technologies.
- Diagnostic tests/Investigation
Statistics from Altmetric.com
Diabetic retinal screening programmes in the UK differ over how surrogate photographic markers are used to screen patients for diabetic macular oedema. England uses exudates within two disc diameters of the centre of the macula and, if visual acuity is reduced, blot haemorrhages and microaneurysms/dot haemorrhages within one disc diameter. Scotland only uses exudates and blot haemorrhages within one disc diameter, regardless of the visual acuity.
We investigated the accuracy and cost-effectiveness of these schemes using optical coherence tomography (OCT), a technology that directly visualises oedema, as the reference standard. Additionally, we investigated the accuracy and cost-effectiveness of automated grading and the role of OCT in screening for diabetic macular oedema.1–3
Materials and methods
This was a multi-centre, prospective, observational cohort study. Participants with diabetes were recruited from retinopathy screening and ophthalmology in Aberdeen, Birmingham, Dundee, Dunfermline, Edinburgh, Liverpool and Oxford. Patients aged 18 or older who gave informed consent were included. The following photographic features in at least one eye were required for recruitment: microaneurysms/dot haemorrhages or blot haemorrhages within one disc diameter or exudates within one or two disc diameters of the centre of the macula. Exclusions were: pregnancy; contra-indications to dilatation; intraocular surgery within 1 year; macular or pan-retinal laser treatment; or intraocular injection. The reference standard was an adequate OCT image of both eyes. Patients were omitted from analysis if they had an inadequate OCT image in either eye. Patients with an adequate retinal photograph in one eye were included.
To avoid inter-centre variation, OCT operators submitted a portfolio of images for accreditation.
A 45° macula-centred colour digital retinal photograph (3–8 megapixels, with or without JPEG compression) was obtained from each eye. OCT images were obtained from each eye producing a nine subfield ‘Early Treatment Diabetic Retinopathy Study’ (ETDRS) map showing average regional thickness, and a horizontal cross-section through the centre of the macula or the region of greatest thickness.4 The outer four regions were disregarded. Best logMAR visual acuity was recorded unaided, with pinhole or with glasses. There was a maximum of 4 weeks between photograph and OCT scan.
All images were graded and annotated by a quality assured grader (94.3% sensitivity, 95.7% specificity, for referable retinopathy/maculopathy, 20125) prior to reviewing the OCT data. Borderline images were referred to a senior ophthalmologist.
Diabetic macular oedema was deemed present if:
Central ETDRS region thickness >250 µm or any of inner five regions >300 µm;
AND visible intraretinal cyst or area of subretinal fluid on OCT cross-section;
AND no other visible cause for macular oedema, for example, vein occlusion.
Thickness thresholds were adjusted to account for all scanners used in the study.3
England's scheme, Scotland's scheme and a hybrid scheme, using features from both, were assessed (table 1).
A fully automated grading scheme was developed using existing software.6 ,7 Automated inputs included: image feature intensity; image clarity; counts of microaneurysms/dot haemorrhages within one disc diameter and two disc diameters; likelihoods of haemorrhages within one disc diameter and anywhere in the image; likelihoods of exudates within one disc diameter, two disc diameters and anywhere in the image; and visual acuity.
Patients with inadequate quality photographs, but no referable disease, were sent for slit-lamp examination, reflecting clinical practice. For automated grading, it was assumed that patients assigned the outcome of inadequate quality photographs would be referred for manual grading (hybrid grading scheme), and then to slit-lamp, if manual grading concurred.
To identify sampling bias, patients were classified into a hierarchy of five mutually exclusive categories of features present in either eye:
Exudates within one disc diameter.
Blot haemorrhages, but no exudates, within one disc diameter.
Microaneurysms/dot haemorrhages, but no exudates or blots, within one disc diameter.
Exudates within one–two disc diameters with no relevant diabetic retinopathy features within one disc diameter.
None of the above.
Weighting was undertaken to correct for sampling bias based on observed proportions of the above categories in a consecutive cohort of 6900 patients attending retinal screening in Grampian.8 ,9 Each weight10–12 was calculated as the ratio of the observed proportion in the cohort study9 to that in the present study.
For both weighted and unweighted data, the sensitivity and specificity of using each investigated scheme were estimated at the patient level.13 For these calculations, referral of the patient corresponded to a scheme applied to both eyes separately indicating referral in either eye (or both).
A Markov microsimulation model was developed to assess the cost-effectiveness of the alternative grading schemes for triggering referral in the context of annual screening, with and without OCT prior to referral. A time horizon of 20 years was adopted. Based on epidemiological and clinical effectiveness data, the model simulated the progression of macular oedema and visual loss in each eye of referred and un-referred patients. The model assumed that patients with macular oedema would receive laser treatment while those not referred would be screened 1 year later.14 ,15 An alternative scenario was modelled whereby only those with macular oedema and visual acuity ≥0.3 logMAR received laser. Healthcare costs associated with photographic screening (£46.69 per patient), addition of OCT to the screening pathway (£31.96 per patient), initial referral (£143.35), treatment (£160 per treatment per eye) and ongoing monitoring (£117 per visit) were estimated from a resource use questionnaire sent to participating centres and other published sources.16 ,17 Health and social care costs of severe vision loss (£6295 per year) were taken from a previous study18 (see online supplementary appendix).
The analysis simulated the passage of 100 000 ‘patients’, with characteristics matching those of patients in the clinical dataset, through the model individually. As above, the proportions of patients in the different feature categories were weighted. The impact of using alternative grading schemes within annual screening was assessed by applying the weighted sensitivities and specificities within the model. Modelling was also used to assess the cost per case of macular oedema detected from one round of screening for this cohort (see online supplementary appendix).
The mean costs, years free of moderate visual loss (in either eye) and quality adjusted life years accruing to patients, under the alternative grading schemes, were compared with estimate incremental cost-effectiveness ratios. The schemes were compared both with and without the use of OCT prior to referral. We also assessed a scheme (scheme A) whereby anyone with markers of diabetic maculopathy would be examined with OCT. A ceiling willingness to pay ratio of £30 000 per quality adjusted life years (QALY) gained was applied to identify the optimal scheme on grounds of cost-effectiveness.19
To characterise the uncertainty surrounding the cost-effectiveness of alternatives, deterministic and probabilistic sensitivity analyses were undertaken. The probabilistic analysis sampled from distributions assigned to each model parameter, and simulated the passage of 10 000 patients through the model 1000 times. This produced 1000 estimates of the mean cost and effects for each scheme. Cost-effectiveness acceptability curves were produced by calculating the proportion of these iterations favouring each of the schemes (on grounds of cost-effectiveness) at different ceiling ratios of willingness to pay per QALY.20 The methods used to derive probabilities for visual loss and the development of macular oedema precluded determination of the statistical impression surrounding these estimates. The impact of variation in these parameters was addressed through deterministic sensitivity analysis.
A total of 3540 patients were recruited between 31 July 2008 and 22 February 2011 (figure 1). Overall, 370 were excluded from analysis: in 329 the OCT failed in at least one eye; in a further 41, retinal photographs from both eyes were of inadequate quality; and there was one lost image (figure 2).
In all, 3170 patients were analysed (table 2) of whom 243 (7.7%) had diabetic macular oedema. Prevalence of diabetic macular oedema differed between centres (range 3.7%–12.2%) and scanners (range 4.5%–11.8%). Diabetic macular oedema was statistically commoner in older people, Caucasians, those with type 2 diabetes or poor visual acuity.
When mutually exclusive categories of lesions were considered, diabetic macular oedema was present in 14.1% of those with exudates within one disc diameter; 12.1% of those with blot haemorrhages (but no exudates within one disc diameter); and 3.2% of those with microaneurysms/dot haemorrhages (and no exudates or blot haemorrhages within one disc diameter) (table 2).
Table 3 shows the analysis weights used to correct for sampling bias. Exudates within one disc diameter and blot haemorrhages were down weighted. Exudates between one and two disc diameters and microaneurysms/dot haemorrhages were up weighted.
Table 4 shows the sensitivities and specificities for predicting the presence of diabetic macular oedema from certain lesion combinations for unweighted and weighted data. The presence of exudates within one disc diameter had the greatest influence on the prediction of macular oedema. The addition of exudates between one and two disc diameters did not identify any further cases (table 4).
England's scheme, after weighting, had sensitivity of 72.6% and specificity of 66.8% for detection of diabetic macular oedema; Scotland's scheme had sensitivity of 59.5% and specificity of 79%. The hybrid scheme had sensitivity of 73.3% and specificity of 70.9% (table 5).
The receiver operating characteristic curve for automated grading is shown in figure 3 together with the sensitivities and specificities for the three manual schemes. Compared with the manual schemes, for the same sensitivity, automated grading achieved a higher specificity. The automated system operating point used in the cost-effectiveness analysis had slightly higher sensitivity (75.9%) and specificity (73.7%) than the hybrid manual grading scheme (table 5).
The results of the short term analysis of the cost per case detected from one round of screening are presented in the online supplementary appendix.
Table 6 shows the results of the cost-effectiveness analysis. The addition of OCT to each scheme resulted in cost savings without reducing health benefits. Scotland's scheme was found to be most cost-effective at the accepted ceiling ratio of £30 000 per QALY, with or without the addition of OCT. Even scheme A, where anyone with markers of diabetic maculopathy is examined with OCT, produces cost savings over all the manual schemes without OCT.
In the study, automated grading had higher specificity but similar sensitivity to England's and the hybrid scheme (figure 3). Assuming that automated grading was implemented for a similar cost to manual grading, it has the potential to produce a similar number of QALYs, but at a lower overall cost to the health service, than either England's or the hybrid scheme. Automated grading could be made cost-effective in Scotland, but an operating point at a higher specificity would have to be chosen.
Deterministic sensitivity analysis suggested that monitoring patients with suspected diabetic macular oedema (on a 6-monthly basis) with OCT and retinal photography remained cost-saving up to an incremental cost of ∼£58 per patient. Further scenario analyses assessed the sensitivity of findings to alterations in assumptions and parameters in favour of the more sensitive and less specific strategies (see online supplementary appendix). Only when a number of parameters were simultaneously weighted in favour of the more sensitive strategies did incremental cost per QALY approach the accepted threshold range (£20–30 000 per QALY).
Figure 4 summarises the probabilistic sensitivity analysis results, showing that Scotland's scheme retains the highest probability of being cost-effective up to a ceiling ‘willingness to pay’ ratio of ∼£240 000 per QALY when used in conjunction with OCT.
Comparison was made between England's and Scotland's maculopathy grading schemes, along with a hybrid scheme and an automated scheme. In the weighted analysis, Scotland achieved a sensitivity of 59.5% and specificity of 79.0%. England had a higher sensitivity (72.6%) but a lower specificity (66.8%). Compared with England, the hybrid scheme increased sensitivity by 0.7% and specificity by 4.1%.
Statistical analyses were completed at the patient level. This gave higher sensitivity and lower specificity for each scheme than if using a single eye per patient. However, the order of the performances and costs are unaffected compared with single eye analyses.
Based on weighted data, the English and hybrid schemes result in higher numbers of true cases being identified, costing an additional £910 and £639 per extra case in the first cycle of the model. However, the repetitive nature of interval screening compromises the cost-effectiveness of schemes that have lower specificity. While the more sensitive schemes gave rise to small increases in years free from moderate visual loss (≥15 ETDRS letters), this translated into very small increases in QALYs as such visual losses are associated with a modest utility decrement and may only affect the worst seeing eye. Furthermore, patients missed in one round of screening have a chance of being detected at the next.
While the cost-effectiveness model assumed all patients referred with macular oedema undergo laser treatment, results remained robust when only those patients with macular oedema and visual acuity ≥0.3 logMAR were modelled to incur treatment costs. With several parameter values and assumptions weighted in favour of the more sensitive schemes, the additional costs of these schemes (per QALY gained) remained above thresholds for cost-effectiveness.19
With weighted data, automated grading (working at any operating point on its receiver operating characteristic curve) improved performance over the manual schemes. Cost-effectiveness will depend on the operating point chosen and the costs of implementation, balanced against cost savings resulting from reductions in manual grading time and unnecessary referrals.9
In this study, a variety of OCT scanners were used. A variation in detection of diabetic macular oedema between centres was noted, partly due to differences in the sensitivity of the scanner and partly due to case selection. Cases missed by less sensitive scanners may have biased the estimated sensitivities and specificities, but most likely in the same directions for all schemes. Hence they are unlikely to have affected the broader inferences.
Economic modelling suggests that the use of OCT in conjunction with photography within screening programmes, for patients with surrogate markers of oedema, is likely to be cost-effective. The estimated marginal cost of conducting OCT within the screening programme (£32) is low in comparison with the cost of referral to ophthalmology (£143) and consequent monitoring in the outpatient setting. As the analysis included a survey of costs and pathways of implementation in the participating centres, the results can be applied across England and Scotland.
We assumed that patients without treatment would progress at the rate observed in the Early Treatment Diabetic Retinopathy Study.14 To assess the benefits of improved detection and referral, the best available evidence was used.4 ,14 ,15 Although ranibizumab has now been approved,21 its impact on the cost-effectiveness of screening for macular oedema is unknown.
Considering the comparison of alternative photographic grading schemes in England and Scotland for triggering referral to ophthalmology or an OCT examination, we found Scotland's scheme to be preferred based on weighted data when applying a ceiling ratio of £30 000 per QALY gained.
Automated grading benefits from the ability to choose different operating points, depending on the sensitivity desired. At the study's chosen operating point, if it could be implemented without increasing grading costs, automation could produce a similar number of QALYS for a lower overall cost than England's scheme. Automated grading could be made cost-effective in Scotland, but an operating point at a higher specificity would have to be chosen.
Using OCT as part of the screening pathway could reduce costs to the health service.
Retinal screening programmes in the UK should reconsider the screening pathway to make best use of existing and new technologies.
This project could not have been successfully completed without the cooperation of all the centres. Much of the workload of acquiring the data was carried out by the retinal screeners, supported by the clinical staff. Our particular thanks go to Julie Raeper, a Senior Retinal Screener at Aberdeen Royal Infirmary, who manually graded and annotated all the images. Thanks must go to those who sat on the Trials Steering Committee and the Investigators’ Meetings. In particular, those members not involved directly in the study: Mr Stephen Graham, our patient representative, Professor Alex Elliott, Dr Ayyakkannu Manivannan and Mrs Alison Farrow. Thanks also for the guidance of the chair of the Trials Steering Committee, initially, Dr Caroline Styles and later Dr Rod Harvey.
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Data supplement 1 - Online supplement
Contributors JAO was the principal investigator. JAO, PFS, KAG, SP, ADF, GSS, PMD, GJP, SB, DMB, VC, SPH, GPL, CJS, KS and HMW contributed to the study design. KAG managed the data collection. The web-based grading database was developed by KAG and ADF. CSa, SB, DMB, VC, PMD, SPH, GPL, CJS, RDM, KS and HMW were responsible for providing patient data from the collaborating centres. KAG and ADF developed the automated analysis and with RTS generated the results. JAO performed quality assurance. GJP, KAG and ADF performed the statistical analyses and GJP checked the validity of the statistical analyses. GSS performed all of the economic analyses. GJP and ADF wrote the first draft of the paper. All authors participated in the interpretation of the data, reviewed and revised the paper for important intellectual content and approved the final version. JAO takes responsibility for the content.
Funding Funding for this study was provided by the Health Technology Assessment programme of the National Institute for Health Research, project reference 06/402/49.
Competing interests JAO, GJP, PFS, SP, GSS and ADF have received funding for their institution from the Chief Scientist Office, Scotland and from Medalytix Ltd. ADF has received salary support from Medalytix Ltd. SPH has received conference funding from Allergan. KS sits on the Novartis Advisory Board (Scotland). Commercial implementation associated with some of the referenced work may in future provide some remuneration for the University of Aberdeen, NHS Grampian and Aberdeen based authors.
Patient consent Obtained.
Data sharing There is currently no plan for sharing of unpublished data beyond the seven research centres.
Ethics approval The North of Scotland Research Ethics Committee. REC reference: 07/S0801/107. Date of approval: 17/12/2007.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.