Article Text

Autonomous screening for laser photocoagulation in fundus images using deep learning
  1. Idan Bressler1,
  2. Rachelle Aviv1,
  3. Danny Margalit1,
  4. Yovel Rom1,
  5. Tsontcho Ianchulev1,2,
  6. Zack Dvey-Aharon1
  1. 1 AEYE Health, New York, New York, USA
  2. 2 Ophthalmology, New York Eye and Ear Infirmary of Mount Sinai, New York, New York, USA
  1. Correspondence to Rachelle Aviv, AEYE Health, Tel Aviv 6473925, Israel; rachelle{at}aeyehealth.com

Abstract

Background Diabetic retinopathy (DR) is a leading cause of blindness in adults worldwide. Artificial intelligence (AI) with autonomous deep learning algorithms has been increasingly used in retinal image analysis, particularly for the screening of referrable DR. An established treatment for proliferative DR is panretinal or focal laser photocoagulation. Training autonomous models to discern laser patterns can be important in disease management and follow-up.

Methods A deep learning model was trained for laser treatment detection using the EyePACs dataset. Data was randomly assigned, by participant, into development (n=18 945) and validation (n=2105) sets. Analysis was conducted at the single image, eye, and patient levels. The model was then used to filter input for three independent AI models for retinal indications; changes in model efficacy were measured using area under the receiver operating characteristic curve (AUC) and mean absolute error (MAE).

Results On the task of laser photocoagulation detection: AUCs of 0.981, 0.95, and 0.979 were achieved at the patient, image, and eye levels, respectively. When analysing independent models, efficacy was shown to improve across the board after filtering. Diabetic macular oedema detection on images with artefacts was AUC 0.932 vs AUC 0.955 on those without. Participant sex detection on images with artefacts was AUC 0.872 vs AUC 0.922 on those without. Participant age detection on images with artefacts was MAE 5.33 vs MAE 3.81 on those without.

Conclusion The proposed model for laser treatment detection achieved high performance on all analysis metrics and has been demonstrated to positively affect the efficacy of different AI models, suggesting that laser detection can generally improve AI-powered applications for fundus images.

  • Treatment Lasers
  • Neovascularisation

Data availability statement

Data may be obtained from a third party and are not publicly available. Deidentified data used in this study are not publicly available at present. Parties interested in data access should contact Jorge Cuadros (jcuadros@eyepacs.com) for queries related to EyePACS.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

What is already known on this topic

  • Laser photocoagulation is an established treatment for retinal conditions such as Diabetic Retinopathy. The performance of artificial intelligence models for the detection of various retinal indications may be affected by the existence of laser artifacts.

What this study adds

  • This study proposes a new state-of-the-art artificial intelligence model for the detection of laser photocoagulation artifacts, as well as demonstrating a positive effect on the performance of other artificial intelligence models.

How this study might affect research, practice or policy

  • This study may improve or aid the development of future artificial intelligence models for the detection and diagnosis of various retinal conditions.

Introduction

Laser photocoagulation is a common and established procedure, in which laser pulses are used to coagulate retinal tissue, used to treat multiple retinal diseases.1–3 Ablative photocoagulation is mostly used to prevent leakage and ischaemic neovascularisation in vascular retinal conditions such as diabetic retinopathy (DR),4 5 diabetic macular oedema (DME),6–8 retinal vein occlusion,9 10 and neovascular age-related macular degeneration (AMD).11

Laser photocoagulation is generally divided into panretinal and focal; the former is delivered in the peripheral retina with deep ablative burns to stem the neovascular process,12 13 while the latter is a lighter photocoagulative treatment delivered in the central macula to treat macular conditions.14 15 There are well-established laser treatment protocols depending on disease severity and individual patient disease state.11 16–18 While laser photocoagulation is an effective treatment, it causes retinal scarring and is destructive to the retinal tissue leaving long-term defects in the anatomy.19–21

Artificial intelligence (AI) using fundus imaging has been increasingly employed in various ophthalmological applications.22 23 These applications include extraction of basic patient data, such as age and sex,24 detection of retinal pathologies,25 26 and pathology development prediction.27 28 AI methods rely on image pattern recognition, especially in areas in which the pathology is present. As such, laser photocoagulation may disrupt general pattern recognition by adding new patterns or artefacts, such as burns and scars, which the model is less trained to deal with. This is specifically problematic given that laser treatment is often done on areas of interest, such as leaky blood vessels, which are often the very areas that are most crucial to recognise.

The effect laser photocoagulation has on AI systems suggests that a tool to identify images of eyes which have undergone photocoagulation may be beneficial for the autonomous retinal-based diagnosis and follow-up treatment of patients. While previous methods of laser photocoagulation detection exist,29–33 this work, to the best of our knowledge, presents the first laser treated image detection method based on a large, diverse, and widely accepted database—in this case, EyePACS (https://www.eyepacs.org); the database contains images from a variety of manufacturers and patient populations, of varying image qualities.

Methods

Data

The data consisted of a subsample of the EyePACs dataset, which contains 45° angle fundus photography images and expert readings of said images. All images and data were deidentified according to the Health Insurance Portability and Accountability Act ‘Safe Harbor’ before they were transferred to the researchers.

The dataset contained up to six images per patient visit: one macula centred image, one disc centred image, and one centred image (in which a central fixation image is fixated on the middle of a line connecting the foveola and the optic disc), per eye. Each eye underwent expert reading, including but not limited to panretinal laser treatment presence, focal laser treatment presence, and image quality. All images of the subsample deemed readable by expert annotations were used.

The resulting dataset consisted of 21 050 images from 9212 patients, of which 9484 images (45%) had artefacts of panretinal laser treatment, 1888 (9%) had artefacts of focal laser treatment and 847 (4%) had both. This work combined focal and panretinal laser treatments into 1 category of laser treatment, resulting in an overall 10 525 (50%) images with laser treatment artefacts (table 1). Of these, roughly 77% of patients required dilation, where 54% of all patients received 1 gtt. tropicamide 1%, 17% received 1 gtt. tropicamide 0.5% and 5% received other dilation agents.

Table 1

Laser treatment prevalence in the EyePACs dataset

The average age of patients with laser treatment artefacts was 59.5 (10.0 SD) and 55% were women, compared with the patients who had not undergone laser treatment, for which the average age was 55.6 (11.3 SD) and 61% of which were women (table 2). The prevalence of laser photocoagulation across ethnic groups may be found in online supplemental table A. The distribution of laser treatment images across DR levels is given in online supplemental table B; all laser treatment images were from patients with more than mild DR, and the majority were from patients with grade 4 DR.

Supplemental material

Table 2

Patient demographics for patients who did and did not have laser treatment artefacts

Quality assessment

An image quality assessment tool was developed using classic computer vision methods; the tool detects visibility of fundus-specific characteristics and assigns each image a score. The given quality score for an image is an aggregation of the visibility from multiple areas within the fundus image. The tool was validated based on visual assessment of images score and the readability of the images. Figure 1 demonstrates a few examples of images and their respective scores, showing the correlation between score and visual image quality. This was done in order to remove low-quality images from the dataset, as the quality scores assigned by EyePACs are assigned to patients and not to individual images.

Figure 1

Example images and their accompanied image quality scores, ordered from the worst quality (left) to the best quality (right).

Preprocessing

Image preprocessing was performed in two steps for both datasets. First, image backgrounds were cut along the convex hull, which contains the circular border between the image and the background. Figure 2 shows an example of this process. Second, images were resized to 512×512 pixels. Lastly, using the afore-mentioned quality assessment tool, bad-quality images were filtered out before training. The model was checked with multiple training configurations set by multiple thresholds and the image quality threshold was set at the point at which model performances were not improved by filtering additional images, resulting in 1373 images filtered, approximately 6.5% of the data.

Figure 2

Example of image cropping, blue lines represent the cropping boundaries.

Model training

The data was then divided into training, validation, and test datasets at a ratio of 80%, 10% and 10%, respectively. A binary classification neural network was trained. The model architecture was automatically fitted to best balance the model performance versus model complexity tradeoff. Hyperparameter tuning was done on the validation set.

Statistical analysis

The metrics used for model assessment were accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC). For each metric, the bias corrected and accelerated bootstrap method34 was used to produce a 95% CI.

Analysis levels

Laser detection was done on three different levels. The first, detection on the individual image level, was the basic task for which the model was trained. The second, detection on the eye level, used all images from a given eye and the image for which the model had the highest probability score was selected for analysis. For the third, detection on the patient level, the results from both eyes were compared and the eye with the higher probability score was selected to produce a patient-level result. The eye and patient level analysis, respectively, operate on the logic that one field or eye with photocoagulation artefacts is sufficient for the eye or patient to be classified as positive.

Effect on imaging tasks

The effect that laser treatment has on imaging tasks was measured by applying the laser detection model as a postprocessing step for a model for the detection of DME, which was developed based on the EyePACs dataset,35 and a model for age detection, also developed based on the EyePACs dataset.

The performance on these tasks was measured in AUC on a separate validation set containing images both with and without laser treatment artefacts. The 95% CI was calculated using the accelerated bootstrap method for each population and compared for significance.

A regression model was additionally trained for age detection, and the mean absolute error (MAE) between the patient’s age and predicted age was calculated on a separate validation set. The validation set was separated into patients with and without laser treatment artefacts, such that the mean age between these populations was the same. Significance in MAE between the two populations was calculated using a student’s t-test. Detailed patient statistics of these experiments, as well as details on model development, are given in online supplemental tables C and following explanations.

Results

The results for the different analysis methods of laser artefact detection were as follows (table 3): on the image level, sensitivity of 0.883 (95% CI 0.868 to 0.897), specificity of 0.880 (95% CI 0.864 to 0.894), and AUC of 0.950 (95% CI 0.943 to 0.956) were achieved. On the eye level, sensitivity of 0.925 (95% CI 0.900 to 0.945), specificity of 0.931 (95% CI 0.916 to 0.944), and AUC of 0.979 (95% CI 0.972 to 0.984) were achieved. On the patient level, sensitivity of 0.929 (95% CI 0.881 to 0.947), specificity of 0.926 (95% CI 0.911 to 0.944), and AUC of 0.981 (95% CI 0.971 to 0.987) were achieved.

Table 3

Laser treatment detection results on the EyePACs dataset for the three analysis levels performed, given in accuracy, sensitivity, specificity and AUC with a 95% CI

The results of laser artefact detection for each DR level are displayed in table 4: the model achieved 0.910 AUC (95% CI 0.866 to 0.941) for DR level 2, 0.887 AUC (95% CI 0.758 to 0.954) for DR level 3, 0.929 AUC (95% CI 0.918 to 0.938) for DR level 4, and 0.772 AUC (95% CI 0.904 to 0.968) for ungradable DR level. DR levels 0 and 1 did not have any laser treated examples, thus most metrics are not defined for these groups. The results of laser artefact detection stratified by ethnicity are available in online supplemental table D.

Table 4

Results on the EyePACs dataset across DR grades, given in accuracy, sensitivity, specificity, and AUC with a 95% CI

Online supplemental table E shows the difference in results in laser artefact detection between patients with and without DME. The model achieved 0.955 AUC (0.948–0.962) for non DME patients versus 0.908 AUC (0.884–0.927) for DME patients, demonstrating that these conditions do affect results, but the model achieves high performance irrespective of them.

Online supplemental table F displays the results of laser artefact detection for images which passed (high quality) and did not pass (low quality) the quality filter, showing a significant difference between the populations. The results for low-quality images, which were filtered out, were 0.787 sensitivity (95% CI 0.710 to 0.849), 0.793 specificity (95% CI 0.709 to 0.860), and 0.857 AUC (95% CI 0.803 to 0.898); compared with 0.854 sensitivity (95% CI 0.838 to 0.869), 0.904 specificity (95% CI 0.890 to 0.917), and 0.948 AUC (95% CI 0.941 to 0.955) for high-quality images which passed the filter.

The effect of laser detection and subsequent filtration on the afore-mentioned three tasks of DME detection, age prediction, and sex detection were as follows: DME detection results for images with no laser artefacts were 0.955 AUC (95% CI to 0.948 to 0.961), compared with images with laser artefacts, on which the model achieved 0.932 AUC (95% CI 0.905 to 0.951). Age prediction results for images with no laser artefacts, after age adjustment, were 3.81 MAE, compared with images with laser artefacts, on which the model achieved 5.33 MAE. T-test analysis shows a significance of p<1e−4. Sex detection results for images with no laser artefacts were 0.922 AUC (95% CI 0.916 to 927), compared with images with laser artefacts for which the model achieved 0.872 AUC (95% CI 0.830 to 0.903).

The aggregation of these results is shown in table 5.

Table 5

Results for the three experiments conducted for the effect of laser treatment, showing the results in terms of AUC for sex and DME detection and mean average error for age detection.

Discussion

This work proposed a method for the automatic detection of laser treatment artefacts in fundus images, which may also serve as a component in the future development of AI systems for different diagnoses based on retinal imaging. Such tasks may need to consider images of laser-treated eyes differently from non-treated eyes according to their design needs; some may choose to discard these images, while others may analyse them in a manner differently to images of untreated eyes. Accordingly, and in accordance with the degree to which laser treatment affects the task in question, the proposed system may be used at different operating points with different sensitivity–specificity balances. Discarding laser-treated images is a viable option for most automated retinal screening applications, as these patients should already have an awareness of the need for regular screening.

Previous studies on the autonomous detection of laser burns from fundus images have been on a smaller scale (roughly 2 orders of magnitude).29–33 The importance of scale is in the better representation of real-life conditions; specifically, this study allows better representation of various image qualities, camera manufacturers, and populations. Additionally, a wider range of clinical conditions, such as DR and DME, are represented in this study both with and without laser treatment, and the proposed system shows high performance across these conditions.

The effect laser treatment has on imaging tasks, and the model’s ability to detect relevant images was validated by checking the model’s effect on different AI tasks involving retinal images. A significant difference was found for all three tasks, showing the relevance of the proposed method for future AI tasks.

A limitation of this work is the lack of differentiation between focal and panretinal laser treatments that were grouped as one in this work. Future works may differentiate between the two, given increased data. Furthermore, even though the base characteristics of laser photocoagulation remain similar across conditions, the addition of AMD-specific databases to the training set may improve results.

In addition, and in the same vein of the presented work, machine learning methods to detect patients with DME who will require future laser treatment may be developed. This would require training a model, similar to the one presented, on a dataset generated from a longitudinal study tracking the progression of patients with diabetes.

Data availability statement

Data may be obtained from a third party and are not publicly available. Deidentified data used in this study are not publicly available at present. Parties interested in data access should contact Jorge Cuadros (jcuadros@eyepacs.com) for queries related to EyePACS.

Ethics statements

Patient consent for publication

Ethics approval

Institutional Review Board exemption was obtained from the Sterling Independent Review Board on the basis of a category 4 exemption (DHHS), pursuant to the terms of the U.S. Department of Health and Human Service’s Policy for Protection of Human Research Subjects at 45 C.F.R §46.104(d)

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Contributors IB analysed the data, designed the study and conducted research. ZD-A and DM conceived the study and supervised research. YR provided assistance in assessing external models. TI provided medical and strategic guidance and oversight. IB and RA drafted the manuscript with input from all authors. ZD-A is guarantor.

  • Funding Employees and board members of AEYE Health designed and carried out the study; managed, analysed and interpreted the data; prepared, reviewed and approved the article; and were involved in the decision to submit the article. There were no grants or awards involved in the funding of this article.

  • Competing interests RA, IB and YR are employees of AEYE Health. TI is a board member of AEYE Health. DM is COO of AEYE Health. ZD-A is CEO of AEYE Health.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.