Article Text

Download PDFPDF

Prediction of OCT images of short-term response to anti-VEGF treatment for neovascular age-related macular degeneration using generative adversarial network
  1. Yutong Liu1,2,
  2. Jingyuan Yang1,2,
  3. Yang Zhou3,
  4. Weisen Wang4,
  5. Jianchun Zhao3,
  6. Weihong Yu1,2,
  7. Dingding Zhang5,
  8. Dayong Ding3,
  9. Xirong Li4,
  10. Youxin Chen1,2
  1. 1 Ophthalmology, Peking Union Medical College Hospital, Beijing, China
  2. 2 Key Laboratory of Ocular Fundus Diseases, Chinese Academy of Medical Sciences, Beijing, China
  3. 3 Vistel AI Lab, Visionary Intelligence Ltd, Beijing, China
  4. 4 Key Lab of DEKE, Renmin University of China, Beijing, China
  5. 5 Central Research Laboratory, Peking Union Medical College Hospital, Beijing, China
  1. Correspondence to Dr Weihong Yu, Ophthalmology, Peking Union Medical College Hospital, Beijing 100730, China; yuwh{at}pumch.cn; Professor Youxin Chen, Ophthalmology, Peking Union Medical College Hospital, Beijing, China; chenyx{at}pumch.cn

Abstract

Background/aims The aim of this study was to generate and evaluate individualised post-therapeutic optical coherence tomography (OCT) images that could predict the short-term response of antivascular endothelial growth factor therapy for typical neovascular age-related macular degeneration (nAMD) based on pretherapeutic images using generative adversarial network (GAN).

Methods A total of 476 pairs of pretherapeutic and post-therapeutic OCT images of patients with nAMD were included in training set, while 50 pretherapeutic OCT images were included in the tests set retrospectively, and their corresponding post-therapeutic OCT images were used to evaluate the synthetic images. The pix2pixHD method was adopted for image synthesis. Three experiments were performed to evaluate the quality, authenticity and predictive power of the synthetic images by retinal specialists.

Results We found that 92% of the synthetic OCT images had sufficient quality for further clinical interpretation. Only about 26%–30% synthetic post-therapeutic images could be accurately identified as synthetic images. The accuracy to predict macular status of wet or dry was 0.85 (95% CI 0.74 to 0.95).

Conclusion Our results revealed a great potential of GAN to generate post-therapeutic OCT images with both good quality and high accuracy.

  • retina

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Age-related macular degeneration (AMD) is one of the leading causes of irreversible vision loss in people over 50 years of age.1–3 In eyes with neovascular AMD (nAMD),4 a main subtype of AMD, photoreceptors are severely damaged and rapid vision loss could occur due to the development of choroidal neovascularisation, mainly resulting from an elevated intravitreal vascular endothelial growth factor (VEGF) level.5 6 At present, anti-VEGF therapy is the first-line treatment for nAMD.7 8 However, the therapeutic response varies widely, and it is difficult to predict the individual short-term structural and functional response after a single dose of anti-VEGF injection according to the current guidelines.9

Optical coherence tomography (OCT) allows us to visualise and quantitatively evaluate the pathomorphological changes in the retina and choroid in eyes with nAMD,10 which are closely associated with the visual prognosis.11 12 Presently, the monitoring of nAMD progression and the formulation of individualised treatment strategies are mainly dependent on OCT images.13–16 Hence, we considered the possibility of using these images to predict the therapeutic response, which would be of great interest and broad significance for promoting better decision making. This can be achieved by generating predicted post-therapeutic OCT images that would intuitively demonstrate the efficacy of a single dose of an anti-VEGF medication for each patient based on his/her pretherapeutic OCT images with the help of artificial intelligence (AI). A synthetic post-therapeutic OCT image is supposed to assist doctors with clinical decision making, as it would enable a better understanding of the disease and treatment by presenting an expectable post-therapeutic status of the fundus. Deep learning methods, such as generative adversarial networks (GANs),17 could be used for this purpose. To the best of our knowledge, the concept of synthesising a post-therapeutic OCT image for patients with nAMD has not been previously proposed.

The aim of this study was to generate and evaluate post-therapeutic OCT images synthesised based on pretherapeutic OCT images using GAN, which could be used to predict the short-term response of anti-VEGF therapy for individual nAMD patient.

Materials and methods

Study design and participants

We retrospectively reviewed the records of patients with nAMD who underwent an intravitreal injection of anti-VEGF drugs at the Peking Union Medical College Hospital from November 2018 to June 2019. The inclusion criteria were as follows: (1) age ≥55 years; (2) a preoperative diagnosis of typical nAMD using fluorescein angiography (FA) or OCT angiography; (3) a recorded injection of anti-VEGF drugs including conbercept or ranibizumab at any phase in the treatment protocol of three consecutive monthly injections and pro re nata injections (3+PRN), with pretherapeutic and post-therapeutic OCT images taken within 4–6 weeks. The exclusion criteria were: (1) a diagnosis of polypoidal choroidal vasculopathy (PCV) or retinal angiomatous proliferation (RAP); (2) a history of previous operation, laser photocoagulation or intraocular injections of medications other than anti-VEGF agents and (3) a history of other ocular disorders, including glaucoma, pathological myopia, retinal vascular disorders and other disorders involving systemic diseases. Diagnosis of typical nAMD, PCV or RAP was made according to results of FA and indocyanine green angiography (ICGA), and was independently confirmed by two retinal specialists.

Data set creation

Pretherapeutic and post-therapeutic B-scan swept-source OCT (SS-OCT) images were obtained in 16-line 9 mm radial macula pattern with Topcon Deep Range Imaging (DRI) OCT Triton device (Topcon, Tokyo, Japan) using the follow-up mode, which enabled several pretherapeutic and post-therapeutic images to be scanned at the same location. OCT images were section-matched based on the retinal microstructure, including the position of the fovea and retinal vessels. Images with motion artefacts or insufficient quality for clinical judgement were excluded. The original resolution was 1024×992 pixels, which was resized to 256×256 pixel for model training.

B-scan SS-OCT images were further separated into two sets, training and test set. Images taken between November 2018 to April 2019 were included in the training set, whereas those taken in May and June 2019 were included in the test set. There were no images of the same patient simultaneously assigned to the two data sets.

All OCT images were desensitised to patient identifying information.

Image synthesis

Synthesising a post-therapeutic OCT image based on a pretherapeutic OCT image can be essentially viewed as an image-to-image translation task. The state-of-the-art methods for image-to-image translation are pix2pix18 and its high-resolution variant pix2pixHD,19 both developed based on GANs.17 In this work, we repurposed the pix2pixHD method for generating a novel OCT image from a given OCT image.

A GAN for image-to-image translation consists of a generative model and a discriminative model, where the generative model is employed to make the translation, while the discriminative model is designed to optimise the translation process of the generative model. Our pix2pixHD-based solution is illustrated in figure 1. The generative model takes a pair of real pretherapeutic and post-therapeutic images as input data, with no label, segmentation or any clinical information, and generates a synthetic post-therapeutic OCT image as output information, with no binary conclusion or any label. It is the responsibility of the discriminative model to discriminate the real post-therapeutic OCT images from the synthetic fake post-therapeutic OCT images .

Figure 1

A conceptual illustration of the pix2pixHD-based solution used in this study for generating post-therapeutic OCT images from pretherapeutic OCT images. OCT, optical coherence tomography.

The two models are trained in an adversarial manner. The generative model tries to generate synthetic images that are as realistic as possible, while the discriminative model aims to distinguish the fake pairs given by the generative model. In the training process, we train one model at a time with the weights of the other model fixed so that the two models can make improvements in turn. At the end of the training, the generative model is supposed to generate images that are realistic enough, as demonstrated in the last column of figure 1.

The network was implemented by PyTorch (V.1.1.0) framework and python (V.3.5). All experiments were performed with Linux OS and hardware of Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50 GHz, GeForce GTX 2080 Ti.

Evaluation of synthetic images

Because the synthetic images were supposed to be used for assisting clinical practice, the quality, authenticity and predictive power of the synthetic post-therapeutic OCT images were evaluated by three experiments, respectively (figure 2).

Figure 2

Algorithm of the result generation and evaluation. GAN, generative adversarial network; nAMD, neovascular age-related macular degeneration; OCT, optical coherence tomography.

Figure 3

Illustration of the results of GAN training. (A–C) and (D–F) are two bundles of related OCT images. (A) and (D) are pretherapeutic images; (B) and (E) are synthetic post-therapeutic images; (C) and (F) are real post-therapeutic images. (A–C) In example 1, the cystoid macular oedema is completely absorbed after anti-VEGF treatment, that is, the macula turns ‘dry’, which is consistent with the synthetic image. (D–F) In example 2, the subretinal fluid is partially absorbed and remains between PEDs after anti-VEGF treatment, that is, the macula remains ‘wet’. The synthetic image could accurately predict the macular status. GAN, generative adversarial network; OCT, optical coherence tomography; PED, pigment epithelial detachment; VEGF, vascular epithelial growth factor.

Experiment 1 was set to assess the quality of the synthetic post-therapeutic OCT images, and only synthetic images with insufficient quality (ie, presence of two overlapped layers of neuroretina or chorioretinal coloboma) were excluded for further evaluation in experiments 2 and 3. Experiment 2 was set to evaluate the authenticity of the synthetic images, in which images were judged to real or synthetic by masked retinal specialists. Experiment 3 was set to evaluate their predictive accuracy in clinical practice, and we selected to evaluate the macular status (wet or dry) in the pretherapeutic images, synthetic post-therapeutic images and real post-therapeutic images, and the predictive accuracy of macular status in synthetic images was validated. The details of these experiments were explained below.

Experiment 1: Evaluation of the quality of synthetic post-therapeutic OCT images for clinical interpretation.

All synthetic images were presented to two retinal specialists in turn. They independently answered Question 1: ‘Is the image qualified for clinical interpretation?’ They were required to reach an agreement by either discussion or referring to a more experienced retinal specialist. Only images with sufficient quality were further evaluated in experiments 2 and 3.

Experiment 2: Discrimination between real and synthetic images by retinal specialists based on image pairs.

The real pretherapeutic and post-therapeutic images and the corresponding synthetic post-therapeutic image of each OCT B-scan in the test set were simultaneously displayed to the two retinal specialists. The pretherapeutic image was marked for reference, and the real post-therapeutic and synthetic post-therapeutic images appeared in a random sequence with no label. The retinal specialists independently answered question 2: ‘Which is the synthetic image (the first, second or undecided)?’.

Experiment 3: Classification of the macular status as wet or dry macula on post-therapeutic images.

All of the pretherapeutic, real post-therapeutic and synthetic post-therapeutic images were displayed separately without mask to two retinal specialists in turn, and they were asked to answer the following questions. Question 3: ‘Is the macula shown in this image wet or dry?’ Question 4: ‘Does the macula turn from wet to dry?’ A dry macula was defined as the absence of subretinal and intraretinal fluid on a single OCT image, whereas a wet macula was defined as the presence of intraretinal and/or subretinal fluid.13–15 They needed to reach an agreement by discussion or referring to a more experienced retinal specialist. And then we evaluated the predictive accuracy of macular status in synthetic post-therapeutic images, which indicates the short-term structural response to anti-VEGF therapy for eyes with nAMD. The predictive accuracy of macular wet-to-dry conversion was also evaluated based on the results of macular state.

The primary outcome of this study was the diagnostic accuracy of question 3, that is, the diagnostic accuracy of the synthetic images.

Evaluation metrics and statistical analysis

Statistical analyses were performed using SPSS software V.22 (IBM-SPSS). The rate of identification in experiments 2 and 3 was described as rate (95% CI). The statistical analysis for experiment 3 was performed using metrics including accuracy, sensitivity, specificity, positive predictive value, negative predictive value, positive likelihood ratio, negative likelihood ratio and κ score. Results with p<0.05 were considered statistically significant.

Results

A total of 476 pairs of OCT images were assigned into the training set, and 50 pairs from other patients were allocated into the test set (online supplementary figure 1). Consequently, a total of 50 synthetic post-therapeutic images were generated based on pretherapeutic OCT images of patients with nAMD patients in the test set. In the training set, the mean (SD) age of patients was 71.31 (9.46) years, and 67.0% were eyes from male patients. In the test set, the mean (SD) age of patients was 67.24 (8.72) years, and 52.0% were eyes from female patients. More baseline clinical characteristics were shown in table 1.

Table 1

Baseline clinical characteristics

In experiment 1, the two retinal specialists disagreed on the judgement of three synthetic images, in which the specialist 1 judged them unqualified, whereas the specialist 2 judged them qualified. And the third specialist judged the 3 images qualified. Finally, 46 images (92%) were identified as qualified for clinical interpretation and underwent experiment 2 and 3. Examples of qualified and unqualified images are shown in figure 3 and online supplementary figure 2.

Discrimination between real and synthetic post-therapeutic images by retina specialists

In experiment 2, the rate to discriminate between synthetic and real images for specialist 1 was 0.30 (95% CI 0.17 to 0.44) and 0.26 (95% CI 0.13 to 0.39) for specialist 2.

Classification of the macular status as wet or dry macula on post-therapeutic images

The macular status was classified as wet macula in 25 (54.35%) of the real post-therapeutic images (table 2).

Table 2

Contingency table of macula status in experiment 3

In experiment 3, the two retinal specialists disagreed on the judgement of one real post-therapeutic images and seven synthetic therapeutic images. The third specialist agreed with four images of each doctor. The macular status was used to evaluate the diagnostic accuracy of the synthetic OCT images. The accuracy to predict macula status (table 3) was 0.85 (95% CI 0.74 to 0.95). The rate of successfully identifying wet macula was 84%, while that of dry macula was 86%.

Table 3

Accuracy of the synthetic post-therapeutic OCT images in determining the macular status and predicting wet-to-dry macular conversion in experiment 3

To determine the accuracy of the synthetic OCT images to predict whether the macula could turn from wet to dry after a single dose of anti-VEGF injection, we calculated indices of diagnostic accuracy based on the images of previously wet fundus (42 cases). The accuracy was 0.81 (95% CI 0.69 to 0.93) (table 3).

Discussion

In the current study, we presented and evaluated an AI-based method to generate synthetic post-therapy OCT images to predict the structural alterations after anti-VEGF therapy for nAMD. Our results demonstrated that 92% of the synthetic images have acceptable quality for clinical interpretation, with an 85% accuracy to predict the post-treatment macular status.

In the real-world clinical practice, anti-VEGF treatment regimens usually constitute undertreatment, especially in low-income and middle-income countries, because patients find it hard to sustain the frequent resource-consuming monitoring and costly treatments.20 Therefore, it is important to predict the therapeutic response to anti-VEGF therapy for nAMD. AI-based assessments of treatment requirements have been regarded as a reliable component of management in retinal practice.21 Bogunovic et al reported that machine learning could be used to predict the requirement of anti-VEGF injections over approximately 2 years based on OCT images from the study of the pHase III, double-masked, multicenter, randomized, Active treatment-controlled study of the efficacy and safety of 0.5 mg and 2.0 mg Ranibizumab administered monthly or on an as-needed Basis (PRN) in patients with subfoveal neOvasculaR age-related macular degeneration (HARBOR).22 However, predicting the required number of anti-VEGF injections is quite simple and cannot directly guide clinical practice, and due to the source of patients and images they used, the AI model establised by their study can only reproduce the pattern of the specific protocol and apply to a specific population, that is, the protocol and population of HARBOR study.21 In contrast, our study was based on real-world patients who had underwent 3+PRN protocol, which is one of the most popular protocols. Our results might be able to be applied to real-world patients at different treatment stage, such as loading phase and PRN phase. Prahs et al included real-world patients in their study on anti-VEGF treatment indication using deep learning algorithm.23 The input training data were images with classification of ‘injection’ and ‘no injection’, which was based on subjective opinions from experts, aiming to achieve a binary prediction of treatment indication of anti-VEGF medications. However, the results lacked explainability and lacked objective indices related to decision of treatment, which could be overcome by image prediction based on objective pretherapeutic and post-therapeutic structural changes.

The results of this study showed that over 90% of the synthetic OCT images had sufficient quality for clinical interpretation, revealing the potential of such synthetic images to be used in real clinical settings. Moreover, when retinal specialists were asked to discriminate between a synthetic and a real OCT image (experiment 2), their accuracy was low, indicating that the synthetic images were as good as the real ones in terms of quality. Furthermore, the acceptable predictive accuracy of synthetic OCT images to predict post-therapeutic macular status and macular wet-to-dry conversion are encouraging, because the macular status is one of the most important elements for the decision of anti-VEGF therapy.13–15 Several factors could be improved to optimise our results. First, our results were fully based on a training set with no label of lesions or anatomical structures. The performance of the GANs, in terms of image quality, image authenticity, predictive accuracy of macula status, accurate location of lesions and fluid quantity, would be expected to improve when trained with labelled OCT images. The current study is the first and essential step in investigating the possibility of using GANs to generate post-treatment OCT images. Second, baseline characteristics revealed that there was significant difference between training set and test set in terms of the anti-VEGF medications, whose pharmacological mechanisms vary. This might result in a different proportion of macular wet-to-dry conversion after injection of anti-VEGF agents. However, the efficacy of various anti-VEGF agents needs furthers investigation in the real world.

In terms of model selection, we aimed to choose a model that could generate reliable and vivid post-therapeutic OCT images with detailed retinal layers and lesions. For this purpose, we selected a model called pix2pixHD according to previous literatures.19 Pix2pixHD uses ‘coach-to-fine’ training strategy. The generator does not directly synthesise high-resolution images. Instead, it trains the low-resolution global generator first and then adds a local enhancement generator to the outside of the low-resolution global generator to enhance the model resolution. From our results, we would say that the model is not effective in processing data with large lesion areas and large retinal structure deformation. There were two reasons for this. First, there were few such instances in the training set, and second, the GAN encounters difficulties in synthesising images due to deformation of the retina structure and inconspicuous layer information. Therefore, in the follow-up work, we will expand the data set and manually mark the lesion and retinal layer information to guide the training.

As this is the first study that applied GAN on predicting therapeutic response in ophthalmology, this study of small sample which only included Chinese patients has several limitations. First, the major concern is that there was no label of lesions and retinal structures on OCT images. In addition to the influence on the model performance, the lack of labels would hinder the quantitative evaluation of images, such as measuring the central retinal thickness. Future studies will be conducted with elaborate lesion labels and delineation to improve both the model performance and evaluation approach, as noted above. Second, there are multiple prognostic factors of anti-VEGF therapy in nAMD patients. Although we have chosen one of the most important ones, that is, the structural information provided by OCT images, but it cannot represent all aspects of one patient’s prognosis. Thus, doctors should interpret carefully to avoid overexaggeration. Further research could be conducted to cover more prognostic factors to make the model more applicable, such as the very kind of anti-VEGF agent and race. Third, the test set used in this study was not an external validation dataset, although the baseline characteristics of included eyes were different significantly between the training set and the test set, and the subjects of these two sets entering at different time. However, the possibility of overfitting is still worth noticed. Moreover, only OCT images of good quality were included, and a possible selection bias might be introduced. Finally, although we put forward a method to generate predicted OCT images, which are more intuitive and convenient than only numeric metrics for ophthalmologists to understand, but we were not able to explain how the predictive images were generated, and the black-box issue of how the AI models understand the process of prediction remains unsolved.

Based on our findings, there are several implications for future research. We merely included post-therapeutic B-scans in a short-term outcome, but long-term outcome is also what ophthalmologists and patients concern about. Furthermore, in the real clinical scenario, a cube is obtained from one scan of a single eye. Thus, it could be more practical if the input data are images of an intact cube rather than a single selected image. Besides, as stated above, after applying lesion labels to this model, it could be expected that an intact lesion-based image generation could be achieved rather than an image-based image generation.

In conclusion, our results revealed the great potential of GAN to generate post-therapeutic OCT images with both good quality and high accuracy. This work might also motivate more researches into the application of GANs in the field of OCT images prediction in ophthalmology.

References

Footnotes

  • YL and JY are joint first authors.

  • Correction notice This paper has been updated since it was published online. The affiliation, Key Laboratory of Ocular Fundus Diseases, Chinese Academy of Medical Sciences, Beijing, China, was missed for the following authors, Yutong Liu, Jingyuan Yang, Weihong Yu, and Youxin Chen.

  • Contributors YL and JY proposed the idea, designed the study, collected images, evaluated images and wrote both the original and revised manuscript together. They contributed equally to these parts. YZ, WW and JZ were incharge of AI training procedure and participated in writing the AI section in Method part of the manuscript under the guidance of DD and XL. DZ participated in statistical analysis. WY and YC helped with study design and the writing of the manuscripts, and the conduction of this study is supervised by them.

  • Funding Chinese Academy of Medical Sciences Initiative for Innovative Medicine (CAMS-I2M, 2018-I2M-AI 001). Pharmaceutical collaborative innovation project of Beijing Science and Technology Commission (Z191100007719002). National Key Research and Development Project (SQ2018YFC200148). National Natural Science Foundation of China (NSFC) (81670879). Beijing Natural Science Foundation (4202033)

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Ethics approval The Institutional Review Board of Peking Union Medical College Hospital approved this retrospective study (No. S-K631). The study followed the tenets of the Declaration of Helsinki.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data are available on reasonable request.

Linked Articles

  • At a glance
    Keith Barton James Chodosh Jost B Jonas