Aims To develop a deep learning (DL) model for automatic classification of macular hole (MH) aetiology (idiopathic or secondary), and a multimodal deep fusion network (MDFN) model for reliable prediction of MH status (closed or open) at 1 month after vitrectomy and internal limiting membrane peeling (VILMP).
Methods In this multicentre retrospective cohort study, a total of 330 MH eyes with 1082 optical coherence tomography (OCT) images and 3300 clinical data enrolled from four ophthalmic centres were used to train, validate and externally test the DL and MDFN models. 266 eyes from three centres were randomly split by eye-level into a training set (80%) and a validation set (20%). In the external testing dataset, 64 eyes were included from the remaining centre. All eyes underwent macular OCT scanning at baseline and 1 month after VILMP. The area under the receiver operated characteristic curve (AUC), accuracy, speciﬁcity and sensitivity were used to evaluate the performance of the models.
Results In the external testing set, the AUC, accuracy, specificity and sensitivity of the MH aetiology classification model were 0.965, 0.950, 0.870 and 0.938, respectively; the AUC, accuracy, specificity and sensitivity of the postoperative MH status prediction model were 0.904, 0.825, 0.977 and 0.766, respectively; the AUC, accuracy, specificity and sensitivity of the postoperative idiopathic MH status prediction model were 0.947, 0.875, 0.815 and 0.979, respectively.
Conclusion Our DL-based models can accurately classify the MH aetiology and predict the MH status after VILMP. These models would help ophthalmologists in diagnosis and surgical planning of MH.
Data availability statement
Data are available on reasonable request (e-mail: email@example.com).
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Macular hole (MH), a full-thickness defect of the neurosensory retina tissue at the fovea, is one of the reasons for central vision deterioration.1 MH can be divided into two categories according to the aetiology. The aetiology of idiopathic MH (IMH) is unknown, while secondary MH (SMH) is caused by known aetiologies like high myopia and trauma. IMH is the most common type of MH. It is estimated to affect 0.1%–0.8% of the population aged over 44 years, and nearly two-thirds of patients with IMH are female.2–4 IMH development is considered as pathological vitreoretinal traction at the central macula.5 Patients with IMH with progressive visual impairment and metamorphopsia usually require surgical intervention.6 Vitrectomy and internal limiting membrane peeling (VILMP) has been proved effective to treat full-thickness IMH with success rates of 80%–95%.7 8 However, IMH remains open after routine VILMP in some cases.9 For instance, it was reported that up to 44% of large MHs remained open after first surgery.10 In patients with an open IMH after initial surgery, a second surgery is often mandatory.11 However, second surgery is typically associated with higher medical costs and less promising visual outcomes.9 SMH is usually caused by high myopia and blunt ocular trauma.12 13 The pathogenesis of SMH is more complicated and the success rate of initial repair is lower than that of IMH.12 14 Therefore, it is important to distinguish IMH and SMH, and to predict the postoperative MH status after initial repair during surgical planning.
Optical coherence tomography (OCT) has been widely used in diagnosis and prognosis assessment of MH.13 15 IMH may have different OCT images from SMH, especially those secondary to high myopia, with its own OCT characteristics.12 For prognosis assessment, some OCT parameters have been identified as factors related to outcomes of MH surgeries. These factors include the minimum diameter of MH (MIN), the base diameter of MH (BASE), hole form factor (HFF), macular hole index (MHI), tractional hole index (THI) and diameter hole index (DHI).12 13 16–19 However, many previous studies mainly focused on the prediction abilities of one OCT parameter only.15
Deep learning (DL) is an emerging artificial intelligence technology, which has been successfully applied in various areas, such as image identification and classification.20 21 Convolutional neural network (CNN), one of the most popular DL models, employs multiple convolutional layers for automatic identification and extraction of feature representations. A CNN-based DL model was recently developed to automatically detect IMH using ultra-wide-field fundus images.22 In addition to the CNN, the multimodal deep fusion network (MDFN) is an important technology to exploit the comprehensive information of multimodal data extracted by different unimodal DL models. Different feature sets of the multimodal data can be fused by MDFN to improve the performance of the DL model.23 Given the fact that the surgical outcome of MH is affected by both morphological factors (such as preoperative macular OCT parameters) and other clinical factors (such as duration of MH), MDFN is potentially helpful in the prediction of postoperative MH status by combining the information of these factors.
The current study aimed to develop a DL model for automatic classification of MH aetiology based on macular OCT images, and an MDFN-based DL model to automatically predict MH status after VILMP surgery using preoperative OCT images and other clinical data. We also compared the performance of the MDFN model with that of the unimodal models.
Participants and data collection
In this multicentre retrospective cohort study, eyes with full-thickness MH (ie, IMH and SMH caused by high myopia or trauma) followed up for at least 1 month after VILMP surgery were included. The surgical process of VILMP is described in the online supplemental eMethod 1. The patients’ age, gender and duration of symptoms (eg, progressive visual impairment and metamorphopsia) were extracted from the electronic medical records (EMR). All of the eyes underwent ophthalmologic examinations including slit-lamp biomicroscope anterior segment and fundus examination, and spectral domain-OCT scanning (SD-OCT, Spectralis; Heidelberg Engineering, Heidelberg, Germany) at baseline and 1 month after VILMP. Additional details on OCT examination and OCT parameter measurements are presented in the online supplemental eMethod 2.
MH eyes were recruited retrospectively between January 2014 and August 2020. Among the 349 MH eyes of 334 patients with 1125 preoperative macular OCT images received from four ophthalmic settings, 5 eyes were excluded due to missing preoperative OCT images, 8 eyes were excluded due to missing postoperative OCT images and 6 eyes were excluded due to insufficient quality of OCT images. The remaining 330 eyes of 315 patients with 1082 preoperative macular OCT images (285 IMH eyes with 957 images and 45 SMH eyes with 125 images (36 eyes with 106 images secondary to high myopia, 9 eyes with 19 images secondary to trauma)) and 3300 clinical data (preoperative macular OCT parameters extracted from the OCT device and clinical data from EMR) were used to train, validate and externally test the MH aetiology classification model and postoperative MH/IMH status prediction model. Of these eyes, 266 MH eyes (232 IMH eyes and 34 SMH eyes) collected from the Zhongshan Ophthalmic Center, the Department of Ophthalmology in the Zhujiang Hospital of Southern Medical University and the Department of Ophthalmology in the First Affiliated Hospital of Kunming Medical University were randomly divided into the training set (80% of the eyes) and the validation set (the other 20% of the eyes). A separate set of 64 MH eyes (53 IMH eyes and 11 SMH eyes) collected from the Department of Ophthalmology in the Guangdong Provincial People’s Hospital was used for external testing. The flow chart of the research performed in this study is shown in figure 1.
Development of DL model for MH aetiology classification
The Visual Geometry Group network (VGG, Department of Engineering Science, The University of Oxford, Oxford, UK) with 16 convolutional layers was employed for the classification of MH aetiology.24 Preoperative macular OCT images were used as the input data. Based on the requirements of VGG network, the OCT images were preprocessed to normalise the input data. We removed saturated pixels from the raw OCT images with an intensity value of 255, followed by resizing them into 224×224 pixels. A threshold probability value of 0.5 was used to classify the MH aetiology (ie, SMH if the value was 0.5 or more, or IMH if the value was <0.5) (figure 2). A detailed description of MH aetiology labelling and training procedure is described in the online supplemental eMethod 3.
Development of DL models for MH/IMH status prediction
Considering that the pathogenesis of SMH is more complicated and the success rate of initial repair is even lower than IMH, two prediction tasks were performed. For the first prediction task, all of the 330 MH eyes, including both 45 SMH eyes and 285 IMH eyes, were included to predict the postoperative MH status. For the second prediction task, only 285 IMH eyes were enrolled to predict the postoperative status of IMH.
Preoperative macular OCT images and clinical data (ie, age, gender, duration of symptoms, MIN, BASE, height of hole, MHI, HFF, DHI and THI) of MH/IMH eyes were used as the input data. The DL model used for MH aetiology classification (VGG network) was also applied to obtain deep features of OCT images for MH/IMH status prediction. In addition, the fully connected (FC) network was employed to obtain deep features of the clinical data.
After the two types of feature sets (multimodal deep features) were extracted, the VGG network, with the top output layer removed, was connected to the FC network. Moreover, the multimodal deep features were fused to have a comprehensive characterisation of MH/IMH. The concatenate operation could integrate semantic data obtained from diverse feature maps, thereby increasing the number of channels and improving model performance. The fused features were linked to a two-channel softmax layer via the FC layer, to arrive at the predictive probability value of MH/IMH status. A threshold probability value of 0.5 was used to predict MH/IMH status (ie, closed if the value was 0.5 or more, or open if the value was <0.5). A detailed description of postoperative MH status labelling and training procedure is given in the online supplemental eMethod 4.
To compare the predictive performance of the MDFN model to those of the unimodal DL models, we divided the MDFN model into two unimodal DL models with the same structure of VGG network and FC network. We first used the VGG network and the FC network to extract deep features of OCT images and clinical data, respectively. Then the MH/IMH status was predicted by the unimodal DL models via the softmax layer. Schematic illustration of the DL models is presented in figure 2. Statistical analysis is demonstrated in the online supplemental eMethod 5. In addition, we generated heatmaps to aid interpretation of the results and increase model transparency (described in the online supplemental eMethod 6).
Demographic data of the eyes enrolled in this study are summarised in table 1. All patients were of the same ethnicity (Chinese). Among the total 330 MH eyes, 217 eyes were from female patients (66%), and the mean age of patients was 59.54±11.03 years. There were 229 eyes with a closed MH (209 IMH and 20 SMH) and 101 eyes with an open MH (76 IMH and 25 SMH) at the 1-month visit.
The performance metrics and the ROC curves of DL models are shown in table 2 and figure 3, respectively. The confusion matrices for external testing set are presented in the online supplemental eResult 1. For MH aetiology classification, the AUC of the DL model in the training set was 1.000, with an accuracy of 0.998, specificity of 1.000 and sensitivity of 1.000. In the validation set, the DL model showed an AUC of 0.997, with an accuracy of 0.986, specificity of 0.994 and sensitivity of 0.964. In the external testing set, the AUC, accuracy, specificity and sensitivity were 0.965, 0.950, 0.870 and 0.938, respectively (table 2A and figure 3A).
For postoperative MH status prediction, the AUC of the MDFN model in the training set was 0.928, with an accuracy of 0.855, specificity of 0.897 and sensitivity of 0.808. In the validation set, the AUC, accuracy, specificity and sensitivity of the MDFN model were 0.881, 0.826, 0.746 and 0.912, respectively. For external testing, the MDFN model achieved an AUC of 0.904, with an accuracy of 0.825, specificity of 0.977 and sensitivity of 0.766 (table 2B and figure 3B). For IMH status prediction, the AUC of the MDFN model was 0.999 in the training set, with an accuracy of 0.988, specificity of 0.989 and sensitivity of 0.987. In the validation set, the AUC of the MDFN model was 0.974, with an accuracy of 0.901, specificity of 1.000 and sensitivity of 0.865. For external testing, the AUC, accuracy, specificity and sensitivity of the MDFN model were 0.947, 0.875, 0.815 and 0.979, respectively (table 2C and figure 3C). These results indicate that our MDFN models can provide accurate prediction of postoperative MH/IMH status, and the accuracy of the IMH status prediction was slightly higher than that of the MH status prediction.
Additionally, the predictive performance of two unimodal DL models was also evaluated in the external testing set. For MH status prediction, the AUC of VGG network was 0.804, with 0.758 accuracy, 0.872 specificity and 0.656 sensitivity. The AUC of FC network was 0.797, with 0.813 accuracy, 0.652 specificity and 0.829 sensitivity (table 2B and figure 3D,E). For IMH status prediction, the AUC of VGG network was 0.836, with an accuracy of 0.755, specificity of 0.800 and sensitivity of 0.762. The AUC, accuracy, specificity and sensitivity of FC network were 0.768, 0.717, 0.625 and 0.892, respectively (table 2C and figure 3F,G). These results suggest that performance of the MDFN model was significantly better than those of the unimodal DL models for MH/IMH status prediction.
Heatmaps vividly illustrate the regions most important for decision-making process of the MDFN model (figure 4). In these heatmaps, the gap between the edges of the neuro-retina was identified as the most critical pathological region for prediction of the MH status following VILMP.
In this study, we developed a DL model that could accurately distinguish IMH from SMH secondary to high myopia or trauma based on macular OCT images. We also developed an MDFN model capable of precisely predicting the postoperative IMH status based on preoperative macular OCT images and clinical data from a multicentre dataset. Moreover, the prediction accuracy of MDFN model was better than those of unimodal prediction models, suggesting that the MDFN model could improve predictive performance by the integration of image and text features. Furthermore, the prediction accuracy of postoperative status of IMH only was better than that of all MH. This might be due to different prognostic factors between SMH and IMH, and the limited amount of SMH eyes in the study.
MH can be classified into two categories, IMH and SMH caused by known aetiologies such as high myopia and trauma. Treatments for IMH and for SMH caused by high myopia are different. While standard VILMP and gas tamponade is often applied to IMH, SMH due to high myopia usually needs more complicated surgical modalities, such as inverted ILM flap and silicone oil tamponade.14 Macular OCT images of SMH caused by high myopia are also different from those of the IMH, due to the deformation of the posterior ocular surface.12 Thus, it is possible to distinguish these two types of MH based on macular OCT images only. In the present study, we further extended the possibility to automatic classification of IMH and SMH. Despite the limited amount of the macular OCT images of SMH, our DL model still achieved an accuracy of 0.965 and an AUC of 0.950 in external testing. This might be resulted from the characteristic features of macular OCT images in eyes with SMH caused by high myopia.
Since IMH is the most common type of MH, it is clinically important to predict the anatomical outcomes of IMH surgeries. Accurate prediction of the postoperative IMH status can alleviate patients’ anxieties and help ophthalmologists make better surgical plans. In patients likely to have an unfavourable prognosis after VILMP, more advanced surgical techniques, such as inverted ILM flap and autologous ILM transplantation, can be recommended to the patients.10 25 OCT imaging is useful in measuring different aspects of IMH morphology with good repeatability and reproduction.19 Some of the OCT parameters have been used as the prognostic factors of anatomical outcomes after surgery.17 However, these OCT parameters were evaluated individually in previous studies.15 The accuracy and universality of these unifactor prediction algorithms are limited, as they only analyse the predictive ability of a single parameter, while the anatomical outcomes of IMH surgery are affected by multiple factors.4 19 In this study, we propose an MDFN model that integrates OCT image features and various clinical data of IMH to make an accurate prediction of postoperative IMH status.
Multimodal fusion of different sets of deep features has been proposed in previous studies. For instance, a multimodal DL model was developed for diagnosis of age-related macular degeneration (AMD). Two sets of deep features were extracted from colour fundus photographs and OCT images. After integrating the two types of features, the DL model exhibited an AUC of 0.969.26 In another study, the traditional handcrafted features and deep features of colour fundus photographs were integrated to detect hard exudates for diabetic retinopathy screening, with an AUC of 0.9323 and 0.9644 in two benchmark databases, respectively.27 As mentioned before, the IMH status after surgery is affected by various factors, such as the preoperative OCT morphology and other clinical data. For the information of OCT morphology, some could be virtually measured, while the others could be embedded in the OCT image and cannot be directly measured. It is important to integrate this information to obtain an accurate prediction of IMH status after surgery. Our MDFN model contains two modules, feature extraction and feature fusion. Two types of IMH deep features were extracted, including features of the preoperative macular OCT images extracted by VGG network and features of the clinical data extracted by FC network. These two sets of features were fused by the MDFN model to predict MH status after VILMP. The better prediction accuracy of the MDFN model than those of the unimodal DL models in our study indicated that DL models based on the fusion of multimodal feature sets have better prediction performance than DL models based on unimodal feature set. Consequently, the multimodal DL models are more suitable for prediction tasks with influencing factors from multiple feature sets. Since the prognosis of many ocular diseases such as IMH and AMD is affected by factors from different feature sets (eg, image and text), it is worth applying the multimodal DL models in further studies about prognosis prediction of these ocular diseases.
Multimodal fusion refers to the combination of multiple data sets in various forms (eg, image and text) to perform target prediction, which can exploit comprehensive information provided by multimodal data. Multimodal fusion typically occurs at the feature level. The advantage of feature-level fusion lies in two aspects. First, it can obtain the most discriminatory information from original feature sets. Second, it can eliminate the redundant information resulting from the correlation between different feature sets and make real-time decisions possible. In other words, feature fusion is capable of deriving and gaining the powerful and comprehensive features important for final prediction.23 28
There are several limitations to this study, one of which is the relatively small sample size. However, the IMH eyes included in the study were from multiple ophthalmic centres, and the promising performance of the MDFN model was validated by an independent external dataset, suggesting that our MDFN model has excellent adaptability and generalisability. Another limitation is the manual measurements of preoperative macular OCT parameters, which are subject to measurement errors. Nevertheless, the repeatability and reproducibility of manual measurements on SD-OCT have been proved good in previous studies.29 30 Recently, a fully automated 3D OCT image analysis of DL model has been developed for accurate measurement of MH parameters, which is potentially useful for automating MH measures in our DL models in the future.31 Besides, this is a preliminary study to evaluate the possibility of predicting the postoperative IMH status using an MDFN model. Prospective multiple centre trials are needed to verify the accuracy of our MDFN model. Lastly, only a small number of eyes with SMH were included in the study, and we did not include eyes with SMH secondary to macular oedema or vitrectomy, as these patients were rarely seen in clinics. Further studies are needed to predict the postoperative status of SMH.
In conclusion, our DL-based models are highly accurate in classification of MH aetiology and prediction of postoperative MH/IMH status. The DL-based models are potentially useful to help make automatic diagnosis and better surgical planning for patients with MH.
Data availability statement
Data are available on reasonable request (e-mail: firstname.lastname@example.org).
Patient consent for publication
The study was conducted according to the Declaration of Helsinki and was approved by the Institutional Review Board of Guangdong Provincial People’s Hospital (GPPH, No. GDREC2020067H). Informed consent was taken from all patients.
The authors thank the School of Computer Science and Engineering, South China University of Technology for their technical assistance with the DL system; the Zhongshan Ophthalmic Centre, the Department of Ophthalmology in the Zhujiang Hospital of Southern Medical University and the Department of Ophthalmology in the First Affiliated Hospital of Kunming Medical University for contributing OCT images and electronic medical records for training, validating and testing DL-based models.
YX, YH and WQ contributed equally.
Contributors HY, HL and TL had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Conception and design: HL, HY, TL, HC, YX, YH and WQ. Acquisition, analysis or interpretation of data: YX, YF, YW, Yu H, SF, X Zeng, LY, WQ, BZ, QW, BL and ZL. Drafting of the manuscript: YX, YH and WQ. Critical revision of the manuscript for important intellectual content: HY, HL, HC, YY, XW, WL and X Zhang. Statistical analysis: YX and WQ. Obtained funding: HL, HY and YH. Administrative, technical or material support: all authors. Supervision: HY, HL and TL.
Funding This study was supported by the Science and Technology Planning Projects of Guangdong Province (2018B010109008 to HL), National Natural Science Foundation of China (81870663 to HY; 82070972 to TL), the Science and Technology Programme of Guangzhou (202002030074 to HY), the Outstanding Young Talent Trainee Programme of Guangdong Provincial People’s Hospital (KJ012019087 to HY), the GDPH Scientific Research Funds for Leading Medical Talents and Distinguished Young Scholars in Guangdong Province (KJ012019457 to HY), the talent introduction fund of Guangdong Provincial People’s Hospital (Y012018145 to HY), the Technology Innovation Guidance Program of Hunan Province (2018SK50106 to YH), the Science Research Foundation of Aier Eye Hospital Group (AM1909D2 to YH), Guangzhou Key Laboratory Project (20200201006 to HL).
Disclaimer The funders had no role in the design and conduct of the study; collection, management, analysis and interpretation of the data; preparation, review or approval of the manuscript and decision to submit the manuscript for publication.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.