Article Text

Validation of a deep learning system for the detection of diabetic retinopathy in Indigenous Australians
  1. Mark A Chia1,2,3,
  2. Fred Hersch4,
  3. Rory Sayres4,
  4. Pinal Bavishi4,
  5. Richa Tiwari4,
  6. Pearse A Keane1,2,
  7. Angus W Turner3,5
  1. 1 Institute of Ophthalmology, University College London, London, UK
  2. 2 Moorfields Eye Hospital NHS Foundation Trust, London, UK
  3. 3 Lions Outback Vision, Lions Eye Institute, Nedlands, Western Australia, Australia
  4. 4 Google Health, Palo Alto, California, USA
  5. 5 Centre for Ophthalmology and Visual Science, The University of Western Australia, Nedlands, Western Australia, Australia
  1. Correspondence to Dr Mark A Chia, Institute of Ophthalmology, University College London, London, EC1V 9EL, UK; mark.a.chia{at}outlook.com

Abstract

Background/aims Deep learning systems (DLSs) for diabetic retinopathy (DR) detection show promising results but can underperform in racial and ethnic minority groups, therefore external validation within these populations is critical for health equity. This study evaluates the performance of a DLS for DR detection among Indigenous Australians, an understudied ethnic group who suffer disproportionately from DR-related blindness.

Methods We performed a retrospective external validation study comparing the performance of a DLS against a retinal specialist for the detection of more-than-mild DR (mtmDR), vision-threatening DR (vtDR) and all-cause referable DR. The validation set consisted of 1682 consecutive, single-field, macula-centred retinal photographs from 864 patients with diabetes (mean age 54.9 years, 52.4% women) at an Indigenous primary care service in Perth, Australia. Three-person adjudication by a panel of specialists served as the reference standard.

Results For mtmDR detection, sensitivity of the DLS was superior to the retina specialist (98.0% (95% CI, 96.5 to 99.4) vs 87.1% (95% CI, 83.6 to 90.6), McNemar’s test p<0.001) with a small reduction in specificity (95.1% (95% CI, 93.6 to 96.4) vs 97.0% (95% CI, 95.9 to 98.0), p=0.006). For vtDR, the DLS’s sensitivity was again superior to the human grader (96.2% (95% CI, 93.4 to 98.6) vs 84.4% (95% CI, 79.7 to 89.2), p<0.001) with a slight drop in specificity (95.8% (95% CI, 94.6 to 96.9) vs 97.8% (95% CI, 96.9 to 98.6), p=0.002). For all-cause referable DR, there was a substantial increase in sensitivity (93.7% (95% CI, 91.8 to 95.5) vs 74.4% (95% CI, 71.1 to 77.5), p<0.001) and a smaller reduction in specificity (91.7% (95% CI, 90.0 to 93.3) vs 96.3% (95% CI, 95.2 to 97.4), p<0.001).

Conclusion The DLS showed improved sensitivity and similar specificity compared with a retina specialist for DR detection. This demonstrates its potential to support DR screening among Indigenous Australians, an underserved population with a high burden of diabetic eye disease.

  • Retina
  • Diagnostic tests/Investigation
  • Imaging

Data availability statement

No data are available.

https://creativecommons.org/licenses/by/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Data availability statement

No data are available.

View Full Text

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Twitter @markachia, @pearsekeane

  • Contributors MAC: research design, data acquisition, data analysis, data interpretation, manuscript preparation, and guarantor. FH: research design, data interpretation and manuscript revision. RS: data analysis, data interpretation and manuscript revision. PB, RT and PAK: data interpretation and manuscript revision. AT: research design, data interpretation and manuscript revision. All authors approved the final manuscript.

  • Funding Google LLC funded this study, and participated in the design of the study, conducting the study, data collection, data management, data analysis, interpretation of the data, preparation, review and approval of the manuscript. MAC: Supported by a General Sir John Monash Scholarship. PAK: Supported by a Moorfields Eye Charity Career Development Award (R190028A) and a UK Research & Innovation Future Leaders Fellowship (MR/T019050/1).

  • Competing interests PAK has acted as a consultant for DeepMind, Roche, Novartis and Apellis and is an equity owner in Big Picture Medical. He has received speaker fees from Heidelberg Engineering, Topcon, Allergan and Bayer. FH, RS, PB and RT are employees of Google LLC and own Alphabet stock.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Linked Articles

  • Highlights from this issue
    Frank Larkin