Table 4

Confusion matrices for deep convolutional ensemble and board-certified ophthalmologists showing the mean (and SD) per cent agreement between ‘confident’ predicted labels against the ground-truth labels over the test set

Deep convolutional ensembleOphthalmologists
NormalDRGlaucomaAMDNormalDRGlaucomaAMD
Ground-truth labelsNormal25.1%
(0.9%)
1.3%
(0.9%)
0.0%
(0.0%)
0.0%
(0.0%)
22.1%
(3.0%)
0.3%
(0.6%)
0.2%
(0.6%)
0.2%
(0.5%)
DR5.7%
(1.2%)
18.9%
(1.0%)
0.0%
(0.0%)
0.0%
(0.0%)
7.8%
(5.5%)
12.7%
(6.8%)
0.2%
(0.6%)
2.1%
(1.1%)
Glaucoma5.7%
(2.3%)
0.2%
(0.5%)
18.9%
(2.7%)
0.0%
(0.0%)
4.3%
(4.1%)
0.4%
(1.0%)
18.5%
(4.2%)
2.0%
(0.8%)
AMD2.8%
(0.6%)
1.9%
(1.2%)
0.0%
(0.0%)
19.4%
(1.5%)
0.5%
(0.9%)
2.1%
(1.6%)
0.2%
(0.6%)
26.3%
(2.3%)
  • Green cells indicate agreement between the ground-truth labels and predictions by the deep convolutional ensemble or ophthalmologists, and red cells similarly indicate disagreement

  • AMD, age-related macular degeneration; DR, diabetic retinopathy.