Table 2

Confusion matrices for deep convolutional ensemble and board-certified ophthalmologists showing the mean (and SD) per cent agreement between the predicted labels against the ground-truth labels over the test set

Deep convolutional ensembleOphthalmologists
NormalDRGlaucomaAMDNormalDRGlaucomaAMD
Ground-truth labelsNormal23.8%
(0.8%)
1.2%
(0.8%)
0.0%
(0.0%)
0.0%
(0.0%)
21.9%
(2.3%)
1.3%
(1.4%)
1.4%
(2.1%)
0.4%
(0.5%)
DR6.6%
(1.1%)
18.2%
(1.1%)
0%
(0.0%)
0.2%
(0.4%)
9.7%
(5.1%)
12.4%
(4.5%)
1.0%
(1.4%)
1.9%
(1.1%)
Glaucoma6.6%
(2.3%)
0.2%
(0.4%)
18.2%
(2.7%)
0.0%
(0.0%)
5.6%
(4.8%)
0.9%
(1.6%)
17.1%
(3.8%)
1.4%
(0.5%)
AMD3.2%
(0.4%)
2.8%
(1.5%)
0.0%
(0.0%)
19%
(1.2%)
0.7%
(0.8%)
2.6%
(1.5%)
0.4%
(0.5%)
21.3%
(1.4%)
  • Green cells indicate agreement between the ground-truth labels and predictions by the deep convolutional ensemble or ophthalmologists, and red cells similarly indicate disagreement

  • AMD, age-related macular degeneration; DR, diabetic retinopathy.