Article Text

Disease-specific assessment of Vision Impairment in Low Luminance in age-related macular degeneration - a MACUSTAR study report
  1. Jan Henrik Terheyden1,
  2. Susanne G Pondorfer1,
  3. Charlotte Behning2,
  4. Moritz Berger3,
  5. Jill Carlton4,
  6. Donna Rowen4,
  7. Christine Bouchet5,
  8. Stephen Poor5,
  9. Ulrich F O Luhmann6,
  10. Sergio Leal7,
  11. Frank G Holz1,
  12. Thomas Butt8,
  13. John E Brazier4,
  14. Robert P Finger1
  15. the MACUSTAR consortium
    1. 1Department of Ophthalmology, University Hospital Bonn, Bonn, Germany
    2. 2Department of Medical Biometry, Informatics and Epidemiology, University Hospital Bonn, Bonn, Germany
    3. 3Department of Medical Biometry, Informatics and Epidemiology, University of Bonn, Bonn, Germany
    4. 4School of Health and Related Research, University of Sheffield, Sheffield, UK
    5. 5Novartis Pharma AG, Basel, Switzerland
    6. 6Roche Pharmaceutical Research and Early Development, Translational Medicine Ophthalmology, Roche Innovation Center Basel, Basel, Switzerland
    7. 7Bayer AG, Berlin, Germany
    8. 8UCL Institute of Ophthalmology, University College London, London, UK
    1. Correspondence to Professor Robert P Finger, Department of Ophthalmology, University Hospital Bonn, Bonn, Germany; Robert.Finger{at}ukbonn.de

    Abstract

    Background/aims To further validate the Vision Impairment in Low Luminance (VILL) questionnaire, which captures visual functioning and vision-related quality of life (VRQoL) under low luminance, low-contrast conditions relevant to age-related macular degeneration (AMD).

    Methods The VILL was translated from German into English (UK), Danish, Dutch, French, Italian and Portuguese. Rasch analysis was used to assess psychometric characteristics of 716 participants (65% female, mean age 72±7 years, 82% intermediate AMD) from the baseline visit of the MACUSTAR study. In a subset of participants (n=301), test–retest reliability (intraclass correlation coefficient (ICC) and coefficient of repeatability (CoR)) and construct validity were assessed.

    Results Four items were removed from the VILL with 37 items due to misfit. The resulting Vision Impairment in Low Luminance with 33 items (VILL-33) has three subscales with no disordered thresholds and no misfitting items. No differential item functioning and no multidimensionality were observed. Person reliability and person separation index were 0.91 and 3.27 for the Vision Impairment in Low Luminance Reading Subscale (VILL-R), 0.87 and 2.58 for the Vision Impairment in Low Luminance Mobility Subscale (VILL-M), and 0.78 and 1.90 for the Vision Impairment in Low Luminance Emotional Subscale (VILL-E). ICC and CoR were 0.92 and 1.9 for VILL-R, 0.93 and 1.8 for VILL-M and 0.82 and 5.0 for VILL-E. Reported VRQoL decreased with advanced AMD stage (p<0.0001) and was lower in the intermediate AMD group than in the no AMD group (p≤0.0053).

    Conclusion The VILL is a psychometrically sound patient-reported outcome instrument, and the results further support its reliability and validity across all AMD stages. We recommend the shortened version of the questionnaire with three subscales (VILL-33) for future use.

    Trial registration number NCT03349801.

    • Diagnostic tests/Investigation
    • Macula

    Data availability statement

    Data are available upon reasonable request. The datasets used in the present study are available from the MACUSTAR consortium upon reasonable request.

    https://creativecommons.org/licenses/by/4.0/

    This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.

    Statistics from Altmetric.com

    Key messages

    What is already known on this topic

    • Patient relevance is key for regulatory assessment of age-related macular degeneration (AMD) treatments, but existing patient-reported outcome instruments do not fulfil development requirements by regulators or capture AMD patients’ difficulties insufficiently.

    What this study adds

    • The Vision Impairment in Low Luminance (VILL) questionnaire has been developed according to regulatory guidelines and is implemented in the MACUSTAR study. This study supports the psychometric performance including internal consistency, item fit, subscale structure, test–retest reliability and construct validity of the VILL in a multinational, multilanguage setting.

    How this study might affect research, practice or policy

    • The study supports that the VILL is sufficiently precise to capture patient-reported deficits in AMD in future trials.

    Introduction

    There is a large unmet need for effective and safe treatments against onset and progression of age-related macular degeneration (AMD). However, this requires endpoints that capture disease progression reliably over the course of short interventional trials and that are accepted by regulatory authorities and health technology assessment bodies.1–3 Numerous structural biomarkers have previously been identified,4–6 but regulators agree that there is a need for patient-centred approaches, including novel functional tests and patient-reported outcomes (PROs).3

    The visual function deficit in early and intermediate age-related macular degeneration (iAMD) is most pronounced in low-contrast and low-luminance situations, while best-corrected visual acuity under high luminance is often unaffected.5 7–12 Few of the available PRO instruments capture difficulties in low-luminance and low-contrast situations, which are crucial for their use as an endpoint in early and iAMD trials. The Low Luminance Questionnaire (LLQ) and the Night Vision Questionnaire (NVQ) fulfil these specific requirements but have not been developed according to regulatory guidelines, which limit their use in future interventional trials.13–15 Also, available instruments have not been used in the context of multinational, multilingual and multicentre studies. The Vision Impairment in Low Luminance (VILL) questionnaire, a novel vision-related quality of life (VRQoL) instrument meeting these criteria, was developed recently.16 In order to further assess the VILL’s psychometric performance including internal consistency, item fit, subscale structure, test–retest reliability and construct validity in a multinational/multilanguage setting, we report data from the MACUSTAR study, a European low-interventional multicentre study on iAMD progression.1 2

    Materials and methods

    Participants

    The MACUSTAR study is a low-interventional study on the development and validation of functional, structural and patient-reported endpoints in iAMD, conducted at 20 clinical sites across Europe (Denmark, France, Germany, Italy, Netherlands, Portugal and UK).1 2 More details on the study’s design, assessment schedule and outcomes have been published elsewhere.2 In brief, an iAMD cohort (n=585) and three control cohorts (early AMD, n=34; late AMD, n=43; no AMD, n=56) were recruited. An extensive battery of functional, structural and PRO assessments (using the VILL and the generic EuroQol 5-dimension instrument, EQ-5D-5L) was performed by each participant at baseline and repeated within 2 weeks (1–3 weeks, ‘validation visit’) in a subset of 168 iAMD subjects and all subjects from the other three groups (42% of the overall sample) to assess test–retest reliability. This time frame has previously been considered appropriate to minimise recall bias.17 18 Disease stage was assessed independently at the test and retest visits by a central reading centre. Further visits are performed every 6 months over the entire study period for each individual but have not been included in this report. Study inclusion and disease stage classification were based on the current version of the clinical Beckman classification of AMD.19

    The MACUSTAR study has been registered on clinicaltrials.gov.

    Vision Impairment in Low Luminance with 37 Items (VILL-37)

    The VILL questionnaire was developed including in-depth interviews, focus group discussions and cognitive debriefs with patients with AMD, as outlined previously.16 It consists of 37 items with four response options each, plus an additional “not applicable” response option (“Didn’t do this for other reasons” / “Does not apply to me”). The VILL includes two rating scales (online supplemental table 1), referring to difficulty (items 1–24) and frequency (items 25–37). The instrument consists of the three subscales “reading and accessing information” (abbreviated reading, 20 items), “mobility and safety” (abbreviated mobility, 13 items) and “emotional well-being” (abbreviated emotional, 4 items).16 Within the MACUSTAR study, a PRO administration manual was provided to the study sites, ensuring similar test conditions for all participants. Questionnaires were self-administered unless participants requested interviewer administration.2

    Translation and cultural adaptation

    The VILL was originally developed in Germany with German-speaking participants and subsequently translated and culturally adapted into English (United Kingdom, UK), following the principles of good practice for the translation and cultural adaptation process for PRO measures recommended by the International Society for Pharmacoeconomics and Outcomes Research (ISPOR).20 The English (UK) version was evaluated and optimised on the basis of clarity, grammar and spelling, uniqueness, cultural diversity and layout. Five cognitive debriefing interviews were undertaken to ensure comprehension and lack of ambiguity for each item. Inconsistencies were resolved by discussion between translators, patients and the developer (RPF). The English (UK) version then served as the source version for translation and cultural adaption of the following language versions: Danish (Denmark), Dutch (The Netherlands), French (France), Italian (Italy) and Portuguese (Portugal). Two forward translations into the target languages were provided by native speakers of the respective target language. The translations were subsequently reconciled to a single translation. The reconciled translation was then translated back into English by two independent native English speakers who were blinded to the original texts. Discrepancies were resolved in discussion. The target language versions were proofread by medical translators (native speakers of target language). Five cognitive debriefing interviews were undertaken per target language to ensure comprehension and lack of ambiguity for each item. Inconsistencies were resolved in discussion. The developer (RPF) reviewed all versions following initial translation as well as cognitive debriefs. The overall process of translation and cultural adaptation was performed in collaboration with Oxford University Innovation Ltd., following an established methodology and the ISPOR recommendations.20–22 All translations were undertaken by professional medical translators.

    Psychometric evaluation

    Only baseline data of participants included in the study were used for analysis. Rasch analysis, derived from item response theory, was used to assess the VILL’s psychometric characteristics.23 24 Using the three previously established subscales of the VILL-37,16 a polytomous Rasch model was employed. Rasch analysis was used to assess the undimensionality of the three subscales, to identify misfitting items in each subscale, to indicate whether item levels were appropriately ordered, and to check that items did not perform differently depending on characteristics of respondents. First, a person item map was generated and relative person abilities and item difficulties were assessed. We then evaluated threshold ordering of the response categories to investigate the validity of the rating scale. Categories were collapsed where disordered thresholds were observed. To assess item misfit we considered unweighted mean square statistics. Items showing outfit or infit >mean-square value of 1.4 were removed in an initial step and item fit was re-investigated afterwards. In the case of misfitting items outside a corridor of outfit or infit mean-square values of 0.6 to 1.4, persons with misfitting responses to said item were removed and item fit was re-investigated.25 When this did not improve item fit, the respective item was removed. Internal consistency and the instrument’s capability of detecting different ability levels were investigated using person reliability and person separation index. Respective values above 2.0 and 0.8 were considered acceptable.26 27 The targeting of the instrument was assessed based on the person-item map and mean values of person measures and item measures. An absolute difference ≤1.0 logits was considered adequate.26 Dimensionality of the subscales was assessed based on principle component analysis (PCA) of the residuals, with a first contrast of <2.5 eigenvalues supporting unidimensionality of the subscale.28 29 Lastly, we investigated differential item functioning (DIF) based on gender, age group and administration mode. A significant DIF contrast ≥0.64 logits was interpreted as suggestive of biased responses in one of the analysed subgroups.30 P-values<0.05 were considered statistically significant.

    Statistical analysis

    We performed a subgroup analysis of participants that had baseline and re-test visit data available. Person measures were obtained from Rasch analysis and statistical analysis was performed with R software V.3.6.1 (R Core Team 2020, Vienna, Austria). P-values were reported as part of the descriptive analysis and considered significant when<0.05.

    Test–retest reliability

    Intraclass correlation coefficients (ICCs) with 95% confidence intervals were calculated, accounting for repeated measures within subjects by a random effects term and interpreted following Cicchetti and Sparrow.31 Bland-Altman plots with limits of agreement bands were constructed and compared. Coefficients of repeatability (CoRs) were calculated as 1.96×SD of the mean differences between two measurements.32 Deming regression was performed and estimated intercept and slope values were compared, accounting for the variance in the test and retest datasets.33

    Construct validity

    The association between baseline VILL person measures and AMD disease stage was further investigated with a t-test to support construct validity of the VILL, hypothesising VILL person measures to decrease with AMD stage. To control the analysis for age, gender, the number of comorbidities and administration mode, we additionally performed linear regression analysis with the VILL person measures as dependent variables and AMD stage as an independent variable.

    Results

    Psychometric evaluation

    We included 716 out of 718 MACUSTAR study participants (65% women) with baseline data in the psychometric evaluation of the VILL (table 1). Baseline data of two participants were unavailable for psychometric evaluation. Two hundred eighty-two participants were aged 55–70 years (39.4%) and 434 participants were aged 71–88 years (60.6%). All items had a low rate of not applicable or missing responses (≤20%), with the majority of these responses being not applicable to the respondent (1664 not applicable item responses; 11 missing item responses; 24 817 total valid responses). None of the items revealed floor effects, but ceiling effects (where respondents indicated no problems) were detectable in 16 items.

    Table 1

    Descriptive statistics of the overall sample and the subsample used for evaluation of repeatability

    The four items of the emotional subscale (items 34–37) loaded positively on the first factor in the PCA of the residuals (correlation coefficient >0.4). The remaining 33 items had an unexplained variance in the first contrast of 3.49, with items related to reading / accessing information and mobility / safety forming two clusters. This confirmed the subscales previously described. As the reading and mobility subscales had an eigenvalue of the unexplained variance in the first contrast >2.0 (table 2), we re-reviewed their content, which did not reveal any further dimensions. In addition, we investigated the person measure correlation between the reading and mobility subscales and clusters of items from these subscales based on the PCA of residuals. The results did not provide evidence for multidimensionality in any of the VILL subscales (online supplemental table 2). Thus, we proceeded with the subscale structure previously identified (reading and accessing information, mobility and safety, and emotional well-being subscales).

    Table 2

    Fit parameters of the VILL-37 and VILL-33, compared with Rasch model requirements

    None of the category thresholds were disordered. Some of the VILL-37 items showed misfit (table 2) which was addressed by successive item reduction (see below). Two additional items of the reading subscale revealed moderate overfit before item reduction but were retained for further evaluation. Reliability indices were in an acceptable range for the reading and mobility subscales, but below the recommended thresholds for the emotional subscale (table 2). There was no evidence of multidimensionality in any subscale.

    Following this, the VILL was revised based on psychometric findings. Three items from the reading subscale and one item from the mobility subscale were successively dropped due to misfit (table 2). The respective initial outfit mean-square values were 3.29, 2.01 and 1.55 for the removed reading / accessing information subscale items and 1.46 for the removed mobility / safety subscale item (online supplemental table 3). When re-investigating the psychometric properties of these two subscales after item reduction, three items of the reading subscale and one item of the mobility subscale showed initial misfit. Omitting 39 and 14 misfitting person responses to these items from the reading subscale and mobility subscales respectively, all items fit the Rasch model (online supplemental table 4). The reliability indices were in an acceptable range and no items showed DIF (table 2). The emotional subscale was less internally consistent than the reading and mobility subscales, but none of its four items showed relevant misfit or DIF (table 2). All emotional subscale items were retained. Similar to the VILL-37, person ability was higher than item difficulty in all subscales.

    Subgroup analysis

    301 participants (62% women) from all study groups were included in this subgroup analysis (table 1) and complete test–retest assessments were available in 289 of these participants.

    Test–retest reliability

    ICCs of all three subscales of the VILL were excellent in the overall cohort and in the intermediate AMD subgroup (table 3). The overall ICCs of the emotional subscale were significantly lower than ICCs of the reading and mobility subscales. Mean measurement differences in Bland-Altman analysis were close to 0 (figure 1) and Deming regression supported no systematic difference between initial assessment and re-test assessment across the overall sample (table 3). However, there was a trend that persons with high person measures at baseline achieved slightly lower person measures at re-test for some of the groups (Deming regression slope <1: reading subscale: overall group; mobility subscale: overall group, early AMD, late AMD; emotional subscale: overall group, iAMD, early AMD, late AMD; table 3). Though these proportional differences were most pronounced in the emotional subscale, they were not observed for the reading or mobility subscale in participants with iAMD.

    Table 3

    Test–retest reliability statistics of the VILL-33 subscales per AMD stage

    Figure 1

    Bland-Altman plots of the VILL-33 test and retest data: (A) reading and accessing information subscale, (B) mobility and safety subscale and (C) emotional well-being subscale. AMD, age-related macular degeneration; iAMD, intermediate age-related macular degeneration; VILL-33, Vision Impairment in Low Luminance with 33 items.

    Construct validity

    The mean person measures of all subscales of the VILL differed noticeably between AMD stages (figure 2). Higher person measures indicate better VRQoL. Mean person measures were significantly lower in the late AMD group than in the iAMD group (p<0.0001 for all three subscales). Person measures of all three VILL subscales were significantly lower in the iAMD group than in the no AMD group (p<0.0001, reading; p=0.0053, mobility; p=0.0011, emotional). Person measures of the reading and mobility subscale were significantly lower in the iAMD group than in the early AMD group (p=0.0006, reading; p=0.0197, mobility). This did not apply to the emotional subscale (early AMD<iAMD person measures, p=0.01). In linear regression analysis, all VILL subscale person measures were significantly associated with late AMD (p<0.0001) when controlling for age, gender, number of comorbidities and mode of administration. In addition, the reading and emotional subscale person measures were associated with iAMD (p=0.001 and 0.0003, respectively) and the emotional subscale person measures were associated with early AMD (p<0.0001).

    Figure 2

    Distributions of VILL-33 person measures across different age-related macular degeneration stages: (A) reading and accessing information subscale, (B) mobility and safety subscale and (C) emotional well-being subscale. VILL, Vision Impairment in Low Luminance; VILL-33, Vision Impairment in Low Luminance with 33 items.

    Discussion

    The VILL is a novel PRO instrument developed to meet the regulatory requirements for use in AMD trials, with a focus on intermediate AMD. Based on this further evaluation in the MACUSTAR study, we recommend the use of the 33-item VILL with its three subscales reading / accessing information, mobility / safety and emotional well-being. The Vision Impairment in Low Luminance with 33 items (VILL-33) has good psychometric properties, high test–retest reliability and adequate construct validity.

    The VILL-37 questionnaire was developed according to regulatory standards.16 Using data from the MACUSTAR study, we have continued an ongoing validation process following regulatory guidelines to be able to support labelling claims in the context of future drug trials.34 Overall, MACUSTAR participants were on average younger (mean age 72±7 years) than the cohort in which the VILL was developed (mean age 76±7 years). Noticeably, a lower proportion in the MACUSTAR cohort had late AMD (6% in the MACUSTAR sample, 42% in the development study).16 Both the initial development study and the present study are supportive of the internal consistency of the reading and mobility subscales of the VILL with person reliability and person separation values within the accepted ranges. Unlike the VILL-37, no items of the VILL-33 showed misfit.

    The emotional subscale had a lower internal consistency than the reading and mobility subscales in the MACUSTAR data which is similar to the development study. Also, repeatability and construct validity were worse for the emotional subscale than for the other subscales of the VILL. These findings may be related to the lower number of items in the emotional subscale (four items) than in the reading (17) or mobility subscales (12 items) which could make the subscale more prone to measurement noise. The broad definition of the construct “emotional well-being” in the VILL, which was based on experiences of AMD patients and content from existing PRO instruments but not specifically obtained or validated in the context of psychiatric comorbidities may also explain why the emotional subscale appears to be less reliable and construct valid than the reading and mobility subscales. However, we retained the emotional subscale on the basis of content validity while acknowledging the need to explore reliability and validity of this subscale further, including an exploration of its concurrent validity in the context of existing instruments measuring the underlying psychological concepts including worry, anxiety and depression.

    We recommend the VILL-33 to be used in future applications over the VILL-37. Both the VILL-33 and the VILL-37 were not well targeted to the MACUSTAR study sample, and ceiling effects were more prominent in the MACUSTAR data than in the VILL development study.16 This is likely due to the very good vision of the large majority of MACUSTAR participants at baseline who report greater ability than that required to perform several of the items. However, as the VILL was developed to capture changes in VRQoL associated with disease progression within iAMD and to late AMD, and has been shown to be appropriate for a sample with a larger proportion of late AMD participants, we are confident it will perform adequately in the longitudinal part of the MACUSTAR study as it retains scope to capture reduction in VRQoL as progression ensues. Against this background, several items were retained despite ceiling effects.

    Besides the VILL, only a limited number of PRO instruments were designed to capture the characteristic impairment of patients with AMD under low-luminance and low-contrast conditions, that is, the LLQ and the NVQ. The LLQ was designed based on focus group discussions with 80 patients with AMD and patients with inherited retinal disease and was administered to 125 participants including individuals with normal ageing changes.8 In psychometric testing using classical test theory, ceiling effects were present in a high proportion of items; for example, in 22% of the items obtained, the full sum score in all items related to general dim lighting problems.8 The validated German version of the LLQ included 23 of the 32 original items and was evaluated using a Rasch model in 274 participants (including 90 controls).35 While the instrument showed good internal consistency, item targeting was poor due to ceiling effects (difference in person and item mean 2.1). Though the targeting parameter in our study was similar, our population is not directly comparable to the population from the German LLQ validation study.35 Test–retest reliability of the reading and mobility VILL subscales was higher and the sample size larger than the available repeatability data of the LLQ-32 (Pearson correlation coefficents 0.46–0.88 in 60 participants).8 ICC and CoR values of the VILL were also similar to the Vision and Night Driving Questionnaire, which is specifically targeted at an elderly, driving population with good visual function.36

    Validation of the NVQ was originally based on 1052 participants of the Complications of AMD Prevention Trial.37 Again, internal consistency was good, but the instrument suffered from ceiling effects. A recent study investigated NVQ-10 responses of participants of the Laser Intervention in Early Stages of Age-Related Macular Degeneration study.14 38 Rasch analysis revealed disordered thresholds, poor discriminatory power of the items and underfit of items, as well as poor person separation (internal consistency). The authors recommended the NVQ-10 not to be used in iAMD samples based on these findings. Unlike the NVQ, the psychometric analysis of the VILL revealed good internal consistency, item fit and functioning of the rating scale, supporting use of the VILL in future AMD studies.

    A key strength of our study is its large, well-phenotyped sample, including confirmation of AMD staging by a central reading centre as well as central and on-site monitoring to ensure the study meets high quality requirements. Use of the current reference standard of item response theory enabled us to evaluate the VILL at quality standards that cannot be reached using classical test theory.39 However, despite its large overall sample, we did not evaluate differential item functioning between different language versions which needs to be examined in future studies.40 We have neither included functional data of the participants in our analyses nor investigated structural biomarkers besides AMD stage as both aspects were beyond the scope of this paper. The study groups (iAMD group and control groups) were not balanced in terms of age or participant characteristics, which may have affected the comparisons between disease stages.

    To conclude, we provide additional evidence for the validity of the VILL questionnaire in AMD based on MACUSTAR data. We recommend the shortened version of the questionnaire with 33 items (VILL-33) for use in future studies.

    Data availability statement

    Data are available upon reasonable request. The datasets used in the present study are available from the MACUSTAR consortium upon reasonable request.

    Ethics statements

    Patient consent for publication

    Ethics approval

    This study involves human participants and was approved by 384/17, University Hospital Bonn ethics committee; 04/18_2, Paris Ouest IV; 032/2017/AIBILI/CE, AIBILI; 13507/2017, Nova Medical School; 18/LO/0145, London Queen Square Research Ethics Committee; H-18000126, Center for Sundhed Glostrup; 37910/2018, Comitato Etico Milano; 25/10/2018, Ospedale San Raffaele; 2017–3954, Radboudumc Technology Center; and L18.055/SH/sh, LUMC Commissie Medische Ethiek. The MACUSTAR study was approved by the local ethics committees of all participating clinical sites and adhered to the tenets of the Declaration of Helsinki. Written informed consent was obtained from all participants prior to participation.

    Acknowledgments

    We gratefully acknowledge the work put into this project by consortium members who have since moved on to work elsewhere. Likewise, we thank the staff of Oxford University Innovation for their support in making the Vision Impairment in Low Luminance available in seven languages.

    References

    Supplementary materials

    • Supplementary Data

      This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Footnotes

    • Collaborators MACUSTAR Consortium: H Agostini, L Altay, R Atia, F Bandello, P G Basile, C Behning, M Belmouhand, M Berger, A Binns, C J F Boon, M Böttger, C Bouchet, J E Brazier, T Butt, C Carapezzi, J Carlton, A Carneiro, A Charil, R Coimbra, M Cozzi, D P Crabb, J Cunha-Vaz, C Dahlke, L de Sisternes, H Dunbar, R P Finger, E Fletcher, H Floyd, C Francisco, M Gutfleisch, R Hogg, F G Holz, C B Hoyng, A Kilani, J Krätzschmar, L Kühlewein, M Larsen, S Leal, Y T E Lechanteur, U F O Luhmann, A Lüning, I Marques, C Martinho, G Montesano, Z Mulyukov, M Paques, B Parodi, M Parravano, S Penas, T Peters, T Peto, M Pfau, S Poor, S Priglinger, D Rowen, G S Rubin, J Sahel, D Sanches Fernandes, C Sánchez, O Sander, M Saßmannshausen, M Schmid, S Schmitz-Valckenberg, H Schrinner-Fenske, J Siedlecki, R Silva, A Skelly, E Souied, G Staurenghi, L Stöhr, D Tavares, J Tavares, D J Taylor, J H Terheyden, S Thiele, A Tufail, M Varano, L Vieweg, J Werner, L Wintergerst, A Wolf, N Zakaria.

    • Contributors JHT, SGP, CBe, SP, UFOL, SL, FGH and RPF designed the study. JHT, SGP, CB, MB, JC, DR, TB, JB and RPF interpreted the data. JHT, SGP, JC, DR and RPF drafted the manuscript and all authors. CBe, MB, CBo, SP, UFOL, SL, FGH, TB and JB critically revised the manuscript for important intellectual content. All authors approved the final version of the manuscript to be published and agreed to be accountable for all aspects of the work. JHT and RPF are the guarantors.

    • Funding This project received funding from the Innovative Medicines Initiative 2 Joint Undertaking (grant agreement number 116076). This joint undertaking received support from the European Union’s Horizon 2020 research and innovation programme and EFPIA.

    • Disclaimer The communication reflects the author's view and neither IMI nor the European Union, EFPIA, or any Associated Partners are responsible for any use that may be made of the information contained therein.

    • Competing interests JHT: Heidelberg Engineering, Optos, Carl Zeiss Meditec, CenterVueSGP: Heidelberg Engineering, Optos Carl Zeiss Meditec, CenterVue; CBe: None; MB: None; JC: None; DR: None; CBo: employee of Novartis; SP: employee of Novartis; UFOL: employee of F. Hoffmann-La Roche; SL: employee of Bayer; FGH: Acucela, Allergan, Apellis, Bayer, Boehringer-Ingelheim, Bioeq/Formycon, CenterVue, Ellex, Roche/Genentech, Geuder, Grayburg Vision, Heidelberg Engineering, Kanghong, LinBioscience, NightStarX, Novartis, Optos, Pixium Vision, Oxurion, Stealth BioTherapeutics, Zeiss; TB: None; JB: None; RPF: Bayer, Ellex, Novartis, Novartis, Opthea, Alimera, Santhera, Roche/Genentech, CentreVue, Zeiss.

    • Provenance and peer review Not commissioned; externally peer reviewed.

    • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

    Request Permissions

    If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.