1 Item reduction | Identify items for possible elimination | Applied to each of the 45 items |
Missing data <5% |
No item redundancy (inter-item correlations <0.80) |
Item-total correlations ⩾0.20 |
Maximum endorsement frequencies <80% including floor/ceiling effects (that is, response categories with high endorsement rates at the bottom/top end of the scale respectively) |
Aggregate adjacent endorsement frequencies ⩾10% |
Item convergent/discriminant analyses (items must be scaling successes or probable scaling successes) |
2 Acceptability | Completeness of data and score distributions | Applied to each of the 33 items |
Missing data <5% |
Maximum endorsement frequencies <80% including floor/ceiling effects |
Applied to summary scores |
Missing data <5% |
Floor/ceiling effects <80% |
Skewness values between +1 to −1 |
3 Reliability | | |
3.1 Internal consistency | The extent to which items comprising a scale measure the same construct | Cronbach α coefficients for summary scores >0.70 |
Item-total correlations ⩾0.40 |
3.2 Test-retest reliability | The stability of an instrument assessed by administering the instrument to respondents on two separate occasions | Pearson/Spearman correlations >0.80 |
4 Validity | | |
4.1 Content validity | Extent to which content of instrument or scale is representative of intended conceptual domain | Content derived from focus groups and field testing |
4.2 Construct validity | | |
4.2.1 Within scale analyses | Evidence that a single construct is being measured | Internal consistency Cronbach α coefficient >0.70 |
Item-total correlations ⩾0.25 |
4.2.2 Analyses against external criteria | | |
4.2.2.1. Known group differences | Evidence that the instrument differentiates between groups who are known to differ—eg, by presence or severity of disease | Expected higher IND-VFQ scores in patients with eye disease compared to normals |
4.2.2.2 Convergent validity | Evidence that the instrument correlates with measures of the same of a similar construct | Expected correlation with visual acuity |
4.2.2.3 Discriminant validity | Evidence that the instrument is not correlated with measures of different constructs | Expected lack of association with age and sex |
5 Responsiveness | Ability of a scale to detect clinically significant change following a treatment of known efficacy | Expected improved scores after cataract surgery |
Effect sizes (calculated as mean difference in scores divided by pooled standard deviation at baseline) compatible with others published for similar treatment |