Table 1

 Psychometric tests and criteria (adapted from Lamping et al8)

Psychometric propertyDefinitionCriteria
1 Item reductionIdentify items for possible eliminationApplied to each of the 45 items
Missing data <5%
No item redundancy (inter-item correlations <0.80)
Item-total correlations ⩾0.20
Maximum endorsement frequencies <80% including floor/ceiling effects (that is, response categories with high endorsement rates at the bottom/top end of the scale respectively)
Aggregate adjacent endorsement frequencies ⩾10%
Item convergent/discriminant analyses (items must be scaling successes or probable scaling successes)
2 AcceptabilityCompleteness of data and score distributionsApplied to each of the 33 items
Missing data <5%
Maximum endorsement frequencies <80% including floor/ceiling effects
Applied to summary scores
Missing data <5%
Floor/ceiling effects <80%
Skewness values between +1 to −1
3 Reliability
3.1 Internal consistencyThe extent to which items comprising a scale measure the same constructCronbach α coefficients for summary scores >0.70
Item-total correlations ⩾0.40
3.2 Test-retest reliabilityThe stability of an instrument assessed by administering the instrument to respondents on two separate occasionsPearson/Spearman correlations >0.80
4 Validity
4.1 Content validityExtent to which content of instrument or scale is representative of intended conceptual domainContent derived from focus groups and field testing
4.2 Construct validity
4.2.1 Within scale analysesEvidence that a single construct is being measuredInternal consistency Cronbach α coefficient >0.70
Item-total correlations ⩾0.25
4.2.2 Analyses against external criteria Known group differencesEvidence that the instrument differentiates between groups who are known to differ—eg, by presence or severity of diseaseExpected higher IND-VFQ scores in patients with eye disease compared to normals Convergent validityEvidence that the instrument correlates with measures of the same of a similar constructExpected correlation with visual acuity Discriminant validityEvidence that the instrument is not correlated with measures of different constructsExpected lack of association with age and sex
5 ResponsivenessAbility of a scale to detect clinically significant change following a treatment of known efficacyExpected improved scores after cataract surgery
Effect sizes (calculated as mean difference in scores divided by pooled standard deviation at baseline) compatible with others published for similar treatment