Development and validation of a patient based measure of outcome in ocular melanoma
- aQueen's Medical Centre, University Hospital, Nottingham, bHealth Services Research Unit, London School of Hygiene and Tropical Medicine, London, cMoorfields Eye Hospital, London
- Mr Alexander Foss, Queen's Medical Centre, University Hospital, Nottingham NG7 2HU
- Accepted 23 August 1999
BACKGROUND Patients with uveal melanoma can be treated by a number of modalities. As none of the different treatments offer a survival advantage, a key factor in choosing among treatments is their differential impact on patients' quality of life. A short, patient based questionnaire was developed and validated for evaluating outcomes following treatment for uveal melanoma.
METHODS The 21 item measure of outcome in ocular disease (MOOD) assesses the patient's view of outcome in terms of visual function and the impact of treatment. The reliability and validity of the three MOOD scores (total, vision, impact) were evaluated in 176 patients who had been treated for uveal melanoma (75 brachytherapy, 78 proton beam radiotherapy, 23 enucleation). Of these, 165 patients also completed the SF-36.
RESULTS All three MOOD scales met standard criteria for acceptability, reliability, and validity. The proportion of missing data was low, and responses to all items were well distributed across response categories. Internal consistency, assessed by Cronbach's alpha coefficients, exceeded the standard criterion of 0.70 for all three summary scores. Item total correlations ranged from 0.22 to 0.77 (mean item total correlation 0.58), indicating good homogeneity. Test-retest correlations for all three summary scores exceeded 0.85. Scaling assumptions, assessed by item convergent and discriminant validity correlations, were met for the vision and impact scores. The MOOD showed good content validity, as assessed by review by ophthalmologists and patients. Construct validity was demonstrated by high intercorrelations between the vision and impact scores and the total scale; higher scores for patients who reported being very satisfied compared with those who were not very satisfied and for those who reported persistent red eye compared with those who did not have this complication (known group differences/hypothesis testing); moderate correlations between the MOOD and the SF-36 and visual acuity (convergent validity); and low correlations between the MOOD and age and sex (discriminant validity).
CONCLUSIONS The MOOD is a practical and scientifically sound patient based measure which can be used in research and audit to evaluate outcomes following treatment for uveal melanoma. It takes 5 minutes to complete and meets standard psychometric criteria for reliability and validity.
Enucleation was the standard treatment for uveal melanoma which has excellent local tumour control rates, but it involves sacrificing the eye. A number of techniques which spare the eye, such as brachytherapy and proton beam radiotherapy, have since been developed. No differences in survival have been reported for patients treated by these modalities.1 2 Primary tumour control rates are excellent: 100% for those without extraocular extension treated by enucleation, 95% for brachytherapy, and 98%3or more for charged particle radiotherapy.1 However, ocular radiotherapy is not without morbidity4 5 and charged particle beam radiotherapy6-12 is associated with side effects including keratitis, cataract, scleral and corneal necrosis, radiation retinopathy, radiation optic neuropathy, retinal detachment, phthisis, rubeosis iridis, and glaucoma. Furthermore, enucleation rates as high as 13% have been reported11 for treatment associated morbidity after charged particle beam radiotherapy.
As none of these treatments has a survival advantage over the others, the treatment of choice should be guided by the effects on quality of life. The SF-36, the current gold standard generic measure of quality of life, has been shown to be sensitive to “blurred vision” as a symptom, but it may not be sensitive to other ocular symptoms or to a single eye disease.13 Most disease specific measures of outcome in ocular disease, such as the VF-14, cataract symptom score, and the vision related sickness impact profile,14 15 were developed mainly for use on cataract patients and have focused exclusively on visual function. These instruments do not assess other outcomes that are important to patients such as cosmesis or pain or discharge in the eye. The measure of outcome in ocular disease (MOOD) questionnaire was developed to assess visual function and the impact of treatment on the eye.
Patients and methods
The study received approval from the Moorfields ethics committee.
The MOOD was validated in a sample of 176 patients, recruited at the ocular oncology clinic at Moorfields Eye Hospital between December 1993 and June 1994, who had been treated for uveal melanoma 1–197 months (mean 33 months, median 23 months) before the clinic visit. The sample included 100 (57%) men and 76 (43%) women, ranging in age from 22 to 86 years (mean 58.2 years). All patients were white, as ocular melanoma is a disease found predominantly in white people and is rare in other ethnic groups. Seventy eight (44%) had been treated by proton beam radiotherapy, 75 (43%) by brachytherapy, and 23 (13%) by enucleation. A subsample of 32 patients who completed the MOOD twice formed the test retest sample.
Patients completed two self completion questionnaires, including the SF-36,16 17 a gold standard, generic measure of quality of life, and the measure of outcome in ocular disease (MOOD), a newly developed, disease specific measure to assess the patient's view of outcome in ocular disease (see ).
The MOOD is a 21 item questionnaire which measures ocular outcomes in two domains, visual function and impact of treatment. It provides summary scores on three scales: total, vision, and impact. Low scores indicate good outcomes. Instructions for scoring the questionnaire can be obtained from the first author.
The 14 item vision scale includes eight items modified from the VF-1414 and five new items, all of which are measured on five point Likert scales, and one global question measured on a 100 point visual analogue scale. The seven item impact scale includes six new items measured on five point Likert scales and one new item measured on a four point Likert scale. In addition to the 21 items that are scored quantitatively, the MOOD also includes four open ended questions which are not scored, but which provide descriptive information about the patient's view of the best and worst points of treatment and how treatment could be improved.
Information was also collected for the following: age, sex, pretreatment, and post-treatment vision in both eyes (as measured by a Snellen chart at 6 metres with the vision recorded on an ordinal scale with 6/6 = 1, 6/9 = 2, 6/12 = 3, 6/18 = 4, 6/24 = 5, 6/36 = 6, 6/60 = 7, 3/60 = 8, 1/60 = 9, HM = 10, PL = 11, NPL = 12), treatment modality, months since treatment, time from treatment, whether the treated eye was requiring medication, and complications such as redness (either of the conjunctiva or the lids), presence of strabismus, inflammation, or glaucoma. Patients were given the MOOD on arrival at the clinic and returned the completed questionnaire either on leaving the clinic or by post. A subsample of 32 patients also completed a second MOOD questionnaire at home, sent by post after a 3 week interval.
The reliability and validity of the MOOD were evaluated using standard psychometric techniques.18 The acceptability of the questionnaire was evaluated through an examination of item non-response rate and the distribution of responses across response categories. Scaling assumptions were tested on the basis of item convergent and discriminant validity correlations. Reliability analyses included internal consistency (Cronbach's alpha) and test-retest reliability for all three summary scores. Item total correlations were used to evaluate the homogeneity of the questionnaire. Validity analyses included an evaluation of content and construct validity, including both within scale analyses (intercorrelations between scales, group differences/hypothesis testing) and analyses against external criteria, including correlations with other measures (SF-36, complications, visual acuity, sociodemographic factors).
Preliminary psychometric analyses of the initial 22 item developmental version of the MOOD detected one item which failed to meet acceptability and reliability criteria. This item (“have you worn a patch to cover your treated/artificial eye?”) was therefore eliminated from the final 21 item version of the questionnaire and excluded from further psychometric analyses.
The proportion of missing data was low, ranging from 0–5% across items. Responses to all items were well distributed across response categories.
Cronbach's alpha coefficients for the three summary scores of the MOOD (Table 1) indicate excellent internal consistency. Alpha coefficients exceeded the standard criterion of 0.70 for all three summary scores. The removal of specific items did not substantially increase the internal consistency of any of the scales.
The homogeneity of the MOOD was evaluated on the basis of item-total correlations. These analyses compute the correlation between each item and the total score with the item of interest eliminated from the calculation of the total score. Item-total correlations below 0.20 are generally eliminated.18 As can be seen in Table 2, item-total correlations ranged from 0.22 to 0.77 (mean item-total correlation 0.58), indicating good homogeneity and no need for elimination of any items.
Test-retest correlations for the three summary scores, shown in Table3, all exceeded 0.85, indicating good test-retest reliability.
Scaling assumptions were tested by comparing the correlations between each item and the scale to which it belonged (item own scale correlation) and to the other scale to which it did not belong (item other scale correlation). Item own scale correlations should be higher (item-convergent validity) than item other scale correlations (item-discriminant validity) by at least two standard errors.19 As shown in Table 4, all items in both scales met this criterion.
Content validity was evaluated during the development of the MOOD. Five ophthalmologists and five patients reviewed the preliminary version of the questionnaire for completeness and appropriateness of content. The questionnaire was also pretested on a further five patients. The responses to open ended questions obtained during subsequent field testing in the full scale validation phase were also reviewed to determine whether patients identified any important new areas which had not been included in the questionnaire. Of the 153 patients who answered the open ended questions, 60% had nothing to add. The concerns raised by the remainder are presented in Table 5. Except for the responses from two patients about concerns about the loss of binocular function, all other responses that are relevant to treatment outcomes (for example, poor vision, pain, discomfort) are covered by the MOOD. The other responses to the open ended questions all pertain to service delivery and are thus more relevant to a measure evaluating the process of care, rather than a measure of the outcome of care such as the MOOD.
The construct validity of the MOOD is supported by three types of within scale analyses. Firstly, the high internal consistency of all three summary scales provides evidence for construct validity. Secondly, as shown in Table 6, intercorrelations between the three scales support the construct validity of the MOOD. The high correlations between the vision and impact scales and the total scale support the convergent validity of the questionnaire, while the moderate correlation between the vision and impact scales provide evidence of good discriminant validity. Thirdly, MOOD scores confirm hypotheses about differences expected between known groups defined on the basis of responses to individual questions on the MOOD. As shown in Table 7, MOOD total and impact scores were significantly higher for patients who reported being very satisfied compared with those who were not very satisfied.
Three types of between scale analyses also support the construct validity of the MOOD. Firstly, MOOD scores confirm hypotheses about differences expected between known groups defined on the basis of complications. As shown in Table 8, MOOD impact scores were significantly higher for patients who reported persistent red eye compared with those who did not have this complication. Secondly, correlations with other measures such as the SF-36 and a measure of visual acuity support the convergent validity of the MOOD. As shown in Table 9, correlations between the MOOD and the SF-36 are all in the moderate range. As both questionnaires measure quality of life, the MOOD at a more specific level than the generic SF-36, moderate correlations are expected. The moderate correlations between the total and vision scales of the MOOD and the measure of visual acuity in the better eye provide support for convergent validity. This compares favourably with correlations of 0.27–0.44 that have been reported between the widely used VF-14 and visual acuity.15Thirdly, correlations between the MOOD and sociodemographic variables support the discriminant validity of the questionnaire. As shown in Table 9, correlations between the MOOD and age and sex are low, suggesting that responses to the MOOD are not biased in terms of sociodemographic factors. The lower correlation of the MOOD impact scale with the measure of visual acuity compared with the total and vision scales also demonstrates discriminant validity.
In recent years, there has been a great expansion in the number and use of instruments to measure quality of life and other patient based outcomes in health care. Rigorous scientific methods, derived from psychometric theory, are now being used by clinicians and health services researchers to identify “good” measures which are credible in scientific as well as clinical terms.
This paper describes the development and validation of the measure of outcome in ocular disease (MOOD). The aim was to develop a measure to assess outcomes in patients treated for ocular melanoma. As none of the treatment modalities has shown a comparative survival advantage, choice should be guided by the effects of treatment on ocular morbidity and quality of life. For example, we have previously reported the outcome in 127 patients treated by proton beam radiotherapy using rubeosis as an indicator of poor outcome.20 Although the development of rubeosis correlated well with outcome (for example, subsequent ocular loss due to local morbidity), it is an indirect measure. Clinical outcome measures need to be supplemented with a more direct measure of greater relevance to patients. Most questionnaires developed for ocular conditions focus predominantly on vision and have been designed specifically for cataract patients. For many ocular diseases, there are other important outcomes such as ocular appearance, pain and discharge which are not assessed by existing instruments. Although the MOOD was developed and tested on patients with ocular melanoma, it may be applicable to other ocular diseases. Future studies should investigate the validity of the MOOD in other ocular conditions.
The MOOD proved to be highly acceptable to patients. It was designed to be short and suitable for routine use, taking only 5 minutes to complete. The instrument met the standard psychometric criteria for reliability and validity and is thus a scientifically sound measure of outcome in patients with ocular melanoma. A companion paper describes outcomes as measured by the MOOD in groups defined on the basis of tumour stage and grade and treatment modality.
We wish to acknowledge the contribution of Dr Andrew Jaskowski who died shortly after this study began. He introduced AJEF to the field of psychometrics and DLL to ocular oncology and initiated this collaboration.