Statistics from Altmetric.com
The importance of prognosis and treatment effect
Decisions based on clinical examination are critical to the practice of ophthalmology. Although quite different disease processes can produce the same structural and functional outcomes, treatment decisions are often directly based on the observation of a complex of signs and symptoms. Thus, presented with a patient with sudden visual loss and/or ocular pain due to posterior uveitis, the ophthalmologist will rely on the clinical examination, in particular ophthalmoscopy, to answer three basic questions: what is wrong (diagnosis), what can we expect in the future (prognosis), and what can we do about it (effect of treatment)? But how reliable and accurate are ophthalmoscopic observations and how useful is ophthalmoscopy to support therapeutic decisions?
Stanford and co-workers, in a study published in this issue of the BJO (636), have tried to address some of these questions. They estimated the sensitivity and specificity of uveitis experts' interpretation of retinal photographs for the diagnosis of toxoplasma retinochoroiditis. Five experts were asked to classify the retinal photographs of 96 patients into four categories without any additional information (definitely, probably, possibly, or not toxoplasma retinochoroiditis). This is an important study as it is the first time that the diagnostic accuracy of these ophthalmoscopic findings has been investigated. A major problem the investigators had to overcome was that it is not possible to diagnose or exclude the disease with certainty, and a statistical model was therefore used to estimate the sensitivity and specificity. The sensitivity and specificity were found to vary considerably among the experts, which led the authors to conclude that a considerable number of decisions to treat patients will be wrong if they are based on a single fundal examination. There are a number of broader issues that this study raises.
Firstly, the most important aim of evaluating the signs and symptoms of patients is to identify those patients for whom the expected benefit of treatment outweighs the expected harm. For example, the best treatment for patients with toxoplasma retinochoroiditis is considered to be antibiotics in combination with systemic corticosteroids. As this treatment is associated with serious potential adverse effects, an accurate distinction between patients with and without toxoplasma retinochoroiditis is of utmost importance. On reflection however, this “stepping stone approach” (jumping from complaints to diagnosis and then from diagnosis to treatment) is in many cases rather artificial. A more pragmatic approach seems to occur in clinical practice, when the role of diagnostic information is less to identify those patients with a certain diagnosis and more to distinguish between patients who are expected to benefit from treatment and those who are not. Visual loss or pain in combination with focal retinitis and retinochoroidal scars is then taken as an indication to start treatment with antibiotics and systemic corticosteroids, as it is assumed that a patient with this complex of symptoms and signs is better off with this treatment than without it.
It is not the presence or absence of a particular disease that is of most interest but the future health outcome for a patient with and without treatment
All this may sound rather academic, but it is important to realise that such an approach focusing on clinical effectiveness (effect on patient outcome), rather than on diagnostic accuracy (for example, the sensitivity and specificity), would solve the problem that for many ophthalmological diseases an adequate reference, or gold standard for the diagnosis is not available. It is often thought that the evaluation of diagnostic tests can be divided in a number of separate steps, and that the diagnostic accuracy should be evaluated before the clinical effectiveness of a test. However, this hierarchical approach would be unhelpful when it is difficult or impossible to establish the diagnosis with certainty or when the indication for treatment is determined by the severity and nature of a known disorder rather than the presence or absence of disease itself (for example, glaucoma, diabetic retinopathy).
Secondly, the authors report the diagnostic accuracy of experts' interpretation of retinal photographs expressed in terms of sensitivity (the percentage of patients with toxoplasma choroiditis who were correctly identified) and specificity (the percentage of patients without toxoplasma choroiditis who were correctly identified) by dichotomising the four diagnostic categories (definite and probable toxoplasma retinochoroiditis versus possible or not). As a result of this, the agreement between the experts not only depends on their diagnostic abilities but also on their interpretation of statements such as probable and possible. Interestingly, the sensitivities and specificities reported for the five experts are inversely related: the sensitivity is higher if the specificity is lower. This may be partly explained by the fact that the experts indeed used different definitions or “thresholds” for the diagnostic categories. In principle, these thresholds should be based on the consequences of the diagnostic judgment for treatment. From this perspective, it is regrettable that the investigators asked the experts to classify the fundal photographs according to the presence or absence of toxoplasma retinochoroiditis and not according to the effectiveness of treatment with antibiotics and corticosteroids.
Thirdly, the investigators explore the risk of mislabelling retinal appearance by applying the estimated sensitivity and specificity of ophthalmoscopy in different populations with different prevalences of toxoplasma uveitis. Some caution is warranted when interpreting these mislabelling risks. The basic assumption underlying the estimation of the mislabelling risks is that sensitivity and specificity are constant across different settings. There is a large body of evidence, however, that indicates that sensitivity and specificity vary across different patient populations and even across subgroups within a population. In other words, the sensitivity and specificity estimated in one population of patients might not be applicable to another population. All this is again extra support for an approach of evaluating a diagnostic examination or test by focusing on clinical effectiveness in a particular clinical context rather than on diagnostic accuracy.
Lastly, the study was complicated by the lack of evidence on the prognosis of toxoplasma retinochoroiditis with and without treatment. The lack of evidence on the effectiveness of treatment is a wider problem in the evaluation of ophthalmic diagnostic tests. Many previous studies focused instead on agreement between and within different types of observers or specific test characteristics.1 Thus, an important step forward would be if studies to evaluate ophthalmic investigations always considered whether the different diagnostic results helped to identify patients who would benefit from the different treatment options. For the study by Stanford and co-workers, this would imply that the most important question for the experts to answer would have been whether the patient was expected to benefit from treatment with antibiotic and systemic corticosteroids. It is not the presence or absence of a particular disease that is of most interest but the future health outcome for a patient with and without treatment. Of the three basic questions that an ophthalmologist tries to answer in clinical practice for individual patients, those about the prognosis (“what can we expect?”) and the potential to influence the prognosis with treatment (“what can we do about it?”) ultimately take priority over the diagnostic question (“what is wrong?”). Wherever possible, future studies of diagnostic or screening tests should reflect this.
Note in Proof
The importance of prognosis and treatment effect