Article Text

Download PDFPDF

Ophthalmic statistics note: the perils of dichotomising continuous variables
  1. Phillippa M Cumberland1,
  2. Gabriela Czanner2,3,
  3. Catey Bunce4,
  4. Caroline J Doré5,
  5. Nick Freemantle6,
  6. Marta García-Fiñana2,
  7. On behalf of the Ophthalmic Statistics Group
  1. 1Centre for Paediatric Epidemiology and Biostatistics, UCL Institute of Child Health, London, UK
  2. 2Department of Biostatistics, Faculty of Health and Life Sciences, University of Liverpool, Liverpool, UK
  3. 3Department of Eye and Vision Science, Faculty of Health and Life Sciences, University of Liverpool, Liverpool, UK
  4. 4NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
  5. 5UCL Clinical Trials Unit, University College London, London, UK
  6. 6Department of Primary Care and Population Health & PRIMENT Clinical Trials Unit, University College London, London, UK
  1. Correspondence to Phillippa Cumberland, Centre for Paediatric Epidemiology and Biostatistics, University College London (UCL), 30 Guilford Street, London WC1N 1EH, UK; p.cumberland{at}ucl.ac.uk

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Continuous variables (such as intraocular pressure (IOP), visual acuity, contrast sensitivity) are commonly measured in clinical ophthalmology and vision research. In clinical practice, a ‘status’ (category) can sometimes be assigned to an individual patient using a cutpoint in the value of a continuous variable; for example, a diagnosis of glaucoma might be confirmed by an elevated IOP measurement (eg, IOP >21 mm Hg). Indeed much of medicine revolves around an implicit classification of individuals into diseased and non-diseased. In clinical research, continuous variables may likewise be converted to categorical variables, assigning individuals to one of two groups. Although this may be appropriate for some specific studies where the underlying distribution of the variable shows a clear grouping, such dichotomisation has several drawbacks.1

Dichotomisation may be driven by the research question, for example, a study to investigate the health service needs of those with low vision, in which dichotomisation uses WHO visual acuity threshold for low vision.2 It may sometimes be used to bring the data in line with the clinical classification of patients but often the reason for dichotomisation of data is that it is thought to simplify the statistical analysis (eg, to enable use of a t test or a χ2 test) and the presentation and interpretation of data. However, this simplification has a cost in terms of loss of information3 and may compromise the validity of the statistical analysis. We will discuss the disadvantages of dichotomisation and outline some points to consider before categorising continuous data. It should be noted that while the focus here is on categorisation of data into two groups, the problems arising when dichotomising data are inherent with any categorisation of data (two or more groups).

Loss of information and statistical power

Dichotomisation results, first, in the loss of descriptive information on the study population. For example, the …

View Full Text