Article Text

other Versions

Download PDFPDF
Ophthalmic statistics note 10: data transformations
  1. Catey Bunce1,2,
  2. John Stephenson3,
  3. Caroline J Doré4,
  4. Nick Freemantle5,
  5. On behalf of the Ophthalmic Statistics Group
  1. 1Research &Development, NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
  2. 2Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, London, UK
  3. 3School of Human and Health Sciences, University of Huddersfield, Queensgate, Huddersfield GB-HD1 3DH, Great Britain
  4. 4Comprehensive Clinical Trials Unit, University College London, Great Britain
  5. 5Department of Primary Care and Population Health, University College London, Great Britain
  1. Correspondence to Dr John Stephenson: j.stephenson{at}

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Many statistical analyses in ophthalmic and other clinical fields are concerned with describing relationships between one or more ‘predictors’ (explanatory or independent variables) and usually one outcome measure (response or dependent variable). Our earlier statistical notes make reference to the fact that statistical techniques often make assumptions about data.1 ,2 Assumptions may relate to the outcome variable, to the predictor variable or indeed both; common assumptions are that data follow normal (Gaussian) distributions and that observations are independent. It is, of course, entirely possible to ignore such assumptions, but doing so is not good statistical practice and in medicine; poor statistical practice can impact negatively upon patients and the public.3

One approach when assumptions are not adhered to is to use alternative tests which place fewer restrictions on the data – non-parametric or so-called distribution free methods.2 A more powerful alternative, however, is to transform your data. While your ‘raw’ (untransformed) data may not satisfy the assumptions needed for a particular test, it is possible that a mathematical function or transformation of the data will. Analyses may then be conducted on the transformed data rather than the raw data.

Scenario 1: A study to evaluate the accuracy of intraocular lens power estimation in eyes having phacovitrectomy for rhegmatogenous retinal detachment4 measured the axial length (in mm) of 71 eyes. The raw data (figure 1A) exhibited a fairly strong positive skew (rather than being symmetric there is an extended tail in the histogram to the right); the same data with a logarithmic transformation applied (figure 1B) appears much more normal (less of a tail …

View Full Text


  • Correction notice This article has been corrected since it was published Online First. The title has been corrected to match the series title, and 'On behalf of the Ophthalmic Statistics Group' has been added to the author list.

  • Contributors JS drafted the paper. CJD, CB and NF critically reviewed and revised the paper. CB redrafted the paper after review. CJD, JS and JF critically reviewed the redraft.

  • Funding CB is partly funded by the National Institute of Health Research (NIHR) Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology.

  • Competing interests None declared.

Linked Articles