A senior colleague asks me to critique a paper which reports to have used multivariate statistical methods to suggest an inhibitory effect of maternal smoking on the development of severe retinopathy of prematurity (ROP).

I access the internet and find that the paper has been published in a peer-reviewed journal of high repute and that it reports an analysis conducted using data from 86 premature (<32 weeks’ gestation) infants. ROP grading had been evaluated in accordance with the International Classification of Retinopathy of Prematurity.

I am not familiar with the term multivariate and so I consult the internet and statistical books.

In univariate techniques, there is a single outcome or dependent (response) variable (in this instance, development of severe ROP) and one independent or explanatory variable which may sometimes also be termed ‘covariate’ (in this instance birth weight or gestational age etc). Univariate logistic regression could be used to identify which variables are associated with the odds of severe ROP. I would create a series of models for each of the explanatory variables that have been recorded within this study, and in each case, I would explore the association between that explanatory variable and severe ROP development. While this might be of interest, we would probably be interested to use information on several of the recorded variables simultaneously to determine disease development, and for this, we would use multiple variable or multivariable logistic regression. Multivariable methods are the tools to use when there is one dependent/response variable but more than one independent/predictor/explanatory variable. Multivariable methods may be used to identify which of several potential predictor variables is ‘important’, to develop a prognostic model from several predictor variables or to remove the possible effects of ‘nuisance’ variables or confounders.

While we understand univariate and multivariable techniques, I am unsure whether multivariate is the same as multivariable. As mentioned above, the statistical jargon implies that multivariate is only to be used in situations where all variables are treated equally or there is more than one dependent variable.

The first situation, that is, where variables are regarded as on an equal footing, covers methods that are typically used for data reduction purposes (ie, reducing the number of variables in an analysis) to examine the relationships between individuals or the relationships between variables or to develop rules to classify subjects into groups. Such methods might also be used in the development of measures or scales for complex underlying concepts such as ‘visual dysfunction’ or ‘vision disability’.

The second situation is where we have two or more dependent variables and we wish to examine relationships between these and several explanatory variables. This class of multivariate analyses includes multivariate linear regression and multivariate analysis of variance.

It appears to me that multivariate methods have not been employed. This paper has a single dependent variable and multiple independent variables and the authors have used multivariable logistic regression and not a multivariate method. I wonder whether it matters and learn that the statistical methods section within a paper should allow the reader to comprehend fully what has been done.

I advise my senior colleague that multivariate methods have not been used and that perhaps this misunderstanding throws doubt on the statistical validity.

I consider other aspects of the multivariable model—my understanding is that the regression coefficients provided no longer give me a simple assessment of how that factor relates to the outcome variable but something more complicated. In a model with two independent variables or covariates (say maternal smoking and gestational age), the coefficients, now called marginal (or adjusted or conditional) coefficients, provide an estimate of the effect of maternal smoking on the development of ROP while ‘holding’ gestational age constant. My understanding of this is that it therefore gives me a measure of association between the odds of severe ROP and maternal smoking in babies with similar gestational ages, for example, two babies with a gestational age of 27 weeks or two babies with a gestational age of 30 weeks. If only these two covariates are in the model, an assumption is being made that the effect of maternal smoking on the odds of severe ROP is the same irrespective of gestational age. If the effect of maternal smoking differed according to the gestational age of the baby (older babies having been exposed to indirect smoking for longer than younger babies), I learn that an interaction term would need to be included in the model. There is no mention within the paper of an examination of the potential for interaction but I learn that interactions are often not explored fully because detecting them requires a lot of data and frequently there is insufficient data to fully explore these.

In a model with three independent variables or covariates (say maternal smoking, gestational age and the gender of the baby), the marginal coefficients are giving an estimate of the association between the odds of severe ROP and maternal smoking while ‘holding’ gestational age and the gender of the baby constant. The model is therefore looking at the effect of maternal smoking versus not smoking in babies of the same sex and of the same gestational age. Again, this model is making an assumption that these covariate effects are not dependent on the levels of the other factors, that is, that there are no interactions. While 86 premature babies seemed like a reasonable number to explore associations, I now see why large numbers are needed to assess models reliably. The more variables that are included in the model, the greater the data are stretched and there simply will be no data to support the examination. A model is being fitted with limited ability to assess its fit.

The multivariable model reported in the paper contains seven covariates. I learn that a rule of thumb for logistic regression models is that the number of observed events (or non-events, whichever is smaller) for each independent variable considered within a model should be at least 10.

In the model with three factors, the logistic regression model gives me an estimate of the effect of maternal smoking on severe ROP in premature babies of the same gestational age and the same gender. Each time a variable is added to the model, I must consider that an additional variable is being held constant. I start to realise how little data are contributing to these adjusted estimates. For example, only one mother of the 27 babies with severe retinopathy was a smoker.

The OR estimate under scrutiny is 0.01 with a CI of 0.00 to 0.48. If this were a univariate model, the interpretation would be that the odds of severe ROP in a premature baby of a mother who smoked is 0.01 times that of the odds of severe ROP in a premature baby of a mother who had not smoked. This, however, is a multivariable model and so I now acknowledge that it is actually saying something slightly different, that is, that

In addition to statistical uncertainty, I determine that there are sources of bias that do not appear to have been adequately dealt with. Smoking status was self-reported by mothers at their first visit to the mother and baby centre—might some wish not to disclose such information for fear of recrimination? Mothers who discontinued smoking during pregnancy or who had an ‘uncertain’ smoking status were excluded from the study (selection bias) and we are not told how many such exclusions there were.

At the journal club, I present the paper. We conclude that there is no robust information provided suggesting an inhibitory effect of maternal smoking on the development of severe ROP and have a greater understanding of multivariate and multivariable statistical techniques.

Multivariate methods are not the same as multivariable methods.

Multivariate methods have more than one dependent variable or place variables on an equal footing.

Multivariable methods have one dependent variable and more than one independent variables or covariates.

Regression coefficients from multivariable models need careful interpretation as their meaning differs to that from a univariate model.

The number of observed events (or non-events, whichever is smaller) for each independent variable considered within a multiple variable logistic regression model should be at least 10.

CB drafted the paper. GZ, MG, CJD, CB and NF critically reviewed and revised the paper.

CB is partly funded/supported by the National Institute for Health Research (NIHR) Biomedical Research Centre based at Guy’s and St Thomas' NHS Foundation Trust and King’s College London. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.

None declared.

Not commissioned; externally peer reviewed.

This article has been corrected since it was published Online First. The affiliation for author Mariusz Tadeusz Grzeda has been corrected to affiliation number 3. The affiliation for author Gabriela Czanner has been corrected to affiliation number 2 only.