WOMBAT – A program for Mixed Model Analyses by Restricted Maximum Likelihood

Hypothesis testing for REML analyses

With a maximum likelihood framework of estimation, tests based on the (log) likelihood are the obvious choice for hypothesis testing.

Random effects

REML, short for restricted or residual maximum likelihood, estimation involves maximising the likelihood of the residuals, i.e the observations adjusted for estimates of the fixed effects fitted. In this way, REML acounts for the degrees of freedom used by fitting fixed effects - in contrast to `full' maximum likelihood which does not and thus yields underestimates of the residual variance.

However, this does imply that the REML likelihood does not contain any information about the fixed effects fitted. Consequently, only models (with nested random effects) which fit exactly the same fixed effects can be compared using likelihood based tests.

Likelihood ratio test

Say we have a vector of parameters Graph (i.e. (co)variance components or functions of such components) and their REML estimates Graph, and let Graph denote the corresponding, maximum likelihood value. Partition Graph into the vector of p parameters we want to test, Graph, and the remaining q parameters, Graph (re-ordering Graph if necessary). Further, let the null hypothesis to be tested be


with the alternative hypothesis


To carry out a likelihood ratio test, we need to find the REML estimates of Graph fixing Graph at the `test values' Graph, and the corresponding conditional maximum likelihood value Graph.

The likelihood ratio test criterion is the computed as


Asymptotically, Graph has a Graph distribution with p degrees of freedom. Hence, Graph is rejected if Graph exceeds the value from the Graph distribution for p degrees of freedom and a chosen error probability, and Graph is accepted otherwise. For example for Graph and an error probability of 5%, Graph needs to exceed 3.84 for Graph to be accepted.

Note that the use of likelihood ratio tests for model comparison is only valid for nested models.

Parameters at the boundary

Note further that the alternative hypothesis implies a two-sided test. Hence it is not directly applicable for tests at the boundary of the parameter space, e.g. to test whether a variance component is greater than zero or not. In this case, Graph is distributed as a mixture of Graph variables, with details depending on the values of p and q and how many of the elements of Graph and Graph are at the boundary. Details are given by Self and Liang (1987); see also Domenicus et al. (2006) or Visscher (2006) for discussions in a genetics (variance component) context and suitable adjustments in simple cases.

Information criteria

Likelihood ratio tests are known to favour more detailed models. Hence, for scenarios with many parameters, model comparisons are often based on the likelihood penalised for the number of parameters to be estimated.


Likelihood ratio tests are known to favour the most detailed model. This is due to log L being a biased estimator of the `information'. Akaike (1973) showed that this bias is approximately proportional to the number of parameters estimated. The Akaike information criterion (AIC) (pronounced, approximately, ah-kah-ee-kay) corrects for this bias by penalising the likelihood accordingly, i.e. AIC = -2 log L + 2 p with p the number of parameters. The model considered to fit the data `best' is then the model with the lowest AIC value.

Schwarz' or Bayesian Information Criterion

Score test

Wald's test

Fixed effects

As emphasized above, REML estimation maximises the part of the likelihood which is does not depend on the fixed effects fitted. This implies that the REML maximum log likelihood cannot be used directly in making any inference about fixed effects.

See Tess et al. (1993) and Welham et al. (1997) for discussions on testing of fixed effects in conjunction with REML estimation.


[1987, article]
Self, S. G., & Liang, K. Y. (1987). Asymptotic properties of the maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. J. Amer. Stat. Ass., 82, 605-610.
[1993, article]
Boik, R. J., Tess, M. W., & Todd, C. (1993). Technical note: computing tests of fixed effects in a restricted class of mixed models. J. Anim. Sci., 71(1), 51-56.
[1997, article]
Welham, S. J., & Thompson, R. (1997). Likelihood ratio tests for fixed model terms using residual maximum likelihood. Proc. Roy. Stat. Soc. B, 59, 701-714.
[2003, article]
Verbeke, G., & Molenberghs, G. (2003). The Use of Score Tests for Inference on Variance Components. Biometrics, 59(2), 254-254.
[2006, article]
Dominicus, A., Skrondal, A., Gjessing, H. K., Pedersen, N. L., & Palmgren, J. (2006). Likelihood Ratio Tests in Behavioral Genetics: Problems and Solutions. Behav. Genet., 36, 331-340.
[2006, article]
Visscher, P. M. (2006). A Note on the Asymptotic Distribution of Likelihood Ratio Tests to Test Variance Components. Twin Res. Hum. Genet., 9, 490-495.

Points to remember:

  1. REML maximises the part of the likelihood which does not depend on the fixed effects fitted.
  2. The REML maximum log likelihood `as is' cannot be used to test fixed effects.
  3. Only models with the same fixed effects can be compared using the REML log likelihood.
  4. The likelihood can not decrease when another parameter is added to the model.

QR Code
QR Code wombat:hypotheses (generated for current page)