WOMBAT – A program for Mixed Model Analyses by Restricted Maximum Likelihood

Hypothesis testing for REML analyses

With a maximum likelihood framework of estimation, tests based on the (log) likelihood are the obvious choice for hypothesis testing.

Random effects

REML, short for restricted or residual maximum likelihood, estimation involves maximising the likelihood of the residuals, i.e the observations adjusted for estimates of the fixed effects fitted. In this way, REML acounts for the degrees of freedom used by fitting fixed effects - in contrast to `full' maximum likelihood which does not and thus yields underestimates of the residual variance.

However, this does imply that the REML likelihood does not contain any information about the fixed effects fitted. Consequently, only models (with nested random effects) which fit exactly the same fixed effects can be compared using likelihood based tests.

Points to remember:

  1. REML maximises the part of the likelihood which does not depend on the fixed effects fitted.
  2. The REML maximum log likelihood `as is' cannot be used to test fixed effects.
  3. Only models with the same fixed effects can be compared using the REML log likelihood.
  4. The likelihood can not decrease when another parameter is added to the model.

Likelihood ratio test

Say we have a vector of parameters ${\pmb \theta}$ (i.e. (co)variance components or functions of such components) and their REML estimates $\hat{\pmb \theta}$, and let <$ \log L_{Max}(\hat{\pmb \theta})$ denote the corresponding, maximum likelihood value. Partition $ \pmb\theta$ into the vector of p parameters we want to test,$\pmb \theta_{1}$, and the remaining q parameters, $\pmb {\theta}_{2}$ (re-ordering $\pmb \theta$ if necessary). Further, let the null hypothesis to be tested be $$ {\mathrm H}_{0}: \hat{\pmb \theta}_{1} = \bf{t} $$

with the alternative hypothesis $$ {\mathrm H}_{A}: \hat{\pmb \theta}_{1} \ne \bf{t} $$

To carry out a likelihood ratio test, we need to find the REML estimates of ${\pmb \theta}_{2}$ fixing {\hat{\boldmath \theta}_{1}} at the `test values' ${\bf t}$, and the corresponding conditional maximum likelihood value $ \log L_{Max}( \hat{\pmb \theta}_{2} | {\pmb \theta}_{1} = {\bf t})$.

The likelihood ratio test criterion is the computed as $$ \Lambda = -2 \left( \log L_{Max}( \hat{\pmb \theta} ) - \log L_{Max}( \hat{\pmb \theta}_{2} | {\pmb \theta}_{1} = {\bf t} ) \right) $$

Asymptotically, $\Lambda$ has a $\chi^{2}$ distribution with p degrees of freedom. Hence, ${\rm H}_{0}$ is rejected if $\Lambda$ exceeds the value from the $\chi^{2}$ distribution for p degrees of freedom and a chosen error probability, and ${\rm H}_{0}$ is accepted otherwise. For example for $p=1$ and an error probability of 5%, $\Lambda$ needs to exceed 3.84 for ${\rm H}_{A}$ to be accepted.

Note that the use of likelihood ratio tests for model comparison is only valid for nested models.

Parameters at the boundary

Note further that the alternative hypothesis implies a two-sided test. Hence it is not directly applicable for tests at the boundary of the parameter space, e.g. to test whether a variance component is greater than zero or not. In this case, \Lambda is distributed as a mixture of \chi^{2} variables, with details depending on the values of p and q and how many of the elements of \hat{\boldmath \theta}_{1} and \hat{\boldmath \theta}_{2} are at the boundary. Details are given by Self and Liang (1987); see also Domenicus et al. (2006) or Visscher (2006) for discussions in a genetics (variance component) context and suitable adjustments in simple cases.

Information criteria

Likelihood ratio tests are known to favour more detailed models. Hence, for scenarios with many parameters, model comparisons are often based on the likelihood penalised for the number of parameters to be estimated.


Likelihood ratio tests are known to favour the most detailed model. This is due to log L being a biased estimator of the `information'. Akaike (1973) showed that this bias is approximately proportional to the number of parameters estimated. The Akaike information criterion (AIC) (pronounced, approximately, ah-kah-ee-kay) corrects for this bias by penalising the likelihood accordingly, i.e. $$AIC = -2 log L + 2 p$$ with p the number of parameters. The model considered to fit the data `best' is then the model with the lowest AIC value.

Schwarz' or Bayesian Information Criterion

Score test

Wald's test

Fixed effects

As emphasized above, REML estimation maximises the part of the likelihood which is does not depend on the fixed effects fitted. This implies that the REML maximum log likelihood cannot be used directly in making any inference about fixed effects.

See Tess et al. (1993) and Welham et al. (1997) for discussions on testing of fixed effects in conjunction with REML estimation.


    author = {Self, S. G. and Liang, K. Y.},
     pages = {605--610},
     title = {Asymptotic properties of the maximum likelihood estimators and likelihood ratio tests under nonstandard conditions},
   journal = {J. Amer. Stat. Ass.},
    volume = {82},
      year = {1987}
  author =	 {Boik, R. J. and Tess, M. W. and Todd, C.},
  title =	 {Technical note: computing tests of fixed effects in
                  a restricted class of mixed models},
  journal =	 {J. Anim. Sci.},
  volume =	 71,
  number =	 1,
  pages =	 {51--56},
  year =	 1993,
  author =	 {Welham, S. J. and Thompson, R. },
  title =	 {Likelihood ratio tests for fixed model terms using
                  residual maximum likelihood},
  journal =	 {Proc. Roy. Stat. Soc. B},
  year =	 {1997},
  volume =	 {59},
  pages =	 {701--714},
  author =	 {Geert Verbeke and  Geert Molenberghs},
  title =	 {The Use of Score Tests for Inference on Variance
  journal =	 {Biometrics},
  volume =	 59,
  number =	 2,
  pages =	 {254--254},
  year =	 2003,
  doi =		 {10.1111/1541-0420.00032},
    author = {Dominicus, A. and Skrondal, A. and Gjessing, H. K. and Pedersen, N. L. and Palmgren, J.},
     pages = {331--340},
     title = {Likelihood Ratio Tests in Behavioral Genetics: Problems and Solutions},
   journal = {Behav. Genet.},
    volume = {36},
      year = {2006}
    author = {Visscher, P. M.},
     pages = {490--495},
     title = {A Note on the Asymptotic Distribution of Likelihood Ratio Tests to Test Variance Components},
   journal = {Twin Res. Hum. Genet.},
    volume = {9},
      year = {2006}
QR Code
QR Code wombat:hypotheses (generated for current page)