Pooling covariance components
A standard task in quantitative genetics is the estimation of
covariance matrices, often for many traits and multiple sources of
variation. Frequently, full multivariate analyses for all traits of
interest are computationally not feasible, and estimates for all
components of interest are obtained from analyses by parts.
Typically, this comprises numerous low-dimensional analyses of
overlapping subsets of traits. In the simplest scenario, these may be
bivariate analyses for all pairs of traits. In other cases, e.g. when
some traits are subject to selection based on other traits of
interest, tri- or higher-variate analyses may be needed to counter-act
selection bias.
The task then is to combine -- or 'pool' -- the
resulting estimates into overall covariance matrices for all
traits. The main problems inherent in this are:
- The resulting
matrices need to be positive (semi-) definite, i.e. do not have
negative eigenvalues.
- The elements of the overall matrices
should be 'near' to estimates from individual analyses. In particular,
the phenotypic components should not be distorted.
- Different
part analyses may involve very different numbers of observations, and results
should thus be weighted differently.
The penalized likelihood approach
We propose a likelihood approach to combine estimates for all sources
of variation simultaneously. In brief, this relies on treating
estimates from individual part analyses as data, i.e. matrices of
corrected mean squares and cross-products. As such, it is similar to
the methods suggested by Mantysaari (1999) (as implemented in
PDMATRIX) and Thompson
et al.
(2005). However, there are two improvements:
- Estimates for
different sources of variation from the same part analysis are
combined assuming a simple pseudo pedigree structure, and all
covariance matrices are pooled at the same time. Doing so approximates
sampling covariances between estimates from the same analysis, and
thus restricts changes in their sum, i.e. the phenotypic components.
- We can maximize the likelihood imposing a penalty aimed at
'borrowing strength' from the estimate of the phenotypic covariance
matrix. As shown for full, multivariate analyses, this can result in substantially reduced sampling variation, and thus result in estimates with are, on average, closer to the population values.
Software
The penalized likelihood approach has been implemented as an add-on
to our software package WOMBAT. It is invoked using the run option
--pool. There are two forms of input, one to combine part
results obtained using WOMBAT, and a 'general' form suitable for
estimates from any source. Details are described in the WOMBAT
User Manual.
There is a set of notes describing the specific options available in
conjunction with pooling (e.g. to select the pseudo pedigree
structures or penalty to be imposed), and a worked example.
Downloads
- Meyer, K. (2012). "Pooling estimates of covariance components by penalized maximum likelihood using WOMBAT".
· Notes
(pdf file, 151 KB,
2904 downloads).
- Worked example for WOMBAT: Available as Example 15 from the WOMBAT downloads page.
- Meyer, K. (2012). "Pooling estimates of covariance components using a penalized likelihood approach". 4th International Congress on
Quantitative Genetics, Edinburgh, Scotland, June 17-22, 2012.
· Poster
(pdf file, 1.96 MB,
2528 downloads).
- Meyer, K. (2012). "A penalized likelihood approach to
pooling estimates of covariance components
from analyses by parts",
Manuscript
(pdf file, 220 KB,
3465 downloads).