This chapter explains some of the assumptions made, strategies and statistical procedures used in WOMBAT.
An essential setup set in REML analyses is to ﬁnd a permutation of the rows and columns in the coeﬃcient matrix of the mixed model equations that makes the ‘ﬁllin’ arising during factorisation small and thus minimises the computational eﬀorts per likelihood evaluation and REML iterate. This needs to be done only once per analysis (WOMBAT saves the results from this step for reuse in any subsequent steps). As it can have a dramatic impact on the time and memory required per analysis, it is well worth considerable eﬀort to ﬁnd the ‘best’ order. Especially for analyses involving large data sets or multiple random eﬀects, the time spend trying several, or even numerous alternatives is readily recouped within the ﬁrst few iterates [31]. WOMBAT selects a default ordering strategy based on the number of equations in the analysis.
Three diﬀerent stratgies are implemented :
WOMBAT calculates four diﬀerent criteria to determine whether an analysis has converged. The ﬁrst two are simple changes, available for all maximisation algorithms, the other two are based on the derivatives of the log likelihood function, i.e. can only be obtained for the AI algorithm. The criteria are :
(A.1) 
where 𝜃_{i}^{t} denotes the estimate of the i−th parameter from iterate t, and p is the number of parameters.
(A.2) 
(A.3) 
where H_{ij}^{t} is the ij−the element of the inverse of the average information matrix for iterate t. This gives a measure of the expected diﬀerence of log ℒ^{t} from the maximum, and has been suggested as an alternative convergence criterion [6].
Algorithm
 
Criterion  AI  (PX) EM  Powell 
Change in log ℒ  < 5 × 10^{−4}  < 10^{−5}  < 10^{−4} 
Change in parameters  < 10^{−8}  < 10^{−8}  < 10^{−8} 
Norm of gradient vector  < 10^{−3}  –  – 
Newton decrement  not used  –  – 
Default values for the thresholds for these criteria used in WOMBAT are summarised in Table Table A.1.
N.B.: Current values used are rather stringent; ‘softer’ limits combining several criteria may be more appropriate for practical estimation.
At convergence, the inverse of the AI matrix gives estimates of lower bound sampling covariances among the parameters estimated. These are used to approximate sampling errors of covariance components and genetic parameters.
For full rank analyses parameterising to the elements of the Cholesky factors of the corresponding covariance matrices, the AI matrix is obtained by ﬁrst calculating the AI matrix for the covariance components. This is then transformed to the AI matrix for the parameters to be estimated by pre and postmultiplying it with the corresponding Jacobian and its transpose, respectively. Full details are given in Meyer and Smith [39]. This implies that sampling covariances among the covariance components can be obtained directly by simply inverting the corresponding AI matrix.
For reduced rank analyses, however, the AI matrix for the parameters to be estimated are calculated directly, as outlined by Meyer and Kirkpatrick [36]. Hence, sampling covariances among the corresponding covariance components need to be approximated from the inverse of the AI matrix. WOMBAT estimates the leading columns of the Cholesky factor (L) of a matrix to obtain a reduced rank estimate, Σ = LL′. Let l_{ir} (for i ≤ r) denote the nonzero elements of L. The inverse of the AI matrix then gives approximate sampling covariances Cov. The ij−th covariance component, σ_{ij}, is

with q(i,j) = min(i,j,t) and t the rank which the estimate of Σ is set to have. The covariance between two covariances, σ_{ij} and σ_{km} is then

Using a ﬁrst order Taylor series approximation to the product of two variables, this can be approximated as
Equation A.4 extends readily to two covariance components belonging to diﬀerent covariance matrices, Σ_{1} and Σ_{2}, and their respective Cholesky factors.
Let σ denote the vector of covariance components in the model of analysis and V (σ) its approximate matrix of sampling covariances, obtained as described above (subsection A.4.1). The sampling covariance for any pair of linear functions of σ is then simply
(A.5) 
with w_{i} the vector of weights in linear function i.
Sampling variances of nonlinear functions of covariance components are obtained by ﬁrst approximating the function by a ﬁrst order Taylor series expansion, and then calculating the variance of the resulting linear function. For example, for a variance ratio
(A.6) 
Similarly, for a correlation
To yield a search direction which is likely to improve the likelihood, or, equivalently, decrease −log ℒ, the Hessian matrix or its approximation in a Newton type optimisation strategy must be positive deﬁnite. While the AI matrix is a matrix of sums of squares and crossproducts and thus virtually guaranteed to be positive deﬁnite, it can have a relatively large condition number or minimum eigenvalues close to zero. This can yield step sizes, calculated as the product of the inverse of the AI matrix and the vector of ﬁrst derivatives, which are too large. Consequently, severe step size modiﬁcations may be required to achieve an improvement log ℒ. This may, at best, require several additional likelihood evaluations or cause the algorithm to fail. Modiﬁcation of the AI matrix, to ensure that it is ‘safely’ positive deﬁnite and that its condition number is not excessive, may improve performance of the AI algorithm in this instance.
Several strategies are available. None has been found to be ‘best’.
Choice of the modiﬁcation can have a substantial eﬀect on the eﬃciency of the AI algorithm. In particular, too large a modiﬁcation can slow convergence rates unnecessarily. Further experience is necessary to determine which is a good choice of modiﬁcation for speciﬁc cases.
Consider the q × q covariance matrix V (among q traits) for a random eﬀect, e.g. additive genetic eﬀects. Assume we have S analyses of subsets of traits, with the s−th analysis comprising k_{s} traits. Further, let C_{s} denote the k_{s} × k_{s} matrix of estimates of covariance components for the random eﬀect from analysis s. The pooled matrix V is constructed by iterating on
(A.8) 
until V^{t} and V^{t+1} are virtually identical, with V^{t} the value of V for the t−th iterate.
Other components of (Equation A.8) are w_{s}, the weight given to analysis s, I, an identity matrix of size q × q, and transformation matrix P_{s} for analysis s, of size q ×k_{s}. P_{s} has k_{s} elements of unity, p_{ij} = 1, if the ith trait overall is the jth trait in analysis s, and zero otherwise.
Starting values (V^{0}) are obtained initially by decomposing covariance matrices C_{s} into variances and correlations, and calculating simple averages over individual analyses. If the resulting correlation matrix is not positive deﬁnite, it is modiﬁed by regressing all eigenvalues towards their mean, choosing a regression factor so that the smallest eigenvalue becomes 10^{−6}, and pre and postmultiplying the diagonal matrix of modiﬁed eigenvalues with the matrix of eigenvectors and its transpose, respectively. Using the average variances, V^{0} is then obtained from the (modiﬁed) average correlation matrix. WOMBAT is set up to perform up to 100000 iterates. Convergence is assumed to be reached when

with v_{ij}^{t} denoting the ij−th element of V^{t}.