7.3 ‘Utility’ files

In addition, WOMBAT produces a number of small ‘utility’ files. These serve to monitor progress during estimation, to carry information over to subsequent runs or to facilitate specialised post-estimation calculations by the user.

7.3.1 File ListOfCovs

This file lists the covariance components defined by the model of analysis, together with their running numbers and starting values given. It is written out during the ‘set-up’ phase (see 5.2.3). It can be used to identify the running numbers needed when defining additional functions of covariance components to be evaluated (see 4.10.4)

7.3.2 File RepeatedRecordsCounts

This file gives a count of the numbers of repeated records per trait and, if option TSELECT is used, a count of the number of pairs taken at the same time.

7.3.3 File BestPoint

Whenever WOMBAT encounters a set of parameters which improves the likelihood, the currently ‘best’ point is written out to the file BestPoint.

The first line of BestPoint gives the following information :

(a)
The current value of the log likelihood,
(b)
The number of parameters

This is followed by the covariance matrices estimated.

1.
Only the upper triangle, written out row-wise, is given.
2.
Each covariance matrix starts on a new line. ‘
3.
The first matrix given is the matrix of residual covariances. The other covariance matrices are given in the same order as the matrices of starting values were specified in the parameter file.

N.B.: BestPoint is used in any continuation or post-estimation steps – do not delete is until the analysis is complete !

7.3.4 File Iterates

WOMBAT appends a line of summary information to the file Iterates on completion of an iterate of the AI, PX-EM or EM algorithm. This can be used to monitor the progress of an estimation run – useful for long runs in background mode. Each line gives the following information :

(a)
Column 1 gives a two-letter identifying the algorithms (AI, PX, EM) used.
(b)
Column 2 gives the running number of the iterate.
(c)
Column 3 contains the log likelihood value at the end of the iterate.
(d)
Column 4 gives the change in log likelihood from the previous iterate.
(e)
Column 5 shows the norm of the vector of first derivatives of the log likelihood (zero for PX and EM)
(f)
Column 6 gives the norm of the vector of changes in the parameters, divided by the norm of the vector of parameters.
(g)
Column 7 gives the Newton decrement (absolute value) for the iterate (zero for PX and EM).
(h)
Column 8 shows a) the step size factor used for AI steps, b) the average squared deviation of the matrices of additional parameters in the PX-EM algorithm from the identity matrix, c) zero for EM steps.
(i)
Column 9 gives the CPU time for the iterate in seconds
(j)
Column 10 gives the number of likelihood evaluations carried out so far.
(k)
Column 11 gives the factor used to ensure that the average information matrix is ‘safely’ positive definite
(l)
Column 12 identifies the method used to modify the average information matrix (0: no modification, 1: modify eigenvalues directly, 2: add diagonal matrix, 3: modified Cholesky decomposition, 4: partial Cholesky decomposition – see A.5).

7.3.5 File OperationCounts

This small file gathers accumulates the number of non-zero elements in the Cholesky factor (or inverse) of the mixed model matrix together with the resulting operation count for the factorisation. This can be used to compare the efficacy of different ordering strategies for a particular analysis. The file contains one line per ordering tried, with the following information :

(a)
The name of the ordering strategy (mmd, amd or metis).
(b)
For metis only : the values of the three options which can be set by the user, i.e. the number of graph separators used, the ‘density’ factor, and the option selecting the edge matching strategy (see 5.3.1).
(c)
The number of non-zero elements in the mixed model matrix after factorisation.
(d)
The operation count for the factorisation.

7.3.6 Files AvInfoParms and AvinfoCovs

These files are written out when the AI algorithm is used. After each iterate, they give the average information matrix (not its inverse !) corresponding to the ‘best’ estimates obtained by an AI step, as written out to BestPoint. These can be used to approximate sampling variances and errors of genetic parameters.

N.B.: If the AI iterates are followed by further estimates steps using a different algorithm, the average information matrices given may not pertain to the ‘best’ estimates any longer.

AvInfoParms contains the average information matrix for the parameters estimated. Generally, the parameters are the elements of the leading columns of the Cholesky factors of the covariance matrices estimated. This file is written out for both full and reduced rank estimation.

For full rank estimation, the average information is first calculated with respect to the covariance components and then transformed to the Cholesky scale. Hence, the average information for the covariances is available directly, and is written to the file AvInfoParms.

Both files give the elements of the upper triangle of the symmetric information matrix row-wise. The first line gives the log likelihood value for the estimates to which the matrix pertains – this can be used to ensure corresponding files of estimates and average information are used. Each of the following lines in the file represents one element of the matrix, containing 3 variables :

(a)
row number,
(b)
column number, and
(c)
element of the average information matrix.

N.B.: Written out are the information matrices for all parameters. If some parameters (or covariances) are not estimated (such as zero residual covariances for traits measured on different animals), the corresponding rows and columns may be zero.

7.3.7 Files Covariable.baf

For random regression analyses, file(s) with the basis functions evaluated for the values of the control variable(s) in the data are written out. These can be used, for example, in calculating covariances of predicted random effects at specific points.

The name of a file is equal to the name of the covariable (or ‘control’ variable), as given in the parameter file (model of analysis part), followed by the option describing the form of basis function (POL, LEG, BSP; see 4.8.1.1) the maximum number of coefficients, and the extension .baf. The file then contains one row for each value of the covariable, giving the covariable, followed by the coefficients of the basis function.

NB. These files pertain to the random regressions fitted! Your model may contain a fixed regression on the same covariable with the same number of specified regression coefficients, n  , but with intercept omitted. If so, the coefficients in this file are not appropriate to evaluate the fixed regression curve.

7.3.8 File LogL4Quapprox.dat

For analyses involving an additional parameter, the values used for the parameter and the corresponding maximum log likelihoods are collected in this file. This is meant facilitate estimation of the parameter through a quadratic approximation of the resulting profile likelihood curve via the run option --quapp. Note that this file is appended to at each run.

7.3.9 File SubSetsList

If analyses considering a subset of traits are carried out, WOMBAT writes out files EstimSubsetn + ...+ m  .dat (see 7.2.6), to be used as input files in a run with option --itsum. In addition, for each run performed, this file name this appended to SubSetsList. This file contains one line per ‘partial’ run with two entries: the file name (EstimSubsetn+ ...+ m  .dat) and a weight given to the corresponding results when combining estimates. The default for the weight is unity.