7.2 Additional results

These are large files, most likely subject to further processing by other programs. Thus they contain minimum or no text. They have extension .dat.

7.2.1 File Residuals.dat

This files gives the residuals for all observations, for the model fitted and current estimates of covariance components. It has the same order as the data file, and contains 2 space separated columns :

(a)
Column 1 contains the estimated residual.
(b)
Column 2 gives the corresponding observation, as deviation from the trait mean.

Summary statistics about the distribution of residuals can be readily obtained using standard statistical packages. For example, the following R commands compute means, standard deviations and quartiles, and plot the two columns against each other as well as a distribution histogram for the residuals :

EXAMPLE:

res<-read.table(‘‘Residuals.dat’’)
summary(res); sd(res)
par(mfrow=c(1,2))
plot(res); hist(res[,1])

7.2.2 File(s) RnSoln_rname.dat

Solutions for each random effect are written to a separate file. These files have names RnSoln_rname.dat, with rname representing the name of the random effect. Columns in these files are :

(a)
Column 1 gives the running number for the level considered
(b)
Column 2 gives the corresponding original code; this is only printed for the first ’effect’ fitted for the level.
(c)
Column 3 gives the ’effect’ number, where, depending on the analysis, ’effect’ is a trait, principal component or random regression coefficient.
(d)
Column 4 gives the solution.
(e)
Column 5 gives the sampling error of the solution, calculated as the square root value of the corresponding diagonal element of the coefficient matrix in the mixed model equations. This is only available, if the last iterate has been carried out using an EM or PX-EM algorithm.

For genetic random effects with covariance option NRM, WOMBAT calculates inbreeding coefficients from the list of pedigrees specified. For such effects, there may be an additional column containing these coefficients (in %). This should be the last column in the RnSoln_rname.dat file (NB: For multivariate analyses, this is only given once per individual, in the line corresponding to the first trait).

There may be up to 7 columns – ignore column 6 unless you recognize the numbers (calculations for column 6 are not fully debugged, but may be o.k. for simple models).

If you have carried out a reduced rank analysis, i.e. give the PC option for the analysis type, the solutions in RnSoln_rname.dat pertain to the principal components! You might then also be interested in the corresponding solutions on the original scale – WOMBAT endeavours to calculate these for you and writes them to the file RnSoln_rname-tr.dat. However, if you have carried out a run in which you have calculated standard errors for the effects fitted, these are ignored in the back-transformation and you will find that column 5 (in RnSoln_rname-tr.dat) consists entirely of zeros – this does not mean that these s.e. are zero, only that they have not been determined.

7.2.3 File(s) Curve_cvname(_trname).dat

At convergence, curves for fixed covariables fitted are evaluated and written to separate files, one per covariable and trait. These have names Curve_cvname.dat for univariate analyses and Curve_cvname_ trname.dat for multivariate analyses, with cvname the name of the covariable as specified in the parameter file and, correspondingly, trname the name of the trait. Curves are only evaluated at points corresponding to nearest integer values of values found in the data. Each file has four columns :

(a)
Column 1 gives the value of the covariable.
(b)
Column 2 gives the point on the fitted curve.
(c)
Column 3 contains the number of observations with this value of the covariable.
(d)
Column 4 gives the corresponding raw mean.

HINT: To get most information from these files, it might be worth your while scaling your covariables prior to analysis!

7.2.4 File(s) RanRegname.dat

For random regression analyses, WOMBAT evaluates variance components (file RanRegVariances.dat), variance ratios (file
RanRegVarRatios.dat, not written if more than one control variable is used) and selected correlations (RanRegCorrels.dat) for values of the control variable(s) occurring in the data. If approximate sampling variances of parameters are available, it is attempted to approximate the corresponding sampling errors. The general layout of the files is as follows :

(a)
Column 1 gives the running number of the value of the control variable.
(b)
Column 2 gives the corresponding actual value (omitted if more than one control variable).
(c)
The following columns give the variance components, ratios or correlations.
(i)
If sampling errors are available, each source of variation is represented by two columns, i.e. value followed by the approximate lower bound sampling error, with additional spaces between ‘pairs’ of numbers.
(ii)
Random effects are listed in same order as the starting values for random effects covariances are given in the parameter file.
(iii)
If the same control variable is used for all random effects, it is attempted to calculate a total, ‘phenotypic’ variance and corresponding variance ratios and correlations.
(iv)
Correlations are calculated for 5 values of the control variable, corresponding to lowest and highest value, and 3 approximately equidistant intermediate values.

In addition, the files contain some rudimentary headings.

IF the special option RRCORR-ALL has been specified (see 4.10.6.13), a file RanRegCorrAll.dat is written out in addition. This contains the following columns:

(a)
The name of the random effect
(b)
The running number for trait one
(c)
The running number for trait two
(d)
The running number for the pair of traits
(e)
The value of the control variable (“age”) for trait one
(f)
The value of the control variable for trait two
(g)
The estimated covariance between traits one and two for the specified ages
(h)
The corresponding correlation

7.2.5 Files SimDatan.dat

Simulated records are written to files with the standard name
SimDatan.dat, where n  is a three-digit integer value (i.e. 001  , 002  , ...  ). These files have the same number of columns as specified for the data file in the parameter file (i.e. any trailing variables in the original data file not listed are ignored), with the trait values replaced by simulated records. These variables are followed by the simulated values for individual random effects : The first of these values is the residual error term, the other values are the random effects as sampled (standard uni-/multivariate analyses) or as evaluated using the random regression coefficients sampled - in the same order as the corresponding covariance matrices are specified in the parameter file. Except for the trait number in multivariate analyses (first variable), all variables are written out as real values.

7.2.6 Files EstimSubSetn+...  +m.dat

If an analysis considering a subset of traits is carried out, WOMBAT writes out a file EstimSubsetn+…+m.dat with the estimates of covariance matrices for this analysis. Writing of this file is ‘switched on’ when encountering the syntax “m”->n, specifying the trait number in the parameter file (see 4.8.2). The first two lines of EstimSubsetn+...  +m.dat gives the following information :

(a)
The number of traits in the subset and their names, as given in the parameter file.
(b)
The corresponding trait numbers in the ‘full’ analysis.

This is followed by the covariance matrices estimated. The first matrix given is the matrix of residual covariances, the other covariance matrices are given in the same order as specified in the parameter file.

(c)
The first line for each covariance matrix gives the running number of the random effect, the order of fit and the name of the effect
(d)
The following lines give the elements of covariance matrix, with one line per row.
  • The number of rows written is equal to the number of traits in the subset; for random effects not fitted for all traits, corresponding rows and columns of zeros are written out.

Finally, EstimSubsetn+...  +m.dat gives some information on the data structure (not used subsequently) :

(e)
The number of records for each trait
(f)
The number of individuals with pairs of records
(g)
The number of levels for the random effects fitted

7.2.7 Files PDMatrix.dat and PDBestPoint

These files give the pooled covariance matrices, obtained running WOMBAT with option --itsum.

PDMatrix.dat is meant to be readily pasted (as starting values) into the parameter file for an analysis considering all traits simultaneously. It contains the following information for each covariance matrix :

(a)
A line with the qualifier VAR, followed by the name of the random effect and the order and rank of the covariance matrix.
(b)
The elements of the upper triangle of the covariance matrix; these are written out as one element per line.

PDBestPoint has the same form as BestPoint (see 7.3.3). It is meant to be copied (or renamed) to BestPoint, so that WOMBAT can be run with option --best to generate a ‘results’ file (BestSoFar) with correlations, variance ratios and eigenvalues of the pooled covariance matrices.

7.2.8 Files PoolEstimates.out and PoolBestPoint

These files provided the results from a run with the option --pool:

1.
PoolEstimates.out summarizes characteristics of the part estimates provided (input), options chosen, and results for all analyses carried out.
2.
PoolBestPoint is the equivalent to BestPoint. If penalized analyses are carried out, copies labelled PoolBestPoint_unpen and PoolBestPoint_txx  , with xx  equal to the tuning factor, are generated so that files for all sub-analyses are available at the end of the run.

7.2.9 Files MME*.dat

The two files MMECoeffMatrix.dat and MMEEqNos+Solns.dat are written out when the run option --mmeout is specified.
MMECoeffMatrix.dat contains the non-zero elements in the lower triangle of the coefficient matrix in the MME. There is one line per element, containing 3 space-separated variables:

(a)
The row number (integer); in running order from 1 to N  , with N  the total number of equations.
(b)
The column number (integer); in running order from 1 to N  .
(c)
The element (real).

HINT: This file is in the correct format to be inverted using run option --invert or --invrev.

MMEEqNos+Solns.dat
provides the mapping of equation numbers (1 to N  ) to effects in the model, as well as the right hand sides and solutions. This file has one line per equation, with the following, space separated variables:
(a)
The equation number (integer).
(b)
The name of the trait, truncated to 12 letters for long names.
(c)
The name of the effect, truncated to 12 letters (For random effects and analyses using the PC option, this is replaced by PCn  ).
(d)
The original code for this level and effect (integer); for covariables this is replaced by the ‘set number’ (= 1  for non-nested covariables).
(e)
The running number for this level (within effect).
(f)
The right hand side in the MME (real).
(g)
The solution for this effect (real).

7.2.10 File QTLSolutions.dat

This is the output file for a run using option --snap. It contains one line for each line found in the input file QTLAllels.dat containing

(a)
The estimated SNP effect (regression coefficient).
(b)
Its standard error (from the inverse of the coefficient matrix).
(c)
The t− value, i.e. the ratio of the two variables.
(d)
A character variable with a name for the line.

7.2.11 Files Pen*(.dat) and ValidateLogLike.dat

The following files are created in conjunction with penalized estimation. Some can be used by WOMBAT in additional steps.

7.2.11.1 File PenEstimates.dat

This file gives a brief summary of estimates together with log likelihood values for all values of the tuning parameter given.

7.2.11.2 File PenBestPoints.dat

This file collects the BestPoint’s for all tuning parameters. The format for each is similar to that for BestPoint (see 7.3.3), except that the first line has 3 entries comprising the tuning factor, the maximum penalized likelihood and the corresponding unpenalized value. It is suitable as input for additional calculations aimed at comparing estimates, and used as input file for ‘validation’ runs (see run option --valid, 5.3.2).

Output to this file is cumulative, i.e. if it exists in the working directory it is not over-written but appended to.

7.2.11.3 File PenCanEigvalues.dat

Similarly, this file collects the values of the tuning factors and corresponding penalized and unpenalized log likelihood values. It has one line for each tuning factor. For a penalty on the canonical eigenvalues, estimates of the latter are written out as well (in descending order). Again, if this file exists in the working directory it is not over-written but appended to.

7.2.11.4 File PenTargetMatrix

If the option PHENV (see 4.10.7) is specified, WOMBAT writes out this file with a suitable target matrix. For penalty COVARM a covariance matrix and for CORREL the corresponding correlation matrix is given. For a multivariate analysis fitting a simple animal model, this is the phenotypic covariance (correlation) matrix. For a random regression analysis, corresponding matrices are based on the sum of the covariance matrices among random regression coefficients due the two random effects fitted, assumed to represent individuals’ genetic and permanent environmental effects (which must be fitted using the same number of basis functions).

Written out is the upper triangle of the matrix.

7.2.11.5 File ValidateLogLike.dat

This file is the output resulting from a run with the option --valid. It contains one line per tuning factor with he following entries:

(a)
A running number
(b)
The tuning factor
(c)
The unpenalized log likelihood in the validation data.
(d)
The penalized log likelihood in the training data.
(e)
The unpenalized log likelihood in the training data.

If this file exists in the working directory it is not over-written but appended to.

7.2.12 File CovSamples_name  .dat

This (potentially large) file contains the samples drawn from the multivariate normal distribution of estimates, either for all random effects in the analysis (name  = ALL) or for single, selected effect. The file contains one line per replicate, with covariance matrices written in the same sequence as in the corresponding estimation run (for ALL), giving the upper triangle for each matrix.