Worked examples

A number of worked examples are provided to illustrate the use of WOMBAT and, in particular, show how to set up the parameter files.

Installation for the suite of examples is described in section 3.1.4. This generates the directory WOMBAT/Examples with subdirectories Example ().

Each subdirectory contains the data and pedigree files for a particular example, a file WhatIsIt with a brief description of the example, and one or subdirectories for individual runs, A, B, C, .

Each ‘run’ directory (A, B, ) contains :

- (a)
- A parameter file (.par)
- (b)
- The file typescript (generated using the script command) which
contains the screen output for the run.

Run time options used can be found at the top of this file. - (c)
- The numerous output files generated by WOMBAT.

N.B.: The example data sets have been selected for ease of demonstration, and to allow fairly rapid replication of the example runs. Clearly, most of the data sets used are too small to support estimation of all the parameters fitted, in particular for the higher dimensional analyses shown !

N.B.: Examples runs have been carried out on a 64-bit machine (Intel I7 processor, rated at 3.20Ghz) ; numbers obtained on 32-bit machine may vary slightly.

Note further that all the example files are Linux files - you may need to ’translate’ them to DOS format if you plan to run the examples under Windows.

Example 1

Example 2

Example 3

Example 4

Example 5

Example 6

Example 7

Example 8

Example 9

Example 10

Example 11

Example 13

Example 14

Example 15

Example 16

Example 17

Example 18

Example 19

Example 20

Example 2

Example 3

Example 4

Example 5

Example 6

Example 7

Example 8

Example 9

Example 10

Example 11

Example 13

Example 14

Example 15

Example 16

Example 17

Example 18

Example 19

Example 20

This shows a univariate analysis for a simple animal model, fitting a single fixed effect only.

Source: Simulated data; Example 1 from DfReml

This shows a bivariate analysis for the case where the same model is fitted for both traits and all traits are recorded for all animals. The model of analysis includes 3 cross-classified fixed effects and an additional random effect

- A
- Analysis fitting an animal model
- B
- Analysis fitting a sire model

Source: Data from Edinburgh mouse lines; Example 2 from DfReml

This example involves up to six repeated records for a single trait, recorded at different ages. The model of analysis is an animal model with a single fixed effect. Data are analysed :

- A
- Fitting a univariate ‘repeatability’ model, with age as a covariable
- B
- Fitting a multivariate analysis with 6 traits
- C
- Fitting a univariate random regression model

Source: Wokalup selection experiment; Example 3 from DfReml

This example shows a four-variate analysis for a simple animal model. Runs show:

- A
- A ‘standard’ full rank analysis
- B
- A reduced rank analysis, fitting the first two principal components only
- C
- A reduced rank analysis using the EM algorithm

Source: Australian beef cattle field data

Similar to example 4, but involving 6 traits. Runs show:

- A
- A ‘standard’ full rank analysis
- B
- A reduced rank analysis, fitting the first four principal components
- C
- An analysis fitting a factor-analytic structure for the genetic covariance matrix
- D
- A full rank analysis, illustrating use of penalized REML (’bending’) for a chosen tuning parameter

Source: Australian beef cattle data

This example involves 4 measurements, with records on different sexes treated as different traits. This gives an eight-variate analysis, with 16 residual covariances equal to zero. The model of analysis is a simple animal model.

- A
- A full rank analysis with ‘good’ starting values
- B
- A full rank analysis with ‘bad’ starting values

Source: Australian beef cattle field data

This example illustrates the analysis of 4 traits, subject to genetic and permanent environmental effects. The model of analysis involves several crossclassified fixed effects, nested covariables and different effects for different traits.

- A
- Univariate analysis for trait 1
- B
- Univariate analysis for trait 2
- B1
- As B but reworked by setting up NRM inverse externally to illustrate use of GIN option
- B2
- As B, but fitting fixed effects only and a user-defined basis function for the covariable dam age
- C
- Univariate analysis for trait 2, allowing for a non-zero direct-maternal genetic covariance
- C1
- As C but using GIN instead of NRM option
- D
- Bivariate analysis for traits 1 and 2
- E
- Bivariate analysis for traits 1 and 2, allowing for a non-zero direct-maternal genetic covariance
- F
- Trivariate analysis for traits 1, 2 and 3
- G
- Fourvariate analysis of all traits
- H
- Fourvariate analysis of all traits, not fitting maternal effects for trait 4
- I
- Reduced rank, fourvariate analysis of all traits, not fitting maternal effects for trait 4

Source: Wokalup selection experiment

This is an example of a model where different random effects are fitted for different traits. It is a bivariate analysis of mature cow weights together with gestation length. Mature cow weight involves repeated records per animal, and a permanent environmental effect of the animal is thus fitted for this trait. Gestation length is treated as trait of the calf and assumed to be affected by both genetic and permanent environmental effects.

- A
- Standard model
- B
- Equivalent model, using the PEQ option for permanent environmental effects of the animal.

Source: Australian beef cattle field data

This example illustrates random regression analyses fitting an additional random effect, using B-splines as basis functions and imposing rank restrictions on estimated covariance functions. Data are monthly records for weights of calves from birth to weaning.

- A
- Full rank analysis
- A1
- As A, but evaluating the basis functions externally to illustrate the use of user-defined basis functions.
- B
- Reduced rank analysis
- C
- Reduced rank analysis with different ranks

Source: Wokalup selection experiment

In a recent paper, Wilson et al. [43] presented a tutorial on ‘animal model’ analyses of data from a wild population. This example replicates the analyses shown (and expands it by demonstrating how to evaluate likelihood fixing the genetic covariance at zero, in order to carry out a likelihood ratio test for this component).

The tutorial and all data (& pedigree) files used in this example are not included – you have to download these from their web site:

- A
- Simple univariate analysis
- B
- Bivariate analysis
- B1
- Bivariate analysis, fixing the genetic covariance at zero
- C
- Repeated records model

Source: The Wild Animal Models Wiki

This example demonstrates simple bi-variate random regression analyses.

- A
- Using a RR model to carry out a multivariate analysis with repeated records where the pattern of temporary environmental covariances is determined through the control variable.
- A1
- Multivariate analysis corresponding to A
- B
- Bi-variate RR analysis fitting a quadratic regression on Legendre polynomials and homogeneous measurement error variances.

Source: Simulated records

This example illustrates multivariate analyses with repeated records per trait, especially some new features:

- WOMBAT now insists that the option RPTCOV is specified in the parameter file!
- WOMBAT writes out a file RepeatedRecordsCounts with some basic information on how many animals have how many records.
- There is now a mechanism – through RPTCOV TSELECT – to specify which records are measured at the same time and thus have a non-zero error covariance and which are not.
- Trait numbers need to be assigned so that any traits with repeated records have a lower number than traits with single records.

The data were obtained by simulating records for 4 traits recorded on 800 animals at 5 different times. A missing value indicator (999) is used to create different pattern of missing records - note that analyses in the different sub-directories analyze different columns in the data file.

- A
- Demonstrates an analysis without missing records, i.e. where all traits are recorded at the same time. This implies that there are non-zero error covariances between all traits and that the ALIGNED option is appropriate.
- B
- Shows the analysis when some records are missing, but in a systematic fashion: Traits 1 and 2 have records at all 5 times, but traits 3 and 4 are only recorded for times 1 and 2. As the ‘missing’ observations only occur for the later times, the option ALIGNED is still appropriate.
- C
- Similar to B, but measurements for traits 3 and 4 are taken at times 2 and 4. This means that a time of recording indicator needs to be used to model the residual covariance structure correctly. This is done specifying TSELECT together with the name of the column in the data file which contains the time variable.
- C1
- As C, but using a multivariate random regression analysis.
- D
- Illustrates the scenario where we have a trait with repeated records analysed
together with traits with single records and where traits with single and repeated
records are measured at different times so that
- i)
- there are no error covariances between these groups of traits and
- ii)
- that we can ‘use’ the error covariance to model covariances between traits due to permanent environmental effects of the animal.

For this example, we use records taken at times 1 to 4 for trait 1, and records taken at time 5 for traits 2 to 4. For this case a model fitting a permanent environmental effects due to the animal for trait 1 only together with the INDESCR option is appropriate. Estimates of the error covariances betwen trait 1 and traits 2, 3 and 4 then reflect the permanent environmental covariance, while estimates of the (co)variances among the latter represent the sum of temporary and permanent environmental covariances.

- E
- Shows the case where we have a trait with repeated records analysed together with traits with single records, but where the single records are taken at the same time as one of the repeated records, so that we need to model non-zero error covariances. Here we consider records for trait 1 at all 5 times, and records for traits 2 to 3 taken at time 5. Again we need the TSELECT option to model this properly. In addition, we need to use the equivalent model invoked via the PEQ option in order to separate temporary and permanent envirionmental covariances between trait 1 and the other traits. Note that permanent environmental effects are fitted for all 4 traits, but that only the corresponding covariance components which can be disentangled from the environmental covariances are reported.

Source: Simulated records

This gives a toy example illustrating the use of the run option --snap for a simple GWAS type analysis.

Source: Simulated records

This example illustrates how to pool estimates of covariance components by (penalized) maximum likelihood.

The example is comprised of 14 traits, with 6 traits measured on few animals and the remaining records representing 4 traits with measures on males and females treated as different traits [from 28]. There are results from 76 bivariate and one six-variate analysis to be combined. Due to traits measured on different subsets of animals (sexes) there are a number of residual covariances which are to be fixed at zero.

- A
- All part analyses have been carried out using WOMBAT and a ‘full’ parameter file is available.
- B
- Results from part analyses are summarized in a single file and a ‘minimum’ parameter file is used.

This example shows what options are available in WOMBAT to fit ‘social interaction’ type models.

- A
- This directory shows an example run for simulated data for a dilution factor of 0, treating direct and social genetic effects as uncorrelated.
- B
- As A, but allowing for a non-zero genetic covariance
- C
- As B, but treating social group and residual variances as heterogeneous.
- D
- Fitting the same model as in B to data simulated with a non-zero dilution factor, this directory shows the multiple runs required to estimated this factor using a quadratic approximation to the profile likelihood.
- Z
- Larry Schaeffer has some notes which contain a very simple example. This subdirectory shows how to obtain the BLUP solutions given using WOMBAT. www.aps.uoguelph.ca/ lrs/ABModels/NOTES/SSocial.pdf

This example shows how WOMBAT can be used to approximate the sampling distribution of estimates by sampling from their asymptotic, multivariate normal distribution, as described by Meyer and Houle [30].

- A
- Sample estimates for the genetic covariance matrix in a 5-trait analysis (Note that 100 samples is used for illustration only - more should be used in practice!)

This example illustrates the use of the single-step option, --s1step

- A
- Univariate analysis
- B
- As A, using –blup
- C
- Bivariate analysis
- D
- Univariate analysis, fitting "explicit" genetic groups

This example illustrates the use of simple penalties to reduce sampling variances in REML estimation.

- 0
- Parameter file to simulate data

- A
- Standard unpenalized analysis
- B
- Unpenalized analysis, parameterised to elements of the canonical decomposition
- C
- Penalty on canonical eigenvalues with ESS=8
- D
- Penalty on genetic partial correlations with ESS=8, shrinking towards phenotypic values

- Z
- Script to simulate replicates & Fortran code to summarize results

This example shows some of the options available for a pre-analysis step with run option --hinv to calculate relationship matrices required for genomic evaluation and provides the relevant documentation as a pdf file.

- A
- Calculation of complete H-inverse for simulated data with 370 genotypes and 750 animals in total.
- A1
- as A, but with scaling of the GRM as described by Christensen [6]
- B
- Data as A. Illustrates calculation of GRM and GRM-Inverse only.
- C
- Data as A. Illustrates calculation of only.
- D
- Shows computation of the inverse numerator relationship matrix with one metafounder; example from Legarra et al. [20].
- E
- Computation of the inverse numerator relationship matrix with two metafounders; example from Legarra et al. [20].
- F
- Illustrates computation of the H-inverse using an NRM with two metafounders. Example from github.com/alegarra/metafounders