WOMBAT – A program for Mixed Model Analyses by Restricted Maximum Likelihood

Example 1 for WOMBAT

Download Example1

One of the simplest analysis in quantitative genetics is the case where we have one trait of interest which is recorded only once, and where we have no additional random effects. Assuming a general pedigree (i.e. information from various types of relatives), we then fit a simple animal model, with additive genetic effects due to the individual to estimate the additive genetic and residual variance components and calculate the heritability of the trait from them.

Data and pedigree files

This is illustrated in Example 1 using some simulated data. The first few lines of the file tstmo1.d are shown in this box:

   10025       1       2   1  220.0000
   10026       1       2   1  212.0000
   10027       1       2   1  221.0000
   10028       1       2   1  207.0000
   10029       1       2   1  218.0000
   10030       1       2   1  201.0000
   10031       1       2   1  214.0000
   10032       1       2   1  229.0000

Column 1 contains the animal identity and columns 2 and 3 give the codes for its sire and dam, respectively. Note that all are numeric codes and that animal codes must be higher than both sire and dam codes. In this case, we have a single fixed effect – generation – with 2 levels which is listed in column 4. Column 5 finally contains the trait value. As the data file has only one record per individual and as it contains the pedigree information for all individuals (i.e. there is no additional pedigree information on parents without records), we can use the same file as pedigree file.

Parameter file

The parameter file to analyse these data then looks like this (Note that line numbers are not part of the parameter file, but are shown here for ease of description only):

 1: COM Example 1 from DFREML : Simple univariate analysis
 2: PED ../tstmo1.d
 3: DAT ../tstmo1.d
 4:    animal
 5:    sire
 6:    dam
 7:    fixeff 9
 8:    record
 9: end
10: ANAL UNI
11: MODEL
12:    RAN animal NRM
13:    FIX fixeff
14:    TR record
15: END
16: VAR animal 1
17: 60
18: VAR error 1
19: 40
  • Line 1 is a comment line (optional) – what follows the identifier COM is written at the start of the summary files generated by WOMBAT.
  • Line 2 identifies the pedigree file to be used. Here, it is held in the parent directory to the working directory (identified by ../). As no other option is given, this file is expected to have (at least) three columns, with columns 1, 2 and 3 containing the animal, sire and dam IDs, respectively.

Lines within a `block' are indented here – this is for clarity (and recommended), but not actually required by the program.

  • Line 3 specifies the data file to be used. It is the beginning of a `block' – consisting of lines 4 to 8 – which describes the content of the columns of the file, and is terminated by the END statement (line 9).
    For columns to be `used' in the analysis, the name given is to be followed by a number (must be on integer) giving the maximum number of levels for this effect (this can be larger but should not be 10 or 100 times larger). For a random effect associated with a pedigree file (animals in this case) this number be omitted (or given as 0) as the program will determine the correct number automatically.
  • Line 10 specifies the type of analysis to be carried out: UNI for a standard, univariate analysis.
  • Lines 11 to 15 specify the model of analysis. Again, this is a block of information (line 12 to 14) which is bracketed by the MODEL (line 11) and END (line 15) statements.
  • Lines 16 to 19 finally give the variance components to be estimated and the starting values to be used in the iterative estimation scheme.
    • There must be a line starting with VAR for each random effect fitted. This is followed by one or more lines with the starting values.
    • In addition there must be VAR statement with the name error or residual, and corresponding line(s) of starting values.

In this example, we call the parameter file wombat.par.

Running WOMBAT

Running WOMBAT then requires typing in the program name, possibly with the appropriate path, together with any run time options desired on the `command line' Here we use

wombat -v

This requires a set-up where the operating system knows where to look for the executable file! Otherwise, you may need to specify the complete path, e.g. /home/kmeyer/bin/wombat. The run time option -v requests `verbose' output to the screen. Since our parameter file has the default name wombat.par, we don't need to give its name.

Screen output

This generates the screen output as shown in the following (again the line numbers are not part of the output):

  1:  Parameter file opened : "wombat.par"
  2: **COM Example 1 from DFREML : Simple univariate analysis**
  3: **PED ../tstmo1.d**
  4:   Pedigree file : "../tstmo1.d"
  5: **DAT ../tstmo1.d**
  6:   Data file : "../tstmo1.d"
  7: **animal**
  8:   Data - column :   1  "animal"    -1
  9: **sire**
 10:   Data - column :   2  "sire"    -1
 11: **dam**
 12:   Data - column :   3  "dam"    -1
 13: **fixeff 9**
 14:   Data - column :   4  "fixeff"     9
 15: **record**
 16:   Data - column :   5  "record"    -1
 17: **end**
 18: **ANAL UNI**
 19:   Analysis type = 1  "UNI"
 20:   No. of traits =  1
 21: **MODEL**
 22: **RAN animal NRM**
 23: **FIX fixeff**
 24: **TR record**
 25:   Trait no.  1  "record"  Missing value  -123456789
 26: **END**
 27: **VAR animal 1**
 28: **60**
 29: **VAR error 1**
 30: **40**
 31:
 32:  End of parameter file reached :  19 lines read
 33:  Data file appers to be sorted correctly
 34:  No. of records                             =       282
 35:  No. of animal IDs in data                  =       282
 36:  No. of animal IDs in data                  =       282
 37:  Total no. of IDs in reduced ped file       =       306
 38:  Pedigree file                  =   ../tstmo1.d
 39:  Total no. of animal IDs found  = 306
 40:  no. of NRM matrices  1 1
 41:  Pruned pedigree for random effect no. 1
 42:  Original no. of animals = 306
 43:  New no. of animals      = 306
 44:
 45: Inbreeding coefficients for random effect   1 computed
 46: No. of inbred animals          =      38
 47: Average inbreeding coefficient =  2.2059 (in %)
 48: ... amongst inbred animals     = 17.7632 (in %)
 49:  NRM Inverse no. : 1   "animal"
 50:  No. of elements = 906   log determinant = -195.46750491790496
 51:  No. of "subjects" in data 282
 52:   maxnr = 2   ncombi = 1
 53:  Parameters to be estimated & their starting values
 54:     1  COVS Z 1 1           40.000000
 55:     2  COVS A 1 1           60.000000
 56:  END data
 57:  Coefficient matrix fixed effects - rank =     2  no. of rows =     2
 58:  END "lsqfixed"
 59:  No. of equations in full rank submatrix = 308
 60:  Elements in MMM - after data                 282
 61:  Elements in MMM - after RE    1              882
 62:  End of "GENMMD"      308  nofsub =           950
 63:  Elements in MMM - after RHS                 1166
 64:  End of "SMBFCT": maxsub                      318
 65:                   maxlnz                     1274
 66:                   nnops                      4599.  999999999999999.
 67:  END "eqnorders"
 68: ***log L    1          -760.345   Time       0
 69:  search 1
 70: ***AI iterate     0  log L =  -760.344597      D =   0.0000       0.0000      Time      0
 71:  AI matrix : min. & max. eigenvalue 0.580946      6.50567      tau =   0.0000
 72: ***log L    2          -759.598   Time       0
 73:  Tau =  0.0000     Alpha =  1.0000      log L =        -759.598
 74: ***AI iterate     1  log L =  -759.597721      D =  0.74688      0.18539      Time      0
 75:  Estimates for "tau" =  0.0000       "alpha" =  1.0000       "norm" =  1.6662
 76:    1   CHOL Z 1 1       Old =    6.324555     New =    7.396408
 77:    2   CHOL A 1 1       Old =    7.745967     New =    6.298573
 78:  AI matrix : min. & max. eigenvalue 0.689232      6.18917      tau =   0.0000
 79: ***log L    3          -759.507   Time       0
 80:  Tau =  0.0000     Alpha =  1.0000      log L =        -759.507
 81: ***AI iterate     2  log L =  -759.506748      D =  0.90973E-01  0.52339E-01  Time      0
 82:  Estimates for "tau" =  0.0000       "alpha" =  1.0000       "norm" = 0.64462
 83:    1   CHOL Z 1 1       Old =    7.396408     New =    7.097490
 84:    2   CHOL A 1 1       Old =    6.298573     New =    6.713435
 85:  AI matrix : min. & max. eigenvalue 0.637744      6.31627      tau =   0.0000
 86: ***log L    4          -759.504   Time       0
 87:  Tau =  0.0000     Alpha =  1.0000      log L =        -759.504
 88: ***AI iterate     3  log L =  -759.503659      D =  0.30884E-02  0.11687E-01  Time      0
 89:  Estimates for "tau" =  0.0000       "alpha" =  1.0000       "norm" = 0.72789E-01
 90:    1   CHOL Z 1 1       Old =    7.097490     New =    7.148321
 91:    2   CHOL A 1 1       Old =    6.713435     New =    6.611619
 92:  AI matrix : min. & max. eigenvalue 0.652444      6.34003      tau =   0.0000
 93: ***log L    5          -759.503   Time       0
 94:  Tau =  0.0000     Alpha =  1.0000      log L =        -759.503
 95: ***AI iterate     4  log L =  -759.503454      D =  0.20554E-03  0.29391E-02  Time      0
 96:  Estimates for "tau" =  0.0000       "alpha" =  1.0000       "norm" = 0.19922E-01
 97:    1   CHOL Z 1 1       Old =    7.148321     New =    7.134470
 98:    2   CHOL A 1 1       Old =    6.611619     New =    6.636684
 99: *** AIREML has converged
100:     Last change in log L 2.05541655532215373E-4 5.0000000000000001E-4
101:    1  animal                   306  zero soln.s       0
102:  SS residuals 9470.5877529900336
103:  "WOMBAT" has finished !

Results files

Summary of pedigree information

The first output file generated is a summary of the pedigree information found. This file is named SumPedigree.out.

 1: ======= Version 15-09-2009 ======================================= **KM** ====
 2:
 3:         Program WOMBAT : Summary of Pedigree Information
 4: ==============================================================================
 5:
 6:  Example 1 from DFREML : Simple univariate analysis
 7:
 8:  Analysis type          :   "UNI"
 9:  Data file              :   "../tstmo1.d"
10:  Pedigree file          :   "../tstmo1.d"
11:  Parameter file         :   "wombat.par"
  • Lines 1 to 4 represent a header to describe the file. This includes the date (a.k.a version) that the program executable has been compiled (line 1).
  • Line 6 contains the comment line given in the parameter file.
  • Lines 8 to 11 identify the type of analysis to be carried out and the names of the data, pedigree and parameter files used.
  • Lines 12 and 13 are blank.
14:  No. of animal IDs in data file =           =       282
15:  No. of animal IDs in total     =           =       306
16:
17: *****Pedigree Structure for random effect :    1  ****************************
18:  Original no. of animals                    =       306
19:  No. of animals after pruning               =       306
20:  ... proportion (%) remaining               =     100.0
21:
22:  No. of levels w/out records                =        24
23:  No. of levels with records                 =       282  100.0%
24:   ...  1 record(s)                          =       282  100.0%
25:
26:  No. of animals w/out offspring             =       258   84.3%
27:  No. of animals with offspring              =        48   15.7%
28:   ... and records                           =        24    7.8%
29:
30:  No. of animals with unknown sire           =        24
31:  No. of animals with unknown dam            =        24
32:  No. of animals with both parents unknown   =        24
33:  No. of animals with records                =
34:   ... and unknown sire                      =         0
35:   ... and unknown dam                       =         0
36:   ... and both parents unknown              =         0
37:  No. of sires                               =        12
38:   ... with progeny in the data              =        12
39:   ... with records & progeny in data        =         6
40:  No. of dams                                =        36
41:   ... with progeny in the data              =        36
42:   ... with records & progeny in data        =        18
43:
44:  No. of animals with known/unpruned grand-parents
45:   ... with paternal grandsire               =       144
46:   ... with paternal granddam                =       144
47:   ... with maternal grandsire               =       144
48:   ... with maternal granddam                =       144
49:
50: Inbreeding coefficients for random effect   1 computed
51: No. of inbred animals          =      38
52: Average inbreeding coefficient =  2.2059 (in %)
53: ... amongst inbred animals     = 17.7632 (in %)
54:   random effect no. = 1 NRM
55:   no. of elements in NRM/GIN inverse  906
56:   log determinant   = -195.46750491790496
57: ======== end of file ============================21-09-2009==========15:40====
  • Lines 14 and 15 give the numbers of distinct animal codes (ID for identity) that WOMBAT has identified in the data and pedigree file, respectively.
  • Lines 17 to 48 contain various counts on numbers of individuals and the numbers of ancestors identified together with some statistics on individuals which have both progeny and records in the data.
  • Lines 50 to 56 report on the calculation of inbreeding coefficients and the characteristics of the inverse of the numerator relationship matrix that has been set up from the pedigree information supplied.
  • Finally, line 57 marks the end of the file, together with a dat and time stamp.

Summary for data file and model specified

The second summary file generated at the beginning of a run is SumModel.out. It gives a summary of the number of records, means and standard deviations per trait and the number of levels for each effect fitted that WOMBAT has found.

  • It is a good idea to inspect this file closely `early on' to be sure that you and WOMBAT agree on the details of the analysis! WOMBAT provides the run-time option ––setup to facilitate this.
 1: ======= Version 15-09-2009 ======================================= **KM** ====
 2:
 3:         Program WOMBAT : Summary of information from Set-up step
 4: ==============================================================================
 5:
 6:  Example 1 from DFREML : Simple univariate analysis
 7:
 8:  Analysis type          :   "UNI"
 9:  Data file              :   "../tstmo1.d"
10:  Pedigree file          :   "../tstmo1.d"
11:  Parameter file         :   "wombat.par"
12:
13:
14:  No. of traits          =   1
15:                       nrec       mean         sdev         min.         max.
16:    1  "record"         282    229.787      13.4938      194.000      269.000
17:
18:  Fixed effects
19:  1 "record"           nlev
20:    1  "fixeff"           2
21:
22:  Random effects       nlev
23:    1  "animal"         306    NRM
24: ======== end of file ============================21-09-2009==========15:40====

Summary of variance component estimates

The main results file is named SumEstimates.out.

 1: ======= Version 15-09-2009 ======================================= **KM** ====
 2:
 3:         Program WOMBAT : Estimates of covariance components
 4: ==============================================================================
 5:
 6:  Example 1 from DFREML : Simple univariate analysis
 7:
 8:  Analysis type          :   "UNI"
 9:  Data file              :   "../tstmo1.d"
10:  Pedigree file          :   "../tstmo1.d"
11:  Parameter file         :   "wombat.par"
12:
13:
14:  No. of traits          =      1                record
15:  No. of records         =    282           282
16:  No. of parameters      =      2
17:  Maximum log L          =          -759.503
18:  -1/2 AIC & AICC        =          -761.503          -761.525
19:  -1/2 BIC               =          -765.138    "Penalty factor" =   2.817
20:
21:  Parameter estimates with approx. sampling erors
22:    1   CHOL Z 1 1              7.13447          0.662468
23:    2   CHOL A 1 1              6.63668           1.12245
24:
25:  Convergence criteria for last 3 iterates
26:  Change in log likelihood    =    0.090973    0.003088    0.000206
27:  Change in parameter vector  =    0.052339    0.011687    0.002939
28:  Norm of gradient vector     =      0.6446      0.0728      0.0199
29:  Newton decrement            =     -0.2226     -0.0083     -0.0005
  • Lines 1 to 11 again comprise the file header, together with information on the input files used.
  • Lines 16 to 19 give the number of parameters estimated and the maximum of the REML log likelihood at the end of the run, together with corresponding information criteria calculated from it.
31: ***** Estimates of residual covariances   ************************************
32:       Order of fit         =       1
33:  Covariance matrix
34:    1     50.901
35:  Matrix of correlations and variance ratios
36:    1      0.5361
37:  Covariances & correlations with approximate sampling errors
38:    1  COVS Z 1 1        50.9007        9.45272       vrat     0.536   0.123
39:
40: ***** Estimates for RE  1   "animal"   ***************************************
41:       No. of levels        =     306
42:       Covariance structure =    NRM
43:       Order of fit         =       1
44:  Covariance matrix
45:    1     44.046
46:  Matrix of correlations and variance ratios
47:    1      0.4639
48:  Covariances & correlations with approximate sampling errors
49:    2  COVS A 1 1        44.0456        14.8987       vrat     0.464   0.123
50:
51: ***** Estimates of phenotypic covariances  ***********************************
52:  Covariance matrix
53:    1     94.946
54:  Covariances & correlations with approximate sampling errors
55:    3  COVS T 1 1        94.9462        10.0250
56: ======== end of file ============================21-09-2009==========15:40====

Other output to look at

QR Code
QR Code wombat:ex1page (generated for current page)