private study only. The thesis may not be reproduced elsewhere without
the permission of the Author.
THE ESTIMATION OF GENETIC PARAMETERS
FOR CATEGORICAL TRAITS
A thesis presented in partial fulfilment of the requ irements for the degree of
Doctor of Philosophy in Animal Science at Massey University.
ARTHUR RICHARD GILMOUR
1983
t
tt ••
-r.
�ABSTRACT
The estimation of heritabilities, genetic and environmental variances and covariances, and the prediction of breedina values, is a major concern among animal
breede r s . This s tudy
adapts existing statistical methods to provide a new method of estimating these parameters for categorical traits.
The problems associated with the analysis of categorical data arise because of the relationship between the mean and
variance of the sampling distribution. The parameters of the sampling distribution are assumed to be a non-linear function of values on an underlying scale. It is further assumed that fixed and random effects are additive on the underlying
scale. This scale cannot be observed and information about it must be deduced from the observed categorical trait.
A common practice has been to estimate parameters (fixed
effects and variances) on the categorical variable itself and then to transform these estimates to values applicable to the underlying scale. This procedure is theoretically invali�
except for a fully random model in which the only fixed effect is the mean. The method developed in this thesis attempts to estimate parameters directly in the underlying scale by transforming the data before calculating estimates of the parameters. It is a synthesis of mixed model
procedures (Henderson et
?1., 1959) and generalized linear models (N elder and Wedderburn, 1972) and is called the
generalized linear mixed model. The general method is for analysing data presumed to arise from a two-stage sampling procedure when the second sampling has an error distribution belonging to the single parameter exponential family.
The detailed algebra for applying the new method to binomial and multinomial traits for the estimation of fixed effects is presented. The logit transformation is used in this
application and the resulting system of equations is called the logistic linear mixed model. A procedure for estimating
iii
variances, covariances and the intraclass correlation on the underlying scale is also developed.
The logistic linear mixed model is evaluated by comparing parameter estimates from the method with true values used to generate the data being analysed. Biases appear to be small except for some extreme combinations of parameters when
assumptions made while developing the algebra break down.
The logistic linear mixed model is applied to two real problems for which fixed and random effects and variance
components are estimated and comparisons made with parameters estimated by other methods. The first problem is the
analysis of data on the feet characteristics of
2 5 1 3
lambs,the second is the analysis of
1 3 96
lambing performance records.The conclu ding discussion considers the general use of the logistic linear mixed model and its relationship with other models.
ACKNOWLEDGEMENTS
I gratefully acknowledge the guidance of my supervisors, Professor R. D. Anderson, Professor A. L. Rae and
Dr. G. A. Wick ham.
The Australian Wool Corporation provided the post-graduate scholarship which enabled this study to be undertak en. It is thank ed, along with the New South Wales Department of
Agriculture which provided study leave and financial assistance.
The Animal Science Department has provided a fine work ing environment and the opportunity to broaden my outlook .
My gratiude is extended to the friends we have made,
particularly in Milson. Their willingness to include us in their community has made life both pleasant and rewarding for me and especially for my wife, Ellen, and our three
daughters. In particular, I ack nowledge the friendship of Graham and Lynette Toms and the members of Milson Christian F ellowship.
To Ellen and the three girls,
thank you fo r b e ing there when
needed and bearing with me when times were busy.
To my God,
thank you fo r p r o v id ing a l l our ne eds -
friends, a home, work and recreation, health, finance and the opportunity to serve - through Jesus Christ.
Table of Contents
Section
Chapter
1 .
Chapter
2 . 2 . 1 .
2 . 2 . 2 . 2 . 1 . 2 . 2 . 2 . 2 . 2 . 3 . 2 . 2 . 4 . 2 . 3 . 2 . 3 . 1 . 2 . 3 . 2 . 2 . 4 .
Chapter
3 .
3 . 1 .
3 . 1 . 1 . 3 . 1 . 2 . 3 . 1
•3 . 3 . 2 . 3 . 2 . 1 . 3 . 2 . 2 .
Heading
Abstract
Ack nowledgments Table of Contents List of Tables List of Figures
Introdu ction
Literatu re review Categorical data
Analysis on the binary
( 0 , 1 )
scaleBinomial traits Multinomial traits
Correlation between variables
Selection involving categorical traits Analysis on the u nderlying scale
F alconer's liability method
Generalized linear models and other proposals
Scope of the thesis
Review of statistical procedures requ ired in later chapters
Some statistical procedu res u sed in animal breeding
The selection index
Best linear u nbiased prediction Estimation of variance components O ther statistical procedures
Logistic distribu tion
Log likelihood expectations
V
page
ii iv V ix xiv
1
3 4 6 6 1 1 1 3 1 5 1 7 1 7
1 9 2 1
2 2
2 2
22
2 3
2 4
26
26
2 8
Generalized linear models (GLM)
3 0 3 . 2 . 3 . 1 .
Derivation of the general expressions for membersof the single parameter exponential family
3 0 3 . 2 . 3 . 2 .
Testing the 'goodness of fit' of a proposed3 . 2 . 4 .
model
Maximum lik elihood solution for parameters of the multinomial distribution by generalized linear models
Solution of re-weighted least squares equations
Chapter
4 .
Generalized linear mixed model4 . 1 .
The joint-maximization method4 . 2 .
The maximization-expectation method3 5
3 5 3 9 4 0 4 1 4 4
Chapter
5 .
Deriving the logistic linear mixed model4 8 5 . 1 .
5 . 1 . 1 . 5 . 1 . 2 .
5 . 1 . 2 . 1 . 5 . 1 . 2 . 2 .
5 . 1 . 2 . 3 . 5 . 2 . 5 . 2 . 1 . 5 . 2 . 1 . 1 . 5. 2 . 1 . 2 .
5 . 2 . 2 . 5 . 2 . 2 . 1 . 5 . 2 . 2 . 2 .
Mixed model analysis of binomial data
4 8
Solution by joint-·
rnax imiza t ion me thod 4 9
Solution by the maximization-expectation method
Using the N ormal distribution Using the moments of a symmetric
distribution
Using the logistic distribution Analysis of multinomial data Analysis of extremal characters
Sol ut ion by joint-maximization method Solution by the maximization-expectation
method
Analysis of multiple threshold characters Fixed effects generalized linear model
Mixed model threshold analysis by joint-maximization method
4 9 5 0 5 2 5 5 5 9 5 9 5 9 60 66 68
7 0 5 . 2 . 2 . 3 .
Mixed model threshold analysis bymaximization-expectation method
7 1
vii
Chapter
6 .
Application of the logistic linear mixed modelto animal breeding
7 5
6 . 1 .
The simu ltaneou s analysis of categorical6 . 1 . 1 . 6 . 2 . 6. 3 . 6 . 3 . 1 . 6 . 3 . 2 .
and continu ou s characters by the same model
7 5
Covariance between a binomial character
and a normal character
7 7
Estimation of variances
7 9
Implementation of the logistic linear mixed model in a generalized linear models programme
8 3
Measu rement of 'lack of fit'
Estimation of variance components
8 3 8 4
� Chapter
7 .
Performance of the logistic linear mixed model·1
in simu lation stu dies.
8 7
7 . 1 .
The behaviou r of estimates of mean and varianceu nder the logistic linear mixed model
8 7
1 . 2 .
The precision of estimates of the threshold andthe intraclass correlation
7 . 3 .
The estimation of breeding valu es u nder thelogistic linear mixed model
Chapter
8 .
A stu dy of foot ailments associated with Merino-cross sheep grazing damp conditions8 . 1 .
Repeatability of assessment of foot-shape8 . 2 .
Estimation of heritability of incidence of foot9 2 97
1 1 0 1 1 1
ailments
1 1 2
Chapter
9 . 9 . 1 .
9 . 2 .
Reprodu ctive performance of Perendale ewes
1 2 5
Analysis of Perendale data by ordinary mu ltivariate
least squ ares
1 29
Analysis of binary traits on the probit scale
u nder a fixed effects model
1 3 0
9 . 3 .
Analysis of the Perendale data u sing thelogistic linear mixed model
1 3 5
Chapter
1 0o
General Discu ssion1 4 7
1 0o 1o
How relevant is an estimate of heritability on the1 0o 2o
1 0
03
01 0o 4o
1 0o 5o 1 0
06
0Appendix A
Appendix B
Appendix C
u nderly ing scale ?
1 4 7
What relationship exists between the
logistic linear mixed model and the ordinary
mixed model analy sis?
1 47
What can be done in very large problems where the iterative procedures would be
prohibitively expensive?
1 4 9
When might it be more efficient to u se
equations based on the joint-maximization method rather than those based on the
maximization-expectation method?
What fu rther work is requ ired?
Conclu sion
Some matrix sy mbols and operations
More expectations for section
5o 1o2o 1
u singthe normal distribu tion
Derivation of the basic expressions requ ired to apply the generalized linear mixed model to mu ltinomial data u sing the
mu ltinomial logit
1 5 0 1 52 1 53
1 5 4
1 56
1 5 9
Appendix D
Some additional tables relating to chapter
8
0Bibliography
1 76
1 8 1
L is t o f Tabl e s T ab l e
3 . 1 Th e r e l at i o n s h i p b e t w e en t h e s tand a r d norm a l a n d sta n d a r d log i s t i c d i s t r i but i on s whe n u s e d as p robab i l i ty t r a n s form a t i o n s .
7 . 1 E f fects o f d i ffer i n g fam i l y s i z e a n d numb e r s o f fam i l i es on e s t i m at e s o f the i n t r a c l a s s c o r r el a t i o n and t h e thr e s h o l d i n b i n om i a l s amples w i th e x t r a - b i n o m i a l v a r i a t i o n .
p a g e
2 8
9 4 7 . 2 S t a t i st i c s for 1 0 s amp l e s o f 1 0 0 N ( O, . 1 0 ) r a n d om
v ar i a b l e s used t o d e f i n e t h e 1 0 f l ocks . 8 . 1 A n a l ys i s o f var i a n c e o f foo t - s h a p e sco r e s . 8 . 2 A n al y s i s o f d ev i a n c e o f foot- s h ape sc o r e s wh e n
a n a l ysed a s a d ou b l e- thr e s h o l d t r a i t w i th an i n tr ac l a s s cor r e l a t i on o f 0 . 7 3 1 .
8 . 3 I t e r a t i v e s equ e n c e for e s t i m a t i n g the i n t r a c l a s s c or r el a t i o n fo r t h e doub l e- thr e s h o l d t r a i t , L 5 4
9 8 1 1 1
1 1 1
( l amb foo t sha pe s co r e 5 , 4 a n d l e s s t h a n 4 ) . 1 1 7 8 . 4 A n a l ys i s o f d e v i a n c e f o r t he d o ub l e- th r e s h o l d t r a it
L 5 4 ( l am b foo t - shape s c o r e 5 , 4 and l e s s th a n 4 ) . 1 1 7 8 . 5 E s t imat e s o f i n t r a c l as s c o r r e l a t i o n and
h e r i t ab i l i t y ( 4 t im e s i n t r a c l a s s c o r r e l a t i o n )
f o r 1 2 s h e e p fo o t t r a i t s . 1 1 8
i x
T a b l e p a g e 8 . 6 C o r r e l at i o n betw e e n e st i m a t es o f b r e ed i n g v a l u e for
f oot- s h a pe s c o r e , ob t a i n e d by v a r iou s m e thod s . 1 18 8 . 7 C or r e l a t i o n bet w e e n e s t i m a t e s o f b reed i n g v a l u e
o f 2 8 s i r e s in fo u r mat i n g - g r o u p s f o r p r e s e n c e o f foo t - s c a l d a n d pr e s e n c e o f foot- ro t .
8 . 8 A n a l ys i s o f dev i a n c e for f oo t - s h ape t r a i t s L5 4 a n d H 5 4 , u s i ng 3 4 s i r e s .
8 . 9 A n a l ys i s o f de v i a n c e fo r foot- sh ape t r a i t s L 5 a nd H5 , u s ing 3 4 s i r e s .
8 . 1 0 A n a l ys i s o f dev i a n c e for p r e s e n c e o f f o o t - s c a l d ( LS ) a n d pre s e n c e o f f o o t - r o t ( L R ) i n l am b s
119
1 1 9
1 2 0
u s i n g 2 8 s i res . 1 2 0
8 . 1 1 A n a l y s i s o f Va r i a n c e f o r foo t - sh a p e s co r e ( L C )
u s i ng 3 4 s i res .
Kco e f f i c i e n t i s 7 3 . 4 3 8 8 . 1 2 1
8 . 1 2 A n a l ys i s o f va ri a n c e f o r foot- s h a p e t r a i t L 5
u s i n g 3 4 s i re s .
Kco e f f i c i e n t i s 7 3 . 4 3 8 8 . 1 2 2 8 . 1 3 A n a l y s i s o f Va ri a n c e f o r pre s e n c e of fo o t - s c a l d
( LS ) u s i ng 2 8 s i re s .
Kcoe f f i c i e n t i s 6 9 . 7 0 2 4 . 1 2 3 8 . 1 4 A n a l ys i s o f Va r i a n c e f o r pre s e n c e o f foo t - r o t ( L R )
u s i n g 2 8 s i re s .
Kc oe f f i c i en t i s 69 . 7 0 2 4 . 1 24 8 . 15 I n c i d e n c e o f foot - s c a l d in l am b s ( 5'mo n t h s ) o n
p e r c ent a g e and l o g i t s c a l e s . 1 2 4
T a b l e p a g e 9 . 1 S u mma r y o f the n u m ber o f d au gh t e r s s i r e d b y e a c h o f
t h e 6 3 r am s u s e d i n the f l oc k f rom 1 9 6 1 t o 1 97 2 . 1 26 9 . 2 D a m age m e an s fo r w ean i n g w e i gh t , hogg e t w e ight
a n d num b e r of l am b s we a n e d .
9 . 3 Y e a r me a n s for w e a n i ng w e i g ht , h o g get we i g ht and num b e r o f l am b s we an e d .
1 2 6
1 2 7
9 . 4 V a r i anc e a n d cov a r i a n c e com pone n t s est i m a t e d b y H e nder s o n ' s ( 1 9 5 3 ) m e t hod 3 . C o v a r i a n c e compo n e n t s a r e bel o w a nd cor r e l at i o n s a r e a b o v e t h e d i a gon a l 1 2 9
9 . 5 H e r itab i l i t i es o f f iv e t r a i t s by H e nder so n ' s ( 1 9 5 3 )
m e thod 3 . 1 2 9
9 . 6 H e r i t a b i l i t y on t he p r o b i t sc a l e by F a l c o n e r ' s
( 1 9 65 ) m e thod . 1 3 2
9 . 7 H e r i t ab i l i t y on t h e p r o b i t s c a l e u s i n g F a l c o n e r ' s m ethod a f t er inc l u d i n g h o g g e t w e i ght a s a cov a r i a t e a nd as s u m i ng g e n e t i c c o r r e l at i o n s o f 1 .
9 . 8 S i r e s i n o rder o f a v e r a g e r an k for l ambe d - o r - no t a n d twi n s - or- not .
9 . 9 F i x ed e f f e c t s for a l l t r a i t s e s t im ated b y m a x imum l i kel i h o o d s in g l e t r a i t m e thod s .
1 3 3
1 34
1 3 9
x i
T a b l e p a g e 9 . 1 0 G e n e t i c a n d env i r o n ment a l v ar i a n c e s , h e r i t a b i l i 
t i es and b r eed i n g v alue s e s timat e d b y m a x i mum
l i kel i ho o d s i n g l e t r a i t metho d s . 1 4 0 9 . 1 0 c o n t i n u ed . B r e e d i n g v a l u e s for a l l t r a i t s e s t i m at e d
b y ma x i mum l i k e l ihoo d s i ng l e t r a i t m e t h o d s . 1 4 1 9 . 1 1 S ome r e s u l ts f r o m u s ing v a r ious t r i a l v a l u e s for
the v a r i a n c e s a n d c o v a r i a n c e s a s s o c i a t e d w i th an
e x trem a l tra i t . 1 4 2
9 . 1 2 G e n e t i c v a r i a n c e s a n d cov a r i a n c e s betwe e n two
b i nar y t r a i t s , l ambed - or - n o t a nd twi n s - o r - not , a n d h o gget w e i ght e s t i mat ed f o r thr e e env i r o n ment a l c o r r e l a t i o n s , 0 . 0 , 0 . 3 and - 0 . 3 .
9 . 1 3 G en et i c v ar i a n c e s a n d c o v ar i a n c e s b e t w e e n two c a t egor i c a l tr a i t s , l amb- t and l amb- e , a n d
h o gget w e ight , e s t ima t e d for th r e e env i r o nmen t a l
1 43
c o rrel a t i o n s , 0 . 0 , 0 . 3 and - 0 . 3 . 1 44
9 . 1 4 C ompar i s o n of b r e ed i n g v a l ue s f o r l am b e d - or - n o t a n d ho g g e t we i g h t obt a i n e d assum i n g e n v i r o nme n t a l c o rrel a t i o n of 0 . 0 and g enet i c c o r r el a t i o n s o f 0 . 0
a n d 0 . 62 75. 1 45
9 . 1 5 C o r r e l a t i o n s between b r eed i n g v a l u e s i n t a b l e 9 . 1 4 a n d the r e a r ing r a n k o f the s i r e . 1 4 6 1 0 . 1 T h e pr o b a b i l i t y t h a t a g r o u p o f n om i n a t e d s i ze
a nd nom i n a t ed i n c i d en c e has ob s e r v a t i o n s a l l i n
t h e sam e c a tegor y . 1 5 2
T ab l e
0 . 1 F o o t - sh a p e scor e s r ecor d e d a t t wo t imes b y t wo obs e r v e r s o n 9 7 l am b s .
0 . 2 F o o t- s c a l d sco r e s r ec o r d ed a t two t im e s b y t wo obs e r v er s o n 9 7 l am b s .
0 . 3 F o o t - r o t s co r e s r e c o r d e d a t two t im e s b y t wo ob s e r v er s o n 9 7 l am b s .
0 . 4 a E x per i m ent d a t a for 1 9 8 0 l amb i n g 0 . 4 b E x per i m ent d at a for 1 9 8 1 l amb i n g
0 . 5 E s t i mat e s o f br e e d i n g v a l u e s for t h e t r a i t
x i i i p a g e
1 76
1 7 6
1 7 7
1 77 1 78
foot- s h a p e s c o r e - o bt a i n ed b y v a r i ou s m ethod s . 1 7 9 0 . 6 E s t im a t e s of b r e e d i n g v a l u e s f o r the t r a i t s
p r esen c e o f foo t - s c a l d and pr e s e n ce o f f o o t - r o t ,
o b tai n e d b y two m e thod s . 1 8 0
L i st o f F i gu r e s
F i g u r e p a g e
7 . 1 C om par i so n o f t h e s c a l e f a c to r s a s s o c i at e d w i th
the t h r e e e x pr e s s i on s f o r E [ p ] . 9 0 7 . 2 C om p a r i so n o f the s c a l e f a c to r s a s s o c i a t ed w i th
t h e t h r e e e x pr e s s i on s fo r E [ p ( 1 -p ) ] . 9 1
7 . 3 R e l at i o n sh i p be t w e e n ac t u a l and e s t i m a t e d v a l u e s o f t h e i n t r a c l as s c o r r e l a t i o n
c o effi c i e n t for v a r i ous t h r e sho l d v a l u e s .
7 . 4 R e l at i o n s h i p between a c t u a l and e s t im a t ed v a l u e s o f the th r e s ho l d f o r v ar i o u s i n t r a c l a s s
95
c o rr e l a t i o n s . 96
7 . 5 R e l a t i o n sh i p between B ( . 5 0 ) and f am i l y s i z e . 1 0 3 7 . 6 R e l at io n s h i p be t w e en G ( . 5 0 ) a n d fam i l y s i z e . 1 0 3
7 . 7 R e l at i o n s h i p b e t w e en B ( . 1 0 ) a n d fam i l y s i z e . 1 0 4
7 . 8 R e l at i o n s h i p between G ( . 1 0 ) a n d fam i l y s i z e . 1 0 4
7 . 9 R e l ati o n s h i p between B ( . 0 2 ) a n d fam i l y s i z e . 1 0 5
7 . 1 0 R e l at i o n s h i p between G ( . 0 2 ) a n d f am i l y s i z e . 1 0 5
XV
F i gu r e p a g e
7 . 11 R e l at i o n s h i p be t w e en cor r e l at i o n ( U, U X) a n d
fam i l y s i ze . 106
7 . 12 R e l at i o n s h i p b e t w e e n c or r e l a t i o n ( u ' u p) and
f am i l y s i ze . 107
7 . 1 3 R e l at i o n sh i p between cor r e l a t i on ( u
x'u P) a nd
famil y s i ze . 108
7 . 1 4 R e l at i o n sh i p b e tween d e v i an c e ( 4 9 9 d f ) and
f ami l y s i ze . 1 09
•