3.8: Statistical tools for data analysis
3.8.5 Estimation Method
In this section, we introduce the Maximum Likelihood Estimation (MLE) technique used to estimate parameters in the survey logistic regression model. Due to the com- plex design sampling properties such as unequal probability of selection, clustering and stratification may not work properly.
Parameters estimation
In this sub-subsection we derive expressions for the maximum likelihood estimators in a typical survey logistic regression. Assuming that the outcomes variable yijh
follows Bernoulli distribution with density function
g(Yijh=yijh) =πijhyijh(1−πijh)1−yijh (3.78) the mean and variance ofyhij are respectively,
µijh = exp[x0ijhβ]
1 + exp[x0ijhβ] (3.79)
and
σ2 =µijh 1−µijh
(3.80) andβ = (β1, β2, ...βp)0 denote a vector of parameters. The log-likelihood function that forms the basis for maximum likelihood estimation is given by
`=
mhj
X
i=1 nh
X
j=1 H
X
h=1
wijhϑijh[yijhlog(µijh+ (1−yijh)log(1−µijh)] (3.81)
Substituting the values of meanµijh into this expression we obtain
`=
mhj
X
i=1 nh
X
j=1 H
X
h=1
wijhϑijh
yijhlog
exp[x0ijhβ]
1 + exp[x0ijhβ]
+(1−yijh)log
1− exp[x0ijhβ]
1 + exp[x0ijhβ]
(3.82) To obtain the unknown parameters we have to differentiate the log-likelihood with respect toβto get the following equation
∂`
∂β =
mhj
X
i=1 nh
X
j=1 H
X
h=1
wijhϑijh exp[x0ijhβ]
1 + exp[x0ijhβ]2
yijh
1− 1 + exp[x0ijhβ]−1− 1−yijh 1 + exp[x0ijhβ]
x0ijh (3.83)
=
mhj
X
i=1 nh
X
j=1 H
X
h=1
ϑijhA0ijh
σ2(yijh)−1
yijh−µijh
(3.84)
whereAijh =µijh 1−µijh
x0ijh
The Fisher information matrix for the parameters of the Bernoulli model follows as ω =−E
∂2`
∂ββ0
, (3.85)
=
mhj
X
i=1 nh
X
j=1 H
X
h=1
Dx0ijh2 bc2
yijh 1−2c−1
− 1−yijh
1 +b−1
−3 1 +b−2
b
(3.86) whereD=wijhϑijh,b= exp[x0ijhβ]andc= 1 + exp[x0ijhβ].
After simplifying the previous equation we get the following equation which is re- ferred to as the Fisher Information
ω =
mhj
X
i=1 nh
X
j=1 H
X
h=1
wijhϑijhA0ijh
σ2(yijh)−1
Aijh (3.87)
Test for goodness of fit The Likelihood Ratio Test
The likelihood ratio (LR) test evaluates the significance of the joint effect of all the variables in the Survey logistic regression procedure. Likelihood ratio is used to compare the significance of the model with multiple parameters to just the inter- cept model. Suppose the model containss explanatory effects. For the ith obser- vation, letπˆi be the estimated probability of the observed response. The statistic of -2loglikelihood is given by:
−2logL=−2X
i
wifilog( ˆπi (3.88)
where,wi andfi are weight and frequency values, respectively, of theith observa- tions. For binary response models that use the events/trials, this is equivalent to
−2logL=−2X
i
wifi
rilog( ˆπi) + (ni−ri)log(1−πˆi) (3.89)
whereri is the number of events,ni is the number of trials, andπˆi is the estimated event probability. The likelihood ratio comparing the log likelihood of the two mod- els and if the difference is statistically significant, then the restricted model works better than the full model. The significance of the likelihood ratio test means that the joint of the variables in the full model is more significant than just the intercept model. Under the global null and alternative likelihood ratio tests has the following hypothesis :
H0:βi= 0 , f or i= 1,2, ..., p
H1: Not allβi= 0 , f or i= 1,2, ..., p
The test statistic of the likelihood ratio test follows a chi square distribution withp degrees of freedom according to Prempeh (2009). The log-likelihood ratio is defined as the difference between the deviance of the null model and model with explana- tory variable(s).
Loglikelihood−Ratio=Dvnull−Dp−1v (3.90) where Dvnull is the deviance of the model with just the constant and Dvp−1 is the deviance of the model withp−1parameters.
Wald Test
The Wald test is the additional test that also can be used to evaluate the significance of the individual parameters in the Surverylogistic Regression procedure. The ex- pression for calculating the Wald statistic is given by
W ald= βˆi
SE( ˆβi) (3.91)
where, βˆi is the regression coefficient estimates of the explanatory variables and SE( ˆβi) is the standard error of the corresponding regression coefficient estimate.
According to Rana et al. (2010), a squared value of the Wald statistics is chi-square
distributed with one degree of freedom.
(W ald)2=
βˆi2
SE( ˆβi)
2 (3.92)
The Wald statistic test has the following null and alternative hypotheses:
H0:βi= 0 , f or i= 1,2, ..., p
H1: Not allβi= 0 , f or i= 1,2, ..., p
If testing one parameter at a time, Wald statistic follows theχ2 distribution with 1 degree of freedom. The null hypothesis is rejected if the p-value<0.05 =α, where αis the level of significance. A regression coefficient estimates with a p-value of the Wald statistic<0.05 implies that the variable is important in the current model.
Akaikes Information Criterion (AIC)
Akaike’s Information Criterion (AIC) measures the quality of each model fit in the survey logistic regression, compared to other models based on the final model. The AIC can be used to select the best model. Akaike’s Information Criterion (AIC) is useless when it is used to the isolated model but the best way to use it is to compare different models. The formula for calculating AIC is:
AIC =−2L(β) + 2p (3.93)
whereβ is the number of parameters in the model,Lis the maximum value of the likelihood function andpis the number of parameter in the model.For generalized logit model,p=k(m+ 1), wherekis the total number of response levels subtracting one andmis the number of explanatory effects. The AIC method is used to select the best model in different set of data. A model with the smallest AIC value will be the most preferable model to use for analysis.