Advanced experimental design and analysis 1

(1)

Environmental Scientists, Biologists, and Resource Managers

C. J. Schwarz

Department of Statistics and Actuarial Science, Simon Fraser University

cschwarz@stat.sfu.ca

(2)

100Logistic Regression - Advanced Topics 2

100.1Introduction . . . 2

100.2Sacrificial pseudo-replication . . . 2

100.3Example: Fox-proofing mice colonies . . . 4

100.3.1 Using the simple proportions as data . . . 5

100.3.2 Logistic regression using overdispersion . . . 6

100.3.3 GLIMM modeling the random effect of colony . . . 8

(3)

Logistic Regression - Advanced Topics

100.1 Introduction

The previous chapters on chi-square tests, logistic regression, and logistic ANOVA only considered the simplest of experiment designs where the data were collected under a completely randomized design, i.e. every observation is independent of every other observation with complete randomization over experimental units and treatments.

It is possible to extend logistic regression and logistic ANOVA to more complex experimental designs.

My course notes in a graduate course Stat-805http://www.stat.sfu.ca/~cschwarz/Stat-805

have some details on these more advanced topics.

It is only recently that software has become readily available to analyze these types of experiments. The

illustrations below will useProc GLIMMIXavailable inSASv.9.1.3 or higher.

In this chapter some variations from the simple CRD will be discussed.

100.2 Sacrificial pseudo-replication

In many experiments, the experimental unit is a collection of individuals, but measurements take place on the individual.

(4)

Here are the data (Table 6 of Hurlbert (1984)):

Colony % Males Number males Number females

Foxes A1 63% 22 13

A2 56% 9 7

No foxes B1 60% 15 10

B2 43% 97 130

This data has the characteristics of a chi-square test or logistic ANOVA. The factor (type of fencing) is categorical. The response, the sex of the mouse, is also categorical. Many researchers would simply pool over the replicates to give the pooled table:

Foxes A1+A2 61% 31 20

No foxes B1+B2 44% 112 140

If aχ2

test is applied to the pooled data, thep-value is less than 5% indicating there is evidence that the

sex ratio is not independent of the presence of foxes.

This “pooled analysis” isINCORRECT. According to Hurlbert (1984), the major problem is that

indi-vidual units (the mice) are treated as independent objects, when in fact, there are not. Experimenters often pool experimental units from disparate sets of observations in order to do simple chi-square tests or logistic

ANOVA. He specifically labels this pooling assacrificial pseudo-replication.

Hurlbert (1984) identifies at least 4 reasons why the pooling is not valid:

• non-independence of observation. The 35 mice caught inA1can be regarded as 35 observations

all subject to a common cause, as can the 16 mice inA2, as each group were subject to a common

influence in the patches. Consequently, the pooled mice areNOTindependent; they represent two sets

of interdependent or correlated observations. The pooled data set violates the fundamental assumption of independent observations.

• throws away some information. The pooling throws out the information on the variability among

replicate plots. Without such information there is no proper way to assess the significance of the dif-ferences between treatments. Note that in previous cases of ordinary pseudo-replication (e.g. multiple fish within a tank), this information is also discarded but is not needed - what is needed is the varia-tion among tanks, not among fish. In the latter case, averaging over the pseudo-replicates causes no problems.

• confusion of experimental and observational units. If one carries out a test on the pooled data,

(5)

• unequal weighting. Pooling weights the replicate plots differentially. For example, suppose that one

enclosure had 1000 mice with 90% being male; and a second enclosure has 10 mice with 10% being

male. The pooled data would have1000 + 10 mice with900 + 1 being male for an overall male

ratio of 90%. Had the two enclosures been given equal weight, the average male percentage would be (90%+10%)/2=50%. In the above example, the number of mice captured in the plots varies from 16 to over 200; the plot with over 200 mice essentially drives the results.

There are multiple ways to analyze this data that avoid the problem that render the pooled analysis invalid. Unfortunately, JMP 7.0 does NOT have the ability to properly analyze this type of data. SAS will be used to illustrate the various options in the sections that follow.

100.3 Example: Fox-proofing mice colonies

Hurlbert (1984) cites the example of an experiment to investigate the effect of fox predation upon the sex ratio of mice. Four colonies of mice are established. Two of the colonies are randomly chosen and a fox-proof fence is erected around the plots. The other two colonies serve as controls with out any fencing.

Here are the data (Table 6 of Hurlbert (1984)):

Foxes A1 63% 22 13

A2 56% 9 7

No foxes B1 60% 15 10

B2 43% 97 130

We being by reading in the data:

data mice;

length treatment $10.;

input colony $ treatment $ sex $ count; datalines;

(6)

100.3.1 Using the simple proportions as data

Hurlbert (1984) suggests the proper way to analyze the above experiment is to essentially compute a single number for each plot and then do a two-sample t-test on the percentages. [This is equivalent to the ordinary averaging process that takes place in ordinary pseudo-replication or sub-sampling.]

We can have SAS compute the proportion of males directly usingProc Transpose:

/* transpose the data to compute the proportions */ proc sort data=mice; by colony treatment;

proc transpose data=mice out=trans_mice; by colony treatment;

format m f 5.0 p_males 7.3; run;

This gives:

colony treatment m f p_males

a1 foxes 22 13 0.629

a2 foxes 9 7 0.563

b1 no.foxes 15 10 0.600

b2 no.foxes 97 130 0.427

Proc Ttestis then used to analyze the data:

proc ttest data=trans_mice ci=none;

title2 ’Simple ttest on the proportion of males’; class treatment;

var p_males;

ods output Statistics=ttest_statistics; ods output TTests=ttest_tests;

(7)

The simple summary statistics are:

Variable treatment N Mean Std Error

p_males foxes 2 0.5955 0.0330 0.1758 1.0153

p_males no.foxes 2 0.5137 0.0863 -0.5834 1.6108

p_males Diff (1-2) _ 0.0819 0.0924 -0.3159 0.4796

The results from a simple t-test conducted in SAS are:

Variable Method Variances t

Value DF

Pr > |t|

p_males Pooled Equal 0.89 2 0.4692

p_males Satterthwaite Unequal 0.89 1.2866 0.5102

The estimated difference in the sex ratio between colonies that are subject to fox predation and colonies

not subject to fox predation is .082 (SE .092) withp-values of .47 (pooled t-test) and .51 (unpooled t-test)

respectively. As thep-values are quite large, there is NO evidence of a predation effect.

With only two replicates (the colonies), this experiment is likely to have very poor power to detect anything but gross differences.

The above analysis is not entirely satisfactory. The proportion of males have different variabilities be-cause they are based on different number of total mice. As well, there may be overdispersion among colonies under the same treatment, i.e. the variation in the proportion of males may be larger among the two colonies under the same treatment than expected.

100.3.2 Logistic regression using overdispersion

Another “approximate” method to deal with the potential overdispersion among the colonies within the same treatment group (the colony effect) is to use a standard logistic regression but use the goodness-of-fit test to estimate an overdispersion effect. This overdispersion is then used to adjust the standard errors of estimates and the test statistics for hypothesis tests. Please consult the chapter on Logistic Regression for more details.

(8)

proc genmod data=mice descending;

title2 ’Generalized Linear Model allowing for overdispersion’; class treatment colony;

model sex = treatment /

dist=binomial link=logit dscale aggregate=colony type3; freq count;

lsmeans treatment / diff cl;

ods output ParameterEstimates=GenModParameterEstimates; ods output Type3=GenModType3;

ods output Diffs=GenModLSmeansDiff; run;

TheDscaleoption on the model statement indicates that the deviance goodness-of-fit test is used to estimate the overdispersion factor.

The bottom line of the parameter estimates table:

Parameter Level1 DF Estimate Standard Error

Intercept 1 -0.2231 0.1527 -0.5225 0.0762 2.13 0.1441

treatment foxes 1 0.6614 0.3778 -0.0791 1.4019 3.06 0.0800

treatment no.foxes 0 0.0000 0.0000 0.0000 0.0000 . .

Scale 0 1.2049 0.0000 1.2049 1.2049 _ _

estimates the overdispersion factor as 1.20. This implies that standard errors will be inflated by√1.20and

test-statistics for effect tests will be deflated by a factor of 1.20.

The test for a treatment effect:

Source Num

treatment 1 2 3.14 0.2185 3.14 0.0765 LR

(9)

The estimated sex effect is:

Effect treatment _treatment Est SE z_Value

Pr > |z|

Alpha Lower Upper

treatment foxes no.foxes 0.6614 0.3778 1.75 0.0800 0.05 -0.07912 1.4019

The estimate of .66 implies that the odds-ratio of the proportion of males between colonies with foxes and

without foxes isexp(.66) = 1.94xbut the 95% confidence interval for the odds ratio is fromexp(₋.0791) =

.92toexp(1.40) = 4.05which includes the value of 1 (indicating no difference in the odds of males).

The use a simple overdispersion factor is not completely satisfactory. It assumes a single correction factor for all of the estimates and again estimates the different amount of mice in each colony.

100.3.3 GLIMM modeling the random effect of colony

A more “refined” analysis is now available using Generalized Linear Mixed Models (GLIMM) which have been implemented in SAS.

GLIMM allow the specification of random effects in much the same way as in advanced ANOVA models. This is a very general treatment and now allows us to analyze data from very complex experimental designs.

In this model, the model would be specified as:

logit(pmales) =T reatment Colony(T reatment)(R)

where theColony(T reatment)would be the random effect of the experimental units (the colonies). A

logistic type model is used.

This is specified in SAS as:

proc glimmix data=mice;

title2 ’Glimmix analysis’; class treatment colony ;

model sex(event=’m’) = treatment /

distribution=binary link=logit ddfm=kr; random colony(treatment);

freq count;

lsmeans treatment / cl ilink;

lsmestimate treatment "trt effect" 1 -1 / cl; ods output CovParms=GlimmixCovParms;

(10)

ods output LSMestimates=GlimmixLSMestimates; run;

The output from GLIMMIX (SAS 9.3.1) follows. First is an estimate of the variability among colonies (on the logit scale):

Parameter Estimate Standard Error

colony(treatment) 0.06892 0.1675

Next is the test of the overall treatment effect:

Effect Num

treatment 1 1.847 1.22 0.3919

Thep-value is .39; again there is no evidence of a predation effect on the proportion of males in the colonies.

Finally, an estimate of the treatment effect:

Effect Label Estimate Standard

Error DF

treatment trt effect 0.5269 0.4763 1.847 1.11 0.3919 0.05 -1.6940 2.7478

Some caution is required. The estimate of .53 (SE .47) is for the difference in the logit(proportions) between

males and females. If you take exp(.53) = 1.69, this is the estimated odds-ratio of males to females

comparing colonies with predators to colonies without predators. The 95% confidence interval for the

odds-ratio isexp(₋1.6950) =.183)toexp(2.7478) = 15.60which includes the value of 1 (indicating no effect).

Consult the chapter on logistic regression for an explanation of odds and odds-ratios.

100.4 Example: Over-dispersed Seeds Germination Data

This data is from the SAS manual.

In a seed germination test, seeds of two cultivars were planted in pots of two soil conditions. The fol-lowing data contains the observed proportion of seeds that germinated for various combinations of cultivar

and soil condition. Variablenrepresents the number of seeds planted in a pot, andrrepresents the

(11)

respectively.

TheSASprogram that analyzed this data is available in the filegermination.sasin the Sample Program

Library at:http://www.stat.sfu.ca/~cschwarz/Stat-650/Notes/MyPrograms.

Notice that the experimental unit is the pot (i.e.soilandcultwere applied to thepotlevel), but the

ob-servational unit (what is actually measured) is the individual seed. The response variable for each individual

seed is the eitheryesornodepending if it germinated or not.

First, how big is the pot random effect? One way to estimate this would be to compare the variation

ofpbamong pots within the same soil-cultivar combination with the theoretical variation based on binomial

sampling within each pot. In order to account for the differing sample sizes in each pot, we will compute a

“standardized normal” variable for potiwithin soil-cultivar combinationjas:

zij= p bpij−pj

(12)

If the additional pot-to-pot random variation was negligible, thenZshould have an approximate standard

normal distribution with a variance of 1. The actual variance ofZwas found to be4.5indicating that the

pot-to-pot variation inpbwas about4_×larger than expected from a simple binomial variation.

Because of this extra-binomial variation, it is not proper to simply “ignore” the pot and pool over the five pots for each cultivar-soil combination. This would be an example of sacrificial pseudo-replication as outlined by Hurlbert (1984). As you will below, the pot-to-pot variation in the proportion that germinate is

more than can be explained by the simple binomial variation, i.e. there is a large random effect ofpotsthat

must be incorporated.

A naive analysis could proceed by finding the proportion of seeds that germinated in each pot (e.g. for

pot 1,p_b= 8/16 = 0.50) and then doing a two-factor CRD analysis on these proportions using the model:

b

p=Soil Cult Soil_∗Cult

This is not satisfactory because the number of seeds in each pot (n) varies considerably from pot-to-pot, and

hence the variance ofpbalso varies1_{. A weighted analysis could be performed which would partially solve}

this problem.

This naive analysis could be done using theProc Mixedcode:

proc mixed data=seeds;

title2 ’naive analysis on the proportions in each plot - using n as weight’; class cult soil;

model phat = cult soil cult*soil / ddfm=kr; lsmeans soil cult soil*cult/ diff adjust=tukey; weight n;

ods output CovParms=NaiveMixedCovparms; ods output Tests3=NaiveMixedTests3; ods output LSmeans=NaiveMixedLSMeans; run;

This give an estimate of the residual variance of:

Cov

Parm Estimate

Residual 1.0057

The residual variance is a combination of pot-to-pot variance and the variability of thepbin each pot.

As a rough guess, the average germination rate is around0.5with an average sample size of around 45.

This would give a binomial variance of .5(1₋.5)/45 = .005. Hence the pot-to-pot variance is about

.026₋.005 =.020which is about4_×that of the binomial variance which we saw earlier.

(13)

The following results for the tests of the main effects and interactions:

cult 1 16 1.57 0.2287

soil 1 16 10.86 0.0046

cult*soil 1 16 0.05 0.8177

Hence the naive analysis find no evidence of an interaction effect ofsoilandcultivar, no evidence of a

main effect ofcultivar, but strong evidence of a main effect ofsoilupon the germination rate.

Here are the estimates of the marginal means:

Effect cult soil Estimate Standard Error DF

soil 0 0.3720 0.04891 16 7.61 <.0001

soil 1 0.5952 0.04688 16 12.70 <.0001

cult 0 0.5260 0.05172 16 10.17 <.0001

cult 1 0.4412 0.04376 16 10.08 <.0001

cult*soil 0 0 0.4064 0.07334 16 5.54 <.0001

cult*soil 0 1 0.6455 0.07295 16 8.85 <.0001

cult*soil 1 0 0.3375 0.06473 16 5.21 <.0001

cult*soil 1 1 0.5448 0.05889 16 9.25 <.0001

The estimated marginal mean germination rates are relatively precise. The standard errors are not equal because of the differing sample sizes in the pots in the various soil-cultivar combinations.

The pots serve as “cluster” in this experiment, so thead hocmethods of correcting for overdispersion

caused by “cluster” effects can also be done, i.e. estimating the overdispersion factor (bc) and multiplying the

standard errors by the√bc. This can be done usingProc GenmodinSAS:

proc genmod data=seeds;

title2 ’overdispersion model’; class cult soil;

model r/n = cult soil cult*soil / dist=binomial link=logit scale=deviance type3; lsmeans cult soil cult*soil / diff;

(14)

ods output LSmeans =GenmodLSmeans; run;

This will use the deviance to degrees of freedom to estimate the overdispersion factor. This gives the following estimate for the overdispersion factor:

Criterion DF Value Value/DF

Deviance 16 68.3465 4.2717

Pearson Chi-Square 16 66.7619 4.1726

Once again we see that the overall variation is over4×larger than expected under the binomial model.

It is not possible to obtain an explicit estimate of the actual pot-to-pot variance.

The test for effects are:

Source Num

cult 1 16 1.55 0.2317 1.55 0.2138 LR

soil 1 16 10.39 0.0053 10.39 0.0013 LR

cult*soil 1 16 0.05 0.8325 0.05 0.8298 LR

The results are similar to the naive analysis seen earlier.

The estimated “means” are now on the logit scale:

Statement

Number Effect cult soil Estimate

Standard

1 cult 0 0.1103 0.2199 0.50 0.6161

1 cult 1 -0.2473 0.1864 -1.33 0.1846

1 soil 0 -0.5266 0.2087 -2.52 0.0116

1 soil 1 0.3896 0.1989 1.96 0.0501

1 cult*soil 0 0 -0.3788 0.3077 -1.23 0.2183

1 cult*soil 0 1 0.5993 0.3143 1.91 0.0565

1 cult*soil 1 0 -0.6745 0.2821 -2.39 0.0168

(15)

These can be converted back to the regular scale using the inverse transformation:

b

pcult=0=expit(.1103) =.53

which is similar to the previous results. Note that thesemust be converted using the delta-method and not

simply using theexpittransformation and is found to be

se(pcult_b =0) =se(logit(pcultb =0))(pcultb =0)(1−pcultb =0) =.2199(.53)(1−.53) =.055

which again matches the naive analysis.

Finally, a logistic ANOVA can be done that explicitly models the random effects of pots directly can be

done usingProc GlimmixinSAS:

proc glimmix data=seeds plots=all; /* plots= request residual and other plots */ title2 ’random effect model’;

class cult soil pot;

model r/n = cult soil cult*soil / dist=binomial link=logit ddfm=kr; random pot(cult*soil) / type=vc;

lsmeans cult / diff adjust=tukey plots=(meanplot(cl ilink) diffplot); lsmeans soil / diff adjust=tukey plots=(meanplot(cl ilink) diffplot); lsmeans soil / diff cl oddsratio; /* get the odds ratio */

lsmeans cult*soil / diff adjust=tukey plots=(meanplot(cl ilink) diffplot); /* plots= ods output CovParms=GlimmixCovParms;

ods output Tests3 =GlimmixTests3; ods output LSmeans =GlimmixLSmeans; ods output Diffs =GlimmixDiffs; run;

This corresponds to the model in the shorthand notation:

logit(p) =cult soil cult_∗soil pots(cult_∗soil)₋R

or the generalized linear model:

rij_∼Binomial(nij, hpij)

θij=logit(pij)

θij=cult soil cult∗soil pots(cult∗soil)−R

Note that pots are nested within each cultivar-soil combination.

This procedure can also produce residual and other plots as requested using theplotskeyword2

The estimated pot-to-pot variance (on the logit scale) is found as:

(16)

Parameter Estimate Standard Error

pot(cult*soil) 0.3208 0.1571

There is no easy way to convert this to an estimate of the pot-to-pot variation on the regular scale.

Tests for main effects and interactions are:

Effect Num

cult 1 15.07 1.00 0.3342

soil 1 15.07 7.71 0.0141

cult*soil 1 15.07 0.02 0.8791

These match the results seen earlier.

Similarly, estimates of main effects (on the logit scale) are found to be:

Effect cult soil Estimate Standard

Error DF

cult 0 0.02619 0.2164 16 0.12 0.9052

cult 1 -0.2705 0.2039 13.77 -1.33 0.2064

soil 0 -0.5350 0.2096 15.19 -2.55 0.0219

soil 1 0.2907 0.2109 14.96 1.38 0.1883

soil 0 -0.5350 0.2096 15.19 -2.55 0.0219

soil 1 0.2907 0.2109 14.96 1.38 0.1883

cult*soil 0 0 -0.4096 0.3001 15.87 -1.36 0.1913

cult*soil 0 1 0.4620 0.3118 16 1.48 0.1578

cult*soil 1 0 -0.6603 0.2927 14.51 -2.26 0.0400

cult*soil 1 1 0.1194 0.2841 13.04 0.42 0.6811

These are again comparable to the previous results.

The advantage of theGlimmixprocedure is the availability of model diagnostic plots. For example, the

(17)

indicates the potential presence of at least one outlier with a germination rate (on the logit scale) is well below that predicted. The normal-probability plot (on the logit scale) also identifies this one potential outlier.

Estimates of the differences between the mean logits can also be found and the listing is available in the

SASoutput. For example, the estimated difference (on the logit scale) between the two marginal means of

the soil combinations is:

Effect soil _soil Estimate Standard

Error DF t Value

Pr > |t|

Adjustment Adj P

(18)

Odds

Ratio Alpha

Lower Confidence Limit for Odds Ratio

Upper Confidence Limit for Odds Ratio

0.438 0.05 0.232 0.825

Hence the estimated difference in the mean logits is−.83(SE .30). This can be converted to an odds

ratio using the methods seen earlier, i.e.

ORsoil0:1=exp(−.8257) =.44

which implies that the odds of germination in soil=0 is only 44% of the odds of germination in soil=1. The

95% confidence interval for the odds-ratio is from(.23_→.83).