Role of St at ist ics in research
Validit y
Will t his st udy help answer t he
research quest ion?
Analysis
What analysis, & how should t his be
int erpret ed and report ed?
Efficiency
Validit y
Will t he st udy answ er t he research quest ion?
Surveys
select a sam ple from a populat ion
describe, but can’t explain
can ident ify relat ionships, but
Surveys & Causalit y
I n a survey:
farm incom e increased by 10% for each
increase in fert iliser of 30 kg/ ha
Surveys & Causalit y
I n a survey:
farm incom e increased by 10% for each increase in fert iliser of 30 kg/ ha
I s t his relat ionship causal? Not necessarily,
ot her fact ors are involved: Managerial abilit y
Farm size
Educat ional level of farm er
Su r ve y Un it
Ex a m ple :
I n an survey t o assess
Su r ve y Un it
Ex a m ple :
I n a survey t o assess t he
height of I rish m ales vs English m ales,
t he unit is t he individual m ale in t hat
one would sam ple a num ber of m ales
of each count ry and t ake t heir height s
rat her t han m easure one m ale from
Com paring t reat m ent effect
A well designed experim ent leads t o conclusion:
Eit her t he t reat m ent s have produced t he observed effect
or
An im probable ( chance < 1: 20, 1: 100 et c) event has occurred
Technically we calculat e a p- value of t he dat a:
i.e. t he probabilit y of obt aining an effect as large as t hat observed when in fact t he average effect is zero
Essent ial elem ent s of a designed
experim ent
1.
COMPARATI VE The obj ect ive is t o com pare a
num ber ( > 1) of t reat m ent s
2.
REPLI CATI ON
Each t reat m ent is t est ed on m ore t han one
experim ent al unit
3.
RANDOMI SATI ON
Replicat ion
Each t reat m ent is t est ed on m ore t han one
e x pe r im e n t a l u n it
( t he populat ion it em t hat
receives t he t reat m ent )
To com pare t reat m ent s we need t o know t he inherent
variabilit y of unit s receiving t he sam e t reat m ent
background noise
Replicat ion: 2 fact s
Our fait h in t reat m ent m eans will:
I ncrease wit h great er replicat ion
Decrease when noise increases
I n part icular t he st andard error of difference ( SED) bet w een 2 t reat m ent m eans where:
r = ( com m on) replicat ion;
s = t ypical difference bet ween observat ions from sam e t reat m ent :
V a lidit y & Efficie n cy
V a lidit y:
The first requirem ent of an
experim ent is t hat it be valid. Ot herw ise it is
at best a w ast e of t im e and resources and at
w orst it is m isleading.
Efficie n cy:
t he use of experim ent al
resources t o get t he m ost precise answ er t o
t he quest ion being asked, is not an absolut e
requirem ent but is cert ainly desirable
Pseudoreplicat ion
- how t o invalidat e your experim ent !
Treat ing m ult iple m easurem ent s on t he sam e unit as if t hey were m easurem ent s on independent unit s
Ex a m ple : I n an experim ent t est ing t he effect of a
Ex a m ple :
I n an experim ent t o com pare t hree cult ivars of
grass, a rect angular t ray w as assigned at
random t o each t reat m ent . Trays w ere filled
w it h John I nnes Num ber 2 com post and 54
seedlings of t he appropriat e cult ivar w ere
plant ed in a rect angular pat t ern in each t ray.
Aft er t en w eeks t he 28 cent ral plant s w ere
harvest ed, dried and w eighed and t he 84
plant w eight s recorded. What w as t he
Ex a m ple :
I n an experim ent t o com pare t hree cult ivars
of grass, 7 square pot s w ere assigned at
random t o each t reat m ent . Pot s w ere filled
w it h John I nnes num ber 2 com post and 16
seedlings of t he appropriat e cult ivar plant ed
in a square pat t ern in each pot .
Aft er t en w eeks t he 4 cent ral plant s w ere
harvest ed, dried and w eighed. Thus 84 plant
w eight s w ere recorded. What is t he
Random isat ion
- allocat ing t reat m ent s t o unit s
Ensures t he only syst em at ic force
working on experim ent al unit s is t hat
produced by t he t reat m ent s
All ot her fact or t hat m ight affect t he
out com e are random ly allocat ed
Ra n dom isa t ion - h ow it
w or k s
What do we m ean by ‘I n a random ised
experim ent any difference bet w een
t he m ean response on different
t reat m ent s is due t o t reat m ent
Example: Suppose 8 experimental units, allocated at
random to two treatments.
Unit 1 2 3 4 5 6 7 8
Response if treated the same
4.1 5.3 7.2 2.6 3.5 6.4 5.5 4.7
Allocated at random to treatment
T1 T1 T2 T2 T2 T1 T2 T1
Treatment effect
0 0 2 2 2 0 2 0
Experimental response
4.1 5.3 9.2 4.6 5.5 6.4 7.5 4.7
Mean response T1 5.13 T2 6.70 The estimated treatment effect is the difference
6.70 - 5.13 = 1.57 between these two means. It is partly influenced by the treatment effect (2 units) and partly by the variation between experimental units, the
Now suppose the most extreme allocation, with the poorest experimental units receiving T2.
Unit 1 2 3 4 5 6 7 8 Response if treated the same
4.1 5.3 7.2 2.6 3.5 6.4 5.5 4.7 Allocated at random to treatment
T2 T1 T1 T2 T2 T1 T1 T2
Treatment effect
2 0 0 2 2 0 0 2
Experimental response
6.1 5.3 7.2 4.6 5.5 6.4 5.5 6.7
Mean response T1 6.10 T2 5.73
The estimated treatment effect is 5.73 - 6.10 = -0.37. Again it is partly influenced by the treatment effect (+2) and partly by the variation between experimental units,
the background noise. The treatment effect is
Again consider the same extreme allocation but with a larger treatment effect.
Unit 1 2 3 4 5 6 7 8 Response if treated the same
4.1 5.3 7.2 2.6 3.5 6.4 5.5 4.7 Allocated at random to treatment
T2 T1 T1 T2 T2 T1 T1 T2
Treatment effect
10 0 0 10 10 0 0 10
Experimental response
14.1 5.3 7.2 12.6 13.5 6.4 5.5 14.7
Mean response T1 6.10 T2 13.73
Th r e e poin t s:
The observed t reat m ent difference is due only t o t reat m ent effect and variat ion.
I f t he t reat m ent effect is large relat ive t o t he
background noise t hen even an ext rem e allocat ion will not obscure t he t reat m ent effect . ( Signal/ Noise rat io) .
I f t he num ber of experim ent al unit s is large t hen a
t reat m ent effect will usually be m ore obvious, since an ext rem e allocat ion of experim ent al unit s is less likely.
Te st s of H ypot h e se s - Te st s
of Sign ifica n ce
Su r ve y:
Are t he observed differences bet w een
groups com pat ible w it h a view t hat t here
are no differences bet w een t he populat ions
from w hich t he sam ples of values are
draw n?
D e sign e d e x pe r im e n t s:
Are observed
differences bet w een t reat m ent m eans
Te st s of H ypot h e se s - Te st s
of Sign ifica n ce
D e sign e d e x pe r im e n t
- only t w o
explanat ions for a negat ive answer,
difference is due t o t he applied
t reat m ent s or a chance effect
Su r ve y
is silent in dist inguishing
Ex a m ple
An experim ent on art ificially raised
salm on com pared t wo t reat m ent s and
20 fish per t reat m ent . Average gains
( g) over t he experim ent al period w ere
1210 and 1320. Variat ion bet w een fish
wit hin a group was RSE = 135g
Pr oce du r e
a ) N ULL H YPOTH ESI S
Treat m ent s have no effect and
any difference observed bet w een groups t reat ed
different ly is due t o chance ( variat ion in t he
experim ent al m at erial) '
b) M e a su r e
- t he variat ion bet w een groups t reat ed different ly
- t he variat ion expect ed if due solely t o chance
d)
The observed difference could have occurred by
chance.
St a t ist ica l t h e or y give s r u le s t o
de t e r m in e h ow lik e ly a give n diffe r e n ce in
va r ia t ion is lia ble t o be by ch a n ce .
e ) SI GN I FI CAN CE TEST
Face t he choice.
- This difference in variat ion could have occurred by
chance w it h probabilit y ? ( 5% , 1% , et c)
OR
- There is a real difference ( produced by t reat m ent ) .
Ex a m ple
:
- Th e t t e st
An experim ent on art ificially raised
salm on com pared t wo t reat m ent s and
20 fish per t reat m ent . Average gains
( g) over t he experim ent al period w ere
1210 and 1320. Variat ion bet w een fish
wit hin a group was RSE = 135g
Exam ple
a ) N ULL H YPOTH ESI S - Treat m ent does not affect salm on growt h rat e
b) Observed difference bet ween groups 1320 - 1210 = 110
Variat ion expect ed solely from chance 135 x ( 2/ 20) .5 = 42.7
c) Te st St a t ist ic
t = 110/ 42.7 = 2.58
d) St at ist ical t heory ( t t ables) shows t hat t he chance of a value as large as 2.58 is about 1 in 100
e ) Make t he choice
Responsibility of the Researcher and
Statistician;
PLANNING PHASEResearcher Statistician
Seek statistical training Keep up to date with statistical technology
Seek statistical advice Teach principles
Use minimum experiment size Provide statistical input to plan Select experimental material
properly
Give researcher different alternatives
EXECUTION PHASE
Research Statistician
Carry out study as planned Give road map/for execution
Log important dates related to data
ANALYSIS PHASE
Researcher Statistician
Study data patterns Assist in studying data patterns
Keep integrity of data set
Choose proper statistical analytical procedure
Assist in choosing analytical procedure
Choose probability levels, contrasts to make, etc.
Assist in choosing probability levels, contrasts, etc.
INTERPRETATION AND REPORTING
Researcher
Statistician
Provide description of statistical methods
Assist in writing statistical methods used
Present results in such a
way that reader can evaluate the interpretation
Review interpretation. Modification if necessary
Responsibility of Researcher and
Statistician
Promote high standards of scientific inquiry and professionalism
Involve appropriate techniques for research
Honor the rights of other researchers – give credit to other researcher where due
Consider interdependence of natural, social and technological systems
Give objectivity a major role
Good Practices Checklist
Planning is very important in experimentation
Statistician can assist in planning
Planning does not ensure success but avoids
built-in disasters
Statistics cannot compensate for negative
Good Planning Can Prevent:
Costly waste of resources
Difficult statistical analysis
Data for which interpretation is controversial
Setting Up Original Hypothesis Objectively
2 Rules:
1.
Hypothesis should be clearly
related to original problem
IV. Discipline Specific Ethical Issues
Flexibility needed: Ethics vary Among Different Application Areas
Business Application: Withholding Negative Results
Problem Formulation Important – Involve Statistician
Statistician provides report but does not make
decisions for management
Company should have same responsibilities to a
salaried statistician as to a consulting one (and
conversely).
See: Deming, W.W.,
Sample Designs in Business
Medical Application
Medical review boards
Informed consent
Methods of selecting subjects
Withholding a treatment to a control group
Access to data
V. Ethical Issues in Interpretation and
Reporting
Insufficient statistical methods description
Statistical significance vs. practical significance
Access to data
Kinds of means in the factorial experiment reporting
Reporting of measures of dispersion
Proper decimal reporting
Bonafide scientific conclusions vs. speculation
Clarity of reporting
Indication that results are not final word
VI. Case Studies
Skagerrak Case – Precautionary Principle
2 Highly respected scientists interpret their results
differently
Case emphasized 2 critical aspects of research
1. The actual statistical analysis
2. How and when to disseminate the information from research
Elton’s Withholding of Anomalous Data