Role of Statistics in Research

(1)

(2)

Role of St at ist ics in research

Validit y

Will t his st udy help answer t he

research quest ion?

Analysis

What analysis, & how should t his be

int erpret ed and report ed?

Efficiency

(3)

Validit y

Will t he st udy answ er t he research quest ion?

Surveys

select a sam ple from a populat ion

describe, but can’t explain

can ident ify relat ionships, but

(4)

Surveys & Causalit y

I n a survey:

farm incom e increased by 10% for each

increase in fert iliser of 30 kg/ ha

(5)

Surveys & Causalit y

I n a survey:

farm incom e increased by 10% for each increase in fert iliser of 30 kg/ ha

I s t his relat ionship causal? Not necessarily,

ot her fact ors are involved: Managerial abilit y

Farm size

Educat ional level of farm er

(6)

Su r ve y Un it

Ex a m ple :

I n an survey t o assess

(7)

Su r ve y Un it

Ex a m ple :

I n a survey t o assess t he

height of I rish m ales vs English m ales,

t he unit is t he individual m ale in t hat

one would sam ple a num ber of m ales

of each count ry and t ake t heir height s

rat her t han m easure one m ale from

(8)

(9)

Com paring t reat m ent effect

A well designed experim ent leads t o conclusion:

Eit her t he t reat m ent s have produced t he observed effect

or

An im probable ( chance < 1: 20, 1: 100 et c) event has occurred

Technically we calculat e a p- value of t he dat a:

i.e. t he probabilit y of obt aining an effect as large as t hat observed when in fact t he average effect is zero

(10)

Essent ial elem ent s of a designed

experim ent

1.

COMPARATI VE The obj ect ive is t o com pare a

num ber ( > 1) of t reat m ent s

2.

REPLI CATI ON

Each t reat m ent is t est ed on m ore t han one

experim ent al unit

3.

RANDOMI SATI ON

(11)

Replicat ion

Each t reat m ent is t est ed on m ore t han one

e x pe r im e n t a l u n it

( t he populat ion it em t hat

receives t he t reat m ent )

To com pare t reat m ent s we need t o know t he inherent

variabilit y of unit s receiving t he sam e t reat m ent

background noise

(12)

Replicat ion: 2 fact s

Our fait h in t reat m ent m eans will:

I ncrease wit h great er replicat ion

Decrease when noise increases

I n part icular t he st andard error of difference ( SED) bet w een 2 t reat m ent m eans where:

r = ( com m on) replicat ion;

s = t ypical difference bet ween observat ions from sam e t reat m ent :

(13)

V a lidit y & Efficie n cy

V a lidit y:

The first requirem ent of an

experim ent is t hat it be valid. Ot herw ise it is

at best a w ast e of t im e and resources and at

w orst it is m isleading.

Efficie n cy:

t he use of experim ent al

resources t o get t he m ost precise answ er t o

t he quest ion being asked, is not an absolut e

requirem ent but is cert ainly desirable

(14)

Pseudoreplicat ion

- how t o invalidat e your experim ent !

Treat ing m ult iple m easurem ent s on t he sam e unit as if t hey were m easurem ent s on independent unit s

Ex a m ple : I n an experim ent t est ing t he effect of a

(15)

Ex a m ple :

I n an experim ent t o com pare t hree cult ivars of

grass, a rect angular t ray w as assigned at

random t o each t reat m ent . Trays w ere filled

w it h John I nnes Num ber 2 com post and 54

seedlings of t he appropriat e cult ivar w ere

plant ed in a rect angular pat t ern in each t ray.

Aft er t en w eeks t he 28 cent ral plant s w ere

harvest ed, dried and w eighed and t he 84

plant w eight s recorded. What w as t he

(16)

(17)

Ex a m ple :

I n an experim ent t o com pare t hree cult ivars

of grass, 7 square pot s w ere assigned at

random t o each t reat m ent . Pot s w ere filled

w it h John I nnes num ber 2 com post and 16

seedlings of t he appropriat e cult ivar plant ed

in a square pat t ern in each pot .

Aft er t en w eeks t he 4 cent ral plant s w ere

harvest ed, dried and w eighed. Thus 84 plant

w eight s w ere recorded. What is t he

(18)

(19)

Random isat ion

- allocat ing t reat m ent s t o unit s

Ensures t he only syst em at ic force

working on experim ent al unit s is t hat

produced by t he t reat m ent s

All ot her fact or t hat m ight affect t he

out com e are random ly allocat ed

(20)

Ra n dom isa t ion - h ow it

w or k s

What do we m ean by ‘I n a random ised

experim ent any difference bet w een

t he m ean response on different

t reat m ent s is due t o t reat m ent

(21)

Example: Suppose 8 experimental units, allocated at

random to two treatments.

Unit 1 2 3 4 5 6 7 8

Response if treated the same

4.1 5.3 7.2 2.6 3.5 6.4 5.5 4.7

Allocated at random to treatment

T1 T1 T2 T2 T2 _T1 T2 _T1

Treatment effect

0 0 2 2 2 ₀ 2 ₀

Experimental response

4.1 5.3 9.2 4.6 5.5 _6.4 7.5 _4.7

Mean response T1 5.13 T2 6.70 The estimated treatment effect is the difference

6.70 - 5.13 = 1.57 between these two means. It is partly influenced by the treatment effect (2 units) and partly by the variation between experimental units, the

(22)

Now suppose the most extreme allocation, with the poorest experimental units receiving T2.

Unit 1 2 3 4 5 6 7 8 Response if treated the same

4.1 5.3 7.2 2.6 3.5 6.4 5.5 4.7 Allocated at random to treatment

T2 _{T1 T1}T2 T2 _{T1 T1}T2

Treatment effect

2 _{0 0}2 2 _{0 0}2

6.1 _{5.3 7.2}4.6 5.5 _{6.4 5.5}6.7

Mean response T1 6.10 T2 5.73

The estimated treatment effect is 5.73 - 6.10 = -0.37. Again it is partly influenced by the treatment effect (+2) and partly by the variation between experimental units,

the background noise. The treatment effect is

(23)

Again consider the same extreme allocation but with a larger treatment effect.

Unit 1 2 3 4 5 6 7 8 Response if treated the same

4.1 5.3 7.2 2.6 3.5 6.4 5.5 4.7 Allocated at random to treatment

T2 _{T1 T1}T2 T2 _{T1 T1}T2

Treatment effect

10 _{0 0}10 10 _{0 0}10

14.1 _{5.3 7.2}12.6 13.5 _{6.4 5.5}14.7

Mean response T1 6.10 T2 13.73

(24)

Th r e e poin t s:

The observed t reat m ent difference is due only t o t reat m ent effect and variat ion.

I f t he t reat m ent effect is large relat ive t o t he

background noise t hen even an ext rem e allocat ion will not obscure t he t reat m ent effect . ( Signal/ Noise rat io) .

I f t he num ber of experim ent al unit s is large t hen a

t reat m ent effect will usually be m ore obvious, since an ext rem e allocat ion of experim ent al unit s is less likely.

(25)

Te st s of H ypot h e se s - Te st s

of Sign ifica n ce

Su r ve y:

Are t he observed differences bet w een

groups com pat ible w it h a view t hat t here

are no differences bet w een t he populat ions

from w hich t he sam ples of values are

draw n?

D e sign e d e x pe r im e n t s:

Are observed

differences bet w een t reat m ent m eans

(26)

Te st s of H ypot h e se s - Te st s

of Sign ifica n ce

D e sign e d e x pe r im e n t

- only t w o

explanat ions for a negat ive answer,

difference is due t o t he applied

t reat m ent s or a chance effect

Su r ve y

is silent in dist inguishing

(27)

Ex a m ple

An experim ent on art ificially raised

salm on com pared t wo t reat m ent s and

20 fish per t reat m ent . Average gains

( g) over t he experim ent al period w ere

1210 and 1320. Variat ion bet w een fish

wit hin a group was RSE = 135g

(28)

Pr oce du r e

a ) N ULL H YPOTH ESI S

Treat m ent s have no effect and

any difference observed bet w een groups t reat ed

different ly is due t o chance ( variat ion in t he

experim ent al m at erial) '

b) M e a su r e

- t he variat ion bet w een groups t reat ed different ly

- t he variat ion expect ed if due solely t o chance

(29)

d)

The observed difference could have occurred by

chance.

St a t ist ica l t h e or y give s r u le s t o

de t e r m in e h ow lik e ly a give n diffe r e n ce in

va r ia t ion is lia ble t o be by ch a n ce .

e ) SI GN I FI CAN CE TEST

Face t he choice.

- This difference in variat ion could have occurred by

chance w it h probabilit y ? ( 5% , 1% , et c)

OR

- There is a real difference ( produced by t reat m ent ) .

(30)

Ex a m ple

:

- Th e t t e st

An experim ent on art ificially raised

salm on com pared t wo t reat m ent s and

20 fish per t reat m ent . Average gains

( g) over t he experim ent al period w ere

1210 and 1320. Variat ion bet w een fish

wit hin a group was RSE = 135g

(31)

Exam ple

a ) N ULL H YPOTH ESI S - Treat m ent does not affect salm on growt h rat e

b) Observed difference bet ween groups 1320 - 1210 = 110

Variat ion expect ed solely from chance 135 x ( 2/ 20) .5 = 42.7

c) Te st St a t ist ic

t = 110/ 42.7 = 2.58

d) St at ist ical t heory ( t t ables) shows t hat t he chance of a value as large as 2.58 is about 1 in 100

e ) Make t he choice

(32)

Responsibility of the Researcher and

Statistician;

PLANNING PHASE

Researcher Statistician

Seek statistical training Keep up to date with statistical technology

Seek statistical advice Teach principles

Use minimum experiment size Provide statistical input to plan Select experimental material

properly

Give researcher different alternatives

(33)

EXECUTION PHASE

Research Statistician

Carry out study as planned Give road map/for execution

Log important dates related to data

(34)

ANALYSIS PHASE

Researcher Statistician

Study data patterns Assist in studying data patterns

Keep integrity of data set

Choose proper statistical analytical procedure

Assist in choosing analytical procedure

Choose probability levels, contrasts to make, etc.

Assist in choosing probability levels, contrasts, etc.

(35)

INTERPRETATION AND REPORTING

Researcher

Statistician

Provide description of statistical methods

Assist in writing statistical methods used

Present results in such a

way that reader can evaluate the interpretation

Review interpretation. Modification if necessary

(36)

Responsibility of Researcher and

Statistician

Promote high standards of scientific inquiry and professionalism

Involve appropriate techniques for research

Honor the rights of other researchers – give credit to other researcher where due

Consider interdependence of natural, social and technological systems

Give objectivity a major role

(37)

Good Practices Checklist

Planning is very important in experimentation

Statistician can assist in planning

Planning does not ensure success but avoids

built-in disasters

Statistics cannot compensate for negative

(38)

Good Planning Can Prevent:

Costly waste of resources

Difficult statistical analysis

Data for which interpretation is controversial

(39)

Setting Up Original Hypothesis Objectively

2 Rules:

1.

Hypothesis should be clearly

Statistician provides report but does not make

decisions for management

Company should have same responsibilities to a

salaried statistician as to a consulting one (and

conversely).

See: Deming, W.W.,

Sample Designs in Business

(42)

Medical Application

Medical review boards

Informed consent

Methods of selecting subjects

Withholding a treatment to a control group

Access to data

(43)

V. Ethical Issues in Interpretation and

Reporting

Insufficient statistical methods description

Statistical significance vs. practical significance

Access to data

Kinds of means in the factorial experiment reporting

Reporting of measures of dispersion

Proper decimal reporting

Bonafide scientific conclusions vs. speculation

Clarity of reporting

Indication that results are not final word

(44)

VI. Case Studies

Skagerrak Case – Precautionary Principle

2 Highly respected scientists interpret their results

differently

Case emphasized 2 critical aspects of research

1. The actual statistical analysis

2. How and when to disseminate the information from research

Elton’s Withholding of Anomalous Data