Statistical test steps
Dr.Hissah Alzahrani
What I need to know before starting statistical hypothesis testing?
Data types: quantitive vs. qualitative
Parametric vs. non-parametric tests
Number of study groups: one, two or >2 groups
Groups are related or not: independent, paired or matched
2
MEASUREMENT SCALES
Nominal Ordinal Interval Ratio
Parametric vs. non-parametric tests:
Parametric tests - assumptions
Random sampling from a defined population.
Data are measured on the continuous scales and follow Normal distribution.
Number of observations is >30
Non-parametric - less assumptions
-
Random sampling from a defined population.
Data distribution :Normal vs. non-normal distribution
check the data weather they are normal or not by graph or normality test
For example, using Kolmogorov-Smirnov Test for Normality
What can we do if they are non-normal?
1-data transformation, IF still not normal=> conduct non parametric test
Groups are related or not:
Independent groups
Two separated and independent groups of subjects.
Matched groups
The subjects from one group are matched on certain factors or conditions relevant to the study to a subject in the other group.
Paired groups
The same subject is used to collect data for both groups. In many instances, a crossover design is used so that the same subject receives all treatments.
4
Statistical hypothesis testing steps:
Step 1: State the Null Hypothesis Step 2: State the Alternative Hypothesis Step 3: Set α
Step 4: Collect Data
Step 5: Choose a statistical test and Calculate a test statistic and p-value Step 6: Construct Acceptance / Rejection regions
Step 7: Based on steps 5 and 6, draw a conclusion about H0
What is the statistical test?
A statistical test provides a mechanism for making quantitative decisions about a process. The intent is to determine whether there is enough evidence to "reject" a hypothesis about the process.
Null Hypothesis and alternative hypothesis:
Example:
A case of suspected cheating on an exam is brought in front of the disciplinary committee at a certain university. There are two opposing claims in this case:
The student’s claim: I did not cheat on the exam.
The null hypothesis H0 is an initial claim that researchers specify using previous research or knowledge.
The instructor’s claim: The student did cheat on the exam.
The alternative hypothesis H1 is what the researcher believes to be true or hope to prove true.
Errors in hypothesis testing
6
p-value
The P value is the probability of finding the observed, or more extreme, results when the null hypothesis (H0) of a study question is true .
It is a quantity computed from test statistic, for example p-value=P(z<3)
Acceptance region and making decision
𝛼
Statistical Analysis Tests are lot, and here are some of them
Statistical test: one sample t-test
Data variables:
dependent variable (Continuous) Parametric or nonparametric:
parametric test Number of study group:
one group
Groups are related or not:
---just one group--- Hypothesis:
H0: The true mean is the specific mean
H1: The true mean is different than the specific mean
8
Statistical test: one sample t-test
Example: A medical researcher is interested in finding out whether the mean weight is different than 28.4
Data variables:
Weight, quantative (continuous) Parametric or nonparametric:
parametric test
Number of study groups:
one group
Groups are related or not:
---just one group--- Hypothesis:
H0: The true mean is 28.4
H1: The true mean is different than 28.4
Statistical test: independent samples t-test
Data variables:
dependent variable (Continuous) , independent variable (binary) Parametric or nonparametric:
parametric test Number of study group:
two groups
Groups are related or not:
independent Hypothesis:
H0: There is no difference between the means of the two groups
H1: There is a difference between the means of the two groups
Statistical test: independent samples t-test
Example: A medical researcher is interested in finding out whether the mean weights of males and females is different.
Data variables:
dependent variable (weight) , independent variable (male or female)
Parametric or nonparametric:
parametric test
Number of study groups:
two groups
Groups are related or not:
independent Hypothesis:
H0: There is no difference between the weight means of males and females
H1: There is a difference between the weight means of males and females
Statistical test: paired t-test
It is a statistical method used to test for differences between the means of the two paired or matched groups.
Data variables:
dependent variable (Continuous) , independent variable
Parametric or nonparametric:
parametric test
Number of study group:
two groups
Groups are related or not:
Paired or matched.
Hypothesis:
H0: There is no difference between the two means.
10
Statistical test: paired t-test
Example: A medical researcher is interested in finding out whether the mean weights of group females is different before and after a diet program.
Data variables: :
dependent variable (weight) , independent variable (diet) Parametric or nonparametric:
parametric test
Number of study group:
two groups
Groups are related or not:
paird Hypothesis:
H0: There is no difference between the females means before and after diet H1: There is a difference between
the females means before and after diet
Statistical test: Analysis of Variance (ANOVA)
It is used to determine whether there are any differences between the means of independent three or more groups.
Data types:
dependent variable (Continuous) , independent variable (categorical)
Parametric or nonparametric:
parametric test
Number of study group:
three or more groups Groups are related or not:
independent Hypothesis:
H0: There is no difference between the means of the groups
H1: There is at a least one difference between the means of the groups
Statistical test: Analysis of Variance (ANOVA)
Example: A medical researcher is interested in finding out whether the mean weights of three classes of students are different.
Data variables:
dependent variable (weight) , independent variable ( 3 classes) Parametric or nonparametric:
parametric test
Number of study group:
three groups
Groups are related or not:
independent Hypothesis:
H0: There is no difference between the means of 3 classes of students
H1: There is at a least one difference weight mean between 3 classes.
Statistical test: Repeated Measures (ANOVA)
It is used to determine whether there are any differences between the means of paired or matched three or more groups. The Repeated Measures ANOVA is very common in bioequivalence studies.
Data variables:
dependent variable (Continuous) , independent variable (categorical)
Parametric or nonparametric:
parametric test
Number of study group:
three or more groups Groups are related or not:
Paired or matched Hypothesis:
H0: There is no difference between the means of the groups
12
Statistical test: Repeated Measures (ANOVA)
Example: A medical researcher is interested in finding out whether the mean weights of three classes of students are different before and after a program diet.
Data variables:
dependent variable (weight of 3 classes), independent variable (the diet program) Parametric or nonparametric:
parametric test
Number of study group:
many groups
Groups are related or not:
paired Hypothesis:
H0: There is no difference between the means of the 3 classes
H1: There is at a least one difference between the means of the 3 classes
Statistical test: Kolmogorov-Smirnov test
The one sample Kolmogorov-Smirnov test is used to test whether a sample comes from a specific distribution. We can use this procedure to determine whether a sample comes from a population which is normally distributed.
Data varables:
dependent variable (Continuous) Parametric or nonparametric:
nonparametric test Number of study group:
one group
Groups are related or not:
---just one group--- Hypothesis:
H0: the data comes from normal distribution H1: the data doesn’t come from normal distribution
Statistical test: Mann-Whitney U Test
The Mann-Whiteny U test for independent samples is a statistical method used to test for differences between the medians of the two independent groups (nonparametric).
Data variables:
dependent variable (ordinal ) , independent variable (binary) Parametric or nonparametric:
nonparametric test Number of study group:
two groups
Groups are related or not:
independent Hypothesis:
H0: There is no difference between the medians of the two groups
H1: There is a difference between the medians of the two groups
Statistical test: Mann-Whitney U Test
Example: A medical researcher is interested in finding out whether the weights levels of males and females is different.
Data variables:
dependent variable (weight levels) , independent variable (gender) Parametric or nonparametric:
nonparametric test Number of study group:
two groups
Groups are related or not:
independent Hypothesis:
H0: There is no difference between the medians of males and females
H1: There is a difference between the medians
14
Statistical test: Wilcoxon Signed Ranks Test
It is a statistical method used to test for differences between the medians of the two paired or matched groups.
Data variables:
dependent variable (ordinal) , independent variable (binary) Parametric or nonparametric:
nonparametric test Number of study group:
two groups
Groups are related or not:
Paired or matched Hypothesis:
H0: There is no difference between the medians of the two groups H1: There is a difference between the medians of the two groups
Statistical test: Wilcoxon Signed Ranks Test
Example: A medical researcher is interested in finding out whether the weights levels of group of females is different before and after a diet program
Data variables:
dependent variable (weight level) , independent variable (diet program) Parametric or nonparametric:
nonparametric test Number of study group:
two groups
Groups are related or not:
Paired Hypothesis:
H0: There is no difference between the medians of the females before and after diet program H1: There is a difference between the medians of the females before and after diet program
Statistical test: Kruskal-Wallis One-Way ANOVA
It is used to determine whether there are any differences between the means of independent three or more groups.
Data variables:
dependent variable (ordinal ) , independent variable (categorical) Parametric or nonparametric:
nonparametric test Number of study group:
three or more groups Groups are related or not:
independent Hypothesis:
H0: There is no difference between the medians of the groups
H1: There is at least one difference between the medians of the groups
Statistical test: Kruskal-Wallis One-Way ANOVA
Example: A medical researcher is interested in finding out whether the weights levels of three classes of students are different.Data variables:
dependent variable (weight levels) , independent variable (3 classes) Parametric or nonparametric:
nonparametric test Number of study group:
three groups
Groups are related or not:
independent Hypothesis:
H0: There is no difference between the medians of the groups
H1: There is at least one difference between
16
Statistical test: Friedman test
It is used to determine whether there are any differences between the medians of dependent three or more groups.
Data variables:
dependent variable (ordinal ) , independent variable (categorical) Parametric or nonparametric:
nonparametric test Number of study group:
three or more groups Groups are related or not:
Paired or matched Hypothesis:
H0: There is no difference between the medians of the groups
H1: There is at least one difference between the medians of the groups
Statistical test: Friedman test
Example: A medical researcher is interested in finding out whether the weight levels of three classes of students are different before and after a program diet.
Data variables:
dependent variable (weight levels ) , independent variable (three groups)
Parametric or nonparametric:
nonparametric test Number of study group:
three or more groups Groups are related or not:
paird Hypothesis:
H0: There is no difference between the medians of the groups H1: There is at least onedifference between the medians of the groups
Statistical test: chi-square test
It used to test whether categorical variable follows a particular probability distribution Data variables:
dependent variable (nominal) Parametric or nonparametric:
nonparametric test Number of study group:
one group
Groups are related or not:
---one group--- Hypothesis:
H0: the population proportion equal specific value H1: the population proportion does not equal the specific value
Statistical test: chi-square test
Example: a researcher is interested whether the fail rate in math class is equal to 0.2 or not?
Data variables:
dependent variable (nominal) Parametric or nonparametric:
nonparametric test Number of study group:
one group
Groups are related or not:
---one group--- Hypothesis:
H0: the population proportion equal 0.2
18
Statistical test: chi-square test
The Chi-Square Test of Independence determines whether there is an association between two categorical variables.
Data variables:
dependent variable (nominal), independent variable (nominal) Parametric or nonparametric:
nonparametric test Number of study group:
Two group
Groups are related or not:
independent Hypothesis:
H0: Assumes that there is no association between the two variables
H1: Assumes that there is an association between the two variables
Statistical test: Chi-square test
Example:a researcher is interested in studying the association between weight levels and gender
Data variables:
dependent variable (weight levels), independent variable (gender)
Parametric or nonparametric:
nonparametric test Number of study group:
Two group
Groups are related or not:
independent Hypothesis:
H0: Assumes that there is no association between the weight level levels and gender
H1: Assumes that there is an association between the weight level levels and gender