Introduction to
Biostatistics
© Farouq Mohammad A. Alam
Chapter 1
INTRODUCTION TO BIOSTATISTICS
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 2 ررقملل يساسلأا عجرملا
Learning Outcomes
› After studying this chapter, the student will:
1. understand the key concepts and terminology of biostatistics;
2. be able to classify types of scientific samples from a population of subjects;
3. understand the definition of a research study and an experiment;
4. understand the importance of using computers in the statistical
1.1. INTRODUCTION
1.2. SOME BASIC CONCEPTS
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 4 ررقملل يساسلأا عجرملا
What is the meaning of Statistics?
› Statistics is a field of study concerned with collecting, organizing, summarizing, analyzing and drawing inferences from data.
What is the meaning of Statistics?
› Statistics is a field of study concerned with collecting, organizing, summarizing, analyzing and drawing inferences from data.
6
What is data?
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش ررقملل يساسلأا عجرملا
What is the meaning of Statistics?
› Statistics is a field of study concerned with collecting, organizing, summarizing, analyzing and drawing inferences from data.
› Data are defined as numbers that result from the taking
measurement or from the process of counting. They are the raw
What is data?
What is the meaning of Statistics?
› Statistics is a field of study concerned with collecting, organizing, summarizing, analyzing and drawing inferences from data.
› Data are defined as numbers that result from the taking
measurement or from the process of counting. They are the raw material of statistics.
8
What is data?
Descriptive Statistics
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش ررقملل يساسلأا عجرملا
What is the meaning of Statistics?
› Statistics is a field of study concerned with collecting, organizing, summarizing, analyzing and drawing inferences from data.
› Data are defined as numbers that result from taking
measurement or from the process of counting. They are the raw
What is data?
Descriptive statistics
Inferential Statistics
Areas of Statistics
› The concepts and methods necessary for organizing, presenting, and summarizing data are called descriptive statistics.
› The concepts and methods necessary for making decisions about a large body of data by examining only a small part of it are
called inferential statistics.
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 10 ررقملل يساسلأا عجرملا
Course Hierarchy
Descriptive Statistics
• Chapter 1: Introduction to Biostatistics
• Chapter 2: Descriptive Statistics
• Chapter 15: Vital Statistics
• Chapter 12: The Chi-square Distribution and the Analysis of Frequencies
Probability Theory
• Chapter 3: Some Basic Probability Concepts
• Chapter 4: Probability Distributions
• Chapter 5: Some Important Sampling Distributions
Chapter 7: Hypothesis Testing
Sources of Data
› Records which are day-to-day logs of transactions of the activities of an organization.
› Surveys including, but are not limited to, filling questionnaires.
› Experiments such as applying different medical strategies.
› External sources such as published reports, commercially available data banks, or the research literature.
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 12 ررقملل يساسلأا عجرملا
What is the difference between Statistics and
Biostatistics? (Class Activity)
What is the difference between Statistics and Biostatistics? (Class Activity)
› The tools of statistics are employed in many fields.
Biostatistics is the application of statistical tools and
concepts in the field of biological sciences and medicine!
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 14 ررقملل يساسلأا عجرملا
What is the difference between data and
variables? (Class Activity)
What is the difference between data and variables? (Class Activity)
› A variable is an observable characteristic that takes on different values in different persons, places, or things.
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 16 ررقملل يساسلأا عجرملا
Quantitative Variables vs. Qualitative Variables
› A qualitative variable is a characteristic that can be categorized, such as gender, ethnic group, medical diagnosis, etc.
› A quantitative variable is a characteristic that can be
measured in the usual sense, such as heights, weights,
ages, etc.
Provide at least one example for each type of a variable. (Class Activity)
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 18 ررقملل يساسلأا عجرملا
Random Variable
› A random variable is a quantitative variable whose obtained values arise as a result of chance factors, so that they cannot be exactly
predicted in advance, such as the weight of a newborn at maturity.
› A discrete random variable possesses gaps or interruptions in the values that it can assume, such as the number of daily admissions to a general hospital.
›
Population vs. Sample
20
› A population of entities is defined as the largest collection of entities for which we have an interest at a specific time.
› A population of values is defined as the largest collection of
values of one random variable for which we have an interest at a specific time.
› A population is either finite or infinite.
› A sample is a part of a population.
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش ررقملل يساسلأا عجرملا
Provide at least one example for each type of a
random variable. (Class Activity)
Population vs. Sample (cont.)
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 22 ررقملل يساسلأا عجرملا
Population vs. Sample (cont.)
Population of entities
Population of values
Why using samples? (Class Activity)
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 24 ررقملل يساسلأا عجرملا
Why using samples? (Class Activity)
› Studying a sample is less costly and less time consuming than studying the whole population.
› Measuring the variable(s) of interest may involve the destruction of the population unit (e.g., blood sample).
› A population may be infinite.
1.3. MEASUREMENT AND MEASUREMENT SCALES
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 26 ررقملل يساسلأا عجرملا
Measurement and Measurement Scales
› A measurement is the assignment of numbers to objects or events according to a set of rules.
Variable
Qualitative Quantitative
Nominal vs. Ordinal Scales
› A nominal scale measurement is a number that only reflects the classification or categorization of a qualitative variable without
regard to ranking according to some criterion. The distance between any two measurements is meaningless, i.e., algebraic operators (+, –, etc.) do not mean anything here.
› An ordinal scale measurement is a number that reflects both the classification or categorization of a qualitative variable and the ranking according to some criterion. The distance between any two measurements is meaningless, i.e., algebraic operators (+, –, etc.) do not mean anything here. However, we can use Likert Scale to
algebraically deal with ordinal scale measurements.
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 28 ررقملل يساسلأا عجرملا
Nominal vs. Ordinal Scales (cont.)
NOMINAL SCALE
› What is your gender?
0 – Male 1 – Female
› Blood type:
0 – A 1 – B 2 – O
ORDINAL SCALE
› Pain Level:
1 – Mild
2 – Moderate 3 – Sever
› How satisfied are you with our medical service?
5 – very satisfied
Interval vs. Ratio Scales
› An interval scale measurement is a number that reflects the
classification or categorization of a quantitative variable and the ranking according to some criterion. The distance between any two measurements has a meaning and the number 0 in this case do not indicate the total absence of the quantity being measured.
› An ratio scale measurement is a number that reflects the
classification or categorization of a quantitative variable and the ranking according to some criterion. The distance between any two measurements has a meaning and the number 0 in this case is a true zero, i.e., it indicates the total absence of the quantity being
measured.
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 30 ررقملل يساسلأا عجرملا
Interval vs. Ratio Scales (cont.)
INTERVAL SCALE
› Temperature.
› IQ score.
RATIO SCALE
› Dose amount.
› Weight.
› Height.
Provide at least one example for each measurement scale. (Class Activity)
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 32 ررقملل يساسلأا عجرملا
1.4. SAMPLING AND
STATISTICAL INFERENCE
Definitions
› Statistical inference is the procedure by which researchers reach a conclusion about a population based on the information contained in a sample that has been drawn from that population.
› A research study is a scientific study of a phenomenon of interest, and it involves designing sampling protocols, collecting and analyzing data, and providing valid conclusions based on the results of the analyses.
› Experiments are a special type of research study in which
observations are made after specific manipulations of conditions have been carried out; they provide the foundation for scientific research.
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 34 ررقملل يساسلأا عجرملا
Provide at least one example for each type of
study. (Class Activity)
Provide at least one example for each type of study. (Class Activity)
A RESEARCH STUDY
› Researchers want to study the influence of drinking at least one cup of tea on sleeping habits. They took a random sample of adults and asked them about the daily amount of tea they consume and the time at which they sleep.
AN EXPERIMENT
› Researchers want to study the influence of drinking at least one cup of tea on sleeping habits. They took a random sample of adults and divided them into three groups. The first group are told to drink tea less than usual, the second group are told to drink tea as usual, and the third group are told to drink tea more than usual. After a while, the people are asked about the daily amount of tea they consumed and the time at which they slept.
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 36 ررقملل يساسلأا عجرملا
Sampling with Replacement vs. Sampling without Replacement
› Suppose that the sample size is 𝒏 and the population size is 𝑵.
› In sampling with replacement, each population unit has at least one chance to be selected in the sample. The number of
possible samples in this case is equal to 𝑵𝒏.
› In sampling without replacement, each population unit has only one chance to be selected in the sample. The number of possible samples in this case is equal to 𝑵 × 𝑵 − 𝟏 × 𝑵 − 𝟐
Suppose that the population is A, B, C, and D and the aim is to sample two letters .
SAMPLING WITH REPLACEMENT SAMPLING WITHOUT REPLACEMENT
› AB, AC, AD, BC, BD, CD, BA, CA, DA, CB, DB, DC
› Notice that 𝑁 – 𝑛 + 1
= 4 – 2 + 1 = 3, hence, the number of possible
samples = 4 × 3 = 12
38
› AA, AB, AC, AD, BB, BC, BD, CC, CD, DD, BA, CA, DA, CB, DB, DC
› Notice that the number of possible samples = 42 = 16
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش ررقملل يساسلأا عجرملا
Simple random sampling
› If a sample of size 𝒏 is drawn from a population of size 𝑵 in such a way that every possible sample of size 𝒏 has the same chance of being selected, then the sample is called a simple random sample.
Simple random sampling
40
Source: Sampling (statistics) - Wikipedia
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش ررقملل يساسلأا عجرملا
Systematic Random Sampling
› To perform systematic random sampling, a randomly-selected starting point from the population is considered, then each kth subject from the starting point is selected. The number k is called the sampling interval.
› Let 𝑥 be the randomly-selected starting point, and 𝑘 is the
sampling interval. Then, the systematic random sample consists of the subjects 𝑥, 𝑥 + 𝑘, 𝑥 + 2𝑘, and so on.
Systematic Random Sampling (𝒙 = ? , 𝒌 = ? )
42
Source: Sampling (statistics) - Wikipedia
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش ررقملل يساسلأا عجرملا
Systematic Random Sampling (𝒙 = 𝟐, 𝒌 = 𝟑)
Stratified Random Sampling
› To perform stratified random sampling, a population of interest is partitioned into groups (or strata) in which the sample units
within a stratum are like each other, then from each group, a random sample is taken independently from each stratum.
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 44 ررقملل يساسلأا عجرملا
Stratified random sampling
Provide at least one example for each type of sampling. (Class Activity)
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 46 ررقملل يساسلأا عجرملا
1.5. THE SCIENTIFIC METHOD AND
THE DESIGN OF EXPERIMENTS
Important terminology
› The terms accuracy and validity refers to the correctness of a measurement.
› The terms precision and reliability refers to the consistency of a measurement.
› A treatment (experimental) group consists of randomly assigned subjects which are directly exposed to a treatment.
› A control (placebo) group consists of randomly assigned subjects which are not exposed to a treatment.
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 48 ررقملل يساسلأا عجرملا
1.6 COMPUTERS AND
BIOSTATISTICAL ANALYSIS
Computers and Biostatistical Analysis
› Computers can perform more calculations faster and far more accurately than can human technicians.
› Modern computers have random number generating capabilities.
› MS Excel/MegaStat will be used to perform data analysis in this course. You can download MS Excel from here for free.
© FAROUQ MOHAMMAD A. ALAM - تسيل هذه ضرعلا حئارش 50 ررقملل يساسلأا عجرملا