by by Allan G.
Allan G. BlumanBluman
http://www.mhhe.com/math/stat/blumanbrief http://www.mhhe.com/math/stat/blumanbrief
SLIDES PREPARED SLIDES PREPARED
BY BY
Elementary Statistics Elementary Statistics
A Step by Step Approach Sixth Edition
BY BY LLOYD R.
LLOYD R. JAISINGHJAISINGH MOREHEAD STATE UNIVERSITY MOREHEAD STATE UNIVERSITY
MOREHEAD KY MOREHEAD KY Updated by Updated by Dr.
Dr. SaeedSaeed AlghamdiAlghamdi King
King AbdulazizAbdulaziz UniversityUniversity
Chapter Chapter 2 2
Frequency Distributions Frequency Distributions q q y y
and and Graphs Graphs
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
1-1
Organize data using frequency distributions.
Represent data in frequency distributions graphically using histograms, frequency
l d i
Objectives
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
polygons and ogives.
Represent data using bar chart, Pareto chart, pie graph and time series graph.
Draw and interpret a stem and leaf plot.
Notes
………...
………...
………...
………...
………...
2-2
When data are collected in original form, they are called raw data .
Wh th d t i d i t t bl
Organizing Data
When the raw data are organized into a table which called frequency distribution , the frequency will be the number of values in a specific class of the distribution.
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...
2-3
A frequency distribution is the organization of raw data in a table form, using classes and frequencies
Organizing Data
frequencies.
Types of frequency distributions are
categorical frequency distribution, ungrouped frequency distribution and grouped frequency distribution .
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...
Why Construct Frequency Distributions?
2-4
To enable the reader to make comparisons among different data sets.
To organize the data in a meaningful, intelligible way.
To facilitate computational procedures for measures of average and spread.
To enable the reader to determine the nature or shape of the distribution.
To enable the researcher to draw charts and graphs for the presentation of data.
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...
2-5
When the sample size (n) is large, the data must be grouped into categories.
C t i l F Di t ib ti
Categorical Frequency Distributions
Categorical Frequency Distributions are used for data that can be placed in specific categories, such as nominal or ordinal level data.
Notes
………...
………...
………...
………...
2-6
Example: Blood Type Frequency Distribution Blood Type Frequency Distribution Categorical Frequency Distributions
A B B AB O O A O O B A B O AB O AB B B A A O B B O O O A O
Class Frequency Percent
A 6 21%
B 8 29%
O 11 39%
AB 3 11%
Total 28 100%
100 100
100 %
frequency f
Total f
f n
Sample Size
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...
2-7
Ungrouped frequency distributions are used for data that can be enumerated and when the range of values in the data set is small
Ungrouped Frequency Distributions
the range of values in the data set is small and the sample size (n) is large.
Example: number of patients in the clinics within a hospital.
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...
2-8
Example: Number of patients in the waiting rooms of 16 6 clinics within a hospital at a specific time. clinics within a hospital at a specific time.
Ungrouped Frequency Distributions
Class FrequencyCumulative 5 4 4 8 %
Statistics Department, Faculty of Sciences, King Abdulaziz University
Class Frequency
Frequency %
4 8 8 50%
5 3 8+3=11 19%
8 5 11+5=16 31%
Total 16 ‐ 100%
8 5 8 4
4 4 8 4
5 8 4 4
Notes
………...
………...
………...
………...
………...
2-9
When the range of values in a data set is large, the data must be grouped into classes that are more than one unit in width, e.g., 24 – 30.
The
lower class limit
represents the smallest dataGrouped Frequency Distributions
The
lower class limit
represents the smallest data value that can be included in a class, e.g., 24 in the class limit 24 – 30.The
upper class limit
represents the largest value that can be included in the class, e.g., 30 in the class limit 24 – 30.Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...
2-10
The class boundariesare used to separate the classes so that there are no gaps in the frequency
distribution. It can be found by subtracting 0.5 from the last digit in the lower class limit and adding 0.5 to
h l di i i h l li i
Grouped Frequency Distributions
the last digit in the upper class limit e.g., 23.5 – 30.5 for the class limit 24 – 30 and 2.65 – 6.85 for the class limit 2.7 – 6.8.
Rule of Thumb: Class limits should have the same decimal place value as the data, but the class boundaries have one additional place value and end in a 5.
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...
2-11
Finding Class Boundaries
The class width for a class in a frequency distribution is found by subtracting the lower (or upper) class limit of one class from the lower (or upper) class limit of the next class.
The class midpoint is found by adding the lower and upper boundaries (or limits) and dividing by 2.
Notes
………...
………...
………...
………...
Class Rules
2-12
There should be between 5 and 20 classes.
As a guide line, the number of classes can be found using
Th l idth h ld b dd b
Number of Classes 1 3.3 log( ) n
The class width should be an odd number.
The classes must be mutually exclusive.
The classes must be continuous.
The classes must be exhaustive.
The classes must be equal in width.
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...
2-13
Procedure for Constructing Procedure for Constructing
a Grouped Frequency Distribution a Grouped Frequency Distribution
Find the highest (H) and lowest (L) value.
Find the range (R). R=H – L
Select the number of classes desired, usually between 5 and 20.
Find the width by dividing the range by the number of classes and rounding up.
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...
2-14
Select a starting point (usually the lowest value); add the width to get the lower limits.
Find the upper class limits.
Fi d th b d i
Procedure for Constructing Procedure for Constructing
a Grouped Frequency Distribution a Grouped Frequency Distribution
Find the boundaries.
Tally the data.
Find the numerical frequencies from the tallies.
Find the cumulative frequencies.
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...
2-15
Example: Sample of birthweight (oz) from 40 consecutive deliveries.
Grouped Frequency Distributions
Class Limits Class Tally 58 118 92 108 132 Class Limits
Boundary Tally 32—50 31.5—50.5 / 51—69 50.5—69.5 //
70—88 69.5—88.5 ///
89—107 88.5—107.5 //// ////
108—126 107.5—126.5 //// //// //
127—145 126.5—145.5 //// ////
146—164 145.5—164.5 ///
Total 40
32 140 138 96 161 120 86 115 118 95 83 112 128 127 124 123 134 94 67 124 155 105 100 112 141 104 132 98 146 132 93 85 94 116 113
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...
2-16
Example: Sample of birthweight (oz) from 40 consecutive deliveries.
Grouped Frequency Distributions
Class Limits Class Class
FrequencyCumulative
%
2 U p p e rL o w e r
Class Limits
Boundary Midpoint Frequency
Frequency %
32—50 31.5—50.5 41 1 1 2.5%
51—69 50.5—69.5 60 2 3 5%
70—88 69.5—88.5 79 3 6 7.5%
89—107 88.5—107.5 98 10 16 25%
108—126 107.5—126.5 117 12 28 30%
127—145 126.5—145.5 136 9 37 22.5%
146—164 145.5—164.5 155 3 40 7.5%
Total ‐ ‐ 40 ‐ 100%
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...
The Role of Graphs
2-17
The purpose of graphs in statistics is to represent the data to the viewer in pictorial form
form.
Graphs are useful in getting the audience’s attention in a publication or a presentation.
Notes
………...
………...
………...
………...
2-18
The histogramdisplays the continuous data that are organized in a grouped frequency distribution by using vertical bars of various heights to represent the frequencies.
The Most Common Graphs
Class Boundary Frequency 31.5—50.5 1 50.5—69.5 2 69.5—88.5 3 88.5—107.5 10 107.5—126.5 12 126.5—145.5 9 145.5—164.5 3
Total 40
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University 0
2 4 6 8 10 12 14
Frequency
Birthweights (Class Boundary) Sample of birthweight (oz) from 40 consecutive
deliveries
350.5 369.5 388.5 3107.5 3126.5 3145.5 3164.5 331.5
Notes
………...
………...
………...
………...
………...
2-19
The frequency polygondisplays the continuous data that are organized in a grouped frequency distribution by using lines that connect points plotted for the frequencies at the midpoints of the classes.
The Most Common Graphs
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Class
Midpoint Frequency
41 1
60 2
79 3
98 10
117 12
136 9
155 3
0 2 4 6 8 10 12 14
22 41 60 79 98 117 136 155 174
Frequency
Class Midpoint Sample of birtgweight (oz) from 40 consecutive
deliveries
Notes
………...
………...
………...
………...
………...
2-20
The cumulative frequency graphor ogiverepresents the cumulative frequencies for the classes in a grouped frequency distribution.
The Most Common Graphs
S f i i ( ) f 40
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Class Boundary
Cumulative Frequency 31.5—50.5 1 50.5—69.5 3 69.5—88.5 6 88.5—107.5 16 107.5—126.5 28 126.5—145.5 37 145.5—164.5 40
0 5 10 15 20 25 30 35 40 45
31.5 50.5 69.5 88.5 107.5 126.5 145.5 164.5
Cumulative Frequency
Class Boundary Sample of birthweight (oz) from 40
consecutive deliveries
Notes
………...
………...
………...
………...
………...
2-21
Relative frequency graph
Graphs of relative frequencies are used instead of frequencies when the proportion of data
The Most Common Graphs
of frequencies when the proportion of data values that fall into a given class is more important than the actual number of data values that fall into that class.
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...
2-22
Relative frequency example:
The Most Common Graphs
Class
Boundary Frequency Relative Frequency
Cumulative Frequency
Cumulative Relative Frequency
frequency f f Total f n
Frequency
31.5—50.5 1 0.025 1 0.025
50.5—69.5 2 0.05 3 0.075
69.5—88.5 3 0.075 6 0.15
88.5—107.5 10 0.25 16 0.4
107.5—126.5 12 0.3 28 0.7
126.5—145.5 9 0.225 37 0.925
145.5—164.5 3 0.075 40 1
Total 40 1 ‐ ‐
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University nf
Notes
………...
………...
………...
………...
………...
2-23
The histogram
The Most Common Graphs
0 3 0.35
Sample of birthweight (oz) from 40 consecutive deliveries
0 0.05 0.1 0.15 0.2 0.25 0.3
Relative Frequency
50.5 69.5 88.5 107.5 126.5 145.5 164.5
31.5
Notes
………...
………...
………...
………...
2-24
The frequency polygon
The Most Common Graphs
0.3 0.35
Sample of birthweight (oz) from 40 consecutive deliveries
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University 0
0.05 0.1 0.15 0.2 0.25
22 41 60 79 98 117 136 155 174
Relative Frequency
Class Midpoint
Notes
………...
………...
………...
………...
………...
2-25
The cumulative frequency graph or ogive The Most Common Graphs
0 8 0.9 1.0
y
Sample of birthweight (oz) from 40 consecutive deliveries
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University 0.0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
31.5 50.5 69.5 88.5 107.5 126.5 145.5 164.5
Cumulative Relative Frequenc
Class Boundary
Notes
………...
………...
………...
………...
………...
2-26
The
bar charts
displays the data by using vertical bars of various heights to represent the frequencies of discrete or categorical variables.Other Types of Graphs
Cause of Death Frequency Motor Vehicle 47
Drowning 15 House Fire 12
Homicide 7
Other 19
Total 100
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University 0
5 10 15 20 25 30 35 40 45 50
Motor Vehicle Other Drowning House Fire Homicide
Frequency
Cause of Death
Notes
………...
………...
………...
………...
………...
2-27
Other Types of Graphs
A
Pareto chart
is used to represent a frequency distribution for categorical variable. The frequencies are displayed by the heights of vertical bars, which are arranged in order from highest to lowest are arranged in order from highest to lowest.Cause of Death Frequency Motor
Vehicle 47 Drowning 15 House Fire 12 Homicide 7
Other 19
Total 100
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University 0
5 10 15 20 25 30 35 40 45 50
Motor Vehicle Other Drowning House Fire Homicide
Frequency
Cause of Death
Notes
………...
………...
………...
………...
………...
2-28
The pie graphis a circle that is divided into sections according to the percentage of frequencies in each category of the distribution, . f 360
Degree
n
Other Types of Graphs
Class Frequency Percentage Degree
A 6 21.43% 77.14
B 8 28.57% 102.86
O 11 39.29% 141.43
AB 3 10.71% 38.57
Total 28 100% 360
21.43%
28.57%
10.71%
39.29%
Blood Type
n
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...
2-29
The time series graphrepresents data that occur over a specific period of time.
Year Domestic
Other Types of Graphs
U. S. A. Cigarette Consumption, 1900-1990
1900 54
1910 151 1920 665 1930 1485 1940 1976 1950 3522 1960 4171 1970 3985 1980 3851 1990 2828
0 500 1000 1500 2000 2500 3000 3500 4000 4500
1900 1910 1920 1930 1940 1950 1960 1970 1980 1990
Number of Cigarettes
Year
U. S. . C ga ette Co su pt o , 900 990
Notes
………...
………...
………...
………...
2-30
A stem-and-leaf plot is a data plot that uses part of a data value as the stem, the most significant digit (i.e. the ‘tens’), and the other part of the data value as the leaf the less
Other Types of Graphs
part of the data value as the leaf, the less significant digits (the ‘units’), to form groups or classes.
It has the advantage over grouped frequency distribution of retaining the actual data while showing them in a graphic form.
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...
2-32
Sample of birthweight (oz) from 40 consecutive deliveries.
58 118 92 108 132
32 140 138 96 161
Other Types of Graphs
32 140 138 96 161
120 86 115 118 95
83 112 128 127 124
123 134 94 67 124
155 105 100 112 141
104 132 98 146 132
93 85 94 116 113
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...
2-33
3 2
4
5 8
6 7
7
8 3 5 6
Other Types of Graphs
8 3 5 6
9 2 3 4 4 5 6 8
10 0 4 5 8
11 2 3 5 6 8 8
12 0 2 3 4 4 7 8
13 2 2 2 4 8
14 0 1 6
15 5 16 1
Dr. Saeed Alghamdi, Statistics Department, Faculty of Sciences, King Abdulaziz University
Notes
………...
………...
………...
………...
………...