Statistic for Business
Week 2
Agenda
Time Activity
90 minutes Central Tendency 60 minutes Variation and Shape
Objectives
By the end of this class, student should be able to understand:
• How to measures central tendency in statistics
Numerical Descriptive Measures
Central Tendency
Variation and Shape Measures for
a Population
The Covariance
and The Coefficient of
Central Tendency
Mean (Arithmetic
Mean)
Median
Mean
Consider this height data:
160 157 162 170 168 174 156 173 157
Mean
Sample size
n
Observed values The ith value
Mean
How about this data of business statistic’s
students monthly spending:
What is the MEAN?
Monthly Spending Frequency
less than Rp. 500.000 2
Mean
In this case we can only ESTIMATE the MEAN…
Keyword: “MIDPOINTS”
Spending Frequency
less than Rp. 500.000 2
Estimated Mean
Midpoint Frequency Mid * f
250000 2 500000
750000 7 5250000
1250000 13 16250000
1750000 5 8750000
Total 27 30750000
Mean
The following is “Student A” Score:
What is the average score of “Student A”?
Course Credits Score
Business Math 3 60
English 2 80
Organization Behavior 3 100
Statistics 4 90
Mean
Consider these two sets of data:
A 150 152 154 155 155
155 155 155 155 157 Mean?
B 150 152 154 155 155
It is DANGEROUS to ONLY use
MEAN in
Median
Consider these two sets of data:
A 150 152 154 155 155
155 155 155 155 157 Median?
B 150 152 154 155 155
Median
What is the median of this height data:
160 157 162 170 168 174 156 173 157
How about this data:
Median
How about this data of business statistic’s
students monthly spending:
What is the MEDIAN?
Monthly Spending Frequency
less than Rp. 500.000 2
Median
The MEDIAN group of monthly spending is Rp.
1.000.000 but less than Rp. 1.500.000
Or
ESTIMATE the
Estimated Median
Estimated Median = Rp. 1.173.076,92
Monthly Spending Frequency
less than Rp. 500.000 2
Mode
What is the mode of this height data:
160 157 162 170 168 174 156 173 157
How about this data:
Mode
How about this data of business statistic’s
students monthly spending:
What is the MODE?
Spending Frequency
less than Rp. 500.000 2
Mode
The MODAL group of monthly spending is Rp.
1.000.000 but less than Rp. 1.500.000
But the
actual Mode may not even be in that
Mode
Without the raw data we don't really know…
Estimated Mode
Estimated Mode = Rp. 1.214.285,72
Spending Frequency
less than Rp. 500.000 2
Central Tendency
Central Tendency
Arithmetic
Middle value in the ordered array
Most
3.10
This is the data of the amount that sample of nine customers spent for lunch ($) at a fast-food restaurant:
4.20 5.03 5.86 6.45 7.38 7.54 8.46 8.47 9.87
3.12
The following data is the overall miles per gallon (MPG) of 2010 small SUVs:
24 23 22 21 22 22 18 18 26
26 26 19 19 19 21 21 21 21
21 18 29 21 22 22 16 16
Compounding Data
Interest
Rate
Growth
Rate
Compounding Data
Suppose you have invested your savings in the stock market for five years. If your returns each year were 90%, 10%, 20%, 30% and -90%, what would your average return be during this
Compounding Data
If we use arithmetic mean in this case
Geometric Mean
This is called geometric mean rate of return
1
Measure of Central Tendency For The Rate Of Change Of A Variable Over Time:
The Geometric Mean & The Geometric Rate of Return
 Geometric mean
 Used to measure the rate of change of a variable over time
 Geometric mean rate of return
 Measures the status of an investment over time
Geometric Mean
Lets reconsider the previous problem. We knew that we invest $100 in year 0 (zero). However, by the end of year 5 the value of the stock became $32.6. Calculate the annual average return!
•
$100
Year 0
•
$32.6
Geometric Mean
Population of West Java
Population of West Java:
• Year 2000: 35.729.537
• Year 2010: 43.053.732
3.22
3.22
a. Compute the geometric mean rate of return per year for platinum, gold, and silver from 2006 through 2009.
Variation and Shape
Range
Variance and Standard Deviation
Coefficient of Variation
Z Scores
Review on Central Tendency
Consider this data:
160 157 162 170 168 174 156 173 157 150
Range
Consider this data:
160 157 162 170 168 174 156 173 157 150
Range
min
max
X
X
Measures of Variation:
Why The Range Can Be Misleading
 Ignores the way in which data are distributed
 Sensitive to outliers
Variance and Standard Deviation
Variance
Standard
Deviation
Let’s see this data again:
160 157 162 170 168 174 156 173 157 150
What is the mean?
Variance and Standard Deviation
Data Deviation (Dev)^2
160 -2.7 7.29
Variance and Standard Deviation
• Sample
• Population
Measures of Variation:
Comparing Standard Deviations
Standard Deviation
How about this data of business statistic’s
students monthly spending:
What is the STANDARD DEVIATION?
Monthly Spending Frequency
less than Rp. 500.000 2
Standard Deviation
How about this data of business statistic’s
students monthly spending:
What is the STANDARD DEVIATION?
Monthly Spending Frequency
less than Rp. 500.000 2
Rp. 500.000 but less than Rp. 1.000.000 7 Rp. 1.000.000 but less than Rp. 1.500.000 13 Rp. 1.500.000 but less than Rp. 2.000.000 5
Estimated Standard Deviation
Midpoint Frequency Dev^2 (Dev^2)*f
250000 2 790123456790.12 1580246913580.25
750000 7 151234567901.24 1058641975308.64
1250000 13 12345679012.35 160493827160.49
1750000 5 373456790123.46 1867283950617.28
Total 27 4666666666666.67
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 4666666666666.67 27 = 172839506172.84
The Coefficient of Variation
The Coefficient of Variation
Let’s see this height data again:
160 157 162 170 168 174 156 173 157 150
What is the mean and standard deviation
The Coefficient of Variation
Students with height before is weighted as follows:
50 55 57 52 55
69 60 65 71 70
What is mean and standard deviation?
The Coefficient of Variation
Height Weight
Mean 162.7 60.4
SD 8.125 7.8
Which one has more variability?
Coefficient of Variation: CVHeight = 4.99%
Locating Extreme Outliers: Z Score
Let’s see this height data again:
160 157 162 170 168 174 156 173 157 150
160 162 164 150 152 154 156 158
Mean = 162.7
166 168 170 172 174
Locating Extreme Outliers: Z Score
Therefore, Z Score for 160 is?
160 162 164 150 152 154 156 158
Mean = 162.7
166 168 170 172 174
Z Score162.7 = 0
Z Score154.6 = -1 Z Score
170.8 = 1
Locating Extreme Outliers: Z Scores
Let’s see this height data again:
160 157 162 170 168 174 156 173 157 150
What is the Z Scores of 160, 174, 168 and 150?
Locating Extreme Outliers: Z Score
SD
X
X
Z
X
• A data value is considered an extreme outlier if its Z-score is less than -3.0 or greater than +3.0.
Shape
Let’s see this height data again:
160 157 162 170 168 174 156 173 157 150
160 162 164 150 152 154 156 158
Mean = 162.7
166 168 170 172 174
Shape
What if the height data is like this:
163 168 162 170 168 174 156 173 157 150
160 162 164 150 152 154 156 158
Mean = 164.1
166 168 170 172 174
Median = 165.5
Shape
Describes how data are distributed
Mean = Median
Mean < Median Median < Mean
Exploring Numerical Data
Quartiles
Interquartile Range
Quartiles
Let’s consider this height data:
160 157 162 170 168 174 156 What is the Q1, Q2 and Q3?
Q1 = 157
Quartiles
Let’s then consider this height data:
160 157 162 170 168 174 156 173 150 What is the Q1, Q2 and Q3?
Q1 = 156.5
Quartiles
And this height data:
160 157 162 170 168 174 156 173 157 150 What is the Q1, Q2 and Q3?
Q1 = 157
Quartiles
160 162 164 150 152 154 156 158
Median = 161
166 168 170 172 174
Q1 = 157 Q
3 = 170
25% 25% 25% 25%
Interquartile Range
160 162 164
150 152 154 156 158 166 168 170 172 174
Q1 = 157 Q
3 = 170
50%
Interquartile Range
160 162 164
150 152 154 156 158 166 168 170 172 174
Q1 = 157 Q
3 = 170
What is the Interquatile range?
Interquartile Range
1 3
Q
Q
Range
ile
Five-Number Summary
max 3
1
min
Q
Median
Q
X
Five-Number Summary
Let’s see again this height data:
160 157 162 170 168 174 156 173 157 150 What is the Five-Number Summary?
Boxplot
Boxplot
Let’s see again this height data:
160 157 162 170 168 174 156 173 157 150
Boxplot for the Height of Business
Statistic’s Student 2014
150 157 161 170 174
Distribution Shape and The Boxplot
Right-Skewed
Left-Skewed Symmetric
Karl Pearson's Measure of Skewness
S
Median X
Bowley's Formula for Measuring
Skewness
150 157 161 170 174
3.10
This is the data of the amount that sample of nine customers spent for lunch ($) at a fast-food restaurant:
4.20 5.03 5.86 6.45 7.38 7.54 8.46 8.47 9.87 a. Compute the mean and median.
b. Compute the variance, standard deviation, and range c. Are the data skewed? If so, how?
3.62
3.62
73 19 16 64 28 28 31 90 60 56 31 56 22 18 a. Compute the mean, median, first quartile, and third quartile.
b. Compute the range, interquartile range, variance, and standard deviation.
c. Are the data skewed? If so, how?
3.22
3.22
a. Compute the geometric mean rate of return per year for platinum, gold, and silver from 2006 through 2009.
3.66
3.66
a. For each variable, compute the mean, median, first quartile, and third quartile.
b. For each variable, compute the range, variance, and standard deviation
c. Are the data skewed? If so, how?
d. Compute the coefficient of correlation between calories and total fat.