Frequency Distributions

Today, businesses collect massive amounts of data they hope will be useful for making decisions. Every time a customer makes a purchase at a store like Macy’s or the Gap, data from that transaction are updated to the store’s database. Major retail stores like Walmart capture the number of different product categories included in each “market basket” of items purchased. Table 2.1 shows these data for all customer transactions for one morning at one Walmart store in Dallas. A total of 450 customers made purchases on the morning in question. The first value in Table 2.1 is a 4, which indicates that the customer’s purchase included four different product categories (for example food, sporting goods, photography supplies, and dry goods).

Although the data in Table 2.1 are easy to capture with the technology of today’s cash registers, in this form, the data provide little or no information that managers could use to determine the buying habits of their customers. However, these data can be converted into useful information through descriptive statistical analysis.

A more effective way to display the Dallas Walmart data would be to construct a frequency distribution.

The product data in Table 2.1 take on only a few possible values (1, 2, 3, c, 11). The minimum number of product categories is 1 and the maximum number of categories in these data is 11. These data are called discrete data.

When you encounter discrete data, where the variable of interest can take on only a rea- sonably small number of possible values, a frequency distribution is constructed by counting the number of times each possible value occurs in the data set. We organize these counts into the frequency distribution table shown in Table 2.2. From this frequency distribution we are able to see how the data values are spread over the different number of possible product

2.1

o u tc o m e 1 Frequency Distribution

A summary of a set of data that displays the number of observations in each of the distribution’s distinct categories or classes.

Discrete Data

Data that can take on a countable number of possible values.

M02_GROE0383_10_GE_C02.indd 53 05/09/17 2:08 PM

54 Chapter 2

|

Graphs, Charts, and Tables—Describing Your Data

categories. For instance, you can see that the most frequently occurring number of product categories in a customer’s “market basket” is 4, which occurred 92 times. You can also see that the three most common numbers of product categories are 4, 5, and 6. Only a very few times do customers purchase 10 or 11 product categories in their trip to the store.

TABLE 2.1 Product Categories per Customer at the Dallas Walmart

4 2 5 8 8 10 1 4 8 3 4 1 1 3 4

1 4 4 5 4 4 4 9 5 4 4 10 7 11 4

10 2 6 7 10 5 4 6 4 6 2 3 2 4 5

5 4 11 1 4 1 9 2 4 6 6 7 6 2 3

6 5 3 4 5 6 5 3 10 6 5 7 7 4 3

8 2 2 6 5 11 9 9 5 5 6 5 3 1 7

6 6 5 3 8 4 3 3 4 4 4 7 6 4 9

1 6 5 5 4 4 7 5 6 6 9 5 6 10 4

7 5 8 4 4 7 4 6 6 4 4 2 10 4 5

4 11 8 7 9 5 6 4 2 8 4 2 6 6 6

6 4 6 5 7 1 6 9 1 5 9 10 5 5 10

5 4 7 5 7 6 9 5 3 2 1 5 5 5 5

5 9 5 3 2 5 7 2 4 6 4 4 4 4 4

6 5 8 5 5 5 5 5 2 5 5 6 4 6 5

5 7 10 2 2 6 8 3 1 3 5 6 3 3 6

5 4 5 3 3 7 9 4 4 5 10 6 10 5 9

4 3 8 7 1 8 4 3 1 3 6 7 5 5 5

4 7 4 11 6 6 3 7 9 4 4 2 9 7 5

1 6 6 8 3 8 4 4 1 9 3 9 3 4 2

9 5 5 7 10 5 3 4 7 7 6 2 2 4 4

4 7 3 5 4 9 2 3 4 3 2 1 6 4 6

1 8 1 4 3 5 5 10 4 4 4 6 9 2 7

9 4 5 3 6 5 5 3 4 6 5 7 3 6 8

3 6 1 5 7 7 5 4 6 6 6 3 6 9 5

4 5 10 1 5 5 7 8 9 1 6 5 6 6 4

10 6 5 5 5 1 6 5 6 4 7 9 10 2 6

4 4 6 11 9 5 4 4 3 5 4 6 2 6 7

3 5 6 7 4 5 4 6 9 4 3 3 6 9 4

3 7 5 6 11 4 4 8 4 2 8 2 4 2 3

6 5 1 10 5 9 5 4 5 1 4 9 5 4 4

TABLE 2.2 Dallas Walmart Product Categories Frequency Distribution

Number of Product

Categories Frequency

1 25

2 29

3 42

4 92

5 86

6 68

7 35

8 19

9 29

10 18

11 7

Total = 450

M02_GROE0383_10_GE_C02.indd 54 05/09/17 2:08 PM

2.1 Frequency Distributions and Histograms

|

^{Chapter 2} ⁵⁵

TABLE 2.3 Frequency Distributions of Years of College Education

Philadelphia Knoxville

Years of

College Frequency Years of

College Frequency

0 35 0 187

1 21 1 62

2 24 2 34

3 22 3 19

4 31 4 14

5 13 5 7

6 6 6 3

7 5 7 4

8 3 8 0

Total = 160 Total = 330

Consider another example in which a consulting firm surveyed random samples of resi- dents in two cities, Philadelphia and Knoxville. The firm is investigating the labor markets in these two communities for a client that is thinking of relocating its corporate offices to one of the two locations. Education level of the workforce in the two cities is a key factor in making the relocation decision. The consulting firm surveyed 160 randomly selected adults in Philadelphia and 330 adults in Knoxville and recorded the number of years of college attended. The responses ranged from zero to eight years. Table 2.3 shows the frequency distributions for the two cities.

Suppose now we wished to compare the distribution for years of college for Philadelphia and Knoxville. How do the two cities’ distributions compare? Do you see any difficulties in making this comparison? Because the surveys contained different numbers of people, it is dif- ficult to compare the frequency distributions directly. When the number of total observations differs, comparisons are easier to make if relative frequencies are computed. Equation 2.1 is used to compute the relative frequencies.

Table 2.4 shows the relative frequencies for each city’s distribution. This makes a comparison of the two much easier. We see that Knoxville has relatively more people without any college (56.7%) or with one year of college (18.8%) than Philadelphia (21.9% and 13.1%). At all other levels of education, Philadelphia has relatively more people than Knoxville.

Relative Frequency

The proportion of total observations that are in a given category. Relative frequency is computed by dividing the frequency in a category by the total number of observations. The relative frequencies can be converted to percentages by multiplying by 100.

TABLE 2.4 Relative Frequency Distributions of Years of College

Philadelphia Knoxville

Years of

College Frequency Relative

Frequency Frequency Relative Frequency

0 35 35>160 = 0.219 187 187>330 = 0.567

1 21 21>160 = 0.131 62 62>330 = 0.188

2 24 24>160 = 0.150 34 34>330 = 0.103

3 22 22>160 = 0.138 19 19>330 = 0.058

4 31 31>160 = 0.194 14 14>330 = 0.042

5 13 13>160 = 0.081 7 7>330 = 0.021

6 6 6>160 = 0.038 3 3>330 = 0.009

7 5 5>160 = 0.031 4 4>330 = 0.012

8 3 3>160 = 0.019 0 0>330 = 0.000

Total 160 330

M02_GROE0383_10_GE_C02.indd 55 05/09/17 2:08 PM

56 Chapter 2

|

Graphs, Charts, and Tables—Describing Your Data

EXAMPLE 2-1

Frequency and Relative Frequency Distributions

Real Estate Transactions In late 2008, the United States experienced a major eco- nomic decline thought to be due in part to the sub-prime loans that many lending institu- tions made during the previous few years. When the housing bubble burst, many institu- tions experienced severe problems. As a result, lenders became much more conservative in granting home loans, which in turn made buying and selling homes more challenging.

To demonstrate the magnitude of the problem in Kansas City, a survey of 16 real estate agencies was conducted to collect data on the number of real estate transactions closed in December 2008. The following data were observed:

3 0 0 1

1 2 2 0

0 2 1 0

2 1 4 2

The real estate analysts can use the following steps to construct a frequency distribution and a relative frequency distribution for the number of real estate transactions.

s t e p 1 List the possible values.

The possible values for the discrete variable, listed in order, are 0, 1, 2, 3, 4.

s t e p 2 Count the number of occurrences at each value.

The frequency distribution follows:

Transactions Frequency Relative Frequency

0 5 5/16 = 0.3125

1 4 4/16 = 0.2500

2 5 5/16 = 0.3125

3 1 1/16 = 0.0625

4 1 1/16 = 0.0625

Total = 16 1.0000

s t e p 3 Determine the relative frequencies.

The relative frequencies are determined by dividing each frequency by 16, as shown in the right-hand column above. Thus, just over 31% of those responding reported no transactions during December 2008.

TRY EXERCISE 2-1 (pg. 70) HOW TO DO IT (Example 2-1)

Developing Frequency and Relative Frequency Distribu- tions for Discrete Data 1. List all possible values of the variable. If the variable is ordinal level or higher, order the possible values from low to high.

2. Count the number of occurrences at each value of the variable and place this value in a column labeled “Frequency.”

To develop a relative frequency distribution, do the following:

3. Use Equation 2.1 and divide each frequency count by the total number of observations and place in a column headed

“Relative Frequency.”

TABLE 2.5 TV Source Frequency Distribution

TV Source Frequency

DISH 80

DIRECTV 90

Cable 20

Other 10

Total = 200

Relative Frequency

Relative frequency5 f n

i (2.1)

where:

f_i = Frequency of the ith value of the discrete variable n = a

k i=1

f_i = Total number of observations

k = The number of different values for the discrete variable

The frequency distributions shown in Table 2.2 and Table 2.3 were developed from quan- titative data. That is, the variable of interest was numerical (number of product categories or number of years of college). However, a frequency distribution can also be developed when the data are qualitative data, or nonnumerical data. For instance, if a survey asked homeowners how they get their TV signal, the possible responses in this region are:

DISH DIRECTV Cable Other

Table 2.5 to the left shows the frequency distribution from a survey of 200 homeowners.

M02_GROE0383_10_GE_C02.indd 56 05/09/17 2:08 PM

2.1 Frequency Distributions and Histograms

|

^{Chapter 2} ⁵⁷

EXAMPLE 2-2

Frequency Distribution for Qualitative Data

Automobile Accidents State Farm Insurance recently surveyed a sample of the records for 15 policy holders to determine the make of the vehicle driven by the eldest member in the household. The following data reflect the results for 15 of the respondents:

Ford Dodge Toyota Ford Buick

Chevy Toyota Nissan Ford Chevy

Ford Toyota Chevy BMW Honda

The frequency distribution for this qualitative variable is found as follows:

s t e p 1 List the possible values.

For these sample data, the possible values for the variable are BMW, Buick, Chevy, Dodge, Ford, Honda, Nissan, Toyota.

s t e p 2 Count the number of occurrences at each value.

The frequency distribution is

Car Company Frequency

BMW 1

Buick 1

Chevy 3

Dodge 1

Ford 4

Honda 1

Nissan 1

Toyota 3

Total = 15

TRY EXERCISE 2-7 (pg. 71)

BUSINESS APPLICATION

Frequency Distributions

Athletic Shoe Survey In recent years, a status symbol for many students has been the brand and style of athletic shoes they wear. Companies such as Nike and Adidas compete for the top position in the sport shoe market. A survey was conducted in which 100 college students at a southern state school were asked a number of questions, including how many pairs of Nike shoes they currently own. The data are in a file called SportsShoes.

The variable Number of Nike is a discrete quantitative variable. Figure 2.1 shows the frequency distribution (output from Excel) for the number of Nike shoes owned by those surveyed. The frequency distribution shows that, although a few people own more than six pairs of Nike shoes, most of those surveyed own two or fewer pairs.

Excel Tutorial

Dalam dokumen Business Statistics: A Decision-Making Approach (Halaman 54-58)