1. Decide on the appropriate type of graph, recalling that histograms and frequency polygons are appropriate for quantitative data, while bar graphs are appropriate for qualitative data and also are some- times used with discrete quantitative data.
2. Draw the horizontal axis, then the vertical axis, remembering that the vertical axis should be about as tall as the horizontal axis is wide.
3. Identify the string of class intervals that eventually will be super- imposed on the horizontal axis. For qualitative data or ungrouped quantitative data, this is easy—just use the classes suggested by the data. For grouped quantitative data, proceed as if you were creat- ing a set of class intervals for a frequency distribution. (See the box
“Constructing Frequency Distributions” on page 27.)
4. Superimpose the string of class intervals (with gaps for bar graphs) along the entire length of the horizontal axis. For his- tograms and frequency polygons, be prepared for some trial and error—use a pencil! Do not use a string of empty class intervals to bridge a sizable gap between the origin of 0 and the smallest class interval. Instead, use wiggly lines to signal a break in scale, then begin with the smallest class interval. Also, do not clutter the hori- zontal scale with excessive numbers—use just a few convenient numbers.
5. Along the entire length of the vertical axis, superimpose a progression of convenient numbers, beginning at the bottom with 0 and ending at the top with a number as large as or slightly larger than the maximum observed frequency. If there is a considerable gap between the origin of 0 and the smallest observed frequency, use wiggly lines to signal a break in scale.
6. Using the scaled axes, construct bars (or dots and lines) to reflect the frequency of observations within each class interval. For frequency polygons, dots should be located above the midpoints of class intervals, and both tails of the graph should be anchored to the horizontal axis, as described under “Frequency Polygons” in Section 2.8.
7. Supply labels for both axes and a title (or even an explanatory sentence) for the graph.
R E V I E W Q U E S T I O N S 4 3 constructed, step by step, to comply with a number of guidelines. (See the box “Con- structing Frequency Distributions” on page 42.) Essentially, a well-constructed fre- quency distribution consists of a string of non-overlapping, equal classes that occupy the entire distance between the largest and smallest observations.
Very extreme scores, or outliers, require special attention. Given a valid outlier, you might choose to relegate it to a footnote because of its potential for distortion, or you might even concentrate on it as a possible key to understanding the data.
When comparing two or more frequency distributions based on appreciably differ- ent total numbers of observations, it is often helpful to express frequencies as relative frequencies.
When relative standing within the distribution is important, convert frequency dis- tributions into cumulative percentages, referred to as percentile ranks. The percentile rank of a score indicates the percentage of scores in the entire distribution with similar or smaller values.
Frequency distributions for qualitative data are easy to construct. They also can be converted into relative frequency distributions and, if the data can be ordered because of ordinal measurement, into percentile ranks.
Frequency distributions can be converted into graphs.
If the data are quantitative, histograms, frequency polygons, or stem and leaf dis- plays are often used. Frequency polygons are particularly useful when two or more frequency distributions are to be included in the same graph.
Shape is an important characteristic of a histogram or a frequency polygon.
Smoothed frequency polygons were used to describe four of the more typical shapes:
normal, bimodal, positively skewed, and negatively skewed.
Bar graphs are often used with qualitative data and sometimes with discrete quan- titative data. They resemble histograms except that gaps separate adjacent bars in bar graphs.
When interpreting graphs, beware of various unscrupulous techniques, such as using bizarre combinations of axes to either exaggerate or suppress a particular data pattern.
When constructing graphs, refer to the step-by-step procedure described in the box
“Constructing Graphs” on page 42.
Important Terms
Frequency distribution
Frequency distribution for ungrouped data
Frequency distribution for grouped data Unit of measurement
Real limits Outlier
Relative frequency distribution
Cumulative frequency distribution Percentile rank
Histogram
Frequency polygon Stem and leaf display
Positively skewed distribution Negatively skewed distribution Bar graph
R E V I E W Q U E S T I O N S
2.14 (a) Construct a frequency distribution for the number of different residences occu- pied by graduating seniors during their college career, namely
1, 4, 2, 3, 3, 1, 6, 7, 4, 3, 3, 9, 2, 4, 2, 2, 3, 2, 3, 4, 4, 2, 3, 3, 5 (b) What is the shape of this distribution?
2.15 The number of friends reported by Facebook users is summarized in the following frequency distribution:
FRIENDS f
400 above 2
350 399 5
300 349 12
250 299 17
200 249 23
150 199 49
100 149 27
50 99 29
0 49 36
Total 200
(a) What is the shape of this distribution?
(b) Find the relative frequencies.
(c) Find the approximate percentile rank of the interval 300–349.
(d) Convert to a histogram.
(e) Why would it not be possible to convert to a stem and leaf display?
2.16 Assume that student volunteers were assigned arbitrarily (according to a coin toss) either to be trained to meditate or to behave as usual. To determine whether medita- tion training (the independent variable) influences GPAs (the dependent variable), GPAs were calculated for each student at the end of the one-year experiment, yield- ing these results for the two groups:
MEDITATORS NONMEDITATORS
3.25 2.25 2.75 3.67 3.79 3.00
3.56 3.33 2.25 2.50 2.75 1.90
3.57 2.45 3.75 3.50 2.67 2.90
2.95 3.30 3.56 2.80 2.65 2.58
3.56 3.78 3.75 2.83 3.10 3.37
3.45 3.00 3.35 3.25 2.76 2.86
3.10 2.75 3.09 2.90 2.10 2.66
2.58 2.95 3.56 2.34 3.20 2.67
3.30 3.43 3.47 3.59 3.00 3.08
(a) What is the unit of measurement for these data?
(b) Construct separate frequency distributions for meditators and for nonmeditators.
(First, construct the frequency distribution for the group having the larger range.
Then, to facilitate comparisons, use the same set of classes for the other frequency distribution.)
(c) Do the two groups tend to differ? (Eventually, tools from inferential statistics, as described in Part 2, will help you decide whether any apparent difference between the two groups probably is real or merely transitory, that is, attributable to variability or chance. See Review Question 14.15 on page 271.)
R E V I E W Q U E S T I O N S 4 5
*
2.17 Are there any conspicuous differences between the two distributions in the following table (one reflecting the ages of all residents of a small town and the other reflecting the ages of all U.S. residents)?(a) To help make the desired comparison, convert the frequencies (f) for the small town to percentages.
(b) Describe any seemingly conspicuous differences between the two distributions.
(c) Using just one graph, construct frequency polygons for the two relative frequency distributions.
*NOTE: When segmenting the horizontal axis, assign the same width to the open-ended interval (65–above) as to any other class interval. (This tactic causes some distor- tion at the upper end of the histogram, since one class interval is doing the work of several. Nothing is free, including the convenience of open-ended intervals.) Answers on pages 423 and 424.
TWO AGE DISTRIBUTIONS AGE
SMALL TOWN f
U.S. POPULATION (2010) (%)
65–above 105 13
60–64 53 5
55–59 45 6
50–54 40 7
45–49 44 7
40–44 38 7
35–39 31 7
30–34 27 6
25–29 25 7
20–24 20 7
15–19 20 7
10–14 19 7
5–9 17 7
0–4 16 7
Total 500 99%
NOTE: The top class (65–above) has no upper boundary. Although less preferred, as discussed previously, this type of open-ended class is employed as a space-saving device when, as in the Statistical Abstract of the United States, many different tables must be listed.
Source: 2012 Statistical Abstract of the United States.
2.18 The following table shows distributions of bachelor’s degrees earned in 2011–2012 for selected fields of study by all male graduates and by all female graduates.
(a) How many female psychology majors graduated in 2011–2012?
(b) Since the total numbers of male and female graduates are fairly different—600.0 thousand and 803.6 thousand—it is helpful to convert first to relative frequencies before making comparisons between male and female graduates. Then, inspect these relative frequencies and note what appear to be the most conspicuous differ- ences between male and female graduates.
(c) Would it be meaningful to cumulate the frequencies in either of these frequency distribu- tions?
(d) Using just one graph, construct bar graphs for all male graduates and for all female gradu- ates. Hint: Alternate shaded and unshaded bars for males and females, respectively.
BACHELOR’S DEGREES EARNED IN 2011–2012 BY SELECTED FIELD OF STUDY AND GENDER (IN THOUSANDS)
MAJOR FIELD OF STUDY MALES FEMALES
Business 190.0 176.7
Social sciences 90.6 87.9
Education 21.8 84.0
Health sciences 24.9 138.6
Psychology 25.4 83.6
Engineering 81.3 17.3
Life sciences 39.5 56.3
Fine arts 37.2 58.6
Communications 33.5 55.2
Computer sciences 38.8 8.6
English 17.0 36.8
Total 600.0 803.6
Source: 2013 Digest of Educational Statistics at http://nces.ed.gov.
*2.19 The following table is slightly more complex than previous tables, and shows both frequency distributions and relative frequency distributions of race/ethnicity for the U.S. population in 1980 and in 2010. It also shows the frequency (f) change and the percent (%) change of race/ethnicity between 1980 and 2010.
(a) Which group changed the most in terms of the actual number of people?
(b) Relative to its size in 1980, which group increased most?
(c) Relative to its size in 1980, which group increased less rapidly than the general population?
(d) What is the most striking trend in these data?
Answers on page 424.
RACE/ETHNICITY OF U.S. POPULATION (IN MILLIONS)
1980 2010 1980–2010
RACE/ETHNICITY f % f % f CHANGE % CHANGE
African American 26.7 12 37.7 12 11.0 41
Asian American 5.2 2 17.2 6 12.0 231
Hispanic 14.6 6 50.5 17 35.9 246
White 180.1 80 196.8 65 16.7 9
Total 226.6 100 302.2 100 75.6 33
NOTE: The last column expresses the 1980–2010 change as a percentage of the 1980 popula- tion for that row.
Source: www.uscensus.gov/prod/cen2010/
4 7