There are seven steps to including graphics in a statistical report with excellent results:
. Plan the list of figures.Here, we lay out the logic of our report.
. Plan each chart or graph.First, we prepare a chart type. Then we think through—and write down—the details.
. Prepare each chart or graph. Here, we generate the graph from the numbers.
. Prepare text to go with each chart or graph.Work with the examples in this chapter so that you can talk to your audience through each chart.
Then write one to three paragraphs that do just that, whether people will be reading the report, or you will be giving a presentation.
. Checking each chart or graph against the numbers. Does everything look right? Check titles, axes, variables, and values.
. Checking each figure against the table of graphs representing statistics.
. Do a final copy-edit and proofreading of the entire report, figures included. Be sure to have someone else check your work. We catch each other’s mistakes much better than we catch our own.
Several of these steps are worth a bit more explanation.
Planning the list of figures
If we make a logical list of the points we want to make, we can build a series of images that help our readers or listeners move from knowing nothing about a subject to knowing what we know. Each point should require a relatively small amount of information, and especially, it should not introduce many new types of information all at once. Once we have our logical presentation—or argument, or case, although we’re not going to argue with anyone—we can look at each point and decide if it needs a graph, and what kind of graph would be best. If we skip this step, we are likely to find that we are creating very complicated graphs because we are packing too many points into one picture.
HANDY HINTS Get Some Style and Save Some Time With Templates and Procedures If we have several graphs of the same style—all line graphs or all bar charts, for instance—we should put them all in the same style, using consistent borders, fonts, line weights, and so forth. We can do this by creating a template, a procedure, or both. A template is a file with the basic settings for the graph all ready to go.
To prepare each graph, we just drop in the data and make final adjustments.
A procedure is a written series of steps that tells you—and others—what settings to choose in your graphics program to get the results you want.
The best approach is to make your first graph and play with it until it is exactly the way you want it. Then copy the axes and settings without the data to make the template, and note the steps and settings to write the procedure. Templates and procedures are particularly useful when graphics are being created by different people or with different tools. And, of course, you can keep them for your next project, producing high-quality graphs time after time.
To go to the last step, be willing to change style where it helps the presentation, but keep changes small. And, if you make an improvement that you like, be sure to change the template and procedure so you can use it the next time, and the next.
Planning each chart or graph
Here, we choose the type of chart or graph that we will use for each point we want to make in our report. Remember that we may not need a graphic at all, if plain text or a simple table will do. When we do need a graph, we can choose the type from Table 7-1.
Table 7.1 Chart types.
Chart name Purpose or example
Pie chart For showing proportional values of a variable in relation to the total.
Bar charts
Horizontal bar chart For comparing values in relation to one another, especially where we usually think of the item in horizontal terms, such as time, or where it is easiest to lay the graphic out horizontally.
Simple vertical bar chart For comparing values in relation to one another, especially where we are comparing quantities or value.
Side-by-side or mirrored bar chart For comparing two sets of values where the names of the values are the same, but the populations are different.
Segmented and stacked bar chart The major variable is shown as a simple vertical bar chart. Each bar is then segmented into two or three parts, so that each variable is split into ranges, marked by color or shading, that indicate the values of another variable, most often a sub-category.
Multiple bar chart Here, we cluster sets of bars in order, so that the chart can be read two ways to see how variables interact with one another.
We might plot time, in quarters or years, and cluster divisions within each time or year, showing net income on the vertical axis.
That would allow a comparison across time of each division, and a comparison across divisions within each time period.
Pareto diagram A specialized bar chart for a single nominal variable.
Histogram or area chart A chart—bar chart in the case of a histogram, or line graph in the case of an area chart—where the area shown on the chart shows the proportion of values in the sample. This requires meeting a number of specifications for the data and the graph.
It allows visual inspection and representation of the sample for statistical and other purposes.
Line graphs
Single line graph To show how a single variable changes across the X-axis.
Very often, the X-axis is time.
Table 7.1 Chart types (Continued).
Chart name Purpose or example
Multiple line graph To show how multiple variables change across the X-axis.
Area graph To emphasize the areas on a line graph. We might want to emphasize one area that is, effectively, the subtraction of one value from another. Or we might want to compare several areas.
3-D graphs
3-D Bar chart To compare two different variables in relation to a third variable.
Contour plot To compare the effects of two different variables on a third variable.
Other types of graphs
Scatter plot To evaluate the relationship between two different variables in a sample.
This is specifically a statistical tool related to regression. Each sample is shown as a dot on an area, its location indicating its value for the variable of each of the two axes.
Statistical map To show values of a variable across a region. This could be a geographical region. However, we can also consider some astronomical charts to be statistical maps of space, and it is possible to create statistical maps of other actual or theoretical regions, as well.
Stem-and-leaf To get a quick look at the shape of a distribution, especially when no computer is available.
Box-and-whisker plot To detect extreme values or other unusual characteristics of a sample distribution.
Bubble chart To add a third variable to a scatter plot or map, where the third variable indicates a magnitude.
Contour plot To show a third continuous variable that is dependent on the first two continuous variables.
Diamond chart To show the relationship between actual performance on four variables against a standard.
CRITICAL CAUTION Crucial Exceptions
There are times when a certain type of chart may seem like the right one, but using it would give the wrong impression. Be sure to readErrors to avoidwhen planning your figures.
Any list of options is incomplete, and prone to error. Think of Table 7-1 as a sample of the population of types of graphs you might use. There are some graphs you will encounter—especially the specialized graphs of your industry—which are not listed. In addition, there are some graphs that don’t fall easily into our categories. For instance, we can put a line through a bar graph, to illustrate both value and change.
Also, the lengths of our bars and the heights of our lines do not need to indicate frequency. They can indicate count, or any numerical value. Line graphs can show cumulative values. There are many options, and the best way to learn them is to know the graphs in your industry, find good examples, and use them as templates. Another excellent way to learn is to find bad graphs, understand what is wrong with them, and fix them, as illustrated in our case study.
CASE STUDY
What Type of Graph to Use? The Case of the Bad Weather Graph
Picking the right sort of graph is a complex process. There are certain rules of thumb, but there is also a real need for applied common sense. Here is a lovely case of using a multiple bar chart when an area chart would be a much better choice. A prominent Cable TV channel that specializes in weather forecasts (we will mention no names!) also has a website that allows the user to display local weather and related information, including information about the local climate. The local climate, broken out by months, is displayed on the website in a multiple bar chart.
The original—in orange and blue—was a bit easier to read. Let that be a caution to those who rely on color in a world still full of black-and-white copiers and printers! First, note the poor spacing. When creating a multi-component bar chart, it is always a good idea to have all of the bars in each cluster close together—or even touching—and have greater separation between clusters. These allow the reader to see the clusters more easily. In Fig. 7-16, the only graphical element that tells you that two bars belong to the same month are the clumsy, distracting grid lines in the background.
Second, bars are usually used to indicate amounts, and temperatures are not amounts of anything. Even absolute, or Kelvin, temperatures are not amounts of heat, and, on the Fahrenheit temperature scale, the zero point is arbitrary.
Third, bar charts, whether vertical or horizontal or histogram, are designed to compare relative values. We might want to compare average temperature between January and June (although that is unlikely), but we would never need to compare the average high temperature for a month to the average low. We know before we look at the graph. The high will be higher than the low.
Now that we’ve shown a bar graph is not a good chart, let’s see why a line graph might do what we want. What is it that our reader might like to know from these data? Most likely, it is the trend of changing temperature across the seasons. Trends are better shown with line graphs. (Since trends in time are about change, always consider a line graph before a bar graph, when the x-axis is time.) Since we have two variables (high and low), we will need a multi-line graph. As we look at this graph, we see that the value of one variable is always higher than the other, so we should
Fig. 7-16. Wrong choice of graph: monthly average temperature.
consider an area chart. An area chart is a good idea when the area between two lines on a line graph is meaningful. Here, the area between the average low and the average high for the month is the range of temperatures you can expect to feel on an average day. Above in Fig. 7-17, we have taken the data (kindly supplied by this same website) and created a simple area chart using Microsoft ExcelÕ.
Note that we could have easily added the record lows and highs for each month, since they would be below and above the average lows and highs. We could probably have actually used daily, instead of monthly information, keeping only the months labeled along the x-axis. While the bar chart above looks crowded and cramped with only the little bit of information it displays so poorly, the area chart is so uncluttered that we could probably add more information without either distracting or confusing our readers.
There are tons of examples of bad graphs. This particular firm is doing no worse of a job than are many, many others. We should appreciate the bad examples, and use them as cautions and tools to learn to make good graphs.