Graphs - FROM RESEARCH TO MANUSCRIPT

4. Figures

4.3. Graphs

When your data are numbers, graphs will show more visual detail than tables. Tables order data in lines, whereas graphs organize data in a full two-dimensional plane, so try to use graphs when you can.

4.3.1. Histograms

One simple graph is the histogram. A histogram, also called a “frequency-distribution diagram,” shows how many of your data points have each value. The range of data values is laid out along the x-axis, and the numbers of data points having each value are listed along the y-axis. In a histogram, the x values (the “classes” or measurement intervals for the values) must be divided into even intervals.

For our flower petal example, Table 1 (section 3.1.4, above) can also be presented as a histogram, Graph 1, where each flower is a data point and the number of petals on the flower is the value of that data point:

no. of flowers

no. of petals 250

200

150

100

0 14 15 16 17 18 19 20 21

Number of Petals on Hybrid Rudbeckia Flowers

Graph 1. A histogram of the number of flowers that have each of the possible numbers of petals. All flowers had between 15 and 20 petals (x-axis). Of the 500 flowers examined, 221 (44%) had 18 petals and 416 (83%) had 17, 18, or 19 petals. The mean was 17.8 petals, the median was 18 petals, and the histogram is asymmetric.

By expanding our table into a graph, we have turned a numerical pattern into a spatial pattern. It takes some thinking to understand the numerical pattern in Table 1, but we can immediately use visual logic to comprehend the spatial pattern in Graph 1.

A histogram is simple because it portrays only a single aspect or variable of your data points. When your research project records two or more variables of each data point, you can make histograms of the individual variables. For example, if we had counted the number of petals of each ﬂower and also measured the height of the entire

plant, we could add an additional histogram showing the number of plants in each of the different height ranges that we found.

no. of plants

height (cm)

<11 0 50 100 150 200 250 300 350 400

11-15 16-20 21-25 >25

4.3.2. Scatter Plots

The two histograms—petal number and plant height—are independent data collec-tions. They could just as easily be the results of two separate research projects. When more than one key variable has been recorded from the same data set in the same ex-periment, it is a good idea to make an independent histogram of each variable so that you can examine the pattern of the occurrence of that variable alone.

When your experiments record two key variables for each observation, however, you can also use a graph to look at the pattern of the co-occurrences of the variables.

In a graph of co-occurrences, each observation is treated as a data point. The position of the point in the graph is established by the values of its two variables. The many points you have recorded will be scattered throughout the graph, so this graph is called a “scatter plot” or “scatter diagram.”

variable #2

variable #1

Scatter Plot of the Co-occurrence of Two Variables Measured in the Same Experiment

For example, a scatter plot of the co-occurrence of petal number and plant height, in our hybrid plant example, would have 500 points, one for each plant in the study.

4.3.3. Correlation

A scatter plot immediately challenges you to wonder whether the two key variables that you have graphed are correlated.

On the one hand, there may be no correlation. It is always possible that the only relation between the two variables is that you have chosen to measure them both each time you collect data. For example, suppose that, for each data point, you measured the number of petals on a ﬂower and the Dow Jones Industrial Average at that moment. It is hard to imagine a mechanism by which the values of these two variables would change together. Therefore, you would probably not look for a correlation.

On the other hand, suppose that, for each data point, you measured both the height of a plant and the number of petals on its ﬂower. You could imagine mechanisms by which the values of these two key variables could change together. Therefore, you might look for a correlation.

A scatter plot of co-occurrences of the two variables is a good place to start look-ing for correlation between them. Begin with your eye and your innate ability to see patterns.

0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16 16

14 12 10 8 6 4 2 0

16 14 12 10 8 6 4 2 0

y y

x x

Ask yourself:

– Does it look like the points in your scatter plot have some order or form a pattern?

– Are the points clumped?

– Are the points more concentrated in certain areas?

– Do the points look like a simple shape, e.g., a line, a curve, a wave?

Then, before using any mathematical tools, write two sentences:

(1) I (think, do not think) I see a pattern.

(2) The pattern looks like (a heterogeneous random dispersal of points, a straight line, a curve, a sine wave, one major clump of points, dense points on the left grading to sparse points on the right, . . . ).

Written simply and precisely, the essence of these sentences belongs in your paper.

For example, “To the eye, the points in this scatter plot appear to lie along a straight line” or “To the eye, the points in this scatter plot are arranged in no recognizable pattern.”

Now try to use correlation statistics to assign numerical conﬁdence levels to these sentences. Speciﬁcally:

– Sentence (1) (which proposes the presence or absence of a pattern) can often be given a conﬁdence level by calculating a non-parametric correlation coefﬁcient.

– Sentence (2) (which proposes the presence of a particular pattern) can often be given a conﬁdence level when the apparent pattern is a simple curve.

Experience is needed to make useful statistical statements about patterns. This is a good point at which to consult a statistician.

A scatter plot is a visual description of your data. Put the plot in your paper even if you don’t see a clear pattern in the points. A reader with a better eye may discover a

pattern that you have not seen. And, if your data truly has no pattern, a scatter plot of the full data can show the heterogeneity convincingly.

Dalam dokumen FROM RESEARCH TO MANUSCRIPT (Halaman 53-58)