• Tidak ada hasil yang ditemukan

Measures of Central Tendency

3.5 The Median

Unfortunately, many distributions possess more than one extreme score.

Skewed distributions, in fact, can feature a moderate percentage of scores trail- ing well off to one side. If we are interested in accurately communicating where scores of a distribution are bunched, and the existence of extreme scores would lead to a misleading impression, then a different measure of centrality is needed.

QuestionWhat is the mean and median of this sample distribution?

2, 4, 7, 9, 12, 15, 17, 46, 54 SolutionM= 18.44 median = 12

Look closely at the two previous distributions. They are identical with the exception of two extreme scores added to the second. This has greatly influ- enced the mean–nearly doubling it. The median, however, was only shifted one score to the right.

Finding the Median When Given an Even Number of Scores

In the examples provided so far, it was easy to identify the median because there were an odd number of scores in the distribution. But what do we do in these situations?

4, 6, 9, 10, 11, 12 1, 2, 4, 6, 8, 11, 14, 18

The median will fall between two scores anytime there is an even number of scores; it will typically be a value that does not occur in the distribution. The median may even be a number, like a fraction, that does not seem to make sense in terms of what is being measured. For instance, imagine the median number of traffic tickets handed out each month for a given city equals 207.5. (What is the meaning of one- half of a traffic ticket?) Remember that statistical concepts convey a feature of an entire distribution; it is not a requirement that the statistical value itself make sense as a score in that distribution. The following examples will show us how to calculate medians when there is an even number of scores in the distribution.

QuestionWhat is the median of this distribution?

3, 9, 15, 16, 19, 22

Median Solution15.50

The median of a distribution having an even number of values is the mean of the middle two numbers, provided there are not a string of identical numbers in the middle.

Finding the Median When There Are Identical Scores in the Middle of the Distribution

QuestionWhat is the median of the following distribution?

3.5 The Median 77

7, 7, 7, 8, 8, 8, 9, 9, 10, 10

Median Solution

If discrete, the median = 8.00.

If continuous, the median = 8.17.

It may not be immediately obvious why the type of variable matters and why one answer would be 8.17. Recall from Chapter 2 that discrete variables can take on only a finite number of values. No meaningful values exist between any two adjacent values. In situations like this, since the same number is found on both sides of the middle count, the resulting median is simply that number. However, for continuous variables, every number of a distribution is considered to be at the midpoint of an interval; remember using real limits to draw histograms? Ok, since there are 10 values, we need to have 5 values on each side. Coming up from the bottom, the three 7’s get us to within two values of the middle. So, we need 2/3rds of the three 8’s to get us to five on each side. Remember that for a con- tinuous measure the value of 8 is the midpoint of the interval 7.5–8.5. So, we need two of those three values that are centered on 8 to go to the lower side of the middle and one of the values centered on 8 to go to the higher side.

But those 8’s cannot be separated since they are stacked on top of each other.

(Look at Figure 3.1.) Ok, so we will have to split those three boxes identically so that a total of two of them fall to the lower half of the distribution and the remaining parts of the boxes fall to the upper half. If we drew a line down through the three boxes such that 2/3rds of each box was to the left and 1/3

3

2

1

6 7 8 9 10 11

2/3 1/3

Median = 8.17

Frequency

Figure 3.1 A visual representation of how to find a median when there are identical scores in the middle of the distribution.

of each box was to the right, that would“do the trick”(7, 7, 7, 2/3rds of the first 8, 2/3rds of the second 8, and 2/3rds of the third 8 would all be on the left). Well, what is 2/3rds of the way from 7.5 to 8.5? Let us add 0.67 to 7.5. That gives us 8.17.

QuestionWhat is the median of this distribution?

7, 7, 7, 8, 8, 8, 8, 8, 9, 9, 10, 10

Median Solution

If discrete, the median = 8.00.

If continuous, the median = 8.10.

Here is another one where, if the measure is continuous, we will need to split up the 8’s. Three of them need to go to the lower half of the distribution and two of them to the higher half. The fairest way to get three of the five 8’s would be to get 60% or 0.6 of each one. Since the 8’s actually start at 7.5, the answer would be (7.5 + 0.60) 8.10.

Box 3.1 The Central Tendency of Likert Scales: The Great Debate

In Chapter 2 readers were introduced to a critical difference between ordinal scales and interval or ratio scales – the nature of the relationship between numerical values. Ordinal scales are a quantitatively organized series of cate- gories and, as such, make no assumptions about the quantitative distance between these categories. Interval and ratio scales hold the intervals constant throughout the measure. In this chapter we learned that the concept of a devi- ation score is necessary to find a mean. A mean is defined as the point where the deviation scores sum to zero. Deviation scores, however, cannot be found for numbers on an ordinal scale; the intervals are not held constant. (It would be like suggesting first- and third-place finishers in a pie bake-off are equidistant from second. We have no reason to presume that.) Readers were also made aware of the ambiguity that surrounds how to interpret numbers generated by a Likert scale–scales that typically offer 5–11 options ranging from strongly disagree to strongly agree (usually with a neutral point in the middle) that are used to measure the amount of agreement people have with a given statement.

The great debate is this: Is it appropriate to generate means for Likert-scale data?

It is very important because if we decide the answer is“no,”then we eliminate all statistical tests that make use of the concept of the mean. Much of the con- tent in statistics textbooks is not applicable to data measured on an ordinal scale. An answer of“no”resigns us to use what are called nonparametric tests.

3.5 The Median 79

(Chapters 17 and 18 in this textbook are devoted to nonparametric tests.) These tests are less powerful (a concept we will explore in Chapter 11) and, therefore, less likely to help us find meaningful differences between groups.

The conservative approach is to argue that Likert scales have no way to deter- mine constancy between values and should therefore be considered ordinal. As a result, data gathered using Likert scales must be analyzed nonparametrically;

end of discussion. Others argue that the line between ordinal and interval is rather vague, some even calling it“fuzzy”(e.g. Abelson, 1995). If, they argue, the data from a Likert scale takes on the shape of a normal distribution, and if there are a good number of options for the respondent to choose from, then the data can be considered“normal”or“sufficiently close”to normal and ana- lyzed with more standard statistical techniques.

We do not aim to settle the debate here, but merely raise it as an important issue. Perhaps it will be a good class discussion topic. As we think about this issue, keep in mind some recommendations made by Karen Grace-Martin, a specialist in data analysis. They are paraphrased below (Grace-Martin, 2008):

1) Realize the difference between a Likert-type item and a Likert scale. A Likert scale is actually made up of many items. Collectively, they attempt to provide a measure of the attitude in question. Many people, however, use the term Likert scale to refer to a single item.

2) Proceed with caution. Look at the particulars of our Likert-scale data. Would treating it as interval data influence our conclusions? The fact that everyone else is treating it as interval data is not sufficient justification in and of itself.

3) At the very least, insist (i) that the item have at least nine points, (ii) that the underlying concept be continuous, and (iii) that there be some indication that the intervals between the values are approximately equal. Make sure the other statistical assumptions for the test are met.

4) When we can, run the nonparametric equivalent to our test. If we get the same results, we can be more confident about our conclusions.

5) If we do choose to use Likert data in a parametric procedure, make sure we have particularly strong results before making a claim.

6) Consider the consequences of reporting inaccurate results. Is the analysis going to be published? Will it be used by others to make decisions?

The hope here is less about bringing readers to some desired position and more about helping students develop an appreciation for some of the more subtle and yet important debatable issues related to data analysis. How we understand what the value of our scores mean is critical, requiring us to first figure out the type of scale being used. And if we find that we are using Likert data; well then, welcome to the great debate!