SAMPLE SIZE - Content Analysis: An Introduction to Its Methodology

After a n analyst decides on a sampling plan, the question that naturally follows concerns how large the sample must be to answer the research question with sufficient confidence. There is no set answer to this question, but the analyst can arrive at an appropriate sample size through one of three approaches: by reducing the research question so that it can be answered, given statistical sampling theory; by experimenting with the accuracy of different sampling techniques and sample sizes; or by applying the split-half technique.

Statistical Sampling Theory

As noted above, the sampling of texts may not conform to the assumptions of statistical sampling theory. Sampling units and recording units tend to differ.

Texts have their own connectivity, and recording units may not be as independent

Table 6.1 Sample Size: Least Likely Units and Significance Level (all sampling units equally informative)

Probability of Least Likely Units in the Population

.1 .01 .001 .0001 .00001

� .5

'" 7 69 693 6,93 1 69,307

= .2

<:\I

'" 1 6 1 6 1 1 ,609 1 6,094 1 60,942

I;:: .1

·S _Oll ²² ²³⁰ ^2,302 ^23,025 ^230,256

1i.i ^.05 ²⁹ ²⁹⁹ ^2,995 ^29,955 ^299,563

....

Ql 0 .02 37 390 3,9 1 1 39, 1 1 8 391 , 1 98

� .01

� 44 459 4,603 46,049 460,5 1 2

...:l .005

'" 5 1 528 5.296 52,980 529,823

� _.002

.� 59 6 1 9 6,2 1 2 62, 143 6 12,453

Q � .001 66 689 6,905 _/ 60,074 690,767

as the theory requires. Textual units tend to be unequally informative, and the researcher must sample them so as to give the research question a fair chance of being answered correctly. Nevertheless, there is one solid generalization that can be carried from statistical sampling theory into content analysis concerns: When the units of text that would make a difference in answering the research question are rare, the sample size must be larger than is the case when such units are common.

This is illustrated by the figures in Table 6 . 1 , which lists the sizes of samples required to " catch" rare units on different levels of significance. For example, assuming the probability of the rarest relevant instances to be 1 in 1 ,000, or .00 1 , and the desired significance level o f the answers to research questions to be .05, a sample of 2,995 would give the analyst 95% certainty that it includes at least one of these instances. This logic is applicable not only to the sampling of rare incidences but also to critical decisions. When an election is close and its out

come depends on very few voters, political pollsters need larger sample sizes in order to predict the results accurately than they do when candidates' levels of popularity are wide apart. Although this generalization is sound, researchers who rely on the actual numbers in this table should understand that they derive from statistical sampling theory, from the binominal distribution in particular. Thus an analyst should use these figures only if the assumptions on which they are based do not violate the research situation in major ways.

Sampl i ng Experiments

Analysts may elect to experiment with various sample sizes and sampling tech

niques in order to find the combination best suited to answering their research questions. Stempel ( 1 952), for example, compared samples of 6, 12, 1 8, 24, and

SAMPLI N G

1 2 3

48 issues of a newspaper with issues from an entire year and found, when he measured the average proportion of subject matter in each sample, that increas

ing the sample size beyond 12 did not produce significantly more accurate results. Riffe et al. ( 1 998, pp. 97-1 03 ) have reported replications of these early studies as well as the results of experiments designed to determine how the use of different sampling techniques affects how well a sample represents a popula

tion. In one study, Riffe at al. used local stories printed in a 39,000-circulation daily over a 6-month period as the closest practical approximation to the popu

lation. They then drew 20 samples for each of three methods, selecting issues at random (random sampling), in fixed intervals (systematic sampling), and by constructing artificial weeks (stratified sampling) "with 7-, 14-, 2 1 -, and 28-day samples." The researchers defined sufficiency of a technique as follows:

A sampling technique was sufficient when the percentage of accurate sam

ple means fell within the percentage for one and two standard erro'rs found in a normal curve. In other words, if 6 8 % of the 20 sample means fell within plus or minus one standard error of the population mean and 95%

of the sample means fell within plus or minus two standard errors of the mean, a sampling technique was adequate. (p. 9 8 )

Riffe e t al. found remarkable differences among the methods:

It took 2 8 days of editions for simple random sampling to be adequate, and consecutive-day sampling never adequately represented the population mean. One constructed week adequately predicted the population mean, and two constructed weeks worked even better . . . . one constructed week was as efficient as four, and its estimates exceeded what would be expected based on probability theory. (p. 9 8 )

It follows that different sampling techniques yield samples o f different degrees of efficiency. It is wise, however, to be wary of unchecked generalizations.

Different media may have different properties, and results like Stempel's and Riffe et al.'s actually reflect measuring frequencies of content categories and may be generalizable only within a genre. If newspapers were to change their report

ing style and feature, say, more pictures, many more sections, and shorter stories (as is typical among today's tabloid papers) , or if content analyses were to use measures other than proportions of subject matter or frequencies, the findings noted above may no longer be generalizable.

What is common to experimental generalizations regarding adequate sample sizes is the researchers' approach, which involves these steps:

• Establish a benchmark against which the accuracy of samples can be assessed, usually by analyzing a very large sample of textual units, there

after taken as the population of texts. Obtain the standard error of this large sample for the adopted benchmark.

• Draw samples of increasing sizes and, if appropriate, by different sampling techniques, and test their accuracy by comparing the measures obtained for them with the confidence interval of the benchmark .

• Stop with the combination of sample size and sampling technique that con

sistently falls within the standard interval of the method ( see Riffe et al.'s criteria above ) .

Such experiments require a benchmark-that is, the results from a n analysis o f a reasonably large sample of data against which smaller sample sizes can be measured. Researchers can conduct experiments like these only when they have a reasonable idea of the population proportions and they intend to generalize statements about the minimal sample sizes needed. The former is rarely available, hence the following recommendation.

The Spl it-Half Techn ique

The split-half technique is similar to the experimental method described above, except that it does not require a population measure against which the adequacy of samples is assessed and does not allow generalizations to other sam

ples drawn within the same genre. It does not even require knowledge of the size of the population from which samples are drawn. The split-half technique calls for analysts to divide a sample randomly into two parts of equal size. If both parts independently lead to the same conclusions within a desired confidence level, then the whole sample can be accepted as being of adequate size. Analysts should repeat this test for several equal splits of the sample, as it is expected to yield the same results for as many splits as are demanded by the confidence limit.

If such tests fail, the content analysts must continue sampling until the condition for an adequate sample size is met.

CHAPTER 7 Recordi ng/Cod i ng

In making data-from recording or describing observations to transcribing or coding texts-human intelligence is required. This chapter addresses the cultural competencies that observers, inter

preters, judges, or coders need to have; how training and instruction can help to channel these to satisfy the reliability requirements of an analysis; and ways in which the syntax and semantics of data languages can be implemented cognitively. It also suggests designs for creating records of texts in a medium suitable for subsequent data processmg.

T H E F U N CTION OF

Dalam dokumen Content Analysis: An Introduction to Its Methodology (Halaman 136-140)