Substantive I nformation - Content Analysis: An Introduction to Its Methodology

About the Phenomena of I nterest

Generating analyzable data is, of course, the raison d'etre of the recording process. Whatever the device, coders must be able to record information with ease, verify instantaneously what they have entered, and correct their mistakes.

Each medium for recording data has its own properties and makes special demands on human coders. Optical scanners call for the use of pencils of a

particular kind, otherwise some uncertainty can be created about how marks are read. The accuracy of punch cards is difficult to verify without mechani

cal readers. Spreadsheets offer convenient overviews of whole data arrays, but they often make it difficult for analysts to connect cell contents to recording units and available categories. Although computer aids are available that allow coders to generate electronic data files on the fly-during telephone interviews or while watching television, for example-such tools must be

RECO R D I N G/CO D I N G

1 4 7

carefully designed so that they interface easily with coders, minimize mistakes, and provide ample feedback for verification, much as traditional paper data sheets do.

Most Americans are familiar with the conventions of filling out questionnaires, and many are also comfortable with using a mouse to point and click on a computer screen. Analysts should rely on coders' competencies where possi

ble. Most people know how to write, how to copy texts, how to fill in blanks, how to circle options in a list, and how to enter a check mark to select an item. The more natural a recording medium is to the coders, the fewer errors they will make.

Above, I outlined several proven approaches to defining the semantics of a data language. Here, I focus on some easily avoidable errors that content ana

lysts make when designing the instruments that coders use to record what they have observed, categorized, judged, or scaled. One frequent source of errors is the overuse of numbers. Numbers are short and concise, but when they are used for everything they can become confusing. Content analysts tend to number their categories, their variables, their coders, the units to be recorded, the pages of instructions where the numbered values of numbered variables are defined, and so on. Many times, the designers of content analysis instructions could specify the required organization of data by using descriptive words instead of numbers whose meanings must be learned, by using typographical or spatial arrangements instead of paragraphs of prose, or even by using icons, which may cause less confusion than numbers.

A second source of errors is the inconsistent use of category names or numbers across different variables. For example, when the default category of "not applic

able" or "other" is coded "0" for one variable and "9" or "99" for another, con

fusion is bound to arise. The same is true when analysts use the same words but with different intended meanings in different variables. Explicitly defined differences are easily forgotten.

A third source of errors is the hand copying of uncommon text into a record.

This is one reason various qualitative software packages allow users to high

light text, assign codes, and cut and paste text. These features significantly reduce the chance of spelling errors, which are bound to introduce unintended differences.

A fourth source of errors is poor design of the presentation of options on the recording medium. In unwritten but widely used graphic computer interface conventions, users are asked either to check " boxes" on or off or to click on alternative "radio buttons," which selects among options. These are logically different operations. And whereas a computer interface can be designed to force users to comply with a designer's intentions-for example, by disabling unavail

able options-paper instruments are not so intelligent. Nevertheless, the designer of a recording medium can do much to discourage coders from recording data incorrectly and thus avoid unreliability and polluting the data with illegiti

mate values. Consider the following three ways of recording the outcome of an interpersonal interaction:

Enter the appropriate number

o 0 ^-favorable to neither 1 ^-favorable to recipient 2 ^-favorable to initiator 3 ^-favorable to both

Encircle one only

favorable to neither recipient only initiator only both

Check 0 as many as applicable

o favorable to initiator o favorable to recipient

Although these three alternatives are effectively equivalent, they differ in the kinds of errors they invite. In the version on the left, the coder is asked to enter a number in a box. There is nothing to prevent a coder from writing a number larger than 3 in the box. Any number larger than 3 would be undefined, regard

less of what the coder had in mind, and hence illegitimate. Leaving the box blank is not a legitimate option either, although it might make sense to a coder who found nothing favorable to record. This version is also sensitive to bad hand

writing. In addition, it is not uncommon for coders to confuse category numbers with, for example, coder ID numbers, unit numbers, variable numbers, or scale points, as mentioned above. The middle version does nothing to discourage or prevent the coder from circling more than one option, circling something between two equally imperfect alternatives, or failing to circle the category

"neither" when none is evident. The version on the right resists illegitimate entries altogether, but this solution is limited to binary values-to being checked or not checked, present or absent. Checking or not checking a box is a simple, unambiguous alternative. Analysts can reduce recording errors by phrasing the recording options so that they require a minimum of writing; the best way to do this is to provide a list of appropriate alternatives and instruct coders to "check all that apply"-not burdening the coders with the information that each then becomes a binary variable on its own (see Chapter 8 ) .

Errors can also occur i f the recording instructions are not well spelled out and coders must exert too much effort to consult instructions when they need to. One extreme solution for this kind of problem is to merge the recording instructions with the recording medium, so that the recording medium is similar to a ques

tionnaire in survey research. The high level of consistency this ensures, however, is counterbalanced by the fact that using such a medium is tedious and produc

ing it is costly. Having a coder use one recording instruction and/or data sheet for each recording unit can be excessive when recording units are small and numerous (e.g., words, frames of videotape, seconds of verbal interaction). At the other extreme, the analyst presents the coder with a spreadsheet-a large grid of recording units by variables, as in Figure 7.2-that the coder completes accord

ing to separately formulated instructions. This method invites a host of confu

sions; for instance, while consulting the instruction manual to resolve indecision, coders may lose track of which row they are coding, or may enter the categories (numbers) for one variable into the cells of another. In addition, most of these kinds of errors are difficult to detect. The following recommendations chart a middle course between these two extremes:

RECORD I N G/CO D I N G

1 49

• At each data entry point, the analyst should present the coders with some verbal description of the variable and, where feasible, a list of options and what each means (abbreviations of the terms used in the more elaborate recording instructions are better than numbers that say nothing about the category). The analyst should also provide instructions just where they are needed.

• The analyst should supply the coders with alternatives to be selected from a well-defined list-in computer applications, from pull-down menus, for example, or a row of bull's-eyes. Ideally, coders should not need to do much writing. Asking coders to enter numerical or alphabetical characters into boxes is problematic, especially when these characters have no intrin

sic relation to the phenomena to be recorded, because the coders then need to learn to correlate the numbers or letters with what they mean and can easily forget the meanings, especially of rare categories, which tend to be the ones that matter most.

• The analyst should create visual analogues (mappings) showing the rela

tionship between the way the analyzed text is organized and the way the recording medium is designed. This is relatively easy when coders are recording the geometric relations of text (locations of newspaper articles on the front page, above the center fold, inside) with visual devices (depicting a few pages from which the locations of interest may be selected) that can be reproduced in the recording medium, temporal sequences (with before on the left and after on the right), or such conceptual distinctions as between sender, message, and receiver (which may be represented to coders diagram

matically or according to linguistic conventions).

The availability of computer software that allows users to enter choices and select among alternatives enables content analysis designers to take advantage of coders' increasing familiarity with reliable interface conventions. It has also opened up the possibility of using computers to enter content analysis data directly, allowing for validation and tests of reliability.

Dalam dokumen Content Analysis: An Introduction to Its Methodology (Halaman 161-165)