Measurement Error Factors - MEASUREMENT IN HEALTH BEHAVIOR

Errors in measurement arise from several sources that are broadly classified as errors due to the respondent, the instrument, or the situation. Below we describe some ex- amples of each of these three sources of error.

Respondent Factors

In completing self-report questionnaires and interviews, respondents can make a number of errors. Possible respondent errors include

• Overreporting or underreporting agreement with items

• Overreporting or underreporting frequency of events

• Overreporting or underreporting positive or negative attitudes

• Giving partial answers

• Giving inconsistent answers

• Making recording errors, such as putting a response in the wrong place

The reasons that test takers make errors are as numerous as the possible types of errors. However, common reasons for making errors include simply misunderstanding what information was requested; maturation issues, such as fatigue and boredom; lack of interest or time; carelessness or guessing; copying; and failure to attend to instructions.

The study of how people respond to items on questionnaires has led to a greater understanding of the cognitive processes that people use to answer questions (Tourangeau & Rasinski, 1988) as well as to the identification of several patterns that people sometimes follow in responding to items. The literature often refers to these patterns as response sets or biases.

In reference to the cognitive processes that people use in answering items, the respondent assumes an attitude that may help him or her complete the task in an optimal manner or, conversely, one that may undermine the task and lead to more respondent error. Krosnick (1991) coined the term optimizing to refer to the situation in which people approach self-report tasks with the optimal mental attitude. People who take this approach use their mental resources to complete the questionnaire in the best way they can. They give each item full consideration and try to provide the best answer. Academic examinations offer the best example of optimizing. Here, students who are interested in doing well on a test read each item carefully, think about possible responses, and carefully record their answers. People who want to participate in clinical trials to receive treatment for a specific disorder, or who want to qualify for a particular event such as a reality show (for example, Survivor), might also try their best by carefully reading instructions, answering questions, and recording answers.

Krosnick (1991) notes that optimizing is the ideal attitude that people may assume when responding to self-report questionnaires. Because people are trying to give the appropriate information when optimizing, this situation results in the smallest amount of respondent error.

People who are not particularly invested in completing a specific questionnaire might opt for a satisfising rather than an optimizing approach. Krosnick (1991) used the term satisfising to mean giving the answer that seems most reasonable at the time. People who take a satisfising approach pay less attention to what is asked, spend less time con- sidering their answers, and offer a “good enough” response. This mental attitude is probably most common among students taking a quiz that will not greatly affect their course grades or among people who do not have strong feelings about an issue but are asked to complete a survey about it. Because people fail to give as much care to responding to items, the likelihood of making an error—for example, failure to follow directions, recording errors, and overreporting or underreporting attitudes—is greater for this group than for people with an optimizing attitude.

On the far end of the continuum are people who apply a negative attitude in responding to items (Silverman, 1977). A person might display a negative attitude if that person felt coerced into completing a survey, then discovered survey bias toward a view opposite to the one the person held. For example, a person who is opposed to abortion might believe that items presented on a survey represent a proabortion stance. A person who agrees to a telephone interview on political attitudes might find during questioning

that the items seem slanted toward one political party. In this case, the person’s neutral attitude might shift negatively, resulting in extreme negative responses. Likewise, students who are asked to complete an evaluation of an instructor after learning that they failed an exam are likely to approach the evaluation with considerable negativity. In this case, respondents might overreport negative attitudes.

To minimize the tendency to respond in either a satisfising or a negative manner, the researcher has available a number of strategies to motivate participants to do their best. High on this list is developing rapport with participants. The strategies for developing rapport vary depending on the interview situation. In a face-to-face interview, the interviewer can greet participants in a pleasant manner, begin the conversation with small talk, explain the study in an easy-to-understand manner, and show respect for participants. During the interview, the interviewer can use comments such as “You’re doing a great job,” and, “We’re almost finished,” to provide continued encouragement for the participant. Mail-out surveys can establish rapport through a letter to participants and through the formatting of the questionnaire. Other moti- vational techniques include carefully explaining the study to participants so that they know their role in responding to items. We assist participants by explaining the im- portance of this role in gathering accurate data and by instructing them to consider each item carefully before answering. Showing appreciation of the time and effort participants expend in completing the survey is important. In addition to using motiva- tional techniques to gain their cooperation, keeping the task simple makes it easier for it to be completed in a timely manner. Researchers should use short items with easy- to-understand words, easy-to-follow directions, and clear formatting to reduce respondent burden (Streiner & Norman, 1995).

Response Sets. In studying the ways in which people respond to items, researchers have identified several patterns of responses, referred to as response sets or bias. Pedhazur and Schmelkin (1991, p. 140) define response set as “the tendency to provide responses independent of item content.” Sometimes people are aware that they are responding to items in a biased manner, but more often, people are unconscious of their biases. Researchers have identified several different response sets, including social desirability, acquiescence, end aversion, positive skew, and halo.

Social Desirability. Edwards (1957) defined social desirability as “the tendency of subjects to attribute to themselves in self-description personality statements with socially desirable scale values and to reject those with socially undesirable scale values” (p. vi). In Edwards’s description, people exhibiting a social desirability response set are more likely than most people to agree strongly with statements about their own good character, will- ingness to help others in all situations, refraining from lying, and friendly disposition.

In health behavior research, people displaying this response set are more likely to agree

strongly with statements associated with healthy behavior and disagree with statements suggesting unhealthy behavior. That is, people may strongly agree with statements that health is important to them (for example, “I exercise every day no matter what”) and disagree with statements that might indicate poor health choices (for example, “I never read food labels”). People who intentionally try to create a favorable impression of themselves (as opposed to having an unconscious tendency to do so) exhibit a response set of faking goodness (Edwards, 1957). Thus, people who seek approval for being physically healthy may intentionally overestimate the amount of exercise they engage in each week or underestimate the amount of fat in their diets. For example, a person who does not eat many fruits and vegetables may indicate on a survey that he or she eats five fruits or vegetables per day based on his or her knowledge of the recommended amount. Likewise, someone who fails to exercise may respond that he or she does work out because he or she knows a person should do so to stay healthy.

The opposite of faking good is faking bad. A person who believes that portraying himself or herself in a negative way may be advantageous in a certain situation may answer items in a socially undesirable manner. For example, sometimes people choose to refrain from disclosing health behaviors that may make them less attractive for a research study. A woman who wants to participate in a research study that includes sessions with a personal trainer may fail to admit to regular exercise when in fact exercise is an important part of her life. Or a man who maintains good control of his diabetes may claim he finds it difficult to manage his disease so that he can participate in a study that offers a number of incentives.

Researchers can use several methods to reduce the tendency of participants to choose socially desirable answers. Participants who feel they can trust the researcher and believe in the confidentiality of the study are more likely to answer items truth- fully. The researcher has an obligation to create an environment in which the respondent will feel comfortable, particularly when the researcher requests sensitive information (for example, “I have sex with more than one partner without using a con- dom”). To create an environment of trust, the researcher should provide adequate information about the study and explain how the data will be used. By mentioning that there are no right or wrong answers and that the data will be carefully secured, the researcher gives participants permission to respond as they truly feel.

To assess the extent to which social desirability may have influenced responses to a specific set of items, the researcher can correlate scores with a measure of social desirability. Crowne and Marlow (1960) developed one of the more popular social desirability scales, which contains items requiring a yes or no answer. Most people are likely to respond yes to at least some of the items on the scale, such as, “I have sometimes told a lie.” Researchers believe that people who select mostly negative answers to this set of items are more likely to respond in a socially desirable manner to other items, including those of interest to the researcher. If scores on the items of interest

are strongly correlated with scores on the social desirability scale, we can conclude that at least some respondents answered in a socially desirable manner. Because researchers usually collect data on the scale of interest to the researcher and the social desirability scale at the same time, the researcher must provide an interpretation based on these findings. A limitation of this type of study may be that the results reflect, to some extent, social desirability.

Acquiescence and Nay-Saying. Cronbach (1946) first identified the tendency of individuals to choose true over false in test taking (true-saying). The concept has expanded to include the tendency to select agree over disagree in Likert scales (yea-saying) and the tendency to choose yes over no (yes-saying) in dichotomous items (Couch & Keniston, 1960). Individuals who display this type of response set have a tendency to agree with both favorable and unfavorable statements, or to answer true to any statement. In response to items on healthy eating habits, a person exhibiting the acquiescence response set would tend to answer yes to items regardless of content. Acquiescence often leads to inconsistent responses. For example, a person might respond yes to indicate that he or she eats fruits five times per day and later respond yes to an item stating that he or she eats no more than three fruits per day. The opposite tendency also exists, where respondents are more likely to disagree than agree, choose false rather than true, or say no rather than yes. We refer to this pattern as nay-saying. The same type of inconsistent response is possible with the nay-saying response set. A questionnaire on smoking might have two items, one in which a person indicates that he or she would not allow someone to smoke in the house, and a second in which he or she indicates that he or she would not ask someone to go outside to smoke. Someone exhibiting the nay-saying response set would answer no to both items, leading to inconsistent responses.

End Aversion (Central Tendency). End aversion is a type of response set noted in items for which there are more than two response options (Streiner & Norman, 1995). Rating scales give people a choice of responses whose number is most often between three and ten.

People who exhibit end aversion tend to avoid selection of extreme values on the rating scale. Thus, on a five-point rating scale, they would select neither one nor five. Likewise, on a ten-point rating scale, they would avoid one and ten. When interpreted, these responses indicate that these respondents fail to hold strong views on the topic. They are more likely to select an agree rather than a strongly agree response or a disagree rather than a strongly disagree response. Or they are more likely to note that they do something most of the time or sometimes rather than always or never. A particular type of end aversion involves selecting the middle category for all items. Thus, if the rating scale has an uneven number of choices (three, five, seven, or nine), the middle category will be selected (for example, three on a five-point rating scale). The best approach to minimize end aversion

is to encourage participants to select from the full range of choices. In a face-to-face interview, the interviewer may provide this encouragement, or on a written questionnaire, the researcher can include this encouragement in the instructions.

Positive Skew. In some cases, people tend to hold extreme positive or negative attitudes about a topic (Streiner & Norman, 1995). As a result, they are likely to select the extreme rating choice—for example, strongly agree or strongly disagree—for all items. This response set differs from that of acquiescence and nay-saying. With acquiescence and nay-saying, there is a tendency to answer in the affirmative (or negative) regardless of item content. In the positive skew response set, content matters. The positive skew response set is manifested more with some attitudes and beliefs than with others. For example, people tend to rate their self-efficacy for a variety of behaviors very high—often seven or more on a ten-point scale. One strategy to control for positive skew is to examine the item content and modify (or tone) the wording to make it more difficult for someone to select an extreme value (Viswanathan, 2005). For example, the developer could add the term always in the stem of each item (for example, “I can always take my medicine at the same time each day”). Another strategy is to include items that assess the highest difficulty level (for example, “I can run an ultra marathon—100 miles”). Finally, providing a very extreme response category or providing more gradations (or options) of responses at one end of the scale can discourage extreme response choices. This last method is shown in Figure 3.1. Response Option A is a traditional Likert scale. If responses are skewed

FIGURE 3.1. RESPONSE OPTIONS.

Unsatisfactory Average Superb

Response Option A: Traditional Spacing for a Likert Scale.

Response Option B: Modified Spacing with Additional Options Between Average and Superb Ratings.

Unsatisfactory Average Superb

Source: Steiner and Norman, 1995, p. 80. Reprinted by permission of Oxford University Press.

toward the Superb end of the scale, most of the responses will be found at the far right of the scale. However, if the center is shifted to the left, as in Response Option B, the respondent has more choices between the responses Average and Superb.

Halo. The halo effect is the tendency for ratings of specific traits to be influenced by a general attitude, or set, toward a person (Cooper, 1981). Halo effects are noted com- monly in situations in which a rater is required to evaluate selected characteristics of an individual. Halo ratings occur when the rater allows a general attitude toward the person being evaluated to influence the rating of specific characteristics. For example, at the end of a course, students are usually asked to evaluate the instructor. Students might give an instructor whom they like high positive ratings on a set of course evaluation items regardless of how well certain aspects of the course were implemented. Also, an instructor might overestimate the ability of students who participate the most in class and give them higher grades than the ones they earned. In clinical settings, where in- structors rate students on performance abilities, they might reward students who are well liked with higher ratings than they give to reserved students who are less inclined to seek attention from the instructor. The opposite of the halo effect is the pitchfork effect, in which a general negative impression of a person influences ratings of specific characteristics (Lowe, 1986). Thus, an instructor who is less engaging might receive lower ratings, without justification, on specific characteristics of the course.

Cooper (1981), who studied the halo response set extensively, gives nine ways in which investigators can reduce the halo effect. The strategies most useful for health behavior researchers include training evaluators regarding ways to rate participants, taking steps to ensure consistency across raters, and having evaluators record critical incidences to use as justification for their ratings. Evaluators who are familiar with the participants are more likely to be aware of the full range of behaviors that the participants might exhibit, and that they in turn would evaluate. For example, having a professor for more than one class allows the student to see that teacher function in different contexts, thus increasing the likelihood of an accurate evaluation.

Recall. In addition to response sets, a major source of measurement error in self- report instruments is related to the ability and motivation of respondents to provide accurate information about behaviors or events that have occurred in the past. When responding to items that require the recollection of information from memory, participants often feel pressure to respond immediately after a question is asked or read. Unless they are instructed otherwise by the interviewer, the respondents tend to answer relatively quickly even though the requested information requires recall and calculation com- ponents. For example, when asked, “How many times did you eat fruit last week?” a respondent generally would not take the time to search his or her memory for the exact days on which fruit was eaten and then calculate how much was consumed each day

during the week. This recall and calculation process would take some time. Thus, the most common response is a quick estimation of fruit intake based on general experi- ence. In these situations, a process called telescoping often occurs, in which respondents include more events within a time period than they should.

There are a number of ways in which investigators can enhance recall of past events. The cognitive interview is one such method used by health professionals to enhance recall of health-related behaviors and events. Fisher and Quigley (1992) developed the cognitive interview to enhance recall of food consumption. Their work was based on the need to trace the sources of food-borne illnesses. When food-borne illness outbreaks occur, epidemiologists depend on affected people to provide accurate and complete recall of foods they have eaten within the past two to seven days. How- ever, studies using standard interview questions reveal that errors in food consumption recall are common (Decker, Booth, et al., 1986; Mann, 1981). The cognitive interview is based on five principles of cognition and memory retrieval: (1) context reinstatement, (2) focused retrieval, (3) extensive retrieval, (4) varied retrieval, and (5) multiple representations. Using the principle of context reinstatement, Fisher and Quigley asked people to think about environmental conditions (for example, room, lighting, people present) and psychological context (for example, reason for selecting the food) when they ate a meal. To recall events, people need to be able to retrieve information from memory (focused retrieval). To encourage them to do so, the investigators allowed people to think without interruptions or distractions, and they encouraged the respondents to take their time in performing this task. The principles of extensive and varied retrieval suggest that the more attempts one makes and the more varied the attempts, the more successful one will be in recalling food eaten. Thus, the investigators asked people to continue to search their memories even after they said they could remem- ber no more. To make this task less redundant, the investigators varied the questions.

For example, a direct question, such as “Did you have mayonnaise on your sandwich?”

would later be followed by the request, “Tell me what was on your sandwich.” The multiple-representation principle is based on the idea that images of an event are stored in different forms and asking about these different images may elicit new information.

Food images might include food on the plate, food carried by the waiter, and food shared with others. Information from Fisher and Quigley’s studies indicates that retrieval of information and accuracy are greater using the cognitive interview than using standard interview questions.

This approach has its disadvantages. The most significant is the amount of time it takes to conduct a cognitive interview. In their study, Fisher and Quigley noted that the cognitive interview took an average of 15 minutes and the interview with standard questions an average of 1.5 minutes. The cognitive interview is also unsuitable for the written questionnaire. The investigator might include specific instructions on a questionnaire asking a person to be mindful of the context of a situation and spend

Dalam dokumen MEASUREMENT IN HEALTH BEHAVIOR (Halaman 70-84)