• Tidak ada hasil yang ditemukan

Basic Concepts in Research

1.5 Validity Issues

Internal Validity

Much of this chapter has focused on the importance of and the means by which definitive conclusions can be made regarding the effect of the independent var- iable on the values yielded by the dependent variable. As we might imagine, some studies do a better job than others of making this connection clear and unambiguous. We use the term internal validity to capture this idea. That is, to what degree can one unambiguously attribute changes in the dependent variable exclusively to the action of the independent variable? As our certainty grows, internal validity goes up. As we realize more and more competing expla- nations for changes in the dependent variable, our internal validity goes down (Shavelson, 1988). We might want to think of the internal validity of a study as a measure of how logically “tight” and “tidy” are the inner workings of the research effort. Various kinds of reasons can account for a lack of internal valid- ity. For example, real-world constraints may impose inescapable limitations on the experimenter’s ability to fully control all extraneous variables, properly manipulate the independent variable, or carefully measure the dependent var- iable. All causes of diminished internal validity are not however inevitable.

There are also many avoidable causes that are actually just instances of poor methodological thinking or poor methodological execution.

Thankfully, we can often learn from our mishaps. In fact, frequently the critique of an initial study can reveal an unconsidered rival explanation for the findings, and this, in turn, will open up the door to a new and fruitful study. What was found to be a confounding variable in the initial experiment can then be used as an inde- pendent or dependent variable in a follow-up study. For example, let us revisit our research on the relationship between mental imagery and pain tolerance, but now we will introduce the variable“anxiety.”Evidence from past research suggests that highly anxious people tolerate pain poorly (Barber & Hahn, 1962). Suppose that in our study we unintentionally varied anxiety systematically with the imagery con- ditions. For example, perhaps close analysis of the experiment would reveal that we had participants in the second imagery condition waiting longer in the recep- tion room before their pain tolerance was measured, compared with those in the initial imagery condition. This delay in time could have allowed more anxiety to build. When a reviewer alerts us to this potential confound, not only could we rerun the study correcting this confound, but we might also decide to add anxiety as a new independent or dependent variable to explore.

As previously noted, a tightly controlled study, one having high internal valid- ity, can provide the researcher with evidence that a cause–effect relationship exists between the independent and dependent variables. Conversely, studies that are not tightly controlled and have low internal validity will not justifiably merit the claim of causality. Box 1.2 presents a published study in which a

Box 1.2 Feeling Good and Helping Others: A Study with a Confound The topic of generosity and helpfulness has long been a popular topic in psy- chology. Discovering the circumstances that encourage and discourage helpful- ness increases our understanding of this important behavioral phenomenon and may suggest ways in which we can facilitate prosocial behavior for the bet- terment of society (Kanfer & Grimm, 1980). Some of the factors that influence helpfulness include the observation of a charitable model (Rosenhan & White, 1967), the relationship between the helper and the recipient (Goranson &

Berkowitz, 1966), a predisposition to value the welfare of a person in need (Batson, Eklund, Chermok, Hoyt, & Ortiz, 2007), and past help received by the would-be helper (Berkowitz & Daniels, 1964). Based on an intuitive formulation (i.e. a hunch), Isen (1970) predicted that the positive feelings one experiences after success (the warm glow of success) would promote generosity. Partici- pants were randomly assigned to experimental conditions in which half of them received success feedback after completing perceptual–motor tasks and the other half were told that they had failed at the tasks. After the experimental manipulation (success or failure feedback), a confederate entered the room and casually placed a canister on a nearby table for donations toward a school project. (A confederate is someone who, unbeknown to the participant, is really part of the experiment and has a prescribed role to play in the study.) The dependent variable was the amount of money participants donated. Consistent with the hypothesis, those participants experiencing the positive feelings of success donated almost twice the amount of money as did those who had been told they had failed at the task. What is the confound?

The researcher asserted a causal connection between a positive emotional state and generosity. The question to ask is:“Did the experimental manipulation (success/failure feedback)onlyalter the emotional states of the participants?” The answer is likely“no.”Participants’ self-perceptions of competence were probably also altered by the success or failure feedback.“Success”participants were not only in a better mood than“failure”participants but may have also seen themselves as more competent than those who failed. And so, was it the participants’emotional state that determined generosity (as the researcher intended to show), or was it their perceptions of competence? The variables competenceandemotional statewere confounded, each varying systematically with the other. This created a problem when it came to interpreting the finding.

Thankfully, in a subsequent experiment, the researcher was able to induce a positive mood in participants through an experimental procedure that did not affect self-perceptions of competence, thereby isolating“mood”as a causal variable. The results of this second study showed that a positive affective state did enhance helpfulness (Isen & Levin, 1972).

1.5 Validity Issues 19

confounding variable reduces the internal validity of the study and presents a rival explanation for its results.

When real-world constraints do not allow for full control of all variables, a researcher may still choose to move forward with a study, even though the inter- nal validity may be compromised. For example, a professor who wanted to see if a new teaching tool (e.g. a new type of review game) was effective in the class- room might expose one section of the class to this new teaching tool but leave another section unchanged and then compare test performances over the rel- evant material. Please note that students were not randomly assigned to the classes rather they signed up for the class they wanted or that best fit into their schedule. This opens up the possibility that one class may represent a different type of student population than the other. Imagine that perhaps one class meets at 8:00 in the morning and the other at 2:00 in the afternoon. These classes, although they are both composed of students from the same institution and may have many other similarities, may represent quite different subgroups of students, namely, those who prefer morning classes and those who prefer after- noon classes. Furthermore, during the course of the experiment, each class may experience other unique events that only occur in that class (e.g. a fire drill, pro- blems with the classroom technology). Since these other events are not the inde- pendent variable, they now compete with the new teaching tool to explain any difference in the outcomes between the classes. These are not ideal experimen- tal situations; however, oftentimes they are the only viable method available to a researcher attempting to establish a causal relationship. Such designs are called quasi-experimentsbecause they share many of the same characteristics with true experiments and yet the participants are not randomly assigned to conditions.

External Validity

A researcher must also analyze the extent to which the experimental findings can be justifiably generalized, thereby reaching beyond the limited context of the study. After all, no one would be interested in learning about the relation- ship between mental imagery and pain tolerance if it is believed that the findings only pertain to those participants involved in the study and to only the specific relationship between a particular form of mental imagery and the act of holding ones’hand in ice-cold water. Obviously, the value of the study only emerges when we consider the finding in a more generalized context–people in general and pain tolerance in general. The degree to which our study legitimately applies to these broader external categories is of critical importance. Stated for- mally,“External validityasks the question of generalizability: To what popula- tions, settings, treatment variables, and measurement variables can this effect be generalized”(Campbell & Stanley, 1963, p. 5). The external validity of a study can be very difficult to judge and is often subject to intense professional debate.

Strictly speaking, there is no way, without running numerous other studies, to determine whether the results of a research study would be replicated if the experiment were conducted with different participants, in a different location, using a slightly different independent variable or measuring a slightly different dependent variable. Thankfully, there are ways to think about these applicability issues that do not require the running of an endless series of near-identical studies.

Let us look at the problem of different participants first. One’s confidence in generalizing to othersnotinvolved in the study can be increased by the method used to select participants for the study.Random samplingis the“gold stand- ard” for sampling participants. It occurs when the means of selecting partici- pants for a study is such that each participant in the population has an equal chance of being included in the sample. (Note that this is different from“ran- dom assignment.” Random assignment is a technique used to assign already selected research participants to the various conditions of a study. Failure to assign randomly affects theinternal validityof the study, not the external valid- ity.) Abiased samplemay occur whenever each member of a population does nothave an equal chance of being included in the sample.

Before moving on, let us clarify a few new terms that have just been intro- duced: populations, sampling, and samples. Apopulationcan be simply defined as“every member of a given group.”For example, imagine we want to study the effectiveness of a new teaching technique designed to help adults learn how to read English. We can label our population, then, asadults who are engaged in learning how to read English. Of course, we cannot include every single person in this entire population in our study; rather, we will select a subset of indivi- duals from that population to investigate. The process of selecting participants is called sampling; and the group of individuals, once selected, is called a sample.

Let us return to the issue of participant variables. Within any given popula- tion, there will be a host of participant variables distinguishing one participant from another. Just asrandom assignmentcreates groups within a study that are roughly similar across all of these participant variables,random samplingcre- ates a group of participants that roughly captures the blend of participant vari- ables found in that population. This is important because it allows the researcher justifiably to claim that findings obtained from their study should also pertain to the population as a whole. This gives us good reason to presume our findings are externally valid and applicable to other participants in our pop- ulation. Conversely, if a sample is not randomly gathered, certain participant variables may be overrepresented, and other participant variables may be under- represented. When this is the case, confidence regarding the external validity of the findings falters. For example, if we tested our new teaching technique only on new immigrants coming from countries using the Roman alphabet (the same letters we use in writing English), we should be cautious about concluding that

1.5 Validity Issues 21

our findings will pertain to immigrants coming from places where alphabets employing different characters (i.e. Chinese characters) were used. Our sample does not“represent”adequately this larger population. Logically, if this were the case, we should realize that the population we have actually sampled from is not adults learning how to read English, but ratheradults learning how to read Eng- lish who came from countries using the Roman alphabet. Clearly, this is a much smaller population, and the external validity of our findings necessarily becomes more suspect as we extrapolate to people outside the population from which we sampled.

It is important to realize that even though random sampling is the gold stand- ard for achieving representative samples, rarely is it employed. Take the exam- ple we used above: How could one perfectly select a truly random sample of adults learning how to read English? While the population can be easily described in the abstract, it is functionally impossible to gain access to it for the purpose of random sampling. Fortunately, many other less-than-ideal- but-more-or-less-adequate sampling procedures can be employed when doing social science research. The specifics of these techniques are usually treated with greater detail in methodology textbooks. In a similar vein, there is the issue of sample size. What is the minimally acceptable number of participants nec- essary for a sample to be considered representative of its parent population?

This can be a complex determination and is customarily discussed with greater precision in textbooks focused on research methodology.

Let us now return to the concept of external validity regarding slightly differ- ent locations, independent variables, and dependent variables. Determining the external validity of findings in these situations is much more a matter of rea- soned argumentation than one of calculating probabilities and likelihoods.

For example, sometimes the findings generated by a study will naturally prompt researchers to think of other similar settings and variables for which these find- ings might apply. A good example of this is a study by Baddeley and Longman (1978) that compared mass practice with distributed practice for learning the new skill of typing. The basic question was this: How is practice time best spent if one is learning to use a typewriter with a new arrangement of keys? One group practiced in mass (i.e. long practice sessions within a short window of time), while another group distributed their practice (i.e. shorter practice sessions spread out over a longer stretch of time). Both groups actually spent the same total amount of time typing on the new keyboards. The findings showed quite convincingly that those who had a more distributed set of practice sessions ended up typing more quickly and with fewer errors. Many theorists correctly suspected that Baddeley and Longman demonstrated a more general principle regarding the relationship between mass and distributed practice (for example, see Box 1.3). The external validity for Baddeley and Longman’s findings appears to be high.

The external validity of a particular finding is best judged by examining the body of existing research to see if a similar finding has occurred with

participants from different populations using different experimental procedures and different measures of the dependent variable. Findings that hold up under a wide range of circumstances (like the“distributed over mass practice”finding) are termedrobust. Oftentimes a researcher, who has uncovered an interesting finding, will go on to conduct a series of related studies in which they system- atically alter populations, settings, and related variables in order to establish the robustness of their finding. This is oftentimes referred to as conducting aline of research.

Internal and external validity are only two among many different kinds of validities. They are all important to the research process, and much fuller treat- ment of these concepts can be found in various methodology textbooks.