Figure 5.1 presents the steps for test construction and evaluation. The five steps are (1) state the purpose of the test, (2) state the test objectives, (3) review content, (4) develop a table of test specifications, (5) write test items, and (6) conduct an item analysis to evaluate the quality of the items and that of the overall test.
State the Purpose of the Test
The test construction process begins with a statement of the purpose of the test.
Among behavioral scientists, the three primary reasons for using knowledge tests are (1) to assess the knowledge base of a particular population on a particular health topic, (2) to determine whether knowledge of a health topic is related to behavior or attitudes
in a particular population, and (3) to determine whether knowledge of a health topic changes following an intervention. It is important to state the purpose of the test because the reason for measuring knowledge dictates to some extent the content, specificity, and difficulty level of the items included in the test. In the achievement test literature, the link between the purpose of the test and the characteristics of the test (type and difficulty of items) is fairly well established. It is generally accepted that there are two major uses of achievement tests (Gronlund, 1977). One is to rank students according to their test performance. This type of test, called a norm-referenced test, allows teachers to com- pare each student’s performance with that of other students. The second type of test, called a criterion-referenced test, gauges the student’s performance against preset cri- teria. In this type of test, the teacher is interested in the extent to which each student has met the objectives for the course. Unfortunately, the link between the primary uses of knowledge tests in health behavior research and the characteristics of the tests has not been fully addressed. However, drawing from the achievement test literature, we will pro- vide some suggestions for the behavioral scientist and the health educator to consider when developing knowledge tests.
In addition to providing direction for the ultimate use of the test, the statement of purpose also places boundaries around the content to be tested. Suppose a researcher wants to find out how much college students know about HIV. Because HIV knowledge FIGURE 5.1. STEPS IN THE DEVELOPMENT OF A KNOWLEDGE TEST.
State purpose of test
State test objectives
Develop table of test specifications
Write items
Conduct item analysis Review content
assessment could include many aspects—incidence, prevalence, transmission, preven- tion, treatment, antiretroviral medications, quality of life, access to care, current research, or cost—the researcher must decide which information is most important to collect from college students. Most students are unlikely to be knowledgeable about HIV treatments, research, costs, and access to care. Because college students are sexually active and many do not use condoms, the researcher’s focus is most likely to be the students’ knowledge of HIV transmission and prevention practices.
State the Test Objectives
After determining the general purpose of the test, the researcher lists specific objec- tives to guide the development of actual test items. The process of developing a test is much easier when specific test objectives are stated. In the achievement test literature, test objectives are synonymous with learning objectives. Learning objectives are de- veloped for each course, and generally these objectives are used to create tests of the course content. There are several approaches to the development of learning objec- tives (Gronlund, 1978; Osterlind, 1998). One of the most common, the taxonomy of educational objectives, was developed by Bloom (1956). Bloom describes three pri- mary domains of objectives: cognitive (knowledge), affective (attitudes or emotions), and psychomotor (skills). Because our focus is on knowledge tests, we present only the cognitive domain.
The six cognitive objectives from Bloom’s taxonomy, in ascending order of com- plexity, are knowledge, comprehension, application, analysis, synthesis, and evaluation.
Knowledge is the lowest level of complexity, requiring simple recall or recognition of information. Comprehension requires the translation of concepts or principles. Appli- cation is the use of concepts or principles in new situations. For example, defining the term measurement using one’s own words demonstrates comprehension, and using rules for item writing demonstrates application. Analysis entails breaking down information.
Synthesizing it means combining information to form a new product. Students who are asked to compare and contrast different approaches to scale development use analytic skills, and those who are asked to construct a scale use synthesis skills. Evaluation, at the highest level of complexity, involves appraisal of information. For example, the psy- chometric assessment of a scale designed to measure quality of life for use among women with breast cancer demonstrates the ability to evaluate information.
Bloom (1956) provides a list of verbs to develop learning objectives for each level of the cognitive objectives presented above. For knowledge, Bloom suggests using verbs such as define, identify, list, know, name, and state. A few objectives for a test of HIV knowledge might be:
• Define HIV.
• List three ways in which HIV can be transmitted.
• Name four strategies to prevent HIV.
• Identify five ways in which HIV cannot be transmitted.
• Identify five strategies that are ineffective in preventing HIV.
• Know that HIV cannot be cured.
In educational institutions, instructors use Bloom’s cognitive objectives to de- velop learning objectives for courses. Teachers decide which information to present and test at each of the six levels. In fact, we developed the learning objectives at the beginning of each chapter in this book using Bloom’s taxonomy. Health educators who develop and offer programs on health topics are likely to use a similar taxonomy to create program objectives. Likewise, behavioral scientists testing interventions to change behaviors are likely to state intervention objectives that indicate expected outcomes associated with participation in the interventions. In each case, learning objectives written to assess the cognitive domain (knowledge) can be used to create test objectives.
The situation for a researcher who wants to add a test of knowledge to a survey or who wants to examine the relationship between knowledge and a specific health be- havior is a bit different. Generally, the researcher does not have a set of learning ob- jectives from which to create test objectives. Instead, the researcher needs to consider carefully the content of the test and the population for which it is being developed.
Using this information, the researcher can create a set of test objectives.
Review Content
After making a decision about the scope of the test, the researcher uses certain re- sources to obtain accurate information about the topic. These resources include a review of the literature, discussions with scientists or other experts in the topic area, and contact with agencies or organizations that provide information on the topic. For example, the Centers for Disease Control and Prevention (CDC) houses the most up- to-date information on HIV prevention. Our researcher who wants to test students’
knowledge about HIV prevention would be wise to obtain literature from the CDC and visit the CDC Web site. In Chapter Seven, we will provide a more detailed dis- cussion of the types of resources that can be used to develop test items.
Develop a Table of Test Specifications
Because behavioral scientists and health educators generally limit their assessment of the cognitive domain to that of knowledge, that is, the recall or recognition of facts, rules, or other information, we will emphasize tests of knowledge in the discussion that follows.
After the decision on the scope of the test and the review of materials, the researcher is ready to write the knowledge items. The item-writing process for tests begins with
the development of a table of test specifications (also called a test blueprint). The table of specifications is a matrix in which the rows identify the content to be tested and the columns indicate the dimensions of that content. For educational tests, developers use cognitive objectives such as Bloom’s as the test dimensions (columns).
Table 5.1 is a sample table of test specifications for a test for a course on mea- surement. The content domains are a definition of measurement, the history of measurement, types of questionnaire items, types of scales, and the item-writing process. This sample table of test specifications represents all six cognitive objectives—
knowledge, comprehension, application, analysis, synthesis, and evaluation—in the columns. X’s indicate the cognitive objective that will be assessed on the test for each content domain. For example, items written about the definition of measurement will be written at the knowledge level, whereas those written to assess the history of mea- surement will be written at both the knowledge and comprehension levels. Some items assessing the content of types of questionnaire items and scales will be written at the higher levels, including analysis, synthesis, and evaluation.
Table 5.2 is a more useful table of test specifications for test development. In this table, specific learning objectives make up the content domain (rows), and the cog- nitive objectives make up the dimensions (columns). Notice that rather than X’s, there are numbers in the boxes, to indicate the number of items that will be written for each intersection of content and dimension.
A table of test specifications for a ten-item HIV knowledge test might look like Table 5.3. According to this table, the researcher will create a ten-item test with the following items: two items to measure knowledge of ways in which HIV can be trans- mitted, two for knowledge of ways in which it cannot be transmitted, two for knowl- edge of strategies that are effective in preventing HIV, two for knowledge of strategies that are ineffective, and one item each to measure knowledge about what HIV is and its ultimate outcome. The researcher will assess each item at the lowest level of cognitive complexity—knowledge.
Domain-Sampling Model
Notice that in Table 5.3, the researcher expects to write fewer items than those listed in the test objectives (presented earlier under test objectives). Although students may be expected to know many specific facts about HIV, the researcher anticipates that in his or her study the students will have only a few minutes to respond about their HIV knowledge. Thus, the investigator must make a decision about which items to in- clude on the test. Sometimes researchers are concerned that a brief test may inade- quately reflect students’ knowledge. However, if the researcher develops and evaluates the test from a domain-sampling model perspective, ten items may prove sufficient to provide a reasonable estimate of the student’s knowledge of HIV prevention. The
TABLE 5.1.SAMPLE TABLE OF TEST SPECIFICATIONS. Cognitive Objectives Content DomainKnowledgeComprehensionApplicationAnalysisSynthesisEvaluation Definition of measurementX History of measurementXX Types of questionnaire itemsXXXXXX Types of scalesXXXXXX Item-writing processXXXX
TABLE 5.2.SAMPLE TABLE OF TEST SPECIFICATIONS USING INDIVIDUAL LEARNING OBJECTIVES. Cognitive Objectives Content DomainKnowledgeComprehensionApplicationAnalysisSynthesisEvaluation Describe the characteristics 2 of a survey Discuss the basic principles of 2 survey construction Apply principles of item writing2 Revise poorly written items2 Describe the association 2 between item wording and measurement error
domain-sampling model assumes that items selected for a test are a random sample of all possible items that measure the content of interest. We can develop a number of tests using a large pool of test items. Each test would be composed of a different set of items, but some items from one test might appear on another. If we select the items randomly, a student’s scores on any one test will be strongly correlated with scores on all possible tests created from the same pool of items. Thus, any one test should provide a good estimate of the student’s knowledge.
Figure 5.2 presents the domain-sampling model. The larger circle represents the pool of all possible items for a given content domain. The smaller circles represent randomly selected sets of items that make up individual tests. Each letter represents a test item. If randomization is conducted in such a way that items are replaced after selection for each test, an item can appear on more than one test.
TABLE 5.3. TABLE OF TEST SPECIFICATIONS FOR AN HIV KNOWLEDGE TEST.
Content Domain Knowledge
(# of Items) List three ways in which HIV can be transmitted 2 Identify five ways in which HIV cannot be transmitted 2
Name four strategies to prevent HIV 2
Identify five strategies that are ineffective in preventing HIV 2
Define HIV 1
Know that HIV cannot be cured 1
FIGURE 5.2. DIAGRAM OF THE DOMAIN-SAMPLING METHOD.
D N
K E
C G R
IO
A J M Q
B H
F L
The domain-sampling method requires the development of a pool of items. The process of creating a large pool of items is time-consuming and sometimes impracti- cal when a researcher plans to use just one knowledge test in a study. Even when not using the test pool, the researcher can develop more items than are required for the test. The researcher can then use the item analysis techniques presented later in this chapter to select the items that perform best for the population of interest.
Behavioral scientists often include all possible knowledge items in a given content domain to create a comprehensive test. For example, they may include all possible routes of HIV transmission on an HIV knowledge test. However, if the test is con- ceptualized within the domain-sampling model, scores on a test composed of an ad- equate representation of items should correlate with the scores on a test composed of a comprehensive set of items.