THE DEVELOPMENT AND VALIDATION OF A GROUP TEST OF LOGICAL THINKING

(1)

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/234585889

The Test of Logical Thinking

Article · January 1984 CITATIONS

25

READS

13,939 2 authors, including:

Kenneth Tobin The Graduate Center, CUNY 394PUBLICATIONS ^11,516^CITATIONS

SEE PROFILE

All content following this page was uploaded by Kenneth Tobin on 06 August 2015.

(2)

413

THE DEVELOPMENT AND VALIDATION

OF

^A^GROUP

TEST OF LOGICAL THINKING

KENNETH G. TOBIN’

University ^ofGeorgia

WILLIAM CAPIE

University ^ofGeorgia

The _paperdescribes the development of the Test of Logical Thinking (TOLT) ^to^measurefive modes of formal reasoning: ^controlling variables, proportional reasoning, combinatorial reasoning, probabilistic reasoning, and correlational reasoning. Each of the 10 items

requires participants

^to^select^a^correctresponse and

justifica-

tion from a number of alternatives. Analysis of data from 682 students from grades ⁶through college ^indicatedhigh ^testreliability

(coefficient

α ⁼.85) ^andprovided confirmation that the test mea-

sured one major underlying dimension termed formal thought. ^Evi-

dence of criterion-related validity ^wasobtained from a study ⁱⁿ

which 88 students from

grades

¹⁰

through

college ^were^assessed^on

the TOLT and on five interview tasks. A correlation of .80 (p ^<

.0001) suggested ^astrong relationship between the two measures of formal reasoning.

1981, 41

INITIALLY studied

by Piaget

^{and his}

colleagues,

^the

development

^of

formal

reasoning ability

^{has been}

extensively

researched in adoles- cents and adults

(e.g., Arlin, 1975; Chiappetta, 1976; Farrell, 1969;

Lovell, 1961).

^In^the

majority

^of^cases,clinical interviews based on

protocols

^described

by

Inhelder and

Piaget (1958, 1975)

^{have been}

used to assess formal

reasoning ability.

^Two

important

^trends^that

have

emerged

from research are that _manyadolescents and adults are

limited in their

ability

^to^use^formal

modes

^of

reasoning

^and^{that for-}

1 Senior author has returned to the faculty ^at^MountLawley College, ^Western^Australia

6050.

Copyright ^@¹⁹⁸¹by Educational and Psychological Measurement

(3)

AND PSYCHOLOGICAL MEASUREMENT

mal

reasoning ability

^is^an

important

mediator of

cognitive

^achieve-

ment

(e.g.,

^{Cantu and}Herron,

1978;

^Goodstein^andHowe,

1978).

^As^a

consequence, researchers have

emphasized

^the

importance

^{of modi-}

fying

instructional

objectives, materials,

and activities so that

they

^are

suited to the

cognitive development

of learners.

Concomitantly they

have

urged

^that

priority

^be

given

^to^the

development

^{of formal}^rea-

soning ability

of middle and

high

school students

through

^the^use^of

appropriate

curriculum materials.

Action on either of these concerns

requires

that formal

reasoning ability

be assessed in a valid and reliable manner. The clinical interview

procedure

^is^notsuited for

widespread

administration because of the time

required

^toadminister a set of tasks to a group of students and because of the level of

expertise required

^of^theinterviewer. In a

research context an additional

problem

derives from the

subjectivity

of the

procedures

used. The

advantages

^of

simultaneously testing large

groups of

subjects

^{have led}^a^{number of}attempts ^to

develop

valid _groupmeasures of formal

reasoning ability (e.g., Bumey, 1974;

Lawson,

1978;

Staver and

Gabel, 1979;

Tisher and

Dale, 1975).

^Items

from most of these tests were based on content and

logical

processes derived from the works of Inhelder and

Piaget (1958, 1975).

Ease of administration

accompanied by objective scoring

procedures is an inducement for researchers to

develop

^avalid and reliable

pencil

^andpaper ^measureof formal

reasoning ability.

^A

major

^diffi-

culty, however,

^is^to^ensure^that

subjects

^use^formal

reasoning ability

to solve the items on the test. Whether

subj ects

^use^formal

reasoning

or not may be ascertained from their reasons for

developing

^or^choos-

ing

^aresponse. A characteristic of the clinical

procedures

^{that has}gen-

erally

^not^been

incorporated

^into^a

pencil-and-paper

^tests^{is the}^neces-

sity

^for

subjects

^to

justify

their solution to a

problem. Exceptions

^to

this

requirement

^were^the^testitems described

by

^Lawson

(1978)

^and

Lawson, Adi,

^and

Karplus (1979),

^which

required subjects

^to

provide

written

justifications

^fortheir solutions to

problems.

^Another

impor-

tant aspect ^{of the}

procedure employed by

^Lawson^was^the^use^{of dem-}

onstrations to

provide

^a^realistic^context^for^the

problems

^to^{be solved.}

The purpose of this

study

^was^to

develop

^agroup ^testof formal rea-

soning ability

that would

require

^students^to^solve

problems

^and^to

justify

the solutions obtained.

Development

of the Test of

Logical Thinking (TOLT)

commenced with a selection of ten items

previously reported by

^Lawson

(1978)

and Lawson et al.

(1979).

Procedures

In the sections that

follow, procedures

^aredescribed for the development ^{of the}

TOLT,

^and

investigations

of internal

consistency,

^con-

struct

validity,

^andcriterion-related are detailed.

(4)

TOBIN AND CAPIE

Development of

^TOLT

Items that had been used in

prior

^research

(Lawson, 1978;

^Lawson

et

al., 1979)

^were

employed

^{as a}^basis^for

developing

^aninitial version of TOLT. The

adoption

^{of this}

procedure

assured that TOLT would contain items that had been

previously reported

^as^valid^measures^of

formal

reasoning ability.

^Two^items^were^selected^to^measure^{each of}

five modes of formal

reasoning: controlling variables, proportional reasoning, probabilistic reasoning,

correlational

reasoning,

^and^com-

binatorial

reasoning.

^The^test

incorporated

demonstrations to

provide

a context for the items. Students selected a correct response for a num-

ber of alternatives and

provided

^written

justification

for their selection.

Although

^the

reliability

estimate of the TOLT was

reasonably high (a = .74),

several factors were

apparently reducing validity.

^For^ex-

ample,

many

high

school students were unable to formulate a clear written

justification

^for

selecting

^a

particular

response.

Rescoring

these responses

suggested

many inconsistencies in

scoring.

^{Because of}

these

problems,

the initial version of TOLT was modified so that mul-

tiple justifications

^were

provided

^as^well^as

multiple

solutions for each

problem.

Reasons that had been volunteered

by subjects

^on^{the first}

version of TOLT were modified and

incorporated

^into^a^revised^ver-

sion of the test. In the revised test a correct solution

required

^selection

of the correct response and the best

justification

^{for the}response. ^A

sample

item is included in

Figure

¹^toillustrate the item format.

The revised test also utilized a color

video-tape

^topresent ^the^con-

text for each

problem

^and^tostandardize administration

procedures.

Adequate

^time^was

provided

for each item to be

completed.

^The^time

Figure ^{1. A}sample ^item^from^the^TOLT.

(5)

416

required

^{for the}^{test to}be administered was

approximately

⁴⁰^min-

utes.

Investigations of

^TOLT:

Validity

^and

Reliability

Three

samples

^were^{used in}^an

investigation

^{of the}

reliability

^and

validity

of the TOLT. The TOLT was administered to a

sample

^{of 353}

students in middle school

grades 6,

⁷^and

8;

^to^a

sample

^of⁸²

physics

and

chemistry

^students^from

grades

^{11 and}

12;

^and^to²⁴⁷^students^enrolled in

college

^science^courses

Intact classes were

employed

^{in all}^cases^toobtain data from relia-

bility

^and^construct

validity investigations.

^Insuch studies the _purpose

was not to differentiate

subjects

^at^different

grade levels,

^but^to^iden-

tify

^the^{extent to}^which^test^itemsmeasure a common dimension. An

adequate investigation

^of^these^testcharacteristics

requires

^arange in

ability

of examinees on the construct

being

^measured.^As^{a con-}

sequence,

subjects

^wereselected from a number of

grade

^levels.

Prior research

(Renner

^and

Grant, 1978)

had shown the

develop-

ment levels of

physics

^students^tobe distributed

differently

^{from the}

developmental

^levels^of^allstudents. A

higher proportion

^of

physics

than of other students was found to be

operating

^at^aformal opera- tional level. As a consequence, students in

grades

¹¹^and¹²^from

physics

^and

chemistry

^classes^wereselected in an endeavor to obtain a

greater range in formal

reasoning.

^The

sample

^was^not^chosen^to^be

representative

^{of the}

population

^of

grade

¹¹^and¹²^students.

The internal

consistency

^of^TOLT^wasassessed from the

complete

data set

(n

⁼

682) by using

coefficient a

(Cronbach, 1951).

^Twosepa- rate factor

analyses

^wereconducted in an

investigation

of the under-

lying

^structure^{of the}

performance

data. In the

first,

^{the data}^to^be^an-

alyzed

consisted of

performance

^oneach of the five

hypothesized

modes of formal

reasoning (each

^mode

representing performance

^on

two

items).

^Inthe second

analysis, performance

^on^{each item}^was^ana-

lyzed.

In each factor

anlaysis

^a

principal

^axis

procedure

^was^used^to^ex-

tract common factors. In this

procedure, diagonal

elements of the cor-

relation matrix of the five modes of

reasoning

^were

replaced

^with

communality

estimates. The initial estimates were the

squared

^mul-

tiple

correlation of

performance

^on^each^{mode of}

reasoning

^{with that}

on the

remaining

^{mode of}

reasoning.

^A^Scree^test

(Cattell, 1966)

^and

the Kaiser varimax ^criterion

(Kaiser, 1960)

^were

jointly applied

^to^obtain a final solution to the factor

analysis.

Rotation for

interpretation

was not

required,

^asone-factor solutions were obtained in each case.

(6)

417

Criterion-Related

Validity of

^TOLT^The

samples

^{for this}

investiga-

tion consisted of 25 students enrolled in

college

science education

courses and of 63

high

school students from

grades

¹⁰

through

^12.^The

TOLT was administered to students followed

by

^a

battery

of five clin- cial interviews selected from those described in Inhelder and

Piaget (1958,1975)

^to

provide

^a^measureof each of the formal modes of rea-

soning

^assessed

by

TOLT. On the basis of

performance

^on^each^task

students were rated one if

they

demonstrated formal

thought

ⁱⁿ^solv-

ing

^a

problem

^{or zero}^if

they

^did^not.

Ratings

for each task were

summed to

provide

^a^measure^of

performance

^on^theclinical interviews.

Results

Coefficient a for the TOLT which was based on the total

sample

^of

682

students,

^was^.85.The internal

consistency

estimate of each two- item subtest

ranged

^from^.56^to^.82.

Descriptive

data related to the items and subtests of the TOLT are contained in Table 1. Item diffi- culties

ranged

^from^.18to .41 with an average of .30. Item discrimina- tion indices

ranged

^{from .39}^to^{.71 with}^anaverage of .55.

When the data were

separately analyzed

^for^each

grade

^level^a

gradual

^increaseⁱⁿ

performance

^wasevident from

grade

⁶^to

college

level. A

frequency

distribution for each

grade

^{level in}^the

sample

^is

provided

in Table 2.

Performance on the five modes of

reasoning

^was

moderately

^inter-

TABLE I

Descriptive Data for ^the^TOLT

(7)

EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT TABLE 2

Distribution of Formal Reasoning Ability

correlated. The correlation coefficients included in Table 3 demon-

strate the interrelated nature of

performance

^onthe five modes of formal

reasoning incorporated

into the TOLT.

The factor

analysis

of the intercorrelations of _measures,each of which was

hypothesized

^to^reflect^one^ofthe five modes of

reasoning, produced

^aone-factor solution which accounted for 43% of the common variance. Each measured mode of

reasoning

^was

highly

^corre-

lated with the

factor,

with the factor structure

loadings ranging

^from

.60 to .72. The factor structure

loadings

^and

communality

^estimates

for this

analysis

^are^containedⁱⁿ^Table⁴

In the second factor

analysis

of the intercorrelations of items a one-

factor solution which accounted for 38’ percent ^of^the^common^vari-

ance was obtained. In this case the factor structure

loadings,

^which

TABLE 3

Intercorrelations _amongHypothesized ^Modesof Formal Reasoning

on the TOLT (n ⁼682)

aTwo items were included for each Mode.

(8)

TOBIN CAPIE

a Two items are included for each Mode.

ranged

^{from .49}^to^.73

(Table 5),

^wereindicative of a common

(unidimensional)

^structure

underlying performance

^on^{each item.}

Criterion-Related

Validity

The correlation between

performance

^onthe interviews and scores on the TOLT was .80.

Descriptive

data for the clinical interviews are

provided

^{in Table}^6.

Predictive

Validity

Results from concurrent

investigations

^at^the

University

^of

Georgia (Table 7)

^indicated

significant relationships

between TOLT

perform-

ance and that on other variables. These studies

provided

^an^indication

of the

predictive validity

^of^{the TOLT.}

In a

study involving

353 students from

grades

⁶

through

⁸

(Tobin

and

Capie, 1980),

³⁵percent of the variance on a test of

integrated

^sci-

ence

processes

^wasattributable to variation in

performance

^on^the

TOLT. Attenuation correction

suggested

^a

validity

coefficient of .74.

TABLE 5

Factor Structure Loadings for ^the^TOLT^Items

(9)

EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT TABLE 6

Description Data for ^Clinical^Interviews

a Two items are included for each Mode.

Tobin, Capie

^and

Bradley (1980)

^also^found^{that the}

performance

of 150

college

^students^was

significantly

correlated with

integrated

^sci-

ence process achievement

(r

⁼

.49,

p ^<

.0001). They

^also

reported

^a

correlation of .49

(p

^<

.0001 )

^with^a

sample

^of

high

school students.

Bradley (1980) reported

^a

significant relationship

between TOLT

performance

^for^a

sample

^of

college

^students^and^two^measures^of^vi-

sualization : the _paper

folding

^test

(r

⁼

.61,

p ^<

.0001 )

and the surface

development

^test

(r

⁼

.55,

p<

.0001). Significant relationships

^were

TABLE 7

Predictive Validity Data for ^{the TOLT}

(10)

TOBIN CAPIE

also

reported

^for^scores^obtained^on^the

College

^BoardScholastic

Ap-

titude Test

(SAT) (Educational Testing Services, 1948-1980).

Yeany, Helseth,

and Barstow

(1980)

^cited

significant relationships

between TOLT

performance

^and

(a)

achievement in

college biology

and

(b)

^SAT^scores.

Discussion

The TOLT furnished a reliable means of

assessing

^formal

reasoning ability.

The ten-item test has a

high

^internal

consistency

and several of the subtests exhibit sufficient

reliability

^toallow decision

making

^at^the

subtest level. The

reliability

coefficients are of sufficient

magnitude

^to

enable the test to be used in

diagnostic

assessment, ⁱⁿ^aresearch con-

text, ^orⁱⁿstudies

designed

^topromote

specific

^formal

reasoning

^abili-

ties. The

magnitudes

of intercorrelations _amongmodes of formal rea-

soning

^were

suggestive

^of^a^common

underlying

unidimensional

structure. When the coefficients were corrected for the attenuation that occurs because of the

unreliability

^{of the}

subscales,

the intercorrelations of the subscales

ranged

^from^.46^to^.70.

Each of the measures

reflecting

^different

hypothesized

modes of the formal

reasoning

^was

highly

correlated with the one-factor in the so-

lution obtained from the factor

analysis

of subtest scores. This result

can be

interpreted

ⁱⁿ^terms^of^each^{mode of}

reasoning contributing

^to

a common

underlying

one-factor structure.

A similar

interpretation

^is

applicable

^to^the^factor

analysis

^of^the separate item data. These results

provided

support ^{for the}^construct

validity

^{of the}^TOLT.^{The data}

suggested

that each item affords a measure on one

underlying

dimension which is defined to be formal

reasoning ability.

Although

^a

comparatively

^small

sample

^wasinvolved in the criterion-related validation of TOLT, ^a

high

non-chance

relationship

^was

established between TOLT

performance

and clinical interview _per- formance. A correlation of .80 between the two variables was in- dicative of similar

performance

^oneach variable. The evidence _sug- gests ^{that the}^same^processesthat allow

subjects

^to^solve

problems correctly

ⁱⁿ^theclinical interview are involved in the solution of TOLT items.

The data obtained in the

investigation

of criterion-related

validity

suggest that TOLT is

measuring

^formal

reasoning ability.

^Additional

studies of this type ^are

required using larger samples

^thatencompass the entire _rangeof formal

reasoning ability.

The initial _purposeof

developing

^TOLT^was^to^construct^an^in-

strument that could be used in studies of

teaching

^and

learning.

^On

the basis of

psychometric properties

described in the _paper,the use of

(11)

EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT

the TOLT is advocated to obtain measures that differentiate student formal

reasoning ability.

^Data^so

generated

would be useful in relat-

ing

^formal

reasoning ability

^to

achievement;

ⁱⁿ

investigating possible

interactions of formal

reasoning ability

with teacher

variables;

^orⁱⁿ

statistically controlling

for variations in formal

reasoning

^so^{that the}

effects of other teacher and student variables can be determined. In these respects ^{the TOLT}^can^{be used}^toobtain continuous or cate-

gorical

^measures^of^formal

reasoning ability.

^{The data}

presented

ⁱⁿ

Table 7 show that the TOLT has

already

^been

employed

for the three purposes

just specified.

Although

^the

development

^of^{the TOLT}grew from

practical

^con-

cerns

relating

^toresearch and

diagnostic teaching,

^the

methodology

used to

develop

^{the TOLT}^is

applicable

^to^further

investigations

^{in the}

nature of formal

reasoning. Specific

^testsfor each mode of

reasoning

could be

developed,

^or

existing

TOLT items could be modified to en-

able additional information to be obtained from each item. For ex-

ample,

^a^set^of

follow-up questions

could be constructed. The

prin- ciple

^of

using

^a

multiple-choice

^format^for^the

justification

^{of each}

answer offers the

potential

^for

developing

^measures^{of formal}^reason-

ing

^that^arevalid and reliable. With these instruments based on a

sound clinical interview

experience,

researchers would have efficient tools to

study developmental

patterns among

large

numbers of sub-

jects.

Conclusion

Evidence suggests that the TOLT does measure formal

thinking.

The

reliability

^data^areindicative of

high

^internal

consistency

^and^the

validity

^data^arediverse and

supportive

^of^aneffective group ^testof formal

thought.

The TOLT

provides

^a^means^of

assessing

^formal

reasoning ability

as a

diagnostic

aid for teachers or as data for researchers

investigating

the nature of

learning.

^The^testis suitable for administration on a

group basis to students from

grade

⁶

through college.

REFERENCES

Arlin,

^P.

Cognitive development

ⁱⁿadulthood: A fifth stage.

Develop-

mental

Psychology, 1975, 11,

^602-606.

Bradley,

^{C. F.}^The

effects of imagery stimulation,

visualization skill and

cognitive development

^level^on^the^scienceachievement

of college

^stu-

dents

.

(Doctoral dissertation, University

^of

Georgia, 1980).

Burney,

^G.^M.^Theconstruction and validation of an

objective

^formal

reasoning

instrument.

(Doctoral dissertation, University

^{of North-}

ern

Colorado, 1974.)

Dissertation Abstracts International,

1975, 35,

4535-B.

(University

Microfilms No.

75-05, 403)

(12)

Cantu, ^L.^and

Herron,

^D.Concrete and formal

Piagetian

stages ^and science concept attainment. Journal

of

Research in Science Teach-

ing, 1978, 15,

^135-143.

Cattell,

^{R. B.}^{The Scree}^testfor the number of factors. Multivariate Behavioral

Research, 1966,

1, ^245-276.

Chiappetta,

E. A review of

Piagetian

studies relevant to science instruction at the

secondary

^and

college

^level.^Science

Education, 1976, 60,

^253-261.

Cronbach,

^{L. J.}Coefficient

alpha

and the internal structure of tests.

Psychometrika

, 1951, 16,

^297-334.

Educational

Testing

^Service.

College

^EntranceExamination Board Scholastic

Aptitude

^Test.

Princeton,

^New

Jersey,

^1948-1980.

Farrell,

M. A. The formal stage: ^Areview of the research. Journal

of

Research and

Development

ⁱⁿ

Education, 1969, 3,

^111-118.

Goodstein,

^M.ândHowe, Â.^Theûseôf^concretemethods in secondary

chemistry

instruction. Journal

of Research

in Science

Teaching,

1978,

15, 361-366.

Inhelder,

^{B and}

Piaget,

^J.^The

growth of logical thinking from

^child-

hood to adolescence. New York:

Basic,

^1958.

Inhelder,

^{B and}

Piaget,

^J.^The

origin of

^{the idea}

of

^chanceⁱⁿ^children.

New York: N. W.

Norton,

^1975.

Kaiser,

^{H. F. The}

application

of electronic computers ^to^factor

analy-

sis. EDUCATIONAL AND PSYCHOLOGICAL

MEASUREMENT, 1960, 20,

141-151.

Lawson,

^{A. E. The}

development

^andvalidation of a classroom test of formal

reasoning.

^Journal

of

^Researchⁱⁿ^Science

Teaching, 1978,

15, ^11-24.

Lawson, ^A.

E., Adi,

^H.^and

Karplus,

^R.

Development

of correlational

reasoning

ⁱⁿ

secondary

^schools:^Do

biology

^courses^make^a^differ-

ence ? The American

Biology Teacher, 1979, 41,

^420-425.

Lovell,

^{K. A}

follow-up study

of Inhelder and

Piaget’s

^the

growth

^of

logical thinking.

British Journal

of Psychology, 1961, 52,

^142-153.

Renner, ^{J. and}

Grant,

R. Can students grasp

physics concepts ?

^The

Science

Teacher, 1978,

45, ^30-33.

Staver, ^J.^R.^and

Gabel,

^{D. L.}^The

development

^and^construct^valida-

tion of a

group-administered

^test^{of formal}

thought.

^Journal

of

^Re-

search in Science

Teaching, 1979, 16,

^535-544.

Tisher, ^{R. P.}^and

Dale,

^{L. G.}

Understanding

in science test. Victoria:

Australian Council for Educational

Research,

^1975.

Tobin,

^{K. G. and}

Capie,

^W.

Relationships

between formal

reasoning ability,

^locus^of

control,

^academicengagement, ^and

integrated

process skill achievement. Journal

of Research

ⁱⁿ^Science

Teaching, 1981,

¹⁸

(in press).

Tobin,

^K.

G., Capie,

^W.^and

Bradley,

^C.^{F. The}

relationship of formal reasoning ability

and process skill achievement

(Tech. Rep. 39), University

^of

Georgia,

October 1980.

Yeany, R., Helseth,

E. H. and

Barstow,

W. Interactive instructional

video-tapes,

scholastic

aptitude, congitive development

^{and locus}

of

control as variables

influencing

^scienceachievement. A paper presented at the annual

meeting

of the National Association for Re- search in Science

Teaching, Boston, April,

¹⁹⁸⁰

© 1981 SAGE Publications. All rights reserved. Not for commercial use or unauthorized distribution.http://epm.sagepub.com at Mina Rees Library/CUNY Graduate Center on January 17, 2008 Downloaded from