ITEMS ANALYSIS ON THE SCORE OF THE ENGLISH SUMMATIVE TEST (A Descriptive Study of the Tenth Grade Students of SMK N 3 Salatiga in the Academic Year of 2013/2014) - Test Repository

(1)

i

ITEMS ANALYSIS ON THE SCORE OF THE ENGLISH

SUMMATIVE TEST

(A Descriptive Study of the Tenth Grade Students of SMK N 3 Salatiga in

the Academic Year of 2013/2014)

A GRADUATING PAPER

Submitted to the Board of Examiners as a Partial Fulfillment of the Requirements

for the Degree of Sarjana Pendidikan Bahasa Inggris (S.Pd.I) in the English Department of Education Faculty

SITI MUNADLIROH

NIM. 11310142

ENGLISH DEPARTMENT OF EDUCATION FACULTY

STATE INSTITUTE FOR ISLAMIC STUDIES (STAIN)

SALATIGA

(2)

(3)

(4)

(5)

v

MOTTO

“Intelligence is not the determinant of success, but hard work is the real determinant of

your success.”

~Alexander Graham Bell~

-

“No surrender term. Winners never give up, because people who give up would never win.”

~Ted Turner~

-

“Life is like to reach the bike, drive as fast as possible.”

(6)

vi

DEDICATION

This work is sincerely dedicated for:

1. My beloved parents, my father (Sujadi) and my mother (Siti alfiah) who

always pray, guide, motivate me to become better person.

2. My beloved sisters (Nurul Hikmah and Nana Farida) who fill my life with

love and affection.

3. My beloved uncle and aunt, my uncle (Yaseri) and my aunt (Sujiyem)

who motivate me directly and my big family who fill my life with love,

affection and pleasantness.

4. My closest friends at STAIN Salatiga who always motivate and help me.

Too many memories and impressions together with you and I can’t forget

(7)

vii

ACKNOWLEDGEMENT

Bismillahirrahmanirrahim,

In the name of Allah, the most gracious and merciful, the kings of universe

and space. Thank you to Allah because the writer could complete this graduating

paper as one of requirement to finished study in English Department faculty of States

Institute for Islamic Studies.

This graduating paper would not have been completed without support,

guidance and help from individual and institution. Therefore, I would like to express

special thank you to:

1. Mr. Dr. Rahmat Hariyadi, M.Pd as the Rector of State Institute for Islamic

Studies Salatiga.

2. Mrs. Rr. Dewi Wahyu Mustikasari, M.Pd as the head of English Department of

States Institute for Islamic Studies (STAIN) Salatiga and the consultant of this

graduating paper. Thank you for all of your suggestion, recommendation and

support for this graduating paper from the beginning until the end.

3. Mrs. Setia Rini, M.Pd as consultant who has educated, supported, directed and

given the writer advice, suggestion and recomendation for this graduating paper

from beginning until the end. Thank you for your patience and care.

4. All lecturers in English Department Faculty of STAIN Salatiga. Thank you for

(8)

(9)

ix

ABSTRACT

Munadliroh, Siti. 2014. “Items Analysis on the Students’ Score of the English

Summative Test (Descriptive Study of the Tenth Grade Students of SMK N 3 Salatiga in the Academic Year of 2013/2014)”. Graduating Paper. Educational Faculty. English Department. State Institute for Islamic Studies (STAIN). Consultant: Setia Rini, M. Pd

This research was aimed to give a description for the readers about an items analysis on the students’ score of the English summative test. This research can be used for as an input for the readers; especially for the English teachers, the headmaster, and all people who are involved and responsible in developing good quality of test. The objective of this research was to measure and find out the difficulty level and discrimination index on items of English summative test score of the tenth grade students at SMK N 3 Salatiga in the academic years of 2013/2014. Type of this research was descriptive study. This research was compiled in quantitative method. It was applied purposive sampling technique. The total number of the sample was three classes which were 102 students. The data of this study was taken from observation and documentation which used to obtain the school data like students’ name and general information. The result of this research were as follows, based on the data of difficulty index, there were 21 questions (42%) that placed in the normal position that included to the criteria of moderate question. In the contrary, there were 27 questions (54%) that included to the criteria of easy question and there were 2 questions (4%) that included to the criteria of hard question. Based on the data of discrimination index, there were 24 questions (42%) were in good criteria of discrimination index. Then 17 questions (34%) were in satisfactory criteria of discrimination index. In contrary, the writer found 9 questions (18%) items that were in the poor criteria. It were rejected either due to the difficulty level or discrimination index. It could be concluded that there were 9 questions that must be removed or revised to be good questions.

(10)

x

LIST OF TABLES AND FIGURES

Table 3.1 List of X-O2 Students’ Scores of SMK N 3 Salatiga 39 Table 3.2 List of X-W1 Students’ Scores of SMK N 3 Salatiga 40 Table 3.3 List of X-TKR1 Students’ Scores of SMK N 3 Salatiga 41 Table 3.4 Classification of the Difficulty Index 46

Table 3.5 Classification of the Discrimination Index 48

Table 4.1 List of the Difficulty Index on the Test Items 51

Table 4.2 Classification of Discrimination Index 55

Table 4.3 List of Upper and Lower Group based on the Scores 56 Table 4.4 List of the Discrimination Index on the Test Items 58

Table 4.5 List of the Difficulty Index on the Test Items 60 Table 4 6 List of the Discrimination Index on the Test Items 62

Figure 4 1 Item Frequency for each Difficulty Index Range 65

Figure 4.2 Item Frequency for each Discrimination Index

(14)

1

CHAPTER I

INTRODUCTION

A. Background of Study

English is a tool of communication to get information and it can be

used in formal education as academic subject matter. In the global era,

English is increasingly needed because it is one of the international languages

mostly used in world.

English as an International language has an important role in any

sphere of activities to be used as a means of communication both written and

spoken, so English language becomes the first foreign language that should be

taught to English students for every level of education in Indonesia. English is

taught as a compulsory subject in elementary, junior and senior high schools,

and as a complementary subject in university.

The purpose of teaching English in Indonesia is to develop the

communication skills especially in oral and written skills (listening, speaking,

reading and writing). To reach the purpose of the instructional activities, the

teachers apply evaluation to measure how far the students understand about

the material.

In education, goals are identified on the basis of students and society’s

need. Based on this needs, educational programs are established so that

(15)

2 students’ behaviour is needed as to judge the success of the students in

reaching the goals. Evaluation must be done because education is not

automatically successful. The core of evaluation is then to evaluate the

success of students which is periodically gathered in terms of the objectives.

One of the most important aspects of teaching learning process is

evaluation. It contributes directly to the teaching and learning process used in

classroom instruction. According to Sudijono (1996: 13), the main focus of

classroom evaluation is the students and their learning process. To measure

the students’ competence in the learning process, the teachers need to hold an

evaluation. Evaluation plays an important role in teaching learning activities.

It is an integral part of instructional program.

The measurement of educational achievement is essential to effective

formal education. Formal education is a complex process, requiring a great

deal of time and money and cooperative efforts of many people. Effort must

be directed toward the attainment of specific goals, because education is not

automatically successful. Teachers, students, parents and school officials need

to know periodically how successful their efforts have been, so that they can

decide which practices to continue and which to change (Gronlund, 1982: 9).

Teachers are those who know the characteristics of their classes. Thus,

they are the best position to construct a test to measure their students’

achievement and it is not an easy job. Some teachers make a test carelessly.

(16)

teaching-3

learning process. High quality test can give information about how well the

students have comprehended the material, which has been taught by the

teacher. So, teaching learning process will be more effective without any

overlapping.

One of form to evaluate the students’ ability is test. Evaluation can be

done in the form of test. This test could be a teacher-made test or standardized

test. In the teacher-made test, the teachers who make the test should know and

master the principles and the steps that must be done in making the test. By

this knowledge the teachers will get a clear figure about the general

systematic framework of evaluation.

There are numerous types of test. There are placement test,

achievement test, proficiency test and aptitude test. The test which is usually

used by teacher to know how far students have mastered the lessons is the

achievement test. The achievement test is intended to establish how successful

individual students groups of students or the courses themselves have been

achieving objectives of language courses. Then here are two kinds of

achievement test: progress achievement test and final achievement test.

Progress achievements are those intended to measure the progress that

students are making and final achievement tests or summative tests are

intended to measure the students’ achievement at the end of a course of study

(17)

4

In order to measure accurately, the teachers should use a good test. It

is not an easy work for them to make it because there are some characteristics

or requirements that must be fulfilled. The characteristics of a good test

include validity, reliability, objectivity and practicality (Sudijono, 1996: 93).

Validity is the most important consideration in test evaluation. The

concept refers to the appropriateness, meaning and usefulness of the specific

inferences made from the score. Test validation is the process of accumulating

evidence to support such inference. The former types of validity are content,

criterion related, and construct (Tinambunan, 1988: 11).

Most of teachers applied test in the multiple choice form in the final

program of teaching learning process. According to Tinambunan (1988: 9),

the summative test is intended to show the standard which the students have

now reached in relation to other students at the same stage.

Item analysis is an important and necessary step in the preparation of

good multiple choice test. Because of this fact; it is suggested that every

classroom teacher who uses multiple choice test data should know something

of item analysis. How it is and what it means. Items analysis provides two

kinds of information on items, there are item difficulty and item

discrimination (Oller, 1979: 254).

In SMK N 3 Salatiga, English summative tests is settled as one of the

most important aspects that can be used as the tools of evaluation to measure

(18)

5

not. Since the English summative tests will become the main point of student

ability in English so this test will be very important to be analyzed. If the test

isn’t validwe can say that the test can’t use as the tools of measurement.

The writer focuses the research of the tenth grade students in this

school and focuses on the observation of the English summative tests. As we

know that tenth grade is the beginning class of senior high school where the

teacher can get the general and valid information about the students’ ability.

By knowing the valid information about the students’ ability might help the

teacher to find the suitable steps in treating students in class.

From the reason above the writer concludes that this research is very

important to be done because the English summative tests in SMK N 3

Salatiga is oriented to measure the students’ ability whether the target of

learning has been achieved or not. The teachers in this school also use the

result of the English summative tests in tenth grade as the standing point of

view in treating the students for the next level. If these purposes of this test

can’t give the valid information so the test is also not valid.

Based on the explanation above, it gives an inspiration to the writer to

conducts a research related to how to evaluate the items analysis of the

summative tests score. That is a research entitled “ITEMS ANALYSIS ON

THE SCORE OF THE ENGLISH SUMMATIVE TEST (A Descriptive

Study of the Tenth Grade Students of SMK N 3 Salatiga in the Academic

(19)

6

B. Problem Statements

Based on the background described above, this research is aimed at

giving answers on how are the difficulty level and the discrimination index

on items of English summative test score of the tenth grade students at SMK

N 3 Salatiga in the academic years of 2013/2014?

C. Objectives of the Study

The general purpose of the study is to be able to know how the items

analysis on the students’ score of the English summative test at he tenth

grade students at SMK N 3 Salatiga. The specific objectives of this study as

able to measure and find out the difficulty level and discrimination index on

items of English summative test score of the tenth grade students at SMK N

3 Salatiga in academic years of 2013/2014.

D. Benefit of the Study

The result of this study is expected to give a description for the readers

about items analysis of test score toward the summative test. It can be used as

an input for the readers; especially for the English teachers, the headmaster,

and all people who are involved and responsible in developing good quality of

test. In other word, it is useful for all people to know the characteristics of a

(20)

7

E. Scope and Limitation of the Study

The discussion of the study will be focused on items analysis on the

students’ score of the English summative tests of the tenth grade students of

SMK N 3 Salatiga. According to Anthony (1983: 284), item analysis refers to

the process of collecting, summarizing, and using information about

individual test items especially information about pupil’s response to items.

According to Widdowson (2000: 60), item analysis usually provides two

kinds of information on items, they are:

1. Item facility (Item difficulty), which helps us decide if the test items are at

the right level for the target group.

2. Item discrimination, which allows us to see if the individual items are

providing information on candidates’ abilities consistent with that

provided by the other items on the test.

F. Definition of Key Terms

1. Validity

Validity is the most important consideration in test evaluation. The

concept refers to the appropriateness, meaning and usefulness of the

specific inferences made from the score. Test validation is the process of

accumulating evidence to support such inference. The former types of

validity (content, criterion related, and construct) are simply considered to

(21)

8

of an interpretation but the primary concern for classroom achievement

testing is content validity (Tinambunan, 1988: 11).

Validity is a standard or criterion that shows whether the instrument

is valid or not. A test is valid to the extent that it measures what it claims to

measure (Sukardi, 1987: 173).

2. Items Analysis

According to Anthony (1983: 284), item analysis refers to the

process of collecting, summarizing, and using information about individual

test items especially information about pupil’s response to items.

3. Test

Test is a particular type of assessment that typically consists of a set

questions administered during a fixed period of time under reasonably

comparable conditions for all students (Linn &Gronlund, 1995: 5).

According to Arikunto (2010: 226), to measure is there the object is

analyzed used a test. It’s used to measure the basic competence and

achievement. There are two types of achievement test used in school:

a. Test made by the teacher; that arranged by certain procedure, but it has

not been examined many times so its characteristics and strength has

not been known.

b. Standardized test; a test that usually has been available in a test

(22)

9

4. Summative Test

Summative test is final test which is executed after completing

program of teaching learning (Sudijono, 1996: 72).

The summative test is intended to show the standard which the

students have now reached in relation to other students at the same stage

(Tinambunan, 1988: 9). The condition for setting a summative test are that

it covers a much wide range of material than diagnostic test and relates to

be long-term rather than short-term objectives. This brings up problems of

sampling, since what has been learnt, for example in a year, cannot be

assessed in one day, yet the test must reflect the content of the whole

course, and the test must be able to determine the extent to which the

instructional objectives have achieved by the pupils and is used primarily

for assigning course grades of certifying pupil’s mastery of the extended

learning outcomes.

G. Review of Previous Research

In this graduating paper, the writer takes some reviews from other

thesis as a comparative in this research. The first journal is done by Hanik

(23)

10

public elementary schools in UDANAWU District, Blitar Regency. They

analyzed the quality of English summative test in terms of the test

construction, content validity, reliability, level of difficulty, level of

discrimination, and the effectiveness of distraction. It was compiled in

descriptive evaluation research.

The second is “Items Analysis of English Formative Test made by English Teacher (A Study of Eleventh Grade Students at SMA N 1 Angkek)”. It is written by Sari S. Octavia, a student of STKIP PGRI Sumatra Barat in the

academic year of 2012. She observed value of the daily tests that have been

achieved by eleventh grade students of SMA N 1 Angkek. Then she observed

how the results of the daily tests given by the English teachers.

H. Research Organization

The writer wants to arrange the graduating paper in order to the reader

can catch the content easily. It is divided into five chapters.

Chapter I is Introduction. It consists of background of study, problem

statements, objectives of the study, benefit of the study, limitation of the

study, definition of key terms, and review of previous research.

Chapter II is Theoretical Framework.This chapter is divided into three

sub chapters. The first sub chapter is talking about language test, which

describes about definition of test, the kind of the test and the characteristics of

(24)

11

includes content validity, face validity, construct validity and empirical

validity. The last is talking about items validity.

Chapter III explains about methods of research that consist of setting

of the research, subject of the research, method of the research, procedure of

the research, technique of collecting data and technique of data analysis.

Chapter IV is Findings and Data Analysis. It consists of description of

data, analysis of data, Interpretation of data, finding, discussion, and result.

Chapter V is Closure. The writer states summary of the study includes

(25)

12

CHAPTER II

THEORETICAL FRAMEWORK

A. Language Test

Language testing is the practice and study of evaluating the

proficiency of an individual in using a particular language effectively (Brown,

2003: 42). The purpose of language test is to determine a person knowledge

and ability in the language and to discriminate that the persons’ ability from

that of others. Such ability may be of different kinds, achievement,

proficiency or aptitude. Tests, unlike scales, consist of specified tasks through

which language abilities are elicited. The term language test is used somewhat

more widely to include for example classroom testing for learning and

institutional examinations.

Actually there are many ways that use to evaluate the learning process.

One of the ways is test. Generally, test serves to motivate the learner and to

give the unity to portions of the material being studied at different times. It

can be device to prove the skills and abilities in learning.

From explanation above, the writer tries to develop the specific

(26)

13

B. Test

1. Definition of Test

There are some definitions about test. Test is a particular type of

assessment that typically consists of a set questions administered during a

fixed period of time under reasonably comparable conditions for all

students (Linn & Gronlund, 1995: 5). According to Tinambunan (1988:

3), test is a set of questions, each of which has a correct answer, that

examinees usually answer orally or in writing. Furthermore, according to

Brown (2003: 3), test is a method of measuring a persons’ ability,

knowledge, or performance in a given domain. Additionally, according to

Griffin and Nix, tests are setting for structured observations and are

expected to provide an efficient source of many types of assessment

information. They also said that test is a formal, systematic procedure

used to gather information about students’ achievement or other cognitive

skill (Griffin and Nix, 1989: 5-6).

In order to know how well the result of learning process, teacher

should evaluate it. By evaluating, teachers can collect information or have

concept whether the teaching and learning activity has successes or not.

Gronlund said that “tests are used as a means to motivate students

to learn or review specific material” (Gronlund, 1982: 6). It means that

test is one motivation of students to learn or review material in their

(27)

14

Furthermore Fernandes states that a test as a systematic procedure

for surveying a persons’ behavior and explaining it with the aid of a

numeric scale or a category system (Fernandes, 1984: 1).

In addition, according to Arikunto (2012: 67), test is instrument or

procedure which is used to know or measure a something in the situation

with the methods and rules determined.

Based on the definitions above, the writer concludes that the test

is the particular types of assessment to reinforce learning and to motivate

the students by giving a task or a set of tasks. Through the test, teachers

don’t only measure and motivate the students’ ability but also improve the

lesson in teaching learning process. In order to make a proper decision, the

teacher needs an accurate data. So a good instrument is needed.

2. The Kinds of Test

There are many types of test used to measure students’

achievement. The writer discusses about kinds of test based on two

experts’ opinions. First, According to Tinambunan, there are four types of

(28)

15

1) Placement test

A placement test is designed to determine pupil

performance in the beginning of instruction.

2) Formative test

Formative test is intended to monitor learning progress

during the instruction and to provide continuous feedback to both

pupil and teacher concerning learning successes and failures. It is

used at the end of a unit in the course book or after a lesson

designed. The result of this test will give the students immediate

feedback.

3) Diagnostic test

Diagnostic test is intended to diagnose learning difficulties

during instruction. The main aim of diagnostic test is to determine

the causes of learning difficulties and then to formulate a plan for

remedial action.

4) Summative test

According to Sudijono, summative test is final test which is

executed after completing program of teaching learning (Sudijono,

1998: 7-9).

In addition, according to Tambunan (1988: 9), summative

test is intended to show the standard which the students have now

(29)

16

Second, according to Brown, tests are divided into three

categories there are achievement test, aptitude test, and proficiency

test. Here, the writer likes to explain more about kind of tests.

1) Achievement test

Achievement test was designed to measure a variety of

learning outcomes, such as knowledge of specific facts, ability to

apply facts and principles (Tinambunan, 1988: 28). A classroom

tests is made by a teacher for his/her students and may or may not

be used again.

According to Gronlund, an achievement test is a systematic

procedure for determining the amount a student has learned.

Although the emphasis is on measuring learning outcomes, it

should not be implied that testing is to be done only at the end of

instruction (Gronlund, 1982: 1).

While, Sudijonos’ opinion (1996: 73), achievement test is

test which is used to reveal the level of attainment or learning

achievement. It is usually a formal examination given at the end of

the school year or at the end of the course, the achievement test

may be written and administered by ministries of education,

(30)

17

According to Hughes, achievement tests are directly related

to language course, their purpose being to establish how successful

individual students, group of students, or the courses themselves

have been in achieving objectives. They are two kind of test: final

achievement tests and progress achievement tests.

a) Final achievement tests are those administered at the end of a

course of study.

b) Progress achievement tests, as their name suggests, are

intended to measure the progress that students are making

(Hughes, 2003: 13).

Furthermore, Brown (2003: 47) said that an achievement

test related directly to classroom lessons, units, or even a total

curriculum. Achievement test are limited to particular material

covered in a curriculum within a particular time frame, and are

offered after a course has covered the objectives in question. Then

achievement tests are often summative because they are

administered at the end of a unit or term of study.

In addition, another opinion, an achievement test is

designed to indicate degree of students’ success in some past

learning activities (Tinambunan, 1998: 9). This purpose of

(31)

18

aptitude test, where the aptitude test is designed to predict success

in some future learning activities.

In order to have a good achievement test form, a test maker

should consider that achievement test much be constructed well by

paying attention to some following basic principles (Gronlund

1988: 303). They are:

a) Achievement tests should measure clearly defined learning

outcomes that are in harmony with the instructional objectives.

b) Achievement tests should measure an adequate sample of the

learning outcomes and subjects matter content included in

instructions.

c) Achievement tests should include of the tests items, which are

most appropriate for measuring the desired learning outcomes.

d) Achievement tests should be designed to fit the particular uses

to be made of the results.

e) Achievement tests should be made as reliable as possible and

should then be interpreted with caution.

f) Achievement tests should be used to improve student learning.

The content of tests based on the course objectives gives a

number of advantages. The first, it compels course designers to be

explicit about objectives. The second, it makes possible for

(32)

19

achieved the instructional objectives. Consequently, the course

designer or teacher should construct a syllabus based on the

instructional objectives and should select books and materials

which are consistent with the course objectives.

Based on the explanation above, the writer concludes that

achievement tests should support and reinforce other aspects of the

instructional process. May they can aid both the teacher and

student in assessing learning readiness.

2) Aptitude test

The second type of test is the aptitude tests. Aptitude tests

are designed to predict, before beginning language study, a

subjects’ capability of acquiring the language (Merry and Sydney,

1993: 7). By looking at “predict” term, it can be recognized that

these tests give some clues as to whether, how well and how

quickly a person is likely to success in learning.

According to Sudijono (1996: 73), the aptitude test is test

which is executed that aim to reveal a basic competence or special

aptitude that students have.

Beside it, Brown states, a language aptitude test is designed

to measures a persons’ capacity or general ability to learn a foreign

(33)

20

are considered to be independent of a particular language (Brown,

2001: 391).

Fundamentally, aptitude tests have different features in

nature from achievement test, which has been discussed

previously. Aptitude tests are primarily designed to predict success

in some future learning activities, whereas achievement tests are

designed to indicate degree of success in some past learning

activity (Tinambunan, 1998: 7). From a comparison above, it can

be comprehended that a distinction founded between these two

tests is made in term of the use of the results. It is rather than the

qualities of the tests themselves.

3) Proficiency test

The third type of test is proficiency test. This test is used to

know the proficiency of test-takers. It is hoped after giving this test

the test-taker will know their ability in their ability in language

especially in English language.

According to Hughes (2003: 11), proficiency tests are

designed to measure people’s ability in a language. The content of

proficiency test is based on a specification have to be able to do in

the language in order to be considered proficient.

While Harmer (2001: 321), said that the proficiency tests

(34)

21

than measures progress). They are frequently used as stages people

have to reach if they want to be admitted to a foreign university,

get a job, or obtain some kind of certificate. Proficiency tests have

a profound backwash effect since, where they are external exams,

students obviously want to pass them, and teachers’ reputation

sometimes depend (probably unfairly) upon how many of them

succeed.

Appropriate the writers’ experience during the learning,

this test usually consists of the standardized multiple choice items

in structure, reading comprehension, listening comprehension, and

sometimes on writing.

Based on the explanations about the kind of tests above, the writer

concludes that generally test is a systematic and objective procedure to

find out the knowledge and ability of what have been learned from

someone.

C. Summative Test

According to Brown, Summative test has clearly related to summative

assessment. Summative assessment aims to measure or summarize what a

student grasped and typically occurs at the end of a course or unit of

(35)

22

and taking stock of how well that student has accomplished objectives. But it

does not necessarily point the way to future progress. Final exams in a course

and general exams are examples of summative assessment (Brown, 2003: 06).

In this part, the writer discusses more about summative test as a follow:

1. Definitions of Summative Test

According to Sudijono, summative test is final test which is

executed after completing program of teaching learning (Sudijono, 1998:

7-9).

In addition, according to Tambunan, summative test is intended to

shows the standard which the students have now reached in relation to

other students at the same stage (Tinambunan, 1988: 9)

2. Purpose of Summative Test

The purpose of summative test is establishes a success learning,

which its result as a substances to fulfill a students’ grade report and

preferment class. So that it is not used to improve a teaching-learning

process, because of all the materials have been extended. If the student

failed, he or she reputed that not pass in a lesson which involved in the

learning (Sutomo, 1985: 20).

3. Advantages of Summative Test

According to Arikunto (2008: 39), there are important advantages

of summative test, they are:

(36)

23

b. To be able to know students’ ability in following the next of teaching

programs.

c. To fulfill the progress learning notes that it can be useful to students’

parent, consultant, and mentor in the school.

The main purpose of summative test is determines the point which

symbolize a student success after passing the learning process at certain

time. So that the teacher can determines a student position in the class. It

related to the students condition in following a teaching programs

(Silverus, 1991: 10).

4. Assessment aspect of summative test

According to Sutomo (1985: 20) aspect which is assessed in

summative assessment is all of ability aspect the learning result during the

teaching programs. They are knowledge aspect (cognitive), skill

(psychomotor), and behavior (affective).

Based on the statement above, the writer concludes that the

condition for setting a summative test are that it covers a much wide range

of material than diagnostic test and relates to be long-term rather than

short-term objectives. This brings up problems of sampling, since what

has been learned, for example in a year, cannot be assessed in one day, yet

the test must reflect the content of the whole course, and the test must be

(37)

24

achieved by the pupils and is used primarily for assigning course grades of

certifying pupil’s mastery of the extended learning outcomes.

D. The Characteristics of a Good Tests

A test which is good the measuring instrument must meet the test

requirements, namely to have validity, reliability, and usability (Arikunto,

2009: 58). First character of a good test is needed to have validity. Validity

refers to the adequacy and appropriateness of the interpretation made from

tests, with regard to a particular use. An information data can be said is valid

in accordance with actual circumstances. The second characteristic of a good

test is needed to have reliability. A test should be reliable as a measuring

instrument. Reliability is the consistency of assessment results (Linn and

Gronlund, 1995: 48).

If the teachers obtain quite similar scores when the same test

procedure is used with the same students on two different occasions, they can

conclude that their results have a high degree of reliability from one occasion

to another. Similarity, if different teachers independently rate student

performances on the same test task and obtain similar ratings, they can

conclude that tests can be said reliable. If it gives results that remain when a

test is practiced to students for many times, it also conclude that tests can be

(38)

25

The third characteristic of a good test is usability in the preparation of

a new test. The term usability, then, refers only to the practically of the

procedure and says nothing about the other qualities percent (Linn and

Gronlund, 1995: 49). The teacher must keep in mind a number of a very

practical consideration which involves economy, ease of administration,

scoring and interpretation of result. How long the administering and scoring

of test will take, choosing a short test rather longer test.

In the writer’s opinion, the practically of a test is important in order

that test materials can be administered well. It must be determined in term of

materials, time, and effort that it requires.

1. Validity

Based on the previous explanation, the writer mentions that one of

a good test characteristic is validity. Test validity is the most critical factor

to be judged in the total of foreign language testing. Validity is the extent

to which a test measures what it is intended to measure (William, 1990:

183). It means validity refers to extent to which the results of an

evaluation procedure serve the particular uses for which they are intended.

For example, if a test is designed to measure oral comprehension, it should

not attend to measure another skill such as reading comprehension. If a

test is intended to measure a persons’ ability to speak the language, it is

valid only if speaking skills and not writing ability are the specific

(39)

26 Traditionally, validity has been defined as “the degree to which a

test measures what it claims or purports to be measuring”. According to

Gronlund, the meaning of validity has typically been defined for the

testing profession by a set of standards. In the most recent edition of the

standards, validity has been described as follows: “validity is the most

important consideration in test evaluation. The concept refers to the

appropriateness, meaningfulness, and usefulness of the specific inferences

made from the scores. Test validation is the process of accumulating

evidence to support such inferences. A variety of inferences may be made

from scores produced by a given test, and there are many ways of

accumulating evidence to support any particular inference. Validity,

however, is a unitary concept. Although evidence may be accumulated in

many ways, validity always refers to the degree to which that evidence

support the inferences that are made from the scores. The inferences

regarding specific uses of a test are validated, not the test itself (Gronlund,

1993: 159)”.

The other hand, Tinambunan said that validity refers to the extent

to the results of an evaluation procedure serve the particular uses for

which they are intended. Thus, the validity of a test is extent to which the

test measures what is intended to measure (Tinambunan, 1988: 11).

According to Gronlund, validity refers to the appropriateness of

(40)

27

testing, can be clarified further by noting the following general points;

validity refers to the interpretation of test results (not to the test itself),

validity is inferred from available evidence (not measured), validity is

specific to a particular use (selection, placement, evaluation of learning,

and so forth), validity is expressed by degree, for example; high,

moderate, or low (Gronlund, 1982: 126).

In every language, we say that something is valid if it is sound and

meaningful, or well grounded on principles or evidence. For example, we

speak of a valid theory, a valid argument, or valid reason. Validity is the

process of gathering and evaluating validity evidence. Both the test

developer and the test user may play a role in the validation of a test for a

specific purpose (Ronald and Mark, 1988: 175).

In other opinion came from Fernandes, an important characteristic

of a test is its validity. The validity can be viewed as the accuracy of

specified in references made from scores (Fernandes, 1986: 6).

From the definition above, the writer concludes that these are no

differences in the essence of validity, there are only different in the

terminology, such as extent and degree and worth, while all of them intend

to measure the purpose to measure.

There are three types of validity namely content validity, construct

(41)

28

a. Content Validity

The principal validity for achievement tests is content validity,

sometimes called content relevance. Content validity talks about

content of test. Febru and Erna said, “Content validity is concerned

with the extent to which the test is representative of a defined body of

content consisting of topics and processes (Febru and Erna, 2011:

167). Therefore, the test should reflect instructional objectives or

subject matters. But it is not expected that every knowledge or skills

will always appear in the test; there may simply be too many things for

all of them to appear in a single test.

According to Hughes a test is said to have content validity if its

contents constitute a representative sample of the language skills,

structure, etc (Hughes, 2003: 11).

The content validity is concerned with how the test measures

the subject matter and behavior under consideration. The test items

must be a representative sample of the domain of possible content or

behavior. Content validity is the most appropriate method for

(42)

29

b. Construct Validity

In construct validity, we have to measure the difficulties of the

students toward the test has to be qualified. Terminologically,

according to Anas, achievement test learning can be stated as a test

which it is have a construct validity, if achievement test learning is

exactly reflect a construction in psychology theory with consideration

from composition aspect, design or invention (Sudijono, 1996: 166).

According to Gronlund, construct validity is applicable to both

norm-referenced and criterion-referenced tests, evidence in the latter

case would, it consists of necessity and it can be less dependent on

statistical measures requiring score variability (Gronlund, 1982: 131).

Bachman and Palmer said that construct validity is the

on-going process of demonstrating that a particular interpretation of test

scores is justified and involves, essentially, building a logical case in

support of a particular interpretation and providing evidence justifying

interpretation (Bachman and Palmer, 1984:520).

Beside that Hughes and Porter said that construct validity has

focuses attention on the desirability of basing test construction on an

explicitly recognized theoretical foundation. A possible danger in the

application of construct validity is that may open the way for

subjective, unverified assertions about test validity (Hughes and

(43)

30

c. Empirical Validity

Empirical validity is accuracy measure which is basing on

analysis result that has empirical character (Sudijono, 1996: 167).

According to Charles, empirical validity depends on empirical

and statistical evidence as to whether students’ marks on the test are

similar to their marks on other appropriate measures of their ability,

such as their scores on other tests, their self assessments or their

teachers’ rating of their ability (Anderson, 1995: 171).

In order to know whether a test has empirical validity or not, it

can be traced from ways, first is concurrent validity and second is

prediction validity. Concurrent validity applies if data on the two

measures (test and criterion) are collected at or about the same time.

Predictive validity applies if there is an intervening period (e.g., three

or six month) between the time of testing and the collection of data on

criterion. Operationally, this time of criterion data collection is the

distinction between the two types of criterion validity. Specifically, the

question of concurrent validity is whether or not the test scores

estimate a specified present performance; that’s of predictive validity

is whether or not the test scores predict a specified future performance

(44)

31 In the writers’ opinion, validity of a test is important to know a

test whether it has a good quality in testing someone’s capability or

not.

2. Reliability

A test should be reliable as a measuring instrument. A test cannot

measure anything well unless it measures consistently. According to

Anderson (1995: 187), a test cannot be valid unless it is reliable. If the test

administered to the same students on the different occasion and there is no

difference to the results. It can be said that the test is reliable.

3. Practicality

The third, characteristics of a good test is practicality or usability

in the preparation of a new test. The teacher must keep in mind a number

of very practical considerations which involves economy, ease of

administration, scoring and interpretation of result. Economy means the

test is not costly. The teachers must take into account the cost per copy,

how many scores will be needed, (for the more personnel who must be

involved in giving and scoring a test, the more costly the process

becomes). How long the administering and scoring of it will take,

choosing a short test rather than longer one. Ease of administration and

scoring means that the test administrator can perform his task quickly and

efficiently. We must also consider the ease with which the test can be

(45)

32

According to Heaton (1988: 161), the final point concerns the

presentation of the test paper itself, where possible, it should be printed or

type written and appear neat, tidy and aesthetically pleasing. Nothing is

worse and more disconcerting to the testiest than untidy test paper, full of

miss spellings, omissions and corrections. If it happens, it will be easy for

the students or testiest easy to interpret the test items.

Besides having a good criteria, the other characteristics of the test

that’s more important and specific is the quality of the test items. To know

the quality of the test items, teachers should use a method called item

analysis.

E. Item Analysis

There are several meanings of what item analysis. According to

Anthony (1983: 284), item analysis refers to the process of collecting,

summarizing, and using information about individual test items especially

information about pupil’s response to items.

Item analysis is an important and necessary step in the preparation of

good multiple choice test. Because of this fact; it is suggested that every

classroom teacher who uses multiple choice test data should know something

of item analysis. How it is and what it means (Oller, 1979: 254).

For the teacher made test, the followings are the important uses of

(46)

33

back to students about their performance and as a basis for class discussion,

feedback about pupil difficulties, and area for curriculum improvement,

revising the item and improving item writing skill.

According to Widdowson (2000: 60), item analysis usually provides

two kinds of information on items, they are:

1. Item Difficulty (Item Facility)

Item facility, which helps us decide if the test items are at the right

level for the target group. Item facility expresses the proportion of the

people taking the test who got a given item right. According to Arikunto

(1995: 211), item facility refers to item difficulty. Item difficulty is

sometimes used to express similar information, in this case the proportion

that got an item wrong. Where the test purpose is to make distinctions

between candidates, to spread them out in terms of their performance on

the test, the items should be neither too easy nor too difficult. Good test is

items which not too easy or not too difficult. If the items are too easy, then

people with differing levels of ability or knowledge will all get them right,

and the differences in ability or knowledge will not revealed by the item.

Similarly if the items are too hard, then able and less able candidates alike

will get them wrong and the item will not help us in distinguishing

(47)

34

2. Item Discrimination

According to Arikunto (1995: 215), analysis of item discrimination

addresses a different target: consistency of performance by candidates

across items. The usual method for calculating item discrimination

involves comparing performance on each item by different groups of test

takers: those who have done relatively poorly. For example, as items get

harder, we would expect those who do best on the vest overall to be ones

who in the main get they right. Poor item discrimination indices are signal

that an item deserves revision.

If there are a lot of items with problems of discrimination, the

information coming out of the test is confusing, as it means that some

items are suggesting certain candidates that relatively better, while order

individuals are better, no clear picture of the candidates’ abilities emerges

from the test. (The scores, in other words, are misleading and not reliable

indicators of the underlying abilities of the candidates) such a test will

need considerable revision (Arikunto, 1995: 216).

F. English Curriculum

The writer would explain about English Curriculum 2013. Based on

module of implementation curriculum 2013 coaching (2014: 2), the

(48)

35

1. Concept of Curriculum 2013

Curriculum is one of element which gives contribution to construct

the students’ potential. It is developed based on the competence which is

needed as an instrument. It aims to direct the students to be able to:

a. Quality human who capable and proactive to the God; human who

have good character, skillful human, creative human, and powerful

human.

b. National who democratic and responsibility.

2. Rational of Curriculum 2013 Development

Developing of curriculum 2013 is advance step the curriculum

based on competence which is pioneered in year of 2004. It is advance

step the KTSP (2006) curriculum which include of cognitive competence,

(49)

36

CHAPTER III

METHODOLOGY

I. Setting of the Research

This research was conducted in SMK N 3 Salatiga which is located in

Jl. Ja’far Shodiq Rt. 01 Rw. 03 Phone/Fax (0298) 7103119 Salatiga 50744.

The subject of this research was the tenth grade students of SMK N 3 Salatiga

in academic year of 2013/2014.

The existence of SMK N 3 Salatiga has long be expected by

communities especially Salatiga. It was to address the needs of diverse and

quality education. May 21th, 2007, it poured on decree of competence program provider No. 420.5/1510 Head of the Salatiga. The competence program

opened; Mechatronic Technique, Welding technique, Ototronic Technique,

and Agribusiness & Horticulture.

In the face of increasingly fierce competition with public schools, the

management SMK N 3 Salatiga must create educational programs with the

aim to improve services to the stakeholders.

SMK N 3 Salatiga committed to education and training as the

fulfillment of the needs of the labor market by establishing a human resource

(50)

37

J. Subject of the Research

In this research, the writer chose SMK N 3 Salatiga as object of the

study especially the tenth grade students. The tenth grade students consist of

twelve classes, but the writer took three classes, they are W1, O2, and

X-TKR1. The numbers of the participants are X-W1 (33 students, all of them are

boys), X-O2 (37 students, all of them are boys), and X-TKR1 (32 students, all

of them are boys). Their native language is Bahasa Indonesia. The average

age of the participants was 16 years old. They have English lesson at least one

meeting in a week which one hour lesson is 45 minutes.

a. Population

According to Arikunto (2010: 173), “population is all respondents of the research subject”. The population of this research was the tenth

grade students of SMK N 3 Salatiga in the academic year of 2013/ 2014.

They are all of tenth grade students of SMK N 3 Salatiga.

b. Sample and Sampling Technique

Sample is part of the representative of population that is observed

(Arikunto, 2010: 174). From the total population of tenth grade students,

the writer took X-W1, X-O2, and X-TKR1 classes as the sample of this

research. It consists of 102 students.

In this research the writer used purposive sampling. According to

Ary, Jacobs and Sorensen (2006: 156),purposive sampling also referred to

(51)

38

representative, are chosen from the population. The assumption is that

errors of judgment in the selection will counterbalance one another.

The writer used this sampling technique because of a reason or

purpose in choosing that class as the sample. It was that these classes have

some categories based on the students’ ability in English lesson. They are

X-O2 (as excellent class), X-W1 (as moderate class) and X-TKR1 (as low

class).

These are the data of X-O2, X-W1 and X-TKR1 students’ English

scores are used as source of the research could be drawn as follows:

(52)

(53)

40

List of X-TKR1 Students’ Scores of SMK N 3 Salatiga in

(54)

41

Stephen and Michael (1982: 46), descriptive study is used in the literal sense

of describing situations events. It is the accumulation of data base that is

(55)

42

the facts and characteristics of a given population or area of interest, factually

and accurately.

In this research, the writer described about items analysis on the score

of the English summative tests of the tenth grade students of SMK N 3

Salatiga in the academic year of 2013/2014.

The quantitative method was applied in this study. According to

Lodico (2006: 13), quantitative methods are those which focus on numbers

and frequencies rather than on meaning and experience. Quantitative methods

(e.g. experiments, questionnaires and psychometric test) provide information

which is easy to analyze statistically and fairly reliable. In quantitative

method, the researcher focuses on collecting the data about the statistical

inferences. In addition, quantitative methodology assumes the necessity,

desirability, and even the possibility of applying some underlying empirical

standard to social phenomena (Quinn, 1978: 212).

L. Technique of Collecting Data

According to Muslich (2012: 40) research techniques consists of:

a. Observation

Observation is written note about what is seen, heard, and

experienced in collecting data and reflection toward qualitative data.

Observation is used to get the certain target which is observed. (Sam’s,

(56)

43

The writer visited the school, asked for the tests results

(summative test) of English Subject and asked for the question sheet of

English Subject to be analyzed. The writer interviewed with the English

teacher of the tenth grade students of SMK N 3 Salatiga.

b. Documentation

According to Arikunto (2010: 274), documentation is an activity

to look for variable like notes, transcribes, books, newspapers, magazine,

etc. In this method, writer provided a check- list to look for the variable

that had been decided. Whether the wanted variable was rise, then the

writer gave a check (√) in the check- list form.

Documentation means collected the files or data of related

information including the result of tenth grade student’s examination in

even semester. There are two instruments used in this research, they are

English Summative test and English syllabus. The writer came to school,

ask for English summative tests of the tenth grade students of SMK N 3

Salatiga. Then, the writer collected the data about English syllabus,

students’ data profile, students’ English score and the general information

(57)

44

M.Technique of Analyzing Data

The writer conducts Items Analysis on the Score of the English

Summative Test (Descriptive Study of the Tenth Grade Students of SMK N 3

Salatiga in Academic Year of 2013/2014. According to Stephen and Michael

(1982: 46), descriptive study is used in the literal sense of describing

situations events. It is the accumulation of data base that is solely descriptive.

The purpose of this approach is to describe systematically the facts and

characteristics of a given population or area of interest, factually and

accurately. In this study, the writer described all of the data and analyzed to

get the result and conclusions.

In analyzing data, the writer uses quantitative approach. Quantitative

approach is summarizing data using numbers. Hypotheses and methods of

data collection are created before the research begins (Lodico, 2006: 6). In

this research, the writer needs to identify, classify, and interpret the data.

Based on the information type that needed of this research, the writer

focuses on collecting the data about the statistical inferences in this research.

The English summative test was consisted of 50 items. It was developed from

the syllabus of curriculum 2013.

Items analysis provides two kinds of information on items, there are

item difficulty and item discrimination. First, the writer measures the

difficulty level that exists in items of the English summative test. According

(58)

45

easy called difficulty index. Number of difficulty index between 0.00 until 1.0. It is shows the standard of test difficulty. Test with the difficulty index 0.0

show that the test is too hard, in opposite index 1.0 show that the test is too

easy.

0.0 1.0

Hard Easy

To measure the difficulty index, the writer used the formula bellow:

P = Difficulty Index.

B = Total students that answered correct.

(59)

46

The result would be compared to the classification of difficulty index.

According to Daryanto (1999: 182), the difficulty index is classified as the

criteria bellow:

Table 3.4

Classification of the Difficulty Index

Achievement Criteria

0.00 ─ 0.30 Hard Question

0.30 ─ 0.70 Moderate Question

0.70 ─ 1.00 Easy Question

Second, the writer measures the discrimination index. According to

Arikunto (1995: 215), discrimination index is ability of item to discriminate

between high students and low students ability. Number which is show the

discrimination index called difficulty index. It at range 0.00 until 1.00. In

contradiction, the difficulty index not identifies a negative (-) sign and the

discrimination index identifies negative (-) sign.

-1.00 0.00 + 1.00

Discrimination index Discrimination index Discrimination index Negative Low High/Positive

Test item was not good when the item which is answered correctly by the

upper student or lower student because it haven’t discrimination index. Such

(60)

46

was same. That item has point D 0.00 because it have not discrimination

index. To measure the discrimination index, the writer used the formula

bellow:

-

= PA – PB

D = Discrimination index

J = Total students

JA = Total of upper group

JB = Total of lower group

BA = Total of upper group who answered correctly

BB = Total of lower group who answered correctly

PA =

= Proportion the total of upper group who answered correctly (P as the discrimination index).

PA =

= Proportion the total of lower group who answered correctly.

According to Daryanto (1999: 189), good item is item that distinguish

between high students and low students. It could be seen from whether able or

unable answered the test. The test items were poor if the test items could be

(61)

47

The result would be compared to the classification of discrimination

index. According to Arikunto (1995: 223), the discrimination index is

classified as the criteria bellow:

Table 3.5

Classification of the Discrimination Index

Achievement Criteria

0.70 ─ 1.00 Excellent

0.40 ─ 0.70 Good

0.20 ─ 0.40 Satisfactory

0.00 ─ 0.20 Poor

Moreover classification, According to Aggrawal (1986), items having

negative discrimination is rejected. Items having discrimination index

above 02.00 are ordinarily regarded satisfactory for use in most tests

(62)

48

CHAPTER IV

DATA ANALYSIS

This chapter focuses on analyzing the collected data. The writer gives the

details of the findings. This chapter is likely the main discussion of the research

conducted. It displays the finding of the collected data since in the beginning until

the end of the research.

A. Analysis

In this study, the writer provided the whole data analyses of this

research which are explained in the description below:

1. Analysis of the Difficulty Index

First, the writer measures the difficulty level that exists in items of

the English summative test. According to Arikunto (1995: 212), number

which is indicates the items that difficult or easy called difficulty index. Number of difficulty index between 0.00 until 1.0. It is shows the standard

of test difficulty. Test with the difficulty index 0.0 show that the test is too

hard, in opposite index 1.0 show that the test is too easy.

1.0 1.0

ITEMS ANALYSIS ON THE SCORE OF THE ENGLISH SUMMATIVE TEST (A Descriptive Study of the Tenth Grade Students of SMK N 3 Salatiga in the Academic Year of 2013/2014) - Test Repository