• Tidak ada hasil yang ditemukan

Six qualities of useful language tests in 2012 senior high school bahasa Inggris national examination in Kota Yogyakarta.

N/A
N/A
Protected

Academic year: 2017

Membagikan "Six qualities of useful language tests in 2012 senior high school bahasa Inggris national examination in Kota Yogyakarta."

Copied!
272
0
0

Teks penuh

(1)

SIX QUALITIES OF USEFUL LANGUAGE TESTS IN 2012 SENIOR HIGH SCHOOL BAHASA INGGRIS NATIONAL EXAMINATION IN

KOTA YOGYAKARTA

A SARJANA PENDIDIKAN THESIS

Presented as Partial Fulfillment of the Requirements to Obtain the Sarjana Pendidikan Degree

in English Language Education

By Sabina Thipani Student Number: 081214032

ENGLISH LANGUAGE EDUCATION STUDY PROGRAM DEPARTMENT OF LANGUAGE AND ARTS EDUCATION FACULTY OF TEACHERS TRAINING AND EDUCATION

SANATA DHARMA UNIVERSITY YOGYAKARTA

(2)

i

SIX QUALITIES OF USEFUL LANGUAGE TESTS IN 2012 SENIOR HIGH SCHOOL BAHASA INGGRIS NATIONAL EXAMINATION IN

KOTA YOGYAKARTA

A SARJANA PENDIDIKAN THESIS

Presented as Partial Fulfillment of the Requirements to Obtain the Sarjana Pendidikan Degree

in English Language Education

By Sabina Thipani Student Number: 081214032

ENGLISH LANGUAGE EDUCATION STUDY PROGRAM DEPARTMENT OF LANGUAGE AND ARTS EDUCATION FACULTY OF TEACHERS TRAINING AND EDUCATION

SANATA DHARMA UNIVERSITY YOGYAKARTA

(3)
(4)
(5)

iv

This thesis is dedicated to the education world of Indonesia.

“Education either functions as an instrument which is used to

facilitate integration of the younger generation into the logic of the present system and bring about conformity or it becomes the practice of freedom, the means by which men and women

deal critically and creatively with reality and discover how to participate in the transformation of their world.”

― Paulo Freire, Pedagogy of the Oppressed

“We are more than robotic bookshelves, conditioned to blurt out facts we were taught in school. We are all very special, every

human on this planet is so special, so aren't we all deserving of something better, of using our minds for innovation, rather than memorization, for creativity, rather than futile activity, for

rumination rather than stagnation?” - Erica Goldson, Here I Stand

“School is supposed to emancipate soul, not inventing working class.”

(6)

v

STATEMENT OF WORK’S ORIGINALITY

I honestly declare that this thesis, which I have written, does not contain the work or parts of the work of other people, except those cited in the quotations and the references, as a scientific paper should.

Yogyakarta, June 5, 2013

The Writer

(7)

vi

LEMBAR PERNYATAAN PERSETUJUAN

PUBLIKASI KARYA ILMIAH UNTUK KEPENTINGAN AKADEMIS

Yang bertanda tangan di bawah ini, saya mahasiswa Universitas Sanata Dharma:

Nama : Sabina Thipani

Nomor Mahasiswa : 08 1214 032

Demi pengembangan ilmu pengetahuan, saya memberikan kepada Perpustakaan Universitas Sanata Dharma karya ilmiah saya yang berjudul:

SIX QUALITIES OF USEFUL LANGUAGE TESTS IN 2012 SENIOR HIGH SCHOOL BAHASA INGGRIS NATIONAL EXAMINATION IN

KOTA YOGYAKARTA

beserta perangkat yang diperlukan (bila ada). Dengan demikian saya memberikan kepada Perpustakaan Universitas Sanata Dharma hak untuk menyimpan, mengalihkan dalam bentuk media lain, mengelolanya dalam bentuk pangkalan data, mendistribusikan secara terbatas, dan mempublikasikannya di Internet atau media lain untuk kepentingan akademis tanpa perlu meminta ijin dari saya maupun memberikan royalti kepada saya selama tetap mencantumkan nama saya sebagai penulis.

Demikian pernyataan ini yang saya buat dengan sebenarnya. Dibuat di Yogyakarta

Pada tanggal: 5 Juni 2013 Yang menyatakan

(8)

vii ABSTRACT

THIPANI, SABINA. 2013. Six Qualities of Useful Language Tests in 2012 Senior High School Bahasa Inggris National Examination in Kota Yogyakarta. Yogyakarta: Sanata Dharma University.

National Examination has always been a controversial issue in Indonesia. This controversy inspired the researcher to analyze National Examination. To analyze it, the researcher used six qualities of useful language tests theory. This theory was chosen for four reasons. The theory chosen led the researcher to propose this problem formulation: Does 2012 Senior High School Bahasa Inggris National Examination in Kota Yogyakarta correspond to six qualities of useful language tests? The qualities of useful language tests meant include: reliability, construct validity, authenticity, interactiveness, impact, and practicality.

In the research, the researcher used document/content analysis and sample survey as methods. Document analysis was especially used to answer the questions related to construct validity, authenticity, and practicality. Sample survey was especially used to answer the questions related to reliability, interactiveness, and impact.

Based on the research results and discussion, it can be concluded that, firstly, the 2012 Senior High School Bahasa Inggris National Examination in Kota Yogyakarta was reliable in the context of inter-rater. However, the test was not utterly reliable in the context of student, test administration, and test. Secondly, the test can be considered as unconstructively valid because it did not represent all aspects in either Kisi-kisi UN or Basic Competence and Competence Standard. Thirdly, the test tasks were not utterly authentic because some parts of the test tasks did not reflect the TLU tasks. Fourthly, the test was interactive because the students‟ personal characteristics (especially their level & type of general education and types & amount of preparation) and language ability (especially their language knowledge) helped them to be involved in the test. However, the test can also be considered not interactive because the students‟ personal characteristics (especially their family background, topical knowledge, affective schemata, and language ability – in particular their strategic competence) did not help them to be involved in the test.

Other than that, it can also be concluded that, fifthly, the test developer goals were in accord with the society/education system goals. However, the test score interpretation conflicted with the society/education goals and the test did not bring significant impact to the students and teachers. Sixthly, the test can be considered practical in the context of human resources and time allocation availability. However, it was not practical in the context of material resources availability.

(9)

viii ABSTRAK

THIPANI, SABINA. 2013. Six Qualities of Useful Language Tests in 2012 Senior High School Bahasa Inggris National Examination in Kota Yogyakarta. Yogyakarta: Sanata Dharma University.

Ujian Nasional (UN) telah menjadi isu yang kontroversial di Indonesia. Kontroversi ini menginspirasi peneliti untuk menganalisa UN. Untuk menganalisa UN peneliti menggunakan teori enam kualitas tes yang berguna. Ada empat alasan yang melandasi pemilihan teori ini. Dengan teori ini, peneliti mengajukan rumusan masalah berikut: Apakah UN Bahasa Inggris SMA Tahun 2012 sesuai dengan enam kualitas tes bahasa yang berguna? Kualitas yang dimaksud adalah: keterandalan, validitas konsep, otentisitas, keinteraktifan, dampak, dan kepraktisan.

Peneliti menggunakan metode analisis isi dan survei sampel dalam penelitian ini. Analisis isi digunakan untuk menjawab pertanyaan-pertanyaan, terutama yang berhubungan dengan validitas konsep, keotentikan, dan kepraktisan. Survei sampel digunakan untuk menjawab pertanyaan-pertanyaan, terutama yang berhubungan dengan keterandalan, keinteraktifan, dan dampak.

Berdasarkan hasil penelitian dan pembahasan, dapat disimpulkan bahwa, pertama, UN Bahasa Inggris SMA Tahun 2012 di Kota Yogyakarta adalah tes yang dapat diandalkan, terutama dalam konteks antarpenilai. Namun, UN tersebut tidak sepenuhnya dapat diandalkan dalam konteks siswa, administrasi tes, dan tes. Kedua, UN tidak valid secara konseptual karena tidak merepresentasikan semua aspek dalam Kisi-kisi UN maupun Standar Kompetensi dan Kompetensi Dasar. Ketiga, tugas tes UN tidak sepenuhnya otentik karena beberapa bagian dari tugas tersebut tidak mencerminkan tugas kegunaan bahasa target. Keempat, UN adalah tes yang interaktif karena karakter personal siswa (terutama tingkat & jenis pendidikan umum dan jenis & kuantitas persiapan) serta kemampuan bahasa mereka (khususnya pengetahuan bahasa) membuat para siswa terlibat dalam tes. Namun, UN juga merupakan tes yang tidak interaktif karena karakteristik personal para siswa (khususnya latar belakang keluarga, pengetahuan topikal, dan skema afektif) tidak membuat mereka terlibat dalam tes.

Sixthly, the test can be considered practical in the context of human resources and time allocation availability. However, it is not practical in the context of material resources availability.

Selain itu, dapat pula disimpulkan bahwa, kelima, tujuan dari pengembang tes tidak sesuai dengan tujuan masyarakat/sistem pendidikan. Namun, interpretasi nilai UN bertentangan dengan tujuan masyrakat/sistem pendidikan. UN juga tidak berdampak signifikan pada siswa dan guru. Keenam, dapat disimpulkan bahwa UN bukanlah tes yang praktis dalam konteks ketersediaan sumber daya manusia dan alokasi waktu. Namun UN merupakan tes yang praktis dalam konteks ketersediaan sumber daya material.

(10)

ix

ACKNOWLEDGMENTS

First of all, I would like to express my gratitude to the Universe, for helping me to reveal the new truths from my thesis writing process. Secondly, I would like to address my gratitude to my major sponsor, Ag. Hardi Prasetyo, S.Pd., M.A., for the guidance, suggestions, and correction during my thesis writing process.

I think I also need to thank all teachers and students from the three senior high schools where I conducted research. I owe them a debt of gratitude for their willingness to share thoughts and experiences related to 2012 Senior High School

Bahasa Inggris National Examination. Because of their openness, I am able to

finally finish this thesis.

I would like to give my warmest thanks to my family; my mother, Sisilia Sumarsih, my father, Ign. Sugeng Mulyono, and my dearest brother, Hugo Probo Gumelar. I thank them for their prayers, support, patience, understanding, and endless love, not only during my thesis writing process and my study, but also during the worst and the best phases in my life.

My gratitude also goes to Wahmuji. I would like to thank him for always listening to my worries, for every comforting word he gave in hard times, and for moments given to know me better and for helping me to be stronger and braver, not only during this thesis writing process, but also during my study.

(11)

x

intellectual development, something which I found to be very valuable to help me to deal with my thesis.

I would also like to show my gratitude to my awesome partners in mediasastra.com. All of them had helped me to get through my intellectual drought during this thesis writing by showing me fresh and innovative ways to write.

Last but not least, I wish to express my gratitude to all friends in the English Language Education Study Program, especially Theresia Pangestu, Maria Oktaviarini, Yustian Pristantyo, Clara Belinda Carismalanni and Ratih Mayasari. I would like to thank them for the support and time we spent at the last moment of the thesis writing process.

(12)

xi

TABLE OF CONTENTS

Page

TITLE PAGE ... i

APPROVAL PAGE ... ii

DEDICATION PAGE ... iv

STATEMENT OF WORK‟S ORIGINALITY ... v

PERNYATAAN PERSETUJUAN PUBLIKASI ... vi

ABSTRACT ... vii

ABSTRAK ... viii

ACKNOWLEDGEMENTS ... ix

TABLE OF CONTENTS ... xi

LIST OF TABLES ... xv

LIST OF FIGURES ... xvi

LIST OF APPENDICES ... xvii

CHAPTER I. INTRODUCTION A. Background ... 1

B. Research Problem ... 5

C. Problem Limitation ... 5

D. Research Objectives ... 7

E. Research Benefits ... 7

(13)

xii

CHAPTER II. REVIEW OF RELATED LITERATURE

A. Theoretical Description ... 13

1. National Examination ... 15

2. Assessment and Test ... 18

3. Six Qualities of Useful Language Tests ... 19

B. Theoretical Framework ... 32

CHAPTER III. RESEARCH METHODOLOGY A. Research Method ... 37

B. Research Setting ... 40

C. Research Participants ... 41

D. Instruments and Data Gathering Techniques ... 41

E. Data Analysis Techniques ... 46

F. Research Procedure ... 57

CHAPTER IV. RESEARCH RESULTS AND DISCUSSION A. Reliability of 2012 Senior High School Bahasa Inggris National Examination ... 60

1. Student Reliability ... 60

2. Inter-rater Reliability ... 62

3. Test Administration Reliability ... 63

(14)

xiii

B. Construct Validity of 2012 Senior High School Bahasa Inggris National

Examination ... 71

1. Relevance of Kisi-kisi UN ... 72

2. Relevance of National Examination ... 75

C. Authenticity of 2012 Senior High School Bahasa Inggris National Examination ... 79

D. Interactiveness of 2012 Senior High School Bahasa Inggris National Examination ... 83

1. Personal Characteristics ... 83

a. Level and Type of General Education ... 84

b. Family Background ... 85

c. Type and Amount of Preparation ... 86

2. Topical Knowledge ... 87

3. Affective Schemata ... 91

4. Language Ability ... 97

a. Language Knowledge ... 97

b. Strategic Competence ... 113

E. Impact of 2012 Senior High School Bahasa Inggris National Examination ... 114

1. Washback ... 115

2. Impact on Individuals ... 116

(15)

xiv

F. Practicality of 2012 Senior High School Bahasa Inggris National

Examination ... 137

CHAPTER V. CONCLUSIONS AND RECOMMENDATIONS A. Conclusions ... 142

B. Recommendations ... 145

REFERENCES ... 146

(16)

xv

LIST OF TABLES

Table Page

2.1. Elements of Language Ability ... 24

3.1. Implementation of Research Methods in Six Qualities ... 39

3.2. Samples of the Research ... 41

3.3. Research Participants ... 42

3.4. Participants Interviewed and Given Questionnaires ... 44

3.5. Instruments of Data Gathering ... 44

3.6. Data Analysis Techniques Summary ... 54

4.1. Excluded Elements of Competence Standards and Basic Competences ... 73

4.2. Relevance of National Examination to Kisi-kisi UN ... 75

4.3. Authenticity of 2012 Senior High School Bahasa Inggris National Examination ... 80

4.4. Student Effective Responses in SMA A, SMA B, and SMA C ... 92

4.5. Language Knowledge Areas Involved in Test Tasks ... 98

4.6. Language Knowledge of Students of SMA A, SMA B, and SMA C ... 112

(17)

xvi

LIST OF FIGURES

(18)

xvii

LIST OF APPENDICES

Appendix Page

Appendix A : Questions List ... 149

Appendix B : Reliability Rubric ... 153

Table B1 Student Reliability Rubric ... 153

Table B2 Test Administration Reliability Rubric ... 154

Table B3 Test Reliability Rubric ... 157

Appendix C : Kisi-kisi UN ... 163

Table C1 Kisi-kisi UN Bahasa Inggris SMA 2012 ... 163

Table C2 2012 Senior High School Bahasa Inggris National Examination Kisi-kisi ... 166

Appendix D : Item Analysis Rubric ... 169

Appendix E : Authenticity Rubric ... 203

Appendix F : Interactiveness Rubric ... 212

Table F1 Personal Characteristics Rubric ... 214

Table F2 Topical Knowledge Rubric ... 215

Table F3 Affective Schemata Rubric ... 217

Table F4 Language Ability Rubric ... 220

Appendix G : Impact Rubric ... 231

Table G1 Impact on Test Takers Rubric ... 231

Table G2 Impact on Teachers Rubric ... 238

Table G3 Impact on Society and Education Systems Rubric ... 239

Appendix H : Practicality Rubric ... 245

(19)

1 CHAPTER I INTRODUCTION

This chapter describes the background information, nature, and content of the research. It is divided into six parts. They are: research background, research problem, problem limitation, research objective, research benefits, and definition of terms.

A. Research Background

Since 1950, the Indonesian government has been implementing national final tests for all students in Indonesia. The national final test term changes every several years. It begun with Ujian Penghabisan (1950-1960) then changed to be

Ujian Negara (1965-1971). The term then changed again to be EBTANAS

(Evaluasi Belajar Tahap Akhir Nasional) (1980-2001). The national final test in

2002 was called UAN (Ujian Akhir Nasional). It has been changed to be UN

(Ujian Nasional)– what will be called as National Examination by the researcher

in this thesis – since 2005 (Cessnasari, 2005). The UN term is still used until now (2013).

National Examination has always been a controversial issue in Indonesia.

Tim Ujian Nasional Universitas Negeri Malang (National Examination Team of

Malang State University) states that National Examination in Indonesia has been a controversial discussion for years:

(20)

2

selalu berlangsung setiap tahun di Indonesia, bahkan sepanjang tahun sehingga menjadi perdebatan laten penilaian pendidikan nasional (n.d., p.1.).

English version:

Discussion or polemic on National Examination – related to its policy, form or design, mechanism, or implementation – always happens every year, even all years in Indonesia until they become latent national education discussion.

Conflicts have happened between those who support and oppose the implementation of National Examination. Tim Ujian Nasional Universitas Negeri Malang also writes that both the pro and the contra sides state principles, ideas, logic, arguments, and empirical evidence in order to support their claims of National Examination. The discussion is generally related to National Examination from the perspective of pedagogy, jurisdiction, economy, sociality, and from how it is implemented.

From the perspective of pedagogy, for example, the opponents state that National Examination does not cover three aspects of objectives (cognitive, affective, and psychomotor aspects). National Examination only covers cognitive aspect (Harti, n.d.). Different from the contra side, the pro side states that National Examination is acceptable pedagogically because National Examination reflects a standard-based education paradigm which is implemented in Indonesia (Tim

Ujian Nasional Universitas Negeri Malang, n.d.).

(21)

1 (National Education System Statute No. 20/2003 Article 35 Clause 1) (Harti, n.d.) which states that:

Standar nasional pendidikan terdiri atas standar isi, proses, kompetensi lulusan, tenaga kependidikan, sarana dan prasarana, pengelolaan, pembiayaan, dan penilaian pendidikan yang harus ditingkatkan secara berencana dan berkala.

English version:

Education national standard consists of content standard, process, graduate competence, education resources, facility and infrastructure, management, expense, and education assessment which must be upgraded

systematically and continuously.

On the contrary, the proponents state that National Examination corresponded to National Education System Statute No. 20/2003.

The conflicts inspired the researcher to conduct research on National Examination, especially Senior High School Bahasa Inggris National Examination. In her research, National Examination was analyzed by six qualities of useful language tests theory. This is a theory which was proposed by Lyle F. Bachman and Adrian S. Palmer (2004) in their book the title of which is

Language Testing in Practice: Designing and Developing Useful Language Tests.

This theory was picked for four reasons.

(22)

21-35). Correlation between language testing and language use is explained in the discussion of authenticity quality (Bachman & Palmer, pp. 23-25).

The second reason was the theory is able to analyze tests from the social perspective. The more detailed explanation on this is provided on the discussion of impact quality (Bachman & Palmer, 2004). Discussion on the test impact on society and education systems is specified in one sub-subchapter here (Bachman & Palmer, 2004, pp. 34-35).

The third reason was the theory can provide a discussion on the test implementation. Language test philosophy which is adopted by Bachman and Palmer (2004) in the theory is related to the ways tests are implemented. This philosophy is reflected in the explanation of reliability, interactiveness, and practicality qualities (Bachman & Palmer, 2004, pp. 19-37).

The fourth reason behind the choosing of the six qualities of useful language tests was related to the applicative character of the theory. Bachman and Palmer (2004) state that “… in order to be useful, any given test must be

developed with a specific purpose, a particular group of test takers and a specific language use domain…“ (p. 18). From this quotation, the researcher interpreted

(23)

researcher, which is the city in which English is used as a foreign language, can be analyzed by the six qualities of useful language tests.

In this research, the researcher also used principles of language assessment proposed by H. Douglas Brown (2004) in his book the title of which is Language

Assessment: Principles and Classroom Practices. These principles were used in

order to have a more comprehensive discussion on the National Examination. Brown principles are especially implemented in the discussion of reliability and impact qualities.

B. Research Problem

Based on the background which is presented in sub chapter A, the researcher proposed one research problem. It was: “Does 2012 Senior High

School Bahasa Inggris National Examination in Kota Yogyakarta correspond to six qualities of useful language tests?”

C. Problem Limitation

In order to have realistic and focused research, as what has discussed in sub chapter B, the researcher limited the research, which was Senior High School

Bahasa Inggris National Examination, into 2011/2012 Senior High School

Bahasa Inggris National Examination in Kota Yogyakarta. Henceforth, for

practical reason, „2011/2012 Senior High School Bahasa Inggris National

Examination in Kota Yogyakarta‟ will be referred as „2012 Senior High School

(24)

The researcher chose year 2012 as the time setting because 2012 National Examination was the most recent National Examination conducted, at least up to the finishing of this research proposal designing. By analyzing 2012 Senior High School Bahasa Inggris National Examination, the researcher was able to show the most actual condition of this test. The latest condition of this test can be useful for teachers, students, and academicians to respond to the next Senior High School

Bahasa Inggris National Examinations critically. The newness character of the

National Examination brought benefit for this research because new documents were easy to be accessed. Other than that, because of its newness character, the possibility for the research document loss and damage was small.

Senior High School was chosen because, up to the finishing of the research proposal designing, Senior High School was included as one of the formal educational institutions which were used as the setting of Program

Pengalaman Lapangan/PPL (Field Internship Unit), besides Junior and

Vocational High Schools. PPL is a crucial subject in the Faculty of Teachers Training and Education because PPL is the estuary of all education programs learnt by the students in the faculty (Fakultas Keguruan dan Ilmu Pendidikan

Universitas Sanata Dharma Yogyakarta [FKIP USD YK], 2007). Senior High

School was chosen by the researcher because it is one of the formal educational institutions which were used as the setting of a crucial subject in the faculty.

(25)

that, in Kota Yogyakarta the variation of the Senior High Schools, especially in terms of their accreditation, is higher thus the researcher was able to pick the ideal samples for the research.

D. Research Objective

By doing this research, the researcher expected to find the reliability, content validity, authenticity, interactiveness, impact, and practicality of Senior High School Bahasa Inggris National Examination in general and 2012 Senior High School Bahasa Inggris National Examination in Kota Yogyakarta in particular.

E. Research Benefits

The results of this research are expected to be useful for:

1. Education academicians, teachers, and students of Senior High Schools in responding to Senior High School Bahasa Inggris National Examinations critically.

2. Test makers of Senior High School Bahasa Inggris National Examinations to develop more useful tests in the future.

3. Lecturers and students of Sanata Dharma (English Language Education Study Program) ELESP to evaluate Senior High School

Bahasa Inggris National Examinations and to enrich their knowledge

(26)

F. Definition of Terms

In order to help the readers to understand this research better, the researcher attached some definitions of specific terms such as:

1. Six Qualities of Useful Language Tests

In this study, six qualities of useful language tests refer to the combination of reliability, construct validity, authenticity, interactiveness, impact, and practicality (Bachman & Palmer, 2004). This combination is recommended to analyze both new and existing language tests which need to be developed (Bachman & Palmer, 2004, p. 9). Thus, in this research, the researcher checked if the National Examination was reliable, constructively valid, authentic, interactive, practical, and brought impact. If it was and did, the National Examination can be considered as a useful test.

a. Reliability

According to Bachman and Palmer (2004), a useful language test is a reliable test. Reliability refers to the consistency of a test “across different

(27)

b. Construct Validity

A useful test is also a test which is constructively valid, meaning “reflects

the area(s) of language ability we want to measure…” (Bachman & Palmer, 2004,

p. 21) or, if using Brown‟s terminology, reflects its theoretical construct (Brown,

2004, p. 25). In the context of the National Examination, one the theoretical constructs was Kisi-kisi Ujian Nasional Bahasa Inggris SMA/MA Tahun

Pelajaran 2011/2012 (list of competence and indicators which must be tested in

the National Examination). Henceforth, it will be referred as „Kisi-Kisi UN’.

Kisi-Kisi UN was grounded on Standar Kompetensi dan Kompetensi Dasar dalam

Standar Isi di Peraturan Menteri No. 22 Tahun 2006 (Competence Standard and

Basic Competence specified in Content Standard in Ministerial Regulation No. 22 Year 2006), which will be later referred as „Competence Standard and Basic

Competence‟. Thus, indirectly, another theoretical construct of National

Examination was Competence Standard and Basic Competence. By checking if National Examination reflected the two theoretical constructs, the researcher was able to find the test construct validity.

c. Authenticity

A test which is useful is also a test which is authentic. An authentic test is a test which corresponds to “the characteristics of a given language test task to the

features of a TLU task” (Bachman & Palmer, 2004, p. 23). TLU stands for Target

(28)

should be meaningful for the learner (called as relevant by Bachman and Palmer), there should be thematic organization, and the tasks should represent real-world tasks.

d. Interactiveness

A test which is useful is also a test which is interactive. Interactive here means having the possibility to help the test takers‟ individual characteristics to be

involved in the accomplishment of test tasks (Bachman & Palmer, 2004, p. 25). Most relevant individual characteristics for language testing are the test taker‟s personal characteristics (Bachman & Palmer, 2004, p. 64), language ability (language knowledge and strategic competence), topical knowledge, and affective schemata (Bachman & Palmer, 2004, p. 25). A more detailed explanation of the characteristics will be elaborated in Chapter II. However, there is one condition which can be provided to picture the concept of interactiveness. Of affective schemata, for example, if the test topics motivate the test takers to respond to the test positively, and finally perform at their best easily, the test can be considered as an interactive test (Bachman & Palmer, 2004, p. 145). By performing at their best easily, the test takers are involved in the test and being involved is an important indicator of test interactiveness.

e. Impact

(29)

Impact on the test takers involves discussion on: how one test affects the test takers‟ preparation, how the test feedback is given to the test takers, and how the

test takers respond to the decisions that are made on the basis of the test scores (Bachman & Palmer, 2004, p. 31). Impact on the teachers involves discussion on: the relation between the test to the teaching material, the teaching and learning activities, and the instructional program (Bachman & Palmer, 2004, p. 147)). Impact on the society and education systems involves discussion on the relation between the test to the goal of the society and education systems (Bachman & Palmer, 2004, pp. 147-148).

f. Practicality

A useful test must also be practical. One test is considered as a practical test if it can provide all resources needed for the test implementation. Practicality is defined as “the relation between the resources that will be required in the design, development, and use of the test and the resources that will be available for these activities” (Bachman & Palmer, 2004, p. 36). If, for example, a test

needs to be conducted in ten classrooms, and the test administrator can provide all ten classrooms, the test can be considered practical.

2. Senior High School Bahasa Inggris National Examination

(30)

measures Senior High School Student‟s ability on English nationally (“Prosedur

(31)

13 CHAPTER II

REVIEW OF RELATED LITERATURE

This chapter describes the theoretical background of the research. It consists of two big parts. They are theoretical description and theoretical framework. The first part describes the summary of the previous research which is related to the study, the nature of national examination, the definition of assessment and test, and the detailed explanation of six qualities of useful language tests theory. The second part describes the relation between the theoretical description with the study matter.

A. Theoretical Description

In his thesis, Albertus Fiharsono, a student of ELESP year 1998, analyzed the use of multiple choices items in the reading passage of Junior High School National Final Examination (2005). The findings in his thesis show that multiple-choice format was suitable for the National Final Examination. The National Final Examination also did not always consist of plausible distractors and was rapid in scoring. Other than that, its material was not completely new to the students. However, he found that the National Final Examination did not encourage the students to be creative. Its stem was also not well constructed.

(32)

Vocational High Schools. In the thesis, he found that 2007 National Final Examination for Vocational High Schools met the criteria of face and content validity.

This research conducted by the researcher was expected to give new perspective on Bahasa Inggris National Examination. This research analyzes 2012 Senior High School Bahasa Inggris National Examination in Kota Yogyakarta by using the six qualities of useful language tests. Different from the previous research conducted by Fiharsono which only focused on one skill, this research focuses on all skills used in the National Examination. Different from the previous research done by Wiratmo which only focused on face and content validity, this research focuses on all qualities of useful language tests (reliability, construct validity, authenticity, practicality, interactiveness, and impact). Different from the first researcher which analyzed National Examination of Junior High School and the second researcher which analyzed Vocational High School, this research analyzes Senior High School National Examination.

(33)

1. National Examination

From Prosedur Operasi Standar Ujian Nasional Sekolah Menengah Pertama, Madrasah Tsanawiyah, Sekolah Menengah Pertama Luar Biasa,

Sekolah Menengah Atas, Madrasah Aliyah dan Sekolah Menengah Kejuruan

Tahun Pelajaran 2011/2012 (National Examination Standard Operating

Procedure for Junior High School, Madrasah Tsanawiyah, Special Need Junior High School, Senior High School, Madrasah Aliyah, and Vocational High School Year 2011/2012)1, it can be seen that National Examination is the examination which is conducted nationally and simultaneously throughout Indonesia. In 2012, Senior High School National Examination was conducted on April 16 up to April 19. The Bahasa Inggris National Examination was conducted on April 17, 2012. There was also a substitution test in 2012 Senior High School National Examination for those who were ill or unable to come because of certain obstacle (which was proven by legal description letter). This substitution test was conducted on April 23 up to April 26, 2012. The Bahasa Inggris National Examination was conducted on April 24, 2012.

From the same document, designing procedures of 2012 National Examination also can be seen. Before designing Bahasa Inggris National Examination questions, Penyelenggara UN Tingkat Pusat (National Examination Central Organizer) had the responsibility to design Kisi-kisi UN based on Competence Standard and Basic Competence. After that, National Examination

1

(34)

Central Organizer designed and validated draft of questions of Bahasa Inggris National Examination by involving teachers, lecturers, and education experts. The fixed draft was proposed to Kementerian Pendidikan Nasional (Ministry of National Education) as attachment to Peraturan Menteri Pendidikan Nasional (National Education Ministerial Regulation). The draft consisted of 50 multiple choices questions of 2012 Bahasa Inggris National Examination which should have been done in 120 minutes by the students. It consisted of listening, reading, and „writing‟ skills. The word writing is put between quotation marks because of a

reason which is elaborated in Chapter IV.

Before doing the test, the National Examination participants must have fulfilled some requirements. First of all, they had to be in the last year of their education level. They also had to have report of their education record until the first semester of their last year of education level. The last one was they had to sign their selves up for the National Examination.

In its administration, the National Examination was organized by Badan

Standar Nasional Pendidikan/BSNP (National Education Standard Agency). It

was helped by other education institutions. One of the institutions was

Departemen Pendidikan Provinsi (Provincial Education Department). This

(35)

Departemen Pendidikan Provinsi dan Daerah (Provincial and Regional Education Department) and schools had the responsibility to keep those files secret and save.

After the test ended, there were still obligations needed to be done by the National Examination organizers. The filled National Examination answer sheets were scanned by the chosen private universities. Scores were given based on the amount of the right answers. To decide whether an answer was right or not, there was an exact answer key made since the choices of the answers were only limited to a, b, c, d, or e. The scanned answer sheets were then sent to the Ministry of National Education. The chosen private universities must also send the report of the preparation and implementation of the National Examination to the Ministry of National Education to be sent again to BSNP. The results of the National Examination were then announced on May 24, 2012.

The passing grade of the National Examination was 5.5. The results of the National Examination were used as:

“a. pemetaan mutu pendidikan dan/atau satuan pendidikan. b. dasar seleksi masuk jenjang pendidikan selanjutnya. c. penentuan kelulusan peserta didik dari program dan/atau satuan pendidikan. d. pembinaan dan pemberian bantuan kepada satuan pendidikan dalam upayanya untuk meningkatkan mutu pendidikan (Badan Standar Nasional Pendidikan

Kementrian Pendidikan dan Kebudyaan Republik Indonesia [BSNP

Kemendikbud RI], 2011/2012).” English version:

(36)

Of point c, before 2011, National Examination was used to be the only determinant of students‟ graduation. But in 2012, the result of National

Examination was combined with the result of Ujian Sekolah/US (School Examination) before it was finally used to decide whether a student can pass a level of education or not. To decide a student graduation, 60% of the National Examination result was combined with 40% of School Examination (Kelulusan UN, 2012).

2. Assessment and Test

Assessment is a process of collecting information about aspects of test taker‟s language ability systematically, on the basis of substantive ground

(Bachman & Palmer, 2012, p. 20). As a wider domain than test, assessment can be divided into: informal – formal assessments and formative – summative assessments (Brown, 2004, pp. 5-7). Informal assessments include “a number of forms, starting with incidental, unplanned comments and responses, along with coaching and other impromptu feedback to the student” (Brown, 2004, p. 5) while formal assessments “are exercises or procedures specifically designed to tap into a

storehouse of skills and knowledge” (Brown, 2004, p. 6). Formative assessments

are activities of “evaluating students in the process of „forming‟ their

competencies and skills with the goal of helping them to continue that growth process” (Brown, 2004, p. 6) while summative assessments “aims to measure, or

(37)

One of the forms of assessment is test (Brown, 2004). Brown (2004) defines test as “a method of measuring a person‟s ability, knowledge, or

performance in a given domain” (p.3). The „method‟ here refers to “a set of

techniques, procedures, or items – that requires performance on the part of the test-taker” (Brown, 2004, p. 3). What is measured by a test can be general ability or very specific competence or objectives (Brown, 2004). Since tests measure persons‟ ability, knowledge, or performance, “[t]esters need to understand who

the test-takers are” (Brown, 2004, p. 3). Even though a test measures test taker performance, its “results must imply the test-taker‟s ability, or, to use a concept

common in the field of linguistics, competence” (Brown, 2004, p. 3). A test which measures a given domain is a test which measures general competence in all skills of a language even though it only involves a sampling of skills (Brown, 2004). “A

well-constructed test is an instrument that provides an accurate measure of the test-taker‟s ability within a particular domain” (Brown, 2004, p. 4).

In the context of assessment – test frame above, 2012 Senior High School

Bahasa Inggris National Examination can be categorized as a test. More

specifically, it belongs to formal-summative test. As a test, thus, 2012 Senior High School Bahasa Inggris National Examination is appropriate to be analyzed with six qualities of useful language tests which are proposed by Bachman and Palmer.

3. Six Qualities of Useful Language Tests

(38)

The theory implies that language tests can be considered useful if they correspond to six qualities. The qualities are:

a. Reliability

Bachman and Palmer (2004) define reliability as the consistency of a test “across different characteristics of the testing situation” (p. 19). Brown (2004)

mentions four types of condition which might affect test consistency. They are: 1) student-related reliability, 2) rater reliability, 3) test administration reliability, and 4) test reliability.

The following is the explanation of four of them. Student-related reliability is physical or psychological factors which may influence test consistency.

Rater reliability is rating and scorers condition which may affect test consistency. It is divided into two: inter-rater reliability and intra-rater reliability. Inter-rater reliability will not occur “when two or more scorers yield inconsistent scores of the same test, possibly for lack of attention to scoring criteria, inexperience, inattention, or even preconceived biases” (Brown, 2004, p. 21).

Intra-rater reliability is related to tests in the classroom scope. Intra-rater unreliability “is a common occurrence for classroom teachers because of unclear

scoring criteria, fatigue, bias toward particular good and bad students, or simple carelessness” (Brown, 2004, p. 21). In this research, the more appropriate rater

(39)

Test administration reliability is “the condition in which the test is administered” (Brown, 2004, p. 21) which may affect test consistency. Those

conditions are: photocopied test sheet received by the students, sound amplification, and classroom condition (Brown, 2004). Other than those aspects, the researcher also includes several other conditions which belong to the part of test administration, but are not mentioned by Brown. They are time allotment and test observer performance.

Test reliability is “the nature of the test” (Brown, 2004, p. 22) which

influence the test consistency. Bachman and Palmer (2004) define it as the variation extent of the characteristics of test setting, test rubric, test input, test expected response, and relation between test input and expected response. Since the essence of test setting is the same as the essence of test administration explained in the previous paragraph, the researcher did not discuss the variation extent of test setting in the discussion of test reliability.

b. Construct Validity

Construct validity is defined as “the evidence that the test score reflects the area(s) of language ability we want to measure…” (Bachman & Palmer, 2004,

p. 21). If using question, in order to find test construct validity, researchers might ask “Does this test actually tap into the theoretical construct as it has been defined? (Brown, 2004, p. 25)”. Construct here refers to “any theory, hypothesis,

(40)

c. Authenticity

Authenticity is “the degree of correspondence of the characteristics of a given language test task to the features of a TLU task” (Bachman & Palmer,

2004). Brown (2004) specifies the features of an authentic test into five: the language should be natural, the items should be contextualized, the topics should be meaningful for the learner (called as relevant by Bachman and Palmer), there should be thematic organization, and the tasks should represent real-world tasks.

In order to be able to discuss certain test authenticity, there are two questions should be included. The first question is “[to] what extent does the description of tasks in the TLU domain include information about the setting, input, expected response, and relationship between input and response (Bachman & Palmer, 2004, p. 142)?” and the second one is “[to] what extent do the characteristics of the test task correspond to those of TLU tasks (Bachman & Palmer, 2004, p. 142)?”

d. Interactiveness

Interactiveness is defined “as the extent and type of involvement of the test

taker‟s individual characteristics in accomplishing a test task” (Bachman & Palmer, 2004, p.25). The characteristics meant are test taker‟s personal

(41)

1) Personal Characteristics

“Personal characteristics are individual attributes that are not part of test

takers‟ language ability but which may still influence their performance on

language tests (Bachman & Palmer, 2004, p. 64).” According to Bachman and

Palmer (2004), personal characteristics are broad. Some of them are problematic. It is thus impossible to analyze all personal characteristics. For this reason, the researcher then chooses several characteristics which are relevant to the context of 2012 Senior High School Bahasa Inggris National Examination. Those characteristics are: level and type of general education, family background, and type and amount of preparation.

2) Topical Knowledge

Topical knowledge is “knowledge structures in long-term memory”

(Bachman & Palmer, 2004, p. 65). Interactiveness in the context of topical knowledge can be defined as the involvement of student long term memory structure toward the test. Student topical knowledge involvement is indicated by their understanding, interest, and their getting-new-knowledge toward the topics in the test.

3) Affective Schemata

Affective schemata “provide[s] the basis on which language users assess,

(42)

Palmer, 2004, p. 65)”. According to Bachman and Palmer (2004), “[i]n a language

test, test takers‟ affective schemata may influence the ways in which they process

and attempt to complete the test tasks” (p. 65). If the test task evokes affective response that would make it easy for the test takers to perform at their best, the test is interactive (Bachman & Palmer, 2004).

4) Language Ability

Language ability was “a combination of language knowledge and strategic

competence which provides language users with the ability, or capacity, to create and interpret discourse, either in responding to tasks on language tests or in non-test language use ” (Bachman & Palmer, 2004, p. 67). Table 2.1 below (Bachman

[image:42.595.102.506.219.742.2]

& Palmer, 2004, pp. 67-71) shows all parts of language ability.

Table 2.1 Elements of Language Ability Language Ability

1. Language Knowledge

1.1.Organizational Knowledge

1.1.1. Grammatical Knowledge

1.1.1.1.Knowledge of Vocabulary 1.1.1.2.Knowledge of Syntax

1.1.1.3.Knowledge of Phonology/Graphology

1.1.2. Textual Knowledge

1.1.2.1.Knowledge of Cohesion

1.1.2.2.Knowledge of Rhetorical or Conversational Organization

1.2.Pragmatic Knowledge

1.2.1. Functional Knowledge

1.2.1.1.Knowledge of Ideational Functions 1.2.1.2.Knowledge of Manipulative Functions 1.2.1.3.Knowledge of Heuristic Functions 1.2.1.4.Knowledge of Imaginative Functions

(43)

1.2.2.1.Knowledge of dialects/varieties 1.2.2.2.Knowledge of registers

1.2.2.3.Knowledge of natural or idiomatic expressions 1.2.2.4.Knowledge of references and figures of speech 2. Strategic Competence

2.1.Goal Setting 2.2.Assessment 2.3.Planning

As explained in Table 2.1., language ability is divided into two parts. They are language knowledge and strategic competence. Below is the more detailed explanation of both of them.

a) Language Knowledge

Language knowledge is “a domain of information in memory that is available for use by metacognitive strategies in creating and interpreting discourse in language use” (Bachman & Palmer, 2004, p. 67). As seen in Table 2.1,

language knowledge consisted of two parts, organizational knowledge and pragmatic knowledge.

Organizational knowledge affects “how utterances or sentences and texts are organized” (Bachman & Palmer, 2004, p. 68) because it “is involved in

controlling the formal structure of language for producing or comprehending grammatically acceptable utterances or sentences, and for organizing these to form texts, both oral and written” (Bachman & Palmer, 2004, pp. 67-68).

(44)

comprehending formally accurate utterances or sentences” (Bachman & Palmer,

2004, p. 68) while textual knowledge “is involved in producing or comprehending

texts, which are units of language –spoken or written – that consist of two or more utterances or sentences” (Bachman & Palmer, 2004, p. 68). Grammatical

knowledge consists of three parts (knowledge of vocabulary, syntax, phonology/ graphology) and textual knowledge consisted of two areas (knowledge of cohesion and knowledge of rhetorical/conversational organization) (Bachman & Palmer, 2004).

Pragmatic knowledge “enables us to create or interpret discourse by

relating utterances or sentences and texts to their meanings, to the intentions of language users, and to relevant characteristics of the language use setting” (Bachman & Palmer, 2004, p. 69). Pragmatic knowledge consists of two parts which are functional knowledge and sociolinguistic knowledge. Functional knowledge “enables us to interpret relationships between utterances or sentences and texts and the intentions of language users” (Bachman & Palmer, 2004, p. 69)

while sociolinguistic knowledge “enables us to create or interpret language that is

appropriate to a particular language use setting” (Bachman & Palmer, 2004, p.

(45)

In the context of language knowledge, a test which is interactive requires the involvement of a wide range of areas of language knowledge and also engangement of test takers‟ areas of language knowledge (Bachman & Palmer,

2004). Other than that, the involvement of language functions in the test tasks is also needed. According to Bachman and Palmer (2004), not only asking test takers to demonstrate language knowledge, test tasks must also involve test takers in language functions.

b) Strategic Competence

Strategic competence is “a set of metacognitive components or strategies,

which can be thought of as higher order executive processes that provide a cognitive management function in language use, as well as in other cognitive activities” (Bachman & Palmer, 2004, p. 70). There are three activities involved in

(46)

e. Impact

Impact means certain values/goals and the consequences of the acts of administering and taking a test; the importance of the test to individuals, education systems, and society (Bachman & Palmer, 2004, p. 30). According to Bachman and Palmer (2004), “[t]he impact of test use operates at two levels: a

micro level, in terms of the individuals who are affected by the particular test use, and a macro level, in terms of the educational systems or society” (p. 30). Besides

them, there is also one more aspect which needs to be considered in the discussion of impact. It is washback.

1) Washback

Hughes defined washback as “the effect of testing on teaching and

learning” (as cited by Brown, 2004, p. 28). Washback is supposed to help test

takers to enhance their language acquisition, such as: intrinsic motivation, autonomy, self-confidence, language ego, interlanguage, and strategic investment (Brown, 2004). In the context of large-scale assessment, Brown (2004) defines washback as “effects the tests have on instruction in terms of how students

(47)

2) Impact on Individuals a) Impact on Test Takers

According to Bachman and Palmer (2004), test takers might be affected by three testing phases. The first phase is “the experience of taking and, in some cases, of preparing for the test…” (Bachman & Palmer, 2004, p. 31). Bachman

and Palmer (2004) narrows this first phase into one question, which is “[t]o what

extent might the experience of taking the test or the feedback received affect characteristics of test takers that relate to language use?” (p. 146). The

characteristics of test takers meant were topical knowledge, language knowledge, strategic investment (Bachman & Palmer, 2004), autonomy, self-confidence, language ego, inter-language, and intrinsic motivation (Brown, 2004).

The second phase of testing is “[t]he feedback they receive about their

performance on the test…” (Bachman & Palmer, 2004, p. 31). The second phase

is specified into two questions, which are “[w]hat provisions are there for involving test takers directly, or for collecting and utilizing feedback from test takers in the design and development of the test” (Bachman & Palmer, 2004, p.

146)? and “[h]ow relevant, complete, and meaningful is the feedback that is provided to test takers” (Bachman & Palmer, 2004, p. 146)?

The third phase is “[t]he decisions that may be made about them on the basis of their test scores” (Bachman & Palmer, 2004, p. 31). It includes four

questions. They are: 1) “Are decision procedures and criteria applied uniformly to all groups of test takers” (Bachman & Palmer, 2004, p. 146)? 2) “How relevant

(48)

Palmer, 2004, p. 146)? 3) “Are test takers fully informed about the procedures and criteria that will be used in making decisions” (Bachman & Palmer, 2004, p.

147)? and 4) “Are these procedures and criteria actually followed in making the

decisions” (Bachman & Palmer, 2004, p. 147)?

b) Impact on Teachers

According to Bachman and Palmer (2004), regarding the impacts of test on teachers, there are three questions needed to be answered. They are: 1) “How

consistent are the areas of language ability to be measured with those that are included in teaching materials” (Bahman & Palmer, 2004, p. 154)? 2) “How consistent are the characteristics of the test and test tasks with the characteristics of teaching and learning activities” (Bachman & Palmer, 2004, p. 154)? 3) “How

consistent is the purpose of the test with the values and goals of teachers and of the instructional program” (Bachman & Palmer, 2004, p. 154)?

3) Impact on Society and Education Systems

The last part of impact which is going to be discussed is impacts on the society and education systems. In this part, Bachman and Palmer (2004) suggests five questions to be answered. They are: 1)“Are the interpretations we make of the

test scores consistent with the values and goals of society and the education system” (Bachman & Palmer, 2004, p. 154)? 2) “To what extent do the values and

(49)

Practicality = Available resources Required resources

If practicality ≥ 1, the test development and use is practical If practicality < 1, the test development and use is not practical

Figure 2.1 Practicality

consequences, both positive and negative, for society and the education system, of using the test in this particular way” (Bachman & Palmer, 2004, p. 154)? 4)

“What is the most desirable positive consequence, or the best thing that could

happen as a result of using the test in this particular way, and how likely is this to happen” (Bachman & Palmer, 2004, p. 155)? 5) “What is the least desirable

negative consequence, or the worst thing that could happen as a result of using the test in this particular way, and how likely is this to happen” (Bachman & Palmer,

2004, p. 155)?

f. Practicality

Practicality is “the relation between the resources that will be required in the design, development, and use of the test and the resources that will be available for these activities” (Bachman & Palmer, 2004, p. 36). Figure 2.1 below

is used by Bachman and Palmer (2004) to describe the concept of practicality.

(50)

resources which were required for the design stage, the operationalization stage, and the administration stage. The second one is related to the resources which were available for carrying out the design stage, the operationalization stage, and the administration stage.

Related to the first question, Bachman and Palmer (2004) mention three types of resources. They are: human resources, material, and time.

B. Theoretical Framework

From National Examination SOP it can be concluded that National Examination is the examination which is conducted nationally and simultaneously throughout Indonesia. In 2012, it was conducted on April 16 up to April 19. Senior High School Bahasa Inggris National Examination was conducted on April 17, 2012. 2012 Bahasa Inggris National Examination consisted of 50 multiple choices questions which should have been done in 120 minutes by the students. It consisted of listening, reading, and „writing‟ skills. The National

Examination was made on the basis of Kisi-Kisi UN which was grounded on Competence Standard and Basic Competence. Before doing the test, the National Examination participants must have fulfilled some requirements. In its administration, the National Examination was organized by Badan Standar

Nasional Pendidikan/BSNP (National Education Standard Agency). After the test

(51)

made since the choices of the answers were only limited to a, b, c, d, or e. The results of the National Examination were announced on May 24, 2012. The passing grade of the National Examination was 5.5. The results of the National Examination were used for four needs. One of them was to determine student graduation. To decide student graduation, 60% of National Examination result was combined with 40% of School Examination (Kelulusan UN, 2012).

2012 Senior High School Bahasa Inggris National Examination can be categorized as a test. More specifically, it belongs to formal-summative test. As a test, thus, 2012 Senior High School Bahasa Inggris National Examination is appropriate to be analyzed with six qualities of useful language tests theory which is proposed by Bachman and Palmer (2004). To find the correspondence between 2012 Senior High School Bahasa Inggris National Examination with six qualities of useful language tests, the researcher must analyze the test reliability, construct validity, authenticity, interactiveness, impact, and practicality.

Since reliability is defined as the consistency of a test “across different

characteristics of the testing situation” (Bachman & Palmer, 2004, p. 19), in order

to analyze the test reliability, the researcher tried to check the test consistency and dependability. More specifically, as recommended by Brown (2004), the researcher tried to analyze the student, rater, test administration, and test consistency and dependability. To analyze the student reliability, based on Brown theory (2004), the researcher examined the students‟ physical and psychological

(52)

To analyze the test administration reliability, the researcher examined the condition of the photocopied test sheets received by students, sound amplification, classroom condition, time allotment, and test observer performance. To analyze the test reliability, based on Bachman and Palmer recommendation (2004), the researcher examined the consistency and dependability of the test rubric, test input, expected response, and the relation between input and response.

Since construct validity is defined as “the evidence that the test score

reflects the area(s) of language ability we want to measure…” (Bachman &

Palmer, 2004, p. 21), to find out the test reliability the researcher checked the relevance between: 1) 2012 Senior High School Bahasa Inggris National Examination with Kisi-kisi UN and 2) Kisi-kisi UN with Basic Competence and Competence Standard.

Authenticity is “the degree of correspondence of the characteristics of a

given language test task to the features of a TLU task” (Bachman & Palmer, 2004, p. 23). Thus, in order to see the authenticity of the test, firstly the researcher identified “the critical features that define tasks in the TLU domain” (Bachman &

Palmer, p. 23). After specifying the TLU tasks, the researcher checked if the test tasks were relevant to the TLU tasks.

Interactiveness is defined “as the extent and type of involvement of the test

taker‟s individual characteristics in accomplishing a test task” (Bachman & Palmer, 2004, p. 25). The characteristics meant are test taker‟s personal

(53)

order to find out the test interactiveness, the researcher checked if the test takers‟ characteristics helped them to get involved in the test.

(54)

received by the society and education system by using the test in a particular way (Bachman & Palmer, 2004).

(55)

37 CHAPTER III

RESEARCH METHODOLOGY

This chapter describes the aspects of methodology used in this research. The aspects meant are research method, research setting, research participants, research instruments and data gathering technique, data analysis technique, and also research procedure.

A. Research Methods

In this research, the researcher used document/content analysis and sample survey as methods. Document analysis is a technique which enables researchers to find the meaning of certain materials (Ary, Jacobs, Razavieh, & Sorensen, 2010). It is “… focusing on analyzing and interpreting recorded material within its own

context” (Ary, Jacobs, & Razavieh, 2002, pp. 22-28). Specifically, what is

analyzed and interpreted from the recorded material are the characteristics of the material. Ary et al. (2010) state that document analysis “describes the

characteristics of the materials” (p. 452). The recorded material meant is usually

in the written format. As what is stated by Fraenkel & Wallen (2008), the analysis of the document analysis is usually “the written contents of a communication” (p.

(56)

In this research, there are four written documents which were analyzed by the researcher. They are: question sheet of 2012 Senior High School Bahasa

Inggris National Examination which was tested in Kota Yogyakarta, Kisi-kisi UN,

Basic Competence & Competence Standard, and National Examination SOP. Analysis done to the four documents was useful for this research especially to answer the questions related to construct validity, authenticity, and practicality.

Even though Ary et al. (2010) classify document analysis as a qualitative research, Ary et al. also write that document analysis can be both qualitative and quantitative. In this research, not only using the qualitative approach, the researcher also used the quantitative one. The quantitative approach was used to complement the qualitative analysis. It was used when the researcher needed to find out the percentages of certain phenomena. Not only in the qualities analyzed by document analysis, in the qualities analyzed by sample survey both approaches were used.

As stated above, other than using document analysis, the researcher also used sample survey method. Ary et al. (1990) writes that survey “samples populations in order to discover the incidence and distribution of, and the interrelationships among sociological, phsychological, and educational variables”

(p. 407). Further, Ary et al. (1990) state that data of survey researcher is usually “responses to predetermined questions that are asked of a sample of respondents”

(p. 407). It is also written that in survey research researcher tends to “generalize

(57)

Specifically, sample survey can be defined as “[a] survey that studies only

a portion of the population” (Ary et al., 1990, p. 408). The researcher categorized

her research method as sample survey method because in her research the researcher did not analyze the whole of population of senior high schools. Instead, the researcher picked only several senior high schools which are considered representative. The researcher decided to use sample survey method because, as implied by Ary et al. (1990), analyzing the whole population is usually expensive.

In this research, the researcher used tangible and intangible sample surveys. Tangible sample survey is especially used in the test practicality analysis while another sample survey is specifically used in the test impact analysis. Both are used together in the analysis of test reliability and interactiveness.

The kind of sample survey method done by the researcher in this research could be classified into descriptive sample survey. Descriptive survey is defined by Ary et al. (1990) as surveys which “inquire into the status quo; they attempt to

measure what exist without questioning why it exists” (p. 407). In this research,

especially when explaining test reliability, interactiveness, impact, and practicality, the researcher explains their existence in 2012 Senior High School

Bahasa Inggris National Examination without explaining why they existed. The

[image:57.595.96.517.257.586.2]

clearer picture of the research methods use can be seen in Table 3.1 below.

Table 3.1 Implementation of Research Methods in Six Qualities

No. Qualities Methods Implemented

1. Reliability Sample Survey (Tangible & Intangible) 2. Construct Validity Document Analysis

3. Authenticity Document Analysis

(58)

5. Impact Sample Survey (Intangible)

6 Practicality Sample Survey (Tangible), Document Analysis

B. Research Setting

(59)
[image:59.595.102.512.86.620.2]

Table 3.2 Samples of the Research Accreditation/

Status

A C

Public SMA A -

Private SMA B SMA C

C. Research Participants

(60)
[image:60.595.100.517.114.597.2]

Table 3.3 Research Participants

No. Participants SMA A SMA B SMA C

1. Students Student A Student B

Student C Student D

Student E Student F Student G Student H Student I Student J Student K Student L

2. Teachers Teacher W Teacher X

Teacher Y

Teacher Z

The choosing of the student participants were based on their cognitive levels. The researcher involved students of which cognitive level was high, average,

Gambar

Table 2.1 Elements of Language Ability Language Ability
Table 3.1 Implementation of Research Methods in Six Qualities Qualities Methods Implemented
Table 3.2 Samples of the Research A
Table 3.3 Research Participants SMA A SMA B
+7

Referensi

Dokumen terkait

Diajukan sebagai Salah Satu Syarat untuk Mencapai Gelar Sarjana pada Program Studi S1 Jurusan Manajemen Fakultas Ekonomi Universitas

[r]

1 Saya kurang bisa berinteraksi 2 35 6 Cukup Bermasalah 2 Saya kurang bisa berorganisasi 3 35 9 Cukup Bermasalah 3 Saya lebih mementingkan kelompok daripada diri sendiri 5 35 14

nbenhr n.LlLul @tri8. rievqjudio idioF hftn Fog d

[r]

Pos  pembinaan  terpadu  (Posbindu)  PTM  adalah  pe- ran  serta  masyarakat  dalam  melakukan  kegiatan  deteksi  dini  dan  pemantauan  terhadap  faktor  risiko 

Proporsi peran Artropoda sebagai herbivor, predator, detrivor, parasitoid, dan lainnya berturut-turut adalah 60%, 25%, 11%, 3%, dan 1%. Predator yang paling banyak ditemukan