VOCABULARY PROFILE IN THE TENTH GRADE ENGLISH TEXTBOOK USED IN SMK N 1 SALATIGA
THESIS
Submitted in Partial Fulfillment
of the Requirements for the Degree of
Sarjana Pendidikan
Raissa Junita Iwan
112012071
ENGLISH LANGUAGE EDUCATION PROGRAM
FACULTY OF LANGUAGE AND LITERATURE
SATYA WACANA CHRISTIAN UNIVERSITY
SALATIGA
v
COPYRIGHT STATEMENT
This thesis contains no such material as has been submitted for examination in
any course or accepted for the fulfillment of any degree or diploma in any
university. To the best of my knowledge and my belief, this contains no material
previously published or written by any other person except where due reference is
made in the text.
Copyright@ 2016. Raissa Junita Iwan and Anne Indrayanti Timotius, M.Ed.
All rights reserved. No part of this thesis may be reproduced by any means
without the permission of a least one of the copyright owners or the English
Language Education Program, Faculty of Language and Literature, Satya Wacana
Christian University, Salatiga.
vi
TABLE OF CONTENT
COVER PAGE... i
PERNYATAAN TIDAK PLAGIAT...ii
PERNYATAAN PERSETUJUAN AKSES...iii
APPROVAL PAGE ... iv
COPYRIGHT STATEMENT ... v
TABLE OF CONTENT ... vi
ABSTRACT ... 1
A.INTRODUCTION ... 1
B. LITERATURE REVIEW ... 4
The Definition of Vocabulary ... 4
The Importance of Vocabulary Learning ... 4
Problems That Appear in Vocabulary Learning ... 5
Vocabulary Profile ... 6
Four Types of Words ... 6
Relevant Previous Studies ... 8
C. THE STUDY ... 10
Context of the Study ... 10
Material ... 11
Data Collection Instruments... 12
Data Collection Procedures ... 12
Data Analysis Procedure ... 13
D. FINDINGS AND DISCUSSIONS ... 14
1. Overall Result of Vocabulary Profile ... 15
2. Negative Vocabulary Profiles of the Textbooks ... 16
Negative VP of K-1 ... 17
Negative VP of K-2 ... 18
Negative VP of AWL ... 19
vii
4. Comparison of Vocabulary Frequency Levels of the Textbooks ... 22
5. Text Comparison of the Textbooks ... 23
Comparison of Chapter 1 vs. Chapter 5 ... 23
Comparison of Chapter 6 vs Chapter 5 ... 25
CONCLUSION ... 26
REFERENCES ... 28
APPENDIXES ... 31
Appendix A ... 31
Appendix B ... 36
Appendix C ... 47
Appendix D ... 54
Appendix E ... 69
Appendix F ... 76
1
VOCABULARY PROFILE IN THE TENTH GRADE ENGLISH TEXTBOOK USED IN SMK N 1 SALATIGA
Raissa Junita Iwan 112012071
ABSTRACT
Vocabulary is the most important element for the language learning since it is one of the basic components of communication. However, the problem may
appear when the material was given is not suitable for the learner’s vocabulary
level. The study aimed to identify the vocabulary profile of ‘Bahasa Inggris’ book for Vocational School (SMKN 1 Salatiga) grade X. The study attempted to answer the three objectives. The first objective is finding the vocabulary profile of the English textbook grade 10. The second objective is analyzing the vocabularies that are not appear on the textbook. Then the third objective is producing the token recycling index of the textbook. Descriptive method was used for the study. All chapter of the book was used as the samples and an electronic tool named The
Compleat Lexical Tutor was used to identify the vocabulary profile of the ‘Bahasa Inggris’ book. The study resulted in three conclusions. First, the overall findings showed that there were 80.25% of K1, 7.77% of K2, 4.20% of AWL, and 7.78% of Off-List Words. Second, the calculation of vocabulary items that were not appeared in the textbook was 32.05% of K1, 72.11% of K2, and 75.92% of AWL. Last, the two comparisons had two results. Between Chapter 1 and Chapter 5 had the similar 83.28% words. Then, Chapter 6 and Chapter 5 shared the 80.63% came vocabularies.
Key Words : Vocabulary, Vocabulary Profile, English textbook
A. INTRODUCTION
Vocabulary becomes the main point of language learning’s development
since it is one of the basic components of communication. It is used as the media
to communicate the meaning of certain ideas. Vocabulary also becomes the
important part of reading activity. According to Astika (2014), vocabulary
learning is an essential point of reading comprehension. The learners may face the
difficulties when they do the reading activity and find some unfamiliar words.
Meanwhile, reading comprehension can be said as a success if most of the words
2
getting 95% of recognizable words to make the learning process occur. Therefore,
foreign language learners should have a rich knowledge of vocabulary through the
adequate language learning.
Since vocabulary is an essential part in reading text comprehension, it is
important for the teacher to know which level of vocabulary that is appropriate for
the learners. Different level of learners will determine the vocabulary level that
they got too. Milton (2009) said that the vocabulary level that learners need to
learn is determined by the different degree of word-internal factors like form,
cognateness, abstractness, and word length. That is why it is important for the
learners to receive the appropriate level of vocabulary learning, especially in
comprehending the reading text.
Vocabulary becomes an essential part of any foreign language learning
activities, both in formal or informal education. In the formal education,
vocabulary is being taught in every school trough the English subject, including
vocational school. Budiantri, Nitiasih, and Budasih (2013) said that English
subject in vocational high school is aiming to train students to communicate using
the intermediate English level.
Because of the importance in language learning, vocabulary needs to be
learnt. By using the vocabulary profiler, students and teachers can learn the
vocabulary that needs to be taught. “Vocabulary profile is a tool to measure the
vocabulary production that is contained in materials” (Astika, 2014). It describes
and calculates the frequency of word used and which groups that the words are
3
contained in the students’ textbook. By identifying the vocabulary profile of the
textbook teachers will clarify the relation between vocabulary used in the textbook
and the students’ capability in English easier. A different school must have a
different level of students. To create the effectiveness of learning process, the
teacher must know whether the textbook that is used as the learning tools is
appropriate or not for the students’ level. Some schools that have students with
limited vocabulary knowledge may feel the difficulty in adjusting with the
vocabulary words of the textbook if the textbook has a bit complex vocabulary
words.
This study aims to investigate the vocabulary profile of Vocational High
School English textbook grade X by using vocabulary profiler. Furthermore, this
study aims to analyze the kinds of vocabulary that mostly appears in the textbook.
The book entitles ‘Bahasa Inggris’ was published by KEMENDIKBUD in 2014.
This book has been used as the English learning guide in SMKN 1 Salatiga. There
are three research questions that appear for the study :
1. What is the vocabulary profile in the English textbook of the tenth
grade used in SMKN 1 Salatiga?
2. What is the percentage of vocabulary that is not included in the
textbook>
3. What is the token recycling index of the textbook?
From the research questions, the objectives then appear:
a. To find the vocabulary profile of the English textbook of the tenth
4
b. To mention the vocabulary words that not mentioned in the textbook.
c. To produce the token recycling index of the textbook.
This study could be beneficial for English teachers because by knowing
the vocabulary profile of the tenth-grade book, teachers can pay attention the
vocabulary frequency that appears from the book (K1, K2, AWL, and Off-List
Word). The study may also be beneficial for the teachers to prepare the words that
are needed to be taught to the students. It may also be beneficial for the book’s
publisher to revise the vocabulary level that appears from the book if there are
some words that do not fit with students’ level.
B. LITERATURE REVIEW The Definition of Vocabulary
The use of vocabulary is very important for the success of language
learning. According to Huyen and Nga (2003), vocabulary is the collection words
that are known to every individual. Olmos (2009) added that vocabulary is the
basic tool for creating and communicating meaning to someone. In other words,
vocabulary is a bundle of words that is very important as a communication tool.
The Importance of Vocabulary Learning
The need and importance of learning vocabulary for Foreign Language
Learners (FLL) have been analyzed by some experts. Matsuoka and Hirsh (2010)
said that it is required to get 95% of recognizable words to make the learning
process occur. Schmitt, Jiang, and Grabe (2011) added that vocabulary words that
are needed to understand by the FLL in written texts are around 95% to 98%.
5
any fields. According to Richard and Renandya (2002), vocabulary becomes the
basic component of language proficiency. It also becomes the basic learning for
the learners in speaking, listening, reading, and interacting with others. Then with
learning vocabulary, FLL will be able to achieve the other skills in English
learning. The more complex vocabulary that is mastered, the higher level of
English that FLL has. White, Graves and Slated (1990, as cited in Wessels, 2011)
said that the learners’ level in learning language can be measured best by using
their vocabulary knowledge.
Problems That Appear in Vocabulary Learning
In the language learning process, there must be some barriers that the
learners face. In the vocabulary learning itself, the problem sometimes happens
during the process of the vocabulary’s absorption. Milton (2009), said, “The
vocabulary that learners are required to learn may expose the different degree of
learning burden depending on word-internal factors such as forms, cognateness,
abstractness, and word length”. The students’ level of language learning is being
determined by the number of vocabularies that students’ need to master. Schmitt
(2000) said that students at early stages should learn about 1,000-2,000
high-frequency words and increase their skills about 3,000-3,000 words families to be
able to read authentic text that may include academic words for reading the
material at the university level. The problems appear when the FLL do not get the
correct standard and makes them confused with the vocabulary that appears.
Problems also appear for some learners who get the vocabulary that is too hard for
6
Vocabulary Profile
To solve the problems, some strategies in vocabulary learning then appear
to be developed. One strategy that is used is by examining the vocabulary profile
of the English learning material such as textbook and handout. According to
Astika (2014), vocabulary profile is used to determine which group of the
vocabulary that produced in the materials is belonged to. Graves (2005) added
that vocabulary profile is a bundle of vocabulary words that is used frequently.
Vocabulary profile helps the FLL to analyze the material that is suitable for their
capability based on the words list. Morris and Cobb (2004) said that vocabulary
profile provides breakdowns that include percentage from the type of word list.
According to Meara (2005), vocabulary profile or Lexical Frequency Profile
(LFP) is used as a tool of assessment if a particular text is suitable for use with
learners at particular level or proficiency.
Four Types of Words
There are four types of words that are being analyzed use the vocabulary
profile. Nation (1990) said that there are four types of word frequency. The first
one is High Frequency words. It is usually called as K1 and K2 in Vocabulary
Profiler. According to Cooper (2002), High frequency words are the words that
are mostly found in all kinds of text. K1 have a range from 1-1,000 words, and K2
has a range from 1,001-2,000 words. In K1, the words are divided into two parts,
function and content words. According to Saville and Troike (2006), function
words were the words that gave grammatical meaning in a sentence such as a
7
whereas content word is the word that has literal meaning. The third type is
academic words or AWL (Academic Words List). Nation (1990) added that AWL
is the words that occur around 800 times or 8% in most kinds of academic texts.
The last type of words is List words. Nobert and Diane (2012) said that
Off-List words are the kinds of words that do not belong to any other kind of words
(K1, K2, and AWL). The words that are included in this kind are the proper name,
other languages, or misspelled words. The tables below are the example or four
kinds of words.
Table 1. High Frequency Words (K1) (www.vocabulary.com)
Function Content
A And Many New Other
Had By Or Live Now
Have I Had Take Boy
Is Of Will Thing Paper
Table 2 . High Frequency Words (K2) (www.vocabulary.com)
Approve Diligent Moody Presence Distinguish
Favor Appointed Excellent Useless Administration
Pregnant Confidence Appropriate Servant Appointed
Enormous Better Supply Justify Charming
Table 3. Academic Words (www.vocabulary.com)
8
Approach Legal Occur Create Environment
Indicate Available Policy Data Established
Assessment Benefit Context Definition Significant
Table 4. Off-List Words (www.vocabulary.com)
Fauntleroy Asthma Conspiracy Dicalced Excursus
Rubicon Ballyhoo Cormorant Drivel Freesia
Affiance Blucher Corollary Divergence Gambrel
Affidafit Bulgur Courier Endogenus Hebetude
Relevant Previous Studies
The studies about vocabulary profiler have been done by many
researchers. Morris and Cobb (2001), in their journal with title ‘The use of
vocabulary profiles in predicting the academic and pedagogic performance of
TESL trainees’ were aiming to examine the potential of vocabulary profiles as the
predictors of academic performance in undergraduate Teaching English as a
Second Language (TESL) programs. They used the vocabulary profile to analyze
the 122 TESL students’ writing and scored them for each whether the result was
correlated with the grades they had in their program of study. The study found that
vocabulary profile was resulted the correlated significantly with the grades. The
words contained in the students’ writing were coherent with the words level that
TESL students have learned before. The study also found that vocabulary profile
was proved to be useful in carrying out a standard to create an appropriate
9
The other study comes from Graves (2005), with his ‘Vocabulary Profiles
of Letters and Novels of Austen and her Contemporaries’. This study was aiming
to analyze and compares the vocabulary words of three authors that are Jane
Austen, Fanny Burney, and Maria Edgeworth. He analyzed were they similar in
the words choice or not. Graves used the vocabulary profile to compare the word
frequencies that were used by the same author in two or more texts. The text
should have at least twenty-five thousand words to create the vocabulary profile
that would accurately represent the writers’ style. The study was analyzed the
three novels in the control group to produce a profile word set that was most
suitable to differentiate the three authors. The first analyzed resulted in 12
different words that exist in every novel that is ‘on, upon, again, already, till,
enough, however, thus, that, then, where, and why’. Those words then were being
analyzed to find out the correlation between novels and letters that each author
writes. The result revealed that the correlation in the words choice for both novels
and letters in Austen and Burney’s were stronger than Edgeworth’s. It showed that
the authors basically had the similar sense in writing either novels or letters.
Both studies analyzed the vocabulary profile from the source text, can be
the book or writing texts. From the analysis, both studies then identify whether the
words from the book are relevant to the subject of the text or not. Both studies
also used the vocabulary profiler as the tools to analyze the vocabulary profile
from the text. However, the contexts of both studies were different. Morris and
Cobb (2001) on their journal talked about the vocabulary profile that was used to
10
(2005) on his journal was comparing the literature text from three different
authors and analyzes their writing style in write the novel, poetry, or other
literature texts. The two studies from Morris and Cobb (2001) and Graves (2005)
actually have the similar purpose of this study. Same with both journals, this study
also analyzes the vocabulary profile of the textbook and identify it whether the
words are appropriate for the text level or not. However, this study analyzed the
textbook for Senior High School students and its different with both journals that
analyze the text for pretty higher level of academic and the literature text.
C. THE STUDY
The study examined the vocabulary profile of the textbook titled ‘Bahasa
Inggris’ (2014) that is published by Kementrian Pendidikan dan Kebudayaan
Republik Indonesia. The methodology that is used in this study is a descriptive
method. According to Rivera and Rivera (2007), descriptive method is a method
that is used to explore the facts that appear based on the professional judgment or
theories. This method is used to identify the vocabulary profile of the textbook
whether they are included in 1000 word list (K1), 1001-2000 word list (K2),
academic word list (AWL) , or off-list word.
Context of the Study
The study takes place in SMK Negeri 1 Salatiga. SMK Negeri 1 Salatiga is
one of the vocational school in Salatiga. The school is located in Jl. Nakulo
Sadewo 1/3 Salatiga. There are several kinds of the program in this school:
11
Tata Busana, and Tata Boga. The study used a vocational school as the sample
because the vocational school is preparing their students to able to compete in the
occupational world. So it is important for them to get the good basic of English to
support them to compete in the work world later. SMK N 1 Salatiga is being
chosen since this school is considered as one of the best vocational schools in
Salatiga. The writer’s personal experiences in doing the Teaching Practicum in the
SMK N 1 Salatiga also become another reason why SMK N 1 is chosen as the
context of the study. The researcher would get an easy access to get / borrow the
textbook that is used by SMK N 1 Salatiga because of the good relation that was
maintained since the writer was doing the Teaching Practicum there.
Material
This study used all chapters (9 chapters) of the ‘Bahasa Inggris’ textbook
for the tenth grade. Tenth grade was chosen because of several reasons. The first
reason was tenth grade is the transition period from the junior high school and
senior high school which have a different level of English. The second reason was
tenth grade is where the students need to strengthen their basic knowledge of
12
Figure 1. The Cover Side of English Textbook ‘Bahasa Inggris’ Grade 10
Data Collection Instruments
The study analyzed the vocabulary profile on the English textbook with
using an electronic tool named The Compleat Lexical Tutor. This tool could be
accessed at www.lextutor.ca/vp. This program was one of the vocabulary profiler
that was used to examine the vocabulary profile that is used in the textbook.
Data Collection Procedures
The data was collected by copying or re-type all words from all chapters of
the textbook in Microsoft Words. Each chapter will be copied in one Microsoft
Words file. In total, there would be 9 files of Microsoft Words that would be
13
analyzed by the vocabulary profiler. The name of persons, name of town, and
number were deleted from all of the texts.
Data Analysis Procedure
There were several steps in analyzing the data that has been copied in
Microsoft Words file. First, open the vocabulary profiler website on
http://www.lextutor.ca/vp, and choose the program that are needed (Vocabulary
Profile, Frequency, Lex...., and ...)Next, paste the text from the Microsoft
Words’ file to the box given, and click submit under the box. The result of the
vocabulary profile would appear after the program analyzes the texts copied in the
box. Save the analyzed result from the program. The vocabulary profiler’s result
can also be saved in Microsoft Word by clicking the editor print-friendly table.
The analysis result of vocabulary profile from overall chapter of the textbook
would be calculated automatically by the vocabulary profiler site. The analysis of
vocabulary that are not appeared in the textbook will also be shown. Then, the
token recycling of the textbook will also appear. The data would be grouped into
the K1, K2, Academic word list (AWL) and Off-list words. The result would be
14
D. FINDINGS AND DISCUSSIONS
The study discussed the Vocabulary Profile of ‘Bahasa Inggris’ grade X
(2014) . Nine chapters had been analyzed using The Compleat Lexical Tutor V.4.
The result of vocabulary profile was divided into five major parts. The first part
showed the overall result of vocabulary profile of the textbook. The proportion of
vocabulary frequency was being classified in the first part. The second part
presented the negative vocabulary profiles of K1, K2, and AWL along with the
list of words that were not used in all chapters. The next part showed the block
frequency output of Off-List Words. The fourth part discussed the comparison of
vocabulary frequency levels of the textbook across chapter. The last part then
presented the comparison of the textbooks that the vocabulary items seemed
unique in several chapter. The las part also showed the token recycling index that
15
1. Overall Result of Vocabulary Profile
Table 5. Overall Vocabulary Profile
FAMILIES %
TYPES %
TOKENS %
CUMULATIVE %
K-1 WORDS 661
61.43%
1,116 52.05%
1,0774 80.25%
80.25%%
K-2 WORD 277
25.74%
375 17.49%
1,043 7.77%
88.02%
AWL (570 FAMS TOT: 2.570
138 12.83%
181 8.44%
564 4.20%
92.22%
OFF-LIST ? 472
22.01%
1,044 7.78%
100%
TOTAL 1,076+? 2144
100%
13,425 100%
Table 5 showed the overall finding of ‘Bahasa Inggris’ textbook (2014).
There are three terms in the first row of table; families, types, and token. Families
or Word Family was the head of a word. For example, ‘friend’ was a head word of
‘friendly’. Meanwhile, types or Word Types was the word that had no relation.
For example ‘friend’ and ‘friendly’ were considered as same types where ‘mother’
and ‘moved’ were considered as different types. Token was the number of words
in a text. For example in the Vocabulary Profile showed another [1] answer[2]
16
The findings in table two showed that more than half textbook of the
vocabulary profile on the textbook was dominated by K1. Hirsch (2003) said that
it is needed to have at least 95% of the understandable words for the
comprehension of the text. With the accumulation of K2 percentage that was
7.77%, the total percentage of K1 and K2 was 88.02%. Therefore, it could be said
that the textbook was pretty hard to be comprehended. With the AWL percentage,
the cumulative percentage from K1, K2, and AWL was only 92.22%. Therefore, it
may be needed for the students to be able to understand the Off-List Words even
though those words were occur infrequently (low frequency words). The teachers
also should pay attention to the Off-List Word to create important words that
appropriate with students’ level in the category.
2. Negative Vocabulary Profiles of the Textbooks
The result of negative vocabulary profiles was presented in this section.
Negative vocabulary was the words that were not included in the textbook. These
un-included words consisted from the kind of words in the New General Word
List (web) and words in the textbook that was used in this study. Below would be
presented the negative Vocabulary Profile of K1, K2, and AWL words. The list of
negative K1 could be used for teachers to choose the words that students could
use in developing their vocabulary knowledge even the words were not included
17
Negative VP of K-1
The result showed all the word families or head words of K1 that were not
found in the three textbooks. The summary of negative vocabulary profile for K-1
level is presented below.
K-1 Total word families: 964
K-1 families in input: 656 (68.05%)
K-1 families not in input: 309 (32.05%)
The percentage only referred to the number of word families. It means that
tokens were not counted in the percentage. As can be seen the summary, as much
as 68.05% of word families were found in the textbooks. It means that as much as
32.05% of word families were not included in the book based on the on the words
listed in New General Service List (NGSL).
Some words below are some of the word families that were not found in
18 ACCOUNTABLE
ACROSS
ACTRESS
ADDRESS
ADMIT
ADOPT
ADVANCE
ADVANTAGE
ALLOW
ALMOST
ALONE
ALONG
ALREADY
ARISE
ARM
ARMY
ARTICLE
ATTACK
ATTEMPT
AVERAGE
BANK
BAR
BATTLE
BEAR
BANK
BED
BENEATH
BEYOND
BILL
BREAD
BREAK
BROAD
CASE
Negative VP of K-2
The data showed the word families from K2 that were not included in the
textbook’s words.
The result did not count the tokens of K2 in the textbook, but just the word
families of the K2. As much as 27.99% of word families of K2 were found in the
textbook. It means that 72.11% word families of K2 were not found based on the
New General Service List (NGSL). Below are some of the ‘missing’ K2 words
from the textbook. The complete word list can be found in Appendix B. K-2 Total word families: 986
K-2 families in input: 276 (27.99%)
19 ABROAD
ABSENCE
ABSOLUTE
ABSOLUTELY
ACCUSE
ACCUSTOM
ACHE
ADVERTISE
ADVICE
AEROPLANE
AFFORD
AGRICULTURE
AHEAD
AIRPLANE
ALIVE
ALOUD
ALTOGETHER
AMBITION
ANGER
ANGLE
APART
APOLOGIZE
APOLOGY
APPLAUD
APPLAUSE
APPLE
ARCH
ARREST
ARTIFICIAL
ASH
ASHAMED
ASIDE
ASLEEP
Negative VP of AWL
The data showed the analysis of all word families of AWL that were not
found in the textbook. The percentages did not belong to the tokens of the AWL
in the book.
K-3 Total word families: 569
K-3 families in input: 138 (24.25%)
K-3 families not in input: 432 (75.92%)
From the data above, as much as 23.20% of word families were found in
the textbook. It means that 76.98% of word families were no appearing on the
textbook based on the words in the New General Service List (NGSL).
The followings are some word families that were not appeared in the
20 ABSTRACT
ACADEMY
ACCUMULATE
ACCURATE
ACQUIRE
ADAPT
ADEQUATE
ADJACENT
ADJUST
ADMINISTRATE
ADULT
ADVOCATE
AFFECT
AGGREGATE
AID
ALBEIT
ALLOCATE
ALTER
ALTERNATIVE
AMBIGUOUS
AMEND
ANALOGY
ANALYSE
ANTICIPATE
APPARENT
APPEND
APPROACH
APPROPRIATE
APPROXIMATE
ARBITRARY
ASPECT
ASSEMBLE
ASSESS
ASSIST
ASSUME
ASSURE
3. Block Frequency Output of Off-List Words.
The words listed below were belong to ‘Off-list’ category, the words that
were not belong to other three categories. With the feature in Vocabulary Profiler,
the Off-list words had been frequency-blocked per ten words and had been
arranged from high to low frequency. With the list of words, it was hoped that
teachers could get the useful information like what words that were important to
teach from the Off-List Words. Teachers also could choose what words that
appropriate for the ten grade of Senior High School students. Below is the list of
‘Off-list’ words, with 1,044 tokens and 472 types. There are four kinds that were
shown in the table. RANK in the table is ranking of word, FREQ is frequency of
21
or cumulative, and the vocabulary item or word. The complete list of blocked
frequency has been put in Appendix D.
Table 6. Block Frequency Output of Off-List Words.
RANK FREQ COVERAGE
individ cumulative WORD
1. 39 3.74% 3.74% ANNOUNCEMENT
2. 22 2.11% 5.85% NIAGARA
3. 20 1.92% 7.77% STONEHENGE
4. 18 1.72% 9.49% ADJECTIVES
5. 18 1.72% 11.21% EMAIL
6. 18 1.72% 12.93% VOCABULARY
7. 16 1.53% 14.46% COMPLIMENT
8. 15 1.44% 15.90% DURRINGTON
9. 12 1.15% 17.05% COOKIES
10. 12 1.15% 18.20% JUNGLE
11. 12 1.15% 19.35% PHRASES
12. 11 1.05% 20.40% ADJECTIVE
13. 10 0.96% 21.36% COMPLIMENTS
14. 10 0.96% 22.32% CONCERT
15. 8 0.77% 23.09% AMAZING
16. 8 0.77% 23.86% CLASSMATES
17. 8 0.77% 24.63% COMPREHENSION
18. 8 0.77% 25.40% PRONUNCIATION
19. 8 0.77% 26.17% WATERFALL
22
4. Comparison of Vocabulary Frequency Levels of the Textbooks
Below is the comparison frequency of K1, Kw, AWL, and Off-list across
chapters in the textbook. Table 7 gives a broad explanation of the difference
frequency among each chapter.
Table 7. Comparison of word frequency levels
23
The table above showed the result of word frequency for each chapter. It
seemed that each chapter did not have significant differences of the percentage of
K1, K2, ALW, and Off-list words. Therefore, it may be concluded that the
difficulty level of each chapter of the textbook was similar. The AWL proportion
was calculated not as high as the other kind of words. However, the proportion of
AWL might still be the challenge for the students since AWL is the kind of words
that usually appears in the academic text, so the students might face the difficulty
in understands the textbook. The cumulative percentage of K1 and K2 may show
that all chapters still hard to be comprehend because all the cumulative percentage
of K1 and K2 was below 95% an important percentage for an understandable text.
5. Text Comparison of the Textbooks
Comparison of Chapter 1 vs. Chapter 5
The comparison between two chapters of the textbook is the last section
that was discussed. Comparison between two chapters will show the token
recycling index between two chapters. Recycling index is the proportion of the
words on the two chapters compared and the total number words in the second
compared chapter. This index has a functional information about the words are
similar from the two chapter, also the unique words found in the second chapter.
With the result, it may help teachers to focus on the words that found unique from
the second chapter. The first result compared Chapter 1 and Chapter 5. The two
chapters were chosen because they had the contrast result of K1. Chapter 1 was
calculated as the highest proportion of K1 where Chapter 5 was calculated as the
24
83.28%, indicating that as much as 83.28% words in Chapter 1 and Chapter 5
were similar. Those percentage shows that the textbook could be said difficult to
be comprehend because the result was below the 95% of recognizable words
theory. From the result, the calculation of the new or unique words in Chapter 5
was 16.72%. The result of the shared and unique words could be shown through
the table 4 below. Next to the word, there is a number/figure that was used to
count how many times the words appeared in the book.
Table 8. Comparison of Chapter 1 and Chapter 5.
The complete table could be seen in Appendix E.
Unique to first
595 tokens
317 families
001. student 10 002. sister 9 003. attend 7 004. mother 7 005. music 7
006. pal 7
Shared
817 tokens
177 families
001. the 66
002. be 38
003. she 38 004. you 32
005. to 29
006. friend 24
Unique to second
164 tokens
112 families
Freq first (then alpha)
001. point 6
002. hair 4
003. photograph 4
004. tall 4
005. company 3
006. face 3
Same list Alpha first
001. #number 1 002. adventure 1
003. alike 1
004. appear 2
005. bad 1
25
Comparison of Chapter 6 vs Chapter 5
The second part was compared the Chapter 6 and Chapter 5. These two
chapters were chosen to be compared because both chapters had the contrast result
of K2. Chapter 6 had the highest proportion of K2 while Chapter 5 had the lowest
proportion of K2. The data showed that the token of recycling index was 80.63%,
which means the 80.63% words of the two chapters were similar. Still, those two
chapters were pretty hard to be comprehend because they percentage were below
95%. Therefore, the unique words of the second chapter were 19.37%. The same
and unique words are described through the table below.
Table 9. Comparison of Chapter 6 and Chapter 5.
The complete table has been put in Appendix F.
Unique to first
873 tokens
397 families
001. noun 28
002. park 17
003. beauty 13
004. phrase 13
005. jungle 12
006. nation 12
Shared
791 tokens
177 families
001. the 66
002. be 38
003. she 38
004. you 32
005. to 29
006. friend 24
Unique to second
190 tokens
112 families
Freq first (then alpha)
001. picture 11
002. point 6
003. best 5
004. discuss 4
005. hair 4
006. photograph 4
VP novel items
Same list Alpha first
001. adventure 1
002. alike 1
003. appear 2
004. bad 1
005. best 5
26
CONCLUSION
This study aims to answer the research questions ‘What is the vocabulary
profile in the English textbook of the tenth grade used in SMKN 1 Salatiga?’,
‘What is the percentage of vocabulary that is not included in the textbook?’ and
‘What is the token recycling index of the textbook?’by investigating the
vocabulary profile of Vocational High School English textbook grade X by using
vocabulary profiler. The findings showed the vocabulary profile identification of
the ‘ Bahasa Inggris’ book. 32.05%) 72.11 75.92
The findings of the textbook revealed three major conclusions. First, the
overall findings showed that K1 was 80.25%, K2 was 7.77%, AWL was 4.20%,
and Off-list words were 7.78%. Therefore, it could be concluded that more than
half part of the book was belong to K1. Second, there are 32.05% of K1, 72.11%
of K2, and 75.92% of AWL that were not included in the textbook. By knowing
the vocabularies that are not appears in the textbook, teachers could select the
appropriate words for students’ development in learning language. Last, there
were two comparison, Chapter 1 vs Chapter 5 and Chapter 6 vs Chapter 5. The
first comparison between Chapter 1 and Chapter 5 showed that 83.28% words
were sharing the same vocabularies. The second comparison, Chapter 6 and
Chapter 5 showed that 80.63% words were same.
This study still has limitation. The limitation is the study just used one
textbook and one level as the source of analysis. It will be better if the study uses
27
can analyze more than one level of the class. The study can analyze the
vocabulary profile of 11th and 12th grade of the students to get the richer results.
By knowing the vocabulary profile of ‘Bahasa Inggris’ book, teachers are
hoped to be more pay attention to the vocabulary content for each chapter that
help students’ knowledge. Teachers also can decide their choice of words in
teaching the material, written or orally based on the vocabulary profile result of
the book. More research of other textbook and other levels may be beneficial for
28
REFERENCES
Astika, G. (2014). Profiling the vocabulary of news texts as capaity building for language teachers. Indonesian Journal of Applied Linguistics, 4(2), 257-266.
Bauer, Laurie. & Nation, P. (1993). Word Families. International Journal of Lexicography, 6 (4), 253-279.
Budiantri, P. Y., Nitisih, P. Y, & Budiasi, I. G. (2013). Developing authentic reading material for the tenth year students of state vocational high school 1 kubutambahan. Journal Program Pascasarjana Universitas Pendidikan Ganesha Program Studi Pendidikan Bahasa Inggris (1).
Graves, D. (2005). Vocabulary profiles of letters and novels of jane austen and her contemporaries. A publication of the Jane Austen Society of North America, 26 (1).
Huyen, N.T. & Nga, K. T. (2003). Learning vocabulary through games. Asian EFL Journal, 5 (4).
Matsuoka, W. & Hirsh, D. (2010). Vocabulary learning through reading : does an ELT course book provide good opportunity? Reading in a Foreign Language, 22 (1), 56-70.
Meara, P. (2005). Lexical frequency profiles : a monte carlo analysis. Applied Linguistics, 16 (1), 32-47.
29
Morris, L & Cobb, T. (2004). Vocabulary profile as predictors of the academic performance of teaching English as a second language trainees. System 32, 75-78.
Nation. (1990). Teaching and learning vocabulary. Victoria: Victoria University Wellington.
Norbert & Diane, S. (2012). Plenary speech as reassessment of frequency and vocabulary size in L2 vocabulary teaching. Cambridge: Cambridge
University Press.
Richard, C. J. and Renandya, A. W. (2002). Methodology in language teaching. Cambridge: Cambridge University Press.
Rivera, M. & Rivera, R. (2007). Practical guide to thesis and dissertation writing. Quezon city: Katha Publishing Inc.
Saville, M. & Troike. (2006). Introducing second language acquisition. Cambridge: Cambridge University Press.
Schmitt, N. (2000). Vocabulary in language teaching. Cambridge : Cambridge University Press.
Schmitt, N., Jiang, X. & Grabe , W. (2011). The percentage of words known in a text and reading comprehension. The modern Language Journal, 95, 26-43.
30
Vocabulary list (n.d.) Retrieved November 13, 2015, from
http://www.vocabulary.com/lists/
31
APPENDIXES
Appendix A
The Negative Vocabulay Profile of K1
ACROSS
ACTRESS
ADDRESS
ADMIT
ADOPT
ADVANCE
ADVANTAGE
ALLOW
ALMOST
ALONE
ALONG
ALREADY
ARISE
ARM
ARMY
ARTICLE
ATTACK
ATTEMPT
AVERAGE
BANK
BAR
BATTLE
BEAR
BED
BENEATH
BEYOND
BILL
BREAD
BREAK
BROAD
CASE
CASTLE
CAUSE
CHANCE
CHARGE
CHIEF
CHURCH
CLAIM
CLOUD
COAL
COAST
COIN
COLONY
COMMAND
COMMON
CONTROL
COST
COTTON
COUNCIL
COUNT
COURT
CROWD
CROWN
CURRENT
DANGER
DEAL
DECLARE
DEGREE
DEMAND
32 DESERT
DESIRE
DESTROY
DISTINGUISH
DISTRICT
EFFICIENT
EFFORT
INFLUENCE
IRON
JOINT
JOINTED
33
LITERATURE
LORD
LOW
MACHINE
MANUFACTURE
MARK
NECESSITY
NEITHER
OTHERWISE
OUGHT
OWE
PAGE
PER
PLAIN
POLITICAL
POOR
POPULATION
34 PROMISE
PROOF
PROPERTY
PROVE
PROVISION
PULL
RECOGNIZE
RECORD
SENSITIVE
SERIOUS
SUBSTANCE
35 TEAR
TEN
THIRTEEN
THIRTY
THURSDAY
THUS
TILL
TON
TOTAL
TOUCH
TOWARD
TRADE
TRUST
TUESDAY
TWELVE
TWENTY
UNDER
UNION
UNLESS
UPON
VARIETY
VESSEL
VICTORY
VIEW
VIRTUE
VOTE
WAGE
WAR
WEDNESDAY
WESTERN
WHOLE
WIFE
WILD
WINDOW
WISE
WITHIN
WORTH
WOUND
WRONG
YIELD
36
Appendix B
The Negative Vocabulary Profile of K2
ABROAD
ADVERTISE
ADVICE
AEROPLANE
AFFORD
ARTIFICIAL
37
BREAKFAST
BREATH
BREATHE
BRIBE
BRICK
BROADCAST
BROWN
CALCULATE
CANAL
CENTIMETRE
CHAIN
CHRISTMAS
38
CULTIVATE
CUP
CUPBOARDS
CURE
DISCIPLINE
45
TELEPHONE
TEMPER
TRANSLATE
46 WARN
WASH
WASTE
WEAK
WEAPON
WEAVE
WEED
WEIGH
WHEAT
WHIP
WHISTLE
WICKED
WIDOW
WINE
WING
WIPE
WIRE
WITNESS
WOOL
WORM
WORRY
WORSE
WORSHIP
WRAP
WRECK
WRIST
YARD
47
Appendix C
The Negative Vocabulary Profile of AWL
ABSTRACT
AGGREGATE
48
CORE CORPORATE
49
GUIDELINE
50
INTRINSIC INVEST
INVOKE
LEGISLATE
51
PRINCIPLE
52
STATISTIC
STATUS
SUFFICIENT
53 TRANSFORM
TRANSIT
TRANSMIT
TRIGGER
ULTIMATE
UNDERGO
UNDERLIE
UNDERTAKE
UNIFY
UTILISE
VALID
VARY
VEHICLE
VIOLATE
VIRTUAL
VISIBLE
VISION
VISUAL
VOLUME
VOLUNTARY
WHEREAS
WHEREBY
54
Appendix D
Block Frequency Output of Off-List Words
RANK FREQ COVERAGE
individ cumulative WORD
1. 39 3.74% 3.74% ANNOUNCEMENT
2. 22 2.11% 5.85% NIAGARA
3. 20 1.92% 7.77% STONEHENGE
4. 18 1.72% 9.49% ADJECTIVES
5. 18 1.72% 11.21% EMAIL
6. 18 1.72% 12.93% VOCABULARY
7. 16 1.53% 14.46% COMPLIMENT
8. 15 1.44% 15.90% DURRINGTON
9. 12 1.15% 17.05% COOKIES
10. 12 1.15% 18.20% JUNGLE
11. 12 1.15% 19.35% PHRASES
12. 11 1.05% 20.40% ADJECTIVE
13. 10 0.96% 21.36% COMPLIMENTS
14. 10 0.96% 22.32% CONCERT
15. 8 0.77% 23.09% AMAZING
16. 8 0.77% 23.86% CLASSMATES
17. 8 0.77% 24.63% COMPREHENSION
18. 8 0.77% 25.40% PRONUNCIATION
19. 8 0.77% 26.17% WATERFALL
20. 7 0.67% 26.84% HOBBIES
21. 7 0.67% 27.51% IMPRESSIVE
55
23. 6 0.57% 28.65% EQUIVALENTS
24. 6 0.57% 29.22% MAGNIFICENT
25. 6 0.57% 29.79% PROBOSCIS
26. 6 0.57% 30.36% RAINBOW
27. 5 0.48% 30.84% BELLOW
28. 5 0.48% 31.32% BORING
29. 5 0.48% 31.80% CAPTIVE
30. 5 0.48% 32.28% CLASSMATE
31. 5 0.48% 32.76% DELICIOUS
32. 5 0.48% 33.24% DESTINATION
33. 5 0.48% 33.72% ECOTOURISM
34. 5 0.48% 34.20% ESSAY
35. 5 0.48% 34.68% EX
36. 5 0.48% 35.16% INTERNET
37. 5 0.48% 35.64% JACKET
38. 5 0.48% 36.12% MALL
39. 5 0.48% 36.60% MIST
40. 5 0.48% 37.08% MODIFIERS
41. 5 0.48% 37.56% MONUMENTS
42. 5 0.48% 38.04% PAL
43. 5 0.48% 38.52% WATERFALLS
44. 4 0.38% 38.90% ARCHEOLOGISTS
45. 4 0.38% 39.28% CANCEL
46. 4 0.38% 39.66% CANCELLATION
47. 4 0.38% 40.04% CIVILIZATIONS
48. 4 0.38% 40.42% COMMUTER
49. 4 0.38% 40.80% CONGRATULATION
50. 4 0.38% 41.18% DESTINATIONS
51. 4 0.38% 41.56% DROPLETS
52. 4 0.38% 41.94% EXHAUSTED
53. 4 0.38% 42.32% GIGANTIC
54. 4 0.38% 42.70% HABITAT
56
56. 4 0.38% 43.46% MUSEUM
57. 4 0.38% 43.84% PARKER
58. 4 0.38% 44.22% PEARSON
59. 4 0.38% 44.60% SNOUT
60. 4 0.38% 44.98% STADIUM
61. 3 0.29% 45.27% ARTISTE
62. 3 0.29% 45.56% BARISTA
63. 3 0.29% 45.85% BRAT
64. 3 0.29% 46.14% CHASE
65. 3 0.29% 46.43% CHIPS
66. 3 0.29% 46.72% CLUES
67. 3 0.29% 47.01% CONTENTEDLY
68. 3 0.29% 47.30% CONTEST
69. 3 0.29% 47.59% CUTE
70. 3 0.29% 47.88% DIALOGUE
71. 3 0.29% 48.17% DUSK
72. 3 0.29% 48.46% FLUENT
73. 3 0.29% 48.75% GORGE
74. 3 0.29% 49.04% HAIRCUT
75. 3 0.29% 49.33% HOBBY
76. 3 0.29% 49.62% ILLUMINATED
77. 3 0.29% 49.91% MAID
78. 3 0.29% 50.20% MODIFIER
79. 3 0.29% 50.49% MUSLIMS
80. 3 0.29% 50.78% NATIONALITY
81. 3 0.29% 51.07% NOVELS
82. 3 0.29% 51.36% ORALLY
83. 3 0.29% 51.65% OUTDOOR
84. 3 0.29% 51.94% PENINSULA
85. 3 0.29% 52.23% PERSONALITY
86. 3 0.29% 52.52% PHRASE
57
88. 3 0.29% 53.10% PRISTINE
89. 3 0.29% 53.39% SANCTUARY
90. 3 0.29% 53.68% SCENIC
91. 3 0.29% 53.97% SENIOR
92. 3 0.29% 54.26% SINGAPORE
93. 3 0.29% 54.55% SMART
94. 3 0.29% 54.84% SPLASH
95. 3 0.29% 55.13% TINY
96. 3 0.29% 55.42% TREMENDOUS
97. 3 0.29% 55.71% UNFORESEEN
98. 2 0.19% 55.90% AMAZED
99. 2 0.19% 56.09% AMAZINGLY
100. 2 0.19% 56.28% ANNOUNCE
101. 2 0.19% 56.47% ANNOUNCES
102. 2 0.19% 56.66% ANTS
103. 2 0.19% 56.85% APPARATUS
104. 2 0.19% 57.04% BASKETBALL
105. 2 0.19% 57.23% BEACH
106. 2 0.19% 57.42% BETRAYED
107. 2 0.19% 57.61% BIOLOGY
108. 2 0.19% 57.80% BLONDE
109. 2 0.19% 57.99% BORED
110. 2 0.19% 58.18% CAMPAIGN
111. 2 0.19% 58.37% CASUAL
112. 2 0.19% 58.56% CHAOTIC
113. 2 0.19% 58.75% CHINESE
114. 2 0.19% 58.94% CHOCO
115. 2 0.19% 59.13% CHUBBY
116. 2 0.19% 59.32% CIVILIZATION
117. 2 0.19% 59.51% CORNS
118. 2 0.19% 59.70% DASH
58
120. 2 0.19% 60.08% DOWNFALL
121. 2 0.19% 60.27% DOWNTOWN
122. 2 0.19% 60.46% DRAM
123. 2 0.19% 60.65% ELEMENTARY
124. 2 0.19% 60.84% EXCERPT
125. 2 0.19% 61.03% EXHILARATING
126. 2 0.19% 61.22% EYEBROW
127. 2 0.19% 61.41% FANTASTIC
128. 2 0.19% 61.60% FLASHLIGHT
129. 2 0.19% 61.79% FLUFF
130. 2 0.19% 61.98% GINGER
131. 2 0.19% 62.17% GORGEOUS
132. 2 0.19% 62.36% GRADUATE
133. 2 0.19% 62.55% HEADMASTER
134. 2 0.19% 62.74% HELICOPTER
135. 2 0.19% 62.93% HOMETOWN
136. 2 0.19% 63.12% HUGE
137. 2 0.19% 63.31% HURRICANE
138. 2 0.19% 63.50% INCREDIBLE
139. 2 0.19% 63.69% INDSIA
140. 2 0.19% 63.88% INHERITS
141. 2 0.19% 64.07% INSPIRE
142. 2 0.19% 64.26% INSPIRED
143. 2 0.19% 64.45% ITALICS
144. 2 0.19% 64.64% JAM
145. 2 0.19% 64.83% JIGSAW
146. 2 0.19% 65.02% JUNIOR
147. 2 0.19% 65.21% KILOMETERS
148. 2 0.19% 65.40% MALAGASY
149. 2 0.19% 65.59% MARY
150. 2 0.19% 65.78% MCMASTER
59
152. 2 0.19% 66.16% MESS
153. 2 0.19% 66.35% MINI
154. 2 0.19% 66.54% MISSPELLED
155. 2 0.19% 66.73% MONUMENT
156. 2 0.19% 66.92% MOVIES
157. 2 0.19% 67.11% OD
158. 2 0.19% 67.30% OLYMPIAD
159. 2 0.19% 67.49% PALS
160. 2 0.19% 67.68% PARTICIPLES
161. 2 0.19% 67.87% PLUMP
162. 2 0.19% 68.06% PLUNGE
163. 2 0.19% 68.25% POETTRY
164. 2 0.19% 68.44% POUNDING
165. 2 0.19% 68.63% PROGRAMMER
166. 2 0.19% 68.82% REHABILITATION
167. 2 0.19% 69.01% REWRITE
168. 2 0.19% 69.20% SCAN
169. 2 0.19% 69.39% SCARF
170. 2 0.19% 69.58% SCORE
171. 2 0.19% 69.77% SEMESTER
172. 2 0.19% 69.96% SHY
173. 2 0.19% 70.15% SKINNY
174. 2 0.19% 70.34% SNEAKERS
175. 2 0.19% 70.53% SOAKED
176. 2 0.19% 70.72% STRAIGHTEN
177. 2 0.19% 70.91% STUBBORN
178. 2 0.19% 71.10% TALKATIVE
179. 2 0.19% 71.29% TERRIFIC
180. 2 0.19% 71.48% THEATER
181. 2 0.19% 71.67% TRAFFIC
182. 2 0.19% 71.86% UNEARTH
60
184. 2 0.19% 72.24% WATERPROOF
185. 2 0.19% 72.43% WOW
186. 1 0.10% 72.53% ACCOMPLISHMENT
187. 1 0.10% 72.63% ACE
188. 1 0.10% 72.73% ACES
189. 1 0.10% 72.83% ADITTED
190. 1 0.10% 72.93% ADVERB
191. 1 0.10% 73.03% AL
192. 1 0.10% 73.13% ALMST
193. 1 0.10% 73.23% ANEW
194. 1 0.10% 73.33% ANMAL
195. 1 0.10% 73.43% ANNIVERSARY
196. 1 0.10% 73.53% ANNOUNCED
197. 1 0.10% 73.63% ARCHEOLOGIST
198. 1 0.10% 73.73% ARTIFACTS
199. 1 0.10% 73.83% ASTON
200. 1 0.10% 73.93% ATHER
201. 1 0.10% 74.03% ATMOSPHERE
202. 1 0.10% 74.13% AVENU
203. 1 0.10% 74.23% AWESOME
204. 1 0.10% 74.33% AWHAT
205. 1 0.10% 74.43% BACKDOOR
206. 1 0.10% 74.53% BACKPACK
207. 1 0.10% 74.63% BACKYARD
208. 1 0.10% 74.73% BADMINTON
209. 1 0.10% 74.83% BANANA
210. 1 0.10% 74.93% BATU
211. 1 0.10% 75.03% BETRAY
212. 1 0.10% 75.13% BLACKSMITH
213. 1 0.10% 75.23% BLANKET
214. 1 0.10% 75.33% BLANKS
61
216. 1 0.10% 75.53% BLUESTONES
217. 1 0.10% 75.63% BOATHOUSE
218. 1 0.10% 75.73% BOOKSTORES
219. 1 0.10% 75.83% BOTANICAL
220. 1 0.10% 75.93% BREATHAKING
221. 1 0.10% 76.03% BREATHTAKING
222. 1 0.10% 76.13% BREEZE
223. 1 0.10% 76.23% BRIDAL
224. 1 0.10% 76.33% BRO
225. 1 0.10% 76.43% BROCHURES
226. 1 0.10% 76.53% BRUISES
227. 1 0.10% 76.63% BUDGET
228. 1 0.10% 76.73% CAFE
229. 1 0.10% 76.83% CAMPUS
230. 1 0.10% 76.93% CANDIDATE
231. 1 0.10% 77.03% CANEL
232. 1 0.10% 77.13% CANOE
233. 1 0.10% 77.23% CANOPY
234. 1 0.10% 77.33% CARRED
235. 1 0.10% 77.43% CASK
236. 1 0.10% 77.53% CELEBRATE
237. 1 0.10% 77.63% CERAMIC
238. 1 0.10% 77.73% CERTIFICATES
239. 1 0.10% 77.83% CHAT
240. 1 0.10% 77.93% CHEDDAR
241. 1 0.10% 78.03% CHEECH
242. 1 0.10% 78.13% CHEERFULLY
243. 1 0.10% 78.23% CHEF
244. 1 0.10% 78.33% CIGARETTE
245. 1 0.10% 78.43% COLLABORATIVE
246. 1 0.10% 78.53% COLUMN
62
248. 1 0.10% 78.73% COMEDIES
249. 1 0.10% 78.83% COMMERCIALS
250. 1 0.10% 78.93% COMPLIMENTED
251. 1 0.10% 79.03% CONGRATULATS
252. 1 0.10% 79.13% CONTIBUTIONS
253. 1 0.10% 79.23% CONTST
254. 1 0.10% 79.33% CORAL
255. 1 0.10% 79.43% CORPS
256. 1 0.10% 79.53% CRATER
257. 1 0.10% 79.63% CRAZY
258. 1 0.10% 79.73% CRYSTAL
259. 1 0.10% 79.83% CULINARY
260. 1 0.10% 79.93% CURRICULAR
261. 1 0.10% 80.03% DECK
262. 1 0.10% 80.13% DED
263. 1 0.10% 80.23% DEPARTING
264. 1 0.10% 80.33% DEPLOYED
265. 1 0.10% 80.43% DEPOSITED
266. 1 0.10% 80.53% DIALOGS
267. 1 0.10% 80.63% DIARY
268. 1 0.10% 80.73% DONATE
269. 1 0.10% 80.83% DORMITORY
270. 1 0.10% 80.93% ELDEST
271. 1 0.10% 81.03% ELEVATOR
272. 1 0.10% 81.13% EMAILS
273. 1 0.10% 81.23% EMBARRASSED
274. 1 0.10% 81.33% EMBARRASSING
275. 1 0.10% 81.43% ENDANGERED
276. 1 0.10% 81.53% ENGLAND
277. 1 0.10% 81.63% ERA
278. 1 0.10% 81.73% ETC
63
280. 1 0.10% 81.93% FAUNA
281. 1 0.10% 82.03% FERRIS
282. 1 0.10% 82.13% FICTION
283. 1 0.10% 82.23% FLIP
284. 1 0.10% 82.33% FLOP
285. 1 0.10% 82.43% FLORA
286. 1 0.10% 82.53% FOE
287. 1 0.10% 82.63% FOLK
288. 1 0.10% 82.73% FOLOWWING
289. 1 0.10% 82.83% FOOTBALLER
290. 1 0.10% 82.93% FOOTSTEPS
291. 1 0.10% 83.03% FOREVER
292. 1 0.10% 83.13% FORT
293. 1 0.10% 83.23% FRIEN
294. 1 0.10% 83.33% FRUSTRATED
295. 1 0.10% 83.43% FRUSTRATING
296. 1 0.10% 83.53% GADGET
297. 1 0.10% 83.63% GARDENING
298. 1 0.10% 83.73% GATHERD
299. 1 0.10% 83.83% GEOGRAPHY
300. 1 0.10% 83.93% GIANT
301. 1 0.10% 84.03% GRADUATED
302. 1 0.10% 84.13% GRADUATING
303. 1 0.10% 84.23% GRADUATION
304. 1 0.10% 84.33% GRVES
305. 1 0.10% 84.43% GUISING
306. 1 0.10% 84.53% GUITAR
307. 1 0.10% 84.63% HARMONY
308. 1 0.10% 84.73% HEADSETS
309. 1 0.10% 84.83% HEARTFELT
310. 1 0.10% 84.93% HIKING
64
312. 1 0.10% 85.13% HMONG
313. 1 0.10% 85.23% HOSPITALIZED
314. 1 0.10% 85.33% HYDROELECTRIC
315. 1 0.10% 85.43% ID
316. 1 0.10% 85.53% IGUANA
317. 1 0.10% 85.63% IMPRESSION
318. 1 0.10% 85.73% INDENTATION
319. 1 0.10% 85.83% INDIANA
320. 1 0.10% 85.93% INDO
321. 1 0.10% 86.03% INDONESIA
322. 1 0.10% 86.13% INDOOR
323. 1 0.10% 86.23% INFORMATIVE
324. 1 0.10% 86.33% INHERIT
325. 1 0.10% 86.43% INTER
326. 1 0.10% 86.53% INTERPRETER
327. 1 0.10% 86.63% IRRITATED
328. 1 0.10% 86.73% IRRITATING
329. 1 0.10% 86.83% ISLAMIC
330. 1 0.10% 86.93% JEANS
331. 1 0.10% 87.03% KINDERGARTEN
332. 1 0.10% 87.13% LAS
333. 1 0.10% 87.23% LIGE
334. 1 0.10% 87.33% LINKD
335. 1 0.10% 87.43% LONDON
336. 1 0.10% 87.53% LONGED
337. 1 0.10% 87.63% LOTION
338. 1 0.10% 87.73% LUXURIOUS
339. 1 0.10% 87.83% MADAGASKAR
340. 1 0.10% 87.93% MAGAZINE
341. 1 0.10% 88.03% MANIAC
342. 1 0.10% 88.13% MARVELOUS
65
344. 1 0.10% 88.33% MAYOR
345. 1 0.10% 88.43% MEATBALL
346. 1 0.10% 88.53% MEMORABLE
347. 1 0.10% 88.63% MENUS
348. 1 0.10% 88.73% MIKE
349. 1 0.10% 88.83% MISSION
350. 1 0.10% 88.93% MISTY
351. 1 0.10% 89.03% MOBILE
352. 1 0.10% 89.13% MOSQUITO
353. 1 0.10% 89.23% MOTH
354. 1 0.10% 89.33% MOTORBIKE
355. 1 0.10% 89.43% MOUNT
356. 1 0.10% 89.53% MULTI
357. 1 0.10% 89.63% MYSTIFIED
358. 1 0.10% 89.73% NEARBI
359. 1 0.10% 89.83% NEER
360. 1 0.10% 89.93% NIGH
361. 1 0.10% 90.03% NON
362. 1 0.10% 90.13% NOTIFIED
363. 1 0.10% 90.23% NUMBERAM
364. 1 0.10% 90.33% NURA
365. 1 0.10% 90.43% OPTIMISTIC
366. 1 0.10% 90.53% ORCHIDS
367. 1 0.10% 90.63% OT
368. 1 0.10% 90.73% OU
369. 1 0.10% 90.83% OUD
370. 1 0.10% 90.93% OUTFIT
371. 1 0.10% 91.03% PARL
372. 1 0.10% 91.13% PATCHING
373. 1 0.10% 91.23% PAYED
374. 1 0.10% 91.33% PH
66
376. 1 0.10% 91.53% PIER
377. 1 0.10% 91.63% PILLOW
378. 1 0.10% 91.73% PIMPLES
379. 1 0.10% 91.83% PLAGIARIZING
380. 1 0.10% 91.93% PONYTAIL
381. 1 0.10% 92.03% POP
382. 1 0.10% 92.13% PORTRAYING
383. 1 0.10% 92.23% POSTCARD
384. 1 0.10% 92.33% POTATO
385. 1 0.10% 92.43% PREPOSITION
386. 1 0.10% 92.53% PRESELL
387. 1 0.10% 92.63% PRIVILEGE
388. 1 0.10% 92.73% PROVINCE
389. 1 0.10% 92.83% QUITTED
390. 1 0.10% 92.93% RAFT
391. 1 0.10% 93.03% RAINBOWS
392. 1 0.10% 93.13% RAINCOAT
393. 1 0.10% 93.23% RANGER
394. 1 0.10% 93.33% RECESS
395. 1 0.10% 93.43% RECREATIONAL
396. 1 0.10% 93.53% REFERENCES
397. 1 0.10% 93.63% REGAIN
398. 1 0.10% 93.73% REGAINS
399. 1 0.10% 93.83% RELIGIUS
400. 1 0.10% 93.93% RENOWNED
401. 1 0.10% 94.03% REORGANIZE
402. 1 0.10% 94.13% REPELLENT
403. 1 0.10% 94.23% RESIDENCES
404. 1 0.10% 94.33% RIDICULOUS
405. 1 0.10% 94.43% ROBOT
406. 1 0.10% 94.53% ROBOTS
67
408. 1 0.10% 94.73% SAVANNAH
409. 1 0.10% 94.83% SCHOLARSHIP
410. 1 0.10% 94.93% SCISSOR
411. 1 0.10% 95.03% SCUBA
412. 1 0.10% 95.13% SEMINAR
413. 1 0.10% 95.23% SHORTST
414. 1 0.10% 95.33% SIBLINGS
415. 1 0.10% 95.43% SIGH
416. 1 0.10% 95.53% SINGULAR
417. 1 0.10% 95.63% SKETCH
418. 1 0.10% 95.73% SKETCHBOOK
419. 1 0.10% 95.83% SKETCHES
420. 1 0.10% 95.93% SOAR
421. 1 0.10% 96.03% SOARING
422. 1 0.10% 96.13% SOCCER
423. 1 0.10% 96.23% SOCIOLOGY
424. 1 0.10% 96.33% SOFA
425. 1 0.10% 96.43% SOLSTICES
426. 1 0.10% 96.53% SOMERSET
427. 1 0.10% 96.63% SORROW
428. 1 0.10% 96.73% SOUVENIRS
429. 1 0.10% 96.83% SPECTACULAR
430. 1 0.10% 96.93% SPILED
431. 1 0.10% 97.03% SPOOKY
432. 1 0.10% 97.13% STEWARDS
433. 1 0.10% 97.23% STOMACHACHE
434. 1 0.10% 97.33% STONHENGE
435. 1 0.10% 97.43% SUITCASE
436. 1 0.10% 97.53% SUPERVISED
437. 1 0.10% 97.63% SUPERVISOR
438. 1 0.10% 97.73% SUPERVISORS
68
440. 1 0.10% 97.93% SWERVED
441. 1 0.10% 98.03% TALENTED
442. 1 0.10% 98.13% TE
443. 1 0.10% 98.23% TELEVISION
444. 1 0.10% 98.33% TENNIS
445. 1 0.10% 98.43% THER
446. 1 0.10% 98.53% THESEE
447. 1 0.10% 98.63% THET
448. 1 0.10% 98.73% TONE
449. 1 0.10% 98.83% TRANSITIVE
450. 1 0.10% 98.93% TREFFIC
451. 1 0.10% 99.03% TROPHY
452. 1 0.10% 99.13% TROPICAL
453. 1 0.10% 99.23% UNDERLINING
454. 1 0.10% 99.33% UNEARTHED
455. 1 0.10% 99.43% UNEXPLAINED
456. 1 0.10% 99.53% UNINTERESTING
457. 1 0.10% 99.63% UNSUALLY
458. 1 0.10% 99.73% UNTINTERRUPTER
459. 1 0.10% 99.83% UPTHE
460. 1 0.10% 99.93% USD
461. 1 0.10% 100.00% UTENSILS
462. 1 0.10% 100.00% VACATION
463. 1 0.10% 100.00% VASE
464. 1 0.10% 100.00% VEGETATION
465. 1 0.10% 100.00% VEST
466. 1 0.10% 100.00% VICE
467. 1 0.10% 100.00% WATERFAL
468. 1 0.10% 100.00% WAVY
469. 1 0.10% 100.00% WEDDING
69
Appendix E
Comparison of Chapter 1 and Chapter 5
Unique to first 595 tokens 317 families
001. student 10 016. magnificent 5 017. most 5 817 tokens 177 families
001. the 66
Unique to second 164 tokens 112 families
71
094. live 2 095. main 2 096. marry 2 097. mathematics 2 098. might 2 139. certificate 1 140. city 1 147. communicate 1
73
202. hometown 1 203. hospital 1 204. house 1 205. hullo 1 206. iguana 1 207. individual 1 208. instead 1 209. instrument 1 210. invite 1 211. islam 1 212. island 1 213. isn’ 1 214. it’ 1 215. i’ve 1 216. jam 1 217. jigsaw 1 218. kindergarten 1 219. knowledge 1 220. large 1 221. let 1 222. letter: 1 223. life 1 224. like: 1 225. line 1 226. luck 1 227. luxury 1 228. maniac 1 229. memory 1 230. mention 1 231. menu 1 232. middle 1 233. minute 1 234. mobile 1 235. mother’ 1 236. move 1 237. movie 1 238. muslim 1 239. nation 1 240. necessary 1 241. never 1 242. next 1 243. no 1 244. notice 1 245. number 1 246. object 1 247. offer 1 248. office 1 249. optimist 1 250. order 1
74
75
76
Appendix F
Comparison of Chapter 6 and Chapter 5
Unique to first 873 tokens 397 families
001. noun 28 012. destination 8 013. paragraph 8 791 tokens 177 families
001. the 66
Unique to second 190 tokens 112 families
79 168. accommodate 1 169. admire 1 192. breathaking 1 193. breathtaking 1 194. breeze 1
80