Modelling and Simulation of Search Engine

(1)

 All sources 45  Internet sources 30  Own documents 11  Organization archive 4

[0]  iopscience.iop.org/article/10.1088/1742-6596/801/1/012078/pdf 13.0% 50 matches

[1]  https://archive.org/stream/arxiv-1212.3906/1212.3906_djvu.txt 7.3% 40 matches

[2]  "CR-INT110-Semantic interpretation ...ot; dated 2017-10-09 6.0% 33 matches

[3]  https://www.science.gov/topicpages/s/simulation sheds model.html 3.3% 9 matches

[4]  iopscience.iop.org/article/10.1088/1742-6596/801/1/012022 2.8% 11 matches

[5]  iopscience.iop.org/article/10.1088/1742-6596/801/1/012020/meta 2.8% 11 matches

[6]  "CR-INT136-Social network extractio...ot; dated 2017-10-09 2.6% 12 matches

[7]  https://archive.org/stream/arxiv-1212.4702/1212.4702_djvu.txt 2.1% 10 matches

[8]  https://arxiv.org/pdf/1303.3964.pdf 2.0% 10 matches

[9]  eprints.binadarma.ac.id/2778/1/ICIBA2016-13-086-091-Nasution-SocialNetworkMining.pdf 2.0% 12 matches

[10]  https://arxiv.org/pdf/1604.06976.pdf 2.3% 12 matches

[11]  "CR-INT137-Enhancing to method for ...ot; dated 2017-10-09 2.0% 8 matches

[12]

 "16. IOP.pdf" dated 2017-12-07 1.9% 7 matches

 1 documents with identical matches

[14]  "CR-INT135-Information Retrieval on...ot; dated 2017-10-09 2.0% 10 matches

[15]

[17]  "3028-3639-1-RV.pdf" dated 2017-10-30 1.7% 8 matches

[18]

 "Text artikel no_11.pdf" dated 2017-12-06 1.6% 5 matches

[20]  www.academia.edu/3144214/Simple_Search_Engine_Model_Adaptive_Properties 1.3% 7 matches

[21]  "Abdullah_2017_J._Phys.__Conf._Ser._890_012102.pdf" dated 2017-10-12 1.3% 5 matches

[22]  https://www.researchgate.net/publication..._and_Design_Patterns 1.4% 5 matches

[23]  www.academia.edu/3144197/Simple_Search_Engine_Model_Adaptive_Properties_for_Doubleton 1.0% 5 matches

[24]  https://www.researchgate.net/publication...ion_of_Search_Engine 1.2% 5 matches

[25]  https://link.springer.com/content/pdf/10.1007/978-3-319-05476-6_9.pdf 0.8% 6 matches

[26]

24.7%

Results of plagiarism analysis from 2017-12-27 23:47 UTC

M ahyuddinKM Nasution_2017_J._Phys.__Conf._Ser._801_012078.pdf

(2)

[26] 1.0% 3 matches

[32]  https://link.springer.com/chapter/10.1007/978-3-319-67621-0_20 0.9% 4 matches

[34]  https://www.sciencedirect.com/science/article/pii/S2288430014500243 0.7% 1 matches

[35]  https://vdocuments.site/chi-square-history.html 0.6% 2 matches

[36]  citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.747.3201&rep=rep1&type=pdf 0.5% 2 matches

[37]  "3472-4529-1-SM.pdf" dated 2017-10-30 0.5% 4 matches

[38]  https://www.researchgate.net/profile/Mah...n=publication_detail 0.5% 4 matches

[39]  https://www.researchgate.net/profile/Sha...tion-perspective.pdf 0.4% 3 matches

[40]  iopscience.iop.org/article/10.1088/1742-6596/822/1/012078/pdf 0.5% 2 matches

[41]  iopscience.iop.org/article/10.1088/1757-899X/171/1/012078/pdf 0.5% 2 matches

[42]  https://www.researchgate.net/publication..._Adaptive_Properties 0.1% 2 matches

[43]  www.finedictionary.com/Nasute.html 0.2% 2 matches

[45]  https://rd.springer.com/content/pdf/10.1007/978-3-319-05476-6_9.pdf 0.3% 2 matches

[46]  https://www.researchgate.net/publication...n=publication_detail 0.2% 1 matches

[47]  https://research.google.com/pubs/AndreiBroder.html 0.2% 1 matches

[50]  https://link.springer.com/chapter/10.1007/978-3-319-29544-2_7 0.1% 1 matches

[51]  https://quizlet.com/128157095/formal-logic-3-flash-cards/ 0.1% 1 matches

[52]  https://www.gcflearnfree.org/internetbasics/using-search-engines/1/ 0.1% 1 matches

9 pages, 5997 words

PlagLevel: selected / overall

101 matches from 53 sources, of which 30 are online sources.

Settings

Data policy: Compare with web sources, Check against my documents, Check against my documents in the organization repository, Check against organization repository, Check against the Plagiarism Prevention Pool

Sensitivity: Medium

Bibliography: Consider text

Citation detection: Reduce PlagLevel

(3)

--Journal of Physics:

[0]

Conference Series

PAPER • OPEN ACCESS

Modelling and Simulation of Search Engine

To cite this article: [17]Mahyuddin K M Nasution 2017J. Phys.: Conf. Ser.[12]801 012078

View the article online for updates and enhancements.

-Recent citations

Mahyuddin K. M. Nasution et al

- [17]

Social Network Extraction Based on Web. A Comparison of Superficial Methods Mahyuddin K.M. Nasution and Shahrul[0] Azman Noah

(4)

Modelling and Simulation of Search Engine

___

1_Information_{Technology, Fasilkom-TI, Universitas Sumatera Utara, Padang Bulan}

USU Medan 20155, Indonesia.

_{}__{}

Abstract.[3]The best tool currently used to access information is a search engine[3]. Meanwhile, the information space has its own behaviour. [0]Systematically, an information space needs to be familiarized with mathematics so easily we identify the characteristics associated with it. [0]This paper reveal some characteristics of search engine based on a model of document collection, which are then estimated the impact on the feasibility of information. [3]We reveal some of characteristics of search engine on the lemma and theorem about singleton and doubleton, then computes statistically characteristic as simulating the possibility of using search engine. [0]In this case, Google and Yahoo[0]. There are differences in the behaviour of both search engines, although in theory based on the concept of documents collection.

1. Introduction

To access or search for information in an information space or system, we need tools [1]. One of tools is the search engine, we know as a software system [2, 3].[0] In general, for helping to know and understand

a system, we use the model to assemble it such that mathematically a model can represent the search engine [4].[0]Whereas, simulation can used for estimating the effect of search engine model on the information space or system [5].

There are many different search engines. The search engine that arises naturally with the database or [52]

[0]

search engine that grew up with the web (web search engine) [6, 7].Dealing with the complexity of information, the search engines helpless and disappear, the search engine shifts to meet the capabilities

required, or the search engines changed clothes and present be new[0]. Therefore, all this will affect access

[0]

to information in space.In this case, the mathematical principle is not only used to systematize, but it

serves to optimize the creation of a search engine on information space. This paper aimed to express the[0]

characteristics of search engine based on the constraints in the information space.

 ___

[0]

Suppose we denote the information space or system such as Ω[8].[0]The information space contain the groups of documents or D[9].[0]Each group of documents consist of documents djwhereby there a word w, i.e.[0]the basic unit of discrete data, defined to be an item from a vocabulary indexed by {1,…,K},wk

= 1 if k in or K wk= 0 otherwise [10, 11]. Next, we define the terms related to the word.

Definition 1[1].A term tx

[0]

coincide with at least one or more words, i.e.xt= (wl|l=1,…, ),L k≤ l, is a numberl

of parameters representing words w l , is the number of vocabularies in xt, |tx| = l is the size of tx.

Suppose that we have a term, that is a person name t

[0]

l= ‘Mahyuddin Khairuddin Matyuso Nasution',

or {w1,w2,w3,w4} = {“Mahyuddin”,”Khairuddin”,”Matyuso”,”Nasution”} as a set of words. We obtain

[9] [44]

1

International Conference on Computing and Applied Informatics 2016 I OP Publishing

IOP Conf. Series: Journal of Physics: Conf[0].Series 801 _{(201 )}_{7 0120}78_doi_:10.[0_1088/] _{1742-6596 801}_/ _/1/₀₁₂₀78

International Conference on Recent Trends in Physics 2016 (ICRTP2016) IOP Publishing [12]

Journal of Physics:Conference Series 755 755 _{(2016) 011001} _doi_:_{10.1088/1742-6596/755/1/011001}[12]

[4]

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence.Any further distribution

of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. Published under licence by IOP Publishing Ltd

(5)

the power of sets {{},{w1},{w2},{w3},{w4},{w1,w2},{w1,w3},{w1,w4},{w2,w3},{w2,w4},

{w3,w4},{w1,w2,w3},{w1,w2,w4},{w2,w3,w4},{w1,w2,w3,w4}} = {t2^{k}-1_{}, note:}_{2^{k} is called}_k_{-th power}

of 2. In this case we have |{t2^{k}-1_{}| = 15 and}_l_{= . Therefore, probability of}_k _t

lor (p tl) = 1/(2l-1). Suppose

the vector space of tlis {t2^{l}-1}, we have a design for searching information in the search space by a

software system as the search engine. We express it as follows [12, 13].

Definition 2.[1]SupposeΩis a set of documentsindexed by search engine, i.e.[2], a set consists of the ordered pair of the termstliand documents dlj, or (tli,dli), i=1,…,I j, =1,…,J. The relation table is two columnst

[0]

l

anddlas a representation of search engine wherebyΩl= {(tl,dl)ij} is a subset of .Ω The size of Ωis

denoted by |Ω|.

Definition 3.[1] Let tlis a search term and q is a query, thentlin q for tlin dl, dlin Ω.

In logical implication, Definition 3 express that a document is relevant to a query if it implies the

[0]

query, that is if d q= is true or d t= lis true for all dinΩ: (d= tl) = 1.Thus, the degree of d q= measured

[0]

by (P d= q). [0]Therefore there are the uniform mass probability function for Ω, i.e.

P : Ω→ [0,1] (1)

where ΣΩP d( ) = 1.

Definition 4.[1] Suppose txis a search term or txin S wherebyS is a set of singleton search terms of search

engine[1]. A vector spaceΩx, be a subset of Ω, is a singleton search engine event (singleton) of documents

that contain an occurrence ofxtindx.

The same meaning of Ω

[1]

xas subset of Ωis if d t= xhas true value, or Ωx(tx)≈1 if txis true atd in Ωor

Ωx(tx) 0 otherwise, ≈ and the cardinality ofΩxbe |Ωx| = ΣΩ(Ωx(tx)≈1). In other word, each document that

[1]

is indexed by search engine contains at least one occurrence about the search term. [0]In degree of uncertainty of d= txon d= q means that

P(Ωx) P(Ωx(tx)=≈1) = ΣΩ(Ωx(tx)≈1)/|Ω| = |Ωx|/|Ω|. (2)

However, if search term in pattern, liketx= “Mahyuddin Khairuddin Matyuso Nasution”, then a different

[0] [0]

result appears. In other words,Ωxp(“tx”)=1 iftxis true at d inΩexactly orΩxp(“tx”)= 0 otherwise, and the

cardinality of Ωxbe |Ωxp| =ΣΩ(Ωxp(“tx”)=1).In this case, each document that is indexed by search engine

[1]

contains at least one occurrence of a search term[0]. In degree of uncertainty of d= ”xt” on d q = is

P(Ωxp) = P(Ωxp(“tx”)=1) = ΣΩ(Ωxp(“tx”)=1)/|Ω| = |Ωxp|/ |Ω|. (2)

Thus |Ωxp|/ |Ω| ≤ |Ωx|/ |Ω|, so |Ωxp|≤|Ωx| rΩxpis a subset of o Ωx.

Lett

[1]

xandtyare two different search terms. Ift

[2]

x=ty,tx≠ty, or |tx| |ty|, thenΩxpbe a subset of Ωxor

Ωypbe a subset of Ωyor Ωxpbe a subset of Ωyor Ωypbe a subset of Ωy.

 __

[1]

Let txand tyare search terms, refer to the definitions above, will be revealed some characteristics related to the search engine as a system. [0]All characteristics derived from the adaptation formula that build model of the problem completion relating to the possible the results of the search engine[0]. Some of the adaptive characteristics are as follows [12, 13, 14].

Lemma 1[1].If tx≠tyandtx∩ty= , then |ϕ Ωx∩Ωy|=0 and |ΩxUΩy|= |Ωx|+|Ωy| hereΩxwandΩyare subsets of Ω.

[2]

Proof.tx≠tyandtx∩ty=ϕmean that for all wxin txallwxnot in tyand for all wyin tyallwynot in tx, then for all wxin dxallwxnot in dyand for all wyin dyallwynot in dxsuch that txUty=tyUtxanddxUdy=dyUdx.

2

(6)

Therefore,Ωx={(tx,dx)} and Ωy={(ty,dy)} are two independent events from queries, or txandtyare true

atdinΩ, respectively[0]. In this case, Ωx∩Ωy=ϕ. In other words, {(t

[0]

x,dx)}U{(ty,dy)} = ΩxUΩy. Therefore,

we have

|Ωx∩Ωy| = 0 (3)

and

|ΩxUΩy| |Ωx| |=Ωy| + 4) (

Lemma 2.[Iftx≠tytx∩ty2 ≠ϕand |]ty| |tx|, then |Ωx|=|Ωx|+ |Ωy| where ΩxandΩyare subsets of Ω.

Proof[2].Based on assumption, we have for all wyin tyallwyin tx, but there are wxin txwherebywxnot in tysuch that tx∩ty=tyandtxUty=tx. Similar concept, for allwyin tyallwyin dyand because all wyin txwe

conclude that wyalso in dx, but there are wxin txandxxin dxwherebywxnot in tysuch that wxnot in dy. Thus, if t

[1]

x∩ty=tythen dx∩dy=dyand if txUty=txthendxUdy=dx. Therefore, Ω

[0]

x= {(tx,dx)} = {(txUty,dxUdy)} = {(tx,dx)U(ty,dy)} = {(tx,dx)}U{(ty,dy)} = ΩxUΩy. In other words,

|Ωx| = |Ωx| |Ωy| + 5) (

Proposition 1[1]. For search terms tz≠… ≠ty≠txand |tz| … |t y| |tx|, then |Ωx| = |Ωx| |Ωy| + + |Ω+z|, … where Ωz, …, Ωy, Ωxare subsets of Ω.

Proof. Based on generalization of Equation (3) and Equation (4), we derive |Ωx| = |ΩxUΩy| = |Ωx|+ |Ωy| =

|Ωx|+|ΩyU…| = |Ωx|+ |Ωy|+… = |Ωx|+|Ωy|+|…UΩz|, and

|Ωx| = |Ωx| |Ωy| + |Ω+z| … + 6) (

Lemma 3[2].If tx≠tytx∩ty=ϕanddx∩dy≠ϕ, then |Ωx| |≈ Ωy|,ΩxandΩyare subsets of Ω.

Proof. tx≠tytx∩ty=ϕanddx∩dy≠ϕmean that for allwxin txallwxnot in tyand for allwyin tyallwynot in txthen txUty= tyUtx, but for all wxin dxthere are wxin dyand for all wyin dythere are wyin dxalso, then dx∩dy= dx    Ωx= {(tx,dx)} and Ωy= {(ty,dy)} we obtain Ωx∩Ωy=

{(tx,dx)}∩{(ty,dy)} = {(tx,dy)}∩{(ty,dy)} = {(ty,dy)}∩{(ty,dy)} or Ωx∩Ωy= Ωy∩Ωyor

Ωx∩Ωy= Ωy. 7) (

Similarly,

Ωx∩Ωy= Ωx. (8)

[ 0 ]

In other words, Ωx= {(tx,dx)} = {(tx,dxUdy)} = {(tx,dx)U(tx,dy)} = {(tx,dx)U(ty,dy)} = {(ty,dx)U(ty,dy)} = {(ty,dxUdy)} = {(ty,dy)} = Ωy. Therefore, based on Equation (7) and Equation (8), we have |Ωx| |≈ Ωy|.

Definition 5[1]. Supposetxand tyare two different search terms. Let t

[1]

x≠ty, txand tyinS, where Sis a set of

singleton search term of search engine. [2]A doubleton search term is D= {{tx,ty}:t [2]

x,tyinS} whereby the vector space of doubleton search term denoted by Ωx∩Ωy is a doubleton search engine event of documents that contain a co-occurrence of txandtysuch that tx,tyindxandtx,tyindywherebyΩx,Ωy,

Ωx∩Ωyare subsets of Ω.

Theorem 1. Suppose txandtyare two different search terms.Ω

[9]

x∩Ωyis a doubleton search engine event for txandtywherebyΩxandΩyare subsets ofΩ, then |Ωx∩Ωy| ≤ |Ωx| ≤ |Ω| and |Ωx∩Ωy| ≤ |Ωy| ≤ |Ω|. Proof. Based on set theory Ωx∩Ωybe subset of ΩxandΩx∩Ωybe subset of Ωy, thus |Ωx∩Ωy| |Ωx| and

|Ωx∩Ωy| |Ωy|. While based on Lemma 3 we have |Ωx∩Ωy| = |Ωx| and |Ωx∩Ωy| = |Ωy|. ForΩ

[0]

x= {(tx,dx)}, we have {(tx∩ty,dx∩dy)} = {(tx,dx) (∩ty,dy)} = {(tx,dx)}∩{(ty,dy)} = Ωx∩Ωyif tx≠tyand |tx| |ty|. Thus |Ωx∩Ωy| = |Ωx|. Similarly, we obtain Ωy= Ωx∩Ωy, so |Ωx∩Ωy| = |Ωy|. Based on Definition 5

[2]

D= {{tx,ty},tx,tyinS}, or {tx,ty} = {(tx,dx),(ty,dy)} = {(tx,dx)∩(ty,dy)} = {(tx,dx)∩(ty,dy)} = {(tx∩ty), (dx∩dy)}

3

International Conference on Computing and Applied Informatics 2016 I OP Publishing

(7)

= {(tx,ty),(dx,dy)}, for {tx,ty} = Ωx∩Ωy, we get Ωx∩Ωy=ΩxandΩx∩Ωy=Ωy.In other words, {t

[0]

x,ty} = {(tx,dx),(ty,dx)} = {(tx,dx)},{(ty,dy)}, for {tx,ty} = Ωx∩Ωywe get Ωx∩Ωy= Ωx∩Ωy, Ωx∩Ωyis a subset of Ωx or Ωy. If the comma logically means “and” in set theory it means an intersection.

Therefore, |Ω

[1]

x∩Ωy| ≤ |Ωx| ≤ |Ω| and |Ωx∩Ωy| ≤ |Ωy| ≤ |Ω| for all search terms txandty.

 ____

[8]

The purpose of simulation, in this case, is to construct an approach for selecting the documents in information space or for disclosing the information in the repository [15]. As an experiment to collect data, which is to select n objects from the community. For example, we collect data from the academic community of Faculty of Medicine University of Sumatera Utara (USU), i.e. n = 51 academic actors, or in a list is A = {Abdul Majid, Abdul Rachman Saragih, Abdul Rasyid, Abdullah Afif Siregar, Achsanuddin Hanafie, Adi Kusuma Aman, Alfred C. Satyo, Askaroellah Aboet, Atan Baas Sinuhaji, Ayodhia Pitaloka Pasaribu, Aznan Lelo, Bachtiar Surya, Budi R. Hadibroto, Chairuddin Panusunan Lubis, Chairul Yoel, Darwin Dalimunthe, Daulat Hasiholan Sibuea, Delfi Lutan, Delfitri Munir, Erwin Dharma Kadar}. Among the names of actors as the term, two different termstxandtyhave several

options that correspond to words of each name, such as mutual, including, or intersection. Therefore, each term has the opportunity to be placed in the position of a particular index. The position of each term in the search engines for example based on the selected collection of a number of documents related to the term.

Table 1. Experiment design for simulation of search engine Actor Name

In the sample that can represent population, we develop a table of information as experiment design for providing data, Table 1. Data that reveal characteristics of a search engine. In the table, the first column is the actor's names alphabetically ordered. The second column contains academic level: It is used to test whether the sample is random, the academic level as medium of randomness test (mrt). The third column involves data of scientific publications indexed by Scopus whereby the actor consists of two categories: the author or not, data of scientific publications as the comparative mrt. It is intended to support the randomness test of sample.[2] The next columns contain the list of singletons respective to tx

andtxin quotes, and a list of doubletons of txandty(singleton with keyword). In this case, we ensure that the singletons also are random.

4

(8)

In general, the information space consisting of documents viewed as the population. Statistically, the population is random, and it was tested whether the characteristics also lowered to the sample, so that any measurement about sample describe population. We seperate the sample into two categories: number of first categories

n1= ∑i=1…nai1 (1)

or

n2= ∑i=1…nai2 (2)

whereby ai1is elements ofA that meet first category and ai2is elements ofA that meet second category. While run (r) is how many times the category change in the sample. Thus, the average of run is

μr= (2n1n2/(n1+n2))+1 (3)

and the variance of run is

σr2= ((2n1n2(2n1n2-n1-n2))/((n1+n2)2(n1+n2-1)))1/2. 4) (

Then, we have Zcountas follows

Zcount= (r - μr)/σr (5)

for hypotheses used are as follows: H0: the data sequence is random, and H1: the data sequence is not random. For academic level as category: professor (pr) or lecturer (lc), we have1n= 34 and n2= 17. By using Equations (3), (4), and (5), we obtain μr= 23.67, σr= 0.93 and Zcount= -1.79, and for α= 0.05 we

obtain Zα=-0.025=1.96 ≤Zcount≤Z=0.025= 1.96, and because r is located between the critical value then the decision is received H0. Seen from the publication of scientific papers indexed by Scopus: author (a) or not (n), we have the similar conditions such that the sequence of data is random.

Furthermore, to test the randomness perfectly, tested independence of two data space by using chi-square (χ2_{). Suppose the data space (}_ds_{) is presented in matrix form as follows,}

ds =

x11 x12 … x1n x21 x22 … x2n

… … … …

xm1 xm2 … xmn

Amount of data xijis Sijas follows

Sij= Σi=1,…, , =1,…,n j mxij. 6) (

So that we can calculate the expectations of each data as follows

e11= (Σi=1, =1,…,j mxij)(Σi=1,…, , =1n j xij)/Sij e12= (Σi=1, =1,…,j mxij)(Σi=1,…, , =2n j xij)/Sij

… 7) (

emn= (Σi n j= , =1,…,mxij)(Σi=1,…, , =n j mxij)/Sij

and we have a matrix of expectations as follows

es =

e11 e12 … e1n e21 e22 … e2n

… … … …

em1 em2 … emn

5

(9)

Amount of data eijis Eijas follows

Eij= Σi=1,…, , =1,…,n j meij (8)

Then, we have

χ2_{= Σ}

i=1,…, , =1,…,n j m(xij-eij)/eij (9)

with degree of freedom (df) is (m-1)(n-1). For example, among 51 actor names we have x11= 4 3 professors,x21= 17 lectures, x12= 17 authors, and x22= 34 non-authors. Based on Eq. (7) we can calculate

their expectations, i.e.e11=e12=e21=e22= 25.5, and based on Eq. (9) we obtain χ2= 11.33 for test [35]

statistic Tas chi-squared distribution with (m-1)( -1) = (2-1)(2-1) = n 1 degree of freedom and the acceptance region for Twith a significance level of 5% is 3.841, then rejects [35] the null hypothesis of independence because χ2_3._{841. This tell us there is a relationship between type of academic level and}

authors.

In reveal characteristics of search engine based on a model, we conduct an experiment about

[3]

singleton and doubleton of Google search engine as test simulation and of Yahoo search engine as comparative simulation as follows.

1. Randomness test: Calculate the randomness test for tx, t”x”, and tx,tyby completing the

computations as follows:

a. Fortx, we have the amount of 51 |Ωx| from Google search engine, that is 2635514

with average (avg) = 51676.75. Number of |Ωx| greater than or equal to avg is 10 |Ωx|

while number of |Ωx| less than avgis 41. By using Eq. (3), Eq. (4), and Eq. (5), Zcount

=-6.54 Zα=-0.025=-1.96. Therefore, reject H0and 51 singletons of txfrom Google search

engine is not random.

b. However, the amount of 51 |Ω”x”| from Google search engine, that is 1095045 with

average (avg) = 21471.47. Number of |Ω”x”| greater than or equal toavgis 3 |Ω

[37]

”x”|

while number of |Ω”x”| less than avgis 48. By using the similar equations, Zcount

=-1.50 Zα=-0.025=-1.96, and H0accepted whereby 51 singletons of t”x” from Google

search engine is random.

c. For doubletontx,tywhereby ty= “Universitas Sumatera Utara” as a keyword, we have amount of 51 |Ωx∩Ωy| from Google search engine, i.e 61092 with avg= 1197.88.

Number of |Ωx∩Ωy| greater than or equal to avgis 15 and number of |Ωx∩Ωy| less than avg is 36. With that, we obtain Zcount=-1.31 Zα=-0.025=-1.96 based on Eq. (3), Eq. (4) and Eq. (5), and H0accepted, thus 51 doubletons of tx,tyfrom Google search engine

is random.

d. Whereas, for txby using Yahoo search engine, we have the amount of 51 |Ωx| s i

2365061 with avgis 46373.76. So n1(pr) = 5 and n2(lc) = 46. Zcount=-0.03 Zα=-0.025 =-1.96. On that basis, H0accepted, thus 51 singletons oftxfrom Yahoo search engine is random.

e. Simlarly for t”x”, the amount of 51 |Ω”x”| from Yahoo search engine, that is 395815

with average (avg) = 7761.08. Number of |Ω”x”| greater than or equal to avgis 3 |Ω”x”|

while number of |Ω”x”| less than avgis 48. By using the similar equations, Zcount

=-1.50 Z α=-0.025=-1.96, and H0 accepted whereby 51 singletons of t”x” from Yahoo

[1]

search engine is random.

f. For doubletontx,tywhereby ty= “Universitas Sumatera Utara” as a keyword, we have amount of 51 |Ωx∩Ωy| from Yahoo search engine, i.e 15361 with avg= 301.19.

Number of |Ωx∩Ωy| greater than or equal to avgis 12 and number of |Ωx∩Ωy| less than avg is 39. With that, we obtain Zcount=-0.42 Zα=-0.025=-1.96 based on Eq. (3), Eq. (4) and Eq. (5), and H0accepted, thus 51 doubletons ofxt,tyfrom Yahoo search engine is

random.

6

(10)

2. Independence test: For a contingency table has mrows andn columns, a test of independency that null and alternative hypotheses are:

H0: The two or more categorical variables are independent.

H1: The two or more categorical variables are related.

Table 2. Samples and categories

Modelling and Simulation of Search Engine

24.7%

--Journal of Physics:

Conference Series

Modelling and Simulation of Search Engine

Related content

-Recent citations

Modelling and Simulation of Search Engine