Information Retrieval Based on the Extracted Social Network

(1)

 All sources 46  Internet sources 13

[4]  https://www.researchgate.net/publication...nformation_Retrieval 8.4% 18 matches

[15]  https://www.researchgate.net/publication...c_of_negative_issues 4.2% 7 matches

[28]  https://arxiv.org/pdf/1602.00104.pdf 2.4% 3 matches

[33]  https://www.semanticscholar.org/paper/Ex...a74b74012814c5508e20 1.7% 6 matches

[34]  https://www.researchgate.net/publication...traction_perspective 1.1% 4 matches

[37]  https://www.springer.com/gp/book/9783319676203 0.9% 2 matches

[39]  https://www.coursehero.com/file/p1ip7ubi...ian-coordinates-and/ 0.5% 1 matches

[40]  https://www.coursehero.com/file/p6lfa5am...etween-the-Phillips/ 0.5% 1 matches

[41]  https://timesquare.inria.fr/simple-relation-example/ 0.5% 1 matches

[42]  cdn-pubs.acs.org/doi/suppl/10.1021/acs.i...ic6b01699_si_001.pdf 0.5% 1 matches

[43]  https://en.mimi.hu/finance/leakages.html 0.5% 1 matches

[44]  https://scirate.com/arxiv/1506.07975 0.5% 1 matches

[45]  www.amosweb.com/cgi-bin/awb_nav.pl?s=wpd&c=dsp&k=average product 0.5% 1 matches

12.3%

Results of plagiarism analysis from 2018-07-04 13:11 UTC

0 - 20 - Information Retrieval Based on the Extracted Social Network.pdf

Date: 2018-07-04 13:01 UTC

7 pages, 2214 words

PlagLevel: selected / overall

133 matches from 50 sources, of which 15 are online sources. Settings

Data policy: Compare with web sources, Check against my documents, Check against my documents in the organization repository, Check against organization repository, Check against the Plagiarism Prevention Pool

(2)

--Information Retrieval Based on the Extracted

Social Network

Mahyuddin K. M.[4]Nasution,Rahmad Syah, and Maria Elfida

Information TechnologyDepartment,

Fakultas Ilmu Komputer dan Teknologi Informasi (Fasilkom-TI), and

Information system Centre,

Universitas Sumatera Utara, Medan 1500 USU, Sumatera Utara, Indonesia [email protected]

Abstract.[15]It is possible that a technology affects other technologies.

In this paper, explored the possibility to reveal improved information

[15]

retrieval performance through the extraction method of social network.

Any extracted social network structurally is not a perfect graph so it is

[15]

possible to build the star social networks as the optimal form of graph, which guides to model the implication of information retrieval: the for-mulation of recall and the precision, which generally show better perfor-mance on average over 90%and 58%, respectively.

Key words: information space, webpages, recall, precision, graph, de-gree.

1 Introduction

The social network as a resultant of extraction method from Web is a repre-sentation of relationship between web pages through any search engine [1]. The resultant based on occurrence and co-occurrence [2]. It depends not only on the logically relevance between the query and the webpage but also the similarity between the query and the information available, i.e. the answer to the required information as accurately as possible [3]. The latter case has become the con-centration of information retrieval that is knowledge technology that focuses on the effectiveness and efficiency for retrieving information from information space like Web [4].

The method of social network extraction from the Web not onlygives birth [4]

(3)

2 M. K. M. Nasution, R. Syah, and M. Elfida computationally within similarity distance [10], e.g. using

simour (ai, a j) = 2|ai||aj| any relation between social actors, but encourages the growth of social networks. Moreover, semantically the weights indicate the proximity of webpages that are not actually interlinked with each other [11]. Thus, social structures indirectly form other structures of documents scattered within the information space, the different structures than links built into webpages or within the server in which the webpage resides.

(4)

compotision of information: a hitcount|Ωai∩Ωaj|and list of snippetsLsaisaj. In other words, for all el∈E we have

el= q(a ai j) =Ωaiaj=Ωai∩Ωaj={ωk|k = 1, . . . , m l} (3)

where Ωaiaj is a cluster of webpages.

The cluster of webpages in either occurrence or co-occurrence consists of webpages in general conical to intersection between a cluster based on ai and a cluster based on aj. Therefore, to obtain a reliabletechnology it is necessary to assess the performance of social network extraction methods. Assessmentis done on the side of getting trusty information and getting side of technology associated with it. AssumeDr is the set of documents relevant to the query and Dcis an accomplished documents, then the commonly used the assesment measures are recall Rec and precision P res as follows.

1. To measure the permissibility (approximate ability) of reaching approach to all relevant documents based on query,

Rec =|Dr∩ Dc|

|Dr| (4)

2. To measure the ability of the approach to achieve only documents relevant to the query and may reject irrelevant documents,

P res =|Dr∩ Dc|

|Dc|

(5)

[4]

Theorem 1. If a social network is representation of social structure, then re-sultant of method for extracting social network from Webis representation of webpages structure.

3 An Approach

If the social network extracted from the information source consists of n social actors, then in general the social structure is formed can be expressed through the degree of social actors. Inthis case, the social network theoretically consist of two categories as defined below.

Definition 1. A social network is completely shaped or abbreviated a complete social network consists n vertices and n(n−1) edges. Hereinafter it denoted as SN (n, n(n−1)).

Definition 2. A star-shaped social network or abbreviated a star social network consists n vertices and n−1 edges. Hereinafter it denoted as SN (n, n−1).

(5)

4 M. K. M. Nasution, R. Syah, and M. Elfida

Fig. 1. Star and complete social network in comparison

Lemma 1. If social actor has the highest degree in social network, then the social actor become candidate of center in social network.

Proof. Suppose there are m social actors within social networks within each degree canbe sorted as follow d(v1) d (v2)≥d v3)( ≥. . . ≥d vm). It is clear( that the social actor with the highest degree d(v1) is center candidate in the social network.

Proposition 3. If there is more than one social actor is of the highest degree and likely to have multiple leaves, the social actors are the candidates of center in subs of social network.

Proof. Based on Lemma 1, if vertices with the highest degree in social network are eliminated but each vertex still has leaves, it is clear that the social actors represented by the vertices become the candidates of center in subs of social network.

In graph theory, a tree is optimal-shaped of graph. The staris one of tree forms.

Lemma 2. If there are vertices with the highest degree in social network, then the subs of social network with social actors as the center are the optimal form of social networks.

(6)

This social network is denoted by SNs(mi, m i−1) be subs of star-shaped social

network or the star social networks.

Proposition 4. If there are the star social networks in a social network, then the collection of information spaces about social actors is conjoined into the social actors' information space as the center of star social networks.

Personal Name Position Number of Documents

Abdul Razak Hamdan Professor (A1) 103

Abdullah MohdZin Professor (A2) 105 information retrieval in social network extraction perspective basedon Eq. 4 and Eq. 5. The documents setDris modeled as a collection of all the collected

(7)

6 M. K. M. Nasution, R. Syah, and M. Elfida

Personal Name Disambiguation Star Social Network (Professor) Recall Precision Recall Precision

A1 47/103 (45.63%) 47/154(30.52%) 99/103 (96.12%) 99/194(51.03%) A2 49/105 (46.67%) 49/158(31.01%) 101/105 (96.19%) 101/188(33.72%) A3 72/160 (45.00%) 72/152(47.37%) 155/160 (96.88%) 155/222(69.82%) A4 92/210 (43.81%) 92/148(62.16%) 200/210 (95.24%) 200/258(77.52%) A5 32/70 (45.71%) 32/154(20.78%) 68/70 (97.14%) 68/164(41.46%)

of 648 webpages like Table 1. By using conceptthe name disambiguation, Eq. 4 and Eq. 5 produce Table 2.

In general, there is an increase in the performance of the access process by means of involving social networks whereby the webpages of influential social actors will be dug up by other social actors as keywords rather than keywords that are not the names of social actors. Thus, theformulation of recall and precision based on Proposition 4 are as follows:

Reca=|Dr∩

S_mi=1Dc|

|Dr| (8)

and

P res a=

|Dr∩

S_mi=1Dc|

|Smi=1 Dc|

(9)

where m is degree of center in star social network.

5 Conclusion and Future Work

By studying the principle of social network extraction, from the extraction method, there is a change in formulation on the implication of information re-trieval that is recall and precision, which generally have better performance based on computation. Furthermore, to test this formula will be built larger datasets.

References

1. Mahyuddin K. M. Nasution, and S. A. M. Noah. 2010.[33]Superficial method for extracting social network for academic using web snippets.In Yu, J. et al., (eds.).

Rough Set and Knowledge Technology, LNAI 6401, Heidelberg, Springer: 483-490.

[4]

2. Mahyuddin K. M. Nasution. 2016. Social network mining (SNM): Adefinition of relation between the resources and SNA. International Journal on Advanced Science Engineering Information Technology 6(6): 975-981.

3. Mahyuddin K. M. Nasution and Shahrul Azman Noah. 2012. Information Retrieval Model: A Social Network Extraction Perspective. CAMP 2012, IEEE.

[4]

(8)

[4]

5. Mahyuddin K. M.Nasution,Opim Salim Sitompul, EmersonP. Sinulingga, and Shahrul Azman Noah. 2016.[4]An extracted social network mining.SAI Computing Conference, IEEE.

6. Mahyuddin K. M. Nasution. 2016. Modelling and simulation of search engine. International Conference on Computing and Applied Informatics (ICCAI), IOP. 7. Mahyuddin K. M. Nasution. 2014. Newmethod for extracting keyword for the

social actor. Intelligent Information andDatabase Systems LNCS Volume 8397: 83-92.

8. Mahyuddin K. M. Nasution. 2012.[33]Simple search engine model:Adaptive proper-ties.[33]Cornell University Library arXiv:1212.3906v1.

9. Mahyuddin K. M. Nasution. 2012. Simple search engine model:[33]Adaptive properties for doubleton. Cornell University Library arXiv:1212.4702v1.

10. Mahyuddin K. M. Nasution. 2016. New similarity. Annual Applied Science and Engineering Conference (AASEC), IOP.