All sources 33 Internet sources 21 Own documents 10 Organization archive 2 [0] https://link.springer.com/chapter/10.1007/978-3-319-57261-1_31
13.1% 39 matches
[1] https://www.springerprofessional.de/en/e...relation-be/12204946 11.7% 37 matches
[2] "3028-3639-1-RV.pdf" dated 2017-10-30 5.1% 18 matches
[3] "CR-INT110-Semantic interpretation ...ot; dated 2017-10-09 3.3% 12 matches
[4] "CR-INT136-Social network extractio...ot; dated 2017-10-09 3.8% 15 matches
[5] "CR-INT137-Enhancing to method for ...ot; dated 2017-10-09 3.0% 12 matches
[6] https://www.researchgate.net/publication...etween_Social_Actors 3.6% 8 matches
[7] "CR-INT135-Information Retrieval on...ot; dated 2017-10-09 2.9% 13 matches
[8] https://link.springer.com/chapter/10.1007/978-3-319-67621-0_20 3.4% 13 matches
[9] https://archive.org/stream/arxiv-1212.4702/1212.4702_djvu.txt 2.7% 9 matches
[10] https://arxiv.org/pdf/1303.3964.pdf 2.1% 7 matches
[11] https://rd.springer.com/content/pdf/10.1007/978-3-319-05476-6_9.pdf 1.9% 6 matches
[12] www.emeraldinsight.com/doi/citedby/10.1108/09696470510611384 1.7% 5 matches
[13]
www.springer.com/us/book/9783319572604 1.7% 5 matches
1 documents with identical matches [15] "3472-4529-1-SM.pdf" dated 2017-10-30
1.4% 5 matches
[16] dblp.uni-trier.de/db/conf/csoc/csoc2017-1 1.6% 4 matches
[17] www.academia.edu/3144197/Simple_Search_Engine_Model_Adaptive_Properties_for_Doubleton 0.8% 3 matches
18.5%
Results of plagiarism analysis from 2017-12-28 15:58 UTC31 - Enhancing Extraction Method for Aggregating Strength Relation Between Social Actors.pdf
[18] https://www.researchgate.net/profile/Mahyuddin_Nasution 1.1% 3 matches
[19] dblp.uni-trier.de/pers/n/Nasution:Mahyuddin_K=_M= 1.3% 4 matches
[20] www.yasni.de/mariati mohd zin/person information 1.0% 4 matches
[21] it.usu.ac.id/index.php/penelitian-pengabdian/publikasi/49-daftar-publikasi-tahun-2017 1.1% 3 matches
[22] https://www.researchgate.net/profile/Mah...citationCount&page=1 1.1% 3 matches
[23] "2668-3031-1-RV.pdf" dated 2017-09-26 0.5% 2 matches
[24] "ICIC_2017_paper_144.pdf" dated 2017-09-11 0.4% 1 matches
[25] "3310-4262-1-RV.pdf" dated 2017-10-30 0.4% 1 matches
[26] "2660-3022-1-RV.pdf" dated 2017-09-26 0.4% 1 matches
[27] https://patents.google.com/patent/US20090287685A1/en 0.3% 1 matches
[28]
https://www.researchgate.net/publication..._Kesehatan_Indonesia 0.4% 1 matches
1 documents with identical matches
[30] "PTUPT_Sistem_Peringatan_Dini_Kebakar.pdf.pdf" dated 2017-09-09 0.3% 1 matches
[31] https://www.researchgate.net/publication...ed_on_Indonesian_NLP 0.3% 1 matches
[32] aasec.conference.upi.edu/2017/ 0.3% 1 matches
[33]
"16. IOP.pdf" dated 2017-12-07 0.3% 1 matches
5 documents with identical matches
[39] https://books.google.co.uk/patents/US20150186789 0.2% 1 matches
10 pages, 3520 words PlagLevel: selected / overall
Data policy: Compare with web sources, Check against my documents, Check against my documents in the organization repository, Check against organization repository, Check against the Plagiarism Prevention Pool
Sensitivity: Medium Bibliography: Consider text
--Enhancing
Enhancing Extraction
Extraction Metho
Metho d
d for
for Aggregating
Aggregating
[2121]Strength
Strength Relation
Relation Bet
Betw
ween
een So
So cial
cial Actors
Actors
Mahyuddin K.M.[0] Nasution(B) and Opim Salim Sitompul
Information Technology Department, Fakultas Ilmu Komputer Dan Teknologi Informasi (Fasilkom-TI), and Information System Centre,
Universitas Sumatera Utara, 1500 USU, Medan, Sumatera Utara, Indonesia
[00]
Abstract.
Abstract. There are differences in the resultant of extracting the rela-tions between social actors based on two streams of approaches in
prin-[0]
ciple. However, one of the methods like the superficial methods can upgraded to make the information extraction by using the principles of
[0]
the other methods, and this needs proof systematically.This paper serves to reveal some formulations have the function for resolving this issue.
Based on the results of experiments conducted the expanded method is
[0]
the adequate.
[00]
Keyw
Keywords:ords: Search engine
·
Search term·
Query·
Social actor·
Singleton ·Doubleton1
1
Intro
In
tro duction
duction
Extractingsocial network fromWeb has carried out with a variety of approaches
ranging from simple to complex [1]. Unsupervised method or superficial method generally more concise and low cost, but only generates the strength relations between social actors from heterogeneous and unstructured sources such as the W
e b [2]. Instead, supervised methods are generally more complicated and high
cost and it produces labels of relationship between social actors, but it came from sources, homogeneous and semi-structured like corpuses [3 4, ]. However, to
generate social networks that enable to express semantically meaning is not easy
[5]. This requires a method to represent their privilege of both methods: An
approach is not only produces a relationship but re-interpret the relationship based on the aggregation principle.[4] This paper aimed to enhance the superficial method for extracting social network from Web.
2
2
Problem Definition
Problem
Definition
[55]
The initial concept semantically of the extraction of social network from Web
is to explore a series of names through co-occurrence using search engine [6 7, ].
Then, the extraction of social network made possible by involving the occurrence. Formally, the following we stated extracting social networks [8 9, ].
c
[0] Springer International Publishing AG 2017
R.Silhavy et al. (eds.),Artificial Intelligence Trends in Intelligent Systems,
Enhancing Extraction Method for Aggregating Strength Relation 313
Occurrence and co-occurrence individually are a query ( ) representing aq
social actor and a query representing a pair of social actors. On the occurrence,
q contains a name of social actor, for exampleq =”Mahyuddin K. M. Nasution”. While on the co-occurrence, q contains two names of social actors, for example
q = ”Mahyuddin K. M. Nasution”, ”Shahrul Azman Noah” [2]. Therefore, names of social actors are the search terms, and we define it formally as follows
De
Definition 2.inition 2. A earch ermstk consists of words or phrase, i.e.t tk = {wk|k =
1, }.. . . , o
We use the well query to pry information from the Web by submitting it to
search engine. A search engine works on a collection of documents or web pages, or more precisely as follows [10].
[9]
De
Definition 3.inition 3. Ω is a set of web pages indexedsearch engine, if there are a table
relation of(ti, j) such thatω Ω = ({ t, ω)ij}, here ti is search terms and w ωj is a
page that is indexed by search engine contains at least one occurrence of tx,
314 M.K.M. Nasution and O.S. Sitompul
As information of any social actor, the singleton is the basic of search engine property that statistically related to the social actor. In this case, the singleton be the necessary condition for gaining the information of social actor from Web
although it contains connatural trait (bias and ambiguity), and naturally it becomes the social dynamic of human beings [2]. Hit count is main information
for a social actor based on Web, and validation of this information can obtained by crawling one after one the snippets list returned by the search engine [11].
De
proved that w has the character, i.e. the relative probability of w
p w( ) | |w =
hit count of doubleton as follows
|Ωx∩Ωy|=
Ω
(Ωx(tx ∧ty)∩Ωy(tx ∧ty)) = 1) . (3)
Enhancing Extraction Method for Aggregating Strength Relation 315
Lemma
Lemma 2.2. If w is a token in LD as list of snippets based on doubleton, then w
statistically has the character in the doubleton.
Proo f. Similar to Lemma 1, and based on Definitions 5 and 6, e ave hew har-h t c acter of w in the doubleton as follows
pD(w) =
| |w
|Ωx ∩Ωy|
∈[0,1], (4)
where |w| Ωx∩Ω≤y| and |Ωx| ∩Ωy| = . 0
Fig.
Fig. 1.1. Type of snippets based on co-occurrence (Google search engine)
As information of the relations between social actors, the doubleton naturally
be basic for refining the information about a social actor where one of search
terms be a keyword for other. Therefore, this is sufficient condition for
nating the connatural trait of the singleton. The snippets of doubleton, however
naturally showed the different kind of information of relations. We conclude that
316 M.K.M. Nasution and O.S. Sitompul
of three (triple) dots between two names of social actors. Triple dots naturally is a word in text. The direct relations represented by direct co-occurrences like co-author, but the indirect relations represented by indirect co-occurrences such as citation or present on same event.
3
3
The Prop
The
Prop osed
osed Approach
Approach
The method of extracting information from Web recognized as the superficial method, categorized in unsupervised stream, involving a search engine to obtain the information like the hit counts used in computation [12]. Generally, for gen-erating relation between actors applied the similarity measurement [13].
[44]
De
Definition 7.inition 7. rs ∈R is the strength relation between two social actors a, b ∈A
if it meets the comparison among the different information of twoactors (aa and
b
b) and the common information of them (aa∩bb) in the similarity measurement.
Or sr=sim(aa b, ,b aa∩bb) in [0 1], , aa∩bb≤aa and aa∩bb≤bb.
Suppose we use Jaccard coefficient, we possess sr based on hit counts
sr= |Ωa∩Ωb|
Lemma 3.3. If ir is a indirect relation between two social actors a, b ∈ A, then
irstatistically has the character in the doubleton.
Proo f. Suppose the indirect relationsir can be recognized in each snippet based on doubleton, we have number of the indirect relations in the snippets list based
on doubleton or |ir| |, ir| = number of snippets contain triple dots. Therefore, we generate the character of ir as follows
p(ir) | |ir=
Propositionosition 3.3. If sr is a strength relation between two social actors a, b ∈ A,
then the aggregation of sr consists of three binderies.
Proo f. Suppose ( )p ir ∈ [0,1] (Eq. (6)) as probability of the indirect relation
based on doubleton, then probability of the direct relation ( ) based on dou-dr
bleton is as follows
Enhancing Extraction Method for Aggregating Strength Relation 317
characteristics are (p ir) nd p(dr), respectively.a However, 1−p(ir)−p(dr)≥ 0,
if p(ir) p(dr)+−1= 0, we obtain
p(ur) −( (p ir=) p dr( ))1+ (8)
i.e. the character of relation has not be determined with certainty through the co-occurrence. Because (p ir), p dr( ) ndp(ur) can be considered as the percentagea values, the multiplication of a characteristic with the strength relation regarded as bindery based on type of relations. Therefore, we have three bindings of the strength relations as follows
J1 A bindery of strength relations based on the direct relations,
srdr =sr∗p dr( )∈[0,1] (9)
J2 A bindery of strength relations based on the indirect relations,
srir =sr∗p ir( )∈[0,1] (10)
J3 A bindery of strength relations based on the unclear relations,
srur =sr∗p ur( )∈[0,1] (11)
Fig.
Fig. 2.2. Type of relations based the social network extraction
Prop
Propositionosition 4.4. Ifsr is a strength relation between two social actors a, b ∈ A,
then the aggregation of sr consists of sheets.
Proo f. Based on Proposition and by applying Eq. ( ) to the strength relation3 4
sr, we can generate the aggregations based on words and we call it as the sheets of relations sh, i.e.
318 M.K.M. Nasution and O.S. Sitompul
Generally, this concept is considered to be an approach to the concept of latent semantic analysis [14] that have been put forward and produce labels on
the social networks based on the supervised stream or the generative probabilistic
model (PGM) [4,15]. This approach as enhancing for superficial method [16,17].
Theorem
Theorem 1.1. sr is the strength relation between two actors a, b∈A if and only
if there are aggregation.
[ 3
]
Proo f. This is a direct consequence of Propositions3 and 4 as the necessary conditions, and Lemmas1 2 , and as the sufficient conditions, see Fig.3 2.
generate (keyword)
INPUT : A set of actors
OUTPUT : aggregation of the strength relations STEPS :
1. |Ωa| ta query and search engine.← 2. |Ωb| tb query and search engine.←
3. |Ωa∩Ωa| ta∧tb←query and search engine.A= {w1, 2, w .n} Collect. ←. , w words-(terms) per a pair of actors from snippets based on doubleton.
4. |dr| List of snippets based on doubleton.← 5. |ir| List of snippets based on doubleton.← 6. sr∗p(dr) nd sr∗p(ira)
7. Aggregating sr∗p(dr) nd sr∗p(ira) based on the summation of sheets per domain.
[11]
8. Measuring recall and precision of relations.
4
4
Exp erimen
Exp
eriment
t
In this experiment, we implicate n = 469 social actors or n n( −1) = 219 492,
potential relations. There are 30,044 strength relations between 469 actors or
14% of potential relations, among them (a) 4,422 direct relations (2%), (b) 21,462 indirect relations (10%), and (c) 4,160 direct and indirect relations (2%). There-fore, there are 21,462 lists of snippets of doubleton (LD) contain the triple dots
in all snippets, or there are 4,422 lists of snippets of doubleton (LD) ave o h n
dots in all snippets.
Suppose we define the ontology domain and taxonomically we interpret in a set of words as follows
1. Direct relations:
(a) author-relationship = {activity, article, author, authors, award, journal, journals, paper, patent, presentation, proceedings, publication, theme, poster, . }. . .
(b) academic rule = {supervisor, cosupervisor, editor, editors, graduate, lec-turer, professor, prof, researcher, reviewer, student, . }. . .
Enhancing Extraction Method for Aggregating Strength Relation 319
[2020]
T
Tableable 1.1. The strength relation, direct and indirect relations, and author-relationship
sr
1. [20Abdullah Mohd Zin] 0.0482 0.0395 2.[20Abdul Razak Hamdan] 0.0237 3. Tengku Mohd Tengku Sembok
dr ir dr ir
1. Abdullah Mohd Zin 0.0163 0.0815 0.0975 0.0612 2. Abdul Razak Hamdan 0.0000 0.2349 3. Tengku Mohd Tengku Sembok
1 & 1 2 & 2 3 & 3
(a) scientific event = {chair, conference, conferences, meeting, programme,
schedule, seminar, session, sponsor, symposium, track, workshop, . }. . .
(b) citation ={reference, references, bibliography, . } . .
With the concept of aggregation starting from the bindery, each bindery consists of chapters (domains), and each chapter contains the sheets (words).
For example, hit counts (|Ωa|) of “Abdullah Mohd Zin”, “Abdul Razak Hamdan”, and “Tengku Mohd Tengku Sembok” are 7,740, 8,280, and 3,860, respectively. While|Ωa∩Ωb|between “Abdullah Mohd Zin” and “Abdul Razak Hamdan” is 736, |Ωa∩Ωc| between “Abdullah Mohd Zin” and “Tengku Mohd Tengku Sembok” is 441, and |Ωb∩Ωc|between “Abdullah Razak Hamdan” and
[7]
“Tengku Mohd Tengku Sembok” is 281. Therefore, based on Eq.(5) e ave w h
three strength relations sr like Table 1. From 100 snippets based on doubleton, we have:
1. 60 snippets contain the indirect relations and 12 snippets contain the direct relations for “Abdullah Mohd Zin” and “Abdul Razak Hamdan”,
2. 27 snippets contain the indirect relations and 43 snippets contain the direct relations for “Abdullah Mohd Zin” and “Tengku Mohd Tengku Sembok”, and 3. 66 snippets contain the indirect relations for “Abdul Razak Hamdan” and
320 M.K.M. Nasution and O.S. Sitompul
In this case, p(dr) nd p(ir) for a pair of actors there are in Table 1. While 100a snippets for each pair of actors are calculatedpD(w) for each word and its value
is directly transferred to the sheets in the appropriate domain, such as Table 1.
T
Tableable 2.2. .
Aggregation Recall Precision 1 Author-relationship 61.76% 17.65% 2 Research group 55.88% 7.28% 3 Academic rule 61.94% 13.15% 4 Scientific event 61.76% 6.10% 5 Citation 50.01% 6.63%
We conduct an experiment using 65 social actors that have direct and indirect
relations between them, orn n( −1) = 4,160 potential relations. Based on survey we obtain the relevant relation and this is a comparison of the results obtained
through extraction from Web. Based on Table 2, the recall and the precision give
the impression that the activation of each aggregation of the strength relation as adequate.
5
5
Conclusion and
Conclusion
and F
Future
uture W
Work
ork
By studying the principle of methods for extraction the relation between social
actors, we have an enhanced method for aggregation the relations to interpret more rich about social. Thus, this new method still needs further verification. Future work we study about combination between sheets and domain based on ontology.
2. Nasution, M.K.M., Noah, S.A.: Superficial method for extracting social network for academics using web snippets. In: Yu, J., Greco, S., Lingras, P., Wang, G., Skowron, A. (eds.) RSKT 2010. LNCS (LNAI), vol. 6401, pp. 483–490. Springer, Heidelberg (2010). doi:10.1007/978-3-642-16248-068
[0]
3. Cullota, A., Bekkerman, R., McCallum, A.:Extracting social networks and contact
[1]
information from email and the Web. In: Proceedings of the 1st Conference on Email and Anti-Spam (CEAS) (2004)
[0]
4. McCallum, A., Corrada-Emmanual, A., Wang, X.: The author-recipient-topic
model for topic and role discovery in social networks, with application to Enron
[0]
and academic email. In: Proceedings of the Workshop and Link Analysis,
Coun-[0]
Enhancing Extraction Method for Aggregating Strength Relation 321
5. Heras, S., Atkinson, K., Botti, V., Grasso, F., Juli´an, V., McBurney, P.:[0]Research
opportunities for argumentation in social networks. Artif. Intell. Rev. 3939, 39–62 (2013)
[0]
6. Kautz, H., Selman, B., Shah, M.: ReferralWeb: combining social networks and collaborative filtering. Commun.[0]ACM 4040(3), 63–65 (1997)
[0]
7. Finin, T., Ding, L., Zhou, L., Joshi, A.: Social networking on the semantic web.
Learn. Organ.[012]12(5), 418–435 (2005)
[0]
8. Nasution, M.K.M., Sitompul, O.S., Sinulingga, E.P., Noah, S.A.: An extracted social network mining. In: SAI Computing Conference. IEEE (2016)
[0]
9. Nasution, M.K.M.:Social network mining (SNM): a definition[0] of relation between the resources and SNA. Int. J. Adv. Sci. Eng. Inf. Technol.[1]66(6), 975–981 (2016)
[0] [0]
10. Nasution, M.K.M.: Modelling and simulation of search engine. In: International Conference on Computing and Applied Informatics (ICCAI). IOP (2016)
[0]
11. Nasution, M.K.M.: New method for extracting keyword for the social actor. In: Nguyen, N.T., Attachoo, B., Trawi´nski, B., Somboonviwat, K. (eds.) ACIIDS 2014. LNCS (LNAI), vol. 8397, pp. 83–92. Springer, Cham (2014). doi:10.1007/ 978-3-319-05476-6 9
12. Matsuo, Y., Mori, J., Hamasaki, M., Nishimura, T., Takeda, T., Hasida, K., [0]
Ishizuka, M.: POLYPHONET:an advanced social networks extraction system from the web. J. Web Semant. Sci. Serv.[0]Agents World Wide Web 55, 262–278 (2007)
[0]
13. Nasution, M.K.M.: New similarity. In: Annual Applied Science and Engineering Conference (AASEC). IOP (2016)
14. Blei, D.M., Ng, A.Y., Jordan, M.J.: Latent Dirichlet allocation. J. Mach. Learn. Res. , 993–1022 (2003)33
[0]
15. McCallum, A., Corrada-Emmanual, A., Wang, X.: Topic and role discovery in
[0]
social networks. In: Proceedings of the 19th International Joint Conference on
Artificial Intelligence, pp. 786–791 (2005)
16. Nasution, M.K.M., Mohd Noah, S.A.:[0]Extraction of academic social network from
online database.In: Mohd Noah, S.A. et al. (eds.[0)]Proceeding of 2011 International Conference on Semantic Technology and Information Retrieval (STAIRS 2011), pp.
64–69.[1]IEEE, Putrajaya (2011)
[0]