Conclusion - Opinion Analysis Corpora Across Languages

Opinion Analysis Corpora Across Languages

6.4 Conclusion

In this paper, we have discussed the contributions made by our development ofNTCIR MOAT. We created a cross-lingual opinion corpus using the news document genre, following which, several researchers have conducted cross-lingual opinion research using our test collections. Although sentiment classification accuracy is improved by using a cross-lingual corpus, research investigating linguistic opinion properties characterized by languages rooted in different cultures and opinion retrieval strategies preferable for different language characteristics remain to be undertaken.

6https://nlp.stanford.edu/sentiment/.

6 Opinion Analysis Corpora Across Languages 93 In recent research, high-quality contextual representations based on neural archi- tectures such as ELMo (Peters et al.2018a) and BERT (Devlin et al.2019) are proving to be effective in NLP research. In addition, linguistic properties such as morpholog- ical, local-syntax, and longer-range semantics tend to be treated at different layers, such as the word-embedding layer, lower contextual layers, or upper layers in each of these cases (Peters et al.2018b; Jawahar et al.2019). As an extension of bilingual sentiment word-embedding frameworks (Zhou et al.2015), cross-lingual sentiment retrieval research that considers syntax and semantics in different languages will be an interesting direction for future work.

Acknowledgements This work was partially supported by JSPS Grants-in-Aid for Scientific Research (B) (#19H04420).

References

Balahur A, Steinberger R (2009) Rethinking sentiment analysis in the news: from theory to practice and back. In: Proceedings of workshop on opinion mining and sentiment analysis’ (WOMSA), Sevilla, Spain, pp 1–12

Balahur A, Steinberger R, Kabadjov M, Zavarella V, van der Goot E, Halkia M, Pouliquen B, Belyaeva J (2010) Sentiment analysis in the news. In: Proceedings of the 7th international conference on language resources and evaluation (LREC 2010), Valletta, Malta, pp 2216–2220 Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boom-boxes and blenders: domain

adaptation for sentiment classification. In: Proceedings of the 45th annual meeting of the association of computational linguistics (ACL), pp 440–447

Brody S, Diakopoulos N (2011) Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! using word lengthening to detect sentiment in microblogs. In: Proceedings of the 2011 conference on empirical methods in natural language processing (EMNLP 2011), Edinburgh, Scotland, UK, pp 562–570 Choi Y, Cardie C, Riloff E, Patwardhan S (2005) Identifying sources of opinions with conditional

random fields and extraction patterns. In: Proceedings of the 2005 human language technology conference and conference on empirical methods in natural language processing (HLT/EMNLP 2005), Vancouver, B. C

Dang HT (2008) Overview of the TAC 2008 opinion question answering and summarization tasks.

In: Text analysis conference (TAC 2008) workshop notebook papers. Accessed 1 Apr 2010.

Available from:http://www.nist.gov/tac/tracks/2008/index.html

Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional trans- formers for language understanding. In: Proceedings of the 2019 conference of the North Amer- ican chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), Minneapolis, Minnesota, pp 4171–4186.https://doi.org/10.18653/

v1/N19-1423

Elming J, Plank B, Hovy D (2014) Robust cross-domain sentiment analysis for low-resource languages. In: Proceedings of the 5th workshop on computational approaches to subjectivity, sentiment and social media analysis, association for computational linguistics, Baltimore, Maryland, pp 2–7.https://doi.org/10.3115/v1/W14-2602,https://www.aclweb.org/anthology/W14-2602 Felbo B, Mislove A, Søgaard A, Rahwan I, Lehmann S (2017) Using millions of emoji occurrences

to learn any-domain representations for detecting sentiment, emotion and sarcasm. In: Proceed- ings of the 2017 conference on empirical methods in natural language processing, Copenhagen, Denmark, pp 1615–1625.https://doi.org/10.18653/v1/D17-1169

Gao D, Wei F, Li W, Liu X, Zhou M (2015) Cross-lingual sentiment lexicon learning with bilingual word graph label propagation. Comput Linguist 41:21–40

94 Y. Seki Jawahar G, Sagot B, Seddah D (2019) What does BERT learn about the structure of language? In:

Proceedings of the 57th annual meeting of the association for computational linguistics, Florence, Italy, pp 3651–3657.https://doi.org/10.18653/v1/P19-1356

Jiang L, Yu M, Zhou M, Liu X, Zhao T (2011) Target-dependent twitter sentiment classification.

In: Proceedings of the 49th annual meeting of the association for computational linguistics (ACL 2011), Portland, Oregon, pp 151–160

Le TA, Moeljadi D, Miura Y, Ohkuma T (2016) Sentiment analysis for low resource languages: a study on informal Indonesian tweets. In: Proceedings of the 12th workshop on Asian language resources (ALR), The COLING 2016 Organizing Committee, Osaka, Japan, pp 123–131.https://

www.aclweb.org/anthology/W16-5415

Louviere JJ, Flynn TN, Marley AAJ (2015) Best-worst scaling: theory, methods and applications.

Cambridge University Press, Cambridge

Lu B, Tan C, Cardie C, Tsou BK (2011) Joint bilingual sentiment classification with unlabeled parallel corpora. In: Proceedings of the 49th annual meeting of the association for computational linguistics (ACL 2011), Portland, Oregon, pp 320–330

Macdonald C, Ounis I (2006) The TREC Blogs06 collection: creating and analysing a blog test collection. Technical report TR-2006-224, Department of Computing Science, University of Glasgow

Martinez-Camara E, Martin-Valdivia MT, Urena-Lopez LA, Montejo-Raez AR (2013) Sentiment analysis in twitter. Nat Lang Eng 11(2):1–28

Meng X, Wei F, Liu X, Zhou M, Xu G, Wang H (2012) Cross-lingual mixture model for sentiment classification. In: Proceedings of the 50th annual meeting of the association for computational linguistics (ACL 2012), Jeju, Republic of Korea, pp 572–581

Mohammad SM, Bravo-Marquez F, Salameh M, Kiritchenko S (2018) SemEval-2018 task 1: affect in tweets. In: Proceedings of the 12th international workshop on semantic evaluation (SemEval- 2018), New Orleans, Louisiana, pp 1–17.https://doi.org/10.18653/v1/S18-1001

Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd annual meeting of the association for computational linguistics (ACL 2005), Ann Arbor, Michigan, pp 115–124

Pang B, Lee L (2008) Opinion mining and sentiment analysis. Now Publishers Inc, Boston Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? sentiment classification using machine learning

techniques. In: Proceedings of the 2002 conference on empirical methods in natural language processing (EMNLP 2011), pp 79–86

Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018a) Deep con- textualized word representations. In: Proceedings of the 2018 conference of the North American Chapter of the association for computational linguistics: human language technologies, volume 1 (long papers), New Orleans, Louisiana, pp 2227–2237.https://doi.org/10.18653/v1/N18-1202 Peters M, Neumann M, Zettlemoyer L, tau Yih W (2018b) Dissecting contextual word embeddings:

architecture and representation. In: Proceedings of the 2018 conference on empirical methods in natural language processing (EMNLP 2018), Brussels, Belgium, pp 1499–1509

Purver M, Battersby S (2012) Experimenting with distant supervision for emotion classification. In:

Proceedings of the 13th Eurpoean chapter of the association for computational linguistics (EACL 2012), Avignon, France, pp 482–491

Ruppenhofer J, Somasundaran S, Wiebe J (2008) Finding the sources and targets of subjective expressions. In: Proceedings of the 6th international language resources and evaluation (LREC’08), Marrakech, Morocco

Seki Y, Evans DK, Ku LW, Chen HH, Kando N, Lin CY (2007) Overview of opinion analysis pilot task at NTCIR-6. In: Proceedings of the 6th NTCIR workshop meeting, NII, Japan, pp 265–278 Seki Y, Evans DK, Ku LW, Sun L, Chen HH, Kando N (2008) Overview of multilingual opinion analysis task at NTCIR-7. In: Proceedings of the 7th NTCIR workshop meeting, NII, Japan, pp 185–203

6 Opinion Analysis Corpora Across Languages 95 Seki Y, Ku LW, Sun L, Chen HH, Kando N (2010) Overview of multilingual opinion analysis task at NTCIR-8 - a step toward cross lingual opinion analysis. In: Proceedings of the 8th NTCIR workshop meeting, NII, Japan, pp 209–220

Shanahan JG, Qu Y, Wiebe JM (2006) Computing attitude and affect in text: theory and applications.

Springer, Berlin

Socher R, Perelygin A, Wu J, Manning JCCD, Ng A, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the (2013) conference on empirical methods in natural language processing (EMNLP 2013). Association for Computational Linguistics, Seattle, Washington, USA, pp 1631–1642

Steinberger R, Pouliquen B, van der Goo E, (2009) An introduction to the Europe media monitor. In:

Proceedings of ACM SIGIR 2009 workshop: information access in a multilingual world, Boston, MA, USA

Stoyanov V, Cardie C, Wiebe J (2005) Multi-perspective question answering using the OpQA corpus. In: Proceedings of the 2005 human language technology conference and conference on empirical methods in natural language processing (HLT/EMNLP 2005), Vancouver, B. C Strapparava C, Mihalcea R (2007) SemEval-2007 task 14: affective text. In: Proceedings of the 4th

international workshop on semantic evaluations (SemEval-2007), Prague, pp 70–74

Turney P (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting on association for computational linguistics, Philadelphia, USA, pp 417–424

Wan X (2009) Co-training for cross-lingual sentiment classification. In: Proceedings of the 47th annual meeting of the association of computational linguistics (ACL), Suntec, Singapore, pp 235–243

Wiebe J, Breck E, Buckley C, Cardie C, Davis P, Fraser B, Litman D, Pierce D, Riloff E, Wilson T (2002) NRRC summer workshop on multiple-perspective question answering final report Wiebe J, Wilson T, Cardie C (2005) Annotating expressions of opinions and emotions in language.

Lang Resour Eval 39(2–3):165–210

Wiebe JM, Wilson T, Bruce RF, Bell M, Martin M (2004) Learning subjective language. Comput Linguist 30(3):277–308

Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the 2005 human language technology conference and conference on empirical methods in natural language processing (HLT/EMNLP 2005), Vancouver, B. C Zhou H, Chen L, Shi F, Huang D (2015) Learning bilingual sentiment word embeddings for cross-

language sentiment classification. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: long papers), Beijing, China, pp 430–440.https://doi.org/10.3115/v1/P15- 1042

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Chapter 7

Dalam dokumen OER000001404.pdf (Halaman 104-108)