Understanding Emojis and their Semantics: A Survey

(1)

Kajal Gupta, Kanchan and Pushpak Bhattacharyya Dept. of Computer Science and Engineering

Indian Institute of Technology, Bombay

{guptakajal9411, pal.kanchan04, pushpakbh}@gmail.com

Abstract

In social media, it has become a trend to use emojis along with text to be more expressive.

These emojis have become an essential part of computer mediated communication. In recent years, emojis have become a hot topic of research in the field of natural language processing due to their varied semantics and senses.

This paper gives an introduction to various open resources that are available for emojis.

It also compiles various works that have been done in different research areas of emojis to gain some insights.

1 Introduction

Nowadays, people are so much indulged with social media that the number of users across the globe are expected to rise from 2.46 billion (2017) to 2.77 billion (2019*).¹ Moreover, in recent times, the writing habits of social media users have also changed. They have increased the use of pic- tographs, calledemojis, along with the text to be more lively and expressive.

Emojis can play different roles or functions de- pending on a situation. They can visually represent objects/things or can simply express or intensify some sentiment/emotion in the tweet. In some cases, they can also reverse the sentiment of the tweet, thus, causing sarcasm. Example tweets for these functions are given in Table1.

Along with different roles, given some context, emojis can also take different senses. Also, like words, emojis interpretations can vary with cultures, thus, making them difficult to interpret even for humans. This demands for research avenue for emoji in Natural Language Processing in order to fully understand the meanings and usage of emojis to capture opinions expressed by people. In this

1https://www.statista.com/statistics/278414/number-of- worldwide-social-network-users/

Example Usage F1 Travelling to UK for vacation F2 I met Dwayne Johnson today!!!

F3 Ughh I hate my life F4 Such an awesome birthday

Table 1: Example usage of emojis to represent objects (F1), express sentiment (F2), intensify sentiment (F3) and express sarcasm (F4).

paper, we explain various works that have been done in the field of emojis.

The organization of the paper is as follows. The paper starts with the description of various open resources which are available for emoji on the web in section 2. Then, section 3 describes how different emojis are misinterpreted across different languages. Section 4 describes how emojis can be helpful to detect sarcasm and irony. The works done in the field of Emoji Sense Disambiguation and Emoji Sense Similarity are explained in Sec- tion 5 and Section 6 respectively. Section 7 ex- plains an approach used for the task of emoji prediction for twitter. Finally, we conclude the paper in Section 8.

2 Open Resources for Emoji

This section gives an introduction to various open resources which are available for emoji on the web. These resources convey a lot of information related to sentiment, sense definitions, keywords, platform dependent representations, related emojis etc., of an emoji.

2.1 Unicode Consortium

Unicode is a text encoding standard enforcing a uniform interpretation of text byte code by computers². This enables people around the world

2http://www.unicode.org/

(2)

to use computers in any language. This consortium provides a complete list of the Unicode emoji characters³.

This consortium not only provides Unicode for each emoji but also provides different emoji representations from different vendors like, Google, Apple, Twitter etc. It also provides CLDR short name and keywords associated with each emoji.

2.1.1 Emoji Sentiment Ranking

Novak et al. (2015) created Emoji Sentiment Ranking⁴, the first emoji sentiment lexicon of 751 most frequent emojis. For its cretion, 83 human annotators were recruited to classify 1.6 million tweets collected in 13 different European languages into negative, neutral, or positive classes.

Based on the 4% of the collected tweets (which contained emojis), they ranked the popular emojis using the sentiment score of the plain text (tweet).

The Emoji Sentiment Ranking has a format similar to SentiWordNet (Esuli and Sebastiani,2006).

For each emoji, negativityp−, neutralityp₀, posi- tivityp₊and sentiment scores¯are assigned.

2.2 Emojipedia

Emojipedia is a emoji reference site⁵. It hosts various categories like, Smileys and People, Animals and Nature etc., of emoji, popular emoji and news related to emoji. For each emoji, it provides Uni- code representation, definition, images based on different rendering platforms, short code names, and other emoji manually-asserted to be related.

2.3 iEmoji

iEmoji⁶ is a web service that allows you to con- vert an emoji into viewable format. It also helps in understanding how the emoji are being used in social media posts. For each emoji, it provides Twitter emoji popularity rank, images across platforms, short code name, keywords describing the emoji. It also provides human-generated description, Unicode character representation, category within a manually-built hierarchy, examples of use in social media (Twitter) posts and history for each emoji.

3http://www.unicode.org/emoji/charts/full-emoji- list.html

4http://kt.ijs.si/data/Emoji_sentiment_ranking/

5https://emojipedia.org/

6http://www.iemoji.com/

2.4 The Emoji Dictionary

The Emoji Dictionary⁷ is the first crowdsourced emoji resource on web. Since it is a crowsourced site, it lets you add definitions or sense labels to any emoji. These sense labels defines how an emoji can be used in a sentence. It organizes meanings for emoji under three part-of-speech tags, namely, nouns, verbs, and adjectives. It also lists an image of the emoji and its definition with example uses spanning multiples sense labels.

3 Emoji Misinterpretation

Emojis can be defined as pictures, which are nat- urally combined with plain text, creating a new form of language. These pictures remain same in- dependently where we live but can be used and interpreted differently. In order to understand this, the meaning and usage of emojis across different languages is compared from a natural language processing view. (Barbieri et al., 2016a) An em- pirical research methodology relying on current vector space representation modeling (Turney and Pantel,2010) was adopted to understand the “semantics” of these emoji.

A corpus of more than 30 million tweets in four different languages, American English (USA),British English(UK),Peninsular Span- ish(ESP), and Italian (ITA), were collected and different experiments were carried out to compare emojis. In order to preprocess the tweets, same procedure of Barbieri et al. (Barbieri et al.,2016b) was followed. Both the words and the emojis were modelled in the same vectorial space, by relying on the skip-gram embedding model introduced by Mikolov et al. (Mikolov et al.,2013) with 300 dimensions and a window size of 6 tokens.

Several experiments were performed to compare the way the semantics of emojis varies across languages. The main motive is to quantify how the meaning of an emoji A is preserved across different languages by measuring to what extent the emojis that are most similar to A overlap across languages. The vectorial representation of each emoji is exploited in a specific language to select the ones with similar vectors and thus presumably closest in meaning.

The term Nearest Neighbours N N_l(e) of the emoji ein the languagel is defined as the set of the 10 nearest emojis to the emojiein the semantic space of languagel. The nearest neighbours of

7https://emojidictionary.emojifoundation.com/

(3)

each emoji were retrieved with respect to its cosine similarity with other emojis. However, since an emoji is defined by other similar emojis, it is possible to compare these representations across different languages. In order to see if an emoji is similarly defined in two languages, the common elements are looked at in the NN representation of that emoji in the two languages. If the representations of the emoji in different languages share many elements it would mean that the emoji is defined and thus used in a similar way. If there are not common elements among the two representations, the emoji is more likely to mean something different in the two languages.

More precisely, to determine if emoji eis similar in languagel₁ andl₂, the size of the intersec- tion of the NN sets is measured:

sim_l₁_l₂(e) =|N N_l₁(e)\

N N_l₂(e)|

It is assumed that if sim_l₁_l₂ is equal to 10, the emoji e has the same meaning in the languagesl1

andl2. On the other hand, if sim_l₁_l₂ is equal to 0the emoji means something different in the two languages. It is also measured whether an emoji means the same across all the languages by looking at the overlap of all the sets of emojis that are most similar to the emojiein each language.

The top half of theFigure 1indicates the emojis having high value ofsim_all. It indicates that emojis related to music, nature and food are mostly language independent. The lower half of the table contains emojis having low value ofsim_allwhich indicates that they are language dependent. For each emoji are indicated the rank (where1is the most common emoji over the four languages and 150is the least used emoji), the six combinations ofsiml1l2, and thesimall.

Looking at the bottom of the table it can be seen that the emojis and are used in a very different way across all the languages, and each language seems to have its own way to define them.

Also the emojis , , and do not seem to keep their meaning across different languages (the number 100 for example might be just a number or a excellent grade). For the waving-hand emoji , the N N for USA and U K is plotted, (Figure 2) and it can be observed that these two emojis are interpreted in different ways. In the case of American English the waving-hand seems to mean bye/see you later (smiles and people waving), while for British English the waving-hand emojis is related

Figure 1: Emojis with highsim_allon the top and with lowsimallon the bottom (Barbieri et al.,2016a)

to travelling (countries flags, train and airplane are included in theN N).

4 Use of Emojis to Detect Sarcasm and Irony

Automatic detection of figurative language is a challenging task in computational linguistics. It is very difficult for a machine to differentiate between the literal and figurative meaning of a text.

Sarcasm and irony basically refers to the case when the intended meaning of a text is opposite to its literal meaning. The difference between sarcasm and irony is that the former is always negative and is aimed at ridiculing the other person while it is not true for the latter. So, detecting sarcasm and irony in a text are very challenging and a lot of research is being done on these topics.

Emojis have been used as a feature for sarcasm detection.(Barbieri et al.,2014) Other features that

(4)

Figure 2: Nearest Neighbors of the waving hand emoji for USA (left) and UK (right) (Barbieri et al.,2016a)

were used in this work include:

• Frequency: gap between rare and common words

• Written-Spoken:written-spoken style uses

• Intensity:intensity of adverbs and adjectives

• Structure:length and punctuation

• Sentiments: gap between positive and negative terms

• Synonyms:common vs. rare synonyms use

• Ambiguity:measure of possible ambiguities The study shows that the emojis were useful in this task of sarcasm detection.

Emoticons have also been used in detecting irony in a text. (Carvalho et al.,2009) This work basically described eight linguistic patters that can be helpful in detecting irony. The aim of this work was to recognize irony in apparently positive sentences involving human named-entities (NE) in Portugese. These eight patterns include:

1. Demonstrative Determiners: In Portuguese, the occurrence of any demonstrative form be- fore an human NE usually indicates that such entity is being negatively mentioned.

2. Interjections: It is believed that some interjections can be used as potential clues for irony detection.

3. Verb Morphology: The type of pronoun used for addressing people can also be an impor- tant clue for irony detection.

4. Cross-Constructions: In Portuguese, evalua- tive adjectives with a prior positive or neutral polarity usually take a negative or ironic interpretation whenever they appear in cross- constructions.

5. Heavy Punctuation: It is assumed that the presence of more than one exclamation mark and/or question mark in a sentence can be used as a clue for irony detection.

6. Quotation Marks: They are also frequently used to express and emphasize an ironic content, especially if the content has a prior positive polarity, which is the main motive of this work.

7. Laughter Expressions:It included the use of acronyms like LOL, onomatopoeic expressions such as "ah", "eh", "hi", etc and prior positive emoticons ":)",";-)"and ":P" to detect irony.

The study proved that the emoticons were useful for detecting irony.

5 Emoji Sense Disambiguation

Emoji have become an extremely popular form of communication because of their powerful way of expressing emotions in single character. But emoji may be used in different contexts to express different senses which makes them hard to disambiguate using traditional NLP techniques. This section first introduces an emoji resource, Emo- jiNet⁸, and then describes how it is useful for the task of Emoji Sense Disambiguation.

5.1 EmojiNet: A Sense Inventory for Emoji EmojiNet is an emoji resource built by Wijer- atne et al. (2016) and it is similar to a lexical databese, Wordnet (Fellbaum, 1998). The difference is that the former assigns context based meanings, senses, to different emojis and the latter assigns senses to different words and groups them into sets of synonyms calledsynsets. It is a first machine readable sense inventory for emojis which provides these informations about emojis:

1. the part-of-speech tags (PoS tags) for a particular use of emoji,

2. the definition of an emoji and the senses it is used in,

8http://emojinet.knoesis.org/

(5)

3. example uses of emoji for each sense, 4. links of emoji senses to other knowledge

bases such as BabelNet⁹or Wikipedia¹⁰. EmojiNet integrates four openly available emoji resources (Unicode Consortium, Emoji- pedia, iEmoji and The Emoji Dictionary) along with BabelNet to create a dataset consisting of 12,904sense labels over2,389emoji.

5.2 Emoji Sense Disambiguation using EmojiNet

Emoji sense disambiguation is the ability to iden- tify the meaning of an emoji in the context of a message in a computational manner. To use Emo- jiNet for this task,Wijeratne et al.(2017b) did the following enhancements:

1. They first created two word embedding models learned over Twitter and news articles using the Word2Vec model (Mikolov et al.

(2013)).

2. For each emoji ei ∈ E, they extracted the definitiond_i of the emojie_iand the set of all emoji sense definitions S_i of e_i from Emo- jiNet. Then, for each wordwindi , they extracted the twenty most similar words from the two word embedding models as two separate sets, namely CW_e^T_i andCW_e^N_i. Simi- larly, for each emoji sense definitionsi ∈Si

that belongs toe_i , they extracted the words wsi in si ∈ Si and repeat the same process to learn two separate context word sets CW_e^T_i_−s_i andCW_e^N_i_−s_i.

3. For example, for , EmojiNet lists “A gun emoji, more precisely a pistol. A weapon that has potential to cause great harm”as its emoji definition. For each word in the definition above, the top twenty most similar words learned for it using the two word embedding models are used to generate context words.

The same process is applied to each emoji sense definition for as well.

The process of emoji sense disambiguation using EmojiNet is as follows:

1. First of all, 25 emoji are selected which have shown to be interpreted differently when used in communication byMiller et al.(2016).

9http://babelnet.org/about

10https://www.wikipedia.org

2. Then randomly 50 tweets are selected for each of the 25 emoji from the Twitter corpus that was used to train the word embedding model.

3. Three sets of contexts are defined for an emoji sense based on the three different datasets:

(a) BabelNet-based context:This contains the set of words included in BabelNet sense definitions for an emoji.

(b) Twitter-based context: This contains the set of context words learned by using the Twitter word embedding model for the emoji.

(c) News-based context: This contains the set of context words learned by using the Google News word embedding model for an emoji.

4. Now, to find the sense of an emoji in a tweet, they calculate the context overlap between the context of the emoji in the tweet with the context words taken from each of the above three sets.

5. The sense which has the highest context word overlap is then assigned to the emoji for that particular tweet.

6 Emoji Sense Similarity

Like words, different emojis can also convey similar meanings when used in the same context which evokes the problem of emoji similarity. Different research works proposed how creating emoji embeddings can help in finding semantic similarity among different emojis.

6.1 Vector Skip-Gram Model

They employed the skip-gram neural embedding model introduced by Mikolov et al. (2013) by mapping in the same space both words and emojis.

They did two evaluations of their system. These are:

• Quantitative Evaluation

They compiled EmoTwi50, a human gold standard dataset that contains a set of 50 pairs of emojis with degrees of similarity (functional similarity) andrelatedness(topi- cal similarity). Pearson correlations between

(6)

the human gold standard and the similarity scores given by their models (for each pair measure the cosine similarity of the vectors of the two emojis) was calculated to see whether emoji embeddings models are able to capture bothsimilarityandrelatednessor not.

• Qualitative Evaluation

This evaluation was performed to see the quality of the vectors that represent each emoji. It is composed of two parts:

1. Single emojis

In the first part of the this evaluation, it was explored whether similar emojis are plotted close to each other or not. Here, the vectors of the 100 emojis were re- duced to two dimensions and plotted in the same space. It can be seen from the Figure 3that overall the vectors are able to group similar emojis together.

Figure 3: Visualisation of 100 emojis vectors (Barbieri et al.,2016b).

The second part of this evaluation was to look at the relation between words and emojis. In Figure 4, five facial expressions and five objects, people and places emojis are selected and for each emoji the five most similar text tokens are shown.

2. Clusters of emojis

The purpose of this evaluation was to see if emojis can be grouped by topics.

They build 11 clusters with K-means

Figure 4: The five text tokens that better characterize each emoji (Barbieri et al.,2016b).

from the 300 most frequent emojis. In Figure 3, they added the color of the cluster in the back of each emoji. The emojis plotted are the 10 emojis closest to the centroid of each cluster. It can be seen that most of the clusters have a clear identity and seems to be quite con- sistent.

6.2 Emoji2Vec

Eisner et al.(2016) proposed an approach of creating emoji embeddings where they mapped emojis into the same space as the 300 dimensional word2vec embeddings so that they can be used together in any task. In order to learn emoji embeddings, they used the description of emojis from Unicode standard, an example of which is given in Figure 5.

Figure 5: Description of U+1F574 taken from Unicode standard (Eisner et al.,2016)

They trained emoji embeddings using examples consisting of an emoji and a sequence of words w1, . . . , wN describing that emoji. They took the sum of the individual word vectors in the descriptive phrase as found in the Google News word2vecembeddings,

vj =

N

X

k=1

wk, (1)

(7)

where w_k is theword2vec vector for wordw_k andvj is the vector representation of the description.

Now, for each emoji, they define a vectorxi to represent its embedding which is learned by modeling the probability of a match between x_i and v_j using the sigmoid of the dot product of the two vectors. Logistic loss is used for training the embeddings.

The embeddings learned this way were then used in a sentiment analysis task to find out the amount of information captured by them. For this purpose, dataset by Novak et al. (2015) was used which contains 67k English tweets labeled as positive, negative or neutral. Feature vector was sum of the embeddings corresponding to each word or emoji occurring in the tweet. The results proved that embeddings learned using descriptions are more informative and capture semantics very well.

6.3 A Semantics-Based Measure of Emoji Similarity

Similar to the work in subsection 6.2, Wijeratne et al. (2017a) created emoji embedding models that were learned over machine-readable emoji meanings in the EmojiNet knowledge base and presented a thorough analysis of the semantic similarity of emojis. For the training of these emoji embedding models, they constructed three different ways to represent the meaning of an emoji using the information in EmojiNet. These are:

• Emoji Description (Sense_Desc.):

The descriptions of each emoji were extracted because they give information about what is depicted in an emoji and its intended uses.

• Emoji Sense Labels (Sense_Label):

Emoji Sense labels (Word-POS tag pairs) were also extracted as they describe the senses and their part-of-speech under which an emoji can be used in a sentence.

• Emoji Sense Denitions (Sense_Def.):

Emoji Sense definitions were also collected as they gave the textual description of each emoji sense labels.

Example of an emoji with its representation containing these information is given inFigure 6.

The details of their work is as follows:

Figure 6: Example emoji with its representation. (Wi- jeratne et al.,2017a).

1. Model:

They used Word2Vec (skip-gram mode with negative sampling) to get word embeddings for two different datasets: a twitter corpus and a Google News corpus. Then these word embeddings were used to get emoji embeddings for the different types of emoji definitions listed above. All words in each emoji definition were replaced with their corresponding word vectors and final emoji embeddings were obtained by taking the average of the word vectors of all words in the emoji definition to form a single vector of size 300.

2. Evaluation:

For the purpose of evaluating these word embeddings, they created an emoji similarity dataset called ‘EmoSim508’ that consisted of 508 emoji pairs which were assigned similarity scores by ten human judges. Then they used Spearman’s rank correlation coefficient¹¹ to evaluate the alignment of emoji similarity rankings generated by their emoji embedding models (cosine similarity of two emoji vectors) with the emoji similarity rankings of the EmoSim508 dataset. They did the qualitative evaluation of these embeddings by applying them into a sentiment analysis task and their model outperformed the Emoji2Vec model mentioned in previous subsection.

6.4 Grouping of Emojis

With a huge number of emojis available, it be- comes difficult for a user to figure out an appro-

11https://en.wikipedia.org/wiki/Spearman’s_rank_correlation_coefficient

(8)

priate emoji and then to search for it which takes a lot of time. To reduce this problem of search time,Pohl et al.(2017) proposed a design of key- board in which similar emojis can be placed close to each other. This is again based on the ex- ploitation of emoji similarity. On other side,Gui- bon et al. (2018) proposed an automatic system to group emojis according to their usage. They did this for 63 face emojis which can be further extended to any number of emojis. For this purpose, tweets were retrieved using Twitter Stream- ing API.

To learn emoji embeddings of 300 dimensions, Continuous Bag of Words model was used with hierarchical softmax. Then, to make clusters, they applied spectral clustering. The initial number of clusters was taken to be 63 assuming each emoji in one cluster. After applying spectral clustering and removing empty clusters, they finally got 18 clusters which are shown inFigure 7.

Figure 7: Emoji clusters using spectral clustering (Gui- bon et al.,2018)

To verify whether this categorization represent some known emotions or not, they labeled them with Ekman’s 16 basic face expressions of emotions (Ekman,1992). The labels are also shown in Figure 7. Some emojis appear alone in the cluster like sleep or closed mouth while some clusters are divided based on the intensity of emotions they deliver e.g. mild contentment and more inten- sive contentment. This comparison with Ekman’s emotions established the fact that these facial emojis represent the same emotions as depicted by their faces.

7 Emoji Prediction

In order to study the relation between words and emojis, Barbieri et al. (2017) proposed an automatic system that predicts the most suitable emoji from a set of20most frequent emojis given an in- put tweet. For this, tweets were retrieved using Twitter API, which were further filtered to retain those which contain exactly one of the 20 most frequent emojis. They also considered subset of 10and5most frequent emojis.

They proposed a neural framework based on bi-directional Long Short Term Memory network which can be formalized as:

s= max{0,W[fw,bw] + d}

whereWis a learned parameter matrix,fwandbw are forward and backward encoding of the message respectively anddis a bias term. The vectors can be used to calculate probability distribution of emojis to predict the most likely one. They used two standard representations: word based LSTM and char based LSTM for the experiments. Stan- dard bag of words and skip gram vector average were used as baselines.

The results showed that character based bi- LSTM along with pre-trained embeddings outperformed the other models. They also compared their best performing model against the human evaluation, the results of which show that automatic system is better at generalizing the semantics of emojis when compared to humans.

8 Conclusion

In this paper, we have covered various literature that have been implemented in the field of emojis. The paper started with introducing some open resources for emojis on the web. These resources give various information like, sentiment, meanings, example usage, related to emojis. As most of the emojis are polysemous i.e., can have different meanings based on contexts, and has varying platform representations, we have covered past works that worked on emoji misinterpre- tations across cultures, emoji sense disambiguation (ESD) and emoji sense similarity (ESS). But sometimes, emojis can be used to disambiguate the meaning expressed by the accompanying text as in the cases of sarcasm or irony. Finally we covered the work which is the state-of-the-art in the growing field of automatic emoji prediction.

(9)

References

Francesco Barbieri, Miguel Ballesteros, and Horacio Saggion. 2017. Are emojis predictable? In Pro- ceedings of the 2017 Conference of the European Chapter of the Association for Computational Lin- guistics, pages 105–111. ACL.

Francesco Barbieri, German Kruszewski, Francesco Ronzano, and Horacio Saggion. 2016a. How cos- mopolitan are emojis?: Exploring emojis usage and meaning over different languages with distributional semantics. InProceedings of the 2016 ACM on Mul- timedia Conference, pages 531–535. ACM.

Francesco Barbieri, Francesco Ronzano, and Horacio Saggion. 2016b. What does this emoji mean? A vector space skip-gram model for twitter emojis. In Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC.

Francesco Barbieri, Horacio Saggion, and Francesco Ronzano. 2014. Modelling sarcasm in twitter, a novel approach. InWASSA@ ACL, pages 50–58.

Paula Carvalho, Luís Sarmento, Mário J Silva, and Eu- génio De Oliveira. 2009. Clues for detecting irony in user-generated contents: oh...!! it’s so easy;-. In Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion, pages 53–56. ACM.

Ben Eisner, Tim Rocktäschel, Isabelle Augenstein, Matko Bošnjak, and Sebastian Riedel. 2016.

emoji2vec: Learning emoji representations from their description. In Proceedings of The Fourth International Workshop on Natural Language Pro- cessing for Social Media, pages 48–54. ACL.

Paul Ekman. 1992. An argument for basic emotions.

Cognition & emotion, 6(3-4):169–200.

Andrea Esuli and Fabrizio Sebastiani. 2006. Sen- tiwordnet: A publicly available lexical resource for opinion mining. In In Proceedings of the 5th Conference on Language Resources and Evaluation (LREC’06, pages 417–422.

Christiane Fellbaum. 1998. WordNet: An Electronic Lexical Database. A Bradford Book.

Gaël Guibon, Magalie Ochs, and Patrice Bellot. 2018.

From emoji usage to categorical emoji prediction.

InInternational Conference on Intelligent Text Pro- cessing and Computational Linguistics. Springer.

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. InICLR.

Hannah Miller, Jacob Thebault-Spieker, Shuo Chang, Isaac Johnson, Loren Terveen, and Brent Hecht.

2016."blissfully happy" or "ready to fight": Varying interpretations of emoji. AAAI press.

Petra Kralj Novak, Jasmina Smailovic, Borut Sluban, and Igor Mozetic. 2015. Sentiment of emojis.

CoRR, abs/1509.07761.

Henning Pohl, Christian Domin, and Michael Rohs.

2017. Beyond just text: Semantic emoji similarity modeling to support expressive communication.

ACM Transactions on Computer-Human Interaction (TOCHI), 24(1):6.

Peter D Turney and Patrick Pantel. 2010. From frequency to meaning: Vector space models of semantics. Journal of artificial intelligence research, 37:141–188.

Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, and Derek Doran. 2016. Emojinet: Building a machine readable sense inventory for emoji. InInterna- tional Conference on Social Informatics, pages 527–

541. Springer.

Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, and Derek Doran. 2017a. A semantics-based measure of emoji similarity. InWeb Intelligence.

Sanjaya Wijeratne, Lakshika Balasuriya, Amit P.

Sheth, and Derek Doran. 2017b. Emojinet: An open service and API for emoji sense discovery.

In 11th Intl. AAAI Conf. on Web and Social Media (ICWSM), pages 437–446. Montreal, Canada.