Understanding Emojis and their Semantics: A Survey
Kajal Gupta, Kanchan and Pushpak Bhattacharyya Dept. of Computer Science and Engineering
Indian Institute of Technology, Bombay
{guptakajal9411, pal.kanchan04, pushpakbh}@gmail.com
Abstract
In social media, it has become a trend to use emojis along with text to be more expressive.
These emojis have become an essential part of computer mediated communication. In recent years, emojis have become a hot topic of re- search in the field of natural language process- ing due to their varied semantics and senses.
This paper gives an introduction to various open resources that are available for emojis.
It also compiles various works that have been done in different research areas of emojis to gain some insights.
1 Introduction
Nowadays, people are so much indulged with so- cial media that the number of users across the globe are expected to rise from 2.46 billion (2017) to 2.77 billion (2019*).1 Moreover, in recent times, the writing habits of social media users have also changed. They have increased the use of pic- tographs, calledemojis, along with the text to be more lively and expressive.
Emojis can play different roles or functions de- pending on a situation. They can visually repre- sent objects/things or can simply express or inten- sify some sentiment/emotion in the tweet. In some cases, they can also reverse the sentiment of the tweet, thus, causing sarcasm. Example tweets for these functions are given in Table1.
Along with different roles, given some context, emojis can also take different senses. Also, like words, emojis interpretations can vary with cul- tures, thus, making them difficult to interpret even for humans. This demands for research avenue for emoji in Natural Language Processing in order to fully understand the meanings and usage of emojis to capture opinions expressed by people. In this
1https://www.statista.com/statistics/278414/number-of- worldwide-social-network-users/
Example Usage F1 Travelling to UK for vacation F2 I met Dwayne Johnson today!!!
F3 Ughh I hate my life F4 Such an awesome birthday
Table 1: Example usage of emojis to represent objects (F1), express sentiment (F2), intensify sentiment (F3) and express sarcasm (F4).
paper, we explain various works that have been done in the field of emojis.
The organization of the paper is as follows. The paper starts with the description of various open resources which are available for emoji on the web in section 2. Then, section 3 describes how differ- ent emojis are misinterpreted across different lan- guages. Section 4 describes how emojis can be helpful to detect sarcasm and irony. The works done in the field of Emoji Sense Disambiguation and Emoji Sense Similarity are explained in Sec- tion 5 and Section 6 respectively. Section 7 ex- plains an approach used for the task of emoji pre- diction for twitter. Finally, we conclude the paper in Section 8.
2 Open Resources for Emoji
This section gives an introduction to various open resources which are available for emoji on the web. These resources convey a lot of information related to sentiment, sense definitions, keywords, platform dependent representations, related emo- jis etc., of an emoji.
2.1 Unicode Consortium
Unicode is a text encoding standard enforcing a uniform interpretation of text byte code by com- puters2. This enables people around the world
2http://www.unicode.org/
to use computers in any language. This consor- tium provides a complete list of the Unicode emoji characters3.
This consortium not only provides Unicode for each emoji but also provides different emoji rep- resentations from different vendors like, Google, Apple, Twitter etc. It also provides CLDR short name and keywords associated with each emoji.
2.1.1 Emoji Sentiment Ranking
Novak et al. (2015) created Emoji Sentiment Ranking4, the first emoji sentiment lexicon of 751 most frequent emojis. For its cretion, 83 hu- man annotators were recruited to classify 1.6 mil- lion tweets collected in 13 different European lan- guages into negative, neutral, or positive classes.
Based on the 4% of the collected tweets (which contained emojis), they ranked the popular emojis using the sentiment score of the plain text (tweet).
The Emoji Sentiment Ranking has a format simi- lar to SentiWordNet (Esuli and Sebastiani,2006).
For each emoji, negativityp−, neutralityp0, posi- tivityp+and sentiment scores¯are assigned.
2.2 Emojipedia
Emojipedia is a emoji reference site5. It hosts var- ious categories like, Smileys and People, Animals and Nature etc., of emoji, popular emoji and news related to emoji. For each emoji, it provides Uni- code representation, definition, images based on different rendering platforms, short code names, and other emoji manually-asserted to be related.
2.3 iEmoji
iEmoji6 is a web service that allows you to con- vert an emoji into viewable format. It also helps in understanding how the emoji are being used in social media posts. For each emoji, it provides Twitter emoji popularity rank, images across plat- forms, short code name, keywords describing the emoji. It also provides human-generated descrip- tion, Unicode character representation, category within a manually-built hierarchy, examples of use in social media (Twitter) posts and history for each emoji.
3http://www.unicode.org/emoji/charts/full-emoji- list.html
4http://kt.ijs.si/data/Emoji_sentiment_ranking/
5https://emojipedia.org/
6http://www.iemoji.com/
2.4 The Emoji Dictionary
The Emoji Dictionary7 is the first crowdsourced emoji resource on web. Since it is a crowsourced site, it lets you add definitions or sense labels to any emoji. These sense labels defines how an emoji can be used in a sentence. It organizes meanings for emoji under three part-of-speech tags, namely, nouns, verbs, and adjectives. It also lists an image of the emoji and its definition with example uses spanning multiples sense labels.
3 Emoji Misinterpretation
Emojis can be defined as pictures, which are nat- urally combined with plain text, creating a new form of language. These pictures remain same in- dependently where we live but can be used and interpreted differently. In order to understand this, the meaning and usage of emojis across different languages is compared from a natural language processing view. (Barbieri et al., 2016a) An em- pirical research methodology relying on current vector space representation modeling (Turney and Pantel,2010) was adopted to understand the “se- mantics” of these emoji.
A corpus of more than 30 million tweets in four different languages, American English (USA),British English(UK),Peninsular Span- ish(ESP), and Italian (ITA), were collected and different experiments were carried out to compare emojis. In order to preprocess the tweets, same procedure of Barbieri et al. (Barbieri et al.,2016b) was followed. Both the words and the emojis were modelled in the same vectorial space, by relying on the skip-gram embedding model introduced by Mikolov et al. (Mikolov et al.,2013) with 300 di- mensions and a window size of 6 tokens.
Several experiments were performed to com- pare the way the semantics of emojis varies across languages. The main motive is to quantify how the meaning of an emoji A is preserved across dif- ferent languages by measuring to what extent the emojis that are most similar to A overlap across languages. The vectorial representation of each emoji is exploited in a specific language to select the ones with similar vectors and thus presumably closest in meaning.
The term Nearest Neighbours N Nl(e) of the emoji ein the languagel is defined as the set of the 10 nearest emojis to the emojiein the seman- tic space of languagel. The nearest neighbours of
7https://emojidictionary.emojifoundation.com/
each emoji were retrieved with respect to its co- sine similarity with other emojis. However, since an emoji is defined by other similar emojis, it is possible to compare these representations across different languages. In order to see if an emoji is similarly defined in two languages, the common elements are looked at in the NN representation of that emoji in the two languages. If the represen- tations of the emoji in different languages share many elements it would mean that the emoji is de- fined and thus used in a similar way. If there are not common elements among the two representa- tions, the emoji is more likely to mean something different in the two languages.
More precisely, to determine if emoji eis sim- ilar in languagel1 andl2, the size of the intersec- tion of the NN sets is measured:
siml1l2(e) =|N Nl1(e)\
N Nl2(e)|
It is assumed that if siml1l2 is equal to 10, the emoji e has the same meaning in the languagesl1
andl2. On the other hand, if siml1l2 is equal to 0the emoji means something different in the two languages. It is also measured whether an emoji means the same across all the languages by look- ing at the overlap of all the sets of emojis that are most similar to the emojiein each language.
The top half of theFigure 1indicates the emojis having high value ofsimall. It indicates that emo- jis related to music, nature and food are mostly language independent. The lower half of the table contains emojis having low value ofsimallwhich indicates that they are language dependent. For each emoji are indicated the rank (where1is the most common emoji over the four languages and 150is the least used emoji), the six combinations ofsiml1l2, and thesimall.
Looking at the bottom of the table it can be seen that the emojis and are used in a very dif- ferent way across all the languages, and each lan- guage seems to have its own way to define them.
Also the emojis , , and do not seem to keep their meaning across different languages (the num- ber 100 for example might be just a number or a excellent grade). For the waving-hand emoji , the N N for USA and U K is plotted, (Figure 2) and it can be observed that these two emojis are inter- preted in different ways. In the case of American English the waving-hand seems to mean bye/see you later (smiles and people waving), while for British English the waving-hand emojis is related
Figure 1: Emojis with highsimallon the top and with lowsimallon the bottom (Barbieri et al.,2016a)
to travelling (countries flags, train and airplane are included in theN N).
4 Use of Emojis to Detect Sarcasm and Irony
Automatic detection of figurative language is a challenging task in computational linguistics. It is very difficult for a machine to differentiate be- tween the literal and figurative meaning of a text.
Sarcasm and irony basically refers to the case when the intended meaning of a text is opposite to its literal meaning. The difference between sar- casm and irony is that the former is always neg- ative and is aimed at ridiculing the other person while it is not true for the latter. So, detecting sar- casm and irony in a text are very challenging and a lot of research is being done on these topics.
Emojis have been used as a feature for sarcasm detection.(Barbieri et al.,2014) Other features that
Figure 2: Nearest Neighbors of the waving hand emoji for USA (left) and UK (right) (Barbieri et al.,2016a)
were used in this work include:
• Frequency: gap between rare and common words
• Written-Spoken:written-spoken style uses
• Intensity:intensity of adverbs and adjectives
• Structure:length and punctuation
• Sentiments: gap between positive and nega- tive terms
• Synonyms:common vs. rare synonyms use
• Ambiguity:measure of possible ambiguities The study shows that the emojis were useful in this task of sarcasm detection.
Emoticons have also been used in detecting irony in a text. (Carvalho et al.,2009) This work basically described eight linguistic patters that can be helpful in detecting irony. The aim of this work was to recognize irony in apparently positive sentences involving human named-entities (NE) in Portugese. These eight patterns include:
1. Demonstrative Determiners: In Portuguese, the occurrence of any demonstrative form be- fore an human NE usually indicates that such entity is being negatively mentioned.
2. Interjections: It is believed that some inter- jections can be used as potential clues for irony detection.
3. Verb Morphology: The type of pronoun used for addressing people can also be an impor- tant clue for irony detection.
4. Cross-Constructions: In Portuguese, evalua- tive adjectives with a prior positive or neu- tral polarity usually take a negative or ironic interpretation whenever they appear in cross- constructions.
5. Heavy Punctuation: It is assumed that the presence of more than one exclamation mark and/or question mark in a sentence can be used as a clue for irony detection.
6. Quotation Marks: They are also frequently used to express and emphasize an ironic con- tent, especially if the content has a prior posi- tive polarity, which is the main motive of this work.
7. Laughter Expressions:It included the use of acronyms like LOL, onomatopoeic expres- sions such as "ah", "eh", "hi", etc and prior positive emoticons ":)",";-)"and ":P" to detect irony.
The study proved that the emoticons were useful for detecting irony.
5 Emoji Sense Disambiguation
Emoji have become an extremely popular form of communication because of their powerful way of expressing emotions in single character. But emoji may be used in different contexts to express dif- ferent senses which makes them hard to disam- biguate using traditional NLP techniques. This section first introduces an emoji resource, Emo- jiNet8, and then describes how it is useful for the task of Emoji Sense Disambiguation.
5.1 EmojiNet: A Sense Inventory for Emoji EmojiNet is an emoji resource built by Wijer- atne et al. (2016) and it is similar to a lexical databese, Wordnet (Fellbaum, 1998). The dif- ference is that the former assigns context based meanings, senses, to different emojis and the lat- ter assigns senses to different words and groups them into sets of synonyms calledsynsets. It is a first machine readable sense inventory for emojis which provides these informations about emojis:
1. the part-of-speech tags (PoS tags) for a par- ticular use of emoji,
2. the definition of an emoji and the senses it is used in,
8http://emojinet.knoesis.org/
3. example uses of emoji for each sense, 4. links of emoji senses to other knowledge
bases such as BabelNet9or Wikipedia10. EmojiNet integrates four openly available emoji resources (Unicode Consortium, Emoji- pedia, iEmoji and The Emoji Dictionary) along with BabelNet to create a dataset consisting of 12,904sense labels over2,389emoji.
5.2 Emoji Sense Disambiguation using EmojiNet
Emoji sense disambiguation is the ability to iden- tify the meaning of an emoji in the context of a message in a computational manner. To use Emo- jiNet for this task,Wijeratne et al.(2017b) did the following enhancements:
1. They first created two word embedding mod- els learned over Twitter and news articles using the Word2Vec model (Mikolov et al.
(2013)).
2. For each emoji ei ∈ E, they extracted the definitiondi of the emojieiand the set of all emoji sense definitions Si of ei from Emo- jiNet. Then, for each wordwindi , they ex- tracted the twenty most similar words from the two word embedding models as two sep- arate sets, namely CWeTi andCWeNi. Simi- larly, for each emoji sense definitionsi ∈Si
that belongs toei , they extracted the words wsi in si ∈ Si and repeat the same pro- cess to learn two separate context word sets CWeTi−si andCWeNi−si.
3. For example, for , EmojiNet lists “A gun emoji, more precisely a pistol. A weapon that has potential to cause great harm”as its emoji definition. For each word in the defini- tion above, the top twenty most similar words learned for it using the two word embedding models are used to generate context words.
The same process is applied to each emoji sense definition for as well.
The process of emoji sense disambiguation using EmojiNet is as follows:
1. First of all, 25 emoji are selected which have shown to be interpreted differently when used in communication byMiller et al.(2016).
9http://babelnet.org/about
10https://www.wikipedia.org
2. Then randomly 50 tweets are selected for each of the 25 emoji from the Twitter cor- pus that was used to train the word embed- ding model.
3. Three sets of contexts are defined for an emoji sense based on the three different datasets:
(a) BabelNet-based context:This contains the set of words included in BabelNet sense definitions for an emoji.
(b) Twitter-based context: This contains the set of context words learned by us- ing the Twitter word embedding model for the emoji.
(c) News-based context: This contains the set of context words learned by using the Google News word embedding model for an emoji.
4. Now, to find the sense of an emoji in a tweet, they calculate the context overlap between the context of the emoji in the tweet with the context words taken from each of the above three sets.
5. The sense which has the highest context word overlap is then assigned to the emoji for that particular tweet.
6 Emoji Sense Similarity
Like words, different emojis can also convey simi- lar meanings when used in the same context which evokes the problem of emoji similarity. Different research works proposed how creating emoji em- beddings can help in finding semantic similarity among different emojis.
6.1 Vector Skip-Gram Model
They employed the skip-gram neural embedding model introduced by Mikolov et al. (2013) by mapping in the same space both words and emojis.
They did two evaluations of their system. These are:
• Quantitative Evaluation
They compiled EmoTwi50, a human gold standard dataset that contains a set of 50 pairs of emojis with degrees of similarity (functional similarity) andrelatedness(topi- cal similarity). Pearson correlations between
the human gold standard and the similar- ity scores given by their models (for each pair measure the cosine similarity of the vec- tors of the two emojis) was calculated to see whether emoji embeddings models are able to capture bothsimilarityandrelatednessor not.
• Qualitative Evaluation
This evaluation was performed to see the quality of the vectors that represent each emoji. It is composed of two parts:
1. Single emojis
In the first part of the this evaluation, it was explored whether similar emojis are plotted close to each other or not. Here, the vectors of the 100 emojis were re- duced to two dimensions and plotted in the same space. It can be seen from the Figure 3that overall the vectors are able to group similar emojis together.
Figure 3: Visualisation of 100 emojis vectors (Barbieri et al.,2016b).
The second part of this evaluation was to look at the relation between words and emojis. In Figure 4, five facial ex- pressions and five objects, people and places emojis are selected and for each emoji the five most similar text tokens are shown.
2. Clusters of emojis
The purpose of this evaluation was to see if emojis can be grouped by topics.
They build 11 clusters with K-means
Figure 4: The five text tokens that better characterize each emoji (Barbieri et al.,2016b).
from the 300 most frequent emojis. In Figure 3, they added the color of the cluster in the back of each emoji. The emojis plotted are the 10 emojis closest to the centroid of each cluster. It can be seen that most of the clusters have a clear identity and seems to be quite con- sistent.
6.2 Emoji2Vec
Eisner et al.(2016) proposed an approach of cre- ating emoji embeddings where they mapped emo- jis into the same space as the 300 dimensional word2vec embeddings so that they can be used together in any task. In order to learn emoji em- beddings, they used the description of emojis from Unicode standard, an example of which is given in Figure 5.
Figure 5: Description of U+1F574 taken from Unicode standard (Eisner et al.,2016)
They trained emoji embeddings using exam- ples consisting of an emoji and a sequence of words w1, . . . , wN describing that emoji. They took the sum of the individual word vectors in the descriptive phrase as found in the Google News word2vecembeddings,
vj =
N
X
k=1
wk, (1)
where wk is theword2vec vector for wordwk andvj is the vector representation of the descrip- tion.
Now, for each emoji, they define a vectorxi to represent its embedding which is learned by mod- eling the probability of a match between xi and vj using the sigmoid of the dot product of the two vectors. Logistic loss is used for training the em- beddings.
The embeddings learned this way were then used in a sentiment analysis task to find out the amount of information captured by them. For this purpose, dataset by Novak et al. (2015) was used which contains 67k English tweets labeled as positive, negative or neutral. Feature vector was sum of the embeddings corresponding to each word or emoji occurring in the tweet. The results proved that embeddings learned using descriptions are more informative and capture semantics very well.
6.3 A Semantics-Based Measure of Emoji Similarity
Similar to the work in subsection 6.2, Wijeratne et al. (2017a) created emoji embedding models that were learned over machine-readable emoji meanings in the EmojiNet knowledge base and presented a thorough analysis of the semantic sim- ilarity of emojis. For the training of these emoji embedding models, they constructed three differ- ent ways to represent the meaning of an emoji us- ing the information in EmojiNet. These are:
• Emoji Description (Sense_Desc.):
The descriptions of each emoji were ex- tracted because they give information about what is depicted in an emoji and its intended uses.
• Emoji Sense Labels (Sense_Label):
Emoji Sense labels (Word-POS tag pairs) were also extracted as they describe the senses and their part-of-speech under which an emoji can be used in a sentence.
• Emoji Sense Denitions (Sense_Def.):
Emoji Sense definitions were also collected as they gave the textual description of each emoji sense labels.
Example of an emoji with its representation containing these information is given inFigure 6.
The details of their work is as follows:
Figure 6: Example emoji with its representation. (Wi- jeratne et al.,2017a).
1. Model:
They used Word2Vec (skip-gram mode with negative sampling) to get word embed- dings for two different datasets: a twitter cor- pus and a Google News corpus. Then these word embeddings were used to get emoji embeddings for the different types of emoji definitions listed above. All words in each emoji definition were replaced with their cor- responding word vectors and final emoji em- beddings were obtained by taking the average of the word vectors of all words in the emoji definition to form a single vector of size 300.
2. Evaluation:
For the purpose of evaluating these word em- beddings, they created an emoji similarity dataset called ‘EmoSim508’ that consisted of 508 emoji pairs which were assigned sim- ilarity scores by ten human judges. Then they used Spearman’s rank correlation coef- ficient11 to evaluate the alignment of emoji similarity rankings generated by their emoji embedding models (cosine similarity of two emoji vectors) with the emoji similarity rank- ings of the EmoSim508 dataset. They did the qualitative evaluation of these embed- dings by applying them into a sentiment anal- ysis task and their model outperformed the Emoji2Vec model mentioned in previous subsection.
6.4 Grouping of Emojis
With a huge number of emojis available, it be- comes difficult for a user to figure out an appro-
11https://en.wikipedia.org/wiki/Spearman’s_rank_correlation_coefficient
priate emoji and then to search for it which takes a lot of time. To reduce this problem of search time,Pohl et al.(2017) proposed a design of key- board in which similar emojis can be placed close to each other. This is again based on the ex- ploitation of emoji similarity. On other side,Gui- bon et al. (2018) proposed an automatic system to group emojis according to their usage. They did this for 63 face emojis which can be further extended to any number of emojis. For this pur- pose, tweets were retrieved using Twitter Stream- ing API.
To learn emoji embeddings of 300 dimensions, Continuous Bag of Words model was used with hierarchical softmax. Then, to make clusters, they applied spectral clustering. The initial number of clusters was taken to be 63 assuming each emoji in one cluster. After applying spectral clustering and removing empty clusters, they finally got 18 clusters which are shown inFigure 7.
Figure 7: Emoji clusters using spectral clustering (Gui- bon et al.,2018)
To verify whether this categorization represent some known emotions or not, they labeled them with Ekman’s 16 basic face expressions of emo- tions (Ekman,1992). The labels are also shown in Figure 7. Some emojis appear alone in the clus- ter like sleep or closed mouth while some clus- ters are divided based on the intensity of emotions they deliver e.g. mild contentment and more inten- sive contentment. This comparison with Ekman’s emotions established the fact that these facial emo- jis represent the same emotions as depicted by their faces.
7 Emoji Prediction
In order to study the relation between words and emojis, Barbieri et al. (2017) proposed an auto- matic system that predicts the most suitable emoji from a set of20most frequent emojis given an in- put tweet. For this, tweets were retrieved using Twitter API, which were further filtered to retain those which contain exactly one of the 20 most frequent emojis. They also considered subset of 10and5most frequent emojis.
They proposed a neural framework based on bi-directional Long Short Term Memory network which can be formalized as:
s= max{0,W[fw,bw] + d}
whereWis a learned parameter matrix,fwandbw are forward and backward encoding of the mes- sage respectively anddis a bias term. The vectors can be used to calculate probability distribution of emojis to predict the most likely one. They used two standard representations: word based LSTM and char based LSTM for the experiments. Stan- dard bag of words and skip gram vector average were used as baselines.
The results showed that character based bi- LSTM along with pre-trained embeddings outper- formed the other models. They also compared their best performing model against the human evaluation, the results of which show that auto- matic system is better at generalizing the seman- tics of emojis when compared to humans.
8 Conclusion
In this paper, we have covered various literature that have been implemented in the field of emo- jis. The paper started with introducing some open resources for emojis on the web. These re- sources give various information like, sentiment, meanings, example usage, related to emojis. As most of the emojis are polysemous i.e., can have different meanings based on contexts, and has varying platform representations, we have cov- ered past works that worked on emoji misinterpre- tations across cultures, emoji sense disambigua- tion (ESD) and emoji sense similarity (ESS). But sometimes, emojis can be used to disambiguate the meaning expressed by the accompanying text as in the cases of sarcasm or irony. Finally we covered the work which is the state-of-the-art in the growing field of automatic emoji prediction.
References
Francesco Barbieri, Miguel Ballesteros, and Horacio Saggion. 2017. Are emojis predictable? In Pro- ceedings of the 2017 Conference of the European Chapter of the Association for Computational Lin- guistics, pages 105–111. ACL.
Francesco Barbieri, German Kruszewski, Francesco Ronzano, and Horacio Saggion. 2016a. How cos- mopolitan are emojis?: Exploring emojis usage and meaning over different languages with distributional semantics. InProceedings of the 2016 ACM on Mul- timedia Conference, pages 531–535. ACM.
Francesco Barbieri, Francesco Ronzano, and Horacio Saggion. 2016b. What does this emoji mean? A vector space skip-gram model for twitter emojis. In Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC.
Francesco Barbieri, Horacio Saggion, and Francesco Ronzano. 2014. Modelling sarcasm in twitter, a novel approach. InWASSA@ ACL, pages 50–58.
Paula Carvalho, Luís Sarmento, Mário J Silva, and Eu- génio De Oliveira. 2009. Clues for detecting irony in user-generated contents: oh...!! it’s so easy;-. In Proceedings of the 1st international CIKM work- shop on Topic-sentiment analysis for mass opinion, pages 53–56. ACM.
Ben Eisner, Tim Rocktäschel, Isabelle Augenstein, Matko Bošnjak, and Sebastian Riedel. 2016.
emoji2vec: Learning emoji representations from their description. In Proceedings of The Fourth International Workshop on Natural Language Pro- cessing for Social Media, pages 48–54. ACL.
Paul Ekman. 1992. An argument for basic emotions.
Cognition & emotion, 6(3-4):169–200.
Andrea Esuli and Fabrizio Sebastiani. 2006. Sen- tiwordnet: A publicly available lexical resource for opinion mining. In In Proceedings of the 5th Conference on Language Resources and Evaluation (LREC’06, pages 417–422.
Christiane Fellbaum. 1998. WordNet: An Electronic Lexical Database. A Bradford Book.
Gaël Guibon, Magalie Ochs, and Patrice Bellot. 2018.
From emoji usage to categorical emoji prediction.
InInternational Conference on Intelligent Text Pro- cessing and Computational Linguistics. Springer.
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word represen- tations in vector space. InICLR.
Hannah Miller, Jacob Thebault-Spieker, Shuo Chang, Isaac Johnson, Loren Terveen, and Brent Hecht.
2016."blissfully happy" or "ready to fight": Varying interpretations of emoji. AAAI press.
Petra Kralj Novak, Jasmina Smailovic, Borut Sluban, and Igor Mozetic. 2015. Sentiment of emojis.
CoRR, abs/1509.07761.
Henning Pohl, Christian Domin, and Michael Rohs.
2017. Beyond just text: Semantic emoji similar- ity modeling to support expressive communication.
ACM Transactions on Computer-Human Interaction (TOCHI), 24(1):6.
Peter D Turney and Patrick Pantel. 2010. From fre- quency to meaning: Vector space models of se- mantics. Journal of artificial intelligence research, 37:141–188.
Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, and Derek Doran. 2016. Emojinet: Building a ma- chine readable sense inventory for emoji. InInterna- tional Conference on Social Informatics, pages 527–
541. Springer.
Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, and Derek Doran. 2017a. A semantics-based mea- sure of emoji similarity. InWeb Intelligence.
Sanjaya Wijeratne, Lakshika Balasuriya, Amit P.
Sheth, and Derek Doran. 2017b. Emojinet: An open service and API for emoji sense discovery.
In 11th Intl. AAAI Conf. on Web and Social Media (ICWSM), pages 437–446. Montreal, Canada.