Review-Based Recommender Systems - submitted to the Department of Informatics

2.3.1 Alleviating Rating Sparsity with Reviews

In many e-commerce websites, users can write textual reviews on the products they have purchased, in addition to the rating data. In reviews, users can provide comments explaining reasons behind their ratings on items, which offer richer and more meaningful information than numeric ratings. Such meaningful contents in reviews have been recognized as a valuable source of information to alleviate the rating sparsity occurred in a standard CF-based approach [21]. For example, consider the review in Fig. 2.6. Even if this user only provided one review, some very useful information can still be mined from that review, such as having

2.3 Review-Based Recommender Systems 21 traveled in summer with family members, liking city views at night, and being concerned about the cleanliness of the room. If such useful information is extracted effectively, they might be helpful in modeling the user preferences or item features in more efficient way than utilizing only ratings.

By leveraging a review textReviewinto rating prediction, the objective functionRof a review-based recommendation can be defined by:

R:U ser×Item×Review→Rating. (2.3.1)

2.3.2 Leveraging Reviews for Making Recommendations

Since a review is provided in an unstructured textual form, it cannot be easily interpreted by the system. Therefore, a method for extracting and representing information from reviews is required for making a recommendation. For example, [29] built a user and item profiles based on the frequency of words extracted from reviews by applying a term frequency–inverse document frequency (TF-IDF) technique. The recommendation is then made based on the items with similar profiles to the user. Moreover, Poirier et at. [76] inferred the rating to create the user-item rating matrix to be used for CF-based approach. They represented a review with a word frequency vector, and combined it with a rating to train the Naive Bayes classifier, which is then used to infer ratings for the new reviews. Furthermore, McAuley and Leskovec [64] proposed a Hidden-Factors as Topic (HFT) model, which combined the latent topics learned from reviews with the latent factor model learned from ratings. The author represented a review with the set of topic distributions retrieved from latent dirichlet allocation (LDA) [66], which is linked with the item latent factors. Specifically, a distribution θ_v_j_,kof topickof itemv_j is defined by the following transformation:

θ_v_j_,k= exp(κx_v_j_,k)

∑_k^′exp(κx_v_j_,k^′), (2.3.2) wherex_v_j_,kdenotes a latent featurekof the representation of itemv_j, andκ is a parameter to control the peakiness of the transformation. With Eq. 2.3.2, the latent topics can be incorporated into the process of learning the latent factor model.

In recent years, many deep-learning techniques have been adopted to model user and item representations from reviews due to their superior predictive performances [19, 47, 86, 103].

For example, DeepCoNN [103] applied a convolutional neural network (CNN) [102] to learn such representations, which were then used to predict ratings based on the latent factor model.

Neural Network Components FC Layer

Learning

Embedding Representation

ew₁

Word Embedding Lookup

… …

ew_i ew_M

y_i_,1

review

…

User document for user

y_i_,_D

review u_i

w₁ w_M

xu_i

Figure 2.7: A neural network framework for constructing user representations from reviews In an extension to the CNN, NARRE [19] applied an attention mechanism [96] to construct representations by considering different contributions from reviews based on their usefulness.

Despite variations, these techniques share a common network framework for constructing representations, which is shown schematically in Fig. 2.7. To learn a representation for user u_i, this technique first creates a user document foru_iby concatenating all the user’s previous reviews. Each of theM words inu_i’s document is then looked up and mapped with its word embedding, which can be initialized randomly or by utilizing a pretrained word embedding such as Word2Vec [65], GloVe [73], or BERT [24]. These word embeddings are then fed into the neural network components to learnx_u_i as a representation ofu_i. Note that an item representation can be constructed in the same way as a user representation. The output of such a framework is a static representation for every user and item in the training data.

2.3.3 Challenges for Review-Based Recommendations

Although the deep learning based models for review-based recommendations utilize different types of networks to learn the user and item representations, they share two similar principles that could limit their potential: the way they utilized the relevant words, and the way they incorporated the relevant reviews to learn such representations.

2.3 Review-Based Recommender Systems 23 Identifying Relevant Words

First, most of the deep-learning based methods consider every word in a review as an input when learning user and item representations. Given that some words are not relevant either to user preferences or item features, such words should not be given any weight when modeling their representations. For example, in hotel recommendations, words such as “clean” or

“breakfast” are more relevant to a user preferences toward a hotel features than words such as

“he” or “run”. If only the words relevant to a specific recommendation domain are identified and utilized, the user and item representations could be constructed in a more efficient and meaningful way.

Utilizing a Particular Review’s Content

Moreover, the user and item representations are constructed in a static manner by aggregating their relevant previous reviews. This means that each user or item has one fixed representation per review. However, to predict a rating for a particular review, with the aim of modeling a user preferences and an item features for application to the user’s current situation, I believe that it is more important to concentrate and leverage the more relevant information embedded in that review. For example, the review in Fig. 2.6 mentions that the room offers breathtaking views of a city at night. To generate user and item representations for predicting a rating for this review, it would be beneficial to know how much the user prefers, and how much the hotel’s rooms are well known for, its city views at night. That is, my assumption is that the user and item representations should be dynamically constructed for each particular review, to capture the interactions between user preferences (or item features) and the relevant information in that review.

2.3.4 Extracting Contexts from User Reviews

When writing reviews, users can express opinions describing their experiences and their satisfaction with items, which can be a valuable source of contexts [21]. As shown in Fig. 2.6, for example, underlined words such as “summer”, “family”, or “night” can be considered as contexts embedded in a review. Successfully identifying and utilizing contexts from reviews could be the key to satisfying both recommendation accuracy and alleviating rating sparsity in recommender systems.

However, unlike the context-aware recommendation methods that relied on predefined list of contexts [5, 7, 11, 88], the contexts embedded in reviews need to be recognized first before they can further be used for making recommedations. There are two main approaches to extracting contexts from reviews, namely supervised and unsupervised approaches. A

supervised approach extracts contexts based on a predefined list of contextual variables and their corresponding values [2, 20, 38, 58, 54, 55]. Using the predefined contexts in Table 2.1, words such as “summer”, “family”, or “night” could be extracted as contexts from the review in Fig. 2.6. However, non-predefined words in Fig. 2.6 such as “clean”, “free-wifi”, or “breakfast”, which could potentially be considered as contexts, are overlooked. For a supervised approach to be robust, therefore, it will require the contextual variables and their corresponding values to be predefined optimally for each specific recommendation domain.

In contrast, an unsupervised approach aims to infer contexts from reviews without having to predefine them [13, 75, 101]. Some of these approaches [13, 75] classify reviews into context-rich and context-free reviews, based on features of each review such as the number of words, verbs, and verbs in the past tense. The contexts are then extracted as those words or topics that occur more often in the context-rich reviews. These two methods, however, require manual annotation of the review data (as context or noncontext) as part of the training process. Recently, CARL [101] has applied CNN and word-level attention to semantically infer contexts from reviews. Its user and item representations are constructed by modeling the attention weight of each word as its influence in each context on a user–item pair. This method was, however, presented using the framework shown in Fig. 2.7, which means that it suffers from the limitations of utilizing irrelevant words and constructing only static representations.

In addition, most context extraction methods [2, 13, 20, 38, 54, 55, 58] define and extract a context in the form of a single word such as those shown in Table 2.1. However, when users write reviews, they have flexibility in how their contexts are presented, including using phrases in addition to single words. For example, some contexts from the review in Fig.

2.6 might be best extracted as “family trip”, “night city view”, or “friendly staff”, which are more meaningful than just “family”, “night”, or “friendly.” I believe that other words that often accompany (or are present in the same text region as) context words might help in capturing the appropriate meaning of contexts, and should therefore also be extracted to represent contexts accurately.

I strongly believe that effectively extracting and utilizing contexts in reviews could help overcome the challenges of obtaining and identifying relevant contexts in context-aware methods, in addition to the limitations of modeling user and item representations from reviews via deep learning techniques.

Dalam dokumen submitted to the Department of Informatics (Halaman 36-41)