Jurnal Media Informatika Budidarma

(1)

Food and Beverage Recommendation in EatAja Application Using the Alternating Least Square Method Recommender System

Elsa Rachel Dementieva, Z K A Baizal^*, Donni Richasdy School of Computing, Telkom University, Bandung, Indonesia

Email: ¹[email protected], ^2,*[email protected],³[email protected] Email Penulis Korespondensi: [email protected]

Abstract−EatAja is a startup in Indonesia that provides a mobile application-based food and beverage ordering solution for restaurants. The EatAja application uses transaction data to recommend food and beverage menus to customers. Previous studies have developed recommender systems using the Apriori and Collaborative Filtering methods.However, there are shortcomings in the recommendation system using both methods, i.e., the lack of personalization factors and low scalability.

The learning method with matrix factorization can overcome the problem. In this study, we improve the food and beverage product recommender system in the EatAja application using the Alternating Least Square (ALS) matrix factorization method on Apache Spark. We will compare the results of the recommender system using the ALS method with the Collaborative Filtering method. The comparison uses the Mean Absolute Error (MAE) evaluation method. The results showed that the MAE value decreased by 0.07 with the ALS Matrix factorization method.

Keywords: Recommender System; Learning Method; Matrix Factorization, Alternating Least Square; EatAja

1. INTRODUCTION

A recommender system is a system that uses various data sources to make decisions according to the wishes of the user [1][2]. A recommender system is one of the essential applications of artificial intelligence in the commercial domain [3]. Research on recommender systems has grown recently, from traditional methods such as content- based filtering to machine learning methods. Companies in the technology industry often use recommender systems to increase sales or user interaction, such as those on Amazon, Netflix, Facebook, and YouTube [4]. One example of implementation in an e-commerce marketplace is Amazon, which utilizes user feedback data such as a list of products viewed and purchased by customers. Interaction data can be analyzed to predict which products will be purchased in the future [5]. With this recommender system, Amazon increased its sales [6].

The food and beverage business application can implement a recommender system. The EatAja application is an application that focuses on ordering food and beverage online. By recommending a menu that suits each customer's preferences, customers can more easily choose the menu they want to order.

In previous studies, there were two different methods for creating a recommender system in the EatAja application. The first method is Apriori [7], which utilizes data mining techniques. The second method is Collaborative Filtering, which takes advantage of similarities between two or more users to recommend products [8]. Although both methods are more straightforward and intuitive, there are drawbacks such as loss of personalization factor in Apriori, low scalability, and cannot overcome the problem of overfitting in the Collaborative Filtering method[1].

One method that can overcome the shortcomings of the Collaborative Filtering and Apriori methods is the ALS factorization matrix. The ALS algorithm is one of the popular implementations of matrix factorization because ALS is profitable in at least two cases. The first is that the system can overcome the overfit by regularization parameters, and the second is that the implicit and explicit data can implement the recommender system using the ALS method [9][10].

Based on the problems discussed in this study, we developed a recommender system for the EatAja application using the ALS method. We will compare the accuracy results using ALS and the accuracy results using the Collaborative Filtering method to determine the effect of using the ALS method on the food and beverage business domain on the EatAja platform.

2. RESEARCH METHODOLOGY

2.1 Recommender System

A recommender system is a subclass of information filtering systems that can predict the "rating" or "preference"

a user would give to an item [11]. The importance of recommendation systems is due to their potential to help customers in their shopping journey in many ways [3]. A recommender system collects data from users directly or indirectly [12]. Following the data collection results, the user receives the results of calculations using the given algorithm as food and beverage recommendations based on the user's parameters. Three recommender systems are:

a. Collaborative filtering, which uses user-item interaction data;

b. Content-based filtering, which uses user and item attribute information; and

c. Hybrid recommender system which combines Collaborative Filtering and Content-Based Filtering.

(2)

2.2 User Feedback

User feedback is qualitative and quantitative data from customers about their likes, dislikes, impressions, and requests about a product [13].User feedback can potentially improve machine learning systems significantly [14].

In the training model system, the recommendation provider requires feedback data from the user. There are two types of user feedback are:

a. Implicit: When interacting with an item, the user is usually unaware that he or she is providing feedback on the item. For example, a user clicks on a specific item or adds it to the cart [15].

b. Explicit: When interacting with an item, the user is aware that he or she is providing feedback on the item.

For example, users give ratings and reviews on items [15]. In this study, we use explicit data types.

2.3 Matrix Factorization Methods

Matrix Factorization is a way to make a matrix into two or more matrix products. High correspondence between item and user factors leads to a recommendation [16]. Matrix Factorization Is a method that is widely used in the recommender system. In addition, they offer much flexibility for modeling various real-life situations [9][17].

Matrix Factorization based approaches have proven to be efficient for rating-based recommender systems [18].

Each item 𝑖 is associated with a vector 𝑞_𝑖∈ ℝ^𝑓, and each user 𝑢 is associated with a vector 𝑝_𝑢∈ ℝ^𝑓. For given item 𝑖, the elements of 𝑞_𝑖 measure the extent to which the item possesses positive or negative factors. For a given user 𝑢, the elements of 𝑝_𝑢 measure the extent of interest the user has in items that are high on the corresponding factors, positive or negative. The resulting dot product, 𝑞_𝑖^𝑇𝑝_𝑢, captures the interaction between user 𝑢 and item 𝑖 the user’s overall interest in the item’s characteristics. This approximates user 𝑢’s rating of item i, which is denoted by 𝑟_𝑢𝑖, leading to the estimate:

𝑟̂_𝑢𝑖= 𝑞_𝑖^𝑇𝑝_𝑢 (1)

To learn the factor vectors (𝑝_𝑢 and 𝑞_𝑖 ), the system minimizes the regularized squared error on the set of known ratings:

min𝑞∗,𝑝∗∑_{(𝑢,𝑖)∈𝐾}(𝑟_𝑢𝑖− 𝑞_𝑖^𝑇𝑝_𝑢)²+ 𝜆 (||𝑞_𝑖||²+ ||𝑝_𝑢||² ) (2)

2.4 Alternating Least Square

Alternating Least Squares is an algorithm in the collaborative filtering paradigm [19]. ALS is an optimization process where we try to get closer and closer to the factor that represents the original data in each iteration, commonly known as the least-squares method. In short, ALS helps calculate the sum of the squares of 1 between the predicted value and the actual valuation value.

We chose to use the ALS method in this study because this method can overcome the problems of the previously used method, namely low personalization, and scalability factors. In addition, the ALS method can overcome the overfitting problem with regularization parameters and can be used to train implicit and explicit data [9][10].

Equation 2 is not convex because 𝑞_𝑖 and 𝑝_𝑢 are unknowns. If we fix one of the unknowns, the optimization problem becomes quadratic and can be solved optimally [9]. Thus, ALS techniques rotate between fixing the 𝑞_𝑖’s and 𝑝_𝑢’s. When all 𝑝_𝑢’s are fixed, the system recomputes the 𝑞_𝑖’s by solving a least-squares problem, and vice versa. It ensures that each step decreases Equation 2 until convergence [20].

The observed rating is broken down into its four components: global average, item bias, user bias, and interaction. It allows each component to explain only the part of a signal relevant to it. The system learns user- item by minimizing the squared error function [9].

𝑝∗,𝑞∗,𝑏∗min ∑_{(𝑢,𝑖)∈𝐾}(𝑟_𝑢𝑖− μ − 𝑏_𝑢− 𝑏_𝑖− 𝑝_𝑢^𝑇𝑞_𝑖)²+ 𝜆 (||𝑝_𝑢||²+ ||𝑞_𝑖||²+ 𝑏_𝑢²+ 𝑏_𝑖²) (3)

2.5 Mean Absolute Error

Mean Absolute Error (MAE) is a method to measure the accuracy of a predictive model. The MAE value represents the average absolute error between the forecast results and the actual value. The formula for MAE:

𝑀𝐴𝐸 =∑ =1|𝑃𝑖−𝑞𝑖|^𝑗_𝑖

𝑁 (4)

Where 𝑁 is the total number of absolute ratings, 𝑃𝑖 is the user has predicted rank in the list, and 𝑞𝑖 is the actual ranking of users they like the most based on user preferences. MAE is best for getting accurate results on behalf of user ratings. The calculation of MAE is relatively straightforward [21].

2.6 Design System

The data collection aims to collect the data for requiring in this study. The dataset that collects is data accessed through the EatAja API on November 19, 2021. The datasets are menu data, order data, and restaurant data. After

(3)

that, data exploration helps analyze dataset patterns in this study, and data cleaning aims to prevent unexpected results from the model. Then, adjust the data format according to the tool's required format. Then the ALS model provides predictive recommendations to users and measurement accuracy using MAE. We will evaluate the recommender system's results and compare the accuracy value with the Collaborative Filtering method.

Figure 1. Design Of System

3. RESULT AND DISCUSSION

3.1 Description Of Dataset

In this study, the recommender system using the ALS method and tested on a data set from the EatAja application.

The three data sets collected were menus, orders, and restaurants. Detailed dataset information is in Table 1. The maximum rating given is from number 1 to 5.

Table 1. Dataset Details

No. Description Total

1 Item (Menu) 98

2 Items with an average rating of ≥ 3 98 3 Items with an average rating of ≤ 3 0

4 Partners 10

5 Number of users who ordered 2890 6 Users who give a lot of ratings ≥ 5 737

7 Restaurant 8

In Figure 2 and Figure 3, the histograms show that the dataset used does not show many outliers. Outliers are data that have characteristics that differ significantly from other observations and appear in the form of extreme values for either single variables or combination variables, so the dataset used in this study is suitable for obtaining relevant results.

Figure 2. Distributions Rating by User

Figure 2 is a histogram of the number of ratings the user gives. The histogram shows the number of ratings not too far from the minimum to the maximum, i.e., 2 to 11.

(4)

Figure 3. Distributions Rating by Menu

Figure 3 is a histogram of the number of ratings per menu. The histogram shows the number of ratings not too far from the minimum to the maximum, i.e., 80 to 120. It means there are no unpopular items where the number of ratings is tiny. The users can rate the menu ordered in the EatAja application. Table 2 shows the details of the rating.

Table 2. Rating Details Rating Description

1 Very Bad

2 Bad

3 Enough

4 Good

5 Very Good 3.2 Analysis of Result

The MAE method helps measure the accuracy of the training model created in this study to make predictions. The MAE value depends on the parameter values set in, i.e., the maximum number of iterations (𝑚𝑎𝑥𝐼𝑡𝑒𝑟), the regularization parameter (𝑟𝑒𝑔𝑃𝑎𝑟𝑎𝑚), and the length of the latent factor used (rank). The recommender system created is implemented on Apache Spark and Apache Hadoop clusters. Using the ALS library and then testing by changing the existing parameter values to produce the best MAE value. The input parameter values used are in Table 3.

Table 3. Parameter Setting Parameters Testing Value Maxlter 1,2,3

Rank 1,2,3

regParam 0.02, 0.04, 0.06, 0.08, 0.1

By utilizing the input parameter values in Table 3, the results obtained from the MAE value in Table 4. The minimum MAE result is 0.896, with the maximum iteration parameter at 5, the regularization parameter at 0.04, and the rank at 1.

Table 4. Mean Absolute Error Mean Absolute Error

𝒎𝒂𝒙𝑰𝒕𝒆𝒓 𝒓𝒆𝒈𝑷𝒂𝒓𝒂𝒎 𝒓𝒂𝒏𝒌

0.02 0.04 0.06 0.08 0.1 1 4.17 4.12 4.08 4.06 4.04 1

2 3.6 3.49 3.41 3.35 3.3 1

3 2.07 1.8 1.6 1.5 1.48 1

4 1.09 1.007 0.99 0.97 0.96 1 5 0.9 0.896 0.899 0.902 0.902 1

1 4.5 4.4 4.3 4.2 4.1 2

2 4.3 4.1 3.6 3.8 3.7 2

(5)

Mean Absolute Error

𝒎𝒂𝒙𝑰𝒕𝒆𝒓 𝒓𝒆𝒈𝑷𝒂𝒓𝒂𝒎 𝒓𝒂𝒏𝒌

0.02 0.04 0.06 0.08 0.1

3 3.7 3.3 3.1 2.9 2.7 2

4 2.7 2.2 1.9 1.7 1.6 2

5 1.9 1.5 1.3 1.1 1 2

1 5.01 4.6 4.3 4.1 4.02 3

2 4.4 3.9 3.6 3.3 3.1 3

3 3.8 3.2 2.7 2.4 2.2 3

4 3.4 2.6 2.2 1.9 1.6 3

5 2.9 2.2 1.8 1.5 1.4 3

In the experiment for the regularization parameter in Figure 4, by setting the maximum iteration value at 5 and the rank parameter value at 1, the minimum value of MAE is found in the regularization parameter of 0.04.

The regularization parameter aims to generalize the model, so there is no overfit. In each study, the value of the appropriate regularization parameter for each model is not the same or can be different. If the regularization parameter value is too low, the model will be too complex and overfitting. Moreover, if the regularization parameter value is too high, then the model will experience underfitting. Underfitting occurs when the model cannot see the logic behind the data, so it cannot make predictions correctly. At the same time, overfitting occurs because the model is too focused on training a specific dataset, so it cannot make predictions correctly if given another similar dataset. If the model is overfitting or underfitting, the model will produce low accuracy. Therefore, the best regularization parameter value is in the middle of the range of values. In this study, the dataset is not too large, and the data is not too sparse, so it does not require the regularization parameter value to be too large or small.

Figure 4. Experiment For Regularization Parameter

The experiment for the rank parameter is in Figure 5. By setting the maximum iteration value at 5 and the parameter regularization value at 0.04, it shows that the higher the rank value, the higher the MAE value generated.

The maximum possible iteration is too tiny for ranks 2,3,4, and 5. Because increasing the model's rank means increasing the model's complexity, the model requires more time for training.

Figure 5. Experiment for Rank Parameter

The following experiment is for the maximum iteration parameter. By setting the parameter regularization value at 0.04 and the rank parameter value at 1, in Figure 6, the MAE value decreases as the maximum iteration value increases. It can be said that the higher the maximum iteration, the smaller the MAE result. In this

(6)

experiment, we only carried out experiments up to 5 iterations because configuring the hardware used to run the model impacted maximum iteration performance.

Figure 6. Experiment for Maximum Iteration

In Figure 7 and Figure 6, after analyzing and checking the results of the recommendations from the best model, each user gets the same top 10 recommendation items, and there is no personalization in the results of these recommendations. It happens because the best model only uses the rank = 1 parameter, which means there is only one latent factor for each item. Thus, the recommendation for each user is only calculated based on the highest value of the latent factor and does not take full advantage of the multiplication of the latent factor of the user and the item. Therefore, the result of the model with parameter rank = 1 is not suitable as the best model.

Figure 7. Menu Prediction Value to User

Figure 8. Menu Recommendation Results to Each User

In this study, a comparative analysis aims to evaluate the performance of the training algorithm of the recommender system proposed by ALS. The comparison of the two methods is in Table 5. The table shows that the study results indicate that the MAE value generated is better using the ALS matrix factorization method

(7)

because it obtains a lower MAE value. From the data from the MAE generated by the two methods, the difference in the MAE value is 0.07. Experiments show that the ALS algorithm is suitable for training explicit feedback datasets in which the user assigns a rating to an item. The ALS method effectively solves the problem of scalability and data spareness data.

Table 5. MAE Comparison

Method MAE

Collaborative Filtering 0.96 Alternating Least Square 0.89 Score difference 0.07

There is a reason why the ALS method is better than the Collaborative Filtering method. The first is because, in matrix factorization, the similarity between users and items is obtained with the latent factor due to dimension reduction, while Collaborative Filtering uses ratings directly. Searching similarity with dense data, such as in latent factors, will produce better neighbors than using sparse data. Second, ALS uses regularization parameters to solve the overfitting problem. In ordinary collaborative filtering, this technique does not exist.

4. CONCLUSION

Based on the testing and analysis results of the recommender system, we conclude that the ALS method implemented in processing transaction data for the EatAja application has better performance than the Collaborative Filtering method. The ALS method is also suitable for solving overfitting problems on sparse data and improving prediction accuracy. The results showed that the MAE value decreased using the ALS matrix factorization method to 0.89. Thus, the ALS algorithm is suitable for training explicit feedback datasets in which users rank items. It will benefit users when placing orders on the EatAja application. In future research, it should apply to other types of domains and find the best parameter combination values for the model. In addition, the amount of transaction data that is processed should be more because the algorithm depends on the amount of data used.

REFERENCES

[1] C. C. Aggarwal, Recommender Systems The Textbook, vol. 39, no. 4. 2016.

[2] Z. K. A. Baizal, D. H. Widyantoro, and N. U. Maulidevi, “Computational model for generating interactions in conversational recommender system based on product functional requirements,” Data Knowl. Eng., vol. 128, 2020, doi:

10.1016/j.datak.2020.101813.

[3] L. Evalina, A. I. Riaddy, S. Savitri, and R. A. Permadi, “Toward improving similar item recommendation for a C2C marketplace,” 2019. doi: 10.1109/ICACSIS47736.2019.8979770.

[4] P. B.Thorat, R. M. Goudar, and S. Barve, “Survey on Collaborative Filtering, Content-based Filtering and Hybrid Recommendation System,” Int. J. Comput. Appl., vol. 110, no. 4, 2015, doi: 10.5120/19308-0760.

[5] Z. K. A. Baizal, D. Tarwidi, Adiwijaya, and B. Wijaya, “Tourism Destination Recommendation Using Ontology-based Conversational Recommender System,” Int. J. Comput. Digit. Syst., vol. 10, no. 1, 2021, doi: 10.12785/IJCDS/100176.

[6] M. R. Rezaei, “Amazon Product Recommender System,” pp. 1–5, 2021, [Online]. Available:

http://arxiv.org/abs/2102.04238

[7] R. Produk, M. Perusahaan, I. P. Yuda, D. Jaya, and B. Dirgantoro, “Implementasi Algoritma a-Priori Untuk Sistem ( Recommendation System for E-Commerce Eataja Company Partner Using a-Priori Algorithm ),” e-Proceeding Eng., vol. 8, no. 5, pp. 6222–6229, 2021.

[8] M. Naufal et al., “SISTEM REKOMENDASI LAYANAN PEMESANAN MAKANAN ‘EatAja’ MENGGUNAKAN ALGORITMA COLLABORATIVE FILTERING,” e-Proceeding Eng., vol. 8, no. 5, 2021.

[9] Y. Koren, R. Bell, and C. Volinsky, “Matrix factorization techniques for recommender systems,” Computer (Long.

Beach. Calif)., vol. 42, no. 8, 2009, doi: 10.1109/MC.2009.263.

[10] Z. Gantner, “Supervised Machine Learning Methods for Item Recommendation,” Opus.Bsz-Bw.De, 2012.

[11] L. Sharma and A. Gera, “A Survey of Recommendation System : Research Challenges,” Int. J. Eng. Trends Technol., vol. 4, no. 5, 2013.

[12] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Item-based collaborative filtering recommendation algorithms,” Proc.

10th Int. Conf. World Wide Web, WWW 2001, pp. 285–295, 2001, doi: 10.1145/371920.372071.

[13] X. Yu et al., “Recommendation in heterogeneous information networks with implicit user feedback,” 2013. doi:

10.1145/2507157.2507230.

[14] S. Stumpf et al., “Toward harnessing user feedback for machine learning,” 2007. doi: 10.1145/1216295.1216316.

[15] F. Marisa, S. S. S. Ahmad, Z. I. M. Yusoh, T. M. Akhriza, W. Purnomowati, and R. K. Pandey, “Performance comparison of collaborative-filtering approach with implicit and explicit data,” Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 10, 2019, doi: 10.14569/ijacsa.2019.0101016.

[16] Y. Bao, H. Fang, and J. Zhang, “TopicMF: Simultaneously exploiting ratings and reviews for recommendation,” in Proceedings of the National Conference on Artificial Intelligence, 2014, vol. 1. doi: 10.1609/aaai.v28i1.8715.

[17] A. Grover and J. Leskovec, “node2vec Real-time Video Recommendation Exploration Categories and Subject Descriptors,” World Neurosurg., vol. 95, no. 1, 2016.

[18] G. Takács, I. Pilászy, and B. Németh, “Investigation of Various Matrix Factorization Methods for Large Recommender

(8)

Systems Categories and Subject Descriptors,” 2nd Netflix-KDD Work., vol. 1, 2008.

[19] E. V. V. Cervantes, L. V. C. Quispe, and J. E. O. Luna, “Performance of alternating least squares in a distributed approach using GraphLab and MapReduce,” in CEUR Workshop Proceedings, 2015, vol. 1478.

[20] R. M. Bell and Y. Koren, “Scalable collaborative filtering with jointly derived neighborhood interpolation weights,”

2007. doi: 10.1109/ICDM.2007.90.

[21] C. J. Willmott and K. Matsuura, “Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance,” Clim. Res., vol. 30, no. 1, 2005, doi: 10.3354/cr030079.