A Similarity Measure for Collaborative Filtering with Implicit Feedback Tong Queue Lee1, Young Park2, and Yong-Tae Park3 1
Dept. of Mobile Internet Dongyang Technical College 62-160 Gocheok-dong, Guro-gu, Seoul 152-714, Korea
[email protected] 2 Dept. of Computer Science & Information Systems Bradley University, W. Bradley Ave., Peoria, IL 61625, USA
[email protected] 3 Dept. of Industrial Engineering, Seoul National University San 56-1, Sillim-dong, Gwanak-gu, Seoul 151-742, Korea
[email protected] Abstract. Collaborative Filtering(CF) is a widely accepted method of creating recommender systems. CF is based on the similarities among users or items. Measures of similarity including the Pearson Correlation Coefficient and the Cosine Similarity work quite well for explicit ratings, but do not capture real similarity from the ratings derived from implicit feedback. This paper identifies some problems that existing similarity measures have with implicit ratings by analyzing the characteristics of implicit feedback, and proposes a new similarity measure called Inner Product that is more appropriate for implicit ratings. We conducted experiments on user-based collaborative filtering using the proposed similarity measure for two e-commerce environments. Empirical results show that our similarity measure better captures similarities for implicit ratings and leads to more accurate recommendations. Our inner product-based similarity measure could be useful for CF-based recommender systems using implicit ratings in which negative ratings are difficult to be incorporated. Keywords: E-commerce, recommender system, collaborative filtering, implicit feedback, similarity measure, recommendation accuracy.
1 Introduction Today users face the problem of choosing the right products or services within a flood of information. A variety of recommender systems help users select relevant products or services. Among these recommender systems, collaborative filtering-based recommender systems are effectively used in many practical areas [1,2]. A hybrid method is also used by using item content information in addition to user feedback data [3]. Collaborative filtering determines the user’s preference from the user’s rating data. In general, rating data is generated by explicit feedback from users. Obtaining explicit feedback is not always easy and sometimes unfeasible. Users tend to be reluctant to D.-S. Huang, L. Heutte, and M. Loog (Eds.): ICIC 2007, LNAI 4682, pp. 385–397, 2007. © Springer-Verlag Berlin Heidelberg 2007
386
T.Q. Lee, Y. Park, and Y.-T. Park
partake in the intrusiveness of giving explicit feedback. In some cases, users give arbitrary ratings, leading to incorrect recommendations. There has been research on constructing rating data by using implicit feedback such as Web log instead of explicit feedback [4,5,6,7]. Once user rating data is established, collaborative filtering computes similarity among users or items using some similarity measure. A number of similarity measures have been used. The Pearson Correlation Coefficient and Cosine Similarity are two popular measures of similarity. These measures do not distinguish between explicit and implicit rating data. These measures work quite well with explicit ratings, but do not capture the real similarity of implicit ratings because the rating data derived from implicit feedback is different from explicit rating data. In this paper we look at the characteristics of implicit feedback and propose a new similarity measure. We investigate the effectiveness of the proposed measure by conducting some experiments on real data in e-commerce environments. Our similarity measure could be used for collaborative filtering-based recommender systems using only implicit ratings, in which negative ratings are difficult to be incorporated. The rest of this paper is organized as follows: Section 2 describes the characteristics of implicit ratings compared with explicit ratings. Some problems of existing similarity measures with implicit ratings are discussed in Section 3. In Section 4, a new similarity measure for implicit ratings is proposed. Experiments and empirical results are described in Section 5. Section 6 concludes with future work.
2 Deriving Ratings from Implicit Feedback User preference is the basis of collaborative filtering. There are two ways of finding user preferences – explicit feedback and implicit feedback. Ratings and reviews are popular forms of explicit feedback. Ratings are easily quantifiable and thus are used as the basis of collaborative filtering in practice called rating-based CF. For example, consider explicit ratings for movies using a scale of 1 (negative preference) to 5 (positive preference) as shown in Table 1. Table 1. Explicit Movie Ratings (scales 1-5)
User A User B User C
Movie 1 5
Movie 2 1
1
Movie 3 3 5
Movie 4 1 4
User A’s preference to Movie 1 and User B’s preference to Movie 3 are high, meaning they like those movies. User A’s preference to Movie 4, User B’s preference to Movie 2 and User C’s preference to Movie 1 are very low, meaning they dislike those movies. With explicit feedback, users can clearly express positive or negative preferences. However, it is not always easy to obtain explicit feedback. It is practically impossible
A Similarity Measure for Collaborative Filtering with Implicit Feedback
387
in some situations, such as mobile e-commerce environments. In this case, recommender systems should rely on implicit feedback. Implicit feedback includes purchase patterns, page visits, page viewing times, and Web surfing paths. This data is usually obtained by analyzing the Web log. This approach needs preprocessing in order to build implicit ratings by extracting meaningful data from the whole Web log. The amount of meaningful data in the Web log is usually small. Collaborative filtering based on this data is called log-based CF [8,9]. With implicit feedback, users cannot clearly express negative preferences. Implicit ratings constructed from implicit feedbacks do not include negative preferences. For example, consider the implicit ratings for items as shown in Table 2. They are constructed by using the number of item’s Web page visits. Table 2. Implicit Ratings from the Number of Item’s Web Page Visits
User A User B User C
Item 1 15
Item 2 2
Item 3 7 13
4
Item 4 3 12
From Table 2, we infer that User A has high preference to Item 1 and User B has high preference to Item 3. We can also view that User A’s preference to Item 4 and User B’s preference to Item 2 are relatively low. However, it is rather difficult to conclude that they do not like those items. Implicit values are derived from implicit feedback. Lower values do not necessarily correspond to lower preferences. As another example, consider the implicit ratings for items by using the purchase of items (Table 3). Table 3. Implicit Ratings from the Purchase of Items
User A User B User C
Item 1 1
Item 2 1
Item 3 1 1
1
Item 4 1 1
In Table 3, 1 indicates that the user purchased the item. In this case, we can infer that the user likes the purchased item. However, we cannot conclude that the user dislikes all the items that were not purchased.
3 Similarity Problems with Implicit Ratings A similarity measure is used in collaborative filtering in order to determine the similarity between two users or items using users’ item ratings. The Pearson Correlation Coefficient and the Cosine Similarity are two popular measures of
388
T.Q. Lee, Y. Park, and Y.-T. Park
similarity. These two measures work quite well with explicit user ratings. However, there are some problems when these measures are applied to implicit ratings. 3.1 Pearson Correlation Coefficient The Pearson Correlation Coefficient is one of the most widely used similarity measures from the early days of collaborative filtering to the present [1]. The Pearson Correlation Coefficient is defined as follows:
∑ (P
aj
P _ sim(a, b) =
∑ (P
aj
− Pa ) 2
j
Here, a and b are users,
Paj
− Pa )( Pbj − Pb )
j
∑ (P
bj
(1)
− Pb ) 2
j
.
is the current preference of user a on item j,
current preference of user b on item j,
Pa
Pbj
is the
is the average current preference of user a,
Pb
and is the average current preference of user b. The Pearson Correlation Coefficient considers the differences in users’ average preferences by subtracting the average preference from the current preference. By dividing by the standard deviations it also considers the differences in user rating values. For instance, consider the explicit ratings for movies using a scale of 1 (negative preference) to 5 (positive preference). An example matrix is shown in Table 4. Table 4. Explicit Ratings (scale 1-5)
User A User B
Movie 1 1 5
Movie 2 2 4
Movie 3 3 3
From Table 4, we see that the rating trends of User A and User B are opposite. When we compute the Pearson Correlation Coefficient between User A and User B, it is negative and thus shows that these two users are dissimilar as shown in Fig. 1. The Pearson Correlation Coefficient appears to be a good similarity measure for explicit ratings given by users. However, the Pearson Correlation Coefficient does not capture the real similarity between users from implicit ratings. For example, consider the number of web page visits. Table 5 shows an example implicit rating matrix. Note that Table 5 looks similar to Table 4, but it contains the number of visits rather than actual rating values. Thus, like Table 4, the Pearson Correlation Coefficient between User A and User B is negative, which implies that these two users are dissimilar. However, because the values in the implicit rating matrix do not indicate any negative preferences, it is difficult to conclude that two users are
A Similarity Measure for Collaborative Filtering with Implicit Feedback
Ratings
User B
389
Ratings User B
User A
Movies
User A Normalize Similarity < 0
2 Users are dissimilar.
Movies
Fig. 1. Similarity using Pearson Correlation Coefficient Table 5. Implicit Ratings based on Page Visit Counts
User A User B
Page 1 2 10
Page 2 4 8
Visit Counts
Page 3 6 6
User A User B
Similarity > 0 2 Users are similar a little. Pages
Fig. 2. Similarity with Implicit Feedback
dissimilar. Smaller numbers of visits do not necessarily correlate to negative preferences. In fact, User A and User B may have very similar preference trends as shown in Fig. 2. 3.2 Cosine Similarity The Cosine Similarity is also one of the similarity measures that are widely used in collaborative filtering. The Cosine Similarity is defined as follows:
∑ (P
aj )( Pbj )
C _ sim(a, b) =
j
∑ (P
aj )
j
2
∑ (P
bj )
2
.
(2)
j
P P Here, a and b are users, aj is the current preference of user a on the item j, bj is the current preference of user b on the item j.
390
T.Q. Lee, Y. Park, and Y.-T. Park
The Cosine Similarity between user u1 and user u2 can be viewed as the angle between u1’s preference vector and u2’s preference vector. The smaller the angle is, the greater the degree of similarity between the users is. For example, consider the explicit ratings for articles using a scale of 1 (negative preference) to 5 (positive preference). Consider an example matrix as shown in Table 6. Table 6. Explicit Ratings (scales 1-5)
User A User B User C
Article 1 2 1 2
Article 2 3 2 4
The Cosine Similarity between User A and User B is the same as the Cosine Similarity between User A and User C. Considering User C’s rating values are proportionately larger than User B’s, we infer that User B and User C are equally similar to User A. The Cosine Similarity normalizes rating values of a user in order to incorporate the user’s trends on the rating values. Thus, as shown in Fig. 3, the Cosine Similarity seems reasonable for explicit ratings. Like the Pearson Correlation Coefficient, however, the Cosine Similarity is problematic in capturing the real similarity between users from implicit ratings. For example, consider the page viewing time. Table 7 shows an example implicit rating matrix. User C Ratings
θ User B User A
O
∠AOB =∠AOC
So, Sim(A,B)=Sim(A,C)
Articles
Fig. 3. Similarity using Cosine Similarity Table 7. Implicit Ratings based on View Time (seconds)
User A User B User C
Article 1 20 10 20
Article 2 30 20 40
Note that Table 7 looks similar to Table 6, but it contains the viewing duration in seconds rather than actual rating values. The Cosine Similarity between User A and User B is the same as the Cosine Similarity between User A and User C.
A Similarity Measure for Collaborative Filtering with Implicit Feedback
391
Still, it is difficult to conclude that User B and User C have the same extent of similarity with respect to User A because the values in the implicit rating matrix are not preference values. It is more natural that the values in the implicit rating matrix themselves without normalization should be viewed as preferences. User C spent more time viewing the articles than User B. Thus, as shown in Fig. 4, it could be that User C is more similar to User A than User B is to User A. User C
View Time
θ User B User A
O
≠|
|OB| OC| So, Sim(A,B)<Sim(A,C)
Articles
Fig. 4. Similarity with Implicit Feedback
4 A New Similarity Measure for Implicit Ratings We propose a new similarity measure for implicit ratings. Our similarity measure solves the problems of negative preferences and normalization in implicit ratings that are constructed from various implicit feedbacks. The new similarity measure is called Inner Product and is defined as follows: G G ( Paj )( Pbj ) . IP _ sim(a, b) = Pa • Pb = (3)
∑ j
G Pa
G Pb
Here, a and b are users, is the preference vector of user a, is the preference Paj P is the current preference of user a on item j, and bj is the vector of user b, current preference of user b on item j. Compared with the Pearson Correlation Coefficient and the Cosine Similarity, the Inner Product measure better captures real similarity among users from implicit ratings. For example, consider the example implicit ratings based on page visit counts (Table 5). The implicit rating values indicate only positive preferences. The similarity value between User A and User B using the Pearson Correlation Coefficient is -1, which implies that the two users are dissimilar. However, the similarity value is 88 when using the Inner Product measure. This indicates that the two users have very similar preferences. The Inner Product measure reflects users’ real preferences (shown by implicit ratings) better than the Pearson Correlation Coefficient. We cannot compute similarity using the Pearson Correlation Coefficient if the standard deviation of User A or User B is 0 because the denominator becomes 0. However, we can compute the Inner Product-based similarity value regardless of the user’s standard deviation.
392
T.Q. Lee, Y. Park, and Y.-T. Park
Consider, for example, the example implicit rating matrix in Table 7 based on page view time. The similarity value using the Cosine Similarity between User A and User 8 65 .
B is
The similarity value using Cosine Similarity between User A and User C is
8 65 . Because User C spent more time viewing the articles than User B, also however, User C seems to be more similar to User A than User B is to User A. When the Inner Product measure is used, the similarity value between User A and User B is 800. But, the similarity value between User A and User C is 1600, which is twice the similarity value between User A and User B. The Inner Product measure reflects similarity between users more accurately than the Cosine Similarity in the context of implicit ratings. The proposed Inner Product measure has the following improvements over the major existing similarity measures:
• The Inner Product measure solves the negative preference problem with the Pearson Coefficient. • The Inner Product measure also solves the normalization problem with the Cosine Similarity. • The Inner Product measure solves the problem with the Pearson Coefficient when the standard deviation is 0.
5 Experiments and Results In order to investigate the effectiveness of our similarity measure, we conducted experiments on user-based collaborative filtering-based recommender systems using two data sets - data set 1 and data set 2. The data set 1 is the set of purchase transactions of character images in a mobile environment provided by SKT in 2004. SKT is one of leading mobile service companies in Korea. The number of users that purchased at least once is 1,922. The number of character images is 9,131. The total number of transactions is 65,101. The data set 2 is the web-log of an on-line cosmetics store “H” in 2005. “H” is an Internet shopping mall in Korea. The number of users is 208. The number of items is 1,682. The total number of transactions is 16,959. We used 80% of the transaction data as training data. The remaining 20% of the transaction data was used to test the accuracy of user-based collaborative filteringbased recommender systems. We used 10 neighbors to find the nearest neighbors. We recommended 10 items. Simulation was done by using VBA (Visual Basic for Applications) on the Excel worksheet containing the data. 5.1 Experiment I-A: Using Implicit Ratings from Purchase Information of Data Set 1
In Experiment I-A, we constructed implicit ratings from purchase information. When someone purchased an item, we assigned the rating value 1 to the user-item pair.
A Similarity Measure for Collaborative Filtering with Implicit Feedback
393
An example user-item rating matrix is shown in Table 8. Here, User A purchased Item 1, 3 and 4, User B purchased Item 2 and 3, and User C purchased Item 1 and 4. Table 8. Implicit Rating Matrix Example Based on Purchase Information Only
Item 1 1
User A User B User C
Item 2
Item 3 1 1
1 1
Item 4 1 1
In order to evaluate accuracy, we compared the number of items actually purchased from the items recommended by the user-based collaborative filtering-based recommender systems using the Pearson Correlation Coefficient, Cosine Similarity, and Inner Product. The empirical results of Experiment I-A are summarized in Table 9. Table 9. Empirical Results with Purchase Information Only of Data Set 1
# of items purchased from recommended items # of items per user
Pearson Correlation Coefficient
Cosine Similarity
New IP Similarity
123
127
118
0.11
0.11
0.11
The Pearson Correlation Coefficient showed 123 actual purchases and Cosine Similarity showed 127 purchases. Our Inner Product measure resulted in 118 actual purchases from the recommended list. As shown in Fig. 5, the three similarity measures all showed similar accuracy. # items recommended & purchased
250 200 150 100 50 0 Pearson Correlation Coefficient
Cosine Similarity
New Similarity
Fig. 5. Comparison of Similarity Measures with Purchase Information Only of Data Set 1
Cosine Similarity showed slightly better accuracy than the other two. Our Inner Product measure showed slightly worse accuracy. This is because implicit ratings from solely purchase information are binary.
394
T.Q. Lee, Y. Park, and Y.-T. Park
5.2 Experiment I-B: Using Implicit Ratings from Purchase and Time Information of Data Set 1
In Experiment I-B, we constructed the implicit ratings from both purchase and time information. We used two kinds of time information: item launch time and user purchase time. Item launch time was used to improve the scalability and accuracy of collaborative filtering-based recommender systems [10]. User purchase time was also used in order to improve recommendation accuracy [11]. Original rating values are weighted by assigning more weight to recent launch time and recent purchase time. We divided the launch time and purchase time into three groups, respectively. We then gave more weight to recent groups. The weight scheme used is given in Table 10. Table 10. Weight Scheme for Time information Old purchase group Old launch group Middle launch group Recent launch group
0.7 1 1.3
Middle purchase group 1.7 2 2.3
Recent purchase group 2.7 3 3.3
For example, consider the example user-item rating matrix shown in Table 8. Assume the following time information: Item 1 belongs to the Old launch group, Item 2 and 3 belong to the Middle launch group and Item 4 belongs to the Recent launch group. Suppose that the User A-Item 1 purchase, the User B-Item 2 purchase and the User A-Item 4 purchase belong to the Old purchase group, the User A-Item 1 purchase and the User B-Item 3 purchase belong to the Middle purchase group, and the User C-Item 1 purchase, the User A-Item 3 purchase and the User C-Item 4 Table 11. Implicit Rating Matrix Example Based on Purchase and Time Information
User A User B User C
Item 1 0.7
Item 2
Item 3 3 2
1 2.7
Item 4 1.3 3.3
The empirical results of Experiment I-B are summarized in Table 12. Table 12. Empirical Results with Purchase and Temporal Information of Data Set 1
# of items purchased from recommended items # of items per user
Pearson Correlation Cosine Similarity Coefficient
New IP Similarity
180
170
229
0.16
0.15
0.21
A Similarity Measure for Collaborative Filtering with Implicit Feedback
395
purchase belong to the Recent purchase group. The corresponding user-item rating matrix for this case is shown in Table 11. The Pearson Correlation Coefficient resulted in 180 actual purchases and Cosine Similarity showed 170 actual purchases. Our Inner Product measure showed 229 actual purchases from the recommended list. Fig. 6 depicts the accuracy of three similarity measures. # items recommended & purchased
250 200 150 100 50 0 Pearson Correlation Coefficient
Cosine Similarity
New Similarity
Fig. 6. Comparison of Similarity Measures with Purchase and Temporal Information of Data Set 1
Our Inner Product measure showed 27% increase in accuracy over the Pearson Correlation Coefficient and 35% increase in accuracy over Cosine Similarity. 5.3 Experiment II: Using Implicit Ratings from Web Log of Data Set 2
In Experiment II, we constructed implicit ratings from web-log information as follows. When someone clicked an item, we assigned the rating value 1 to the useritem pair. When someone put an item in the shopping cart, we assigned the rating value 2 to the user-item pair. When someone actually purchased an item, we assigned the rating value 3 to the user-item pair. We compared accuracy of recommendation using MAE (Mean Absolute Error). The empirical results of Experiment II are summarized in Table 13, and the accuracy comparison of three similarity measures is shown in Fig. 7. Table 13. Empirical Results with Web Log of Data Set 2
Pearson Correlation Coefficient Mean Absolute Error
0.483
Cosine Similarity 0.472
New IP Similarity 0.418
Our Inner Product measure showed 13% increase in accuracy over the Pearson Correlation Coefficient and 11% increase in accuracy over Cosine Similarity.
396
T.Q. Lee, Y. Park, and Y.-T. Park
0.5 0.48 0.46
EA M0.44 0.42
0.4 0.38 Pearson Correlation Coefficient
Cosine Similarity
New Similarity
Fig. 7. Comparison of Similarity Measures with Web Log of Data Set 2
6 Conclusion and Future Work We have presented a new similarity measure suitable for implicit ratings. It is based on inner product and resolves some problems associated with the existing similarity measures (including the Pearson Correlation Coefficient and the Cosine Similarity) with regard to implicit ratings in collaborative filtering. Empirical data from two ecommerce environments (including a mobile environment) showed that user-based collaborating filtering using the proposed similarity measure resulted in more accurate recommendations. Our inner product similarity measure could be useful for collaborative filtering-based recommender systems using implicit ratings, in which negative ratings are not readily incorporated. Future work will focus on conducting more experiments with a variety of implicit rating data. Further research is also needed to incorporate the factors of rating scales and rating average shifts into the inner product measure. Acknowledgments. We would like to thank Dr. Y. H. Cho for permitting us to share the Data Set. This work is supported in part by Dongyang Technical College Academy Research Expenses and the Caterpillar Research Fellowship.
References 1. Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., Riedl, J. GroupLens: An Open Architecture for Collaborative Filtering of Netnews. Proceedings of CSCW ’94 (1994) 175-186 2. Linden, G., Smith, B., York, J.: Amazon.com Recommendations. Item-to-Item Collaborative Filtering. IEEE Internet Computing (2003)
A Similarity Measure for Collaborative Filtering with Implicit Feedback
397
3. Melville, P., Mooney, R. J., Nagarajan, R.: Content-Boosted Collaborative Filtering for Improved Recommendations. Proceedings of Eighth National Conference an Artificial Intelligence (2002) 187-192 4. Caglayan, A., Snorrason, M., Jacoby,J., Mazzu, J.,Jones, R., Kumar, K.: Learn Sesame - A Learning Agent Engine. Applied Artificial Intelligence. Vol. 11 (1997) 393-412 5. Middleton, S.E., Shadbolt, N.R., de Roure, D.C.: Ontological User Profiling in Recommender Systems. ACM Trans. Information Systems. Vol. 22. no. 1 (2004) 54-88 6. Oard, D.W., Kim, J.: Implicit Feedback for Recommender Systems. Proceedings of Recommender Systems 1998 Workshop (1998) 7. Schein, A.I., Popescul, A., Ungar, L.H., Pennock, D.M.: Methods and Metrics for ColdStart Recommendations. Proceedings of Ann. Int’l ACM SIGIR Conf. (2002) 8. Mobasher, B., Dai, H., Luo, T., Sun, Y., Zhu, J.: Automatic Personalization Based on Web Usage Mining. Communications of the ACM. Vol. 43(8) (2000) 142-151 9. Anderson, C. R., Domingos, P., Weld, D. S.: Personalizing Web Sites for Mobile Users. Proceedings of the 10th Conference on the World Wide Web (2001) 10. Tang, T. Y., Winoto, P., Chan, K. C. C.: Scaling Down Candidate Sets Based on the Temporal Feature of Items for Improved Hybrid Recommendations. Intelligent Techniques in Web Personalization. LNAI 3169 (2003) 169-185 11. Ding Y., Li, X. Orlowska M.: Recency-Based Collaborative Filtering, Australian Computer Science Communications. Vol. 28 No 2. Australasian Database Conference. ACM Digital Library (2006) 99-107