ABSTRACT
The use of IR methodology in the evaluation of recommender systems has become common practice in recent years. IR metrics have been found however to be strongly biased towards rewarding algorithms that recommend popular items "the same bias that state of the art recommendation algorithms display. Recent research has confirmed and measured such biases, and proposed methods to avoid them. The fundamental question remains open though whether popularity is really a bias we should avoid or not; whether it could be a useful and reliable signal in recommendation, or it may be unfairly rewarded by the experimental biases. We address this question at a formal level by identifying and modeling the conditions that can determine the answer, in terms of dependencies between key random variables, involving item rating, discovery and relevance. We find conditions that guarantee popularity to be effective or quite the opposite, and for the measured metric values to reflect a true effectiveness, or qualitatively deviate from it. We exemplify and confirm the theoretical findings with empirical results. We build a crowdsourced dataset devoid of the usual biases displayed by common publicly available data, in which we illustrate contradictions between the accuracy that would be measured in a common biased offline experimental setting, and the actual accuracy that can be measured with unbiased observations.
- P. Adamopoulos and A. Tuzhilin. 2014. On unexpectedness in recommender systems: or how to better expect the unexpected. ACM Transactions on Intelligent Systems and Technology 5, 4, (Jan. 2014). ACM, New York, NY, 1--32. Google ScholarDigital Library
- G. Adomavicius and A. Tuzhilin. 2005. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE TKDE 17, 6 (June 2005). IEEE, Piscataway, NJ, USA, 734--749. Google ScholarDigital Library
- A. Bandura. 1971. Social Learning Theory. General Learning Press, New York.Google Scholar
- A. Bellogín, P. Castells, and I. Cantador. 2017. Statistical Biases in Information Retrieval Metrics for Recommender Systems. Information Retrieval 20, 6 (Jul. 2017). Springer, Dordrecht, Netherlands, 606--634. Google ScholarDigital Library
- S. Bikhchandani, D. Hirshleifer, and I. Welch. 1992. A Theory of Fads, Custom, and Cultural Change as Informational Cascades. The Journal of Political Economy 100, 5 (Oct. 1992). University of Chicago Press, Chicago, IL, USA, 992--1026.Google ScholarCross Ref
- R. Bredereck and E. Elkind. 2017. Manipulating Opinion Diffusion in Social Networks. In Proc. of the 26th International Joint Conference on Artificial Intelligence (IJCAI 2017). Morgan Kaufmann Publishers, San Francisco, CA, USA 894--900. Google ScholarDigital Library
- J. Bryant and M. B. Oliver (Eds.). 2008. Media Effects: Advances in Theory and Research, 3rd edition. Routledge, Abingdon, UK.Google Scholar
- R. Cañamares and P. Castells. 2014. Exploring social network effects on popularity biases in recommender systems. In 6th ACM RecSys Workshop on Recommender Systems and the Social Web (RSWeb 2014). Foster City, CA, Oct. 2014.Google Scholar
- R. Cañamares and P. Castells. 2017. A Probabilistic Reformulation of Memory-Based Collaborative Filtering -- Implications on Popularity Biases. In Proc. of the 40th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2017). ACM, New York, USA, 215--224. Google ScholarDigital Library
- R. Cañamares and P. Castells. 2017. On the Optimal Non-Personalized Recommendation: From the PRP to the Discovery False Negative Principle. ACM SIGIR Workshop on Axiomatic Thinking for Information Retrieval and Related Tasks (ATIR 2017). Tokyo, Japan, Aug. 2017.Google Scholar
- P. Castells, N. J. Hurley, S. Vargas. 2015. Novelty and Diversity in Recommender Systems. In: Recommender Systems Handbook, 2nd edition, F. Ricci, L. Rokach, and B. Shapira (Eds.). Springer, New York, NY, USA, 881--918.Google Scholar
- O. Celma and P. Herrera. 2008. A new approach to evaluating novel recommendations. In Proc. of the 2nd ACM Conference on Recommender Systems (RecSys 2008). ACM, New York, NY, USA, 179--186. Google ScholarDigital Library
- R. B. Cialdini and N. J. Goldstein. 2004. Social Influence: Compliance and Conformity. Annual Review of Psychology 55 (Feb. 2004). Palo Alto, CA, USA, 591--621.Google Scholar
- P. Cremonesi, Y. Koren, and R. Turrin. 2010. Performance of recommender algorithms on top-n recommendation tasks. In Proc. of the 4th ACM Conference on Recommender Systems (RecSys 2010). ACM, New York, NY, USA, 39--46. Google ScholarDigital Library
- D. Fleder and K. Hosanagar. 2009. Blockbuster culture's next rise or fall: The impact of recommender systems on sales diversity. Management Science 55, 5 (May 2009). Informs, Catonsville, MD, USA, 697--712. Google ScholarDigital Library
- S. Goel, A. Broder, E. Gabrilovich, and B. Pang. 2010. Anatomy of the long tail: ordinary people with extraordinary tastes. In Proc. of the 3rd ACM Int. Conf. on Web Search and Data Mining (WSDM 2010). ACM, New York, NY, USA, 201--210. Google ScholarDigital Library
- F. M. Harper and J. A. Konstan. 2016. The MovieLens Datasets: History and Context. ACM TOIS 5, 4 (Jan. 2016). ACM, New York, NY, USA. Google ScholarDigital Library
- F. M. Harper, X. Li, Y. Chen and J. A. Konstan. 2005. An Economic Model of User Rating in an Online Recommender System. In Proc. of the 10th International Conference on User Modeling (UM 2005). Springer, Berlin, Germany, 307--316. Google ScholarDigital Library
- H. He and E. A. Garcia. 2009. Learning from Imbalanced Data. IEEE TKDE 21, 9 (Sept.2009). IEEE, Piscataway, NJ, USA, 1263--1284. Google ScholarDigital Library
- Y. Hu, Y. Koren, and C. Volinsky. 2008. Collaborative Filtering for Implicit Feedback Datasets. In Proc. of the 8th IEEE International Conference on Data Mining (ICDM 2008). IEEE Computer Society, Washington, DC, USA, 15--19. Google ScholarDigital Library
- D. Jannach, L. Lerche, I. Kamehkhosh, and M. Jugovac. 2015. What recommenders recommend: an analysis of recommendation biases and possible countermeasures. User Modeling and User-Adapted Interaction 25, 5 (Dec. 2015). Kluwer Academic Publishers, Hingham, MA, USA, 427--491. Google ScholarDigital Library
- G. Linden, B. Smith, and J. York. 2003. Amazon.com Recommendations: Item-to-Item Collaborative Filtering. IEEE Internet Computing 7, 1 (Jan. 2003). IEEE, Piscataway, NJ, USA, 76--80. Google ScholarDigital Library
- R. J. A. Little and D. B. Rubin. 1987. Statistical analysis with missing data. John Wiley&Sons, Hoboken, NJ, USA. Google ScholarDigital Library
- B. M. Marlin, R. S. Zemel, S. T. Roweis, and M. Slaney. 2007. Collaborative Filtering and the Missing at Random Assumption. In Proc. of the 23rd Conf. on Uncertainty in Artificial Intelligence (UAI 2007). AUAI Press, Arlington, VA, 267--275. Google ScholarDigital Library
- B. Marlin and R. Zemel. 2009. Collaborative prediction and ranking with non-random missing data. In Proc. of the 3rd ACM Conference on Recommender Systems (RecSys 2009). ACM, New York, NY, USA, 5--12. Google ScholarDigital Library
- M. Moussaïd, J. E. Kämmer, P. P. Analytis, and H. Neth. 2013. Social Influence and the Collective Dynamics of Opinion Formation. PLoS One 8, 11 (Nov. 2013). Public Library of Science, San Francisco, CA, USA.Google Scholar
- S. A. Myers, C. Zhu, and J. Leskovec. 2012. Information diffusion and external influence in networks. In Proc of the 18th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD 2012). ACM, New York, NY, USA, 33--41. Google ScholarDigital Library
- E. Pariser. 2012. The Filter Bubble: How the New Personalized Web Is Changing What We Read and How We Think. Penguin Books, London, UK. Google ScholarDigital Library
- J. Ratkiewicz, S. Fortunato, A. Flammini, F. Menczer and A. Vespignani. 2010. Characterizing and Modeling the Dynamics of Online Popularity. Physical Review Letters 105, 15 (Oct. 2010). APS, Ridge, NY, USA.Google ScholarCross Ref
- S. E. Robertson. 1977. The Probability Ranking in IR. Journal of Documentation 33, 4 (Jan. 1977), 294--304.Google ScholarCross Ref
- M. J. Salganik, P. S. Dodds and D. J. Watts. 2006. Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market. Science 311, 5762 (Feb. 2006). AAAS, Washington, D.C., USA, 854--856.Google ScholarCross Ref
- G. Shani and A. Gunawardana, 2015. Evaluating Recommendation Systems. In: Recommender Systems Handbook, 2nd edition, F. Ricci, L. Rokach, and B. Shapira (Eds.). Springer, New York, NY, USA, 265--308.Google Scholar
- A. Sinha, D. F. Gleich, and K. Ramani. 2016. Deconvolving Feedback Loops in Recommender Systems. In Proc. of the 30th Annual Conference on Neural Information Processing Systems (NIPS 2016). Barcelona, Spain, Dec. 2016, 3243--3251. Google ScholarDigital Library
- H. Steck. 2010. Training and testing of recommender systems on data missing not at random. In Proc. of the 16th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD 2010). ACM, New York, NY, USA, 713--722. Google ScholarDigital Library
- H. Steck. 2011. Item popularity and recommendation accuracy. In Proc. of the 5th ACM Conference on Recommender Systems (RecSys 2011). ACM, New York, NY, USA, 125--132. Google ScholarDigital Library
- C. Zhai and, J. D. Lafferty. 2004. A study of smoothing methods for language models applied to information retrieval. ACM Transactions on Information Systems 22, 2 (April 2004). ACM, New York, NY, USA, 179--14. Google ScholarDigital Library
Index Terms
- Should I Follow the Crowd?: A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems
Recommendations
Analyzing weighting schemes in collaborative filtering: cold start, post cold start and power users
SAC '12: Proceedings of the 27th Annual ACM Symposium on Applied ComputingCollaborative filtering recommender systems provide their users with relevant items based on information from other similar users. Popular collaborative filtering approaches such as Pearson correlation coefficient and cosine similarity, compute the ...
A Novel Framework for Improving Recommender Diversity
International Workshop on Behavior and Social Informatics on Behavior and Social Computing - Volume 8178Recommender systems are being used to assist users in finding relevant items from a large set of alternatives in many online applications. However, while most research up to this point has focused on improving the accuracy of recommender systems, other ...
A new approach to evaluating novel recommendations
RecSys '08: Proceedings of the 2008 ACM conference on Recommender systemsThis paper presents two methods, named Item- and User-centric, to evaluate the quality of novel recommendations. The former method focuses on analyzing the item-based recommendation network. The aim is to detect whether the network topology has any ...
Comments