Mining Skewed and Sparse Transaction Data for Personalized Shopping Recommendation

Hsu, Chun-Nan; Chung, Hao-Hsiang; Huang, Han-Shen

doi:10.1023/B:MACH.0000035471.28235.6d

Mining Skewed and Sparse Transaction Data for Personalized Shopping Recommendation

Published: October 2004

Volume 57, pages 35–59, (2004)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Mining Skewed and Sparse Transaction Data for Personalized Shopping Recommendation

Download PDF

Chun-Nan Hsu¹,
Hao-Hsiang Chung² &
Han-Shen Huang²

1607 Accesses
41 Citations
Explore all metrics

Abstract

A good shopping recommender system can boost sales in a retailer store. To provide accurate recommendation, the recommender needs to accurately predict a customer's preference, an ability difficult to acquire. Conventional data mining techniques, such as association rule mining and collaborative filtering, can generally be applied to this problem, but rarely produce satisfying results due to the skewness and sparsity of transaction data. In this paper, we report the lessons that we learned in two real-world data mining applications for personalized shopping recommendation. We learned that extending a collaborative filtering method based on ratings (e.g., GroupLens) to perform personalized shopping recommendation is not trivial and that it is not appropriate to apply association-rule based methods (e.g., the IBM SmartPad system) for large scale prediction of customers' shopping preferences. Instead, a probabilistic graphical model can be more effective in handling skewed and sparse data. By casting collaborative filtering algorithms in a probabilistic framework, we derived HyPAM (Hybrid Poisson Aspect Modelling), a novel probabilistic graphical model for personalized shopping recommendation. Experimental results show that HyPAM outperforms GroupLens and the IBM method by generating much more accurate predictions of what items a customer will actually purchase in the unseen test data. The data sets and the results are made available for download at http://chunnan.iis.sinica.edu.tw/hypam/HyPAM.html.

Article PDF

Recommender Systems: Techniques, Applications, and Challenges

A systematic review and research perspective on recommender systems

Article Open access 03 May 2022

Customer segmentation using online platforms: isolating behavioral and demographic segments for persona creation via aggregated user data

Article 23 August 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases (pp. 487–499).
Apte, C., Liu, B., Pednault, E. P. D., & Smyth, P. (2002). Business applications of data mining. Communications of the ACM, 45:8, 49–53.
Google Scholar
Billsus, D., & Pazzani, M. J. (1998). Learning collaborative information filters. In Proceedings of the Fifteenth International Conference on Machine Learning (pp. 46–54).
Breese, J. S., Heckerman, D., & Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence (pp. 43–52).
Brijs, T., Goethals, B., Swinnen, G., Vanhoof, K., & Wets, G. (2000). A data mining framework for optimal product selection in retail supermarket data: The generalized PROFSET model. In Proceedings of the 6th ACM International Conference on Knowledge Discovery and Data Mining (pp. 300–304).
Cadez, I. V., Smyth, P., & Mannila, H. (2001). Probabilistic modeling of transaction data with applications to profiling, visualization, and prediction. In Proceedings of the 7th ACM International Conference on Knowledge Discovery and Data Mining (pp. 37–46).
Dempster, A., Laird, N., & Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, B39, 1–37.
Google Scholar
Goldberg, D., Nichols, D. Oki, B. M., & Terry, D. (1992). Using collaborative filtering to weave an information tapestry. Communications of the ACM, 35:12, 61–70.
Google Scholar
Herlocker, J. L., Konstan, J. A., Borchers, A., & Riedl, J. (1999). An algorithmic framework for performing collaborative filtering. In Proceedings of the Conference on Research and Development in Information Retrieval (pp. 230–237).
Hofmann, T. (1999). Probabilistic latent semantic analysis. In Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (pp. 289–296).
Lawrence, R. D., Almasi, G. S., Kotlyar, V., Viveros, M. S., & Duri, S. (2001). Personalization of supermarket product recommendations. Data Mining and Knowledge Discovery, 5, 11–32.
Google Scholar
Ling, C., & Li, C. (1998). Data mining for direct marketing: Problems and solutions. In Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (pp. 73–79).
Pennock, D. M., Horvitz, E., & Giles, C. L. (2000). Collaborative filtering by personality diagnosis: A Hybrid memory-and model-based approach. In Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence (pp. 473–480).
Popescul, A., Ungar, L., Pennock, D., & Lawrence, S. (2001). Probabilistic models for unified collaborative and content-based recommendation in sparse-data environments. In Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (pp. 437–444).
Resnick, P., Iacovou, N., Suchak, M., Bergstorm, P., & Riedl, J. (1994). GroupLens: An open architecture for collaborative filtering of netnews. In Proceedings of ACM Conference on Computer Supported Cooperative Work (pp. 175–186).
Shardanand, U., & Maes, P. (1995). Social information filtering: Algorithms for automating "Word of Mouth". In Proceedings of ACM Conference on Human Factors in Computing Systems (pp. 210–217).
Wedel, M., & Kamakura, W. A. (1999). Market segmentation: Conceptual and methodological foundations. Kluwer Academic Publishers.

Download references

Author information

Authors and Affiliations

Institute of Information Science, Academia, Sinica, Taiwan
Chun-Nan Hsu
Department of Computer Science and Information Engineering, National Taiwan University, Taiwan
Hao-Hsiang Chung & Han-Shen Huang

Authors

Chun-Nan Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Hao-Hsiang Chung
View author publications
You can also search for this author in PubMed Google Scholar
Han-Shen Huang
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hsu, CN., Chung, HH. & Huang, HS. Mining Skewed and Sparse Transaction Data for Personalized Shopping Recommendation. Machine Learning 57, 35–59 (2004). https://doi.org/10.1023/B:MACH.0000035471.28235.6d

Download citation

Issue Date: October 2004
DOI: https://doi.org/10.1023/B:MACH.0000035471.28235.6d

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Mining Skewed and Sparse Transaction Data for Personalized Shopping Recommendation

Abstract

Article PDF

Similar content being viewed by others

Recommender Systems: Techniques, Applications, and Challenges

A systematic review and research perspective on recommender systems

Customer segmentation using online platforms: isolating behavioral and demographic segments for persona creation via aggregated user data

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Mining Skewed and Sparse Transaction Data for Personalized Shopping Recommendation

Abstract

Article PDF

Similar content being viewed by others

Recommender Systems: Techniques, Applications, and Challenges

A systematic review and research perspective on recommender systems

Customer segmentation using online platforms: isolating behavioral and demographic segments for persona creation via aggregated user data

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation