Skip to main content

Hierarchical Attention Factorization Machine for CTR Prediction

  • Conference paper
  • First Online:
Book cover Database Systems for Advanced Applications (DASFAA 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13246))

Included in the following conference series:

Abstract

Click-through rate (CTR) prediction is a crucial task in recommender systems and online advertising. The most critical step in this task is to perform feature interaction. Factorization machines are proposed to complete the second-order interaction of features to improve the prediction accuracy, but they are not competent for high-order feature interactions. In recent years, many state-of-the-art models employ shallow neural networks to capture high-order feature interactions to improve prediction accuracy. However, some studies have proven that the addictive feature interactions using feedforward neural networks are inefficient in capturing common high-order feature interactions. To solve this problem, we propose a new way, a hierarchical attention network, to capture high-order feature interactions. Through the hierarchical attention network, we can refine the feature representation of the previous layer at each layer, making it become a new feature representation containing the feature interaction information of the previous layer, and this representation will be refined again through the next layer. We also use shallow neural networks to perform higher-order non-linear interactions on feature interaction terms to further improve prediction accuracy. The experiment results on four real-world datasets demonstrate that our proposed HFM model outperforms state-of-the-art models.

Supported by the Natural Science Foun dation of China under Grant 61962038, and by the Guangxi Bagui Teams for Innovation and Research under Grant 201979.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.kaggle.com/c/criteo-display-ad-challenge.

  2. 2.

    https://www.kaggle.com/c/avazu-ctr-prediction.

  3. 3.

    https://grouplens.org/datasets/movielens.

  4. 4.

    https://baltrunas.info/research-menu/frappe.

References

  1. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

  2. Baltrunas, L., Church, K., Karatzoglou, A., Oliver, N.: Frappe: understanding the usage and perception of mobile app recommendations in-the-wild. arXiv preprint arXiv:1505.03014 (2015)

  3. Beutel, A., et al.: Latent cross: making use of context in recurrent recommender systems. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 46–54 (2018)

    Google Scholar 

  4. Chapelle, O., Manavoglu, E., Rosales, R.: Simple and scalable response prediction for display advertising. ACM Trans. Intell. Syst. Technol. (TIST) 5(4), 1–34 (2014)

    Google Scholar 

  5. Cheng, H.T., et al.: Wide & deep learning for recommender systems. In: Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, pp. 7–10 (2016)

    Google Scholar 

  6. Cong, D., et al.: Hierarchical attention based neural network for explainable recommendation. In: Proceedings of the 2019 on International Conference on Multimedia Retrieval, pp. 373–381 (2019)

    Google Scholar 

  7. Guo, H., Tang, R., Ye, Y., Li, Z., He, X.: DeepFM: a factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247 (2017)

  8. He, X., Chua, T.S.: Neural factorization machines for sparse predictive analytics. In: Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 355–364 (2017)

    Google Scholar 

  9. He, X., Liao, L., Zhang, H., Nie, L., Hu, X., Chua, T.S.: Neural collaborative filtering. In: Proceedings of the 26th International Conference on World Wide Web, pp. 173–182 (2017)

    Google Scholar 

  10. Hockett, C.F.: Linguistic elements and their relations. Language 37(1), 29–53 (1961)

    Article  Google Scholar 

  11. Huang, F., Li, X., Yuan, C., Zhang, S., Zhang, J., Qiao, S.: Attention-emotion-enhanced convolutional LSTM for sentiment analysis. IEEE Trans. Neural Netw. Learn. Syst. (2021, online). https://doi.org/10.1109/TNNLS.2021.3056664

  12. Juan, Y., Lefortier, D., Chapelle, O.: Field-aware factorization machines in a real-world online advertising system. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 680–688 (2017)

    Google Scholar 

  13. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  14. Lian, J., Zhou, X., Zhang, F., Chen, Z., Xie, X., Sun, G.: xDeepFM: combining explicit and implicit feature interactions for recommender systems. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1754–1763 (2018)

    Google Scholar 

  15. Lobo, J.M., Jiménez-Valverde, A., Real, R.: AUC: a misleading measure of the performance of predictive distribution models. Glob. Ecol. Biogeogr. 17(2), 145–151 (2008)

    Article  Google Scholar 

  16. Long, L., Huang, F., Yin, Y., Xu, Y.: Multi-task learning for collaborative filtering. Int. J. Mach. Learn. Cybern., 1–14 (2021). https://doi.org/10.1007/s13042-021-01451-0

  17. Long, L., Yin, Y., Huang, F.: Graph-aware collaborative filtering for top-N recommendation. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)

    Google Scholar 

  18. Loni, B., Shi, Y., Larson, M., Hanjalic, A.: Cross-domain collaborative filtering with factorization machines. In: de Rijke, M., et al. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 656–661. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06028-6_72

    Chapter  Google Scholar 

  19. Lu, W., Yu, Y., Chang, Y., Wang, Z., Li, C., Yuan, B.: A dual input-aware factorization machine for CTR prediction. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence, pp. 3139–3145 (2020)

    Google Scholar 

  20. Ma, J., Zhao, Z., Yi, X., Chen, J., Hong, L., Chi, E.H.: Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1930–1939 (2018)

    Google Scholar 

  21. Petroni, F., Corro, L.D., Gemulla, R.: CORE: context-aware open relation extraction with factorization machines. Association for Computational Linguistics (2015)

    Google Scholar 

  22. Qu, Y., et al.: Product-based neural networks for user response prediction. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 1149–1154. IEEE (2016)

    Google Scholar 

  23. Rendle, S.: Factorization machines. In: 2010 IEEE International Conference on Data Mining, pp. 995–1000. IEEE (2010)

    Google Scholar 

  24. Rendle, S., Krichene, W., Zhang, L., Anderson, J.: Neural collaborative filtering vs. matrix factorization revisited. In: Fourteenth ACM Conference on Recommender Systems, pp. 240–248 (2020)

    Google Scholar 

  25. Sun, Y., Pan, J., Zhang, A., Flores, A.: FM2: field-matrixed factorization machines for recommender systems. In: Proceedings of the Web Conference 2021, pp. 2828–2837 (2021)

    Google Scholar 

  26. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  27. Wang, Z., She, Q., Zhang, J.: MaskNet: introducing feature-wise multiplication to CTR ranking models by instance-guided mask. arXiv preprint arXiv:2102.07619 (2021)

  28. Xiao, J., Ye, H., He, X., Zhang, H., Wu, F., Chua, T.S.: Attentional factorization machines: learning the weight of feature interactions via attention networks. arXiv preprint arXiv:1708.04617 (2017)

  29. Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning, pp. 2048–2057 (2015)

    Google Scholar 

  30. Yu, Y., Wang, Z., Yuan, B.: An input-aware factorization machine for sparse prediction. In: IJCAI, pp. 1466–1472 (2019)

    Google Scholar 

  31. Zhang, W., Du, T., Wang, J.: Deep learning over multi-field categorical data. In: Ferro, N., et al. (eds.) ECIR 2016. LNCS, vol. 9626, pp. 45–57. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30671-1_4

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yunfei Yin or Faliang Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Long, L., Yin, Y., Huang, F. (2022). Hierarchical Attention Factorization Machine for CTR Prediction. In: Bhattacharya, A., et al. Database Systems for Advanced Applications. DASFAA 2022. Lecture Notes in Computer Science, vol 13246. Springer, Cham. https://doi.org/10.1007/978-3-031-00126-0_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-00126-0_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-00125-3

  • Online ISBN: 978-3-031-00126-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics