Skip to main content

Measuring Immigrants Adoption of Natives Shopping Consumption with Machine Learning

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track (ECML PKDD 2020)

Abstract

“Tell me what you eat and I will tell you what you are”. Jean Anthelme Brillat-Savarin was among the firsts to recognize the relationship between identity and food consumption. Food adoption choices are much less exposed to external judgment and social pressure than other individual behaviours, and can be observed over a long period. That makes them an interesting basis for, among other applications, studying the integration of immigrants from a food consumption viewpoint. Indeed, in this work we analyze immigrants’ food consumption from shopping retail data for understanding if and how it converges towards those of natives. As core contribution of our proposal, we define a score of adoption of natives’ consumption habits by an individual as the probability of being recognized as a native from a machine learning classifier, thus adopting a completely data-driven approach. We measure the immigrant’s adoption of natives’ consumption behavior over a long time, and we identify different trends. A case study on real data of a large nation-wide supermarket chain reveals that we can distinguish five main different groups of immigrants depending on their trends of native consumption adoption.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Data for refugees of Turkey http://d4r.turktelekom.com.tr/.

  2. 2.

    We assume that the expenditure function E also accounts for the quantity.

  3. 3.

    We consider also features derived from others, like \( AL \) and \( AE \), since they might capture different aspects of the customer shopping behavior. Where needed, redundant features can be removed at the preprocessing stage preceding the training phase of the machine learning classifier.

  4. 4.

    Notice that we are implicitly assuming that the food consumption habits of natives do not change over time. While not true in general, we empirically observed that it holds for the vast majority of customers in our data. Studying natives’ evolution in time is part of our future works.

  5. 5.

    The source code of tinca is available here: https://github.com/riccotti/TINCA.

  6. 6.

    https://www.unicooptirreno.it/, data: https://sobigdata.d4science.org/.

  7. 7.

    The 100 product groups are available in the shared repository. The grouping was performed manually to respect the implicit semantic meaning. Each product models on average 1.9 ± 2.0 categories of items of the UniCoop dataset. The largest product groups are those modeling “bread”, “fish”, and “vegetables”.

  8. 8.

    Eurostat data: https://ec.europa.eu/eurostat/statistics-explained/index.php/Migration_and_migrant_population_statistics.

  9. 9.

    Pearson of 0.75 and Spearman of 0.78 in both cases with p-value < 0.0005.

  10. 10.

    We leave to future works the study of the effect of other specific functions for time series clustering like dynamic time warping.

  11. 11.

    For each group we emphasize the countries having the largest relative number of customers in that group normalized on the total number of customers from that specific country. Focusing on countries with larger absolute presence would be less interesting, as a few countries with overall very large presence (e.g. Romania, Switzerland and Germany) would simply overwhelm the others in all groups.

References

  1. Abramitzky, R., et al.: Cultural assimilation during the age of mass migration. Technical report, National Bureau of Economic Research (2016)

    Google Scholar 

  2. Agrawal, R., et al.: Fast algorithms for mining association rules. In Proceedings of 20th International Conference Very Large Data Bases, VLDB, vol. 1215, pp. 487–499 (1994)

    Google Scholar 

  3. Akerlof, G.A., et al.: Identity economics. Econ. Voice 7(2), 1–3 (2010)

    Google Scholar 

  4. Alba, R., et al.: Only english by the third generation? Demography 39(3), 467 (2002)

    Article  Google Scholar 

  5. Alesina, A., Tabellini, G., Trebbi, F.: Is Europe an optimal political area?. Technical report, National Bureau of Economic Research (2017)

    Google Scholar 

  6. Arai, M., et al.: Renouncing personal names: an empirical examination of surname change and earnings. J. Labor Econ. 27(1), 127–147 (2009)

    Article  Google Scholar 

  7. Atkin, D.: The caloric costs of culture: Evidence from Indian migrants. Am. Econ. Rev. 106(4), 1144–1181 (2016)

    Article  Google Scholar 

  8. Bertoli, S., et al.: Integration of Syrian refugees: insights from D4R, media events and housing market data. In: Salah, A.A., Pentland, A., Lepri, B., Letouzé, E. (eds.) Guide to Mobile Data Analytics in Refugee Scenarios, pp. 179–199. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-12554-7_10

    Chapter  Google Scholar 

  9. Bertrand, M., Kamenica, E.: Coming apart? cultural distances in the united states over time. Technical report, National Bureau of Economic Research (2018)

    Google Scholar 

  10. Borjas, G.J.: The analytics of the wage effect of immigration. IZA J. Migr. 2(1), 1–25 (2013). https://doi.org/10.1186/2193-9039-2-22

    Article  Google Scholar 

  11. Borjas, G.J.: Unraveling the immigration narrative. N&C (2016)

    Google Scholar 

  12. Brillat-Savarin, J.A.: Physiologie du goût. Charpentier (1841)

    Google Scholar 

  13. Bronnenberg, B.J., et al.: The evolution of brand preferences: evidence from consumer migration. Am. Econ. Rev. 102(6), 2472–2508 (2012)

    Article  Google Scholar 

  14. Bucheli, J.R., Fontenla, M., Waddell, B.J.: Return migration and violence. World Dev. 116, 113–124 (2019)

    Article  Google Scholar 

  15. Chaffey, D., Ellis-Chadwick, F., Mayer, R., Johnston, K.: Internet Marketing: Strategy, Implementation and Practice. Pearson Education, London (2009)

    Google Scholar 

  16. Chamberlain, B.P., et al.: Customer lifetime value prediction using embeddings. In: ACM SIGKDD, pp. 1753–1762 (2017)

    Google Scholar 

  17. Chen, M.-C., Chiu, A.-L., Chang, H.-H.: Mining changes in customer behavior in retail marketing. Expert Syst. Appl. 28(4), 773–781 (2005)

    Article  Google Scholar 

  18. Docquier, F., et al.: Emigration and democracy. The World Bank (2011)

    Google Scholar 

  19. Dustmann, C., et al.: Labor supply shocks, native wages, and the adjustment of local employment. Q. J. Econ. 132(1), 435–483 (2017)

    Article  Google Scholar 

  20. Fryer Jr., R.G., Levitt, S.D.: The causes and consequences of distinctively black names. Q. J. Econ. 119(3), 767–805 (2004)

    Article  Google Scholar 

  21. Guidotti, R., Coscia, M., Pedreschi, D., Pennacchioli, D.: Behavioral entropy and profitability in retail. In: 2015 IEEE DSAA, pp. 1–10. IEEE (2015)

    Google Scholar 

  22. Guidotti, R., Gabrielli, L.: Recognizing residents and tourists with retail data using shopping profiles. In: Guidi, B., Ricci, L., Calafate, C., Gaggi, O., Marquez-Barja, J. (eds.) GOODTECHS 2017. LNICST, vol. 233, pp. 353–363. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76111-4_35

    Chapter  Google Scholar 

  23. Guidotti, R., Gabrielli, L., Monreale, A., et al.: Discovering temporal regularities in retail customers’ shopping behavior. EPJ Data Sci. 7(1), 1–26 (2018)

    Article  Google Scholar 

  24. Guidotti, R., Monreale, A., Nanni, M.: Clustering individual transactional data for masses of users. In: KDD, pp. 195–204. ACM (2017)

    Google Scholar 

  25. Guidotti, R., Monreale, A., Ruggieri, S., et al.: A survey of methods for explaining black box models. ACM Comput. Surv. (CSUR) 51(5), 1–42 (2018)

    Article  Google Scholar 

  26. Guidotti, R., Rossetti, G., et al.: Personalized market basket prediction with temporal annotated recurring sequences. IEEE TKDE 31(11), 2151–2163 (2018)

    Google Scholar 

  27. Herdağdelen, A., State, B., Adamic, L., Mason, W.: The social ties of immigrant communities in the united states. In: ACM WEBSCI, pp. 78–84 (2016)

    Google Scholar 

  28. Hyndman, R.J., et al.: Forecasting: Principles and Practice. OTexts, Melbourne (2018)

    Google Scholar 

  29. Kulkarni, V., et al.: Freshman or fresher? quantifying the geographic variation of language in online social media. In: AAAI ICWSM, pp. 615–618 (2016)

    Google Scholar 

  30. Lamanna, F., Lenormand, M., et al.: Immigrant community integration in world cities. PloS one 13(3), e0191612 (2018)

    Article  Google Scholar 

  31. Logan, T.D., Rhode, P.W.: Moveable feasts: A new approach to endogenizing tastes. manuscript (The Ohio State University) (2010)

    Google Scholar 

  32. L. Luo, et al. Tracking the evolution of customer purchase behavior segmentation via a fragmentation-coagulation process. In: IJCAI, pp. 2414–2420 (2017)

    Google Scholar 

  33. Magdy, A., Ghanem, T.M., Musleh, M., Mokbel, M.F.: Exploiting geo-tagged tweets to understand localized language diversity. In: GeoRich, pp. 1–6 (2014)

    Google Scholar 

  34. Qian, Z., et al.: Social boundaries and marital assimilation: Interpreting trends in racial and ethnic intermarriage. Am. Sociol. Rev. 72(1), 68–94 (2007)

    Article  Google Scholar 

  35. Ray, K.: The Migrants Table: Meals And Memories In. Temple University Press, Philadelphia (2004)

    Google Scholar 

  36. Sîrbu, A., et al.: Human migration: the big data perspective. Int. J. Data Sci. Anal. 1–20 (2020). https://doi.org/10.1007/s41060-020-00213-5

  37. Spilimbergo, A.: Democracy and foreign education. AER 99(1), 528–43 (2009)

    Article  Google Scholar 

  38. Tan, P.-N., et al.: Introduction to Data Mining. Pearson Education India, Noida (2016)

    Google Scholar 

  39. Wedel, M., Kamakura, W.A.: Market segmentation: Conceptual and Methodological Foundations, vol. 8. Springer, New York (2012)

    Google Scholar 

  40. Yoshua, B., Réjean, D., Pascal, V., Christian, J.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)

    MATH  Google Scholar 

Download references

Acknowledgment

This work is partially supported by the European Community H2020 programme under the funding schemes: H2020-INFRAIA-2019-1: Res. Infr. G.A. 871042 SoBigData++, G.A. 825619 AI4EU, G.A. 761758 Humane AI, and G.A. 780754 Track&Know. We thank UniCoop Tirreno for providing the data, and Roberto Zicaro for preliminary studies on the proposed methodology and analysis.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Riccardo Guidotti .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Guidotti, R. et al. (2021). Measuring Immigrants Adoption of Natives Shopping Consumption with Machine Learning. In: Dong, Y., Ifrim, G., Mladenić, D., Saunders, C., Van Hoecke, S. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12461. Springer, Cham. https://doi.org/10.1007/978-3-030-67670-4_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67670-4_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67669-8

  • Online ISBN: 978-3-030-67670-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics