Skip to main content

Advertisement

Log in

Estimating city-level poverty rate based on e-commerce data with machine learning

  • Published:
Electronic Commerce Research Aims and scope Submit manuscript

Abstract

There are many big data sources in Indonesia, for example, data from social media, financial transactions, transportation, call detail records, and e-commerce. These types of data have been considered as potential resources to complement periodic surveys and censuses to monitor development indicators such as poverty levels. Data from e-commerce in particular could potentially represent the real expenditure of households, better complying with the formal calculation of the poverty line than other datasets. The contribution of this research is to propose a framework for poverty rate estimation based on e-commerce data using machine learning algorithms. The influence of items and aspects in e-commerce data was investigated in conjunction with poverty rate estimation. The experimental result showed that e-commerce data could potentially be used as a proxy for calculating city-level poverty rates. It was also found that cars and motorbikes are the two most significant items for poverty prediction in Indonesia.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Moore, B., Akib, K., & Sugden, S. (2018). E-commerce in Indonesia: A guide for Australian business. Sydney. Retrieved January 6, 2019, from https://www.austrade.gov.au/ArticleDocuments/1358/E-commerce-in-Indonesia-Guide.pdf.aspx.

  2. OECD. (2018). Poverty rate (indicator). In Organisation for economic co-operation and development. Retrieved July 26, 2018, from https://data.oecd.org/inequality/poverty-rate.htm.

  3. Indonesia, B.-S. (2018). National social and economic survey, Jakarta. Retrieved June 13, 2020, from https://microdata.bps.go.id/mikrodata/index.php/catalog/SUSENAS/about.

  4. BPS—Statistics Indonesia. (2018). Kemiskinan dan Ketimpangan. Retrieved June 13, 2020, from https://www.bps.go.id/subject/23/kemiskinan-dan-ketimpangan.html.

  5. Kipkosgei Lagat, A. (2019). Support vector regression and artificial neural network approaches: Case of economic growth in East Africa community. American Journal of Theoretical and Applied Statistics, 7(2), 67. https://doi.org/10.11648/j.ajtas.20180702.13.

    Article  Google Scholar 

  6. Shirzad, A., Tabesh, M., & Farmani, R. (2014). A comparison between performance of support vector regression and artificial neural network in prediction of pipe burst rate in water distribution networks. KSCE Journal of Civil Engineering, 18(4), 941–948. https://doi.org/10.1007/s12205-014-0537-8.

    Article  Google Scholar 

  7. Naguib, I. A., & Darwish, H. W. (2012). Support vector regression and artificial neural network models for stability indicating analysis of mebeverine hydrochloride and sulpiride mixtures in pharmaceutical preparation: A comparative study. Spectrochimica Acta—Part A: Molecular and Biomolecular Spectroscopy, 86, 515–526. https://doi.org/10.1016/j.saa.2011.11.003.

    Article  Google Scholar 

  8. Mustakim, B. A., & Hermadi, I. (2016). Performance comparison between support vector regression and artificial neural network for prediction of oil palm production. Journal of Computer Science and Information, 1, 99–102. https://doi.org/10.21609/jiki.v9i1.287.

    Article  Google Scholar 

  9. Guo, K. H., & Wang, X. Y. (2011). Comparisons of support vector regression and neural network in modelling the hydraulic damper. Advanced Materials Research, 403–408, 3805–3812. https://doi.org/10.4028/www.scientific.net/amr.403-408.3805.

    Article  Google Scholar 

  10. Wijaya, D. R., Sarno, R., & Zulaika, E. (2019). Noise filtering framework for electronic nose signals: An application for beef quality monitoring. Computers and Electronics in Agriculture, 157(January 2018), 305–321. https://doi.org/10.1016/j.compag.2019.01.001.

    Article  Google Scholar 

  11. Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A., Jaitly, N., et al. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6), 82–97. https://doi.org/10.1109/MSP.2012.2205597.

    Article  Google Scholar 

  12. Li, X., He, Q., Wang, Q., Huang, Q., Li, Y., Zhang, X., et al. (2017). Using multi-stream hierarchical deep neural network to extract deep audio feature for acoustic event detection. Multimedia Tools and Applications, 77(1), 897–916. https://doi.org/10.1007/s11042-016-4332-z.

    Article  Google Scholar 

  13. Guan, W. (2018). Performance optimization of speech recognition system with deep neural network model. Optical Memory and Neural Networks, 27(4), 272–282. https://doi.org/10.3103/s1060992x18040094.

    Article  Google Scholar 

  14. Du, J., & Xu, Y. (2017). Hierarchical deep neural network for multivariate regression. Pattern Recognition, 63(June 2015), 149–157. https://doi.org/10.1016/j.patcog.2016.10.003.

    Article  Google Scholar 

  15. Braithwaite, A., Dasandi, N., & Hudson, D. (2016). Does poverty cause conflict? Isolating the causal origins of the conflict trap. Conflict Management and Peace Science, 33(1), 45–66. https://doi.org/10.1177/0738894214559673.

    Article  Google Scholar 

  16. Målqvist, M. (2015). Abolishing inequity, a necessity for poverty reduction and the realisation of child mortality targets. Archives of Disease in Childhood, 100(Suppl 1), S5–S9. https://doi.org/10.1136/archdischild-2013-305722.

    Article  Google Scholar 

  17. Fund, U. N. P. (2014). Population and poverty. Retrieved July 1, 2019, from https://www.unfpa.org/resources/population-and-poverty.

  18. Steele, J. E., Sundsøy, R., Pezzulo, C., Alegana, V. A., Steele, J. E., Bird, T. J., et al. (2017). Mapping poverty using mobile phone and satellite data. Journal of the Royal Society, Interface. https://doi.org/10.1098/rsif.2016.0690.

    Article  Google Scholar 

  19. The United Nations. (2015). The millennium development goals report. United Nations. ISBN 978-92-1-101320-7.

  20. Blumenstock, J., Cadamuro, G., & On, R. (2015). Predicting poverty and wealth from mobile phone metadata. Science, 350(6264), 1073–1076. https://doi.org/10.1126/science.aac4420.

    Article  Google Scholar 

  21. Soto, V., & Virseda, J. (2011). Prediction of socio-economic levels using cellphone records. In J. A. Konstan, R. Conejo, J. L. Marzo, & N. Oliver (Eds.), International conference on user modeling, adaptation, and personalization (pp. 377–388). Girona: Springer. https://doi.org/10.1007/978-3-642-22362-4.

  22. Mellander, C., Lobo, J., Stolarick, K., & Matheson, Z. (2015). Night-time light data: A good proxy measure for economic activity? PLoS ONE, 10(10), 1–18. https://doi.org/10.1371/journal.pone.0139779.

    Article  Google Scholar 

  23. Jean, N., Burke, M., Xie, M., Davis, W. M., Lobell, D. B., & Ermon, S. (2016). Combining satellite imagery and machine learning to predict poverty. Science, 353(6301), 790–794. https://doi.org/10.1126/science.aaf7894.

    Article  Google Scholar 

  24. Babenko, B., Hersh, J., Newhouse, D., Ramakrishnan, A., & Swartz, T. (2017). Poverty mapping using convolutional neural networks trained on high and medium resolution satellite images, with an application in Mexico. In 31st conference on neural information processing systems (NIPS 2017) (pp. 1–4). Long Beach. https://doi.org/10.1109/vppc.2005.1554579.

  25. Perez, A., Azzari, G., & Burke, M. (2017). Poverty prediction with public Landsat 7 satellite imagery and machine learning. In 31st conference on neural information processing systems (NIPS 2017). Long Beach: Neural Information Processing Systems Foundation, Inc.

  26. Pandey, S. M., Agarwal, T., & Krishnan, N. C. (2018). Multi-task deep learning for predicting poverty from satellite images. In The thirtieth AAAI conference on innovative applications of artificial intelligence (IAAI-18) (pp. 7793–7798). New Orleans: Association for the Advancement of Artificial Intelligence.

  27. Njuguna, C., & McSharry, P. (2017). Constructing spatiotemporal poverty indices from big data. Journal of Business Research, 70, 318–327. https://doi.org/10.1016/j.jbusres.2016.08.005.

    Article  Google Scholar 

  28. Pokhriyal, N., & Christophe, D. (2017). Combining disparate data sources for improved poverty prediction and mapping. Proceedings of the National Academy of Sciences of the United States of America. https://doi.org/10.1073/pnas.1700319114.

    Article  Google Scholar 

  29. Alencar, P., & Cowan, D. (2018). The use of machine learning algorithms in recommender systems: A systematic review. Expert Systems with Applications, 97, 205–227. https://doi.org/10.1016/j.eswa.2017.12.020.

    Article  Google Scholar 

  30. Tian, F., Wu, F., Chao, K. M., Zheng, Q., Shah, N., Lan, T., et al. (2016). A topic sentence-based instance transfer method for imbalanced sentiment classification of Chinese product reviews. Electronic Commerce Research and Applications, 16, 66–76. https://doi.org/10.1016/j.elerap.2015.10.003.

    Article  Google Scholar 

  31. Lee, S., & Kim, W. (2017). Sentiment labeling for extending initial labeled data to improve semi-supervised sentiment classification. Electronic Commerce Research and Applications, 26, 35–49. https://doi.org/10.1016/j.elerap.2017.09.006.

    Article  Google Scholar 

  32. Li, Q., Kurniajaya, K. J., Tseng, K.-K., Zhou, H., & Lin, R. F.-Y. (2017). Price prediction of e-commerce products through Internet sentiment analysis. Electronic Commerce Research, 18(1), 65–88. https://doi.org/10.1007/s10660-017-9272-9.

    Article  Google Scholar 

  33. Rout, J. K., Choo, K. K. R., Dash, A. K., Bakshi, S., Jena, S. K., & Williams, K. L. (2018). A model for sentiment and emotion analysis of unstructured social media text. Electronic Commerce Research, 18(1), 181–199. https://doi.org/10.1007/s10660-017-9257-8.

    Article  Google Scholar 

  34. Wang, Y., Lu, X., & Tan, Y. (2018). Impact of product attributes on customer satisfaction: An analysis of online reviews for washing machines. Electronic Commerce Research and Applications, 29, 1–11. https://doi.org/10.1016/j.elerap.2018.03.003.

    Article  Google Scholar 

  35. Yang, S., Joo, H., & Youm, S. (2019). Demand forecasting model development through big data analysis. Electronic Commerce Research. https://doi.org/10.1007/s10660-019-09337-8.

    Article  Google Scholar 

  36. Ou, W., Huynh, V. N., & Sriboonchitta, S. (2018). Training attractive attribute classifiers based on opinion features extracted from review data. Electronic Commerce Research and Applications, 32(October), 13–22. https://doi.org/10.1016/j.elerap.2018.10.003.

    Article  Google Scholar 

  37. Zhang, W., Du, Y., Yang, Y., & Yoshida, T. (2018). DeRec: A data-driven approach to accurate recommendation with deep learning and weighted loss function. Electronic Commerce Research and Applications, 31(August), 12–23. https://doi.org/10.1016/j.elerap.2018.08.001.

    Article  Google Scholar 

  38. Vincent, O. R., Makinde, A. S., & Akinwale, A. T. (2017). A cognitive buying decision-making process in B2B e-commerce using Analytic-MLP. Electronic Commerce Research and Applications, 25, 59–69. https://doi.org/10.1016/j.elerap.2017.08.002.

    Article  Google Scholar 

  39. Wijaya, D. R., & Afianti, F. (2020). Stability assessment of feature selection algorithms on homogeneous datasets: A study for sensor array optimization problem. IEEE Access, 8, 33944–33953. https://doi.org/10.1109/ACCESS.2020.2974982.

    Article  Google Scholar 

  40. Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R. P., Tang, J., et al. (2016). Feature selection: A data perspective. ACM Computing Surveys. https://doi.org/10.1145/3136625.

    Article  Google Scholar 

  41. Liu, X., Zhang, H., Kong, X., & Lee, K. Y. (2020). Wind speed forecasting using deep neural network with feature selection. Neurocomputing. https://doi.org/10.1016/j.neucom.2019.08.108.

    Article  Google Scholar 

  42. Wang, L., Yan, X., Liu, M. L., Song, K. J., Sun, X. F., & Pan, W. W. (2019). Prediction of RNA-protein interactions by combining deep convolutional neural network with feature selection ensemble method. Journal of Theoretical Biology, 461, 230–238. https://doi.org/10.1016/j.jtbi.2018.10.029.

    Article  Google Scholar 

  43. Jiang, S., Chin, K. S., Wang, L., Qu, G., & Tsui, K. L. (2017). Modified genetic algorithm-based feature selection combined with pre-trained deep neural network for demand forecasting in outpatient department. Expert Systems with Applications, 82, 216–230. https://doi.org/10.1016/j.eswa.2017.04.017.

    Article  Google Scholar 

  44. Mirzaei, A., Pourahmadi, V., Soltani, M., & Sheikhzadeh, H. (2020). Deep feature selection using a teacher-student network. Neurocomputing, 383, 396–408. https://doi.org/10.1016/j.neucom.2019.12.017.

    Article  Google Scholar 

  45. Yu, L., & Liu, H. (2003). Feature selection for high-dimensional data: A fast correlation-based filter solution. In International conference on machine learning (ICML) (pp. 1–8).

  46. Wijaya, D. R., Sarno, R., & Zulaika, E. (2016). Sensor array optimization for mobile electronic nose: Wavelet transform and filter based feature selection approach. International Review on Computers and Software, 11(8), 659–671. https://doi.org/10.15866/irecos.v11i8.9425.

    Article  Google Scholar 

  47. Brown, G., Pocock, A., Zhao, M.-J., & Lujan, M. (2012). Conditional likelihood maximisation: A unifying framework for mutual information feature selection. Journal of Machine Learning Research, 13, 27–66. https://doi.org/10.1016/j.patcog.2015.11.007.

    Article  Google Scholar 

  48. Hariyanto, S. R., & Wijaya, D. R. (2017). Detection of diabetes from gas analysis of human breath using e-Nose. In 2017 11th international conference on information & communication technology and system (ICTS) (Vol. 0, pp. 241–246). Surabaya: IEEE. https://doi.org/10.1109/icts.2017.8265677.

  49. Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley.

    Google Scholar 

  50. Chang, C.-C., & Lin, C.-J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2(3), 1–27. https://doi.org/10.1145/1961189.1961199.

    Article  Google Scholar 

  51. Greff, K., Srivastava, R. K., Koutnik, J., Steunebrink, B. R., & Schmidhuber, J. (2017). LSTM: A search space Odyssey. IEEE Transactions on Neural Networks and Learning Systems, 28(10), 2222–2232. https://doi.org/10.1109/TNNLS.2016.2582924.

    Article  Google Scholar 

  52. Kalman, B. L., & Kwasny, S. C. (1992). Why tanh: Choosing a sigmoidal function. In IJCNN international joint conference on neural networks (pp. 578–581). Baltimore: IEEE. https://doi.org/10.1109/ijcnn.1992.227257.

  53. Deeplearning4j Development Team. (2017). Deeplearning4j: Open-source distributed deep learning for the JVM. Apache Software Foundation License 2.0. San Francisco: Skymind. Retrieved January 6, 2019, from http://deeplearning4j.org.

  54. Baranyi, J., Pin, C., & Ross, T. (1999). Validating and comparing predictive models. International Journal of Food Microbiology, 48(3), 159–166.

    Article  Google Scholar 

  55. Indonesia, B.-S. (2018). Persentase Penduduk Miskin Menurut Kabupaten/Kota, 20152017. Jakarta. Retrieved January 6, 2019, from https://www.bps.go.id/dynamictable/2017/08/03/1261/persentase-penduduk-miskin-menurut-kabupaten-kota-2015%972017.html.

Download references

Acknowledgements

This work was supported by Pulse Lab Jakarta (PLJ), which is a joint initiative of the United Nations and the Government of Indonesia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dedy Rahman Wijaya.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wijaya, D.R., Paramita, N.L.P.S.P., Uluwiyah, A. et al. Estimating city-level poverty rate based on e-commerce data with machine learning. Electron Commer Res 22, 195–221 (2022). https://doi.org/10.1007/s10660-020-09424-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10660-020-09424-1

Keywords

Navigation