Abstract
This paper examines digital inequalities in Nepal based on a publicly available dataset. We build different random forest classification models and apply sensitivity analysis, permutation variable test, and partial dependence analysis to characterize digital inequality on access, skill, and use at household and individual levels. Our analysis reveals important Nepal-specific findings about digital inequality. In addition, our random forest-based analysis illustrates how non-parametric methods can explicate complex nonlinear relationships that prevail between demographic variables. This paper also illustrates how sensitivity and partial dependence analysis can aid in interpreting the so-called ‘black box’ models like random forests. One of our notable findings is that caste has very little explanatory power in explaining the adoption of digital technologies. Gender, on the other hand, is still a strong predictor of an individual’s computer skills. Although the analysis in this paper is limited to Nepal, the methodology applies to similar datasets for other countries too.
Similar content being viewed by others
Notes
There could be other measures of sensitivity as well. We restrict to AAD as our objective is only to triangulate the variable importance inferred from permutation and sensitivity-based analyses.
This can be observed through the Management Information Report series provided on the website of Nepal Telecommunication Authority, the telecom regulator of Nepal.
Literally, ‘Sudoorpaschim’ literally translates to ‘far-western’. The far-western region is the most underdeveloped in terms of ICT access, income, education, and other indicators as per the census data of 2011. This fact has been explored in [7]. Recent reports on human development by the Central Bureau of Statistics Nepal, available on its official website, show that the situation is still similar.
Around 67 percent of the population had access to electricity in 2011, which increased to around 90 percent in 2019. This data is provided online by the World Bank.
A good review of the socioeconomic impacts of the caste system can be found in [15].
References
DiMaggio, P., Hargittai, E., et al.: “From the ‘digital divide’to ‘digital inequality’: studying internet use as penetration increases,” Princeton: Center for Arts and Cultural Policy Studies. Woodrow Wilson Sch. Princeton Univ. 4(1), 4–2 (2001)
Van Dijk, J. A.: A theory of the digital divide. In: The digital divide. Routledge, pp. 49–72 (2013)
Robinson, L., Schulz, J., Blank, G., Ragnedda, M., Ono, H., Hogan, B., Mesch, G. S., Cotten, S. R., Kretchmer, S. B., Hale, T. M., Drabowicz, T., Yan, P., Wellman, B., Harper, M.-G., Quan-Haase, A., Dunn, H. S., Casilli, A. A., Tubaro, P., Carvath, R., Chen, W., Wiest, J. B., Dodel, M., Stern, M. J., Ball, C., Huang, K.-T., Khilnani, A.: “Digital inequalities 2.0: Legacy inequalities in the information age,” First Monday, 25(7) (2020)
Robinson, L., Schulz, J., Dunn, H. S., Casilli, A. A., Tubaro, P., Carvath, R., Chen, W., Wiest, J. B., Dodel, M., Stern, M. J., Ball, C., Huang, K.-T., Blank, G., Ragnedda, M., Ono, H., Hogan, B., Mesch, G. S., Cotten, S. R., Kretchmer, S. B., Hale, T. M., Drabowicz, T., Yan, P., Wellman, B., Harper, M.-G., Quan-Haase, A., Khilnani, A.: “Digital inequalities 3.0: Emergent inequalities in the information age,” First Monday, 25(7) (2020)
Pandey, S., Raj, Y.: Free float internet policies of Nepal. Studies Nepali Hist. Soc. 21(1), 1–60 (2016)
Regmi, N.: Expectations versus reality: a case of internet in Nepal. Electron. J. Inform. Syst. Dev. Count. 82(1), 1–20 (2017)
Pandey, S., Regmi, N.: Changing connectivities and renewed priorities: status and challenges facing Nepali internet. First Monday (2018)
Pandey, S.B., Regmi, N.: If you build it, will they come? Exploring narratives that shape the internet in Nepal. Sci. Technol. Soc. 25(3), 444–464 (2020)
Chautari, M.: Moving beyond access: the landscape of internet use and digital inequality in nepal. martin chautari research brief 23. Martin Chautari, Tech. Rep. (2012)
CBSN and UNICEF: Nepal multiple indicator cluster survey report 2019 survey findings report. Tech. Rep, Central Buereau of Staistics Nepal (2020)
Cortez, P., Embrechts, M.J.: Using sensitivity analysis and visualization techniques to open black box data mining models. Inform. Sci. 225, 1–17 (2013)
Greenwell, B.M.: pdp: an r package for constructing partial dependence plots. R J. 9(1), 421 (2017)
Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning: data mining, inference, and prediction. Springer, Berlin (2009)
Archer, E.: “Package ‘rfpermute”’ (2020)
Mosse, D.: Caste and development: contemporary perspectives on a structure of discrimination and advantage. World Dev. 110, 422–436 (2018)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Strobl, C., Malley, J., Tutz, G.: An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol. Methods 14(4), 323 (2009)
Strobl, C., Boulesteix, A.-L., Kneib, T., Augustin, T., Zeileis, A.: Conditional variable importance for random forests. BMC Bioinform. 9(1), 1–11 (2008)
CBSN: “Population monograph of nepal volume ii,” Central Buereau of Staistics Nepal, Tech. Rep. (2014)
Filmer, D., Pritchett, L.H.: Estimating wealth effects without expenditure data-or tears: an application to educational enrollments in states of India. Demography 38(1), 115–132 (2001)
Liaw, A.: Package ‘randomforest. University of California, Berkeley (2018)
Cortez, P.: “Package ‘rminer’,” Teaching Report, 59 (2020)
Guha, A., Mukerji, M. et al.: Determinants of digital divide using demand-supply framework. Aust. J. Inform. Syst. 25 (2021)
Thapa, D., Sæbø, Ø.: Exploring the link between ict and development in the context of developing countries: a literature review. Electron. J. Inform. Syst. Dev. Count. 64(1), 1–15 (2014)
Van Deursen, A. J., Helsper, E. J., Eynon, R.: Measuring digital skills. In: From digital skills to tangible outcomes project report (2014)
Funding
This work has not been funded by any institution.
Author information
Authors and Affiliations
Contributions
All the work has been contributed by the corresponding author.
Corresponding author
Ethics declarations
Conflict of interest
There is no conflict of interest to disclose.
Human Participants
This work does not directly involve any human participants. The data used is published by UNICEF and is publicly available.
Informed Consent
Not applicable as this works uses public data provided by UNICEF.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Regmi, N. A random forest-based analysis of household survey data to infer insights on digital inequality. Iran J Comput Sci 6, 333–344 (2023). https://doi.org/10.1007/s42044-023-00143-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42044-023-00143-y