Skip to main content

Examining user behavior with machine learning for effective mobile peer-to-peer payment adoption

Abstract

Disruptive innovations caused by FinTech (i.e., technology-assisted customized financial services) have brought digital peer-to-peer (P2P) payments to the fore. In this challenging environment and based on theories about customer behavior in response to technological innovations, this paper identifies the drivers of consumer adoption of mobile P2P payments and develops a machine learning model to predict the use of this thriving payment option. To do so, we use a unique data set with information from 701 participants (observations) who completed a questionnaire about the adoption of Bizum, a leading mobile P2P platform worldwide. The respondent profile was the average Spanish citizen within the framework of European culture and lifestyle. We document (in this order of priority) the usefulness of mobile P2P payments, influence of peers and other social groups such as friends, family, and colleagues on individual behavior (that is, subjective norms), perceived trust, and enjoyment of the user experience within the digital context and how those attributes better classify (potential) users of mobile P2P payments. We also find that nonparametric approaches based on machine learning algorithms outperform traditional parametric methods. Finally, our results show that feature selection based on random forest, such as the Boruta procedure, as a preprocessing technique substantially increases prediction performance while reducing noise, redundancy of the resulting model, and computational costs. The main limitation of this research is that it only has a place within the sociocultural and institutional framework of the Spanish population. It is therefore desirable to replicate this study by surveying people from other countries to analyze the effects of the institutional environment on the adoption of mobile P2P payments.

Introduction

The financial services industry has recently been forced to adopt technological changes to innovate its processes and products (Frame et al. 2018; Kou et al. 2021). As a result, a set of technology-assisted customized financial services (FinTech) has arisen in the banking market (Thakor 2020). Prominent among them are nonintermediated peer-to-peer (P2P) transactions based on digital infrastructures, such as lending and payments. Indeed, mobile P2P payments are a business vector with deeper market penetration (Abdullah and Naved Khan 2021) and have experienced an extraordinary boom, particularly since the beginning of the COVID-19 pandemic (Higueras-Castillo et al. 2023). It should be noted that mobile P2P payments constitute a real threat to traditional payment methods and were born from a need to break the domination of cash and credit card payments for common day-to-day purchases (Belanche et al. 2022; Insider Intelligence 2022). Mobile P2P payments have emerged as a singular digital payment system and are simpler, faster, more convenient, usually cost-free, and feature a social component that other (digital and not digital) systems lack (Li et al. 2021; Nasir et al. 2020, 2021).

Given that mobile P2P payments are a disruptive innovation in the financial services sector, previous research has focused on identifying the factors determining their use (Leong et al. 2022). In practice, financial entities drive change by fostering digital payments among customers. Thus, they need to know the attributes that explain customary resistance to change and the barriers to using new technologies and transferring know-how (Irimia-Diéguez et al. 2023). In this vein, Liébana-Cabanillas et al. (2021) showed that the precursors and barriers to using P2P payments differ from those of mobile-based payment methods, calling for further research.

Therefore, the key research question that this paper aims to shed light on is the drivers and barriers that foster the adoption of mobile P2P payments between banking customers (Shaikh et al. 2023). Accordingly, the main objective of this paper is to analyze factors that determine customers’ adoption of mobile P2P payments. Our contribution lies in finding the key variables that allow banking customers to be classified as users (or nonusers) of mobile P2P payments. To this end, we compare traditional parametric statistical techniques with a set of nonparametric approaches based on machine learning (ML) methods oriented to classification problems. These learning algorithms are the foundations of data mining and big data current trending topics in the financial innovation field and are considered to be a crucial part of a wider research area known as Knowledge Discovery from Data, which focuses on identifying patterns in data sets (Nguyen et al. 2022).

It is worth highlighting, as one of the core strengths of the present study, the use of a unique data set with information from 701 individuals (observations) who were asked about the use of mobile P2P payments; namely, the use of Bizum, one of the leading and pioneering mobile P2P payment applications worldwide, whose success is comparable to Venmo in the USA (Acker and Murthy 2020).

This paper contributes to the FinTech and ML literature in two ways. Practically, our findings have significant implications for banks with a high interest in precisely knowing the factors that impact the intent to use mobile P2P payment services to (i) create more customized products and services to satisfy the needs of their customers to a greater extent and (ii) properly plan their business, human resource, and marketing strategies. One of the key points of this research is the sample, which is built on a survey conducted with users of the mobile P2P payment platform Bizum. We highlight that one of the main variables explaining the adoption of Bizum as a mobile P2P payment is its full connection and integration with traditional financial players. In other words, given that Bizum is a bank-based platform with a largely predefined bank–customer relationship, it has benefited from its deep market penetration into the traditional banking industry to create new business relationships and become a trustworthy and massively used mobile P2P payment platform. Indeed, this, together with the development of technology allowing the widespread use of smartphones, is a primary factor explaining the strong expansion and adoption of Bizum as a mobile P2P payment method.

Theoretically, our framework employs the most relevant models from technology acceptance theories. We use variables from the theory of reasoned action from Fishbein and Ajzen (1977), technology acceptance model (TAM) from Davis et al. (1989), theory of planned behavior from Ajzen (1991), extended TAM, namely TAM 2 from Venkatesh and Davis (2000) and TAM 3 from Venkatesh and Bala (2008), unified theory of acceptance and use of technology (UTAUT) from Venkatesh et al. (2003), UTAUT2 from Venkatesh et al. (2012), and mobile payment technology acceptance model from Liébana-Cabanillas et al. (2014). Empirically, we follow Witten and Frank (2005), who suggest implementing various statistical languages and search procedures that serve some problems well and others badly, an added motivation for more carefully constructing and comparing alternative ML techniques. In addition, the first preselection of independent variables is applied by combining Boruta and Gini index procedures to obtain a more parsimonious model. Thus, the comparison of different ML techniques in the field of user adoption of mobile P2P payments constitutes the second contribution of this study.

The rest of this paper is structured as follows. "Theoretical background" section describes the dataset and the learning machine models used in this research. "Methodology" section presents the empirical results, "Results" section contains the discussion, and "Discussion" section sets out the conclusions, implications, and areas for future research.

Theoretical background

Evolution of payment systems

New payment systems have emerged from advancements in information and communication technology for financial transactions between businesses and their customers. Specifically, these systems arise as a means of addressing certain issues associated with handling physical money (Tamayo 1999), the need to reduce the cost of money and existing payment methods, providing flexibility for small purchases and instant payments, enhancing security and protection against fraud and other forms of crime, and the rise of e-commerce on the Internet and online payments.

Consequently, the financial sector is undergoing a profound transformation where traditional payment systems relying on cash are being replaced by electronic payment systems (see Fig. 1). According to a recent study by the European Central Bank (2022), the total number of noncash payment transactions in the euro area, encompassing all types of payment services, increased by 12.5% compared to the previous year, reaching 114.2 billion transactions, with a total amount increase of 18.6% to 197 trillion euros. Card payments accounted for 49% of the total transactions, transfers represented 22%, and direct debits represented 20%.

Fig. 1
figure 1

Source: Own elaboration based on Huang (2021)

Classification of payment systems.

In addition to this trend, the extensive use of technologies such as mobile phones has also brought about significant changes in user payment behaviors (Liébana-Cabanillas et al. 2022a). Current mobile payment solutions are based on the technological development of smartphones, enabling the creation of payment applications that can be used in various ways for conducting payment transactions with a mobile device (Liébana-Cabanillas et al. 2017). The classification of mobile payments evolves from the use of smartphones at the point of sale, where they are used to perform economic transactions for purchasing products or services and even function as a point-of-sale terminal for customers. Second, mobile phones can serve as a standard payment platform, offering various functionalities such as executing payments and sending money. Third, these phones can be used as a payment channel through the user’s telecommunications operator, with whom they have a contracted phone line.

Finally, closed-loop payments refer to mobile applications specifically developed for a particular store or brand, where the mobile phone functions not only as a payment option within that store but also includes additional payment-related services such as promotional notifications, loyalty programs, and discount coupons.

Previous research on mobile payment adoption

Since the seminal work of Dahlberg et al. (2008) on mobile payment systems, various authors have analyzed the field of mobile payments up to the present day (Liébana-Cabanillas et al. 2022b; Migliore et al. 2022). Dennehy and Sammon (2015) concluded that research on mobile payments is a well-established area that will continue to receive increased attention from various disciplines in the coming years, recognizing the potential and enrichment of mobile payment services as their adoption becomes increasingly imperative. To date, customer adoption continues to be of interest to many researchers, but the focus remains on investigating adoption in specific countries separately, with less attention given to comparing survey results across multiple countries and examining their differences. More recently, authors such as Abdullah and Naved Khan (2021), Tounekti et al. (2022), and Panetta et al. (2023) have proposed bibliometric reviews that highlight the importance of this current and future research topic. Furthermore, recent studies on adoption have specifically examined technology, security, and architecture. Table 1 summarizes recent research that has analyzed the adoption of mobile payment systems.

Table 1 Recent research on mobile payment adoption

Peer-to-peer mobile payment system: Bizum

P2P payments are peer-to-peer applications that facilitate the immediate transfer of mobile money transactions anywhere. Furthermore, this type of payment, which was previously widespread in the private sphere, is also starting to extend into the commercial realm for making purchases at physical establishments. An increasing number of consumers are using P2P payment apps to pay for their purchases at retail stores. This trend is driven by the growing acceptance of P2P payments by merchants (Visconti-Caparrós et al. 2022).

One pioneering P2P payment system in Europe is Bizum, which is known for its origin and comparative competitiveness. It offers its users three major advantages: (i) immediacy, as transferred funds reach recipients’ bank accounts within seconds; (ii) universality, as customers do not need to switch financial institutions, and the system is connected to all participating banks; and (iii) user-friendliness, as it allows users to make payments between individuals as well as at physical and online stores.

In addition, its operation is straightforward: to send money, the Bizum user selects a contact from their mobile phone lists and sets the desired transfer amount. The sender’s bank then sends a code to their mobile phone, which the user enters into the app, and the recipient immediately receives the money in their linked bank account.

Bizum is supported by all Spanish banks, with an option for each e-banking application, and it is used by more than 21 million active users (nearly 50% of the Spanish population), having a historical track record of 1,362 million transactions and more than EUR 70.5 million transferred since its launch in 2016 (Bizum 2022). Bizum can be considered a transversal payment method because its customer profile includes people of any age, educational level, and socioeconomic class (Belanche et al. 2022).

Considering this review of the adoption of mobile payment systems in general, and P2P systems in particular, as well as in line with our objectives, the current research proposes an improvement in the analysis techniques that may determine the variables that foster the intention to use P2P payment systems through the application of different statistical languages combining Boruta and Gini index procedures to obtain a more parsimonious model.

Machine learning and mobile payments

Comparative analysis of key machine learning techniques

ML is a part of artificial intelligence that, by compiling statistical algorithms and systems, demonstrates intelligence to interpret external data correctly and subsequently make decisions (Davenport et al. 2020). In essence, ML models seek to learn relationships and patterns from a given dataset, and therefore, they can be used to solve both predictive and classification/categorization problems (Bishop 2006).

ML is emerging in parallel with the development of computational science and data-driven business management (Sheth and Kellstadt 2021). This is why, in recent years, numerous ML-based intelligent systems have been massively penetrating our business and personal lives (known as the Internet of Things, IoT) (Kaplan and Haenlein 2019). Indeed, ML shows a much greater performance in high-dimensional data environments, where variable interactions and nonlinear relationships often arise, and automatized recurrent decisions are required (Vanini et al. 2023). Accordingly, ML algorithms have been successfully applied in many fields, such as banking, to decide whether to approve or reject a loan application (Alonso-Robisco and Carbó-Martínez 2022), and in engineering for structural design (Thai 2022).

One of the pioneering ML models that have subsequently reached a remarkable expansion and relevance is artificial neural networks (ANNs). ANNs attempt to emulate human brain functioning by creating a set of interconnected nodes (artificial neurons) placed on several layers that reason in a network architecture (Selvamuthu et al. 2019). Among the most used neural networks in business research is the multilayer perceptron (MLP) (Vellido et al. 1999), whose main theoretical advantage is that of supporting the fulfillment of the universal approximate property (Bishop 2006). Nevertheless, in recent years, complex ANN architectures have emerged, namely deep ANNs (e.g., convolutional ANNs), which already excel in human performance in some environments (Madani et al. 2018).

Despite their advantages, the main limitation of ANNs is their black-box nature, which jeopardizes the interpretation of results and the importance, effects, and relationships between the variables. ANNs have a high computational cost to tune the training parameters, which lengthens the time required to design the topology of the optimal network.

At the beginning of the current century, ensemble methods emerged, whose main ground is that the nature of a phenomenon is captured to a greater extent by combining several alternative methods that are subsequently synthesized by a sole optimal model. That is, ensemble algorithms benefit from the strengths of different models without conducting a biased model preselection. Within the ensemble-based approach, two primary methods emerge: bagging and boosting models.

First, the bagging algorithm proposed by Breiman (1996) fits the same underlying algorithm to each training step, creating a final prediction that is the average of each bootstrap prediction. Given a classification model, bagging draws B independent samples with a replacement from the available training set (bootstrap samples), fits a model to each bootstrap sample, and finally aggregates the B models by majority voting. Since the final prediction is always a pondered result of several bootstrap fits, bagging powerfully decreases the model variance and biases, leading to a model with higher generalization ability without overfitting problems (Schapire et al. 1998).

This advantage allows bagging to be successfully applied to generate other ML models. In this vein, when bagging is applied to a tree-based method, this results in a model called random forests (RF), which is one of the most relevant ML techniques (Breiman 2001). RF is an ML method based on the building and combination of a large set of trees. The main strength of RF is that in each split, a random subset of predictors is considered, increasing the probability of weak predictors being selected and thereby reducing bias in the model. Otherwise, stronger predictors would be used by many trees as a first split. To do so, it randomly selects the variables to split the dataset and create each node while each tree grows from a bootstrap sample of the training dataset.

The growing interest in the use of RFs is also due to their capacity to rank predictor variables according to their importance in explaining the studied phenomenon (Friedman et al. 2000). That is, unlike most ML methods that have a black-box nature, RF shows how each variable influences the understanding of the analyzed event. Indeed, this is the procedure employed in this study to select the most relevant variables (see "Feature selection results" section). Moreover, other positive aspects of this method are that it does not generally overfit and that Bayes consistency is obtained with a simple version of RF (Breiman 2001).

Note that RF can be considered an improved version of the classification and regression trees (CART) approach. In this vein, RF randomly selects the variables to split the dataset and creates each node while each tree grows from a bootstrap sample of the training dataset. Thus, it does not fail as CART, as the main disadvantage lies in that a change in a higher-level node, by the domino effect, can lead to completely different trees. In other words, the performance of the CART is strongly dependent on the stopping criteria implemented because this model is developed using binary recursive partitioning, which is an iterative procedure of splitting the dataset until reaching the final nodes. Of course, CART also has advantages. Indeed, Breiman (2001) considers that CART is the model with easier understanding and interpretation. Further, CART also assumes nonlinear relationships between variables and higher-order interactions (Boulesteix et al. 2015).

Second, unlike bagging, boosting trains models sequentially by analyzing the prediction errors, which results in a powerful improvement of the classifiers (Freund 1995). AdaBoost is the most relevant model within this approach. AdaBoost assigns increasing weights to observations that are incorrectly classified in the last iteration of the classifier. Consequently, the subsequent iterations will focus on correctly classifying these observations, which ultimately will minimize the prediction errors. In this paper, we implement Adaboost as well as Binominal Boosting and L2 Boosting. Other boosting algorithms related to additive basis expansion were developed by Friedman et al. (2000).

Finally, support vector machine (SVM) is a powerful technique mainly used for binary classification problems, although it can also be applied to multiclass classifications that build a hyperplane to separate the observations of different classes. To do so, the SVM uses support vectors that are data falling closest to the hyperplane. Although SVM usually generates low misclassification errors and can function well in environments with high-dimensional data, it has a high operational cost in terms of time consumption. Moreover, sometimes SVM works with a nonoptimal function, which undermines its performance.

User behavior prediction

Behavior analysis was introduced in 1953 by Skinner (1953) and focused on analyzing human behavior from a psychological perspective. However, technological advancements have allowed massive data processing and the powerful development of data mining and ML algorithms that have been increasingly applied to explore human behavior, biasing behavior analysis toward the computational science area. Indeed, behavior analysis is currently called behavioral analytics (Cao et al. 2015), whose aim is to model human behavior by understanding the past to predict its future, and thus create business strategies using statistical and ML approaches (Martín et al. 2021).

From the beginning, these analyses essentially address how individuals interact and the role that they play by acting as a group (collaboration-competition) as well as individually (routines–attitudes–intentions). However, the study of human behavior is not altruistic. Rather, there is a strong economic interest that companies are trying to exploit to increase their market share, brand, and products-services positioning and, ultimately, their profits. For this reason, currently, this discipline is closely connected to the economy and organizational management and is encompassed in the field called user behavior analysis, which comes together with human behavior ML techniques and business decision-making (LeCun et al. 2015; Cui et al. 2016).

In practice, ML has been successfully employed in different domains related to disruptive innovations and marketing, such as the recommendation of products to potential customers (Hagenauer and Helbich 2017) or the estimation of consumer preferences for technology products (Guo et al. 2021).

Particularly relevant is the use of ML in P2P finance (also known as Internet or Digital Finance), which mainly operates through the Internet; therefore, a large amount of data must be processed before decision-making (Wu et al. 2018). As suggested by Gomber et al. (2017), digital finance is a new form of finance based on third-party payment, cloud computing, big data, social networks, and e-commerce platforms to obtain financing and credit as well as to make payments and other financial transactions. In this challenging environment, ML can collect new data, update the model, and provide an output, thus adapting to rapidly evolving environments, such as economic patterns and shocks.

Indeed, ML is being effectively used to explore the factors that influence users’ digital finance behavior (Xiong et al. 2022). Authentication technology, the nonrepudiation of transactions, privacy protection, data integrity, and user trust have a significant impact on users’ Internet finance behavior.

Focusing on e-payment users, Bajari et al. (2015) suggested that ML techniques outperform discrete choice models, which have been the referenced statistical methods used to analyze consumers’ preferences and adoption of means of payments and other digital financial services (Hernández-Murillo et al. 2010). As pointed out by Cui et al. (2016), ML is a powerful methodological approach that promises to generate new insights into payment behavior. In this sense, Lee et al. (2020) used a two-stage analysis by employing Partial Least Squares and subsequently an artificial neural network to explore the antecedents that affect users’ behavioral intention to use wearable payments. Also, Aslam et al. (2022), using SVM, studied the users´ behavioral factors that explain the adoption of mobile payments. They found perceived value to be the most important predictor of usage behavior. Even, users´ behavior with mobile payments has been employed as a driver to forecast, through ML, stores’ total customer flows (Ma and Fildes 2020).

To the best of the authors´ knowledge, only the above few research articles analyze the users´ behavior regarding digital payments; therefore, more empirical evidence is needed. This is not surprising given that mobile payment applications are not yet widely used by the population, and more importantly, there is very little leading e-payment software that massively operates in a country (as does happen with Bizum). Therefore, it is not possible to question users about the behavioral factors that lead them to adopt these mobile payments. This reinforces the findings of the present study.

Methodology

In this study, we use a primary source of data obtained from a survey of 701 Spanish smartphone users who are considered potential users of mobile P2P payment systems. All the users who participated in the survey had experience using their cell phones for commercial activities, either for shopping or payments. The profile of the respondents was that of an average Spanish citizen having their place in the European culture and lifestyle framework. To collect the data, nonprobability snowball sampling was employed through a mailing list and social networks. Although simple random sampling is the best sampling method, many empirical studies published in high-impact journals have used a snowball method when collecting data (Belanche et al. 2022; Huang et al. 2019).

The questionnaire included items to measure the variables defined in Table 2. The items were selected through a review of the relevant literature, adapting the original scales to the nature of the research. The participants expressed their attitudes on a seven-point Likert scale (1: strongly disagree; 7: strongly agree). The questionnaire was developed using a multi-item approach, where three or more items measured each latent variable. This is a common procedure in the field of marketing research. Appendix 1 provides the questionnaire used in the study for reference.

Table 2 Variables and theoretical background

The dependent variable is a dummy variable with a null value (0) in the case of a merchant not having a mobile payment system available and a value of one (1) in the case of these payment systems being available to customers, according to the following:

$$Y_{it} = \left\{ {\begin{array}{*{20}l} 1 \hfill & {{\text{use }}\;{\text{mobile}}\;{\text{payment}}\;{\text{system}}} \hfill \\ 0 \hfill & {{\text{does }}\;{\text{not}}\;{\text{use}}\;{\text{mobile}}\;{\text{payment}}\;{\text{system}}} \hfill \\ \end{array} } \right.$$

To execute this research, we will classify the independent variables used in two categories. We established a group of behavioral variables related to the main theories concerning the adoption of technologies (perceived ease of use, perceived risk, trust, personal innovativeness, subjective norms, perceived enjoyment, loyalty to the banking brand, and perceived quality) and a second group of variables linked to the demographic classification of potential users of the payment system (gender and age).

Regarding the first group of variables, the classic scientific literature has developed multiple theories that have analyzed the behavior of individuals despite innovation. In recent years, some authors have applied these theories to the field of mobile and P2P payments (Upadhyay et al. 2022; Belanche et al. 2022). Table 2 describes the variables used and the sources employed for their definition.

The second block of variables refers to the gender and age of potential users of the proposed payment system. Therefore, our study includes the same categories used by the Spanish National Employment Institute in its statistical reports to classify a population.

Results

Feature selection results

We performed two preprocessing procedures because, as supported by Chen et al. (2020), their use substantially improves the prediction result. First, all predictor variables were standardized into the [0,1] interval to align the dimensionality of predictors and dependent (dummy) variables. Second, given that we have high-dimensional data in terms of the number of features (forty-two independent variables, see Table 3), it is necessary to apply a procedure to reduce the complexity of the model by capturing only the most relevant inputs. The inclusion of many predictors in a model to solve a classification problem has severe theoretical disadvantages such as: (i) overfitting, (ii) correlation problems, (iii) difficulty in interpreting results, and (iv) a slower training process. The idea is to reduce the noise and redundancy in the final model. Indeed, the principle of parsimony states that the best statistical model has fewer parameters (variables) and less dimensionality (Arora and Kaur 2020; Speiser et al. 2019).

Table 3 Feature selection (FS) under random forest approach

Consequently, we performed a procedure to select the most relevant predictors. This minimizes the complexity of our model and accelerates its training, as well as improves the robustness of performance measurements, in terms of higher accuracy or lower errors, due to the booster of the generalization capacity of a classifier. Dewi (2019) indicated that the feature selection (FS) of the procedure enables reducing the original features of a dataset to a smaller one while preserving the relevant information and rejecting redundant information. As Chen et al. (2020) sustain, FS crucially impacts the performance of the classification model. Indeed, FS is considered more important than designing the prediction model.

Following Chen et al. (2020), we implement the random forest (RF) algorithm as a method to select the most relevant feature from the data. Unlike other parametric techniques grounded in subset selection, such as logistic regression (LR) with forward or backward procedures, RF is a nonparametric method based on supervised ML that incorporates two procedures to select the most important variables: (i) the package varImp() of R, where the mean decrease of the Gini index is calculated, and (ii) Boruta (Fahimifar et al. 2022).

The package varImp() of R is implemented after running the RF model. This is a postestimation procedure applied to each tree obtained and consists of calculating the prediction accuracy and subsequently permuting each predictor variable. Afterward, the difference between the two accuracies is averaged over all the trees normalized by the SE. The package provides two measures of importance for each predictor, disaggregating the results by outcome class (1, when Bizum is adopted, and 0 otherwise). The first of these metrics indicates the decrease, on average, in accuracy when a variable is removed. The second measure provides the reduction of the Gini impurity when a variable is chosen to split a node. It should be noted that the sample used to calculate the importance of each variable is the out-of-sample data that was not used during tree construction. The recommendation is to analyze both measures together because this enables a comparison of the importance ranking of each one. However, their main disadvantage is that they may overstate the importance of the correlated variables.

To benchmark with respect to FS, we also implemented the Boruta algorithm that enables ranking the predictor variables based on their significance (default values for p value = 0.01 and maxRun = 100). One of the most important advantages of the use of Boruta is that it provides a classification of the variables in three groups: (i) confirmed, for those significant variables (the most relevant); (ii) tentative, for those variables that may be selected, but which have less importance; and (iii) rejected, for those variables that the method considers are not to be included.

The results of the FS analysis are depicted in Table 3 (graphically also in Fig. 2). As shown here, the two FS procedures employed (Boruta and the Gini index) match most of the rankings performed, especially in the first variables, i.e., the variables with the highest classification of importance.

Fig. 2
figure 2

The important measure for each variable using Boruta

Unlike the Gini index, one of the main advantages of the Boruta procedure is that it enables knowing which variable must be included in the model. However, as can be observed in Table 3, the Boruta procedure considers that the entire list of variables should be introduced into the model because they have a high importance level. Therefore, it is not operational from a computational viewpoint. Consequently, to increase the selection capacity of the FS procedures, we only select, from the ten first variables, the variables matching the two criteria (Boruta and the Gini index).

Eight of the ten first variables are the most relevant under both FS criteria (see Table 3); thus, these variables will be included in our classification model. It should be noted that with this procedure, we are dramatically reducing the number of variables that will be introduced into our model, considering only eight (i.e., only 19.04% of the information contained in the original dataset) from forty-two variables. This selection of the data’s critical features reduces the noise and redundancy of the final model and improves its interpretation while decreasing the computational costs.

Despite the advantages of the Boruta and Gini index procedures shown above, the main disadvantage of both procedures is that they do not consider the potential multicollinearity problems that may arise between the resulting explanatory variables. Indeed, multicollinearity problems remain understudied in the environment of AI and ML algorithms, although it is one of the most important aspects to consider in an econometric model (Chan et al. 2022). However, unlike what is often claimed, correlation does not necessarily mean multicollinearity as they are not the same, and thus multicollinearity problems cannot be analyzed by using the correlation matrix, but by using the Variation Inflation Factor (VIF) (Chan et al. 2022). The variable PU2 has the maximum VIF value (6.548), which confirms the lack of multicollinearity problems (note that although there is no strict threshold for VIF to confirm the presence of multicollinearity, there is a wide consensus in the previous research to consider that a VIF of 10 or higher often indicates multicollinearity (Weisberg 2005). Additionally, as a robustness check, we also implement the forward stepwise logistic regression as a parametric alternative approach to select the most relevant variables. Here we obtain only four resulting variables (PU2, SN3, TR5, and PENJ3), of which three match those obtained in the Boruta and Gini index procedures (our results, in terms of the nonparametric techniques based on ML outperform the classical LDA and LR, remain unaltered by applying the Boruta, Gini index, and forward stepwise logistic regression).

From a theoretical point of view, FS analysis suggests that the variables corresponding to usefulness, subjective usage norms, trust, and perceived enjoyment have a strong influence on the intention to use mobile payment systems and media.

Specifically, our results suggest that the usefulness of mobile payment media (PU1, PU2, PU3, and PU4) is a strong explanatory factor in their usage intention, which is an advance over the previous literature (Bhattacherjee and Premkumar 2004). This posits the concept of perceived usefulness to understand changes in beliefs and attitudes toward information technology use.

Second, moving on to personal innovation in the information technology domain, two subjective customer profile variables (SN3 and SN4) show high explanatory and predictive power for the intention to use mobile payment methods (Agarwal and Prasad 1998a; Taylor and Todd 1995).

Turning to variables related to perceived trust in mobile payment systems, in line with Ba and Pavlou (2002), our results identify a strong link between bank customers’ perceived trustworthiness in the mobile payment medium (TR2) and their direct intention to use.

Furthermore, our findings represent an advance over the previous literature regarding the variable related to the perceived enjoyment of using online payment systems (Agarwal and Karahanna 2000; Rouibah et al. 2016), as our results identify a significant relationship between the perceived enjoyment of using a mobile payment means and the intention to continue using this technology (PENJ3). To better illustrate the discriminatory power of the RF model after applying the FS procedure, we present the area under the ROC curve (AUC) in Fig. 3. AUC is calculated by plotting the true positive rate against the false positive rate at various threshold settings. Indeed, AUC can be defined as a tradeoff between sensitivity and specificity, given that an increase in sensitivity will cause a reduction in specificity. The model will have a greater classification power when the curve is closer to the upper left corner. Similarly, Fig. 4 shows the out-of-bag (OOB) error, which can be defined as the average error using predictions from trees that are not contained in their respective bootstrap sample. OOB is used to fit the classification power of the RF model while it is being trained. As depicted in Fig. 4, the OOB drastically decreases (i.e., the model increases its fitting) after the first 150 trees, oscillating steadily from them.

Fig. 3
figure 3

Area under ROC curve for random forest (AUC)

Fig. 4
figure 4

The Out-Of-Bag (OOB) error for final random forest model

Validation measures

The performance of each model is evaluated using different accuracy measurements on the results obtained for each method on the out-of-sample. In binary classification problems, two relevant metrics arise sensitivity and specificity. On the one hand, sensitivity measures the probability that the model classifies a Bizum user as a real user of Bizum. In other words, sensitivity measures the model’s ability to detect Bizum usage in its presence. Conversely, specificity measures the probability that the model classifies a real Bizum nonuser as a Bizum nonuser. That is, specificity measures the ability of the model to exclude the use of Bizum when it is lacking. Sensitivity and specificity are defined as follows:

$$Sensitivity = \frac{TP}{{TP + FN}};\quad Specificity = \frac{TN}{{TN + FP}}$$

where

TP = True Positive, the number of positive cases (not adopting mobile P2P payment) that are correctly identified as positive,

TN = True Negative, the number of negative cases (adopt mobile P2P payment) that are correctly identified as negative cases,

FN = False Negative, the number of positive cases (not adopt mobile P2P payment) that are misclassified as negative cases (adopt mobile P2P payment),

FP = False Positive, the number of negative cases (adopt mobile P2P payment) that are incorrectly identified as positive cases (not adopt mobile P2P payment).

Following Petropoulos et al. (2020), we built several performance measurements based on sensitivity and specificity to overcome the limitations of traditional accuracy metrics based only on the overall predictive ability. In this vein, we calculate the following measures:

  • G-mean: The geometric mean G-mean is the product of sensitivity and specificity. This metric illustrates the balance between the classification performances of the majority and minority classes.

    $$G = \sqrt {Sensitivity \cdot Specificity}$$

A poor performance in predicting positive cases will lead to a low G-mean value, even if the negative cases are correctly classified by the algorithm.

  • LR: The negative likelihood ratio is the ratio between the probability of predicting a case as negative when it is positive and the probability of predicting a case as negative when it is actually negative.

    $$LR = \frac{1 - Sensitivity}{{Specificity}}$$

A lower negative likelihood ratio signifies better performance in negative cases. This is the main point of interest in this study as we model bank failures.

  • DP: Discriminant power is a measurement that sums up sensitivity and specificity.

    $$DP = \frac{\sqrt 3 }{\pi }\left[ {log\left( {\frac{Sensitivity}{{1 - Sensitivity}}} \right) + log\left( {\frac{Specificity}{{1 - Specificity}}} \right)} \right]$$

The algorithm distinguishes between positive and negative cases for DP values greater than 3.

  • BA: Balanced accuracy is the average of Sensitivity and Specificity. If the classifier performs equally well on either class, this term lowers the conventional accuracy measure.

    $$BA = \frac{1}{2}\left( {Sensitivity + Specificity} \right)$$

In contrast, if the conventional accuracy is high simply because the classifier takes advantage of a good prediction on the majority class, the balanced accuracy will decrease, thus signaling any performance issues. That is, BA does not disregard the accuracy of the model in the minority class (i.e., adopt Bizum in our case).

  • Youden’s γ: Youden’s index is a linear transformation of the mean sensitivity and specificity; consequently, it is difficult to interpret.

    $$\gamma = Sensitivity - \left( {1 - Specificity} \right)$$

As a general rule, a higher value of Youden’s γ signifies a better ability of the algorithm to avoid misclassification of the population.

  • WBA1: A weighted balance accuracy measure that weighs specificity more than sensitivity (75%/25%).

  • WBA2: A weighted balance accuracy measure that weighs sensitivity more than specificity (75%/25%).

Alternatively, we also calculate the AUC, which can be defined as the probability that the classifier will rank a randomly chosen positive instance higher than a randomly chosen negative instance. The value of AUC varies between 0.50 and 1, being accepted by the researcher that a value above 0.80 denotes a high performance.

Finally, to facilitate the interpretation of the results, we build a metric, the Global Performance Index (GPI), which summarizes the results of the previous performance measurements. We define GPI as the arithmetic average of all previous metrics, except for Type I and II errors, because they are complementary ratios to specificity and sensitivity. Moreover, given that a model obtains a better performance with lower values of LR, this metric subtracts in the following expression:

$$GPI = \frac{AUC + Accuracy\;ratio + Sensitivity + Specificity + Gmean - LR + DP + BA + Youden^{\prime}s + WBA1 + WBA2}{{11}}$$

Results

The final sample, after eliminating questionnaires that were completed too quickly or exceeded the recommended time, amounted to 701 participants, of whom 46.22% were male and 53.78% were female. 42.37% were between 18 and 24 years old, 51.21% were between 25 and 44 years old, and 6.28% were over 44 years old. Of these, 4.28% had doctoral studies, 49.93% had university studies, 26.68% had secondary school studies, 15.83% had primary school studies, and the remaining 3.28% had no studies at all. The number of invalid questionnaires rejected was only 13; thus, the valid response rate was 98%.

Table 4 summarizes the results in terms of performance metrics in the test set. This shows that there is not a unique model that obtains the best performance in terms of all metrics. However, our results demonstrate that nonparametric techniques based on ML often outperform classical LDA and LR. In particular, we find that binomial boosting, MLP4, and L2 boosting are the models that obtain the best performance in terms of GPI. Specifically, binomial boosting obtains the best GPI score with a value of 0.6859, followed by MLP 4 and L2 boosting, which reach GPIs of 0.6613 and 0.6609, respectively. In contrast, the two models based on classification trees, CART and CTBag, obtained the worst performance in terms of GPI.

Table 4 Performance results in out-of-sample

Since the AUC is based on conceptual and methodological foundations different from the rest of the metrics, which, as previously argued, are based on specificity and sensitivity (complementary measurements of type I error and type II error, respectively), we analyze this metric in more detail. In this sense, our findings show that the methods with the highest AUC values are the neuronal network (MLP 1 and MLP 2), followed closely by SVM and L2 boosting. In the same way as the GPI, CART, and CTBag are the two underperforming methods in terms of AUC.

When comparing the performance of the models built using all the variables for the models that apply FS to reduce the dimensionality of the data, our results suggest that the performance increases when FS is used. More importantly, we find that the increase in the performance of implementing FS remains unaltered for all the methods in terms of all the performance metrics.

Discussion

Theoretical implications

Our empirical research has two relevant results. First, related to the classification accuracy of methods, our findings suggest that using FS analysis as a preprocessing technique substantially improves prediction performance while reducing the noise and redundancy of the resulting model and the computational costs of its implementation due to lower data dimensionality. All of this definitively improves the theoretical interpretation of the final model and allows analysis of how each independent variable contributes to explicating and predicting the use of mobile P2P payments. We also find that there is not a unique method that outperforms in terms of all metrics, but it is demonstrated that, in general, nonparametric techniques based on ML outperform classical LDA and LR. Thus, the results show that binomial boosting, MLP4, and L2 boosting are the models that obtain the best performance according to the Global Performance Index (GPI).

Second, from a theoretical point of view, we document that (in this order of priority) the usefulness of mobile P2P payments, the influence of peers and other social groups such as friends, family, and colleagues on an individual’s behavior (i.e., subjective norms), and the perceived trust and enjoyment of the user experience in the digital context are the attributes that classify the (potential) users of mobile P2P payments with greater ability.

The major importance of usefulness in the intention to use this P2P payment service may be mainly based on the number of current users (approximately half of the population of Spain). This networking effect is crucial to the success of the service because the application must be used by both the sender and receiver. In addition, adequate resources or support are essential for users to perceive the usefulness of the service and even directly influence the intention of use. Subjective Norms, as the following significant factor on the intention to use the service, show that the information that users share about their experience when using the P2P payment service influences the intention of other users due to the social requirements of these services. This fact is highly relevant for those companies that provide these payment services since their plans of action should focus on developing word-of-mouth strategies and attempting to encourage current clients to directly recommend the service. Our results also show that perceived trust and enjoyment significantly affect the intention to use P2P payment services. This finding implies that service providers corroborate the need to develop P2P payment services that may be easy to use, secure, and attractive to consumers.

The future landscape of the payment sector will be promising for financial entities and FinTech organizations that are open to change, innovation, and forward-thinking. These players need to rapidly accelerate their transformation efforts to address unmet customer demands and plug the gaps. In this vein, our findings are novel and useful for both traditional and new financial intermediaries, businesses, customers, and other stakeholders that are part of financial systems, such as policymakers and regulators. More importantly, our findings could be of interest to financial institutions to define ad hoc financial services customized for their target market.

From a theoretical perspective, our results support the necessity of implementing statistical procedures to reduce the complexity of the data. Boruta and Gini algorithms are preferable methods because both are based on the nonlinearity performed by Random Forest, one of the most advanced current ML methods.

Practical implications

From a managerial standpoint, our research findings provide valuable insights for service providers in the mobile P2P payments industry. To effectively promote the adoption and usage of their platforms, providers must prioritize enhancing usability and user experience. This can be achieved by streamlining the payment process, simplifying user interfaces, and ensuring smooth and intuitive navigation. By focusing on subjective norms, providers can tap into the power of social influence, leveraging the positive perceptions and recommendations of existing users to attract new users. Implementing strategies to encourage word-of-mouth marketing, such as referral programs or incentives for users who refer others to the service, can be an effective approach to expanding user adoption.

Building trust is another critical aspect of driving the adoption of mobile P2P payments. Service providers should prioritize security measures and communicate them transparently to users. Highlighting the safety of transactions, data protection protocols, and robust authentication methods can help alleviate concerns and increase users’ trust in the platform. In addition, incorporating features that enhance user enjoyment and engagement, such as personalized experiences, rewards, or gamification elements, can contribute to positive user perception and encourage continued usage.

Beyond the immediate managerial implications, our research findings have broader societal and economic implications. Promoting the adoption of mobile P2P payments can contribute to financial inclusion, particularly for marginalized populations, such as the young, the unemployed, and those with limited access to traditional banking services in rural areas. By providing these individuals with convenient and accessible payment solutions, barriers to financial participation can be reduced, enabling them to engage in economic activities, make transactions, and manage their finances more effectively. This, in turn, can lead to increased economic empowerment, poverty reduction, and overall societal development.

Furthermore, from a macroeconomic perspective, higher adoption of mobile P2P payments can lead to increased financial stability. By reducing the reliance on cash transactions and expanding digital payment options, the risks associated with handling physical currency, such as theft or counterfeiting, can be mitigated. Additionally, the digitization of payments enables better tracking and monitoring of financial flows, contributing to enhanced transparency and accountability within the financial system. This improved oversight can help prevent illicit activities, such as money laundering and tax evasion while facilitating more efficient financial regulations and policy implementations.

In conclusion, the implications of our research emphasize the importance of prioritizing usability, trust, and enjoyment in mobile P2P payment services. By addressing these factors and promoting the adoption of mobile P2P payments, service providers can not only drive their business success but also contribute to financial inclusion, economic development, and financial stability at both the individual and societal levels.

Limitations and avenues for future research

Despite the valuable insights gained from this study, it is important to acknowledge its limitations, which open up avenues for future research. First, enhancing the dataset by incorporating additional information, such as users’ training in new technologies, educational background, and risk aversion, would provide valuable control and moderating variables to deepen our understanding of the factors influencing the use of mobile P2P payments. This could shed light on how these individual characteristics interact with other factors and impact adoption.

Second, obtaining data on the average size of digital payment transactions would allow for an analysis of how users’ risk aversion influences their adoption of mobile P2P payments. Examining whether risk-averse individuals are more or less likely to engage in larger transactions through these payment methods could provide valuable insights into the relationship between risk perception and usage behavior.

Third, replicating this study by surveying individuals from different countries would enable an analysis of the effects of institutional frameworks on the adoption of mobile P2P payments. Comparing adoption patterns across countries with varying regulatory environments and financial infrastructures could reveal the influence of these contextual factors on user behavior.

Finally, it is important to address the limitations associated with the sample selection process, specifically the use of a nonprobability snowball sampling method. Future research should consider employing alternative sampling techniques, such as simple random sampling or quota sampling, to ensure a more representative and generalizable sample. This would enhance the external validity of the findings and provide a more comprehensive understanding of the factors influencing mobile P2P payment adoption across diverse populations.

By addressing these limitations and pursuing further research in these areas, we can gain a more nuanced understanding of the adoption and usage of mobile P2P payments, leading to more effective strategies for service providers and policymakers in driving the growth and acceptance of these payment methods. Another limitation of ML is that it includes suitable choices from manifold implementation options, bias and drift in data, and the mitigation of black-box properties.

Conclusion

In the current era of increasing digitalization and massive use of FinTech services, digital P2P payments are being strongly extended as the preferred payment option, mainly among the young. The rise in P2P payments has been enhanced by the explosion of sure mobile payment applications as well as the COVID-19 pandemic, which has dramatically limited cash payments to prevent transmission of the virus. Of course, the need to align individuals´ behaviors with the Sustainable Development Goals also requires the boosting of digital P2P payments as a way to increase the financial inclusion of many individuals excluded from traditional financial banking services (Danisman and Tarazi 2020). Indeed, banks and other financial players are currently playing a relevant role in developing innovative payment services where P2P payments are becoming widespread. Thus, it is crucial to examine the factors that determine customers’ adoption of mobile P2P payments to exploit their potential.

This study explores the drivers of mobile P2P adoption by using ML to predict usage among FinTech disruptions in financial services. Our main conclusion is that ML must be applied by banks and other financial intermediaries to predict their customers’ adoption of mobile/digital P2P payments. Indeed, to the authors’ knowledge, this approach has not yet been employed in this field of research. In addition, our findings emphasize the relevance of usefulness, subjective norms, trust, and user enjoyment in classifying potential mobile P2P users.

Availability of data and materials

The datasets used and analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

P2P:

Peer-to-peer

LDA:

Linear discriminant analysis

LR:

Logistic regression

IWLS:

Iterative weighted least squares

AIC:

Akaike information criterion

MLP:

Multilayer perceptron

SVM:

Support vector machine

CART:

Classification and regression tree

CTBag:

Bagged tree model

RF:

Random forests

FS:

Feature selection

AUC:

Area under the ROC curve

OOB:

Out-of-bag

GPI:

Global performance index

SDG:

Sustainable development goals

References

  • Abdullah S, Naved Khan M (2021) Determining mobile payment adoption: a systematic literature search and bibliometric analysis. Cogent Bus Manag 8(1):1893245

    Article  Google Scholar 

  • Acker A, Murthy D (2020) What is Venmo? A descriptive analysis of social features in the mobile payment platform. Telem Inform 52:101429

    Article  Google Scholar 

  • Agarwal R, Karahanna E (2000) Time flies when you’re having fun: cognitive absorption and beliefs about information technology usage. MIS Q 24(4):665–694

    Article  Google Scholar 

  • Agarwal R, Prasad J (1998a) A conceptual and operational definition of personal innovativeness in the domain of information technology. Inf Syst Res 9(2):204–215

    Article  Google Scholar 

  • Agarwal R, Prasad J (1998b) The antecedents and consequents of user perceptions in information technology adoption. Decis Support Syst 22(1):15–29

    Article  Google Scholar 

  • Ajzen I (1991) The theory of planned behaviour. Organ Behav Hum Decis Process 50:179–211

    Article  Google Scholar 

  • Alonso Robisco A, Carbó Martínez JM (2022) Measuring the model risk-adjusted performance of machine learning algorithms in credit default prediction. Financ Innov 8:70. https://doi.org/10.1186/s40854-022-00366-1

    Article  Google Scholar 

  • Arora N, Kaur PD (2020) A Bolasso based consistent feature selection enabled random forest classification algorithm: an application to credit risk assessment. Appl Soft Comput 86:105936. https://doi.org/10.1016/j.asoc.2019.105936

    Article  Google Scholar 

  • Aslam F, Awan TM, Fatima T (2022) Classification of m-payment users’ behavior using machine learning models. J Financ Serv Mark 27:264–275. https://doi.org/10.1057/s41264-021-00114-z

    Article  Google Scholar 

  • Ba S, Pavlou P (2002) Evidence of trust building technology in electronic markets: price premiums and buyer behavior. MIS Q 26:243–268. https://doi.org/10.2307/4132332

    Article  Google Scholar 

  • Bailey AA, Bonifield CM, Arias A, Villegas J (2022) Mobile payment adoption in Latin America. J Serv Mark 36(8):1058–1075

    Article  Google Scholar 

  • Bajari P, Nekipelov D, Ryan SP, Yang M (2015) Machine learning methods for demand estimation. Am Econ Rev 105(5):481–485

    Article  Google Scholar 

  • Belanche D, Guinalíu M, Albás P (2022) Customer adoption of P2P mobile payment systems: the role of perceived risk. Telemat Inform 72:101851. https://doi.org/10.1016/j.tele.2022.101851

    Article  Google Scholar 

  • Bhattacherjee A, Premkumar G (2004) Understanding changes in belief and attitude toward information technology usage: a theoretical model and longitudinal test. MIS Q 28(2):229–254

    Article  Google Scholar 

  • Bishop CM (2006) Pattern recognition and machine learning. Springer, Berlin

    Google Scholar 

  • Bizum (2022) https://bizum.es/datos/. Accessed 21 Mar 2022

  • Boulesteix AL, Janitza S, Hapfelmeier A, Van Steen K, Strobl C (2015) Letter to the editor: on the term “interaction” and related phrases in the literature on random forests. Brief Bioinform 16(2):338–345

    Article  PubMed  Google Scholar 

  • Breiman L (1996) Bagging predictors. Mach Learn 24:123–140. https://doi.org/10.1007/BF00058655

    Article  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  • Bühlman P, Hothorn T (2007) Boosting algorithms: regularization, prediction and model fitting. Stat Sci 22:477–505

    MathSciNet  Google Scholar 

  • Cao L, Philip SY, Kumar V (2015) Nonoccurring behavior analytics: a new area. IEEE Intell Syst 30(6):4–11

    Article  Google Scholar 

  • Chan JY, Leow SM, Bea KT, Cheng WK, Phoong SW, Hong ZW, Chen YL (2022) Mitigating the multicollinearity problem and its machine learning approach: a review. Mathematics 10(8):1283. https://doi.org/10.3390/math10081283

    Article  Google Scholar 

  • Chen RC, Dewi C, Huang SW, Caraka RE (2020) Selecting critical features for data classification based on machine learning methods. J Big Data. https://doi.org/10.1186/s40537-020-00327-4

    Article  Google Scholar 

  • Cui G, Wong ML, Lui HK (2016) Machine learning for direct marketing response models: Bayesian networks with evolutionary programming. Manag Sci 52(4):597–612

    Article  Google Scholar 

  • Dahlberg T, Mallat N, Ondrus J, Zmijewska A (2008) Past, present and future of mobile payments research: a literature review. Electron Commer Res Appl 7(2):165–181

    Article  Google Scholar 

  • Danisman GO, Tarazi A (2020) Financial inclusion and bank stability: evidence from Europe. Eur J Finance 26(18):1842–1855. https://doi.org/10.1080/1351847X.2020.1782958

    Article  Google Scholar 

  • Davenport T, Guha A, Grewal D, Bressgott T (2020) How artificial intelligence will change the future of marketing. J Acad Mark Sci 48(1):24–42. https://doi.org/10.1007/s11747-019-00696-0

    Article  Google Scholar 

  • Davis FD (1989) Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Q 13:319–340

    Article  Google Scholar 

  • Davis FD, Bagozzi RP, Warshaw PR (1989) User acceptance of computer technology: a comparison of two theoretical models. Manag Sci 35(8):982–1003

    Article  Google Scholar 

  • Dennehy D, Sammon D (2015) Trends in mobile payments research: a literature review. J Innov Manag 3(1):49–61

    Article  Google Scholar 

  • Dewi C (2019) Random forest and support vector machine on features selection for regression analysis. Int J Innov Comput Inf Control 15(6):2027–2037

    Google Scholar 

  • Dimitriadou E, Hornik K, Leisch F, Meyer D, Weingessel D (2022) e1071: misc functions of the department of statistics (e1071) TU Wien. R package version 1.6. https://cran.r-project.org/web/packages/e1071/index.html

  • European Central Bank (2022) Estadísticas sobre pagos: 2021. www.bce.es

  • Fahimifar S, Mousavi K, Mozaffari F, Ausloos M (2022) Identification of the most important external features of highly cited scholarly papers through 3 (i.e., Ridge, Lasso, and Boruta) feature selection data mining methods. Qual Quant. https://doi.org/10.1007/s11135-022-01480-z

    Article  Google Scholar 

  • Fishbein M, Ajzen I (1975) Belief attitude, intention, and behavior: an introduction to theory and research. Reading, Addison-Wesley, M.A.

    Google Scholar 

  • Fishbein M, Ajzen I (1977) Belief, attitude, intention and behavior: an introduction to theory and research. Philos Rhetor 10(2):130–132

    Google Scholar 

  • Flavián C, Guinaliu M, Lu Y (2020) Mobile payments adoption–introducing mindfulness to better understand consumer behavior. Int J Bank Mark 38(7):1575–1599

    Article  Google Scholar 

  • Frame WS, Wall LD, White LJ (2018) Technological change and financial innovation in banking: some implications for fintech. FRB Atlanta, working paper no. 2018-11

  • Freund Y (1995) Boosting a weak learning algorithm by majority. Inf Comput 121(2):256–285

    Article  MathSciNet  Google Scholar 

  • Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting (with discussion). Ann Stat 28:337–407

    Article  Google Scholar 

  • Gomber P, Koch JA, Siering M (2017) Digital finance and FinTech: current research and future research directions. J Bus Econ 87:537–580. https://doi.org/10.1007/s11573-017-0852-x

    Article  Google Scholar 

  • Gefen D, Karahanna E, Straub DW (2003) Trust and TAM in online shopping: an integrated model. MIS Quart 27(1):51–90

    Article  Google Scholar 

  • Gözükara İ, Çolakoğlu N (2016) A research on generation Y students: brand innovation, brand trust and brand loyalty. Int J Bus Manag Econ Res 7(2):603–611

    Google Scholar 

  • Guo M, Zhang Q, Liao X, Chen FY, Zeng DD (2021) A hybrid machine learning framework for analyzing human decision-making through learning preferences. Omega 101:102263. https://doi.org/10.1016/j.omega.2020.102263

    Article  Google Scholar 

  • Hagenauer J, Helbich MA (2017) Comparative study of machine learning classifiers for modeling travel mode choice. Expert Syst Appl 78:273–282

    Article  Google Scholar 

  • Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction. Springer, New York

    Book  Google Scholar 

  • Hernández-Murillo R, Llobet G, Fuentes R (2010) Strategic online banking adoption. J Bank Finance 34(7):1650–1663

    Article  Google Scholar 

  • Higueras-Castillo E, Liébana-Cabanillas FJ, Villarejo-Ramos ÁF (2023) Intention to use e-commerce vs physical shopping. Difference between consumers in the post-COVID era. J Bus Res 157:113622

    Article  Google Scholar 

  • Hothorn T, Bühlmann P, Kneib T, Schmid M, Hofner B (2022) mboost: model-based boosting. R package version 2.1-2. https://cran.r-project.org/web/packages/mboost/mboost.pdf

  • Huang Y (2021) Retail fintech payments: facts, benefits, challenges, and policies

  • Huang D, Liu X, Lai D, Li Z (2019) Users and non-users of P2P accommodation: differences in perceived risks and behavioral intentions. J Hosp Tour Technol 10(3):369–382

    Google Scholar 

  • Insider Intelligence (2022) The payment industry’s biggest trends in 2022—and the pandemic’s impact on digitization in the payments landscape. https://www.businessinsider.com/payments-ecosystem-report. Accessed 21 Mar 2022

  • Irimia-Diéguez A, Velicia-Martín F, Aguayo-Camacho M (2023) Predicting Fintech innovation adoption: the mediator role of social norms and attitudes. Financ Innov. https://doi.org/10.1186/s40854-022-00434-6

    Article  PubMed  PubMed Central  Google Scholar 

  • Jarvenpaa SL, Tractinsky N, Vitale M (2000) Consumer trust in an internet store information technology and management. J Inf Syst 12(1):41–48

    Google Scholar 

  • Jun J, Cho I, Park H (2018) Factors influencing continued use of mobile easy payment service: an empirical investigation. Total Qual Manag Bus Excell 29(9–10):1043–1057

    Article  Google Scholar 

  • Kalinic Z, Marinkovic V, Molinillo S, Liébana-Cabanillas F (2019) A multi-analytical approach to peer-topeer mobile payment acceptance prediction. J Retail Consum Serv 49:143–153. https://doi.org/10.1016/j.jretconser.2019.03.016

    Article  Google Scholar 

  • Kaplan A, Haenlein M (2019) Siri, Siri, in my hand: Who’s the fairest in the land? On the interpretations, illustrations, and implications of artificial intelligence. Bus Horiz 62(1):15–25. https://doi.org/10.1016/j.bushor.2018.08.004

    Article  Google Scholar 

  • Kou G, Olgu Akdeniz Ö, Dinçer H, Yüksel S (2021) Fintech investments in European banks: a hybrid IT2 fuzzy multidimensional decision-making approach. Financ Innov 7(1):1–28

    Article  Google Scholar 

  • Lai F, Hutchinson J, Li D, Bai C (2007) An empirical assessment and application of SERVQUAL in mainland China’s mobile communications industry. Int J Qual Reliab Manag 24(3):244–262

    Article  Google Scholar 

  • LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444. https://doi.org/10.1038/nature14539

    Article  CAS  PubMed  ADS  Google Scholar 

  • Lee VH, Hew JJ, Leong LY, Tan GWH, Ooi KB (2020) Wearable payment: a deep learning-based dual-stage SEM-ANN analysis. Expert Syst Appl 157:113477. https://doi.org/10.1016/j.eswa.2020.113477

    Article  Google Scholar 

  • Leong LY, Hew JJ, Wong LW, Lin B (2022) The past and beyond of mobile payment research: a development of the mobile payment framework. Internet Res 32(6):1757–1782

    Article  Google Scholar 

  • Lewis BR, Soureli M (2006) The antecedents of consumer loyalty in retail banking. J Consum Behav 5(1):15–31

    Article  Google Scholar 

  • Li L, Freeman G, Wohn DY (2021) The Interplay of financial exchanges and offline interpersonal relationships through digital peer-to-peer payments. Telemat Inform. https://doi.org/10.1016/j.tele.2021.101671

    Article  Google Scholar 

  • Liaw A, Wiener M (2022) Classification and regression by random forest. R News 2:18–22

    Google Scholar 

  • Liébana-Cabanillas F, Sánchez-Fernández J, Muñoz-Leiva F (2014) Role of gender on acceptance of mobile payment. Ind Manag Data Syst 114(2):220–240

    Article  Google Scholar 

  • Liébana-Cabanillas F, Ramos de Luna I, Montoro-Ríos F (2017) Intention to use new mobile payment systems: a comparative analysis of SMS and NFC payments. Econ Res-Ekonomska Istraživanja 30(1):892–910

    Article  Google Scholar 

  • Liébana-Cabanillas F, Molinillo S, Ruiz-Montañez M (2019) To use or not to use, that is the question: analysis of the determining factors for using NFC mobile payment systems in public transportation. Technol Forecast Soc Change 139:266–276

    Article  Google Scholar 

  • Liébana-Cabanillas F, Singh N, Kalinic Z, Carvajal-Trujillo E (2021) Examining the determinants of continuance intention to use and the moderating effect of the gender and age of users of NFC mobile payments: a multi-analytical approach. Inf Technol Manag 22:133–161. https://doi.org/10.1007/s10799-021-00328-6

    Article  Google Scholar 

  • Liébana-Cabanillas F, Kalinic Z, Luna IRD, Marinkovic V (2022a) A holistic analysis of near field communication mobile payments: an empirical analysis. Int J Mob Commun 20(6):703–726

    Article  Google Scholar 

  • Liébana-Cabanillas F, Muñoz-Leiva F, Molinillo S, Higueras-Castillo E (2022b) Do biometric payment systems work during the COVID-19 pandemic? Insights from the Spanish users’ viewpoint. Financ Innov 8(1):1–25

    Article  Google Scholar 

  • Ma S, Fildes R (2020) Forecasting third-party mobile payments with implications for customer flow prediction. Int J Forecast 36(3):739–760. https://doi.org/10.1016/j.ijforecast.2019.08.012

    Article  Google Scholar 

  • Madani A, Ong JR, Tibrewal A, Mofrad MR (2018) Deep echocardiography: data-efficient supervised and semi-supervised deep learning towards automated diagnosis of cardiac disease. Npj Digit Med 1:59. https://doi.org/10.1038/s41746-018-0065-x

    Article  PubMed  PubMed Central  Google Scholar 

  • Maindonald J, Braun J (2003) Data analysis and graphics using R. An examplebased approach. Cambridge University Press, Cambridge, Cambridge

    Google Scholar 

  • Martín A, Fernández-Isabel A, Martín de Diego I, Beltrán M (2021) A survey for user behavior analysis based on machine learning techniques: current models and applications. Appl Intell 51:6029–6055. https://doi.org/10.1007/s10489-020-02160-x

    Article  Google Scholar 

  • Meyer D (2012) Support vector machines. The interface to libsvm in packagee 1071. Available at svmdoc.pdf

  • Migliore G, Wagner R, Cechella FS, Liébana-Cabanillas F (2022) Antecedents to the adoption of mobile payment in China and Italy: an integration of UTAUT2 and innovation resistance theory. Inf Syst Front 24:1–24

    Article  Google Scholar 

  • Moorthy K, Chun T’ing L, Chea Yee K, Wen Huey A, Joe In L, Chyi Feng P, Jia Yi T (2020) What drives the adoption of mobile payment? A Malaysian perspective. Int J Finance Econ 25(3):349–364

    Article  Google Scholar 

  • Nasir A, Shaukat K, Khan KI, Hameed IA, Alam TM, Luo S (2020) What is core and what future holds for blockchain technologies and cryptocurrencies: a bibliometric analysis. IEEE Access 9:989–1004

    Article  Google Scholar 

  • Nasir A, Shaukat K, Iqbal Khan K, Hameed A, I., Alam, T. M., & Luo, S. (2021) Trends and directions of financial technology (Fintech) in society and environment: a bibliometric study. Appl Sci 11(21):10353

    Article  CAS  Google Scholar 

  • Nguyen DK, Sermpinis G, Stasinakis C (2022) Big data, artificial intelligence and machine learning: a transformative symbiosis in favour of financial technology. Eur Financ Manag. https://doi.org/10.1111/eufm.12365

    Article  Google Scholar 

  • Panetta IC, Leo S, Delle Foglie A (2023) The development of digital payments–past, present, and future–from the literature. Res Int Bus Finance 64:101855

    Article  Google Scholar 

  • Patil PP, Dwivedi YK, Rana NP (2017) Digital payments adoption: an analysis of literature. Conference on e-Business, e-Services and e-Society. Springer, Cham, pp 61–70

    Google Scholar 

  • Pavlou PA (2002) Institution-based trust in interorganizational exchange relationships: the role of online B2B marketplaces on trust formation. J Strateg Inf Syst 11(3–4):215–243

    Article  Google Scholar 

  • Peters A, Hothorn T (2016) Improved predictive models by indirect classification and bagging for classification, regression and survival problems as well as resampling based estimators of prediction error. https://cran.r-project.org/web/packages/ipred/index.html

  • Petropoulos A, Siakoulis V, Stavroulakis E, Vlachogiannakis NE (2020) Predicting bank insolvencies using machine learning techniques. Int J Forecast 36(3):1092–1113. https://doi.org/10.1016/j.ijforecast.2019.11.005

    Article  Google Scholar 

  • Rafdinal W, Senalasari W (2021) Predicting the adoption of mobile payment applications during the COVID-19 pandemic. Int J Bank Mark 39(6):984–1002

    Article  Google Scholar 

  • Ramos-de-Luna I, Montoro-Ríos F, Liébana-Cabanillas F (2016) Determinants of the intention to use NFC technology as a payment system: an acceptance model approach. IseB 14(2):293–314

    Article  Google Scholar 

  • Rouibah K, Lowry PB, Hwang Y (2016) The effects of perceived enjoyment and perceived risks on trust formation and intentions to use online payment systems: new perspectives from an Arab country. Electron Commer Res Appl 19:33–43. https://doi.org/10.1016/j.elerap.2016.07.001

    Article  Google Scholar 

  • Schapire RE, Freund Y, Bartlett P, Lee WS (1998) Boosting the margin: a new explanation for the effectiveness of voting methods. Ann Stat 26(5):1651–1686

    MathSciNet  Google Scholar 

  • Selvamuthu D, Kumar V, Mishra A (2019) Indian stock market prediction using artificial neural networks on tick data. Financ Innov 5:16. https://doi.org/10.1186/s40854-019-0131-7

    Article  Google Scholar 

  • Shaikh A, Liébana-Cabanillas F, Glavee-Geo R (2023) Factors inhibiting the adoption intention of digital payment platforms. In: Responsible finance and digitalization. Routledge, pp 140–154

  • Sheth J, Kellstadt CH (2021) Next frontiers of research in data driven marketing: Will techniques keep up with data tsunami? J Bus Res 125:780–784. https://doi.org/10.1016/j.jbusres.2020.04.050

    Article  Google Scholar 

  • Singh J, Sirdeshmukh D (2000) Agency and trust mechanisms in consumer satisfaction and loyalty judgments. J Acad Mark Sci 28:150–167. https://doi.org/10.1177/0092070300281014

    Article  Google Scholar 

  • Skinner BF (1953) Science and human behavior. Simon and Schuster, New York, p 92904

    Google Scholar 

  • Speiser JL, Miller ME, Tooze J, Ip E (2019) A comparison of random forest variable selection methods for classification prediction modeling. Expert Syst Appl 134:93–101. https://doi.org/10.1016/j.eswa.2019.05.028

    Article  PubMed  PubMed Central  Google Scholar 

  • Tamayo B (1999) Nuevos campos para la innovación: Internet y el comercio electrónico de bienes y servicios. Recuperado de www.navactiva.com/es/descargas/pdf/atic/cotec.pdf

  • Taylor S, Todd PA (1995) Understanding information technology usage: a test of competing models. Inf Syst Res 6(2):144–176

    Article  Google Scholar 

  • Thai HT (2022) Machine learning for structural engineering: a state-of-the-art review. Structures 38:448–491. https://doi.org/10.1016/j.istruc.2022.02.003

    Article  Google Scholar 

  • Thakor AV (2020) Fintech and banking: What do we know? J Financ Intermed 41:100883

    Article  Google Scholar 

  • Tounekti O, Ruiz-Martínez A, Skarmeta Gomez AF (2022) Research in electronic and mobile payment systems: a bibliometric analysis. Sustainability 14(13):7661

    Article  Google Scholar 

  • Türker C, Altay BC, Okumuş A (2022) Understanding user acceptance of QR code mobile payment systems in Turkey: an extended TAM. Technol Forecast Soc Change 184:121968

    Article  Google Scholar 

  • Upadhyay N, Upadhyay S, Abed SS, Dwivedi YK (2022) Consumer adoption of mobile payment services during COVID-19: extending meta-UTAUT with perceived severity and self-efficacy. Int J Bank Mark 40(5):960–991

    Article  Google Scholar 

  • Vanini P, Rossi S, Zvizdic E, Domenig T (2023) Online payment fraud: from anomaly detection to risk management. Financ Innov 9:66. https://doi.org/10.1186/s40854-023-00470-w

    Article  Google Scholar 

  • Vellido A, Lisboa PJG, Vaughan J (1999) Neural networks in business: a survey of applications (1992–1998). Expert Syst Appl 17:51–70. https://doi.org/10.1016/S0957-4174(99)00016-0

    Article  Google Scholar 

  • Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New York, NY

    Book  Google Scholar 

  • Venkatesh V, Bala H (2008) Technology acceptance model 3 and a research agenda on interventions. Decis Sci 39(2):273–315

    Article  Google Scholar 

  • Venkatesh V, Davis FD (2000) A theoretical extension of the technology acceptance model: four longitudinal field studies. Manag Sci 46(2):186–204

    Article  Google Scholar 

  • Venkatesh V, Morris MG, Davis GB, Davis FD (2003) User acceptance of information technology: toward a unified view. MIS Q 27:425–478

    Article  Google Scholar 

  • Venkatesh V, Thong J, Xu X (2012) Consumer acceptance and use of information technology: extending the unified theory of acceptance and use of technology. MIS Q 36(1):157–178

    Article  Google Scholar 

  • Visconti-Caparrós JM, Campos-Blázquez JR (2022) The development of alternate payment methods and their impact on customer behavior: the Bizum case in Spain. Technol Forecast Soc Change 175:121330

    Article  Google Scholar 

  • Wakefield RL, Whitten D (2006) Examining user perceptions of third-party organizations credibility and trust in an e-retailer. J Organ End User Comput (JOEUC) 18(2):1–19

    Article  Google Scholar 

  • Weisberg S (2005) Applied linear regression, vol 528. Wiley, Hoboken

    Book  Google Scholar 

  • Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann Publishers, Massachusetts

    Google Scholar 

  • Wu R-Z, Lee J-H, Tian X-F (2021) Determinants of the intention to use cross-border mobile payments in Korea among Chinese tourists: An integrated perspective of UTAUT2 with TTF and ITM. J Theor Appl Electron Commer Res 16(5):1537–1556

    Article  Google Scholar 

  • Wu Y, Zhang W, Shen J, Mo Z, Peng Y (2018) Smart city with Chinese characteristics against the background of big data: idea, action and risk. J Clean Prod 173:60–66

    Article  Google Scholar 

  • Xiong T, Ma Z, Li Z, Dai J (2022) The analysis of influence mechanism for internet financial fraud identification and user behavior based on machine learning approaches. Int J Syst Assur Eng Manag 13(3):996–1007. https://doi.org/10.1007/s13198-021-01181-0

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

ABO: data collecting, methodology, and implementation of algorithms. JLR: conceptualization and original draft. AID: writing and editing. FLC: conceptualization, theoretical framework and positioned our research. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Blanco-Oliver Antonio.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: constructs and measurement items

Perceived ease of use (Venkatesh and Bala 2008)

  • Interaction with the system does not require great effort.

  • Interaction with the system is straightforward.

  • It’s easy to get the system to do what I want.

  • The system is useful for making small payments.

  • In general, the system is easy to use.

Perceived risk of peer-to-peer mobile payment system (Jarvenpaa et al. 2000; Wakefield and Whitten 2006)

  • Other people can get information about my online transactions if I use this tool.

  • There is a high potential for money wasted if I make purchases on the internet/social networks using this tool.

  • There is significant risk in making purchases on the internet/social networks using this tool.

  • I think that making purchases on the internet/social networks with this tool is a risky choice.

Perceived usefulness of peer-to-peer mobile payment systems (Bhattacherjee and Premkumar 2004)

  • Peer-to-peer mobile payment systems are useful payment methods.

  • Using peer-to-peer mobile payment systems makes it easier to handle payments.

  • Peer-to-peer mobile payment systems allow quick use of mobile applications.

  • In general, peer-to-peer mobile payment systems could be useful for me.

Perceived trust of peer-to-peer mobile payment system (Pavlou 2002)

  • I believe the peer-to-peer mobile payment system will keep its promises and commitments.

  • The peer-to-peer mobile payment system is trustworthy.

  • I would describe peer-to-peer mobile payment system as honest.

  • I believe the peer-to-peer mobile payment system is responsible.

  • In general, I trust the peer-to-peer mobile payment system.

Personal innovativeness in information technology (Agarwal and Prasad 1998a; Ramos-de-Luna et al. 2016)

  • If I find out about new information technology, I seek ways to experience it.

  • I am usually one of the first among my colleagues/peers to explore new information technology.

  • In general, I am reluctant to try new information technologies.

  • I like to try new information technologies.

Subjective norms (Taylor and Todd 1995; Agarwal and Prasad 1998b)

  • The people whose opinions I value would approve of me using peer-to-peer mobile payment system.

  • Most of the people I have in mind think that I should use a peer-to-peer mobile payment system.

  • They expect me to use a peer-to-peer mobile payment system.

  • The people who are close to me would agree with me in using a peer-to-peer mobile payment system.

Perceived enjoyment of the peer-to-peer mobile payment system (Agarwal and Karahanna 2000; Rouibah et al. 2016)

  • I have fun interacting with this peer-to-peer mobile payment system.

  • Using this peer-to-peer mobile payment system provides me with a lot of enjoyment.

  • I enjoy using this peer-to-peer mobile payment system.

Loyalty to the bank brand (Gözükara and Çolakoğlu 2016)

  • I will not buy other brands if this brand is available at the store.

  • I consider myself loyal to this brand.

  • This brand would be my first choice.

  • I rarely switch from this brand just to try something different.

Perceived quality (Lai et al. 2007)

  • When peer-to-peer mobile payment systems promise they will do something, they do.

  • I consider peer-to-peer mobile payment systems to be dependable.

  • Peer-to-peer mobile payment systems provide the services they promise when they are supposed to.

  • Peer-to-peer mobile payment systems accurately maintain the statement.

  • It is easy to obtain related service information.

  • It feels safe to do business with the company.

  • The statement is clear and ease to understand.

Appendix 2: criteria for the implementation of algorithms

Linear and quadratic discriminant analysis

We select the threshold pc in the grill (0.01, 0.02, …, 0.99), choosing that value which minimises the classification error in a tenfold cross-validation. We obtained the value 0.42. LDA was fitted with R function lda (Venables and Ripley 2002) available in the MASS library.

Additionally, we also compute the quadratic discrimination analysis (QDA) that assumes that the covariance matrices are not equal. For this, we use the function qda from the MASS library (Venables and Ripley 2002). In this case, the cut point obtained was 0.43.

Logistic regression

We use the step.glm function in R (Venables and Ripley 2002), which strives to compute the maximum likelihood estimators of the n + 1 parameters by means of an iterative weighted least squares (IWLS) algorithm, applied under a forward sequential method based on the Akaike Information Criterion (AIC). The optimal cut-off is searched for in the grid (0.01, 0.02, …, 0.99), selecting the value minimising the tenfold validation error, obtaining 0.46.

Multilayer perceptron

The size of the hidden layer (H) and the decay parameter (k) are fitted by implementing a tenfold cross-validation optimisation in a grid defined as {1, 2, …, 40} and {0, 0.01, 0.05, 0.10, …, 2}, respectively. Accordingly, the output of an MLP from a vector of inputs given by \(\left( {x_{1} , \ldots , x_{p} } \right)\) can be calculated by the following expression:

$$\hat{y} = g\left( {W_{0} + \mathop \sum \limits_{h = 1}^{H} W_{h} g \left( {v_{0h} + \mathop \sum \limits_{j = 1}^{p} v_{ih} x_{j} } \right)} \right)$$

where \(\left\{ {v_{ih} ,{ }i = 0,{ }1,{ }2, \ldots ,p,{ }h = 1,{ }2,{ }3,{ } \ldots ,{ }H} \right\}\) is the synaptic weights for the connections between the p-sized input and the hidden layer, and \(\left\{ {v_{h} ,{ }h = 0,{ }1,{ }2,{ } \ldots ,{ }H} \right\}\) is the synaptic weights for the connections between the hidden nodes and the output node.

We use the function nnet from R (Venables and Ripley 2002), which employs the Broyden–Fletcher–Goldfarb–Shanno (BFGS) pathway, a quasi-Newton procedure that seeks to minimise an error criterion which allows a decay term k intending to avoid overfitting problems. As shown by Hastie et al. (2009), for classification problems an appropriate error function is conditional maximum likelihood (or entropy), that together with the BFGS procedure solves the problem defined as:

$$\mathop {\min }\limits_{W} \mathop \sum \limits_{i = i}^{n} \left( {y_{i} ln\hat{y}_{i} + \left( {1 - y_{i} } \right)\ln \left( {1 - \hat{y}_{i} } \right)} \right) + k\left( {\mathop \sum \limits_{i = i}^{M} W_{i}^{2} } \right)$$

where \(W_{i} { }\left( {i = 1, \ldots ,M} \right)\) is the be the vector of all the M coefficients of the net.

Support vector machine

Mathematically, SVM can be defined by n training vectors {(Xi,yi)}, i = 1,2,...,n, where the multi-dimensional vectors Xi contain the predictor features and the n labels \(y_{i} \in \left\{ { - 1,\left. 1 \right\}} \right.\) identify the class of each vector. In accordance with Meyer (2012), we use Radial Basis Gaussian function kernel function from the library e1071 (Dimitriadou et al. 2022):

$$K = \left( {u,v} \right) = {\text{exp}}\left( { - \theta \left| {u - v} \right|^{2} } \right)$$

where the quadratic programming problem is solved implementing the following procedure:

$$\begin{aligned} & \mathop {\min }\limits_{w,b,\delta } \frac{1}{2}w^{t} w + C\mathop \sum \limits_{i = 1}^{n} \delta_{i} \\ & y_{i} \left( {w^{t} \omega \left( {X_{i} } \right) + b} \right) \ge 1 - \delta_{i} \\ & \delta_{i} \ge 0, i = 1, 2, \ldots , n \\ \end{aligned}$$

Given that the selection of the parameters C and \(\theta\) impact powerfully on the performance of the model, we apply a grid search through the tenfold cross-validation approach in the set {1, 10, 20, 30, 40, …, 1000} and {0.10, 0.15, 0.20, …, 0.90}, respectively, by using the function tune.svm in the library e1071.

Classification trees

We employ the rpart package to build CART, which uses the Gini index as an impurity measure to split the dataset. To avoid the overfitting problem and in accordance with Maindonald and Braun (2003), we apply the one-standard-deviation rule to determine the number of terminal nodes.

Bagging

We aggregate the B models by majority voting. To compute bagged tree models (CTBag) we use the package ipred (Peters and Hothorn 2016). To do so, we consider two values for B, 50 and 100, selecting the one minimising the tenfold cross-validation classification error.

Random forest

To implement this ensemble method, we use the package randomForest (Liaw and Wiener 2022). The number of variables were randomly selected through a tenfold cross-validation search around the default value (mtry = square root of the number of predictors), namely from mtry − 3 to mtry + 3.

Boosting

AdaBoost, Binominal Boosting, and L2 Boosting were performed by using the function glmboost, mboost library (Hothorn et al. 2022). To fit the number of iterations (m) of each model we perform a tenfold cross-validation search of the value minimising the empirical loss, from 1 to 3000.

This library considers the problem of estimating a real-valued function:

$$f^{*} \left( \cdot \right) = arg_{f\left( \cdot \right)} \min E\left[ {\rho \left( {Y, f\left( X \right)} \right)} \right]$$

where \(\rho\) is a loss function. We assume n training vectors \(\left\{ {X_{i} ,y_{i} } \right\}\), \(i = 1, 2, \ldots , n,\) and having selected a base procedure, the generic functional gradient descent algorithm is:

  1. 1.

    Initialise \(\hat{f}^{\left[ 0 \right]} \left( \cdot \right)\) with an offset value. Set \(m = 0\).

  2. 2.

    Increase m by 1. Evaluate at \(\hat{f}^{{\left[ {m - 1} \right]}} \left( {X_{i} } \right)\) the negative gradient of the loss function:

    $$U_{i} = - \frac{\partial }{\partial f}\rho \left( {Y,f} \right) \left| {f = \hat{f}^{{\left[ {m - 1} \right]\left( {X_{i} } \right)}} , i = 1, \ldots , n} \right.$$
  3. 3.

    Fit the base procedure to predict \(\left\{ {U_{i} , i = 1, \ldots ,n} \right\}\) from \(\left\{ {X_{i,} - i = 1, \ldots ,n} \right\}\), obtaining \(\hat{g}^{\left[ m \right]} \left( \cdot \right)\).

  4. 4.

    Update \(\hat{f}^{\left[ m \right]} \left( \cdot \right) = \hat{f}^{{\left[ {m - 1} \right]}} \left( \cdot \right) + v\hat{g}^{\left[ m \right]} \left( \cdot \right)\).

  5. 5.

    Iterate steps 2–4 until some stopping value M.

We use \(m = 1\) since, as shown Bühlman and Hothorn (2007), a small value for the step-length factor does not affect the stability of the model. According to Bühlman and Hothorn (2007), we use three main methods of boosting procedures to select other elements of this algorithm. All of them share the base procedure: select the best variable in a simple linear model in the sense of ordinary least squares fitting. The final model \(\hat{f}^{\left[ m \right]} \left( \cdot \right)\) is a linear combination of the input variables.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Antonio, BO., Juan, LR., Ana, ID. et al. Examining user behavior with machine learning for effective mobile peer-to-peer payment adoption. Financ Innov 10, 94 (2024). https://doi.org/10.1186/s40854-024-00625-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40854-024-00625-3

Keywords

JEL Classification