Statistical analysis of bank deposits dataset

This article presents the statistical analysis of the deposit activities in each of the account types of a leading bank in Nigeria. The mean effect of these account types on the bank was determined using analysis of variance (ANOVA). Further test which include the Tukey's simultaneous test for differences of means was also conducted.


a b s t r a c t
This article presents the statistical analysis of the deposit activities in each of the account types of a leading bank in Nigeria. The mean effect of these account types on the bank was determined using analysis of variance (ANOVA). Further test which include the Tukey's simultaneous test for differences of means was also conducted.
& 2018 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Data source location
The data was obtained from one of the leading banks in Nigeria Data accessibility All the data are available this data article

Value of the data
The data is useful in calculating loan to deposit ratio. The data could be used as one of vital tools in assessing bank competitiveness [1]. The data analysis could be helpful in detecting non-performing loans (NPL) in credit management [2].
The data could be helpful in monitoring off balance sheet engagements [3]. The data could be used to monitor compliance to banking decision making and strategy implementation; for example, innovative savings products [4][5][6].
The data analysis can be applied to monitor statutory policies and regulation; for example, the effect of monetary policies [7].
The data can be extended to include behavioral attitudes and customer preferences for some types of accounts.

Data
The data in this article involves the amount of money (in Naira) deposited into six different account types available in a leading bank in Nigeria on a particular day in year 2017. It also gives information on the number of people that make deposits into the various account types.
The bank used has six different account types which we denote as Account Type 1 (Savings), Account Type 2 (Current), Account Type 3 (Corporate), Account Type 4, Account Type 5 and Account Type 6. Since the data is sensitive and a real life data, we would like to protect the privacy policy of the bank. Descriptive statistics was used to summarize the data and to provide plots for proper visualization and understanding. SPSS version 20 and Minitab version 17 were used for the analyses in this paper.
The data set is summarized in Table 1. The information contained in Table 1 shows that more people patronize account type 1 which is savings account than any other account types but the total money deposited in the account is not necessarily the largest. The account type that attracts the highest deposits is account type 2 (current account), though, the number of depositors for this account type is not the highest but on the average, customers deposited the highest amount of money there. This is reasonable because in the real sense, current account holders could either be for personal, businesses, and corporate organizations.
A chart that summarizes the whole dataset is presented in Fig. 1. The deposit patterns for account types 1-6 are provided in form of histogram in Figs. 2-7 respectively.
Also, the boxplot representing the mean amount deposited in the various account types is displayed in Fig. 8.
The impact of the current account is also being identified in the plot provided in Fig. 8. The mean deposit in each account type with their respective 95% Confidence Interval (C.I) is displayed in Table 2.
The 95% confidence interval plot for the mean of the amount deposited in the various account types is displayed in Fig. 9.

Experimental design, materials and methods
Analysis of variance has traditionally been used to investigate mean effects of groups of subjects. In this research, a one-way ANOVA is applied. ANOVA and other statistical tools have been applied to the analysis of economic data such as in econometric models, credit management, accounting and audit and many others which are too numerous to enumerate. Furthermore, statistical tools are often combined with other tools for better analysis. Some examples include: macroeconomic volatility generation [8], economic impact of transportation [9], economic impact of professional negotiation [10], Gross Domestic product and exchange rate [11], economic impact of tourism [12], income inequality [13], the effects of expenditure [14], human capital in energy growth [15], quality of life [16], economic impact of portfolio selection [17], economics of refugees and asylum seekers [18], economic recovery [19] and energy needs for economic development [20].
Since we are dealing with a one-way ANOVA, the underlying model is:         where Y ij is the jth observation in the ith treatment, μ is the overall mean, α ij is the effect of treatment i, e ij is the error term The specific hypothesis used is: H 0 : The mean deposits in all the account types are equal Versus H 1 : The mean deposits are not equal for at least one of the account types However, Minitab version 17 was used for the analysis of variance (ANOVA) and further tests. Also, the level of significance used for all the analyses is 0.05. The result is displayed in Table 3.
Decision Rule: Reject H 0 if p-value is less or equal to the level of significance.  9. A plot for the 95% C.I for the mean amount of deposits.

Decision:
We reject H 0 since p-value (0.000) is less than the level of significance (0.05). Inference: The mean deposits are not equal for at least one of the account types. The ANOVA model is summarized in Table 4.

Turkey pairwise comparisons
Since H 0 was rejected, we are interested in knowing which pair of the means is actually significantly different from each other using Turkey pairwise comparisons. The means are paired, the differences between the means are calculated and the Tukey's simultaneous test for differences of means of the deposits is obtained. The result is displayed in Table 5.
The pairs with p-value that is less than 0.05 are significantly different from each other. For us to have a clearer picture, the result is summarized in Table 6.
Remark: The means that do not share the same letter are significantly different from each other The residuals are represented in form of histogram and are displayed in Fig. 10. The normal probability plot for the residuals is displayed in Fig. 11.

Key information from the results
The mean effect of current account and corporate account on the bank are the same. The mean effect of Savings account, account types 4, 5 and 6 on the bank are the same. Current account and corporate account attract more deposits than the other account types.