Random number datasets generated from statistical analysis of randomly sampled GSM recharge cards

In this article, a random number of datasets was generated from random samples of used GSM (Global Systems for Mobile Communications) recharge cards. Statistical analyses were performed to refine the raw data to random number datasets arranged in table. A detailed description of the method and relevant tests of randomness were also discussed.


a b s t r a c t
In this article, a random number of datasets was generated from random samples of used GSM (Global Systems for Mobile Communications) recharge cards. Statistical analyses were performed to refine the raw data to random number datasets arranged in

Data
The datasets are the table of random numbers in the raw excel file and the data grouped in four digits in the pdf file. The statistical tests for randomness are indications of the confidence in the reliability of the data for any given purpose.

Experimental design, materials and methods
Every attempt to construct a random number table must take into account that the table must be independent on any row or column. Furthermore, the data will not be found to follow any observed pattern(s). See [2][3][4][5][6][7][8][9][10] for details on other methodologies and results. The choice of using the used recharge cards of GSM network operator was based on the fact that their recharge cards are produced by strong computational algorithms that are programmed to generate random digits and numbers. The steps undertaken to obtain the table of random number datasets are listed below in details.
Step 1: A random sample of used recharge cards from a particular GSM network was taken. 380 samples were obtained, each is 16 digits. The number of the digits (0-9) for each of the 16 digits (a-p) is tabulated to show the frequency distribution. This is shown in (Table 1).
Step 5: Correlation among the columns were investigated by computing the Spearman rank correlation coefficients for the pairs of the columns. Randomness is achieved at zero or near zero correlations as shown in (Table 7). The Chi-square test of independence is conducted on each pair of the columns to investigate whether there is association among them. This is shown using the pvalues. The bold sections of ( Table 8) are indications of association between the columns and hence the probability of randomness is small.  Table 9b Goodness of Fit Test Summary of the raw dataset.

Test statistics Value
Chi-square 19.359 df 9 P-value 0.022 Table 10 The Goodness of Fit test for all the columns of the raw datasets. Step 6: Chi-square goodness of fit test is conducted to investigate the random distributions of the digits (0-9) shown in (Tables 9a and 9b). Near zero values of the p-values implies lower probability of randomness.
Step 7: Chi-square goodness of fit test is conducted to check the random distribution of the digits (0-9) across the columns (a-p) shown in (Table 10). Higher values of p-values are desirable for randomness irrespective of the values of the Chi-square statistics.
Step 8: The residuals obtained in step 7 for all the columns against the digits are tabulated. This is shown in (Table 11).
Step 9: Randomness is improved when there are equal distributions of the digits (0-9) in the columns (a-p). This step involves randomly manipulations of the numbers in each column using   the table of residuals as a guide. This can be done manually or computationally. For example in column a: randomly remove 0 in 9 places, introduce 1 in 3 places, remove 2 in 13 places and so on. This is repeated for all the other 15 columns. The rationale is to achieve equal representation of digits in random sampling irrespective of the columns.
Step 10: Analysis of variance (ANOVA) is performed to show the variation between and within the columns. If step 9 is done correctly, it is expected that the variation between the groups will be zero as shown in (Table 12) Step 11: The Chi-square goodness of fit test is performed to show that the occurrence of the digits are equal in distribution and random. This is shown in (Table 13).
Step 12: Correlation among the columns are conducted to verify weak associations among the various columns (Table 14). To show the degree of randomness, Chi-square test of independence is performed to show independence or association between the pairs of the columns. Higher values of p-values are desirable for randomness. It can be seen from (Table 15) that all the p-values are greater than 0.05.
Step 13: The final data is 380 by 16 table of random numbers.