Exploration of UK Lotto results classified into two periods

United Kingdom Lotto results are obtained from urn containing some numbers of which six winning numbers and one bonus are drawn at each draw event. There is always a need from prospective players for analysis that can aid them in increasing their chances of winning. In this paper, historical data of the United Kingdom Lotto results were grouped into two periods (19/11/1994–7/10/2015 and 10/10/2015–10/5/2017). The classification was as a result of increase of the lotto numbers from 49 to 59. Exploratory statistical and mathematical tools were used to obtain new patterns of winning numbers. The data can provide insights on the random nature and distribution of the winning numbers and help prospective players in increasing their chances of winning the lotto.


a b s t r a c t
United Kingdom Lotto results are obtained from urn containing some numbers of which six winning numbers and one bonus are drawn at each draw event. There is always a need from prospective players for analysis that can aid them in increasing their chances of winning. In this paper, historical data of the United Kingdom Lotto results were grouped into two periods (19/11/1994-7  Value of the data • The data analysis provides a different approach of classifying winning numbers of the UK lotto results [1][2][3][4][5][6]. • The data analysis can be extended to winning pairs and triples.
• The use of digital root provides another avenue for studying probabilities of winning [7,8].
• Discovery of new patterns can encourage more players thereby improving the economic conditions and welfare of the country [9][10][11]. • The data can be useful for educational purposes and gambling researchers, number theorists, lotto operators, statisticians, journalists and so on. • The method and analysis can be replicated for other lotto game results.

Data
The data for this study has been analysed to a certain extent, archived and updated at each draw in [1]. This data article contains data generated from different approach other than what was contained in [1] and it is publicly available. The data was on gathered on draw by draw basis. The data is divided into two periods; period A: when the lotto numbers are from one to forty nine (19/11/1994-7/10/ 2015) and period B: when the lotto numbers are from one to fifty nine (10/10/2015-10/5/2017). The draws for periods A and B are 2065 and 166 respectively. The data obtained for periods A and B when the winning numbers are classified using certain number criteria are shown in Tables 1-4. The fre- Table 1 The lotto single winning numbers classified in decimal (base 10).

Numbers
Period A Period B The most single winning numbers from period A corresponds to 31-40 and the least corresponds to 41-50. Understandingly, the last class contains only 9 numbers for period A. Currently, from the analysis, prospective players with numbers 31-40 and 11-20 has more frequency than other classes. Remark: The frequency of occurrence decreases with increasing multiples of number for both periods. Table 3 The lotto single winning numbers classified in odd and even numbers. Remark: More single odd winning numbers were drawn in period A. However, almost the same frequency was drawn for both even and odd single winning numbers in period B. Chi-square tests and t-tests may not be useful in confirmation the result since the possible winning numbers are more than the even numbers by one.

Table 4
The lotto single winning numbers classified in prime and non-prime numbers. Prime  3770  307  Non-prime  8620  689  Total  12,390  996 Remark: Prime numbers appeared in 27% and 31% of all the single winning numbers in periods A and B respectively. quency distribution of the lotto winning numbers when they are classified according to their digital roots is shown in Table 6 and the various lotto numbers that constitute each digital root are listed in Table 5. This article also introduces the use of the frequencies of digital root in chi-square tests. Finally, simulated data showed the uniformity, randomness and non-normality of occurrence of winning numbers in UK lotto game.

Digital root
This is the sum of digits of a studied number until a single digit number is the final outcome [19][20][21][22]. Digital roots often reveal hidden patterns of distributions as seen in [23][24][25]. This can be applied to lotto to reveal hidden patterns of distribution of winning numbers. The complete list of numbers grouped under their respective digital roots and is shown in Table 5. The digital root of the single winning numbers for periods A and B is shown in Table 6.

Chi-square test of independence
The Pearson chi-square test is conducted to determine whether the observed values conform to theoretical expectations. The expected frequencies in the chi-square test of independence follow the uniform distribution. Details on chi-square test and other tests can be found in [26][27][28][29][30][31]. This paper introduces the use of frequency obtained from the digital roots of number instead of all the numbers in chi-square test of independence. This approach was compared with the traditional procedure using the frequency data in totality. The results of the Chi-square tests for periods A and B using Table 6 are shown in Tables 7 and 9 while the decision rule based on different confidence intervals are shown in Tables 8 and 10.
The expected value was obtained from Table 6 by the sum of all the values under the column (Period A) divided by 9.
The statistical hypothesis is stated; null hypothesis imply independence while the alternative imply otherwise. χ cal o χ sig Accept the null hypothesis (independence); χ cal 4 χ sig Accept the alternative hypothesis (association); χ cal ¼ 2:593494056 (From Table 7). Table 6 The frequency distribution of the single winning numbers classified according to their digital root for periods A and B.
The decision rule for the different level of significance of the chi-square test for period A is shown in Table 8.
The expected value was obtained from Table 6 by the sum of all the values under the column (Period B) divided by 9.
The statistical hypothesis is stated; null hypothesis imply independence while the alternative imply otherwise. χ cal o χ sig Accept the null hypothesis (independence); χ cal 4 χ sig Accept the alternative hypothesis (association); χ cal ¼ 8:758306 (From Table 9).  The decision rule for the different level of significance of the chi-square test for period B is shown in Table 10.
The basis for the statistical decision is that the calculated chi-square statistic is compared with the one tabulated at different degrees of freedom. This revealed that the distribution of the winning numbers of UK lotto is purely random especially at high confidence intervals. This has shown that the UK lotto game is fair.

Simulation analysis
Monte Carlo simulation was used to generate 20,000 simulated results using the discrete uniform distributions for periods A and B. The results are shown as histograms in Figs. 1 and 2.
The simulation results revealed the uniformity in frequency distributions of the lotto numbers and hence the winning numbers does not appear to cluster around any specific value. However, the extreme values 1, 49 and 59 seem to deviate from uniformity. This is one of the major drawbacks of Monte Carlo simulation used to generate those results.