Survey on machine leaning based game predictions

In the world wide millions of people interested on games and competitive matches. The stakeholders stand for one team and produce the sponsorship to the players. Huge amount of money transferred from one hand to other hand. So that stakeholder wants to select a good players into his teams. Here Machine Learning based multi variant regression algorithms used to calculate the progress of each player based on previous datasets to predict the performance at on-going match. To extract the features from on-going match characterized with learned datasets by implementing the Support Vector Machine (SVM), Gaussian Fit-chime (GAU) and KNN algorithms which perform the optimal classification on trained datasets. Feature selection and game predictions are become critical analytical process. The performance of the model effected and produces the outcome based on the feature selection. In this process some irrelevant variables removed to reduce the burden of algorithms and input datasets dimensions. This process speed up the dataset learning using various algorithms to produce the game predictions. The machine learning models mostly preferred algorithms to implement in feature selection are Linear Regression, Decision Tree Regression, Random Forest Regression and Boosting Algorithm like Adaptive Boosting (AdaBoost) Algorithm. In this paper we discussed about how to predict the game score based on trained datasets using various algorithms on Machine Learning platform.


Introduction
With technology innovation developing increasingly more progressed over the most recent couple of years, a top to bottom securing of information has gotten generally simple. Thus, Machine Learning is be-coming a significant pattern in sports examination due to the accessibility of live just as historical data. Sports analytics is the way toward gathering past match's information and examining them to separate the basic information out of it, with an expectation that it encourages in decision making [1]. Decision making be anything including which player to purchase during a closeout, which player to set on the field for the upcoming match, or something more key assignment like, constructing the strategies for forthcoming matches dependent on players' past performances. Machine Learning can be utilized affectively over different events in sports, both on-the-field and offthe-field. At the point when it is about on-the-field, AI applies to the investigation of a player's wellness level, plan of strategies, or choose shot choice. It is additionally utilized in anticipating the  [3]. Then again off-the-field situation concerns the business point of view of the game, which incorporates understanding deals design (tickets, product) and doling out costs in like manner. In recent days sport analysis and prediction also become a business for stakeholder .The sponsors and stakeholders invest lot of money consequently they want to select good team for participation in the competitions [4]. The tracking technologies also introduced to observe the each and every activity including fitness of a player. These kinds of systems helped to many stakeholders to select good team on bases of sponsorship. Due to technology up gradation presently sports analysis need of machine learning on game predictions [5] [6]. A predictive unsupervised leaning model introduced and constructed on historical data to predict the game based on the Naïve Bayes Classifier. Another research introduced a artificial neural networks to predict the game results.ML also implemented to extracting the prediction from on-going match for that researchers applied the prediction on previous dataset and calculated based on SVM, Gaussian Fitchime (GAU) and K-Nearest Neighbors ( .The Linear Regression and Random Forest algorithms also came on the game predictions [20].

Framework of game prediction
The game prediction on the cricket is most noticeable with huge datasets in coverage of more number of years. Here the Ml base predictions applied on single player and team to assess the on-going game winning strategy [8]. The objective of our proposal has twofold. First, we need to identify the features set which created major impacts on the result of games in the CRICKET. Second, when the feature set is known, we will use that data and use ML algorithm to fabricate a prediction model. For this situation, supervised learning appears to be the most fitting technique for such a goal. In supervised learning, the information will incorporate a training dataset with independent factors, for assist, steal, and free throws made[9][10]. Every one of these factors shows the group's capacities against a relevant variable (the result of past games). A short time later, the point is to foresee the result factors by applying a model from recorded cases (subordinate factors just as obvious objective variable qualities). This model will be used to gauge the objective variable incentive in a concealed game (test information). In this exploration venture, the attention was on factors identified with groups, players and rivals, for example,   3 explained the steps of complete machine learning based prediction on data model. Initially the system need to ready with primitive data sets of statistics and the features extracted .By implementing the label on data sets machine learning algorithms applies the supervised learning strategies to produce the predicted result [19]. If we want to predict a model initially system need to learned dataset by either supervised or unsupervised learning algorithms. Consider a model y=f(x) to predict this model we need to create the dataset's=((x1 ,y1),(x2 ,y2),(x3 ,y3),…..(xn ,yn)).Here the output(y) type also a key point. Based on the output type only algorithms operate on D. Supervised learning perform the regression which produce output in continuous value based and classification which produce the discrete kind of values [21][22][23]. In the dataset various columns created for to predict the performance of a single player. The columns are number boundaries, number of catches, number matches played, number wickets, Number of over's played and average strike rate. The multivariate regression model produce the out y based on above six mentioned attributes [18]. y = β0 + β1X1 + β2X2 + β3X3 + β4X4 + β5X5 + β6X6 (1) Here β weight value of each attributes. The total team weight prediction evaluated based on each individual payer strength which gained by outcome y. The total team weight T w given as: total appearance (2)

Feature selection and learning models
Feature selection and game predictions are become critical analytical process. The performance of the model effected and produces the outcome based on the feature selection. In this process some irrelevant variables removed to reduce the burden of algorithms and input datasets dimensions. This process speed up the dataset learning using various algorithms to produce the game predictions [13] [14]. The machine learning models mostly preferred to implement in feature selection given as: Mean while some errors also rectified with respect to algorithms and the value of error vary based on performance of the algorithm [15]. The errors are probably Mean Absolute Error(MAE),Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) [16].

Methodology
In our project after preprocessing the data we have 9 out of 15 and rows of 76,014. Then we have removed few batting teams and bowling teams so has to consider only consistent teams which were presently playing the match in data pre-processing itself. consistent _teams = ['Kolkata Knight Riders', 'Chennai Super Kings', 'Rajasthan Royals', 'Mumbai Indians', 'Kings XI Punjab', 'Royal Challengers Bangalore', 'Delhi Daredevils', 'Sunrisers Hyderabad'] The above teams were only considered out of some other teams then dataset was reduced to 53811 rows and 9 columns. Then we were removing the first five over's such that at least 5 over's data is required for good prediction. So after removing these rows we get the 40108 rows and 9 columns. Then we applied one hot encoding replacing with numerical data since before the data is categorical then data is replaced with 0's and 1's which is easy to predict the final score. Then based on this numerical data we have spitted the data into test train splitting. this splitting is done based on time since data set is a time series kind .Then test train splitting is done >2017 is taken has test and remaining from 2008 -2016 taken as training. Training set: (37330, 21) and Test set: (2778, 21). Using logistic regression we were getting very low prediction like 4.8956083513318935.

Conclusions
In this paper we discussed about game prediction based on the trained datasets of a player. By implementing of Machine Learning algorithms the stakeholder can select a good team based on the players previous performance result which analyzed by algorithms. The technology usage on game prediction succeeded in some cases but not in all cases. In the real time scenario this could helps to predict game winners no obvious. In the entire game time our feature extraction will not balance the concept game prediction. In the future I will continue my research on game predictions and analyze the better way of algorithms implantation for game predictions.