Score Prediction of Sports Events Based on Parallel Self-Organizing Nonlinear Neural Network

This paper introduces the basic concepts and main characteristics of parallel self-organizing networks and analyzes and predicts parallel self-organizing networks through neural networks and their hybrid models. First, we train and describe the law and development trend of the parallel self-organizing network through historical data of the parallel self-organizing network and then use the discovered law to predict the performance of the new data and compare it with its true value. Second, this paper takes the prediction and application of chaotic parallel self-organizing networks as the main research line and neural networks as the main research method. Based on the summary and analysis of traditional neural networks, it jumps out of inertial thinking and first proposes phase space. Reconstruction parameters and neural network structure parameters are unified and optimized, and then, the idea of dividing the phase space into multiple subspaces is proposed. The multi-neural network method is adopted to track and predict the local trajectory of the chaotic attractor in the subspace with high precision to improve overall forecasting performance. During the experiment, short-term and longer-term prediction experiments were performed on the chaotic parallel self-organizing network. The results show that not only the accuracy of the simulation results is greatly improved but also the prediction performance of the real data observed in reality is also greatly improved. When predicting the parallel self-organizing network, the minimum error of the self-organizing difference model is 0.3691, and the minimum error of the self-organizing autoregressive neural network is 0.008, and neural network minimum error is 0.0081. In the parallel self-organizing network prediction of sports event scores, the errors of the above models are 0.0174, 0.0081, 0.0135, and 0.0381, respectively.


Introduction
With the mature application of neural network technology and the rapid development of Internet, the research of data prediction has emerged. Data prediction often faces some noisy, messy, and nonlinear data. Neural networks have good robustness, [1]. is article briefly introduces the main clustering algorithms in data prediction, conducts an indepth study, and optimizes the SOM learning algorithm, which improves the network iteration to a certain extent. e speed of training layer expansion self-organizing mapping network (GHSOM) was studied, the gray relational analysis was introduced into the GHSOM network, and the GRAGHSOM algorithm was proposed.
e experimental results show that the GRAGHSOM algorithm reflects the importance of each component of the sample vector in the model in the process of high-dimensional data clustering and can perform clustering more accurately [2][3][4][5].
Because the fitting of a single neural network to a nonlinear dynamic system is generally computationally intensive, slow in convergence, time-consuming, and poor in generalization ability, a modular neural network chaotic parallel self-organizing network prediction method is proposed. at is, the phase space after the reconstruction of the chaotic parallel self-organizing network is divided into multiple subspaces, and an unbalanced subnet strategy is adopted. A neural network with a different structure is used in each subspace to calculate and track the local attractor trajectory, and the prediction result is finally passed into integrated module output [6]. is method not only makes use of the characteristics of self-organization and selflearning of neural networks but also greatly improves its computational performance compared with a single neural network. It can track the attractor trajectory more accurately in the subspace, thereby improving the overall prediction. Clustering refers to the grouping of objects in the database into meaningful subsets, so that the members in a class are as similar as possible, and the differences in members among different classes are as large as possible. Clustering has unsupervised learning capabilities and is widely used in many fields [7][8][9][10].
is article first reviews the commonly used BP neural network and self-organizing neural network in neural network models. en referring to the article, a new selforganizing difference theoretical model was established on the basis of self-organizing, and a better neural network structure was found through comparison. A better model was found by comparing the number of different classes and the prediction error of the hidden layers. e reference article combines the self-organizing neural network with the autoregressive model, improves the weight initialization and learning rate of the neural network, and finds a model with less error by comparing the number of different categories. In the article, the BP neural network is selected as the comparison model to evaluate the effect of the new model. e single-even network and multi-even network forecast models of the forecast system are studied. Matlab 6.5 programming is used to train the breakout forecast system based on the nonlinear neural network, using the simple, practical, and easy-to-program Visual C++ software for interface design, and a visual forecasting system based on nonlinear neural networks was developed. Experiments have proved that the use of a breakout prediction system based on a nonlinear neural network for breakout prediction can improve the prediction accuracy.

Related Work
erefore, in recent years, the prediction method of chaotic parallel self-organizing network based on a neural network has obtained preliminary theoretical and application research results. However, on the one hand, most of the current researches are focused on the optimization of the network structure itself and the improvement of the two independent processes of phase-space reconstruction parameter selection, without considering the two as a whole. e reconstructed phase umbrella is divided into subspaces in order to track the attractor's trajectory with higher precision locally.
Torrents et al. [11] believe that the phase space of the system can be reconstructed from the parallel self-organizing network of a variable, and all the information of the entire dynamic system is contained in the chaotic parallel self-organizing network itself. e embedding theorem of Grossberg [12] further proves the idea and provides a theoretical basis for predicting the chaotic time series. However, how to construct the prediction model according to the phase reconstruction theory, select the reconstruction phase-space parameters, and predict model parameters is a key problem in the prediction of chaotic parallel self-organizing networks. For this reason, many nonlinear prediction methods for predicting chaotic parallel self-organizing networks have been proposed. Among them, the neural network-based prediction method has distributed processing, self-organization, and self-adaptation. e excellent characteristics of self-learning columns and fault tolerance can better deal with uncertain and nonlinear prediction problems, so more research and applications have been obtained in the prediction of chaotic parallel self-organizing networks. Mohammadi et al. [13] used the neural network global method to predict the chaotic parallel self-organizing network. e results proved that when the MLP neural network is used as a global prediction method to predict the chaotic parallel self-organizing network, the effect is better than most local prediction methods. RBF networks have a strong biological background and are often used in the prediction of chaotic parallel self-organizing networks. Researchers are committed to improving the hidden layer node function and number selection of classic RBF networks. e initial global prediction methods for chaotic parallel self-organizing networks mostly used feed-forward neural networks. ere are many types of feed-forward networks, mainly including multilayer perceptrons (MLP) neural networks and radial basis functions neural network, nonlinear neural network (WNN), and fuzzy neural network (FNN). A representative MLP network is Park [14], which uses a multilayer feedforward network to simulate Lorenz chaotic sequences, where m represents phase-space reconstruction. e embedding dimension and its conclusion illustrate the important role of neural network in the prediction of chaotic time series. Yuan and Li [15] studied the gradient changes of the parallel self-organizing network. e experimental results show that the prediction performance of the classic RBF network has been improved [16]. e researchers proposed a cross-validation subspace method to select the optimal number of hidden layer nodes. is method is used in the prediction of noise-containing chaotic parallel self-organizing networks. e simulation object is the chaotic sequence of radar sea echoes. Nonlinear neural network is a new type of feedforward neural network based on nonlinear theory. Nonlinear neural network is compatible with the superiority of nonlinear and neural networks [17][18][19][20].

Construction of Sports Event Score Prediction
Model Based on a Parallel Self-Organizing Nonlinear Neural Network

Parallel Self-Organizing Architecture Nesting.
According to the different objects to be studied, the behaviors of parallel self-organizing networks are very different. Even for the same object, there will be different performances at different times. erefore, in order to treat different time series in different ways, parallel self-organizing networks can be classified according to the way of data selection and the form of data changes over time. Figure 1 shows the nested parallel self-organizing architecture. When using the combination of difference weighted average to predict the parallel self-organizing network, the self-organizing neural network to the historical data is first used and the input data are divided into several categories. e target value is calculated according to the formula to calculate the weighted average of the difference. When predicting the behavior of new data, first classify it according to the trained self-organizing neural network and then add the weighted average corresponding to the class of the data to the last value of the new data, which is the prediction corresponding to the new data.
First, we must establish a self-organizing neural network to classify the known data. e known data exist in the form of vectors. e elements of each vector are the parallel selforganizing network values selected from the parallel selforganizing network at a certain time interval. And all vector elements are selected in the same way. Let the dimension of the vector be l. e network weights of the initial self-organizing neural network are randomly selected. e risk value of the network is obtained by adding the risk value of the host When all the known data are input to the self-organizing neural network, the optimal neuron is obtained, and the weight vector of the optimal neuron and its surrounding neurons are adjusted, and the self-organizing neural network completes one step of training. en, the data are re-input into the selforganizing neural network for the next step of training, until the optimal neuron corresponding to each input vector in the two trainings before and after does not change, and the selforganizing neural network completes the training. Taking the weight vector of the neurons in the competitive layer after training as the clustering center, the data can be classified finally.
e neurons receive the data passed by the neurons in the previous layer and adjust the data according to their own transformation functions. We perform the transformation and then pass the transformed data to the next layer of neurons. e last part receives the data of the last layer of hidden layer neurons and calculates the final result and error of this input.

Nonlinear Neural Network Node
Output. Structurally, the characteristics of nonlinear neural networks are as follows: (1) distributed storage: that is, all neurons of the neural network jointly complete the information processing function, and each neuron has an independent function to process data; (2) parallelism: that is, the ability of all neurons of the neural network to complete information processing concurrently; and (3) fault tolerance: a single neuron's error in information processing will not affect the entire system. In terms of ability, the neural network can spontaneously react to external input and output data and makes appropriate adjustments to its own network parameters according to the  Computational Intelligence and Neuroscience 3 input and output samples to adapt to the new input samples. Each neuron in the input layer stores an element of the input vector, and all neurons assign data values to the discriminant function of the neurons in the competition layer through the connection with the data according to the discriminant function, select the best matching neuron by comparing the value of the discriminant function of each neuron, and modify the network connection weight parameter according to the input data. e connections represent a certain distance between neurons in the competition layer. e distribution of this weight in one or two dimensions is fixed and will not change during training. Figure 2 shows the distribution of nodes in a nonlinear neural network. In a forward network without feedback, once a signal passes through a neuron, the neuron's processing process ends. In an interconnected network, signals have to be repeatedly transmitted between neurons, and the network is in a dynamic state of constant change. Starting from a certain initial state, after several changes, it will reach a certain equilibrium state. According to the structure of the network and the characteristics of neurons, the operation of the network may also enter periodic oscillations or other equilibrium states such as chaos. is parallel self-organizing network can also be called the aggregate index parallel selforganizing network, which is a statistical sequence arranged in time, and the data content is the statistical value of the same phenomenon or characteristic value. From the absolute parallel self-organizing network, the overall level of the phenomenon to be studied in a certain period can be observed, and it can also reflect the level of the phenomenon at a certain point in time. erefore, the absolute parallel selforganizing network can be further divided into time series and time-point series according to the time selection method. e changes in the value of the curvilinear parallel self-organizing network are also inclined, but the tendency is not fixed, the magnitude and direction of the change may change, and the change of the value is not periodic. e linear parallel self-organizing network value changes have a certain tendency; that is, within a long time range, the value of the parallel self-organizing network gradually rises or decreases, and an upward or downward trend can be observed.

Quantification of Score Prediction in Sports Events.
e normalization of artificial neural network sports score prediction requires proper processing of data to meet the special requirements of artificial neural network for data. According to different neuron transfer functions, many artificial neural network models only accept numerical data in the range of [0, 1] and [−1, 1]. erefore, the data must be scaled down to this interval. Scalar numerical data are generally evenly distributed in a certain range and can be directly mapped to the interval [0, 1]. If the distribution of numerical data is not uniform, you can use piecewise linear equation or logarithmic equation to transform the data, and then scale down d, N to specify the interval. Discrete data are represented by encoding them with 0 and 1, or assigning a value to them in a specified continuous interval. Numerical data represented by vectors or arrays can sometimes be processed in groups; that is, the vector as a whole is subjected to regularization processing. ere are several regularization methods, the most commonly used is to calculate the square root of the sum of the squares of the elements, and then it is used to remove each element. e second method is to first find the sum of all the elements, and then it is used to divide each number. In this case, the sum of the elements after regularization is 1.0, and the value of each element represents their contribution to the group. e third method is to remove each element with the maximum value in the vector. Figure 3 shows the quantitative processing of score prediction for sports events. erefore, all of the above factors can be converted into time factors, and the parallel self-organizing network analysis method can be used to mathematically model the time factor, that is, the value of the score of a sports event.
rough the historical value of the score of a sports event, it is possible to discover the change of the score of a sports event over time and predict the future development of sports event scores. On the other hand, the score of sports events is a relatively open data, which is relatively easy to obtain, which also provides convenience for research. e parallel self-organizing network of sports scores is a parallel selforganizing network with the value of sports scores as data elements.
e scores of sports events can be taken every hour, every day, or other time as the parallel self-organizing network of sports scores. In competition, a vector is used as the input of the neural network. Each unit of the competition layer will form a discriminant function according to the weight of its own network and calculates the discriminant function with the input vector as the value of the independent variable, and the function value is the best. e neuron becomes the best matching neuron in this input. Other neurons surrounding the best matching neuron within a certain range will participate in the cooperation and become the neuron to be adjusted in the process of selfadaptation. Adaptive neurons within the topological neighborhood of the optimal neuron must modify their own network connection weights according to the value of the input vector to achieve the discriminative mode of these neurons.

Recursive Prediction Model Weight
Factor. Data preprocessing is an important step in the data prediction process. e quality of the data directly affects the quality of the mining effect. Data preprocessing includes data cleaning, data integration, data conversion, and data reduction. e task of data cleaning is to eliminate noise or inconsistent data. e inconsistency of the data will reduce the credibility of the data prediction results. e risk value of the host is obtained by adding the risk value caused by the vulnerabilities of the related state nodes. e database construction module is responsible for processing the system configuration information and vulnerability information obtained by the scanning module, extracting their nonquantitative attributes and quantifiable attributes, forming an information database and a vulnerability database; it is also responsible for analyzing the relationship among vulnerabilities. Mining was performed to form a vulnerability database. e data used for data prediction may come from multiple actual systems, so there is a problem of heterogeneous data conversion. Because there may be many inconsistencies between the data of multiple data sources, data integration is not a simple copying process. Data transformation is the transformation or unification of data into a form suitable for mining, including data generalization, normalization, attribute construction, and smoothing. Data reduction includes operations such as data aggregation, attribute selection, and data compression. Figure 4 shows the clustering of the weight factors of the prediction model.
Multilayer perceptron is a popularization form of the single-layer perceptron. e information of the multilayer perceptron network is forwarded layer by layer, and each unit of the lower layer is connected to each unit of the upper layer. e input unit operates layer by layer according to the input/output relationship, and the connection weight between each layer can be adjusted by learning rules. It can be seen that the multilayer perceptron is actually a design of multiple single-layer perceptrons through appropriate combination, which can realize any shape division. In this method, each discrete class is assigned a value from 1 to N represented by a binary number. For example, if a variable has 32 possible values, it can be represented by a binary vector of length 5. As long as the discrete value is arbitrary Computational Intelligence and Neuroscience and does not have any order, binary code is a better representation. However, when the discrete value is converted to a binary code, the bit value varies greatly. For example, the binary code of the seventh element in a discrete class is 000111, and the binary code of the eighth element is 001000.
Hamming distance is a measure of the similarity between two binary numbers, expressed by the number of bits of different symbols in two codes. In this case, the Hamming distance between 7 and 8 is 4. If you want the neural network to think that the two input modes of 7 and 8 are similar, you must choose the thermometer code.

Parallel Self-Organizing Nonlinear Neural Network Data
Preprocessing. e nonlinear neural network selected in this paper has a hidden layer, and every 10 pieces of data are used as a set of inputs. In the training phase, a matrix composed of 500 groups of 10-dimensional input vectors is selected as the input matrix, and the matrix composed of the target value corresponding to each group of input vectors is used as the output matrix. e parameters are adjusted until the error reaches the target value of training times, stop training. In the prediction stage, the new data are used as the input vector, the corresponding predicted target value is calculated according to the adjusted BP neural network, and the actual target is compared to evaluate the performance in Figure 5.
For the measured parallel network, we calculate the ratio of pseudo-neighbors from the minimum embedding dimension. When the ratio of embedding dimension m to pseudo-neighbors is less than 5%, pseudo-neighbors no longer increase with embedding dimension. When it decreases, it can be considered that the singular attractor is fully expanded, and m at this time is the minimum embedding dimension. Generally speaking, when the distribution of phase-space points is relatively sparse, it is better to use the pseudo-neighbor point method, because this method can automatically adjust the distance scale with the sparse changes of the phase-space points so that the statistical results are more reliable. When the phase-space points are densely distributed, it is more reasonable to use the saturated correlation dimension method.
To quantify the risk value of the state node, it is necessary to calculate the probability that the attacker can successfully attack to reach each state node. It can be seen that the average relative error of the PCA-GABP prediction model is both the smallest. e number less than 3% is 39, which is 2 more than the number where the absolute value of the relative error of the GABP model is less than 3%. After comparing and analyzing the data in the article, it can be concluded that the model whose input data have been processed by PCA has a higher prediction accuracy than the unprocessed model; at the same time, regardless of whether the input data has been processed by PCA, the BP neural network improved by GA prediction accuracy is significantly higher.

Realization of the Simulation of the Score Prediction Model for Sports Events.
Also taking the sports event factor as an example, the GA nonlinear neural network has 48 inputs; that is, there are 48 neurons in the input layer of the network, and the output is also 48; that is, there are 48 neurons in the output layer of the network. Using the floating-point coding scheme, each connection weight or threshold is directly represented by a real number, and each individual contains a network ownership value and threshold. e weight of a network is composed of two parts.
Finally, the asset damage value caused by the related vulnerabilities is multiplied by the asset value of the host node to quantify the risk. is article has 48 input neurons and 48 output neurons. Assuming there are 36 hidden layers, plus the thresholds of the hidden layer and the output layer, the code length of each individual is 48×36 + 36 × 48 + 36 + 48 � 3540. We input the training samples in the article into the prediction model, and after many debugging, it is finally determined that there are 36 nodes in the hidden layer. e simulation results of the GABP neural network model are as described in the text. Figure 6 shows the score forecasting training situation of sports events.
If the total input of the neuron is far away from the threshold, the actual output of the neuron is either the maximum value or a small value due to the saturated nonlinear characteristics of the neuron. When the total input of a neuron is too far above the threshold, it is said to enter the saturation zone. When the total input falls into the saturation zone and the actual output contradicts the target output, it is incorrect saturation, that is, false saturation. At this time, there needs to be a large amount of modification to the weight, but in fact, because the derivative value is close to zero at this time. e amount of modification of the derivative weight is very small. e same is true for hidden layer neurons, but the error is calculated by error back propagation. Once learning enters the false saturation state, it will take a long time to get rid of this state or even unable to find the global optimum. Table 1 is the description of hidden layer neuron training.
Calculated by the pseudo-neighbor method, the minimum embedding dimension m is 9, the DRNN model is constructed, the recursive delay is 1, and the same 9 × 21 × 1 nonlinear neural network is constructed. e first 2000 points among the 3000 points are used as training data, and the last 1000 points are used as prediction data. Since the predicted curve is relatively close to the actual curve, it is  Computational Intelligence and Neuroscience difficult to distinguish it from the graph, so only the absolute error P(1) of the DRNN model and the nonlinear neural network prediction are given. When network administrators find unknown vulnerabilities, they need to integrate the vulnerability information and submit it to the national information security vulnerabilities by e-mail. In the training phase, the network is used to layer the input data into several categories and calculate the weighted average difference corresponding to each category of data. In the prediction stage, the trained neural network is used to divide each group of input data into a certain classification area, and then, the prediction is made based on the calculated weighted average difference. e article lists the results of the self-organizing difference weighted average method to predict the results of the parallel self-organizing network of the sports event score, that is, when the number of historical data classifications is different.

Example Application and Analysis.
In the experiment, we set the output as y(t + 1) and then input as x(t − 4). e network has 5 nodes, and the time interval between each neuron is 1. In the experiment, 2500 data are generated on each chaotic parallel self-organizing network of the benchmark, and the whole of them is divided into three parts: training data, verification data, and test data. e ratio of the amount of data of the three parts is 14 : 6 : 5, that is, 1,400 training data, 600 verification data, and 500 prediction data. It can be seen that the hidden layer nodes should be twice the input layer, that is, 10 nodes.
en, the staff will analyze the vulnerabilities, and only after several working days can the information such as the category and hazard value of the vulnerabilities, be published on the official vulnerability platform. e output layer has only one neuron, that is, the predicted value of the stock closing index on the next day. e structure of the recurrent neural network is 51. e DRNN model in this paper is greatly improved relative to the nonlinear neural network model, and it can be seen that the fitting degree of the true exponential curve is also better than the fitting degree of the nonlinear neural network model. It can be seen that the error of the recursive model in this article is concentrated between (−20, 20), the error value is small, the error of the nonlinear neural network model is between (−80, 40), the error value is large, and some points deviate greatly from the true value, which is caused by the high degree of the feedforward network. Figure 7 shows the output effect of the sports event score prediction network.
is will not only cause a huge workload for the vulnerability management staff, but, more importantly, the network administrators will not be able to obtain information about unknown vulnerabilities in a timely manner. Because these feature variables may have high correlations, these features are redundant features. e results show that although the overall detection rate is very high, the U2R class is almost undetectable, and the R2L detection rate is relatively low. ere is no doubt that this is because the number of samples is too small, which makes it difficult to build a classifier for these attack classes. e error detection rate is high, and the two types are mostly detected as normal types, which are absolutely not allowed in IDS. To avoid this, deduplication is performed before training. In order to improve the performance of the classifier without greatly changing the number of samples in the data set, this article increases the number of U2R samples in the data set to ensure the basic classification performance. e system structure adopts the client/server structure. e server is responsible for scanning the hosts and devices in the network to obtain the configuration information and vulnerability information of all the hosts in the network; the client is responsible for collating the scanning results to form various databases. On this basis, certain mathematical methods are used to measure the overall risk assessment value of the network and determine the risk level. erefore, in addition to filtering irrelevant features, the optimal feature subset must also filter redundant features. Figure 8 shows the convergence of the sports event score prediction network.
ere are N individual individuals to form the initial population. e value of the population number N will affect the efficiency of the entire algorithm. If the value of N is too large, the calculation amount will increase, and the running speed will be too slow. If the value of N is too small, although the operating speed can be increased, the diversity of the population is reduced to a certain extent, which may cause premature convergence, that is, premature maturity. According to experience, the value of N is generally between 10 and 200.
Feature selection methods can affect machine learning performance, model complexity, and model running speed. One of the more popular feature selection methods is to select n variables that are highly correlated with categorical variables. erefore, only the first six principal components need to be extracted under the condition that the loss of the original data is not too much information, so as to achieve the purpose of principal component extraction. It can be seen that the prediction points with larger relative errors are significantly reduced, which is 0.5% less than the maximum relative error of the nonlinear neural network, After taking the absolute value of 48 relative errors, 4 points have an error of more than 5%, 4 points are between 4% and 5%, 3 points are between 3% and 4%, and 8 points are between 2% and 2%, and the improved BP neural network has achieved better prediction results.

Conclusion
By reviewing relevant materials and literature, we outline data prediction and neural network related research. e artificial neural network data prediction, data transformation, data representation, and data preprocessing are studied. We made a more in-depth study on the use of artificial neural network methods for clustering, based on the indepth study of the SOM clustering algorithm, optimized the SOM learning algorithm, and then studied the layer expansion self-organizing mapping network (GHSOM), and put forward the GRAGHSOM algorithm that introduces gray relational analysis. Regional competitiveness is a typical nonlinear complex giant system, and it is difficult for traditional linear methods to obtain satisfactory prediction results. At the same time, the experiment proves that this model has great advantages in pattern recognition. e hidden layer quantum neuron of the nonlinear model uses the linear superposition of the nonlinear basis function as the activation function. At the same time, the thesis gives the learning algorithm of a nonlinear neural network. However, in feature selection, combining variables that are highly correlated with categorical variables may not necessarily improve the performance of the classifier. e neural network method can handle systems that are difficult to describe with mathematical models. It has strong parallel processing, self-adaptation, self-organization, and arbitrary approximation to nonlinearity. e experimental results show that the GRAGHSOM algorithm reflects the importance of each component of the sample vector in the model in the process of high-dimensional data clustering and can perform clustering more accurately; for the application of local sports score prediction (LWF) in training, we studied the perceptron neural network and successfully applied the perceptron neural network to the local sports event score prediction system.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e author declares that there are no potential conflicts of interest in this article. Computational Intelligence and Neuroscience 9