Volleyball Data Analysis System and Method Based on Machine Learning

After the reform and the opening up, the economy of my country has grown rapidly and people’s lives have become better and better. As a result, there is a lot of time to pay attention to their health, which has promoted the rapid development of my country’s sports industry. Since the 2008 Beijing Olympics, the successful hosting of the Beijing Olympics has been further strengthened. With the rise of the development of sports in our country, the use of machine learning in a large amount of information can process this data and analyze it well. Based on this, this article is aimed at making volleyball players and coaches better understand the technical structure of hiking and the technique of hiking. The paper understands the characteristics of muscle activity over time and uses machine learning methods to analyze a large number of volleyball sports data. In this experiment, 12 volleyball players from a college of physical education were selected. According to the actual situation of the students’ physical fitness and skills, it is more reasonable to divide them into two arms with preswing technology (A type) group and two-arms without preswing technology (B type) group. Mainly study the volleyball spiking action, select the representative front-row 4th position strong attack and the back-row 6th position for comparison and analysis, and analyze the process from the take-off stage to the aerial shot stage in the four stages of the smash through the kinematics, dynamics, and surface electromyography parameters. Experiments have shown that for type A, the left gluteus maximus integral EMG sum value is significantly different between the front and rear rows (P < 0:05). The discharge volume of the left gluteus maximus during the front-row spiking process is greater than that of the back-row spiking. This difference is mainly reflected in the kicking stage and the air attack stage. It shows that volleyball data analysis has a very broad prospect of exploration and application, which can create huge social and economic benefits. How to analyze kinematics is also a very demanding research project and is also part of the future analysis of sports data. Academic value and broad practical significance are important.


Introduction
In a large amount of text and image information, if manual methods are used to extract useful information for volley data analysis, the enormous workload and slow speed are undoubtedly unrealistic. If you can use the excellent performance of the computer to complete the task, it is definitely the most effective solution; then, you can use machine learning to analyze volleyball data and use it in sports and other directions [1]. Regardless of the aspect, the key to volleyball data analysis is kinematics analysis and industrial analysis.
At present, the most mature industrial analysis is the complete analysis of surface electromyography.
With the application of the scoring method for each ball, the volleyball game is more intense than the previous game, the pace is faster, and the projection is stronger, more exciting, and exciting, attracting a large number of loyal fans. As the intensity of the competition increases, the requirements for the physical fitness, skills, and tactics of volleyball players are getting higher and higher. In a volleyball game, both sides of the game engage in fierce offensive and defensive confrontations on the Internet. Because the direction of the ball is uncertain, volleyball players have different movement skills in the game. In volleyball games, smashing is the most ornamental and interesting technique.
Machine learning is playing an increasingly important role in data analysis [2]. There have been many cases of applying machine learning to solve practical problems at home and abroad. Zhao et al. pointed out that the rapid development of DNA microarray technology provides a wide range of data sources, which is expected to pave the way for better prediction and diagnosis of cancer and the determination of key targets for drug development. DNA microarray data analysis has been carried out using statistical analysis as well as machine learning and data mining methods. He conducted a comprehensive review of the machine learning methods that have been used on the acute lymphoblastic leukemia chip data. Following research conducted by the Leukemia Research Group in Child Biology and Medicine, machine learning has been used to enhance the diagnosis and subtype classification of cancer, develop new treatments, and accurately identify the risk stratification of patients. These methods have been used in the four main areas of microarray data analysis: gene selection, clustering, classification, and pathway analysis. However, their research did not explain the advantages and disadvantages of each machine learning algorithm [3]. Charlton et al. proposed that the support vector machine classification algorithm is the latest development in the machine learning community, proving its potential in structure-activity relationship analysis. In the benchmark test, the support vector machine was compared with several machine learning techniques currently used in the field. The classification task involves using data obtained from the UCI machine learning repository to predict the inhibitory effect of pyrimidine on dihydrofolate reductase. The performance of support vector machine is better than three artificial neural networks, radial basis function network, and C5.0 decision tree. SVM is significantly better than all of these. There is only a neural network controlled by artificial capacity, and the training time is much longer. However, the experimental results lack more data support so that the machine learning mechanism can predict pyrimidine pair two; the inhibitory effect of hydrofolate reductase remains doubtful [4]. Astuti continues to expand the current Internet by providing connectivity and interaction between the physical world and the network world. In addition to increasing capacity, the IoT also produces big data with speed characteristics, which depend on time and location and have a variety of standards and different data quality [5]. Intelligent big data processing and analysis is the key to developing intelligent IoT applications. This article uses smart cities as the main use case for evaluating various machine learning methods to address the data challenges of the Internet of Things. However, his research does not clearly raise the issue of how to evaluate different machine learning methods and the overall research lacks data support [6].
This article uses a combination of kinematics analysis and biomechanical analysis to analyze volleyball data from kinematics, dynamics, and electromyography. This article is aimed at giving volleyball players and coaches a more in-depth understanding of the technical structure of some movements in volleyball and the activity characteristics of related muscles when doing these movements. At the same time, to provide more targeted guidance data for athletes in physical training and technical training. Select volleyball before and after the row of spikes to conduct a series of comprehensive analysis, from the perspective of biomechanics, preliminary discussion of some more in-depth action description.

Application of Machine Learning in
Volleyball Data Analysis System 2.1. Technical Structure of Volleyball Spike in Volleyball. The purpose of the approach is to bring the athlete closer to the ball in the horizontal dimension, to choose a suitable takeoff position, and to play a role in increasing the height of the jump. The purpose of the jump is to bring the athlete closer to the ball in the spatial dimension, and the ultimate goal is to get a proper spiking position. The two parts of the approach and the jump are related to each other, effectively making the athlete closer to the ball [7,8]. In a volleyball game, the athlete's posture directly affects the spiker's speed, making full use of both run-up and take-off sections to prepare for a quick shot. After completing the spike, the athlete falls from the air. The athlete must have good landing skills without being injured. Good landing skills can reduce the impact on the knee and ankle joints. Figure 1 shows a series of action diagrams of volleyball spiking in volleyball.

Machine
Learning-Related Algorithms. Machine learning solves the problem of how to construct a learning algorithm so that it can automatically improve with the acquisition of experience or information and perform tasks such as 2 Wireless Communications and Mobile Computing acquiring knowledge, making predictions, making decisions, or building models based on given input data [9,10]. Data can be viewed as a collection of information containing the relationships between related variables. The complete set of all possible patterns may be too large to be covered by the information in the training data. Therefore, the difficulty of machine learning lies in how to get a good generalization from the observed data set in order to be able to generate useful models for new data [11,12]. Many machine learning algorithms have been successfully applied to a wide range of scientific and engineering problems. In typical scenarios, the output results of machine learning algorithms can be quantitative results [13]. Figure 2 is a flowchart of machine learning. In supervised learning, variables are divided into two groups: explanatory variables and dependent variables. The goal of supervised learning is to determine the relationship between the explanatory variable (input) and the dependent variable (output) and generate a function that maps the input to the output [14,15]. Some supervised algorithms can be used for classification and prediction tasks. In these cases, the paired labeled training samples are represented as follows: Among them, x i is the input and y i is the output. Supervised learning needs to learn a prediction function f that maps from x i to y i and compare f ðx i Þ and y i to cal-culate the error rate of the prediction function [16,17]. Two types of machine learning methods are introduced below, matrix factorization algorithms and subspace segmentation algorithms.
(1) Matrix factorization algorithm Matrix factorization has been successfully applied in many fields, and there are many kinds of decomposition methods. For any matrix A ∈ R m×n , there is Among them, D = diag ðσ 1 , σ 2 ,⋯,σ r Þ, σ i > 0ði = 1, 2,⋯,rÞ is the singularity of matrix A, and r = rank ðAÞ is the rank of matrix A. For the data matrix A ∈ R m×n , including n samples and m features, the total covariance matrix of the samples is as follows: The purpose of the principal component analysis algorithm is to maximize the covariance after projection, and its objective function can be expressed as follows: Among them, the constraint condition W T W = I is used to prevent the covariance from increasing indefinitely. Assuming that the rank of S is r, there is Given that λ 1 , λ 2 , ⋯, λ d is the first a largest eigenvalues of Equation (5), its eigenvector is W 1 , W 2 , ⋯, W d . Therefore, the low-dimensional space feature Z for any data X can be expressed as follows: (2) Subspace segmentation algorithm Realistic data can be viewed as approximate samples drawn from multiple mixed low-dimensional subspaces [18,19]. Given a data matrix A ∈ R m×n , including n samples and m features, its objective function can be expressed as follows: Among them, Bλ > 0 is an adjustable parameter. The above formula can be optimized by using kernel norm or L1 norm, namely,

Wireless Communications and Mobile Computing
Among them, kAk * = ∑ i σ i ðAÞ is the kernel norm of matrix A, and kEk 1 = ∑ ij jE ij j is the L1 norm of matrix E. Assuming that the data is noise-free, the subspace clustering method can express the data matrix X as a dictionary D multiplied by the low-rank representation matrix A through dictionary learning; then, the objective function of the subspace clustering method can be expressed as [20,21] follows: where rank ðAÞ is the rank of A. Equation (9) can use the kernel norm to constrain A to get There is a lot of noise in the real data set, so the subspace clustering method can be extended to a robust form.
Among them, λ > 0 is a balance parameter, and kEk 2,1 = (1) Support vector machines This is the perfect one among the countless straight lines that can be classified, because it is exactly in the middle of the two classes and is the same distance from the points of the two classes. The so-called support vector is the point with the closest vertical distance to the dividing line [22,23], as shown in Figure 3.
Regarding this line, it is at the same distance from both sides. The distance from any point x 0 to the line is Then, normalize it so that the linearly separable training set ðx i , y i Þ, I = 1, 2, ⋯, n, x ∈ R m satisfies the following formula: At this time, the distance between the dividing line and the two sides is equal to 2/kw 2 k. The ultimate goal is to maximize the distance between the dividing line and the two sides, which is equivalent to minimizing kw 2 k. At this time, the classification surface is the optimal classification surface [24,25]. For the information ðx i , y i Þ of N training points, it can also be written as shown in Equation (14).
Although the objective function is clearly expressed but difficult to calculate, a series of mathematical operations such as the Lagrangian multiplier method are used here to finally obtain the objective function [26].
The constraints are a n ≥ 0, ∀n, ∑ N n=1 a n y n = 0. Ideally, all classifications should be straight lines, but in actual situations, it may be curves, planes, curved surfaces, or higher-dimensional surfaces [27,28]. The input of the radial basis kernel function is a vector, and finally, a scalar is obtained according to the input based on the vector distance operation. The specific formula is shown in Equation (16).
Here is the Gaussian version of the radial basis kernel function. The σ in the formula is the speed parameter whose function value is reduced to 0. The kernel function maps the input data to an infinite dimensional space. The radial basis kernel function is very dependent on the parameters, and the training is very time-consuming [29,30].

Experimental Design of Volleyball Data
Analysis System 3.1. Test Subject. Here, we select 12 volleyball players from a physical education college. According to the students' real circumstances, such as physical abilities and skills, they are logically divided into type A and type B groups, with 6 people in each group. These 12 individuals did not suffer serious sports injuries before and after the experiment, and their basic physical condition was essentially the same at the beginning of the experiment. The basic conditions of the subjects are shown in Table 1. All subjects had no history of injury or smoking. Before the test, the subjects have explained the whole test process and precautions in detail and agreed to participate in this study voluntarily. This experiment mainly studies the volleyball spiking action. This experiment selects representative front-row four-position strong attack and back-row six-position strong attack for comparison and analysis. The research mainly analyzes the

Testing
Process. This experiment was performed indoors, and the experiment was completed in one day. The test equipment is arranged according to the layout of the test site of the experimental design, and the test officially begins after the debugging is completed. The experiment requires people to be naked in the upper body and to wear tight shorts in the lower body. This is done to facilitate the bonding of the electrode sheets during the electromyography test and the analysis of the 3D image after the experiment. Under the responsibility of a dedicated experimenter, in accordance with the EMG test plan, help the subject to paste the electrode sheet, wire it, and fix it with a bandage. After completion, remind the subject whether the subject is uncomfortable due to too tight bandage or whether it affects normal limb activity amplitude, and when everything is normal, let the subject try to smash the ball in the test field and become familiar with the action. When the subject takes off, both feet must land on a platform composed of four force plates, and the shot does not fall off the net or out of bounds; then, the action is considered a qualified action. A volleyball expert was specially invited to evaluate the pros and cons of each subject's 3 qualified smashes. The quality of the 3 qualified smashes was recorded as excellent, medium, and poor. This study selects the best movement for analysis. Finally, put away the instrument and check for damage or omissions, clean the experimental site, and copy out the measured data, and the experimental process is shown in Figure 4.

Test
Site. The experimental site layout is shown in Figure 4. In the picture, you can see that the 4 force plates in the center of the site are laid flat in the trough, which are fixed and cannot be moved, so other instruments and equipment need to be arranged with the force plate as the center. The direction the subject faces is the positive direction, which is the direction of the spike and approach. We will explain from the subject's left, right, front, and back. The subject's left is the force plate and the EMG operation area, as well as a fill light. The light is aimed at the force plate; the subject's right is the three-dimensional high-speed camera operation area, two high-speed cameras are located at the back right and front right when the subject is spiking, the main optical axis of the two machines are connected the angle is about 90°, and there are one fill light in front and one back in the vicinity of the force plate. The light is directed at the force plate. The position of the fill light is adjusted so that it will not block the high-speed camera shooting; There is a volleyball net directly in the front, which can be moved to facilitate the adjustment of the distance between the front and rear smashes; the subject is empty, which is convenient for the subject to choose a suitable distance for the smash approach. Since the domestic high-speed camera cannot be synchronized with the force plate and the EMG, this experiment only uses the force plate and the

Wireless Communications and Mobile Computing
EMG to synchronize, and the two sets of instruments are synchronized through the synchronization device, and the force plate triggers the synchronization signal for synchronous measurement. The two high-speed cameras were tested synchronously by artificial arm movements. After the standby device is connected and warmed up, the subject is placed in the test area to simulate the test process, during which the camera angle is adjusted to ensure that the subject is within the test range, and the images from the take-off stage to the air shot stage can be completely recorded. And finally, carry out the three-dimensional frame calibration and wait for the formal entry into the test process. The layout of the experimental site is shown in Figure 5.

Statistical
Processing. Kinematics data processing: export high-speed video, use video synthesis software to synchronize the video of the two machines [31], edit the synchronized synthesis video from the take-off stage to the air shot stage, analyze the synthesized video with video analysis software, and export the TSV files. Then, use the QTOOLS software and EXCEL office software to calculate and process the exported raw data to obtain parameters such as time, joint angle, link speed, and displacement.
Kinetic data processing: use the Kistler software to screen and export the test results, use the EXCEL software for calculation and statistical processing, and obtain parameter indexes such as force value, impulse, and power.
EMG data processing: the test results are processed with the Mega software, and the ASCCI file of each muscle integrated EMG is exported, and then, the EXCEL software is used for calculation and statistical processing, and the integrated EMG sum value, contribution rate, and other parameter indexes are obtained.

Experimental Volleyball Data
Analysis System

Kinematics Results and Analysis
(1) Results and analysis of the take-off phase First, perform a kinematics analysis on the take-off phase. Table 2 shows the joint angle of the lower limbs at the maximum buffer time in the take-off phase.
It can be seen from Table 2 that whether it is type A or type B, they have the same law at the moment of maximum take-off buffer for the front and rear spiking: there is no significant difference between the left and right side related angles, but the right hip angle > left hip angle, right knee angle < left knee angle, and right ankle angle < left ankle angle, which means that the right leg has a large degree of cushioning, and the right leg plays a major role in the cushioning process. The paired-sample t-test is used to obtain type A left and right hip angles and left and right knees; there was no significant difference between the angle and the left and right ankle angles (P > 0:05); for type B, the angle of the right joint was mainly seen. The hip, knee, and ankle angles of the front-row spiking were all greater than the hip, knee, and ankle angles of the back-row spiking. In other words, the degree of flexion of the lower limb joints in the back row is greater than that of the front-row spiking. Regardless of whether it is a front-row smash or a back-row smash, the corresponding joint angles of type B and type A are as follows: type B < type A, indicating that type B has a larger cushioning range. As mentioned earlier, the buffering time of type B technology is much longer than that of type A technology. This shows that type B technology has sufficient time for buffering. Since there is no double-arm preswing coordination, the buffering process is completely dependent on the lower limbs, so the lower limb joints must have more sufficient flexion can store enough energy to prepare for stretching.
(2) Results and analysis of the air shot stage Kinematics analysis is carried out on the flying shot stage. The relevant indexes of the center of gravity at the moment of take-off include the angle of the center of gravity at the moment of take-off, the vertical speed of the center of gravity, the horizontal speed, and the speed of the center of gravity. The results are shown in Table 3.
It can be seen from Table 3 that from the comparison of the front and back spikes, the type A technique is obtained through the paired-sample t-test. At the moment of takeoff, the center of gravity rise angle, the center of gravity horizontal speed, and the center of gravity closing speed are all  significantly different (P < 0:05), and there is no difference in the vertical speed of the center of gravity (P > 0:05). At the center of gravity, the angle of type A front-row smash is about 67 degrees, which is much larger than the back row of 53 degrees. It can be seen that the front-row smashes jump upwards, and the back-row smashes jump forwards. Yes, this can also be seen from the horizontal speed of the center of gravity. The front row is about 1.52 m/s less than the back row 2.55 m/s. The high horizontal speed indicates that the approach speed is high. This shows that in the game, when the front-row smash is closer to the net, in order to avoid touching the net, the smasher will control the approach speed, jump up as much as possible when jumping, strive to get the height in front of the net, and have more options for smashing. Spikes are more threatening.
(3) Results and analysis at the moment of hitting The quality of the shot determines the success or failure of the attack. Regarding the shot, leaving aside various additional factors on the field, the most important indicator in this technique is speed. The results are shown in Figure 6.
It can be seen from Figure 6 that from the comparison of front and rear smashes, for type A, through the paired sample t-test, there is a significant new difference in shoulder speed between the back smash and the front smash (P < 0:05). There is no difference in elbow speed, wrist speed, fingertip speed, and ball speed. The shoulder speed at the moment of impact can represent the speed of the trunk movement. The back row (3.60 m/s) is greater than the front row (2.48 m/s).
For type B, the shoulder speed of the back row is also significantly greater than the front row. From the technical type according to the above comparative analysis, there is no difference between the two types A and B in the front and rear index correspondence, but the difference is in the back-row spiking speed. The type A back-row spiking speed is higher than the type B back-row spiking speed, which may be the center of gravity.

Kinetic Results and Analysis
(1) Comparative analysis of force-related parameters The force value statistics of the take-off link include the vertical force value and the horizontal plane force value. The results are shown in Table 4.
It can be seen from Table 4 that for type A technology, there is no significant difference between the absolute force peak and the relative force peak in the front and rear spiking force peaks (P > 0:05), but the rear spiking peak thrust force is greater than the tendency of front-row spiking. For the type B technique, there is little difference in the peak extension force of the front and rear spikes, and the back row is larger than the front row. Looking at the comparison between A and B, whether it is the relative value or absolute value of the front-row smash or the relative value or absolute value of the back-row smash, the type B is smaller than the type A. The reason for this result is the type A take-off the swing of the arms in the stage greatly increases the reaction force of the ground against the human body.  It can be seen from Figure 7 that for the type A technique, there is a significant difference between the buffer impulse during the take-off phase and the front and rear smashes (P < 0:05). The back-row smash and the front-row smash have sufficient buffer brakings during the take-off, so as to control the forward speed of the body, and at the same time, the lower limb muscles stretch and store energy and finally complete the upward jump. The time is short, so the buffer is not sufficient and the braking is not obvious. This can reduce the loss of approach horizontal speed and momentum, so that it can jump forward during the take-off and maintain a large horizontal speed.

EMG Results and Analysis
(1) Integral EMG and results and analysis Figure 8 shows the integrated EMG and value of each muscle during the whole action stage from jumping to flying.
It can be seen from Figure 8 that for type A, the left gluteus maximus integral EMG sum value is significantly different between the front and rear rows (P < 0:05), and the discharge of the left gluteus maximus during the front-row spiking is greater than the back-row spiking. This difference is mainly reflected in the kick-stroke stage and the air-strike stage, the right rectus abdominis integrated EMG and the value of the front and rear rows are significantly different (P < 0:05), and the right rectus abdominis discharge during the back-row spiking process is significantly larger than the front-row smash; this shows that the right rectus abdominis has a more obvious role in the back-row smash during this process, which is determined by the characteristics of the type A technique between the front and rear smashes. On the whole, the discharge of the upper and lower limbs and trunk muscles during the front and rear smashes of type A technique is relatively balanced, the difference between the front and rear smash is not big, and the discharge of the left and right limbs is also relatively balanced. Compared with type A and type B, as a whole, type B technology has a larger discharge volume of the muscles of the front-row and back-row spiking than type A technology; type B technology is inferior to type A in the balance of overall muscle discharge technology; type B technology has a clearer primary and secondary role of each muscle, while type A reflects a clear balance.
(2) Contribution rate of myoelectric activity during the flight shot stage The contribution rate of each muscle's integrated EMG activity during the flight shot stage is shown in Figure 9.
It can be seen from Figure 9 that in the comparison of the front and rear spiking, there is a significant difference between the left gluteus maximus and the right rectus abdominis (P < 0:05), and there is no significant difference in other muscles. From the perspective of the right gluteus maximus, at this stage, the contribution rate of the left and right gluteus maximus is relatively balanced when the front row smashes. When the front row smashes into the air, the body back arch is greatly twisted, and the back row smashes the body back arch. The degree of torsion is comparable; for the right rectus abdominis, the back-row spiking is significantly greater than the front-row spiking, which means that the muscle has a greater role in the back-row spiking than the front-row spiking during the front bow of the spiking. On the whole, the contribution rate of the upper limbs and trunk muscles is greater than that of the lower limbs. They dominate, and the lower limbs are inferior. But this does not mean that the lower limbs are not important. In the process of flying shots, the back bow of the trunk is clockwise. Twisting and smashing the back

Conclusions
The vertical force curves of spiking and jumping for the front and rear players are all single-peak curves. The peak force appears at a certain moment in the stretch phase, and the curve appears like a platform curve in the buffer phase. Through comparative analysis, this article finds that type B athletes must have strong muscle power, lower limbs, and good waist and abdomen strength. However, type B has shortcomings and cannot perfectly combine technology with physical fitness. Therefore, this article recommends type B athletes to improve their skills. Transition to type A, so combined with its own conditions, in the future competitions can release their potential and maximize energy. This article comprehensively analyzes and compares the two types of techniques from the three aspects of kinematics, dynamics, and electromyography: the characteristics of front and rear spikes summarize some rules and find out some differences. However, these rules and these differences are all established in the group, and due to the large individual differences in the technique of spiking, this article suggests that in training, we still need to carry out targeted exercises according to different athletes to improve technical ability and special strength.

Data Availability
No data were used to support this study.

Conflicts of Interest
The authors declare that they have no conflicts of interest.