Research on Intelligent Scheduling Scheme of Aerobics Competition for Multi-Intelligent Decision-Making

Multi-intelligent decision-making has a good development at present. Based on a series of technologies such as artificial intelligence, multi-intelligent decision-making is involved in many aspects, and the country also attaches great importance to the development of such science and technology. At the same time, at present, the development of the sports industry is not balanced. Even if the country values the sports industry, it needs investment in science and technology. This paper studies the scheduling scheme of aerobics competition in sports competition. By introducing the design scheme of a multi-intelligent decision-making system and improving the MFDRL-CTDE algorithm, the similarity between the action sequence of participants in aerobics competition and the standard action sequence is obtained. Three algorithms including Markov, MFDRL-CTDE algorithm, and improved MFDRL-CTDE algorithm are used for simulation experiments, and the improved MFDRL-CTDE algorithm is more effective and stable for the decision-making of aerobics competition.


Introduction
Our country is very interested in the development of science and technology. e application of intelligent decision-making system has penetrated into all walks of life, and the application of intelligent decision-making system has also been internationally recognized. Researchers have developed related intelligent decisionmaking in depth, which greatly improves the performance of various related decision-making algorithms and makes related applications more efficient. While taking care of the development of science and technology, the country pays more attention to the development of sports, especially the development of aerobics competition. e development history of aerobics is not long ago, but it is loved by the broad masses of people, and it has the performance of sports competition while having an aesthetic feeling. However, the development is not balanced, so it is necessary to put science and technology into aerobics competition and intelligent decision-making into aerobics competition, so as to make aerobics competition have better development.
In the process of studying intelligent decision support system, many new methods have emerged [1]. According to the structural characteristics and system characteristics of the system, an improved model can be designed well. Group is the foundation of the organization of individual intelligence, and the model and strategy discussed are also based on this characteristic. Combined with the combination mechanism, the model can run efficiently. e decisionmaking process plays an important role in the interactive trend of multi-intelligence combination [2]. In order to accurately predict, it is impossible to use probability or theory alone. What game can bring is very good decisionmaking and identification. e Bayesian game model is the combination of game and graph, which can solve the complex situation caused by multi-intelligent agent interaction. Compared with other intelligent decision information systems, this model has better decision efficiency. Because decision-making and evaluation are not carried out at the same time, their operation process is affected by many factors. Nevertheless [3], it is extremely urgent to improve the efficiency of group decision-making. erefore, the rational political model is put forward, and its purpose is to improve the operational efficiency of group decision-making in the face of uncertain factors. In this process, this model clarifies the factors. Moreover, this model obtains the best scheme of group decision-making on the basis of a sequence framework when there are different complete evaluations, information, and opinions. In real-time work [4], such as the intensive care unit, what doctors need is the result of matching time and information, and the important value of an intelligent decision support system is the agent of two or more people to complete tasks cooperatively. At present, the architecture of a multiagent system has been proposed, which is undoubtedly to support the decision-making of clinical data and predictive status required by doctors. e artificial neural network algorithm is divided into two steps [5]. e first step is to use the improved algorithm to calculate the relevant weight matrix. e second step is to apply the method to real system decision-making, such as a multiintelligent robot soccer system. e final simulation results show that this technology can be efficiently used in smalland medium-sized league system decision-making. Multicriteria decision-making method can be applied to aerospace systems effectively [6] and help decision makers to solve related conflict problems. In the face of a variety of multicriteria decision-making methods, we need to choose the appropriate method, but this is a very complex problem related to multicriteria decision-making. e proposal of 14 criteria can evaluate the feasibility of the multicriteria decision-making method, so that the preference of decision makers can be better informationized. If we want to develop an intelligent knowledge system, we can optimize the suitability index. In order to prove the effectiveness of the system, the aircraft selection problem can solve this problem well. In order to realize the UniComBOS of intelligent decision support system [7], we propose to use verbal decision to analyze the paradigm. In this range, for multistandard alternatives, the processing ability of individual decision makers is considered, and the information of decision makers' psychological preferences is accurately extracted on this premise. In the user interface, the decisionmaker's psychological preference is distinguished in the form of graphic color, which provides a psychological comparison for the decision-maker and can test whether the decision-maker's answer is consistent. Like conventional decision-making tools, this scheme decomposes the whole part with the least number of criteria, so that the comparison range of decision makers can be reduced to a single criterion, and then, the criteria are added until the advantages of the optimal substitute are found. However, in the case of multiple standards, a single optimal alternative cannot be found by the system, so only a group of alternatives can be found. However, there is no comparability between these alternatives, but they still have great advantages over those that are not selected. In the EFQM excellence model, literature [8] is used to achieve business excellence, which is mostly used for the self-business evaluation of many nonlarge enterprises in Britain. In fact, this is a problem related to multistandard decision analysis. At present, experiments have proved that in order to realize the method of achieving excellent business through action, an intelligent decision system is used to support program groups, which not only makes the average score come out but also can wait for numerical results and graphic comparison. At present, the society is interested in sports as well as the development of science and technology [9]. In Korea, aerobic exercise is popular with people. However, it has been emphasized recently that sports need technology and competition, which makes the development trend of aerobic exercise that shows an unbalanced trend. Among them, the development of aerobics is relatively rapid. In order to find a new way out for aerobics, it is very important to understand its history and introduction process. e demand and rapid change of society for aerobics is the key to the development of aerobics, and the development needs to be carried out in a moderate way and cannot be too radical in any field. According to the data shown, the relevant results of aerobics competition need to be analyzed with various related techniques [10], which can well analyze whether the scores given by the referee are reasonable and consistent. It makes the selection and scoring of referees more scientific and effective. Referees need to maintain an objective and fair attitude towards competition scoring, and rational evaluation can better develop and build the future of aerobics competition. Aerobics Championship needs to analyze the movements [11]. According to the rules, difficulties, and characteristics of the world aerobics competition, each group performs differently.
ere are outstanding performances in dynamic strength or outstanding performances in balance flexibility, and the most outstanding ones in each group are different. For the analysis of aerobics competition results, we can use related research methods, such as comprehensive omission research and variance analysis [12]. rough the experimental data, we can observe that the scores given by the judges are relatively the same, and of course, the performance displayed by the finished judges is better than that of the artistic judges. erefore, no matter what kind of aerobics competition, it needs a systematic scoring test program. Relevant technologies are used to simplify scores and evaluate objectively. At the same time, participating members can question scores and arbitrate. According to the statistical analysis of the skills of the first six participants in a certain aerobics competition [13], and summarizing the techniques of music rhythm and aerobic exercise, we can develop ladder aerobics and find out its characteristics. e most important point is that this can make junior coaches have more references for aerobics competition. For students' aerobics competition, on-site observation, interview, and video analysis of the later competition are also very important [14]. is is a strengthening of aerobics education for students and a reference for the competition itself. Our country attaches great importance to the development of gymnastics and aerobics [15], and many related training competitions are held by national-level organizations. However, the future development of aerobics needs to be 2 Computational Intelligence and Neuroscience analyzed through data literature, refers to recent training, and changes to solve the problems on the road of development.

Basic Concepts.
Multi-intelligent decision-making refers to the collective integration of multiple intelligent units, so a huge system is composed of multiagents. One of the main tasks of multi-intelligent decision-making is that decision makers disassemble comparatively complex systems, which is a decision-making process. A complex system is divided into several small systems, and these small systems will be combined again. It should be noted that these small systems are interactive, which is why we can ensure that each system is interactive, so as to improve work efficiency. Compared with other technologies, multi-intelligent decision-making has absolute advantages, which are manifested in its independence, relevance, and application development. e classification of intelligent decision is shown in Figure 1.

Markov.
e sequential decision means that after the tth iteration, the state s of the environment will be received by the agent, and on this basis, an action a will occur. Because the agent and the environment are related to each other, this means that the previous actions will also affect the environment, which will make the agent get a reward, that is, a reward R t+1 , so as to stimulate its new state. It is precisely because of the interaction between agent and environment that sequences can be generated. Markov's decision process is to formulate the above process, and Markov's definition formula is as follows: (1) at is, in Markov, the state of the t + 1 iteration and the state of the t iteration are related.
State s and the state transition probability p based on its next state s ′ are expressed as follows: at is, in Markov, the state of the t + 1 iteration is only related to the state of the t iteration.
e formula for getting the reward expectation when the state s moves to the next state s ′ based on it is as follows: e formula for all rewards and G from state s to the final state point is as follows: where c is a discount factor, which is a parameter used to express attenuation.

MFDRL-CTDE Algorithm
e main task of a fuzzy inference system is to fuzzify and defuzzify variables. Its core idea is that when the state is input, variables are continuously fuzzified, and the fuzzification of behavior is due to the prescribed rules, and finally, the behavior should be defuzzified and output. When the output variable is controlled at [0, 1], the input fusion weight w F of the system is the combination of the cumulative average reward and the sample priority in the iteration period, and its normalization formula is as follows: where I is the total number of agents. However, the network parameters θ in the system will not be equal to 0 after many iterations, so the formula for updating the parameters is as follows:

Centralized Training and Decentralized Execution.
Each agent affects the value function that determines each of them, which is indispensable in multi-intelligence deep reinforcement learning. Because of this, Markov's decision process in multi-intelligence system is not working well, and the hidden information is difficult to be fully discovered by the agent, so it is necessary to capture this information during training. Centralized training and decentralized execution make use of this point to help agent improve efficiency and, at the same time, store the available information of agent interaction into the shared experience pool, thus further improving efficiency. For centralized training, the value functions of agents are related to joint behaviors rather than local ones, and they are   trained with extra information. For decentralized execution, agent decision-making only depends on the information that has been mined, rather than all the complete information.

Optimize MFDRL-CTDE Algorithm.
Because of the low fitting ability of the value function in the initial stage, the MFDRL-CTde algorithm cannot guarantee its high efficiency. erefore, in order to improve the efficiency of the algorithm and reduce the volatility of the algorithm, it is necessary to add competitive DQN and preferential empirical playback. e algorithm flow is shown in Figure 2.

Action Selection Strategy.
In order to solve the problem that the agent may choose to execute random actions caused by the constant or decreasing parameter ε of a typical action selection strategy in the later stage, thus greatly reducing the convergence speed of the algorithm, an optimized action selection strategy is needed, and the formula is as follows: where π i (s i t ) is the best action strategy of the agent belonging to the i-th in-state s t when the t-th iterative update is carried Computational Intelligence and Neuroscience out, a random is the random selection action, rand is the random number with a value range of [0, 1], λ is the control rate of ε descent, t is the current number of training rounds, and T is the total number of training rounds.

Competitive DQN.
In deep reinforcement learning, DQN is undoubtedly the most frequently used method, but it is only applicable to a single agent. Facing multiagent, it is obvious that the efficiency of ordinary DQN cannot keep up with a large number of state and action combinations of each agent. erefore, for multi-intelligent systems, it is necessary to apply competitive DQN as the basic structure to improve the original structure and improve the algorithm efficiency.
Competitive DQN solves the problem that a certain state is immune to any action, which can not affect the subsequent state. It breaks the value combination of state and action, which can improve efficiency. e competitive DQN structure is shown in Figure 3.
Among them, the state value function will have a value equal to 0, and the action advantage function will also have a value equal to Q. In order to avoid this kind of situation, it is necessary to reduce the Q value, but the order of the action dominance function cannot be changed, so as to reach the degree of freedom without redundancy. e formula is as follows: where Q(s t , a t ; θ, θ V , θ A ) is the agent in state s t that executes the Q value of action a t at the t-th iterative update. V(s t ; θ, θ V ) is the state value function, that is, the value of the state itself, A(s t , a t ; θ, θ A ) is the action dominance function, that is, the extra value generated after the corresponding action is selected, and N A is the number of actions that are likely to occur.

Priority Experience Is Put Back.
In the ordinary MFDRL-CTDE algorithm, random sampling is adopted for the step of putting back experience, which undoubtedly subtracts the playback efficiency. erefore, in order to improve the efficiency of using samples and enhance the function of experience playback, it is necessary to introduce  priority experience playback, which can make the samples in the shared experience pool get priority rights, and the possibility of being sampled can be determined. At the same time, high-priority samples are selected many times, which leads to multiple playbacks, which will reduce the diversity of samples. At this time, important sampling weights are needed to solve this problem. e formula for judging priority is as follows: Among them, when δ ij t is much greater than 0, it shows that the prediction accuracy has more room for improvement, which makes the convergence efficiency of the algorithm higher. δ ij t is time series difference error from the jth sample of the i-th agent in the t-th iterative update, r ij t is an immediate reward from the j-th sample of the i-th agent in the t-th iterative update, Q i (s i t+1 , a ′ ; θ tar , θ Vtar , θ Atar ) is the Q value obtained by the centralized target of the i-th agent, is the Q value obtained from the estimation of the i-th agent. e formula for calculating the probability G ij that a sample is sampled is as follows: where N g is the empirical pool capacity. e formula for calculating sample priority g ij is as follows: where σ is a smaller positive number, α is degree coefficient, which controls priority and has a value range of [0, 1], and when the value of degree coefficient is 1, it means that sampling is not based on priority, but on random sampling. e formula of importance sampling weight w ij is as follows: where β is the correction degree parameter. e formula of parameter training target value y ij is as follows: End of current training round, e formula of the corrected loss function is as follows: Computational Intelligence and Neuroscience benchmark test functions, and the latter three are multimode benchmark test functions. e formula for the test function is as follows:

Simulation Experiment
x 2 i − 10 cos 2πx i + 10 − 5.12 ≤ x i ≤ 5.12, e average and standard deviation of the three algorithms are calculated after 1000 iterations according to the target test function. e significance of the calculation is that it can reflect the convergence speed of the algorithms and show whether the algorithms are stable or not. e calculation results are shown in Table 1.
e graph of function 1 is shown in Figure 4. e graph of function 2 is shown in Figure 5. e graph of function 3 is shown in Figure 6. e graph of function 4 is shown in Figure 7. e graph of function 5 is shown in Figure 8. e graph of function 6 is shown in Figure 9. It can be seen from the above comparison diagram that the stability and convergence speed of the three algorithms are very close in a single modal function. But, in the multimodal function, it is obvious that Markov does not show good stability, and the convergence rate is very slow. Compared with the improved MFDRL-CTDE algorithm and the common MFDRL-CTDE algorithm, the convergence speed of the former is slightly better than that of the latter, but, for the best performance position, the latter is obviously better than the former, and the latter is more stable. erefore, in general, the improved MFDRL-CTDE algorithm is superior to the original algorithm.

Objective Function.
Set its standard aerobics action as action sequence P, set each person's aerobics action as a series of action sequence Q, compare them, and calculate the similarity value. e formula of similarity evaluation is as follows: where R is the total number of key actions, K p r is the key action set in standard action sequence P, and K q r′ is the key action set in standard action sequence Q.   Computational Intelligence and Neuroscience

Simulation Experiment and Result Analysis.
In the aerobics competition, the movement of each member of the team is set into an action sequence, and the corresponding standard action sequence is compared with it. Suppose there are two aerobics teams participating in the five-person competition and the ten-person competition, respectively. e standard action sequence of aerobics displayed by each team is known. Under the same competition venue, each team completes the display action within 90 seconds at the same time, taking every 10 seconds as a judgment decision point, recording, and setting it into an action sequence to compare its similarity judgment decision.
By taking aerobics as an example, important actions are intercepted as one of the action sequences, and the picture of the intercepted actions is shown in Figure 10.
Put four intercepting actions into the action sequence, which are a 1 t , a 2 t , a 3 t , a 4 t · · · , and compare the action sequences to calculate the similarity.
Set the volume of the shared experience pool to 2000, the number of playback samples to 10, T � 90, c � 0, α � 0.6, β � 0.4, λ � 40. e similarity comparison data of the five-person group is shown in Table 2.
e similarity comparison data of the ten-person group is shown in Table 3. e five-person action sequence decision prediction pair is shown in Figure 11. erefore, for aerobics competition, the results of the three algorithms for the evaluation scheme decision of the competition action sequence are obvious. With the increase  of decision objects, the accuracy of Markov's algorithm is obviously reduced, which coincides with Markov's suitability for a single intelligent decision system. By comparing the MFDRL-CTDE algorithm and the improved MFDRL-CTDE algorithm, it can be clearly seen from the above figure that the improved MFDRL-CTDE algorithm is more stable for the accuracy of predicted values. Although there are very few cases where MFDRL-CTDE predicted values are more accurate than the improved algorithm, most predicted values are still not accurate and stable enough. erefore, it can be concluded that the improved MFDRL-CTDE algorithm is more stable and efficient for multi-intelligent decision-making systems. In the multi-intelligent decisionmaking, the system can make good use of all kinds of dance competition action contrast similarity, so that sports competition has a deeper step of development.  )  10  80  70  78  80  20  77  68  86  80  30  81  71  75  79  40  85  73  80  83  50  84  69  85  86  60  80  72  75  81  70  79  65  75  79  80  82  70  80  83  90 86 70 81 84   : Comparison chart of decision prediction of five-person action sequence. e ten-person action sequence decision prediction pair is shown in Figure 12.

Conclusion
Compared with MFDRL-CTDE, the improved MFDRL-CTDE algorithm is more suitable for multi-intelligent decision-making. Even after many iterations, the decisionmaking of the algorithm will not fall into an agent and then execute actions. Obviously, the actions are still executed according to priority, which ensures the efficiency of the algorithm. Markov is more suitable for single intelligent decision-making and cannot be effectively implemented for multi-intelligent decision-making. erefore, for the scheme scheduling of aerobics competition, the improved MFDRL-CTDE has obvious advantages.

Data Availability
e experimental data used to support the findings of this study are available from the corresponding author upon request.