RETRACTED ARTICLE: Forecasting algorithm of tourism service trade based on PSO-optimized hybrid RVM model

As a comprehensive form of trade, tourism service trade has had a profound impact on the economies of various countries. This research mainly discusses the tourism service trade forecasting algorithm based on the PSO-optimized hybrid RVM model. This study extracts 8 indicators including gross national product, total fixed asset investment, total import and export, China's import and export tariff rate, the exchange rate of renminbi to the US dollar, and the global economic growth rate. The same as the impact indicators of tourism service trade, but there is a certain degree of redundancy and correlation in these indicators. In order to measure the correlation between the evaluation indicators, the autocorrelation evaluation function in MATLAB is used, and the principal component analysis method is used to extract the principal components that can represent the indicators in a larger percentage. In order to improve the prediction accuracy of the RVM model, based on the adaptive construction model structure and initial model weights, the PSO algorithm is used to optimize the RVM model weights. The optimization process takes the minimum error of the RVM model as the algorithm search target, and each represents the RVM model. The algorithm finds the value and threshold of the optimal RVM model through the particle swarm tracking search algorithm and then uses the original RVM model and the optimized RVM prediction respectively total amount of tourism service trade in City A, and compares the prediction errors of the single RVM method and the PSO-optimized RVM method, and analyzes the degree of model prediction error reduction after the PSO model optimizes the RVM model. According to the forecast result, the relative average error of 2020 is 5.7%, and the forecast result is relatively accurate. This research is helpful to provide scientific reference for my country's tourism service trade.

Since the daily passenger flow includes the main normal pattern of the development and change of tourist flow, it is of great significance for tourist attractions to conclude a model with good predictive ability from the time series of normal daily passenger flow. At present, RVM has become an important method for researchers to solve nonlinear time series forecasting, and it has been successfully applied in many forecasting fields.
The development of service industry and service trade plays an important role in the transformation of my country's economic structure. After Yan C implemented CEPA, the traditional service industries such as transportation from Hong Kong to Mainland China have developed significantly. His research only made tourism economic planning, but the development of modern service industries such as finance and commerce was not much [1]. The purpose of Sul is to quantitatively confirm whether hosting the Winter Olympics can improve the balance of tourism or increase the number of inbound tourists. He also tried to identify other key variables to improve the balance of tourism. He aims to quantitatively analyze the relationship between the balance of the tourism industry and the Olympic Games. In order to analyze the specialization characteristics of medals, he used the traditional explicit comparative advantage model and used the bit estimation method. His research findings confirmed the role of hosting the Winter Olympics in improving the balance of the tourism industry and increasing the number of inbound tourists, but the research took too long [2]. Gupta M R established a two-sector dynamic model of an underdeveloped economy, which has an import-traded goods sector and a non-trade tourism service sector to provide services to international tourists. He analyzed the relatively steady-state impact and showed that the development of tourism has increased the level of capital stock and national income, but reduced the environmental quality under the new steady-state equilibrium, which led to the relative expansion (shrinkage) of capital (labor). Dense non-tourism (tourism) sector. The pollution reduction policy has had the exact opposite effect. Although the tourism development policy and pollution elimination policy in his research are complementary to each other, they cannot ensure green growth [3]. Suhartanto D's research aims to study an innovative attractive loyalty model. Data collection was conducted at four culturally based attractions in Bandung, Indonesia. After the visitors experienced the attraction, he distributed them a self-management questionnaire. A total of 415 useful questionnaires were collected. Partial least squares structural equation model was used to test the proposed hypothesis. Although he has done a lot of research, the attractive loyalty model he proposed is insignificant in terms of the difference between tourists and residents [4].
This study extracts 8 indicators including gross national product, total fixed asset investment, gross industrial production, and the global economic growth rate. In order to measure the correlation between the evaluation indicators, the autocorrelation evaluation function in MATLAB is used, and the principal component analysis method is used to extract the principal components that can represent the indicators in a larger percentage. In order to improve the prediction accuracy of the RVM model, based on the adaptive construction model structure and initial model weights, the PSO algorithm is used to optimize the RVM model weights. The optimization process takes the minimum error of the RVM model as the algorithm search target, and each represents the RVM model. The algorithm finds the value and threshold of the optimal RVM model through the particle swarm tracking search algorithm and then uses the original RVM 2 Forecast of tourism service trade

Tourism service trade
Service trade is also called "labor trade," which refers to the economic exchange activities between countries to provide services to each other. Service trade can be divided into broad sense and narrow sense. The broad sense refers to both tangible transaction activities and intangible transaction activities between traders, and the narrow sense refers only to service trade activities between two countries, one party provides services and the other party accepts and pays for the transaction. Traditional research on tourism includes "six major elements," namely "traveling, housing, eating, traveling, shopping, and entertainment" [5,6].
Most of the service trade now refers to the output of labor in a certain period of time, most of which are consumed while outputting. However, the service trade of tourism is different from the traditional trade. Good goods can only be implemented, while the tourism service trade is completed in the tourist local goods and services trade and cannot be returned [7]. Tourists traveling to other countries, as long as they arrive at the destination and consume or request services locally. The local provider of goods or services is the exporter, exports locally, and obtains international foreign exchange income [8]. When the sum of the mean square deviation from the sample point to the cluster center in each cluster is the smallest: Among them, C = c k , k = 1, . . . , K represents K cluster division [9,10].
Among them, w ik represents the feature vector of the text [11]. In addition: The binary Jaccard coefficient can only be used for two attribute values of 0 and 1. It is extended to multiple or continuous values [12,13]: Among them, T J 2 (D i , D j ) extends the binary Jaccard distance [14]. Since the daily passenger flow includes the main normal pattern of the development and change of tourist passenger flow, it is of great significance for tourist attractions to conclude a model with good predictive ability from the time series of normal daily passenger flow, and it is also the current construction of smart scenic spots. Necessary content. However, on the one hand, the time series of ordinary daily passenger flow mainly presents characteristics such as nonlinearity and volatility. On the other hand, due to the limited time for informatization construction of domestic tourist attractions, there are small actual data such as passenger flow, weather, and e-commerce. The sample brings great challenges to the daily passenger flow forecasting.

Particle swarm optimization algorithm
The particle swarm optimization algorithm was first proposed by Dr. Eberhart and Dr. Kennedy. The algorithm is derived from the study of bird predation. At first, people tried to graphically depict the graceful and unpredictable movements of birds. In the process of predation by the bird race, people found that the entire population is always close together for food to ensure that every bird in the population can find food. Based on this model, foreign experts have designed particle swarm optimization (PSO). The search process of PSO algorithm is similar to genetic algorithm, ant colony algorithm, etc. In the process of calculation, the algorithm first initializes a set of solutions and obtains the individual optimal solution and the group optimal solution. Due to the simple operation of the algorithm and strong search ability, the algorithm has been applied to the fields of function optimization and other fields, and has achieved good results.
In order to prevent erroneous evaluation results due to data interference, a model for a certain period of time is used as the evaluation criterion [15].
Here, L is the evaluation time zone. E is the estimated mean square error of the time domain model. The purpose of group distance selection is to calculate the density of individuals, select relatively sparse individuals, improve the diversity of individuals, and make individuals evenly dispersed. According to the target value of each dimension, the population is sorted in ascending order, and finally the dense value of the individual is obtained [16].
Here, Crowd[i] d represents the maintenance target value of MTH, and the initial antibody data are formed according to the increase in the amount of remaining probability variables [17]. Among them, θ 1j and θ 2k are the threshold vectors of the hidden layer and the output layer [15].

A R T I C L E
1. The search strategy of the algorithm is global search. 2. The algorithm uses the speed-location group intelligence model to search, and the operation is simple and effective. 3. The algorithm has a memory function, and the individual dynamically tracks the historical optimal solution to complete the search and can adaptively adjust the search step according to the number of iterations. 4. The concept is clear, the code is short, and it is easy to implement.

RVM model
Relevance Vector Machine (RVM) is a supervised sparse probability model similar to SVM. However, its theoretical framework is completely different from that of support vector machines. Correlation vector machines adopt an automatic correlation decision method that is screened by prior probability, and remove irrelevant points to obtain a sparse model. The correlation vector machine is more suitable for regression prediction problems, and it is also in line with the direction of this article. Assuming that the input vector is defined as x and the target variable is g, the regression prediction process of the correlation vector machine is: 1. Calculate the probability distribution of the target variable; 2. The data matrix is formed after multiple measurements through the input vector, and the likelihood function is calculated; 3. Introduce a separate hyperparameter, namely parameter weight, for the parameter, and calculate the prior form of the weight; 4. Combine the results of the linear model, and obtain the posterior probability of the parameters through integration; 5. The result of step (4) is maximized, the weight corresponding to the correlation vector is obtained, and the final model is formed [18].
Among them, x ij is the original data [19].
The standard deviation of the j index is [20,21]: The standardization matrix is [22]: Calculate the pairwise correlation matrix R [23,24].
Among them [25]: Analyze the variance contribution rate a i .
Then there is a decreasing trend in turn, and there is no influence between each component, to avoid duplication of information [26].

PSO-optimized RVM model
1. Determine the structure of the neural network, including the number of neurons in the input layer, hidden layer, and output layer. 2. Initialize the particle swarm.
(1) The dimension of the particle position and velocity vector (dimsize), the value of which is the ownership value and the threshold value. dimsize = the number of input layer to hidden layer connection weights (the number of input layer neurons X the number of hidden layer neurons) + the number of hidden layer to output layer connection weights + the threshold number of the hidden layer (the number of neurons in the hidden layer) + the threshold number of the output layer. (2) The size of the particle swarm (popsize).
(3) Initialize the learning factor. (4) Initialize the particle swarm and the velocity of each particle. (5) Initialize the individual extreme value and global optimal solution of each particle, and record the corresponding weight and threshold.
3. Determine the fitness function. The minimum mean square error MSE of the neural network is used as the evaluation index (fitness) of the particle search performance to guide the search of the population. 4. Use all training samples to perform forward propagation calculation for each particle, and generate the training error generated by the particle under the training sample. Calculate its fitness. 5. Update the individual extreme value and the global optimal value according to the fitness of each particle. For each individual particle, if its current fitness is less than the individual extremum before the iteration, the individual extremum is updated; otherwise, it remains unchanged. If the current fitness is less than the global optimal value, the global optimal value is updated; otherwise, it remains unchanged. Among all the individual extrema, the individual extremum with the best fitness is the global extremum. The weights and thresholds of the neural network corresponding to the global extremum are the current optimal solutions of the particle population. 6. Update the weighting coefficient. 7. Update the number and position of each particle. 8. Judgment of algorithm stop condition. The fitness of the new particle population generated by iteration is evaluated, and it is judged whether the algorithm reaches the maximum number of iterations or meets the specified error standard. 9. Generate the optimal solution. Among them, data determination refers to determining the indicators related to the predicted data according to the characteristics of the predicted data, using the relevant indicators as the input data of the RVM model, and the predicted data as the output data of the RVM model, thereby constructing input and output data pairs, and selecting them at random N sets of data are used as training data to construct the RVM model through the self-organizing adaptation method, and M sets of data are randomly selected as test data to test the fitting performance of the RVM model.

PSO algorithm optimization weight:
Based on the self-organization determination of the RVM model structure, the PSO algorithm is used to represent the weight of the RVM model with particles, the prediction error is the particle fitness value, and the minimum prediction error is the evolution goal of the particle swarm algorithm.
Weight assignment: Assign the optimal PSO weight obtained by PSO optimization to the RVM model, thereby determining the structure and weight of the PSO according to the training data.

Dependent variables
Whether in Porter's Diamond Model or Gravity Model, there are many evaluation criteria for international tourism service trade. Here we use the variable of international tourism foreign exchange income to measure, take it as the dependent variable and set it as ITR (each province China's international tourism foreign exchange income), and use it to evaluate the development and scale of my country's international tourism service trade.

Independent variables
According to the Porter Diamond Model, taking into account the multicollinearity and availability of variables, the independent variables selected from tourism demand conditions, government support, and tourism and related auxiliary industries are as follows: urbanization rate-UR, per capita the number of national A-level attractions-QTA, the proportion of tourism industry employees-QL, exchange rate-ER, the gross national product of each province-GDP, the gross product of other countries in the world-WGDP.

RVM model
The tourism market is a complex abstract system, which is affected and restricted by many factors. From the grey relational analysis method, the order of the degree of correlation between the factors affecting the number of domestic tourists is: the total mileage of roads and railways in the country > the disposable income of urban residents > the consumer price index > domestic tourism income. Therefore, according to the influencing factors, the RVM model is divided into two categories, namely the single-factor RVM domestic tourist number forecast and the RVM model that uses multiple influencing factors to estimate and predict the domestic tourist number.

Correlation test
We extract 8 indicators including gross national product, total fixed asset investment, gross industrial production, total actual use of foreign capital, total import and export, China's import and export tariff rate, the exchange rate of RMB against the US dollar, and the global economic growth rate as the same the impact indicators of city A's tourism service trade, but there are some redundancy and relevance in these indicators. Correlation analysis is to study the correlation relationship between various data and to qualitatively express and resolve the correlation relationship. It is also a calculation method based on statistical correlation. In order to measure the correlation between the evaluation indicators, the autocorrelation evaluation function in MATLAB is used, and the principal component analysis method is used to extract the principal components that can represent the indicators in a larger percentage. After dividing the principal components into the training principal components and the test principal components, the RVM model is used to construct the RVM model adaptively according to the training principal components, and on the basis of the model construction, the adaptive RVM model is used to predict the test principal component data. In order to improve the prediction accuracy of the RVM model, based on the adaptive construction model structure and initial model weights, the PSO algorithm is used to optimize the RVM model weights. The optimization process takes the minimum error of the RVM model as the algorithm search target, and each represents the RVM model. The algorithm finds the value and threshold of the optimal RVM model through the particle swarm tracking search algorithm and then uses the original RVM model and the optimized RVM respectively Predict the total amount of tourism service trade in City A, compare the prediction errors of the single RVM method and the PSO-optimized RVM method, and analyze the degree of model prediction error reduction after the PSO model optimizes the RVM model. The results of regression on the model are shown in Table 1. 4 Results and discussion Figure 1 shows the number of domestic tourists and disposable income. Through the mathematical statistics and analysis of the per capita disposable income of urban residents and the number of domestic tourists, it can be seen that there is a strong positive correlation between the per capita disposable income of urban residents and the number of domestic tourists. The correlation coefficient between the two can be obtained from the calculation of the above gray correlation. It is 0.9405, indicating that the two have a certain connection. With the increase in the per capita disposable income of urban residents, the number of domestic tourists will also increase accordingly.

Influencing factors of tourism service trade
The MS index of tourism service trade (international market share index) of various countries is shown in Fig. 2. From the perspective of the international market share of the tourism service trade of various countries from 2018 to 2020, the USA has the highest international market share, basically between 13 and 17% in each year, and the international market share is significantly higher than other countries. The market share of China's tourism service trade has declined significantly. Since 2010, it has been moving toward a downward channel. The international market share has basically declined year by year and has now fallen by nearly 2 percentage points. Therefore, from the perspective of market share, it can be basically considered that the  international competitiveness of China's tourism service trade is declining until the international market share is surpassed by Japan in 2019. Table 2 shows the average value of all variable data of the tourism service trade situation of each province in each year. From the above table, it can be seen that the overall trend of foreign exchange income from tourism in my country's provinces during 2017-2020 is upward. It has developed rapidly before 2018, and its scale has expanded by nearly five times in four years. The urbanization rate, the number of scenic spots per capita, and the GDP of each province are generally increasing in a positive direction and have a high degree of correlation. These can all be regarded as economic development indicators, but the focus is different. The average urbanization rate has increased from 42.1%. Increased to 56.6%, the number of scenic spots per capita has almost doubled, but the proportion of tourism practitioners has hardly changed or even decreased, and the GDP of each province has nearly quadrupled. In the near future, my country's exchange rate has not changed much in the long term and is relatively stable. The overall development of the global economic situation in the past 13 years has been okay, and it has nearly doubled.
A horizontal comparison of the 2020 data of various provinces is carried out, and the results are shown in Table 3. From the data in the table, it can be seen that the standard deviation of the foreign exchange income of tourism and the total GDP of each province in the country is large, indicating that the development of each province is very uneven, not only the development of tourism service trade, but also the overall economic situation. There is also a very large imbalance. Relatively speaking, the gap between the urbanization rate and the number of scenic spots per capita is not as large as the other two. This shows the gap between the provinces of our country.

A R T I C L E
Page 11 of 16 Dong and Chen EURASIP J. Adv. Signal Process.
(2021) 2021:76 The initial particle path of tourism trade is shown in Fig. 3. It can be seen from Fig. 3 that the regression prediction data of the number of domestic tourists and the original data of the training sample are basically on the same broken line, maintaining the same growth trend, and the fitted value is relatively close to the true value.
The Pearson correlation coefficient test was performed on each independent variable, which is shown in Table 4. It can be seen from Table 4 that the correlation coefficient between the respective variables is not large. Generally speaking, more than 0.75 indicates that there is more obvious multicollinearity. It can be seen from the above variables that the largest correlation coefficient is 0.4424, indicating that the multicollinearity between the variables is not obvious. There is no obvious multicollinearity problem between variables.   Since the value of the RVM function (radius size) 8 and the penalty constant c are important factors that affect its performance and simulation effect, randomly selected parameters may cause the occurrence of over-fitting, so this article will use the crossvalidation method to select The optimal parameter combination improves the generalization ability of SVM learning to achieve the optimal simulation effect. In actual operation, first roughly find the best c and 8 in a large range, and then reduce the range of c and 8, based on the rough selection of parameters, use cross-validation for fine parameter selection, thereby reducing a large number of calculations. Steps to facilitate the results. The selection process and results are as follows: After the RVN algorithm is processed, the particles begin to tend to the best. The situation of the optimal solution finally obtained is shown in Fig. 4.

PSO-optimized hybrid RVM model prediction
Using the software Matlab and RVM prediction model, select the 2018-2019 domestic tourist number data as the training sample, and the 2019-2020 domestic tourist number data as the test sample. The specific operation process is to use the sample data of the previous three years to predict the next year. The method of the number of domestic tourists (e.g., use the sample data from 2018 to 2019 to predict the number of domestic tourists in 2020, and so on) and finally optimize the RVM parameters through PSO to obtain the optimized penalty constant and kernel function parameters and the insensitive loss function parameters, namely: C = 1000, ε = 0.001, σ = 9.2. The optimized parameters are used to simulate and fit the training samples, and the fitting results are shown in Fig. 5. It can be seen from Fig. 5 that the actual value and the fitted value of the number of domestic tourists in each year have basically maintained the same growth trend, and the difference between each other is very small. Table 5 shows the comparison between the actual value of domestic tourist arrivals from 2018 to 2020 and the predicted value.
It can be seen from Table 5 that the relative error between the actual value of the number of domestic tourists and the fitted value is within 5%, the smallest relative    The RVM method based on PSO optimization is used to predict the amount of tourism service trade in City A. The data of five indicators are shown in Table 6.
The indicators are normalized. Here, the normalization function mapminmax of MATLAB is used for data normalization. Firstly, the principal component analysis method is used to analyze the original data for dimensionality reduction, and the explanatory variable for more than 95% of the normalized index matrix is proposed as the relevant data for the calculation of the tourist service trade quota of City A. The principal component data are shown in Table 7.
The RVM model is initialized with three principal component factors. According to the adaptive learning and evolution of the model, a two-layer RVM model is finally obtained: the connection weights of the first layer of the model and the connection weights of the second layer. Based on the adaptive structure of the model, the forecast results of the tourism service trade in City A from 2018 to 2020 are shown in Fig. 7. According to the forecast result, the relative average error of 2020 is 5.7%, and the forecast result is relatively accurate.

Conclusion
This study extracts 8 indicators including gross national product, total fixed asset investment, gross industrial production, total actual use of foreign capital, total import and export, the exchange rate of renminbi to the US dollar, and the global economic growth rate. In order to measure the correlation between the evaluation indicators, the autocorrelation evaluation function in MATLAB is used, and the principal component analysis method is used to extract the principal components that can represent the indicators in a larger percentage. In order to improve the prediction accuracy of the RVM model, based on the adaptive construction model structure and initial model weights, the PSO algorithm is used to optimize the RVM model weights. The optimization process takes the minimum error of the RVM model as the algorithm search target, and each represents the RVM model. The algorithm finds the value and threshold of the optimal RVM model through the particle swarm tracking