Introduction

With the development of society and improvements in the incomes and living standards, popular demand for tourism is increasing. The global number of tourists will reach 1.231 × 1010 in 2019, and this rose by 5.38 × 108 on 2018 (a growth rate of 4.6%), 0.9% higher than that in 2018. In recent years, the increase in the tourist number has triggered many problems in tourism [1, 2]. Tourism resources are finite relative to visitor numbers: in particular, existing tourism routes are irrationally designed and thus it is hard to group tourists with the same preferences, therefore, there is a failure to arrange tourist traffic with different demands to the most suitable route, causing a series of problems, e.g., crowds, congestion, and failure to meet expectations. How to group tourists and arrange them to the most suitable tourism route according to tourists’ potential demand becomes a difficulty to be solved. To be exact, the tourism route planning problem that needs to be solved in the research can be abstracted as a travelling salesman problem, belonging to a combinational optimization problem. Different from traditional combinational optimization problems, the tourism route planning problem studied here considers a more complex situation, that is, multiple tourists share limited tourism resources, and considers their differentiated preference. It is divided into two steps: firstly, tourists of same attributes are grouped together to form different sets of tourists; secondly, tourist attractions offer tourism services by selecting several appropriate sets of tourists. Ant colony algorithm is a typical solving tool with complex system features, which are often used to solve travel salesman problem. Therefore, this paper will build a new ant colony algorithm to solve the tourism route optimization problem.

To be specific, the concentration of the route planning of tourist attractions is designed to search the set of tourist groups corresponding to each tourism route, which aims to maximize tourist satisfaction. The exploration of the problem can solve a series of route planning problems, for example, routes entailing irregular travel, turning back, and repeating routes such as are found in logistics distribution [3], military command, satellite scheduling [4], robot travel, and propagation and navigation exercises. Therefore, solving the route planning problem for tourist attractions exerts practical and guidance significance on other similar problems. Reasonable use of tourism resources can not only achieve the purpose of reducing costs and increasing efficiency, but also play a positive role in ecological economy. Analysis of tourism ecology can not only reduce the carbon emission of tourism, but also promote the development of ecological economy [5, 6].

It is necessary to take multiple factors (such as greatly differing customer demands, complex routes in tourist attractions and ambiguous planning objectives) into account to solve the route planning problem of tourist attractions. The existing research on the problem is insufficient and it fails to consider characteristics of the problem, including large scale, high difficulty in calculating the revenue, high decision-making dimension and complex constraints. Most of the models of tourism route planning are nonlinear models. For the nonlinear model, some scholars have designed evolutionary algorithm to solve the newsboy problem [7]. Therefore, it is expected to improve overall tourist satisfaction by investigating the methods for solving the route planning problem for tourist attractions. Some scholars have conducted related studies [8, 9]. To realize the dynamic and balanced development of the ecological environment and the social economy, some scholars have carried out research [10]. Tourism route planning is a non-deterministic polynomial (NP) problem. When considering the basic characteristics of the tourism route planning problem, a mathematical model which takes the maximization of the overall satisfaction of all tourist groups as the objective function is established by taking each group of tourists as the research object. The model is built by taking ages and preferences of tourists as well as the upper limits of the tourist carrying capacities along various tourism routes as constraints from the perspective of how to arrange different types of tourists to the most suitable route. The bacterial foraging algorithm is introduced based on the ant colony algorithm to enhance the performance thereof, while avoiding locally optimal solutions [11]. To increase the convergence speed and search ability of the algorithm, this paper introduces two different knowledge models based on the ant colony algorithm to improve the traditional ant colony algorithm. In this way, the solution obtained according to the algorithm further approaches the optimal solution.

The study is divided into six parts: the introduction in “Introduction” comprehensively narrates the research problems; “Literature review” covers the literature review, which briefly generalizes the existing research; “Model establishment” describes the variables and mathematical modelling; “Algorithm design” selects an algorithm according to the studied problem and improves the algorithm; “Simulation results” introduces the source of the examples, problem solving environment, test and result analysis; the last section concludes.

Literature review

The tourism route planning and combination belongs to a route planning problem as used in robotics, logistics, navigation, and industry [12]. In recent years, scholars have further explored the path planning problem [13, 14], considered different factors, and constructed many single objective and multi-objective function models [15]. Many solutions and algorithms are proposed for these models, such as a genetic algorithm (GA), particle swarm optimization (PSO), neural network (NN), heuristic search, evolutionary algorithms and ant colony algorithm [16,17,18,19,20,21].

The tourist route planning (TRP) problem studied here is in essence a type of traveling salesman problems (TSPs) with constraints on multiple combinations and capacity of tourist routes. With the progress of research on TSPs, numerous scholars have established various mixed-integer linear programming models for the TSP of unmanned aerial vehicles (UAVs) in different fields. Moreover, effective algorithms including quadratic heuristic clustering algorithm [22], local search algorithm [23], multi-child genetic algorithm [24], variable neighborhood search [25], and branch and cut algorithm [26, 27] to solve the problem are designed to solve the problem. However, due to the specificality of constraints on the TSP in different research contexts, some researchers have introduced effective strategies such as discretization scheme and two-stage greedy initialization algorithm based on traditional algorithms to further improve search performance of these algorithms [28,29,30,31,32,33].

TRP is a variant of TSPs. With the deepening of research on TSPs, lots of scholars have performed some studies in recent years on TRP by mining the travelling hobbies and interests, setting intermediate scenic spots, and considering influences of the environment on tourists [34, 35]. However, because of changes of the tourism environment and influences of tourists’ preference, different scholars focus on different aspects, so that the problem to be solved exhibits different complexities. Therefore, plenty of scholars also have designed effective algorithms such as firefly algorithm [36], casual meeting algorithm [37], evolutional algorithm [38], ant colony algorithm (ACA) [39], and two-stage heuristic algorithm [40] for different TRP problems, to solve all kinds of such complex problems. For example: Xia et al. [41] put forward an ant colony algorithm MPDACO with multi-pheromone dynamic update, including MPDACO local optimization algorithm and MPDACO global optimization algorithm. The algorithm can dynamically adapt to various conditions such as service failure and change of QoS under service in the network. By performing simulation tests on the algorithm for service recommendation systems for tourism agencies, it can be found that the algorithm presents better performance compared with the basic ant colony algorithm and a GA for service selection. Due to the characteristics of the ant colony algorithm, the foraging behavior of ants delivers a high similarity to tourist travel behavior in scenic spots [42, 43]. Therefore, the ant colony algorithm shows favorable performance outcomes in solving the tourism route planning problem. In addition, some scholars have designed many heuristic algorithms based on complex network optimization problems [44,45,46,47]. Nowadays, more and more scholars have used various types of knowledge and natural heuristic search algorithms to fuse and solve complex optimization problems in many scenarios [48,49,50,51,52]. In previous studies, researchers designed various knowledge models for all kinds of problems and fused the knowledge models with heuristic algorithms for solution. In the process, the design of knowledge models is mainly to effectively retain knowledge attained in the solving process, so as to ensure trend rationality of the subsequent iterative operation. The neighborhood search in the solving process of tourism route planning problems is very complex. In view of this and to ensure trend rationality of the algorithm in the iterative operation and diversity of populations, two models, i.e., a dynamic selection probability model, and a dynamic solving priority model, of knowledge are established in the current research. Meanwhile, the two models are fused with the ant colony algorithm for solution.

The tourism route planning problem differs from the classical traveling salesman problem (TSP), mainly appearing as that (1) tourists can turn back and even repeatedly pass through some scenic spots; (2) tourists are likely to visit some scenic spots within tourist attractions instead of traversing all of them; (3) the starting point is possibly the same as, or different from, the destination [7]. Currently, the optimization method for tourism routes is proposed based on a single tourist group in most studies while it is hard for the traditional ant colony algorithm to solve the proposed problem. Therefore, based on existing research, a decision-making model is designed according to the characteristics of the problem and the existing ant colony algorithm is improved by introducing the bacterial foraging algorithm and Knowledge model, thus more effectively solving the proposed problem.

Model establishment

Description of the problem

The tourism combinational optimization can be defined as follows: there are K optional routes for tourists in a tourism attraction; the tourists in the attraction are divided into n groups and tourists in different groups are allowed to intersect; each group corresponds to a tourist route, and it only has a satisfaction value for the corresponding route. On the condition of being not larger than the upper limit of the tourist carrying capacity in each route and serving all tourists, the tourists are grouped to maximize the overall tourist satisfaction along K routes.

The assumptions of the model include:

  1. 1.

    The related parameters for each of the K routes are known;

  2. 2.

    Some information (including expected expenditure and age) of all tourists is known;

  3. 3.

    Each tourist is only served once;

  4. 4.

    The overall satisfaction of each group of tourists corresponds to a specific tourism route;

  5. 5.

    The tourist groups on each route are not larger than the upper limit of tourist carrying capacity;

  6. 6.

    Various factors (such as weather and environment) along tourist routes do not influence the tourist satisfaction.

It is supposed that there are n groups of data to be solved and therefore the calculation needs to be performed \(\left(n\right)\cdot \left(n-1\right)\dots 1\) times on the worst condition. Hence, the time complexity of the problem is expressed as \(T\left(n\right)=O(n!)\). As the input n increases, the time complexity tends to infinity. Thus, it can be judged that the problem is NP-hard.

Representation of variables of the problem

The parameters and variables of the established model are listed in Table 1.

Table 1 Model parameters and variables

By taking tourist groups as the object, the research is conducted by taking the maximization of the overall satisfaction of all tourist groups as the optimization objective on the premise that all tourists are served and the tourist groups in each route are not larger than the upper limit of tourist carrying capacity. Overall, a mathematical model for the above problem is established:

Objective function:

$${\mathrm{max}}\sum_{i=1}^{n}{x}_{i}{b}_{i}.$$
(1)

Constraints:

$${x}_{i},{y}_{ij},{P}_{ik}={0,1}$$
(2)
$$\sum_{i=1}^{n}{x}_{i}{\mathrm{com}}_{i}=\sum_{j=1}^{N}{T}_{j}$$
(3)
$$\sum_{i=1}^{n}{P}_{ik}{x}_{i}\le {U}_{k},\quad\forall k\in K$$
(4)
$$\sum_{i=1}^{n}{y}_{ij}{x}_{i}=1,\quad\forall j\in m$$
(5)
$$\sum_{i=1}^{n}{x}_{i}\le 1$$
(6)
$${b}_{i}={\omega }_{1}{t}_{i}+{\omega }_{2}{c}_{i}+{\omega }_{3}{\delta }_{i}.$$
(7)

The objective function (1) represents the maximal overall satisfaction; constraint (2) denotes the data range of variables; constraint (3) means that all tourists must be served; constraint (4) represents the fact that the tourist groups in each optional route are not larger than the corresponding upper limit of tourist carrying capacity; constraint (5) means that each tourist only visits along a single route and constraint (6) means that each group is selected once at most. Constraint (7) represents the calculation formula of group i satisfaction.

Algorithm design

The traditional ant colony algorithm is improved using the bacterial foraging algorithm and knowledge model. The following describes the algorithm and its improvement, the problem is a NP-hard problem.

To improve the solution efficiency of the algorithm, the meta-heuristic algorithm is used. As an evolutionary algorithm, ant colony algorithm is the earliest developed by scientists by observing the foraging behavior of ants in the nature. Ants will leave pheromones along a route in the foraging process and other ants can judge whether to select the route or not according to the concentration of pheromones. The more the ants pass through a route, the higher the probability that the other ants select the route. Eventually, ants interactively cooperate through pheromone signals and gradually concentrate on a superior route.

Knowledge models

As a global optimization algorithm, ant colony algorithm has poor performance in local search ability and convergence speed. To improve the solving speed and quality of the algorithm, two knowledge-based models are introduced in this paper: knowledge-based dynamic selection probability, knowledge-based dynamic solution priority. The knowledge-based model can extract knowledge from the optimal solution obtained in the current generation after the algorithm completes an iterative solution, and use the obtained knowledge to guide the subsequent algorithm solution. At the same time, to retain the guidance ability of the knowledge model of the algorithm solution, this paper also modified the pheromone update rules of the algorithm. In summary, the operation mechanism of knowledge-based ant colony algorithm designed in this paper is shown as Fig. 1.

Fig. 1
figure 1

The operation mechanism of knowledge-based ant colony algorithm

Definition of pheromone matrix

Owing to the sequence of ants in selecting a route in the problem shows no influence on the overall satisfaction, the pheromone matrix is defined as 1 × n. It can avoid the influence of existing solutions on the subsequent solution process using the algorithm in the pheromone matrix and increase the rate of convergence of the algorithm.

Rules on route transfer

A random number \(q,q\in ({0,1})\) is generated when selecting a route. Given \(q\le {q}_{0}\), the group \({m}^{*}\) with the maximum \({P}_{m}(t)\) is selected, that is, the \({m}^{*}\) is calculated as follows:

$${m}^{*}={\mathrm{argmax}}_{m\in {\mathrm{allowed}}_{m}}[{\tau }_{m}^{\alpha }\left(t\right)*{\eta }_{m}^{\beta }\left(t\right)*1/{\mathrm{Ti}}_{m}^{\gamma }(t)].$$

Otherwise, in the case of \(q>{q}_{0}\), a group is stochastically selected through roulette; where, the probability that ants select route j at time t is calculated as follows:

$$ P_{j} \left( t \right) = \left\{ {\begin{array}{*{20}l} {\frac{{\tau_{j}^{\alpha } \left( t \right)*\eta_{j}^{\beta } \left( t \right)*1/{\text{Ti}}_{j}^{\gamma } \left( t \right)}}{{\mathop \sum \nolimits_{{s \in {\text{allowed}}_{m} }} \tau_{s}^{\alpha } \left( t \right)*\eta_{s}^{\beta } \left( t \right)*1/{\text{Ti}}_{m}^{\gamma } \left( t \right)}},} \hfill & {j \in {\text{allowed}}_{m} \left( t \right)} \hfill \\ {0,} \hfill & {j \notin {\text{allowed}}_{m} \left( t \right)} \hfill \\ \end{array} } \right\} $$

where, \(P_{j} \left( t \right)\), α, β, γ, \({\text{allowed}}_{m} \left( t \right)\), \(\eta_{j} \left( t \right) = b_{j}\), \({\text{Ti}}_{j} \left( t \right)\), \(\Delta c\), and \(\tau_{j} \left( t \right)\) refer to the probability that ants select route j at time t, the importance factor of pheromones, the importance factor of heuristic values, importance of knowledge model, the set in which ants are allowed to transfer their route at time t, the heuristic value in route j at time t, update parameter of route selection times, the number of times path j has been selected in the optimal solution of each generation at time t and pheromone levels in route j at current time t, respectively.

When the algorithm completes one iteration, it needs to update the number of times each path in the contemporary optimal solution is selected. The rules are as follows:

$${\mathrm{Ti}}_{{n}^{*}}\left(t+1\right)={\mathrm{Ti}}_{{n}^{*}}\left(t\right)+\Delta c,{n}^{*}\in \text{bg}$$
$$\text{bg}=g{s}_{{\mathrm{argmax}}[{\mathrm{S}}(t)]}.$$

Which represents the total satisfaction set requested by the ants at time t, represents the contemporary optimal path, and represents the path sought by the ant m.

Additionally, if the group \({\mathrm{allowed}}_{m}(t)\) with optional routes is an empty matrix while some tourists are still not served, the attained solutions should be cleared to generate new solutions under constraint (3).

Knowledge-based dynamic solution prior model

Based on the constraints of assumption C above, considering that the solution object of this paper is the tourist group, the algorithm selects the initial transfer path, which will have a greater impact on the subsequent solution space. To solve this problem, this paper sets the class with poor average satisfaction of the previous generation group as the priority solution object, which is not only conducive to improving the solving ability of the algorithm, but also conducive to ensuring the tourism experience of tourists.

Based on this, this paper introduces the knowledge-based dynamic solution priority to improve the algorithm. After completing an iteration, this paper calculates the average group satisfaction of each tourist based on the contemporary bg. The calculation formula for the group average satisfaction of tourists j at time t \({\mathrm{avg}}_{j}(t)\) and the group average satisfaction record of group j \({\mathrm{Rd}}_{j}(t)\) is as follows:

$$ {\text{avg}}_{j} \left( t \right) = \frac{{b_{i} }}{{tn_{i} }},\quad y_{ij} = 1, \, i \in \text{bg} $$
$$ {\text{Rd}}_{j} \left( {t + 1} \right) = \{ {\text{Rd}}_{j} \left( t \right),{\text{avg}}_{j} \left( t \right)\} . $$

Among them, \({\mathrm{tn}}_{i}\) represents the number of tourists in the tourist group i. Starting from the second iteration, after completing the calculation of the average group satisfaction of each tourist, compare the group average satisfaction at time t with the previous t − 1 time, if \({\mathrm{avg}}_{j}\left(t\right)\le {\mathrm{minRd}}_{j}(t-1)\), then the initial transfer path of the first \(m/2\) ants of the ant colony algorithm at t + 1 time only considers the tourist group containing tourist j.

Rules on pheromone updating

To retain the knowledge matrix’s ability to guide the ant colony algorithm, this paper used the rules on pheromone updating in [53] as a reference, which updates the global pheromone only after the algorithm completes an iterative solution. When the algorithm completes an iteration, find out the optimal satisfaction Smax, Smin. To make the algorithm converge, the pheromone on the global path is updated by global evaporation.

$${\tau }_{n}\left(t+1\right)=\left(1-\rho \right)*{\tau }_{n}\left(t\right).$$

After performing global pheromone evaporation, update the pheromone on the optimal solutions route according to the following formula:

$${\tau }_{{n}^{*}}(t+1)={\tau }_{{n}^{*}}(t+1)+\rho \Delta {\tau }_{{n}^{*}}(t),\quad {n}^{*}\in \text{bg}$$

where \(\rho \) refer to the global pheromone evaporation parameter, \({\tau }_{n}(t)\)t represents the pheromone concentration on the route at time t, \(\Delta {\tau }_{{n}^{*}}(t)\) represents the decreasing number of pheromones on the optimal solutions route at time t. The calculation formula is as follows:

$${\tau }_{{n}^{*}}\left(t\right)=\frac{Q}{S{\mathrm{max}}-S{\mathrm{min}}}.$$

Bacterial foraging algorithm mechanism

Similar to the ant colony algorithm, the bacterial foraging algorithm is also a bionic algorithm, which is an optimization algorithm proposed to simulate the foraging process of Escherichia coli. E. coli shows three typical behavior modes (i.e., chemotaxis, reproduction, and elimination and dispersal) in the foraging process. Chemotaxis means that bacteria seek for the area with abundant food through tumbling and running; reproduction refers to that poor bacteria are eliminated to increase the proportion of excellent bacteria; elimination and dispersal denotes that bacteria die or migrate to the other positions due to the change of their living environment. Performing the elimination and dispersal in the algorithm is conducive to avoiding local optima and enhancing the solution ability of the algorithm. The solution of the traditional bacterial foraging algorithm is shown in Fig. 2.

Fig. 2
figure 2

Solution of the traditional bacterial foraging algorithm

As for the problem, the tourists are divided into several groups corresponding to specific routes in advance. After attaining a solution using the ant colony algorithm, the generated solution is improved. Owing to constraint (4), and as each group of data only correspond to a route in the case, the traditional bacterial search algorithm is not applicable, therefore, the following modifications are made to the bacterial search algorithm: the chemotaxis, elimination and dispersal in the bacterial foraging algorithm are modified such that the existing solutions obtained through the ant colony algorithm are stochastically interrupted and single points are allowed to generate the rest of new solutions; the reproduction in the bacterial foraging algorithm is modified so that by comparing the optimal solution obtained through the bacterial foraging algorithm with that attained through the ant colony algorithm, the superior optimal solution is taken to update all pheromones. An example of optimization is shown in Fig. 3.

Fig. 3
figure 3

An example of bacterial foraging algorithm

Chemotaxis of bacteria

The chemotaxis of bacteria aims to make bacteria tumble and run towards the area with abundant food to search for the optimal solution. Based on a random function, the combination of tourist groups solved by employing the ant colony algorithm is interrupted to imitate this bacterial tumbling. After eliminating the primary solutions from the range of optional routes, the existing solutions are regenerated to mimic bacterial running after tumbling.

The regeneration of solutions involves two strategies: stochastic generation through roulette by calculating the probability after invoking the following options:

$$ P_{j} \left( t \right) = \left\{ {\begin{array}{*{20}l} {\frac{{\tau_{j}^{\alpha } \left( t \right)*\eta_{j}^{\beta } \left( t \right)*1/Ti_{j}^{\gamma } \left( t \right)}}{{\mathop \sum \nolimits_{{s \in {\text{allowed}}_{m} }} \tau_{s}^{\alpha } \left( t \right)*\eta_{s}^{\beta } \left( t \right)*1/Ti_{m}^{\gamma } \left( t \right)}},} \hfill & {j \in {\text{allowed}}_{m} \left( t \right)} \hfill \\ {0,} \hfill & {j \notin {\text{allowed}}_{m} \left( t \right)} \hfill \\ \end{array} } \right\} $$

where, \({P}_{n}(t)\), α, β, γ, allowed(t), \({\eta }_{n}(t)\), \({Ti}_{n}(t)\) and \({\tau }_{n}(t)\) refer to the probability that bacteria select route n at time t, the importance factor of pheromones, the importance factor of heuristic values, the importance of dynamic selection probability, the set in which bacteria are allowed to transfer their own routes at time t and the heuristic value at time t and pheromone levels at point j at current time t, the number of times n select route at time t, respectively.

The other strategy is to stochastically select a route based on the optional groups without considering pheromones and heuristic values.

To guarantee the search ability of the algorithm, the aforementioned two ideas for selecting a route are selected: on the premise of setting a threshold \({P}_{b}\), a random number \(R\) is generated using the random function before selection. If \(R>{P}_{b}\), the first idea is chosen to select the route; otherwise, the second is selected.

Reproduction of bacteria

The purpose of reproduction is to increase the proportion of excellent individuals and reduce that of inferior individuals in the bacterial community. In the case, the bacterial search algorithm and ant colony algorithm are combined: by comparing the solution obtained after performing the chemotaxis of bacteria with the historical optimal solution, the superior optimal solution is taken to update all pheromones. This allows us to simulate reproduction and strengthen the capacity to obtain a superior solution in subsequent operations (Fig. 4).

Fig. 4
figure 4

Reproduction by the bacterial foraging algorithm

Elimination and dispersal of bacteria

In terms of elimination and dispersal, it is beneficial for the bacterial search algorithm to escape from the local optimal solution by transferring bacteria to other positions in the solution space. In terms of chemotaxis, the stochastically interrupted solution is set to allow the existence of single points to simulate the elimination and dispersal of bacteria. The presence of single-point solutions obtained by interrupting operation minimizes the restriction of the existing solutions on the solution space and enlarges the search range.

Additionally, after updating the ant colony algorithm using the bacterial foraging algorithm, the pheromones corresponding to the optimal solution are checked and updated to increase the probability of escaping from the suboptimal solution in the subsequent solving process: when a solution superior to the existing solution appears during calculation using the bacterial search algorithm, the pheromones of the groups contained in the solution are compared with those of the existing solution. If pheromones of the groups are lower than \(1/u\) of the maximum of the current pheromones, they are set to \(1/u\); otherwise, the pheromones are not processed.

Algorithm procedure of the knowledge-based hybrid ant colony algorithm

The solution steps and process of the ant colony algorithm improved by employing the bacterial foraging algorithm and knowledge model are thus:

  • Step 1: Start procedure, initialize solution parameters and counters, and calculate the tabu list based on the group of tourists to be solved.

  • Step 2: Initialize ant population, K = 1.

  • Step 3: Ant K finds the optional path according to the tabu list.

  • Step 4: Calculate the probability of selecting each path according to pheromone concentration, satisfaction degree and dynamic selection probability and use roulette to select each path.

  • Step 5: Check if all tourists have been assigned, then go to Step 6, otherwise go to Step 3.

  • Step 6: Check that all ants have completed all path selections, then calculate the appropriate value of all ants and update the current optimal value to Step 7. Otherwise K = K + 1, go to Step 3.

  • Step 7: The solution generated by each ant is randomly interrupted, and the method of re-generation is determined according to the threshold of random function comparison.

  • Step 8: Calculate the appropriate value of all improved solutions, compare the optimal solution obtained by ant colony algorithm with the optimal solution of the improved solution, and take the larger solution as the optimal solution of this iteration.

  • Step 9: Update global pheromone based on the optimal solution of this iteration.

  • Step 10: Check whether the iteration upper limit is reached, otherwise it is transferred to Step 11, and it is transferred to Step 13.

  • Step 11: Get the tourist group in the current optimal solution, updates the number of times selected.

  • Step 12: Calculate group satisfaction per visitor, and compared with the satisfaction of the historical group, set the group satisfaction ranking of the tourists to priority solution, then go to Step 2 for the next iteration.

  • Step 13: Output results (Fig. 5).

Fig. 5
figure 5

Flow chart of knowledge-based hybrid ant colony algorithm

Simulation results

Parameter setting and solution environment

The code for solutions is programmed with MATLAB™ and the solution aims to maximize the overall satisfaction. Various parameters in the solution process are listed in Table 2.

Table 2 Parameter setting

To verify the solution effect of the parameters set in Table 2 in the improved ant colony algorithm, this paper sets the parameters and codes to solve in the Windows 10 environment. The detailed environment settings are listed in Table 3.

Table 3 Solution environment

Generation of examples

Taking a travel agency as the direct data source, this paper preprocessed the data of tourists who have registered with the travel agency to obtain data of 1000 related tourists. Then, using the statistical ideas of hierarchical clustering and random sampling, the preprocessed data are processed again to obtain 15 data sets for simulation experiments in this paper. Specifically, it includes the following three steps:

  1. 1.

    First, preprocess the rough data obtained from the travel agency, only retain the data of the tourists who have registered and experienced, delete the related tourists who have registered but have not experienced, and get the tourism data of 1000 related tourists set;

  2. 2.

    To realize the centralized management of tourists of different age groups through age division and improve overall tourist satisfaction [54], based on the processed data set of 1000 tourists, this paper takes the age of tourists as the standard, according to the four ages of (0, 18], [19, 40], [41, 60], [61, + ∞) Hierarchical clustering of segments, four age groups of tourists data sets are obtained;

  3. 3.

    According to the proportion of each age group to the total number of tourists, each age group is randomly sampled, and a total of 15 tourists are finally obtained as the combined object.

The tourism route planning problem studied in this article is essentially a combination optimization problem. By allocating tourists into different combinations and formulating suitable travel routes for different combinations, to allow tourists to obtain the best travel experience at a lower price within a certain period of time, this article is based on the 15 tourists obtained through processing. For the data set, referring to the data combination method in [55], 300 sets and 1300 sets of data to be solved are obtained, respectively. Then 300 sets of data are randomly distributed on 2, 3, and 4 optional paths, and 1300 sets of data are randomly allocated to 3 optional paths, and the tourism satisfaction value of each tourist combination is calculated according to the above constraint (7). Some of the tourist combinations and their satisfaction values are shown in Table 4.

Table 4 Typical data

Test results and analysis

Parameter sensitivity analysis

Parameters in the model and algorithm were mainly analyzed from the following two aspects: on one hand, to ensure accuracy of parameters of the established mathematical model and algorithm, 100 tests were conducted on different parameter sets within a certain deviation range based on two data sizes. The average values of results of 100 tests were taken as the test results and then compared. On the other hand, to verify that the difference between test results of the selected parameter set and the other parameter sets is not a low-probability event, student’s t tests were conducted on the comprehensive cost for computation of all parameter sets. The results of sensitivity tests of parameters are listed in Table 5.

Table 5 Comparison of parameter sensitivity analysis results

The table indicates that although increasing the number of neighborhood searches, neighborhood preserving space, and number of tabu searches can ensure quality of the solution to certain extent, the rate of convergence of the algorithm decreases. The importance factors of pheromone and heuristic values provide reference for the ACA in constructing feasible solutions. Too large important factors may trap the algorithm in the local optimum, while too small ones fail to show their guidance for the algorithm. Improvement of the threshold of guidance of the knowledge models weakens the convergence ability of the algorithm to some extent, which cannot guarantee the quality of guidance of these models. With a too small parameter of pheromone evaporation, too much pheromone is retained in various routes, thus influencing the rate of convergence of the algorithm; if the parameter is too large, the algorithm is trapped in the local optimum, failing to guarantee the search quality.

Student’s t tests were carried out on the 11 sets under 2 data sizes in the table. This indicates that the optimal results attained using the selected parameter set are not a low-probability event. To obtain a better solution in a shorter time, the parameter values of the parameter set selected in this paper specifically include: path selection times update parameter 1, importance factor of knowledge model 3, pheromone release 3500, pheromone global evaporation parameters 0.3, maximal iterated algebra 500, number of ants 34.

Comparison of different numbers of optional routes

The emphasis of the study is to group tourists in advance and calculate the solution to maximize the overall satisfaction. To verify the applicability of the algorithm to different tourist attractions, the solutions are obtained and compared by changing the number of optional routes and it is required that the upper limit of tourist groups in each route is five. The solutions are attained and compared by inputting the data groups obtained by stochastically allocating 300 groups of tourist data to two, three, and four routes into the solver. The optimal solutions are all equal to 10,359.8030. The code is executed 100 times and the mean of maximum satisfactions and the number of optimal solutions occurring during each operation are recorded as follows (Tables 6, 7, 8).

Table 6 Comparison of the means of the maximum satisfactions with different numbers of optional routes
Table 7 Comparison of the numbers of the optimal solution occurring when there are different numbers of optional routes
Table 8 Comparison of solution times with different numbers of optional routes (unit: s)

Before and after increasing the number of optional routes, improving the ant colony algorithm by utilizing the bacterial foraging algorithm can enhance the performance of the algorithm. This is easily found from the number of optimal solutions and the mean of maximum satisfactions after the improvement that are larger than those before. Although the solution time of the algorithm increases due to the improvement, the algorithm still exhibits favorable practicability.

Comparison under different scales of problems

To verify the applicability of the algorithm under different scales of tourist groups, the comparison test for solutions is conducted by substituting 1300 and 300 groups of calculated tourist data. The number of optional routes and the upper limit of tourist groups in each route are set to five, respectively. The optimal solutions under 1300 and 300 groups of tourist data are 11,125.80585 and 10,359.8030, respectively. The code is executed 100 times and the means of maximum satisfactions and the number of optimal solutions occurring during each operation are recorded as follows (Tables 9, 10, 11):

Table 9 Comparison of the average satisfactions with different scales of tourist groups to be solved
Table 10 Comparison of the numbers of the optimal solutions with different scales of tourist groups to be solved
Table 11 Comparison of the solution times with different scales of tourist groups to be solved (unit: s)

The experimental results show that by expanding the number of groups to be solved, the average satisfaction degree of the ant colony algorithm improved by the bacterial foraging algorithm compared with the ordinary ant colony algorithm has expanded from the original about 1.84% lead to about 2.78% lead. Compared with the traditional ant colony algorithm, the leading scale of hybrid ant colony algorithm has expanded from about 3.2 to 4.54%.

Although the number of optimal solutions decreases and the solution time increases after expanding the solution space, by comparing the number of optimal solutions before and after the expansion, it can still be seen that the knowledge-based hybrid ant colony algorithm is better than the ordinary ant colony algorithm and the improvement of the bacterial foraging algorithm ant colony algorithm.

Comparison test with different upper limits on the number of tourist groups

To verify the applicability of the algorithm in terms of the upper limit of tourist groups within attractions, 300 groups of tourist data are distributed to 4 routes and the upper limits of tourist groups are set to 5 and 3 to obtain the solutions and compare the data. The optimal solutions are both 10,359.8030. The code is executed 100 times and the means of maximum satisfactions and the number of optimal solutions occurring during each operation are recorded as follows (Tables 12, 13, 14):

Table 12 Comparison of the average satisfactions with different scales of tourist groups to be solved
Table 13 Comparison of the number of optimal solutions with different scales of tourist groups to be solved
Table 14 Comparison of solution times with different scales of tourist groups to be solved (unit: s)

The experimental results show that when the number of groups to be solved remains unchanged and the maximum reception limit is reduced, the traditional ant colony algorithm can no longer cope with the needs of the solution due to its poor local search ability, and its average satisfaction drops by about 0.7%; compared with the traditional ant colony algorithm, the improved ant colony algorithm for bacterial foraging has expanded from about 2 to 2.4% in average satisfaction; the knowledge-based hybrid ant colony algorithm has expanded from 2.76 to 3.3%. At the same time, the number of optimal solutions obtained by the hybrid ant colony algorithm within an acceptable time growth is much greater than that of the other two comparison algorithms, and as the reception ceiling shrinks, it shows better stability.

Algorithm performance analysis

Based on the above descriptions, algorithms that perform effectively in recent years in solving TSPs and their variants were selected to verify the reliability and superiority of the designed novel knowledge-based ACA (KACA) in solving the problem. Experiments and comparisons were conducted under two different data sizes. Under the data sizes of 300 and 1300, there are three routes and the upper limit of the capacity of the route is 5. For the convenience of description, the four algorithms selected for comparison are labelled as MGGS [56], GATS [57], HGA [58], and HVNS [59]. The specific results are listed in Table 15:

Table 15 Comparison of results of different algorithms

As shown in the above table, due to characteristics of the studied problem, the algorithms effective in solving TSPs, TSP variants, or vehicle routing problems in recent years show certain limitations in solving the problem of interest. Attributed to the global search performance of the knowledge models and the ACA themselves, in 30 solution processes, the devised knowledge-based ACA yields average satisfaction and optimum solution that are superior to the other 4 algorithms. The specific reasons are listed below:

  1. 1.

    In terms of the problem. Different from traditional TSPs, the TRP problem studied has stricter constraints, which can be described as follows: on the premise of meeting standards such as their age and preference, the studied 15 tourists were combined in different ways, thus forming selected tourist groups for the algorithm. Due to low regularity in age and preference of different tourists, the feasible solutions to the problem are sparsely distributed, which increases the complexity of the problem and sets higher requirements for the global and local search capacity of the algorithm for solving the problem.

  2. 2.

    In terms of the solution algorithm. On one hand, the ACA, as a kind of global optimization evolutional algorithms, has strong global search capacity, so it is relatively suitable for serving as a basic algorithm for solving the problem. On the other hand, due to characteristics of the problem, the effective operators designed for TSPs in recent years fail to play their role in solving the problem. This is mainly because the local optimum of the problem is at depth of the solution space, which makes the four comparative algorithms prone to be trapped in local optimum in the solution process. Aiming at the problem, a knowledge model for specific problems was established. In the basic framework of the ACA, the knowledge model uses valuable knowledge attained by the algorithm in the iteration process to guide the iteration. This avoids the algorithm from being trapped in the local optimum, improves the identification of the algorithm for the local optimum, and therefore guides the algorithm to obtain the global optimum.

Conclusion

With the rapid development of tourism demand, it is more and more difficult for travel agencies and tourist attractions to meet the complex needs of tourists with limited tourism resources. In view of the above-mentioned situation, this paper comprehensively considers the age of tourists, acceptable play time and play expenses, under certain constraints to build the tourism portfolio optimization model with the largest overall tourism satisfaction as the goal. To improve the shortcomings of poor local search ability and slow convergence of traditional ant colony algorithm, this paper introduces mechanisms such as bacterial tendency operation, replication operation, migration operation and knowledge model, designs the hybrid ant colony algorithm based on the knowledge-based hybrid ant colony algorithm and the example of bacterial foraging algorithm while retaining the robustness of traditional ant colony algorithm, and proves the effectiveness of the improved algorithm by the data from layering clustering and random sampling. Different from most scholars for the development model to enrich tourism resources and for tourists to improve the tourism path of the status of individual experience, this paper in the consideration of the existing level of resources for the overall satisfaction of tourists optimized, the results of optimization for enterprises to put forward a reference program for tourism planning.

The optimization of tourism combination and the proposed hybrid ant colony algorithm in this paper enriches the relevant research fields and has some reference significance for the future allocation of tourism resources and the development of tourism. According to the optimization results, travel agencies and other enterprises can obtain a better combination of tourists, thereby improving the service level and profit level of travel agencies, tourist attractions can analyze the load level of attractions according to the optimization results, and combined with the load level to further optimize the allocation of resources.

The model and algorithm proposed in this paper lay a theoretical foundation for the decision support system of tourism enterprises under the demand for tourism combination optimization. Due to the complexity of tourist formation, the follow-up research can continue to expand the two-tier planning problem of “tourist group preparation and tourism portfolio optimization”, further explore and optimize the decision-making scheme under the constraint of limited tourism resources, and provide more comprehensive and efficient decision-making theoretical support for tourism resource planning.