A Stochastic Dominance-Based Approach for Hotel Selection under Probabilistic Linguistic Environment

: Online reviews of hotels reﬂect tourist perception and evaluation, which are becoming an important perspective of studying hotel selection. In this paper, we prefer to use a probabilistic linguistic term set (PLTS) to fully reveal evaluation grades and the corresponding probability distribution in the online reviews of hotels. In this way, we propose a novel stochastic dominance-based approach based on stochastic dominance degrees of PLTSs and a stochastic multi-criteria acceptability analysis (SMAA) method that tolerates missing information. Among them, ﬁrst-, second-, and third-order stochastic dominance degrees of PLTSs are calculated on the premise that the dominance relationships between PLTSs can be deﬁned based on ﬁrst-, second-, and third-order stochastic dominance rules of PLTSs. Based on these basic researches, ﬁve hotels are selected as alternatives in our case study to verify the validity and feasibility of the proposed approach. In the end, data analysis illustrates the inﬂuence of parameter and linguistic scale functions and how to choose appropriate parameter values. Furthermore, comparative analysis with other methods shows the stability of the proposed approach.


Introduction
With the development of network technology, different hotel reservation platforms are designed to facilitate people to choose and book hotels on the website. Location, environment, price and service facilities of different star hotels are exhibited on these platforms in detail. In addition, numerous reviews of hotels that are given by former tourists provide supplement features of hotels [1]. Practically, several factors such as price and location are taken into consideration when tourists choose hotels on the website. In other words, hotel selection is a complicated multi-criteria decision-making problem, which means that tourists should select favored hotels from hotel listings with considering these factors [2]. Hence, many studies that focused on hotels' factors, tourists' preferences and purposes have been conducted to help tourists make choices and provide managers with an insight into tourists' preferences [3].
From the point of decision-making theory, each tourist is also an evaluator, and factors are identified as criteria [2,4]. Online reviews that are composed of ratings and natural language are used to evaluate hotels involved with multiple criteria [5,6]. During the solutions, some problems should be considered and solved firstly, such as how to extract valuable information from massive online reviews for further processing, or how to describe information. As we can see on the website, although different people optionally give their reviews, the numerous reviews about hotels' criterion service are basically 'good', 'bad', and 'excellent'. In this situation, a probabilistic linguistic term set (PLTS) composed by linguistic terms and probability values can be used to describe these reviews under different criteria correctly and precisely [7]. Subsequently, Gou and Xu [8] redefined some operational laws for PLTSs based on two equivalent transformation functions. Bai et al. [9] proposed a new possibility degree formula for PLTSs rating. Wu and Liao [10] proposed a probability aggregation method to integrate the individuals' subjective evaluations into group ones expressed as PLTSs. Based on PLTS and relevant theory, Song et al. [11] proposed a novel text representation model named Word2PLTS for short text sentiment analysis. The superiority of PLTS attracts researchers to conduct studies to solve different problems [4,[12][13][14][15][16][17]. Considering the advantage of PLTS in expressing massive online reviews of hotels, PLTS is used to describe these online hotel reviews in our study. Reviews containing natural language and ratings can be transformed into linguistic terms, and the frequency of each common review can be defined as the probability value of the corresponding linguistic term.
Explosive growth of online reviews about hotels on these platforms aggravate complication of hotel selection problems. For example, famous hotels tend to be preferred choice of tourists in most cases, and amount of online reviews for these hotels would exceed five thousand, even over ten thousand. Comments about any hotel service might be depicted by a PLTS s 4 (0.8755), s 3 (0.1102), s 2 (0.0093), s 1 (0.0025), s 0 (0.0025) . In this PLTS, s 4 , s 3 , s 2 , s 1 and s 0 stand for 'excellent', 'very good', 'average', 'poor', and 'terrible', respectively. Future tourist's experience of this hotel's service may be 'excellent' or 'poor'. Based on this idea, the linguistic terms are treated as discrete random variables, and probability values of corresponding linguistic terms can be defined as the probability distribution of these discrete random variables. Therewith a probabilistic linguistic multi-criteria decision-making (PLMCDM) problem can be defined as a stochastic multi-criteria decision-making (SMCDM) problem. In addition, in this case in which evaluation values are random variables, how to calculate weights of hotel's criteria is another difficult issue. To deal with random variables, Martel and Zaras [18] considered a stochastic dominance set that consists of First-degree Stochastic Dominance (FSD), Second-degree Stochastic Dominance (SSD) and Third-degree Stochastic Dominance (TSD). For the weight problem, Lahdelma et al. [19] proposed a stochastic multi-criteria acceptability analysis (SMAA) method to deal with imprecise weight information by exploring the weight space and obtained the most preferred weights of criteria. Stochastic dominances and the SMAA method have been researched and proved to solve SMCDM problems effectively in the last two decades [20][21][22]. However, the PLMCDM method based on stochastic dominance and the SMAA method has not been proposed and used to solve actual hotel selection problems.
The above issue should be solved in stages. The processing stages and our contributions are summarized as follows.
(1) Online hotel reviews are generated naturally on the website. We tend to treat linguistic terms of PLTS as discrete random variables, and calculate probability values of corresponding linguistic terms as a probability distribution of these discrete random variables.
(2) First-, second-, and third-order stochastic dominance rules of PLTSs are defined with different conditions. Moreover, first-, second-, and third-order stochastic dominance degrees of PLTSs are stated on the premise that stochastic dominance rules exist. Some theorems and properties are also proved to explain the meanings and features of these rules and degrees.
(3) Quality decision results are related to the accuracy of the information. Weights of criteria and evaluation values under different criteria on the website may often be missing. Thus, we construct a novel approach based on the SMAA method that tolerates missing information to deal with online hotel reviews and obtain reliable decision result in this paper.
On account of above analysis, cumulative distribution functions, stochastic dominance rules and stochastic dominance degrees are defined. Then, a novel approach is proposed based on stochastic dominance degrees and the SMAA method. The remainder of this paper is organized as follows. Related literature is reviewed and analyzed in Section 2. Some basic researches including stochastic dominance rules and stochastic dominance degrees are conducted in Section 3. A novel approach based on stochastic dominance degrees and the SMAA method is proposed in Section 4. A case study is presented to illustrate the feasibility and availability of the proposed approach in Section 5. Finally, originality, limitations, and further works are concluded in Section 6.

Studies on Transformation of Online Hotel Reviews
As a type of emerging information resource on the website, online hotel reviews have their uniqueness, such as public availability, and have an important impact on hotel selection [23]. According to the online reviews, researchers studied tourists' basic requirements and preferences of hotels [24], and travel platforms could offer hotels' ranking to help tourists choose a satisfying hotel [25][26][27]. Online hotel reviews contain review ratings and review texts [28]. Review ratings are a quantitative evaluation and review texts are in-depth qualitative analysis of hotels [29]. The lengthier review texts are more useful and enjoyable than shorter review texts [30]. However, the correlativity between ratings and hotels list can help tourists choose a hotel in a much simpler and more intuitive manner, and review texts that include less valuable information may have bad readability [29]. Thus, considering determinacy of ratings and complexity of texts, we focus on online hotel review ratings to conduct a case study in Section 5.
Hotel ratings are represented in 'five-stars', 'four-stars' and so forth. Although we can understand what these stars mean, we cannot use them directly in the operation process. Transforming hotel review ratings into computable values plays an important role in solving hotel selection problems. Several researchers began looking at the transformation and some methods were designed. Li et al. [31] normalized hotel rating scores into the range [0,1] before fitting them into computational function. Analogously, Gavilan et al. [32] used a 1-10 scale and four quartiles to differentiate these ratings. However, the meanings of ratings are close to the natural language. Therefore, for preferably keeping comprehensive information of ratings, several fuzzy sets are used to describe review ratings. Based on analyzing hotel's reviews, Yu et al. [33] collected linguistic assessment values given by evaluators and expressed these values with interval type-2 trapezoidal fuzzy numbers (IT2FNs), and then an extended multi-attributive border approximation area comparison (MABAC) method based on the likelihood of IT2FNs. By utilizing the advantage of PLTS in depicting large linguistic values, Peng et al. [34] defined a novel concept of a probabilistic linguistic integrated cloud (PLIC) and established the hotel decision support model based on the essential algorithms and distance measure of PLICs.

Studies on Stochastic Multi-Criteria Decision-Making Problem
The common methods that are used to solve SMCDM problems are stochastic dominance method [35] and SMAA method [19]. Because of the variability and complexity of actual SMCDM problems, extended approaches based on the stochastic dominance method and SMAA method were proposed and applied in various fields. Furthermore, our proposed approach also has great relations with the stochastic dominance method and the SMAA method. As a result, studies on the above two methods are reviewed in detail in this section.
Stochastic dominance rules were defined to compare alternatives and construct dominance relations between alternatives, and then a stochastic dominance method was proposed [35]. Martel and Zaras [18] considered a stochastic dominance set composed of FSD, SSD and TSD, and then defined three stochastic dominances detailly. In order to deal with project evaluation problems of taking qualitative and quantitative information into consideration, Zaras [36] applied deterministic, stochastic, and fuzzy evaluations to build mixed-data multi-attribute dominance relations, and proposed a method based on the dominance-based rough set approach. Nowak [37] distinguished situations of strict and weak preference based on stochastic dominance rules and preference threshold, and then constructed a ranking method based on the ELECTRE-III method. Subsequently, Nowak [38] utilized stochastic dominance to generate efficient actions and constructed rankings of actions with respect to criteria, and proposed a new interactive technique for a discrete stochastic multiattribute decision making problem. However, overall ranking result of alternatives cannot be obtained based on dominance relationships between each alternative. Thus, Zhang et al. [39] defined stochastic dominance degree to describe the degree that one alternative dominated another alternative, and developed a method based on the PROMETHEE-II to obtain the overall ranking result of alternatives. In addition, the concept of stochastic dominance degree is applied to different methods to solve actual stochastic dominance problems [40,41].
Compared with the stochastic dominance method, the SMAA method that explored the weight space was suited to handle uncomplete weight information and random variables [19]. In order to consider all ranks in analysis, Lahdelma and Salminen [42] extended the original SMAA to the SMAA-2 method that could be used to identify good compromise candidates. Based on ELECTRE III-type pseudo-criteria (double threshold model), Lahdelma and Salminen [43] further extended original SMAA to the SMAA-3 method to deal with preference structure and inaccuracy of criteria. Along with these extended SMAA methods, not only other extended methods but also integrated methods of composing the SMAA method and classical decision methods were proposed under different situations. Considering the whole space of preference parameters compatible with the decision maker's preference, Angilella et al. [44] integrated the SMAA method with the Choquet integral preference model to get robust recommendations. Corrente et al. [45] applied the SMAA method to the classical PROMETHEE method and to the bipolar PROMETHEE method. Okul et al. [46] developed the SMAA-TOPSIS method to analyze drug benefit-risk and select a machine gun. To address uncertainties that exist in TODIM at the same time, Zhang et al. [47] put forward the SMAA-TODIM method to explore simultaneously the uncertainties inherent. Govindan et al. [48] proposed a hybrid SMAA-ELECTRE I method to exploit all parameters of an outranking model compatible with the incomplete preference information, and then selected a reverse logistics provider. A series of SMAA methods and SMAA-based methods have been used to handle real-life decision-making problems, and the capacity of these methods in dealing with complex decision problems characterized by the existence of multiple conflicting criteria, uncertain criteria measurements, missing preference information and participation of multiple decision makers have been approved [49].

Basic Researches
Some basic definitions about PLTSs are presented before definitions of stochastic dominance rules and stochastic dominance degrees. Definition 1. [7] Let S = {s 0 , s 1 , s 2 , . . . , s 2T } be a linguistic term set (LTS), an PLTS can be defined as: where s θ(k) p (k) is the linguistic term s θ(k) associated with the probability p (k) , and #L(p) is the number of all different linguistic terms in L(p).

Definition 2.
Given an PLTS L(p) = s θ(k) p (k) s θ(k) ∈ S, p (k) ≥ 0, k = 1, 2, . . . , #L(p) with #L(p) k=1 p (k) < 1, then the normalized PLTS L(p) can be defined by Definition 3. Let S = {s i |i = 0, 1, . . . , 2T } be an LTS and L(p) = s i p (i) i = 0, 1, . . . , 2T, s i ∈ S be a normalized PLTS. In view of fact that a normalized PLTS contains several linguistic terms and the related probability values, L(p) is treated as a discrete random variable. Then, the discrete probability distribution function P(s x ) corresponding to L(p) is defined as: (1) According to Equation (1), the cumulative distribution function F(s x ) can be obtained by Obviously, another cumulative distribution function F( f * (s x )) that corresponds to values of linguistic terms can be defined based on F(s x ) and linguistic scale functions presented in [50]. F( f * (s x )) is shown as where f * (s x ) is numerical value calculated by linguistic scale function f 1 (s x ), f 2 s y or f 3 (s z ). Then new stochastic dominance rules of PLTSs can be defined.

Stochastic Dominance Rules
Definition 4. Let L 1 (p) = s i (p i 1 ) i = 0, 1, 2, . . . , 2T and L 2 (p) = s i (p i 2 ) i = 0, 1, 2, . . . , 2T be any two normalized PLTSs. Two cumulative distribution functions F 1 ( f * (s x )) and F 2 ( f * (s x )) are obtained by utilizing the above transformation process. When the important condition F 1 ( f * (s x )) F 2 ( f * (s x )) holds, the stochastic dominance rules of L 1 (p) and L 2 (p) exist and can be defined as follows: Proof. According to the known conditions p In addition, based on the conditions that p . Transitivity of first-order stochastic dominance rule is proved as follows. Property 1. Let L 1 (p), L 2 (p) and L 3 (p) be any three normalized PLTSs, and F 1 (s x ), F 2 (s x ) and F 3 (s x ) be three cumulative distribution functions obtained based on L 1 (p), L 2 (p) and L 3 (p). Then the following properties hold. ( The proof of the Property 1 is provided in Appendix A.

Stochastic Dominance Degrees
Though relations between PLTSs can be obtained based on the stochastic dominance rules of PLTSs, certain dominance degrees are unknown. As a result, stochastic dominance degrees of PLTSs are defined to measure the dominance degrees of PLTSs precisely.
can be defined as follows: (1) (First-order stochastic dominance degree) If F 1 (s x )SD 1 F 2 (s x ), the first-order stochastic dominance degree is denoted as ψ 1 (F 1 (s x )SD 1 F 2 (s x )) and calculated by Equation (4).
) and calculated by Equation (6). where The proof of the Property 2 is provided in Appendix B. Note: The stochastic dominance relation between PLTSs should be confirmed based on first-, second-, third-order stochastic dominance rules, and then the stochastic dominance degrees of PLTSs can be calculated. If not, the stochastic dominance degrees of PLTSs are meaningless.

A Novel Stochastic Dominance-Based Approach
In this section, the PLMCDM problem is formulated. A novel stochastic dominance-based approach is then proposed to solve the above problem.

Problem Formulation
. As a result, the cumulative distribution functions can be obtained based on these PLTSs. A novel stochastic dominance-based approach is proposed based on stochastic dominance degrees and the SMAA method in the following.

Proposed Approach
The framework of the proposed approach is shown in Figure 1. The decision steps are described after Figure 1. calculation, one decision-maker only provides one linguistic evaluation value. However, the linguistic evaluation values increase with the number of decision-makers that are denoted by S N .
. As a result, the cumulative distribution functions can be obtained based on these PLTSs. A novel stochastic dominance-based approach is proposed based on stochastic dominance degrees and the SMAA method in the following.

Proposed Approach
The framework of the proposed approach is shown in Figure 1. The decision steps are described after Figure 1.

Stage 1 Obtain matrix consisted of stochastic dominance degrees
Step 1: Construct and normalize probabilistic linguistic decision matrix Step 2: Define cumulative distribution functions corresponded to PLTSs in the decision matrix Step 3: Construct decision matrix consisted of stochastic dominance relations Step 4: Obtain decision matrix consisted of stochastic dominance degrees Stage 2 Obtain comprehensive ranking result Step 5: Generate weight vectors Step 6: Calculate comprehensive stochastic dominance degrees between alternatives Step 7: Calculate comprehensive evaluation value of each alternative Step 8: Rank alternatives Step 9: Calculate possibility degree of alternative Step 10: Calculate comprehensive possibility degree of each alternative Step 11: Obtain comprehensive ranking result Yes No  Step 1: Construct and normalize the probabilistic linguistic decision matrix. Basic probabilistic linguistic decision matrix is constructed in terms of criteria and evaluation value L ij (p), and normalized decision matrix is calculated based on Definition 2 and L ij (p).
Step 2: Define cumulative distribution functions corresponding to PLTSs in the decision matrix. Based on Equations (1) and (2), the cumulative distribution function F ij (s x ) corresponding to L ij (p) can be defined as follows.
Step 3: Construct decision matrix consisted of stochastic dominance relations. According to the cumulative distribution functions and Definition 4, stochastic dominance relation between alternatives a i and a l under criterion c j can be judged and denoted by rsd ilj . Then where rsd ilj = ϕ means that there is no relation between alternatives a i and a l under each criterion c j . Subsequently, the decision matrix consisted of stochastic dominance relation rsd ilj can be constructed and denoted by RSD j .
Step 4: Obtain the decision matrix consisting of stochastic dominance degrees. Based on rsd ilj and Definition 5, stochastic dominance degree q ilj between alternatives a i and a l under criterion c j can be calculated by Equation (9).
Thus, decision matrix Q j = q ilj m×m consisted of stochastic dominance degrees can be obtained. Stage 2. Obtain comprehensive ranking result. In this stage, K weight vectors are generated to calculate ranking results. The number of ranking results is related to the number of weight vectors.
Step 5: Generate K weight vectors. Each criterion c k j is assigned a number w k j ranging from 0 to 1 at random. The vector consisting of these numbers that satisfies the given constraint condition, such as n j=1 w k j = 1, turns into weight vector W k = w k 1 , w k 2 , . . . , w k n , where k = 1, 2, . . . , K and K is the number of weight vectors.
Step 6: Calculate comprehensive stochastic dominance degrees between alternatives.
Combine the weight vector W k = w k 1 , w k 2 , . . . , w k n and decision matrix Q k j = q k ilj m×m consisting of stochastic dominance degrees, and then comprehensive stochastic dominance degree φ k il between alternatives a i and a l can be calculated as Step 7: Calculate comprehensive evaluation value of each alternative.
Inspired by the idea of calculating comprehensive evaluation values of alternatives in the TODIM method, comprehensive evaluation values are calculated based on comprehensive stochastic dominance degrees as Step 8: Rank alternatives.
According to the values of Z k i , a ranking result can be obtained. In addition, steps 6-8 are repeated with assigning an integer between 2 and K sequentially to the parameter k until k = K.
Step 9: Calculate possibility degree ∂ r i of alternative a i . K ranking results are obtained based on steps 5-8. Then, ∂ r i is calculated as the probability that the alternative a i is placed on the r-th position of the ranking results where M r i is the number of times the alternative a i is placed on the r-th position of the K ranking results, i = 1, 2, . . . , m and r = 1, 2, . . . , m.
Step 10: Calculate the comprehensive possibility degree of each alternative. Comprehensive possibility degree of each alternative is calculated based on possibility degree ∂ r i and position weight α r . Generally speaking, position weight α r decreases with the increase of rank r. The value of position weight α r can be confirmed based upon the idea of linear weight or reverse weight, and it can also be assigned numerical values as needed.
Step 11: Obtain the comprehensive ranking result.
According to values of ∂ C i , the comprehensive ranking results of alternatives are obtained. a i corresponding with the largest ∂ C i is the best alternative.

Case Study
As a leading large-scale travel website, TripAdvisor.com includes more than 500 million online reviews of hotels, restaurants and scenic spots. Furthermore, tens of thousands of online reviews are provided by tourists from different countries daily. Online reviews of a popular hotel can even reach more than 5000. It is difficult for tourists or hotel managers to pick out useful information from numerous online reviews and make decisions. Naturally, a hotel's ratings on different criteria provided by tourists can be used to help people to understand this hotel fleetingly. Ratings on six criteria of Amari Phuket given by Penny C are presented on the TripAdvisor and shown in Figure 2. In general, although there exist numerous hotels on the TripAdvisor, several hotels can be chosen as alternatives from hotel listings by tourists based on experience. Each of these several hotels often has its advantages and it is difficult for a tourist to choose a suitable hotel. According to our proposed approach and the above analysis, five popular hotels that each has its advantages and disadvantages are selected as alternatives according to TripAdvisor. The five popular hotels denoted by A = {a 1 , a 2 , a 3 , a 4 , a 5 } are selected as appropriate alternatives and the best alternative could be picked out according to the novel stochastic dominance-based approach in Section 3. Six criteria, including sleep quality, location, rooms, service, value and cleanliness in Figure 2 that denoted hereafter by c 1 , c 2 , c 3 , c 4 , c 5 and c 6 , are used to assess the above five hotels. Five linguistic ratings named excellent, very good, average, poor, and terrible in Figure 3 are used and denoted hereafter by LFS S = {s 4 , s 3 , s 2 , s 1 , s 0 }.   For one hotel, different linguistic ratings given by travelers under each criterion are collected and the ratio of the number of each linguistic rating to the number of this hotel's reviews can be calculated. The linguistic ratings and the ratios are corresponding to linguistic terms and probabilities in PLTS. Then evaluation values for each hotel under each criterion can be expressed by PLTSs in this way and shown in Tables 1-3.    For one hotel, different linguistic ratings given by travelers under each criterion are collected and the ratio of the number of each linguistic rating to the number of this hotel's reviews can be calculated. The linguistic ratings and the ratios are corresponding to linguistic terms and probabilities in PLTS. Then evaluation values for each hotel under each criterion can be expressed by PLTSs in this way and shown in Tables 1-3.   For one hotel, different linguistic ratings given by travelers under each criterion are collected and the ratio of the number of each linguistic rating to the number of this hotel's reviews can be calculated. The linguistic ratings and the ratios are corresponding to linguistic terms and probabilities in PLTS. Then evaluation values for each hotel under each criterion can be expressed by PLTSs in this way and shown in Tables 1-3.

Results
According to the proposed approach in Section 3, the decision process can be obtained as follows.
Step 1: Construct and normalize the probabilistic linguistic decision matrix. Based on the evaluation values for each hotel and the fact that six criteria are complete benefit criteria, a normalized probabilistic linguistic decision matrix can be constructed and represented by R = L ij (p) 5×6 .
Step 2: Define cumulative distribution functions corresponding to PLTSs in the decision matrix. Cumulative distribution function F ij (s x ) corresponding to L ij (p) can be defined based on Equations (1) and (2). Take L 11 (p) = s 0 (0.0021), s 1 (0.0053), s 2 (0.0056), s 3 (0.145), s 4 (0.842) for example, the cumulative distribution function F 11 (s x ) is obtained as follows. The other cumulative distribution functions are omitted.
Step 3: Construct a decision matrix consisting of stochastic dominance relations. Let f * (s x ) = f 1 (s x ). Decision matrix RSD j consisting of stochastic dominance relations that between alternatives a i and a l under criterion c j can be constructed according to Equation (8).
Step 4: Obtain the decision matrix consisting of stochastic dominance degrees. By Equation (9), decision matrix Q j = q ilj m×m consisted of stochastic dominance degrees between alternatives a i and a l under criterion c j can be obtained.
Step 6: Calculate comprehensive stochastic dominance degrees between alternatives.
With the first weight vector W 1 , comprehensive stochastic dominance degree φ 1 il is calculated by Equation (10) and shown in Table 4.  Step 7: Calculate the comprehensive evaluation value of each alternative. Based on Equation (11), comprehensive evaluation values of five hotels can be shown in Table 5. Step 8: Rank alternatives. The first ranking result a 5 a 3 a 2 a 1 a 4 can be calculated based on Z 1 i and steps 6-8 are repeated with assigning an integer between 2 and 1000 sequentially to the parameter k, then 1000 ranking results are obtained.
Step 9: Calculate possibility degree ∂ r i of alternative a i . The possibility degree ∂ r i of hotel a i can be calculated by Equation (12) and shown in Table 6.  Step 10: Calculate the comprehensive possibility degree of each alternative. Based on the idea of linear weight and Equation (13), the comprehensive possibility degree of each hotel can be calculated, as shown in Table 7. Step 11: Obtain comprehensive ranking result. According to Table 7, the comprehensive ranking result of five hotels is a 3 a 2 a 5 a 1 a 4 , which means that a 3 is the best hotel.

Data Analysis
In this section, several ranking results of the five hotels in Section 4.1 can be obtained based on different parameter K with the same linguistic scale function or different linguistic scale functions with the same parameter K.
(1) Analysis on parameter K with the same linguistic scale function. In order to explore the influence of parameter K on the final comprehensive ranking results of hotels, 10, 100, 1000, 10,000 and 100,000 are respectively assigned to K but f * (s x ) = f 1 (s x ) remains constant. The ranking results are shown in Table 8. Table 8. Ranking results based on different parameter K.
Ranking Results  Table 8 demonstrates that the ranking result when K = 10 is distinguished from other ranking results and the ranking results are same when K = 100, 1000, 10,000, or 100,000. In addition, the comprehensive possibility degrees under K = 10, 000 and K = 100, 000 are approximately equal from ∂ C 1 to ∂ C 5 . The situation is related to the mutable weight vectors that are generated under given constraint conditions. The known constraint conditions w k 1 > w k 4 , w k 6 > w k 4 , w k 2 > w k 3 , 0 ≤ w k 4 ≤ 0.1, 0 ≤ w k j ≤ 1 and n j=1 w k j = 1 in Section 4.1 are so small and simple that numerous weight vectors satisfy above conditions. K = 10 just means that ten weight vectors satisfying conditions are selected to calculate the final ranking result for one time. However, another ten weight vectors meeting the same conditions can be used to obtain the result, which may be different from the last result. Naturally, the ranking result holds relatively stable along with the increase of parameter K because the weight vectors participated in calculation are adequate.
For further illustration, five ranking results are calculated under K = 10 and another five ranking results are calculated with the same program under K = 100, 000. The total ranking results are shown in Figure 4. The curves of comprehensive possibility degrees are obtained and markedly tend to be identical based on K = 100, 000. In other words, the parameter K should be as large as possible. Furthermore, the results based on K = 1000 and K = 10, 000 are also acceptable in complicated decision-making problems.
(2) Analysis on linguistic scale functions with the same parameter K. Similarly, in order to explore the influence of linguistic scale functions on the final comprehensive ranking results of hotels, f 1 (s x ), f 2 (s x ) and f 3 (s x ) are respectively selected but K = 10, 000 remains constant. The ranking results are shown in Table 9.

Ranking Results
( ) a a a a a     Table 9 states that the linguistic scale functions have an impact on the comprehensive possibility degrees but such influences are not shown in the three ranking results. The features of linguistic scale functions are different and decision-makers can select appropriate linguistic scale functions in solving practical problems.

Comparison Analysis and Discussion
In this section, the study on feasibility and effectiveness of the proposed approach in this paper is done by means of comparison analysis. In order to reduce the impact of criteria's weights, the central weight vector  [51], named Method 1, is used to solve the same problem. Furthermore, a method based on stochastic dominance degrees of PLTSs and the PROMETHEE method defined in this paper named Method 2, is also utilized to resolve the same problem. Then, decision processes of the above two methods are presented.
(1) The method based on stochastic dominance theory and the PROMETHEE method in [51]. Based on the stochastic dominance relations between hotels under each criterion and the method in [51], and expectation values of hotels under different criteria can be calculated, as shown in Table 10.  Table 9 states that the linguistic scale functions have an impact on the comprehensive possibility degrees but such influences are not shown in the three ranking results. The features of linguistic scale functions are different and decision-makers can select appropriate linguistic scale functions in solving practical problems.

Comparison Analysis and Discussion
In this section, the study on feasibility and effectiveness of the proposed approach in this paper is done by means of comparison analysis. In order to reduce the impact of criteria's weights, the central weight vector w = (0.1759, 0.3909, 0.079, 0.0458, 0.1466, 0.1618) generated in the proposed approach is used. With this weight vector, the method based on stochastic dominance theory and the PROMETHEE method in Liang et al. [51], named Method 1, is used to solve the same problem. Furthermore, a method based on stochastic dominance degrees of PLTSs and the PROMETHEE method defined in this paper named Method 2, is also utilized to resolve the same problem. Then, decision processes of the above two methods are presented.
(1) The method based on stochastic dominance theory and the PROMETHEE method in [51]. Based on the stochastic dominance relations between hotels under each criterion and the method in [51], and expectation values of hotels under different criteria can be calculated, as shown in Table 10. Preference threshold q j of criterion c j is defined as 0.055, 0.008, 0.004, 0.005, 0.008, 0.02, respectively, and then comprehensive dominance degrees between hotels can be obtained based on the comprehensive dominance degrees between hotels and criteria's weights. The comprehensive dominance degrees between hotels are shown in Table 11.  Preference threshold j q of criterion j c is defined as 0.055, 0.008, 0.004, 0.005, 0.008, 0.02, respectively, and then comprehensive dominance degrees between hotels can be obtained based on the comprehensive dominance degrees between hotels and criteria's weights. The comprehensive dominance degrees between hotels are shown in Table 11.     Then the ranking result a 3 a 2 a 5 a 1 a 4 is obtained according to the comprehensive flows.
(2) A method based on stochastic dominance degrees of PLTSs and the PROMETHEE method. Based on the weight w = (0.1759, 0.3909, 0.079, 0.0458, 0.1466, 0.1618) and stochastic dominance degrees between hotels a i and a l under criterion c j , the comprehensive stochastic dominance degrees between hotels can be calculated and are shown in Table 12.  Furthermore, the leaving flow Φ + 2 (a i ), entering flow Φ − 2 (a i ) and comprehensive flow Φ 2 (a i ) of hotels in Method 2 are presented in Figure 6. Then the ranking result a 3 a 2 a 1 a 5 a 4 is obtained.
According to the above two methods and the proposed approach in this paper, three ranking results are obtained, as shown in Table 13.
The similar ranking results in Table 13 demonstrate feasibility of the proposed approach. Hotel a 3 is always the best hotel but the deviations between each hotel under each method are different. As one of the outranking methods, the PROMETHEE method is reliable for managing dominance relations between each alternative under different criteria. Then the PROMETHEE method is often used to calculate the total ranking result based on the dominance degrees among alternatives. However, no matter the dominance relations or PROMETHEE methods, it cannot be applied to solve the weight problem. Furthermore, obtaining reasonable weights of criteria is one of the problems inevitably to settle in selecting a hotel on the website based online reviews. As a result, our proposed approach is more suitable to deal with the hotel selection problem. a a a a a     is obtained. According to the above two methods and the proposed approach in this paper, three ranking results are obtained, as shown in Table 13. a a a a a     The similar ranking results in Table 13 demonstrate feasibility of the proposed approach. Hotel 3 a is always the best hotel but the deviations between each hotel under each method are different.
As one of the outranking methods, the PROMETHEE method is reliable for managing dominance relations between each alternative under different criteria. Then the PROMETHEE method is often used to calculate the total ranking result based on the dominance degrees among alternatives. However, no matter the dominance relations or PROMETHEE methods, it cannot be applied to solve the weight problem. Furthermore, obtaining reasonable weights of criteria is one of the problems inevitably to settle in selecting a hotel on the website based online reviews. As a result, our proposed approach is more suitable to deal with the hotel selection problem.  Based on the decision processes of three methods, the features of the proposed approach can be summarized as follows.
(1) Considering that the gaps between two linguistic variables are incongruous under different situations, the stochastic dominance rules and stochastic dominance degrees of PLTSs are defined based on linguistic scale function. In this case, flexibilities of the stochastic dominance rules and stochastic dominance degrees are improved. Furthermore, PLMCDM problems under different semantic environments can be solved.
(2) Preference threshold q j of criterion c j is such as important parameter for Method 1 that it has a great influence on the comprehensive dominance degrees and further impacts the ranking. However, how to assign value to preference threshold q j is not specified in [51]. Compared with Method 1, the ranking results of Method 2 and the proposed approach are more stable. Although the ranking result of the proposed approach is impacted by parameter K, the ranking result is tending toward stability along with the increase of K.
(3) The weights of the criteria is another influencing factor in the decision process. The ranking results cannot be calculated in the case that the weights are unknown or some constraint conditions of weights are known in Method 1 and Method 2. However, the proposed approach can be used to deal with the PLMCDM problems that weights are partly known or some constraint conditions of weights are known. If there are several weight vectors satisfying known conditions, all these weight vectors can be used to calculate the final ranking result. In this paper, for avoiding the case that overmuch weight vectors may injure the calculation efficiency of the proposed approach, parameter K is defined as the number of weight vectors and used to control the decision process. A reasonable K can ensure stability and calculation efficiency of the proposed approach simultaneously.
Moreover, from the decision process, we notice that more than half of tourist's attention has been attracted by sleep quality and location. Hoteliers need to ensure that tourists are satisfied with the sleep quality and location. Apart from the weight problem, we also recognize that the number of linguistic terms in the PLTS has great impact on the decision results. A smaller number means consensus among tourists about different criteria. For hoteliers and hotel managers, the problems emerge when online reviews contain so many kinds of opinion.

Conclusions
Researchers and hoteliers realize that online reviews on the website imply characteristics of the tourist and the features their tourism preference. The proposed novel stochastic dominance-based approach using PLTSs can help to describe reviews from tourist and obtain stability with less hotel criteria information. In this paper, linguistic terms and probability values of corresponding linguistic terms in PLTS are treated as discrete random variables and probability distributions of these discrete random variables respectively. Then we construct first-, second-, and third-order stochastic dominance rules of PLTSs for the next approach. According to these rules, first-, second-, and third-order stochastic dominance degrees of PLTSs are calculated and integrated into the decision procedure. Decision results illustrated the reliability and availability of this manner. Meanwhile, there still exists some limitations that should be addressed in the future. In the case study, we selected five hotels to be the research object, and the number of hotels was determined due to hesitation usually among several hotels. Our proposed approach can be improved to obtain the hotel listing on the website once we solve incomparable relations. Furthermore, the clustering method [52,53] can be used to deal with large scale online reviews to obtain reasonable decision sets. The preferences of tourists are different and varied. These preferences can be further refined by recognizing tourists' positive or negative emotions.
Based on Theorem 1, Property 1(1) and Property 1(2), transitivity of second (third) order stochastic dominance rule can also be proved. In this case, if F 1 (s x )SD i F 2 (s x ) (i = 1, 2, 3), F 2 (s x )SD j F 3 (s x ) ( j = 1, 2, 3) and j > i, F 1 (s x )SD r F 3 (s x ) (r = max i, j ) can be proved on account of Property 1(1) and transitivity of stochastic dominance rule.

Appendix B. The Proof of Property 2
Proof. Since F 1 (s x )SD 1 F 2 (s x ), we have that is the difference between areas under function curves F 1 (s x ) and F 2 (s x ).
is the area under function curve F 2 (s x ). ψ i (F 1 (s x )SD i F 2 (s x )) is the ratio of the difference between areas under two function curves to the area under function curve F 2 (s x ).
For convenience, the areas under function curves F 1 (s x ) and F 2 (s x ) are denoted by A 1 F and A 2 F separately. Obviously, Similarly, 0 ≤ ψ 2 (F 1 (s x )SD 2 F 2 (s x )) ≤ 1 and 0 ≤ ψ 3 (F 1 (s x )SD 3 F 2 (s x )) ≤ 1 can be proved. The detailed proof is omitted.