An Application of Cluster Analysis Method to Determine Vietnam Airlines’ Ground Handling Service Quality Benchmarks

is paper recommends that Vietnam Airlines use a pro-offered model to both evaluate and improve its current service network being operated at international airports. e model includes cluster analysis, ANOVA, and Scheffé post hoc to provide service performance insights and to serve as a complementary corporate benchmark for evaluating service potential and for identifying deficient service areas. By means of this model, the managerial board can designate a potent strategy for ground handling service. Additionally, the given model provides expatriate station managers with a clearer viewpoint of the localized productivity level as performed in relation to other airports concomitant within their own clusters.


Introduction
In Vietnam, both the domestic and the international air travel markets have experienced intensifying competition in the recent past few years. us, Vietnam Airlines (VNA), the national carrier, now faces both significant and unprecedented operational challenges on a global scale. Within Vietnam's domestic market, apart from fast-growing local rivalry, market saturation remains a regular obstacle VNA faces. As domestic market expansion contracts, the company has had to consolidate in the international marketplace. Regarding such a development, VNA experiences fierce and direct daily competition from major international carriers such as Singapore Airlines, Japan Airlines, American Airlines, and Lufthansa as well as a slew of local low-cost carriers located in Northeast Asia and Southeast Asia.
Under current circumstances, and an awareness of the distinct challenge posed by international expansion, VNA has introduced further promotional incentives to customers such as mileage reward programs or frequent flyer membership programs. However, the benefits from such marketing strategies erode over time because other airlines have already offered similar promotions in the past. Realizing the limited impact that marketing strategies may produce, the national carrier now focuses on improving overall customer service quality and satisfaction to attract repeat business.
is is the shared purpose of this article by recommending the application of cluster analysis method to the airline operations.
Currently, the airline provides online ticketing and reservations, professional ground services, and in-flight guest services. Among these offerings, airport ground services are of considerable importance since they most often directly impact the customer service experience. VNA has signed a service-level agreement (SLA) to act in accordance with numerous ground handling companies at international airports in order to regulate these services. e carrier also attaches a supervisory expatriate station manager to each ground handling company to supervise the on-site service quality level expected. e SLA defines the mutually agreed upon set of standards and targeted goals used to monitor the general ground handling company's performance. Regular meetings are then organized between VNA and the handling company to track the agreed-upon level of service quality against assessed performance standards and regular service quality targets. e exact indicators or criteria of the different SLAs may tend to vary from site-to-site. However, they trend towards the following: (1) punctuality; (2) check-in process; (3) boarding process; (4) staff attitude; (5) customer complaints; (6) baggage mishandling; and (6) personal documents mishandling.
e first 3 columns of Table 1, shown below, provide a rubric of main indicators voiced in the SLA signed between Vietnam Airlines and China Airlines ltd., the ground handling company located at Kaohsiung's Siaogang International Airport (KHH) in southern Taiwan for the year 2016.
Even though the SLAs may indicate the putative service standards and service quality targets for handling companies to follow, they still do not guarantee that customers' needs are being met. In order to determine whether or not customers are satisfied with airport ground handling services, VNA conducts regular passenger surveys to obtain feedback related to the service quality provided.
In addition, VNA's Market Service Department personnel maintain comprehensive data records of the handling companies' regular performance levels. e data from these sources and from part of the passenger survey are then consolidated and checked against the set of standards and targets already predetermined. From these results, the airline may decide the nature of ground handling services at airports that are up to standard and those which are not. e last column in Table 1 is an example of how the service quality of the handling company at Kaohsiung's Siaogang International Airport in Taiwan was evaluated for the year 2016.
Presently, the method VNA utilizes to determine the SLA is by making a comparison of the performance results for each of the ground handling companies with its respective service standards or targets preset as part of the SLA. If the performance results are not lower than the targets, then the carrier concludes that the handling company has met the desired performance criteria. Apart from this rather generalized assumption, there is little that can be concluded from the passenger surveys and recorded data.
In this study, the authors will use the same VNA data that were consolidated in 2015 and 2016; however, other tools were used for purposes of analytics. In this regard, cluster analysis, ANOVA, and Scheffé post hoc were used to provide VNA with a more accurate picture of how these handling companies are performing, especially in relation to the others in their group or cluster and then to those in other clusters. It is also argued that these tools can enable the carrier to establish more achievable targets for each group of ground handling companies when followed by benchmarking for future performance standard levels for the next higher group or subgroup. e primary contribution of the article is in its suggested methodology and in its approach to evaluate ground handling service quality due to insufficient coverage in the literature because of its relatively small role in airline service quality evaluation. Such an evaluation may be adapted and then expanded conversely to other service industries. e evaluation method provides an insight into service performance, and it can serve as a company's complementary benchmark for evaluating its own services as well as identifying deficient service areas. Moreover, it affords a definite reflection of the industry's perspective on service quality evaluation, and unlike other rather academic articles related to airline service quality, it insures the practicality of the findings and methods for the industry. e remainder of this paper is organized as follows. Section 2 explains what cluster analysis is, its benefits, and its application.
is section also emphasizes service quality along with criteria used to establish benchmarks for the airline service. Section 3 introduces the research methodology applied, and Section 4 describes and then discusses the data analysis results. Finally, in Section 5 some relevant conclusions are made.

Cluster Analysis.
Cluster analysis, or clustering is a statistical method used to classify groups [1]. It is performed to discover groups of objects found in data on the basis of some form of proximity measurement defined among them [2,3]. ose relative group memberships are created with respect to their proximity to one another. However, in hierarchical clustering, data are not separated into a specific number of classes or groups at a single given phase. As an alternative, the classification procedure includes a series of separations running from one group of all individuals to the n group of a single individual. Both types of hierarchical clustering techniques, i.e., agglomerative methods and divisive methods, have been considered to be the optimal phase in which analysis took place. Agglomerative methods are conducted to form a sequence of successive connections of the n individuals into groups, while divisive methods partition the n individuals into a better form of classification [1]. A process to form hierarchical clusters of reciprocally undivided subgroups is repeated until only one group remains [4]. Regarding the result, a dendrogram is produced which provided a convenient visual aid with which a hierarchical sequence of clustering assignments is exhibited. It appears in the form of a simple tree where each node serves to indicate a cluster. Each cluster represents a single data point, and the root node indicates the cluster consisting of the whole dataset.
Ward's method has been widely employed to conduct cluster analyses in prior research (e.g., [5][6][7][8]). Everitt et al. [1] emphasized that out of the seven methods of cluster analysis available, the standard agglomerative hierarchical clustering method (i.e., Ward's method) possessed the greatest marked tendency to collect samesize, spherical clusters and was a sensitive method to identify outliers. Hands and Everitt [9] claimed that this method was a more complicated method, but also provided a more accurate one in terms of results and minimalized variance between objects. Euclidean distance was recommended for distance measure and most used whenever applying Ward's method. For this study, Ward's method applied with Euclidean distance presented the clearest image of clustering.

Airport Classification.
Classifying airports according to service capabilities is popular. In air transport research, major scholars have grouped airports according to their common attributes (e.g., [6,7,[10][11][12][13][14]. ey have all made various aims and approaches for classifying airports depending on the purpose of the survey. Classification schemes frequently used groupings in terms of benchmarking practice [10,[14][15][16].
Although airport classification has already been studied in a variety of ways regarding diverse aspects, current research related to airports seemingly ignored the intangible issue of passenger service quality. Airports have been classified mostly in terms of their connectivity, geographic location, functionality, traffic distribution, airport size, cargo capacity, utilization and technical characteristics, ownership, efficiency and productivity, and network position [6-8, 10-12, 14, 17]. However, throughout the classification process of airports, variables related to passengers have the primary inclusion. For instance, Adikariwattage et al. [10] used the U.S. Bureau of Transportation Statistics survey database with the additional variables of gate numbers, annual volume including passengers of origin-destination, and transfers from both international and domestic to U.S. airports. Furthermore, by considering an airport's position within the network, Malighetti et al. [6] found the presence of strategic groups in their cluster analysis made up of 467 European airports. ese researchers used variables related to passenger connectivity, such as the seat availability on scheduled flights, the amount of destinations offered, and traffic distribution amongst routes. Rodríguez-Déniz et al. [13] classified airports based on the available air ticket information taken from more than 30 major U.S. air carriers. Apart from concentrating on passenger variables that were related to airports, some studies have focused on overall airport efficiency. For example, Rodríguez-Déniz and Voltes-Dorta [16] used output and input pricing, with cost elasticity and factor shares serving as optimal variable weights in order to estimate an airport's efficiency.
ey identified 17 distinct airport clusters using this given technique. Sarkis and Talluri [14] measured the operational efficiency of 44 major U.S. airports. In this case, four input measures (e.g., airport operational costs, number of airport employees, number of gates, and number of runways) and 5 output measures (e.g., operational revenue, passenger flow, commercial and general aviation movement, and total cargo) were used to provide efficiency measures.
While cluster analysis is a useful tool to group airports, hierarchical clustering is also commonly used in the classification of airports, as employed by Malighetti et al. [6]; Mayer [7]; Rodríguez-Déniz et al. [13]; Sarkis and Talluri [14]; and Vogel and Graham [8]. One of the key benefits of hierarchical clustering over k-means clustering, as remarked by Rodríguez-Déniz et al. [13], is that by illustrating by means of a tree structure, hierarchical classification can indicate a typical structure which is more informative than the flat clusters gained from other splitting methods, such as k-means.
After grouping airports, the usage of airport clusters in benchmarking exercises is a main purpose. Sarkis and Talluri [14] treated the efficiency scores of major U.S. airports across 5 years by way of a clustering method when identifying benchmarks used to improve poorly performing airports. Mayer [7] based his airport cluster analysis upon cargo tonnage throughput. e objective of this study was to provide an insight into the heterogeneity of cargo airports and then to define comparator airports in the air cargo marketplace.
is research is an application of cluster analysis towards the classification of VNA service-linked international airports. is airport hierarchical clustering is based on ground handling service determinants, which was self-determined by VNA. e quality service provided by the ground handling service companies at VNA's international airport destinations represented performance efficiency as evaluated by the Market Service Department's review of VNA passenger surveys and records. Journal of Advanced Transportation

Service Quality in the Airline Industry.
It has been realized that delivering superior service quality can be of key importance to success and survival in today's hypercompetitive business environment. Hill et al. [18] highlighted the impact of service quality on a company's service strategy formulated to improve profitability: "(. . .) for services, "Production" typically takes place while the service is delivered to the customer. e production function can also perform its activities in a way that is consistent with high product quality, which leads to differentiation (and higher value) and lower costs." (p.92). Moreover, service quality was found to be an independent and positive direct effects on satisfaction [19]. By offering superior service, companies were able to charge premium prices for their product [18,20] while gaining growth of market share [21]. It was also noticed that companies could still make a profit while reducing the customer defection rate at the same time [18]. us, airlines can agree with the motto "Improving service quality is improving profitability." In reviewing lessons gleaned from past service quality studies, there was an indication that any service improvement whatsoever can increase the customer base through new and repeated purchases from more loyal customers. For this approach to work, Johnson et al. [22] claimed that the prediction of future retention behavior and profitability could be made through a determination of a cumulative customer satisfaction level. In the airline industry, carriers have made considerable efforts to improve overall service quality with a view to satisfying passengers and meeting customer expectations that will serve to maximize long-term profitability. erefore, service quality needs to be perceptible and assessable by customers (e.g., [23][24][25][26][27]). Gilbert and Wong [24] argued that when flying with an airline, all passengers should share the same positive expectation of desirable service quality. However, this same expectation does not only exist when passengers are not homogeneous in terms of their racial identities and travel objectives. Chen and Chang [23] investigated airline service quality from a process viewpoint by studying the distances between passengers' service expectations and the real service rendered at ground level and from in-flight services. Furthermore, these researchers discovered the distances associated with passenger service expectations and perceptions of these expectations by inquiry of the relevant frontline managers and employees. ey stated that there was an existing distance, and fliers seemed to be most concerned about the aspects of responsiveness and assurance available when interacting with the airline's frontline staff. Wu and Cheng [28] designed a purpose-built model to represent passengers' overall perceptions of airline service quality in the industry to provide the most satisfactory experience for passengers as a whole. Apart from the satisfaction factor, passenger expectations and perceptions were examined in relationship to airline service in different contexts, including airline service quality perceived by passengers in an uncertain environment [27]; consumer expectations and perceptions in an international setting [26]; passengers' perceptions of service quality leading to the choice of carriers for international air travel in Taiwan [29]; and passenger expectations and airline services [24]. For the airline service industry, understanding their passengers' expectations and perceptions necessary to satisfy them is of key importance because this is precisely what keeps the passengers repeatedly flying with the airlines, which inevitably leads to profitability.

Criteria and Subcriteria for an Airline Service Quality
Evaluation.
e literature has made considerable progress as to how service quality perceptions are to be measured (e.g., [25,[30][31][32][33][34][35][36]). Parasuraman et al. [25] used terms such as reliability, responsiveness, empathy, assurances, and tangibles to best describe service encounter characteristics. ese 5 criteria are well known under the name of the SERVQUAL model and integrate 22 subcriteria used to measure functional service quality levels. is model has been widely used in the airline service (e.g., [24,[37][38][39]), and it was followed by a number of other models constructed over time, such as with the performance-based SERVPERF model developed by Cronin and Taylor [31] or SERVPEX which can measure disconfirmation in a single model [32]. e business of various airlines depends on the quality of the services they afford. An evaluation of such service quality may be formulated using multiple criteria, and multiple subcriteria, to cover the entire business activities of an airline (e.g., [24,[37][38][39]). Gilbert and Wong [24] conducted a study on passengers' service expectations while in the Hong Kong airport and used 7 dimensions, with 26 scaled questions, as a management diagnostic tool of its service quality. Gupta [38] carried out research with 29 subdimensions under further 7 main dimensions, which represented the largest number of dimensions and subdimensions ever used to evaluate Indian airline industry. Chou et al. [37] tested their model through a Taiwan airline at the Siaogang International Airport in Kaohsiung, Taiwan. ey evaluated the airline service quality, using 28 items under 5 major dimensions. However, Tsaur et al. [39] pointed out that the real meaning of quality in airline services was difficult to describe and to measure because of its heterogeneity, intangibility, and inseparability.
According to the airline industry, the major criteria used to evaluate service delivery systems contain management, staffing, passenger satisfaction, reliability, and tangibility which are all found in previous research [38][39][40][41]. ese 5 criteria covered the whole airline business including ground service and in-flight service. Ground handling was considered as a part of ground service and took place in an airport context. Criteria and subcriteria used to measure ground handling service provided to passengers at airports were relatively consistent in nature, and they have been frequently employed in many empirical studies. Research commonly involved on-time performance, the check-in process, the boarding process, staffing attitude, customer complaints, and baggage mishandling incidents [23,24,28,[37][38][39][40][41][42][43][44][45]. Zhang et al. [46] and Bowen and Headley [42] focused on 5 determinants which included on-time arrivals, mishandled baggage (flight delays), involuntarily denied boarding (oversales), and consumer complaints. Bowen and Headley [42] even used 12 subelements for the determinant named as consumer complaints. However, the findings of both studies were not consistent because different weightings were used in Bowen and Headley's research. Moreover, staffing attitude was a key aspect to measuring the appropriate level of staffmember interaction with passengers. Positive feelings and negative feelings towards staff behavior played an important role in determining customer satisfaction [43,44]. Likewise, on-time performance was of similar importance. An airline's achieved punctuality was a major source of passenger satisfaction concerning airline's perceived reliability [47,48]. Additionally, Wu and Cheng [28] stated that passengers generally considered their eventual waiting time as a vital criterion in evaluating their service quality experience. Furthermore, personal interactions between passengers and airline employees will likely impact positively or negatively on passengers' perceptions of the airline service quality. Communication attributes used in much of the research to describe regular interactions between airline staff and passengers like the professional appearance of the staff [40], staff with professional knowledge skills, [39,43,49], staff with consistent service, willingness to help [29,50], and language ability [29] were considered to be extremely sensitive to fliers. e aforementioned empirical research was related to airlines operation, which may include ground handling services. However, there were no studies which had covered ground handling activity on a sole basis, to which this paper strove to investigate.

Research Design.
e proposed model is comprised of cluster analysis taken along with Ward's method and Euclidean distance, ANOVA, and Scheffé post hoc test. Hence, the research design was divided into 5 steps (see Figure 1).
Step 1: this step is to select criteria for analysis. In this study, 6 criteria were adopted.
Step 2: for airport selection in this step, VNA's 28 international airports were chosen due to their sufficient data values (see Table 2).
Step 3: cluster analysis was used to group all of the airports according to the given evaluation criteria. Ward's method has been trialed with different distances, but it was only with Euclidean distance that the dendrograms presented the clearest tree-like graphics of clustering.
Step 4: ANOVA was applied in order to display criteria defined by a significant performance gap. Hence, an overview of the service was presented.
Step 5: the Scheffé post hoc test was applied to examine the criteria with significant differences in order to determine the cluster performance levels. Based on these levels, the benchmarks and possible targets for next period could be readily defined.

Sample Characteristics.
Research data have been collected from VNA's Market Service Department. It is comprised in large part of quarterly SLA service quality reports for ground handling companies at VNA's 28 international airport destinations. From these quarterly figures, the authors formed two sets of data by average, one for the year 2015 and the other for 2016. e majority of sample airports are located in Asia, which accounts for 80% (23 airports) of airports concerned in the study, while the balance of 14% (4 airports) are located in Europe, with a further 6% (2 airports) located in Australia.
e VNA Market Service Department has developed 7 criteria to assess the service quality provided by ground handlers. Based on these criteria, the information used to form the data rooted from different sources included the information system used both internationally and internally, daily reports produced by station managers, and autogenerated messages relevant to routine operational activities. e criteria used are described as follows: (i) Check-in process, boarding process, and staffing attitude: the figures for the first 3 criteria originate from completed passenger surveys administered quarterly and distributed randomly onboard VNA flights. In the survey, passengers are requested to report the amount of time spent waiting at check-in counters and standing in queue for their check-in turn; the perceived level of convenience concerning boarding time; the gate-area staff's assistance by providing instructions coupled with information signage and multi-lingual announcements; and the perceived helpfulness and the courteous nature of airport staff during their communications and interactions with passengers.
(ii) Customer complaints: data with respect to this criterion were derived from internal record keeping, indicated by the rate of consumer complaints per 1,000 passengers.
(iii) Baggage mishandling: the figures for this criterion were drawn from the World Tracer system, which traces lost baggage items worldwide. is criterion indicates the rate of mishandled baggage reported per 1,000 passengers.
(iv) Personal documents mishandling: the data related to this criterion is also drawn from VNA's internal records. is criterion indicates errors which occur in the checking of passengers' travel documents in compliance with the entry, in transit, or with other requirements based on countries-of-origin/arrival.
(v) Punctuality: the actual time an airplane departed, or arrived, as compared to the scheduled time of departure (STD) and expected time of departure (ETD) that define if the flight was on time or not. A flight is considered as late if the time difference is fifteen minutes or more. e actual time is measured by the automatic messages sent system wide when an airplane takes off or lands.
Although there are 7 criteria that have been commonly used by the literature as previously-stated, only the first 6 criteria were studied in this study. e criterion related to Journal of Advanced Transportation 5 "Punctuality" is excluded because almost all airports were on time, to some greater or lesser extent. e figures for these criteria are calculated in terms of a 100-point score scale. For the first 3 criteria, the scores are shown to represent the effectiveness of the ground handling companies, whereas in the last 3 criteria, the scores represent their perceived ineffectiveness.
By comparison with the company targets through both years, the data values indicate that the system reached only a median number of the preset targets. e achievement of targets outnumbered in the last 3 criteria, especially the rate of customers' complaints was considerably low.  Figure 3 for 2016) revealed cluster analysis results with numerous clusters at different distances, the authors abbreviated a cutoff on the dendrograms at the distance of 12 for both years, respectively. In this way, there were always 5 airport clusters to be observed for each year. e distance was chosen to benefit operations management, and it was out of consideration for the visual presentations that the airport graphics could be quickly joined to form 5 main groups. is method showed a higher degree of homogeneity of the airports in each cluster in terms of efficiency. e 5 resulting clusters in 2015 were named A1, A2, A3, A4, and A5, and another 5 in 2016 were B1, B2, B3, B4, and B5. According to their performance, as measured by VNA, the features of these clusters are described as follows:

Research Findings and Discussion
Cluster A1: RGN, PNH, KUL, and HKG (4 airports) e cluster's performance needed some improvement in the first three attributes, including check-in, and boarding, and staff attitude. It was on average with baggage handling and personal travel documents. e number of cases filed by passengers was lower than Step 5 Step 3 Step 2 expected. Less than 40% of the VNA targets set for this cluster were reached.
Cluster A2: NRT, KIX, FUK, VTE, NGO, and CGK (6 airports) e cluster exceeded performance standards in baggage handling. According to the other criteria, airport performance was in median level. Moreover, performance met the airline's expectations in terms of functionality based on customer complaints and mishandled baggage attributes. It achieved over 40% of airline's targets. Cluster A3: REP, PVG, TPE, PEK, PUS, CTU, SIN, and CAN (8 airports) e cluster was above performance standards in carrying out tasks connected to the boarding process, yet remaining relatively unproductive in terms of check-in handling and staffing attitude. e majority of airports in this grouping received no complaints from passengers. It could possibly reach 40% of the airline's targets. Cluster A4: SYD, MEL, LHR, and FRA (4 airports) e cluster met 50% of the targets set by the airline. It was among the first place finishers in the system in terms of carrying out tasks related to passenger checkin, the boarding process, and staffing attitude. However, the rates for mishandled baggage and personal documents were considered to be highest.

Journal of Advanced Transportation
Cluster A5: KHH, DME, ICN, HND, LPQ, and BKK (6 airports) e cluster provided above average service quality. Airports in this grouping achieved over 65% of company targets. Like A4, the cluster gave performance exceeding standards for the first three criteria. It also had above average performance levels in those criteria remaining. Cluster B1: FUK, KIX, NGO, and RGN (4 airports) For the check-in, the boarding process, and staffing courtesy, B1 needed improvement because it was in the lowest ranking of the system. Its performance delivery according to the last 3 attributes was better than the first three groupings. In addition, the cluster reached less than 40% of the stipulated targets. Cluster B2: CGK, SYD, and VTE (3 airports) Although the cluster achieved most of targets envisioned (65%), this grouping's boarding process control needed improvement. e performance levels for check-in, the amount of poor comments, and the helpfulness of staffing were all above average. Cluster B3: CAN, ICN, KUL, NRT, PNH, and PUS (6 airports) In addition, this cluster reached 30% of targets; this means that there was average performance regarding most criteria in this cluster.

Cluster B4: LPQ and LHR (2 airports)
is cluster is comprised of two high level airports reporting the handling of check-in service, the boarding process, and staff communicating with passengers. It could reach over 50% of the targets laid out. It should be noted that one airport was much better in carrying out baggage handling and personal travel document control than the other. Cluster B5: REP, PVG, DME, CTU, TPE, PEK, KHH, MEL, FRA, HKG, SIN, HND, and BKK (13 airports) is cluster had productive performance evaluation in the first four criteria. No passenger complaints were made against this cluster, with the possible exception of BKK. In addition, most airports located in this cluster were able to consistently provide a high level of baggage handling and personal travel document handling. is grouping reached less than 50% of the set targets.
In conclusion, the amount of the both-year-targets hit was underexpected. More importantly, the performance of the system was demonstrated in the way that various performance levels exist among 5 of the 2015 clusters and 5 of the 2016 clusters for individual criteria, while each cluster showed similar achievement levels of performance in one criterion, or some criteria, shared among airports.

Specification of Service Quality Levels.
After the 10 clusters were produced, ANOVA has been applied to discover in which criteria there existed statistically significant differences of service quality. en, for criteria determined as to significant performance levels, a Scheffé post hoc test was applied in order to reveal the clusters that outperformed others or that performed at the same level, even though their performance outcomes have always varied in some way. e results of these 2 tests (i.e., ANOVA and Scheffé post hoc test) are presented in Tables 3 and 4.

Overview of the Service Quality
(1) Criteria with No Significant Differences.
e ANOVA results indicate that there were non-significant differences in customer complaints (F � 1.013, p � 0.421) in 2015, as shown in Table 3 Table 4. ese figures mean, as in the above-mentioned criteria, all clusters delivered similar service quality levels in both years because the scores they attained were the same, statistically speaking.
ey have indicated where the differential ground handling service quality was in both years. In 2015, the service quality of the check-in process, the boarding process, the staffing attitude, baggage mishandling, and personal documents mishandling provided to passengers was considerably different, as accorded by the 5 clusters. In other words, there existed substantial gaps of performance among the 5 clusters. Similarly, in 2016, statistically significant differences are found in the check-in process, the boarding process, and staffing attitude. It is obvious that among these 3 criteria, the performance distances of the 3 clusters were considerably vast. e 2 years of 2015 and 2016 present dissimilar performance values, as localized in some certain criteria. As such, it is believed that more managerial attention needs to be focused on these criteria. In theory, ANOVA assists to identify the criteria for which there is statistically significant difference of at least one pair of clusters, but does not indicate where the differences may lie [51]. Inorder to figure out the distinct differences of performance among the clustersin these individual criteria, the authors used the Scheffé post hoc test.. In this regard, the benchmarks and the underperforming clusters in the designated criteria could be determined in the next part. Tables 3 and 4 show the results of Scheffé post hoc test and present an overview of service quality of the network in 2015 and 2016. By means of this test, real cluster performance levels are determinable that lead to benchmark formulation. e detailed descriptions for 5 criteria with significant differences are as follows:

Benchmarks Revealed from Comparison of Clusters' Service Quality.
(i) Check-in process: A1 is seen to be the worst/least performance cluster according to the criterion. Airports in this cluster kept passengers waiting in lengthy check-in queues, as well as beyond average check-in process times. A2 is better than A1. ere were no different gaps of performance between A4 and A5. Both clusters exceeded performance standards and passengers appeared satisfied with the quality of their check-in service and controlled passengers' wait times in queue. However, A5 met more targets stipulated by VNA than did A4. Hence, A4 and A5 should become the benchmark for A3, while A2 can become the benchmark for A1.
In 2016, statistically speaking, with the formation of 4 significant service levels, the criterion had the most performance levels among the 5 clusters. Regarding these levels, B3 offered a similar service quality level with B1. ey needed improvement because they were formed by the least performing airports, especially they included 4 out of total 5 Japanese airports where passengers expected to be served at a higher service quality. eir check-in service quality was below the flyers' expectations. Also, 2 pairs of clusters B3 and B5;B5 and B2 were of equal performance levels. In contrast, B4 was able to handle this exceptional service standard, and it became the benchmark for B5 and B2, while B5 should become the benchmark for B1, and B2 was benchmark for B3. (ii) Boarding process: the same performance level occurred between A1 and A2 and among A3, A5, and A4. However, A3, A5, and A4 outperformed A1 and A2. Both clusters did not provide the convenience of a smooth boarding process. e information given at the boarding gates remained unclear. e speech for boarding was sometimes unclear which caused confusion among passengers. Likewise, the staff did not promptly render assistance which met passengers' requirements. us, one benchmark is suggested. Unlikely, the findings revealed that there were no differences of service quality rendered between B4 and B5; B3 and B1; and B1 and B2. In fact, significant performance gaps existed between B2 and B3; between B3 and B5 or B4; and between B1 and B5. B4 and B5 were the front runners in handling boarding tasks, while B3, B1, and B2 needed to demonstrate better control of this process. B4, B5, and B3 turned out to be benchmarks for the following lower performance clusters. (iii) Staffing attitude: the productivity of A4 and A5 andA2 and A3 did not significantly differ. ere appeared two substantial gaps. A1 needs to solve the problem in communication between the staffing and passengers. Staff should deliver information clearly along with an appearance of helpfulness and friendliness. Based on the two gaps, A4 and A5 can be seen as benchmarks for A3 or A2, while A2 can be A1's benchmark.
e 2016 results presented 3 significant levels because of the same performance among B5, B2, and B3 and between B3 and B1. Benchmarks therefore can be seen as B4, B2, or B5. (iv) Baggage mishandling: in comparison of 5 clusters in 2015, there were no gaps among A2, A5, A3, and A1, but relatively big gaps among them and A4. erefore, they can become the benchmarks for A4. e airports in A4 had the lowest scores. ere were serious problems evident with misrouted baggage, damaged baggage, wrongly tagged baggage, and wrongly loaded baggage. e number of mishandled baggage cases at this airport was the highest recorded in the system. (v) Personal document mishandling: e results indicate that the performance of A5 was statistically equal to that of A3, A1, and A2, while A4 represents the worst performing cluster. A4's check-in agents made errors in controlling passenger's travel documents in direct noncompliance with the entry, transit, or other requirements of the countries of travel. Hence, any of A2, A3, A4, and A5 can become the benchmarks for A1.
To sum up, these two tests may provide VNA's administrators with a criteria-based insight coupled with an underperforming service quality directly affecting VNA passengers throughout the international sphere of operations. For each of these criteria, managers had to recognize the real service-level performance of each airport clusters. Airline management will have the potential to identify outperforming clusters for which pro-offered incentives encourage and maintain qualitative service excellence. Managers can also define given performance according to a variety of parameters for which action is called for to improve them. It is believed that test results will support VNA's strategic management to direct effective and efficient ground handling performance. Furthermore, these findings may also serve station managers in charge of service provision in the airports, so they can draw comparisons as to the productivity level of their airport with other airports that are located in their cluster. Identifying the significantly different levels of service performance among the clusters is of great managerial importance since the specific performance level of the next higher clusters can be easily used as a benchmark (suggested target) for the one immediately following. In this way, VNA management is now able to establish a more realistic and readily achievable set of targets for certain clusters in the future.

Conclusion and Recommendation
e present paper has used a model of cluster analysis, ANOVA, and Scheffé post hoc to better understand the service quality of VNA's ground handling service as it was recently delivered to VNA passengers, at 28 international airports, regarding 6 criteria. e method used to find benchmarks for the next service period for general improvement has been demonstrated.
is analysis aims to support the company's overall management strategy to attract repeat business by focusing on customer satisfaction potential leading to profitability. e study findings have presented an overview of the service performance level of this organization which is beneficial to VNA's top management, as well as to station managers. Top management may realize the service system's weaker areas to afford improvement, benchmarking, and potential targets evident in the next evaluative periods. rough the analysis, 5 classified airport clusters were obtained in 2015. ese clusters actually possessed significant differences in 5 of the criteria, namely, the check-in process, the boarding process, staffing attitude, baggage mishandling, and personal document mishandling. e 5 classified clusters obtained in 2016 possessed 3 criteria, namely, the check-in process, the boarding process, and staffing attitude. For these individual criteria, the significant differences in productivity levels among these clusters have been simply identified. Subsequently, the outperforming clusters and the underperforming clusters are most revealed. It is recommended that the outperforming clusters become the benchmark for those clusters which are immediately antecedent. As such, the scores of the benchmark clusters may also suggest realistic and objective targets for the coming service period.
More importantly, by determining the meaningfully different levels of service quality located among the clusters, this outcome provides the expatriate station managers with a clearer viewpoint of the localized productivity level performed in relation to other airports within their own cluster.
Using the cluster analysis method to obtain benchmarks as achievable targets will advance in understanding the quality of the service system so that further recommendations may be mete out to VNA's senior managers and VNA's international airport representatives, as follows.
Firstly, in securing representative ground handling companies, VNA faces a distinct loss of management control [43]. In order to minimize this risk, the company has been implementing SLA through the supervisory intercession of expatriate station managers who can vet desired service quality. erefore, the selection of station managers, the use of an effective motivational program, and the application of state-of-the-art check-in technologies are of key importance.
Station managers who are selected should be experienced and skillful since they play an important role in building partnership, as well as in coordinating directly with the ground handling agents.
Regarding the motivation of these supervisory representatives, VNA should design an effective managerial program. Such a program should cover the routine meetings of all station managers to insure knowledge sharing, and there should be special attention paid to underperforming airports, made possible through reports on explaining problem-solution activities and justification of events.
Although VNA's online check-in service has been available, VNA should consider more alternatives to using technology-based self-service such as self-check-in kiosks and barcode-activated check-in. ese auto-check-in facilities offer numerous advantages to fliers such as enabling them to pick desired seating, reducing check-in times, and preventing congestion and delay at check-in counters [52]. However, passengers' intentions and satisfaction in using them is still negligible. By adopting these technologies, airlines such as VNA may offer additional benefits or incentives for prior seat selection in order to encourage departing passengers to use self-service technologies as well as allocate local staff to provide specialized instruction to assist passengers in the use of these facilities [53]. Moreover, concerning limited space, the choice of the self-check-in kiosk locations proximate to the luggage conveyor system will further enable satisfactory service for both the airline and the fliers [54]. In sum, reducing waiting time is insuring passengers' satisfaction since excessive wait times are considered to be one factor directly affecting overall fliers' satisfaction [53], and check-in wait times have the heaviest weight in the overall passenger service process.
Secondly, the interaction of passengers, distinct airport features, and airport entry-exit related procedures are all crucial to the customer service experience. e station managers should have an awareness of the passenger profiles of those they serve (i.e., nationality, specialized groups, and various traveling purposes) at the airports since these features impact passengers' expectations of relative service quality [24]. Similarly, although it is clear that SLA signed is rather detailed, the different characteristics that individual airports have should be of concern.
ere are some advantageous and disadvantageous elements, such as employee skill sets, passenger traffic and flow, and aircraft movement and placement. Employees with good knowledge and skills contribute to a better service quality level [43] and airports with capacity constraints may impose delays on aircraft and passengers [19]. Apart from regular procedures, such as security filters, passport control, and pinch-points that may differ at various airports, there may also be occasional or frequent security features specifically imposed due to threat levels on the passengers at certain airports. e station managers need to ascertain the situation in a timely manner to carry out the necessary actions that may assist passengers.

Suggested
Research. Future research should focus on a comparison of the ground handling service quality management methods of VNA next to other carriers and how to generate service quality targets in groups which would support the positive assessment of VNA ground handling service worldwide.

Data Availability
Access to data is restricted due to commercial confidentiality.

Conflicts of Interest
e authors declare that they have no conflicts of interest.