A Mixture Model for the Analysis of Categorical Variables Measured on Five-point Semantic Differential Scales

Ordered response scales are often used in questionnaires to measure individuals’ attitudes or perceptions. Among different response scale formats, we focus on multi-point semantic differential scales, requiring the respondent to position himself/herself on a rating between two bipolar adjectives. The obtained rating data require appropriate statistical models. We resort to the CUM model (Combination of a discrete Uniform and a - linearly transformed - Multinomial random variable), recently proposed in the framework of the CUB (Combination of discrete Uniform and shifted Binomial random variables) class of models. CUM is also suited to all the ordinal response scales with a middle “indifference” option. In the seminal paper on CUM, the methodological approach was developed for an odd number m of response categories, while simulations, case studies and implementation in R were limited to m = 7. The objective of this paper is to extend the original proposal and investigate the model performance in the case of m = 5, which often arises in real situations. The R functions for fitting a CUM model with m = 5 are implemented and made available; simulation studies are developed and compared with results obtained for m = 7 and a case study concerned with the evaluation of museums’ visitor experience is proposed.


Introduction
Ordered response scales are often used in questionnaires in order to measure individuals' perceptions, attitudes, evaluations, habits, behaviours.For example, when we investigate patient satisfaction, respondents are often asked to rate their satisfaction, with a given service, on a response scale from 1="completely dissatisfied" to, say, 5="completely satisfied."Among a variety of possible response scales (Dawis 1987), we focus on (multi-point) semantic differential scales (mSDS), used when respondents are asked to assess their position between two opposite words/concepts/adjectives (for example, sad and happy, weak and strong, dissatisfied and satisfied, etc.).Instead of positioning themselves on a continuous trait, for example by drawing a tick on a segment, when a semantic differential scale is multi-rating, respondents select their favourite option among multiple ratings, from one to the other of the two words/concepts/adjectives.
When dealing this kind of response scales, the middle option plays a relevant role, because it indicates a central position between two extremes.However, this can be considered true in all the response scales having an odd number of ordered categories with a midpoint that means indifference, like "neutral," "neither positive nor negative," or "neither agree nor disagree", as often is the case of the Likert scales commonly used in several fields.
Rating data obtained by the administration of questionnaires need to be modelled with appropriate statistical models and methods, properly designed to account for the categorical ordinal nature of these variables (Agresti 2010(Agresti , 2013)).In addition, specific models can be considered when rating data come from mSDS.
Among the statistical models existing in the literature for the analysis of ordinal data measured on semantic differential scales (SDS), we focus on a mixture model, called CUM model (Combination of a discrete Uniform and a -linearly transformed -Multinomial random variable), recently proposed by Manisera and Zuccolotto (2022) in the framework of the CUB class of models (where CUB stands for Combination of discrete Uniform and shifted Binomial random variables) .
The CUB class of models (Piccolo 2003;D'Elia and Piccolo 2005) innovates with respect to other existing models for categorical data by interpreting the decision process that underlies the respondent's choice as the result of a combination of two latent components.They are called feeling and uncertainty and measure (i) the individuals' level of perception, attitude, evaluation (for example, satisfaction) with the object under evaluation and (ii) the indecision that surrounds any choice among multiple ratings.The two components are then collected in a mixture distribution, combining a (Shifted) Binomial random variable (r.v.) that models feeling and a Uniform r.v. that models uncertainty.CUB models have been extended in several directions and applied in many fields (Piccolo and Simone 2019) .The most recent contributions deal with the possibility to model subjective heterogeneity Simone, Tutz, and Iannario (2020), some computational issues (Simone 2021;Cerulli, Simone, Di Iorio, Piccolo, and Baum 2022), a new approach combining CUB models to decision trees (Cappelli, Simone, and Iorio 2019;Simone, Cappelli, and Di Iorio 2019) and cluster analysis (Biasetton, Disegna, Barzizza, and Salmaso 2023).
The CUM model was introduced (Manisera and Zuccolotto 2022) to extend the CUB paradigm to specifically model the peculiar decision process followed by respondents facing mSDS (or rating scales with a neutral midpoint).CUM was developed in the framework modelling the decision process in Manisera and Zuccolotto (2014a) as a mixture model combining a linearly transformed Multinomial (ltM) and a discrete Uniform r.v.In Manisera and Zuccolotto (2022) the methodological approach of the CUM model was developed for the general situation of an odd number m of response categories, while simulations, case studies and implementation in R (R Core Team 2019) were limited to m = 7.
The aim of this paper is to further investigate the functioning of the CUM model in the presence of a semantic differential response scale with m = 5 categories.In particular, we develop a simulation study following the configuration in Manisera and Zuccolotto (2022) to compare results of m = 5 with those of m = 7, propose a case study and adapt the R functions to cope with m = 5.
At https://bodai.unibs.it/cub/,interested readers can find references, the R functions and some data examples about the CUM model, besides other interesting methodological advancements developed by the "Big & Open Data Innovation Laboratory" research group of the University of Brescia, Italy on rating data analysis.
The paper is structured as follows.The CUM model is briefly recalled in Section 2. Subsection 2.1 declines some of the CUM characteristics in the case of m = 5.Section 3 presents the results of the simulation study, focusing the attention on the comparison between the two situations of m = 7 and m = 5.In Section 4 we present an application of the CUM model to real data coming from a recent survey aimed at evaluating the museums' visitor experience.Section 5 draws some conclusions.

The CUM model
According to Manisera and Zuccolotto (2014a), the decision process (DP) underlying the respondents' choice on a response scale with ordered categories is composed of two components that, borrowing the CUB terminology, are called feeling and uncertainty.They originate in the respondent's mind, where two mind approaches are unconsciously combined to finally obtain a final choice on the response scale.
Feeling and uncertainty approaches acting in the DP are rigorously defined in Manisera and Zuccolotto (2014a).Several variants of the CUB models can be derived from the DP as special cases, for example the NonLinear CUB model (Manisera and Zuccolotto 2014a, 2015, 2014b).In addition, novel models can be proposed by modifying some of the ingredients of the DP (distribution of the r.v.s involved, and the so-called accumulation and likertization functions) (Manisera and Zuccolotto 2019), like the CUM model (Manisera and Zuccolotto 2022).
In the CUM model, the observed rating r, r = 1, 2, . . ., m, with m = 2k + 1, is the realization of R, a discrete r.v. with probability mass given by: where For a given m, R is distributed as a mixture of a ltM r.v.W k (r|ξ D , ξ U ) and a discrete Uniform r.v.U m defined over {1, 2, . . ., m}.
In detail, U m models the uncertainty component and 1 − π weighs the uncertainty.Therefore, 1 − π can be then assumed as a measure of the uncertainty component in the mixture.
On the feeling side, the r.v.
that is a linear transformation of the Multinomial r.v.
The Multinomial r.v.M k originates from a DP where the feeling path unconsciously followed by the respondent is composed of k steps.At each step, the respondent formulates an elementary judgement that can be modelled by a Multinoulli distribution assuming value [0, 1, 0] with probability ξ U when the respondent makes one step towards the highest rating (the right-hand side of the response scale), value [1, 0, 0] with probability ξ D when the respondent makes one step towards the lowest rating (the left-hand side of the response scale), value [0, 0, 1] with probability 1 − ξ D − ξ U when the respondent considers to stay still.The judgement achieved at the k-th step of the feeling path is given by the total number of elementary judgements provided up to the k-th step, where M k,D and M k,U are respectively the total number of steps towards the lowest and highest rating made up to the k-th step.Therefore, the total number of elementary judgements at step k can be modelled by the sum of independent Multinoulli r.v.s, that is a Multinomial r.v.
with trial parameter k.The linear transformation (2) is then needed to map from the domain of M k to {1, 2, . . ., m}.
The CUM model has then two feeling parameters, ξ D and ξ U , which are a measure of the respondents' propensity to move towards the left-hand side (i.e., the lowest rating) and the right-hand side (i.e., the highest rating) of the response scale, respectively, starting from the the mid-point k + 1.Then, the difference 1 − ξ Dξ U measures the propensity of staying still.
In Manisera and Zuccolotto (2022), the probability mass function of W k (r|ξ D , ξ U ) was derived in the general case of m = 2k + 1 as follows: where round down and up, and Formula ( 3) is the starting point to derive, in Subsection 2.1, the probability mass function of Parameter estimation was carried out by Maximum Likelihood method.For fixed m and given the ratings r = (r 1 , . . ., r n ) ′ of n independent subjects in a sample, the log-likelihood function is where θ = (π, ξ D , ξ U ) ′ , n r is the (absolute) frequency of r, and p r = p(R = r|θ).
Since the maximization of the log-likelihood function by numerical methods (Nelder and Mead 1965) showed some convergence issues and high sensitivity to starting values, parameters of the CUM models can be estimated by resorting to the EM algorithm (Manisera and Zuccolotto 2022), as is commonly done with mixture models and, in particular, in mixture distributions of the CUB class (Manisera and Zuccolotto 2017).
The derivatives to be used to compute the M-step of the EM algorithm for CUM estimation and the (asymptotic) variance-covariance matrix were obtained in (Manisera and Zuccolotto 2022) and there derived for m = 7.In this paper, we will use the same approach and with simple algebra we will adapt the computation of those derivatives to the case of m = 5 (computations available upon request to the Authors).
The CUM model has a complex likelihood surface which causes stability issues in the estimates via EM algorithm.In addition, the identifiability of CUM model over the whole parameter space is still an open issue.Therefore, for estimation in this paper we checked for uniqueness of the EM estimates obtained from a grid of different starting values.
Parameter estimates can be effectively represented with nice ternary plots (Manisera and Zuccolotto 2022).The dissimilarity (diss) index, as reported in (Piccolo and Simone 2019), page 408, is a measure of the distance between the observed relative frequencies and the estimated probabilities; it is a normalized index (diss ∈ [0, 1]) and can be used to assess the goodness of fit of a CUM model, with lower values indicating a better fit.

Characteristics of the CUM model for five-point semantic differential scale
We now consider a five-point SDS, where 1 and 5 indicate two opposite adjectives, for example sad and happy or completely dissatisfied and completely satisfied.First, if the rating scale has 5 categories, the feeling path has 2 steps, and starting from the middle option 3, the total number of steps M k,D towards the lowest rating can be 0, 1 or 2; the total number of steps M k,U towards the highest rating can be 0, 1 or 2. The linear transformation in (2) allows to obtain the ratings 1, ..., 5 as in Table 1, that reports the values of the linearly transformed r.v.W 2 for the realizations of M 2,D and M 2,U .
Second, following Formula (3) applied to case m = 5, we have h and we can obtain the probability mass function of W 2 , which is given by:

Simulation study
In this section, we show the results from a simulation study performed fitting the CUM model with m = 5 (hereafter CUM5) to data generated on the basis of the parameter values summarised in Table 2.The 18 scenarios are the same of the simulation study performed to evaluate the CUM model with m = 7 (CUM7) in Manisera and Zuccolotto (2022).For each scenario iter = 1000 simulations with n = 1000 observations are executed.

CUM5 simulation study
The ternary plots of the CUM5 simulation study for the 18 scenarios are represented in Figure 1.The ternary plot represents ξ U , on the red axis, labeled as "Up", ξ D , on the blue axis, labeled as "Down", and 1 − ξ D − ξ U on the green axis, labeled as "Stay".Each estimated CUM model can be represented as a point in this plot, with coordinates given by the estimated parameters.The estimated uncertainty is included in the plot as the point size.In general, the estimated values tend to be quite close to the true parameter value, except for Case 1a, where a portion of the estimated values tend to concentrate on a wrong area of the parameter space.As will be clear in the next Section, this pattern, although still present, tends to be less critical with CUM7.Future research will be addressed to a deep understanding of the source of this anomaly.
Averages and standard errors of the iter = 1000 estimated values are reported in Table 3.

CUM7 simulation study
The ternary plots of the CUM7 simulation study for the 18 scenarios are represented in Figure 2. The results of the CUM7 simulation study, assessed with the same metrics used for CUM5 (Tables 6, 7 and 8), show an overall slightly better performance, mainly in terms of efficiency.Also with CUM7, Case 1a exhibits a problematic pattern, but, with respect to Figure 1, observations are more concentrated around the true parameter value.

Case study
In this section, we present the results obtained from the application of the CUM model to real data obtained by administering a questionnaire to the visitors of the Santa Giulia Museum in Brescia, Italy.The Santa Giulia Museum, included in the UNESCO World Heritage List, is the most important museum in Brescia and unique in Italy and in Europe due to its display concept and location.The data analyzed in this paper were collected during the period April-July 2022 by a survey developed within the activities of the project "Data Science for Brescia (DS4BS) -Arts and cultural places" (https://bodai.unibs.it/ds4bs/).diss Case 1 0.0124 0.0028 0.0199 0.0010 0.0007 0.0185 0.0002 0.0005 0.0174 Case 2 0.0101 0.0036 0.0206 0.0035 0.0018 0.0205 0.0029 0.0012 0.0205 Case 3 0.0009 0.0009 0.0172 0.0008 0.0008 0.0158 0.0003 0.0004 0.0148 Case 4 0.0043 0.0045 0.0218 0.0029 0.0027 0.0211 0.0027 0.0020 0.0212 Case 5 0.0056 0.0019 0.0213 0.0018 0.0016 0.0215 0.0011 0.0013 0.0211 Case 6 0.0019 0.0012 0.0180 0.0009 0.0007 0.0174 0.0005 0.0005 0.0164 The dataset contains 665 evaluations expressed by visitors about a question related to the easiness in visiting the museum.The adopted 7-point semantic differential scale ranges from "difficult" to "easy".The absolute frequency distribution of answers is displayed in Figure 3. Results obtained from the application of CUM5 are compared with those obtained from CUM7 and CUB.Specifically, the aim of the case study is twofold: (1) to comparatively assess the results of the CUM and traditional CUB approach and (2) to compare the parameter estimates obtained with CUM5 and CUM7.As for (2), we propose to deal with the original data collected on the 7-point semantic differential scale and then with the same data where some categories are merged using two strategies to create two different 5-point scales.So, data are fitted with CUM7 and CUM5 with the purpose to check the consistency of results.In order to adapt the dataset for CUM5 application, categories need to be reduced from seven to five.To this purpose, two different strategies have been implemented: 1.The first one is based on merging ratings two/three and five/six, to maintain the original ending and central categories of the scale.This dataset will be addressed as dataset-1 in the following.
2. The second one is based on merging ratings one/two/three, and leaving unchanged the other categories.This second choice is based on the analysis of frequencies of original ratings (see Figure 3), where categories in the right-hand side of the scale have an appreciably higher frequency than the other categories.This dataset will be addressed as dataset-2 in the following.
The rearranged datasets have the same number of observations as the original one, and the frequency distributions are displayed in Figure 4.The strategies we have implemented to reduce the number of categories from 7 to 5 are arbitrary, even though they are motivated by reasons (1) related to a psychological argument, stating that respondents give much importance to the extreme values of the scale and the middle value, and (2) suggested by data analysis.Other options could also be proposed.Alternatively, we could have considered a dataset obtained from surveys with questions based on 5-point response scales.However, in this case, we would not have been able to fit the CUM7 model.

Models for the dataset with 7 categories
In this subsection, results of the application of CUB and CUM7 to the original dataset with 7 categories are described and compared.Table 9 reports estimated parameters; the ternary plot chart for CUM7 is displayed in Figure 5.Both the CUM7 and CUB models suggest a low level of uncertainty (1− π is equal to 0.0509 and 0.0829, respectively).As for feeling, it is quite  10) are all lower for CUM7 than for CUB, suggesting that the improved goodness of fit of CUM7 justifies the additional parameter.

Models for the datasets with 5 categories: dataset-1
In this subsection, CUM5 and CUB are used to fit the ratings of dataset-1.For the CUB approach, a model with shelter on the fourth category was also used, but the shelter parameter turned out to be not significant.Table 11 reports the estimated parameters for CUM5 and CUB models; for CUM5, the ternary plot is in Figure 7. Also with these data, both the  Plots of observed versus fitted frequencies are displayed in Figure 8.Both CUM5 and CUB are not able to model the large observed frequency in the fourth rating, and this reflects on high values of the diss index, which however is lower for CUM5.Also in this case, according to diss index, BIC and AIC (Table 12), the CUM model outperforms the others.

Models for the datasets with 5 categories: dataset-2
In this subsection, CUM5 and CUB models are fitted to dataset-2.In this case the observed frequencies do not suggest the presence of any shelter effect, so only the basic CUB model is used.The parameter estimates are in Table 13; the ternary plot of CUM is shown in Figure 9.The different aggregation of categories proposed in dataset-2 generates data with higher uncertainty, as confirmed by both the CUM5 and CUB approaches (1 − π is equal to 0.2078 and 0.2874, for CUM5 and CUB, respectively).However, no appreciable difference emerges for feeling, whose parameters have quite similar estimates to those obtained with dataset-1, both for CUB model (1 − ξ = 0.7248), and for CUM5, except for a sightly higher probability of moving toward "difficult" ( ξU = 0.5393, ξD = 0.1276 and 1− ξU − ξD = 0.3331).The higher value of ξD can be justified by the fact that the aggregation rule generating dataset-2 (merging of the old categories 1-2-3) moves the middle position of the scale upward.Since the CUM DP assumes that the respondent's reasoning starts from the middle position, if this is moved upward we can reasonably expect a higher probability of moving toward "difficult".So, the different aggregation of categories seems to have modified only the uncertainty assessment, but the feeling measurement remains consistent with that obtained with the other datasets, denoting a very robust assessment of this component.
Figure 10 displays the plots of observed versus fitted frequencies, suggesting a good fit for both the models.According to diss, BIC and AIC indices (Table 14), the CUM model outperforms the other with respect to diss and AIC, while in this case the lowest BIC is reached by CUB.

Conclusions
In this paper the CUM model, conceived in the framework of the CUB class, is recalled with reference to its original formulation (suited for multi-point semantic differential with 7 categories, CUM7) and then extended to the case of 5 categories (CUM5).R functions implementing this new proposal have been developed and made available on the website https://bodai.unibs.it/cub/.The results from a simulation considering the application of CUM5 to 18 scenarios based on different parameter values are reported and compared to the results from an equivalent CUM7 simulation, showing how fitting measures are similar in the two cases, with a single case that requires further investigation.
A case study concerned with the the evaluation of the visitor experience at the Santa Giulia Museum is proposed.The original dataset with 7 ratings was analysed via CUM7 and CUB.Then, the original dataset has been transformed into a 5-point scale, following two different strategies, and analysed by means of CUM5 and CUB.The idea was to compare the CUM to the CUB approach and also to check the robustness of results with respect to different aggregations of the original dataset with 7 categories into two datasets with 5 categories.
From the point of view of the comparison of the different approaches, in general the CUM model outperforms CUB, apart from a single case where CUB exhibits a lower BIC value.So, the additional parameter of CUM with respect to CUB seems to provide significant information and goodness-of-fit improvement.
As for parameter estimates, in the examined case study, the different aggregation rule of the response categories impacts on the uncertainty measurement, while the assessment of the Next steps will be concerned with the deeper investigation of the single simulation case producing estimates far from actual parameters values both for CUM5 and, to a lesser degree, for CUM7.In particular, this raises concerns about the identifiability of the CUM5 model over the entire parameter space, posing an unresolved issue that necessitates further investigation.While awaiting conclusive results, we recommend considering CUM7 whenever feasible, as it appears to be unaffected by identifiability concerns.Additionally, we propose estimating the model by first verifying the uniqueness of estimates using different starting values for the EM algorithm, as done in this paper.

Figure 3 :
Figure 3: Absolute frequencies of the 7 original ratings for the question about easiness in visiting Santa Giulia museum

Figure 4 :Figure 5 :
Figure 4: Absolute frequencies of dataset-1 and dataset-2 for the question about easiness in visiting the museum (5 ratings)

Table 1 :
Values assumed by the r.v.W 2 for the possible realizations of the Multinoulli r.v.s M 2,D and M 2,U

Table 2 :
Summary of parameter values used in the simulation study

Table 3 :
Averages and standard errors of the iter = 1000 estimated values, CUM5

Table 4 :
Quality metrics for results of the simulation study, CUM5

Table 5 :
Best and worst results for CUM5 simulation study

Table 6 :
Averages and standard errors of the iter = 1000 estimated values, CUM7

Table 7 :
Quality metrics for results of the simulation study, CUM7

Table 8 :
Best and worst results for CUM7 simulation study

Table 9
models suggest a low level of uncertainty (1 − π is equal to 0.0151 and 0.0174, respectively).As for feeling, it is again quite high for CUB model (1 − ξ = 0.7668), and also in this case CUM5 recognises the presence of the different components of the assumed DP, with estimated values consistent with those obtained with CUM7 fitted to the original dataset ( ξU = 0.5764, ξD = 0.0245 and 1 − ξU − ξD = 0.3991).

Table 11 :
Estimated parameters (standard errors in parenthesis) -CUM5 and CUB models fitted to dataset-1

Table 12 :
Diss index, BIC and AIC -CUM5 and CUB models fitted to dataset-1

Table 13 :
Estimated parameters (standard errors in parenthesis) -CUM5 and CUB models fitted to dataset-2

Table 14 :
Diss index, BIC and AIC -CUM5 and CUB models fitted to dataset-2 diss BIC AIC CUM5 0.0012 1963.29 1949.79CUB 0.0373 1960.79 1951.79 feeling component remains stable, except for a small, easily interpretable, difference.So, both CUM and CUB exhibit high robustness with respect to different manipulations of data.