prefmod : An R Package for Modeling Preferences Based on Paired Comparisons, Rankings, or Ratings

The ﬁrst part of this paper describes a series of loglinear preference models based on paired comparisons, a method of measurement whose aim is to order a set of objects according to an attribute of interest by asking subjects to compare pairs of objects. Based on the basic Bradley-Terry speciﬁcation, two types of models, the loglinear Bradley-Terry model and a pattern approach are presented. Both methods are extended to include subject and object-speciﬁc covariates and some further structural eﬀects. In addition, models for derived paired comparisons (based on rankings and ratings) are also included. Latent classes and missing values can be included. The second part of the paper describes the package prefmod that implements the above models in R . Illustrational applications are provided in the last part of the paper.


Introduction
The method of paired comparisons is a well-established technique for measuring relative preferences assigned to certain objects or items of any kind.The aim is to establish an ordering of the objects on a preference scale according to an attribute (we use preference as an umbrella term for orderings in general).The attributes may be based on subjective evaluations of properties of the objects (e.g., tastiness of food, beauty of flowers, perceived risk of portfolios) or on 'objective' outcomes under some predefined rules (e.g., strength of football teams, corruptness of countries, quality of scientific journals).
Given a set of objects, the paired comparison (PC) method splits the ordering process into a series of evaluations carried out on two objects at a time.For each pair, the objects are compared and a decision is made which of the two objects possesses more of the attribute.The comparisons are replicated where the units may be different subjects (usually called judges) or different circumstances under which a comparison is observed.This paper presents some preference models covering and extending the basic paired comparison formulation by Bradley and Terry and also describes their implementation through the package prefmod in R (R Development Core Team 2012), available from the Comprehensive R Archive Network at http://CRAN.R-project.org/package=prefmod. Section 2 presents two basic models that are extended in Section 3 by including subject and object-specific covariates and some further structural effects.Alternative response formats (rankings and ratings) are dealt with in Section 4. Section 5 outlines PC models which incorporate missing values and latent classes.In Sections 6 and 7 the prefmod package is covered and some illustrational applications are given.

The basic paired comparison models
Given J objects (indexed by j and k, j < k), the core specification for one paired comparison jk is the Bradley-Terry (BT) model which defines the probabilities to prefer object j to object k (j k) or to prefer object k to object j (k j) as follows: where the πs (also called worth parameters) are the locations of objects on the preference scale.
To ease later presentation, we rewrite (1) as p(j k|π j , π k ) = π j /π k π j /π k + π k /π j , adopting a formulation suggested by Sinclair (1982) where y jk is a response to comparison jk and takes the value of 1, if object j is preferred to k and value of −1, if object k is preferred to j, c jk is a normalizing constant.
This formulation is used when the recorded paired comparisons are assumed to be independent.All models based on the specification (1) or (2) and relying on the independence assumption are called BT models.
A pattern approach is used in which case the responses of a subject are considered simultaneously.The response is defined as y = (y 12 , y 13 , . . ., y jk , . . ., y J−1,J ) and the probability for a pattern is given by p(y) = p(y 12 , . . ., y J−1,J ) = c j<k √ π j √ π k y jk .
(3) Models based on the joint probability for all PCs are called pattern models.

The basic loglinear Bradley-Terry model: LLBT
The BT models can be fitted as loglinear models (see, e.g., Fienberg and Larntz 1976;Sinclair 1982;Dittrich, Hatzinger, and Katzenbeisser 1998).It is assumed that the observed number Response Counts µ 12 µ 13 µ 23 λ 1 λ 2 λ 3 Table 1: Design structure for a simple LLBT.The counts are the observed numbers of preferences (the dependent variate in the loglinear model), the other columns are covariates for the µs (dummies representing the comparisons), and covariates for the object parameters λ 1 , λ 2 , and λ 3 .
of preferences n j k = n(y jk = 1) and n k j = n(y jk = −1) in a given comparison jk follow a Poisson distribution.Conditional on a fixed number of subjects n jk making the comparison jk, the observed number of preferences follow a binomial distribution.
In the loglinear BT model (LLBT) the expected number m(y jk ) for the response y jk ∈ {1, −1} is given by m(y jk ) = n jk p(y jk ) .
Using equation ( 2) and standard notation for loglinear models, the linear predictor η for the basic LLBT is given by the equation where µ jk is a nuisance parameter for the comparison jk which fixes the marginal distribution to n jk .The λs represent object parameters and are related to the worth parameters π in (1) or (2) by ln π = 2λ.In total, we estimate 2 J 2 expected counts.Table 1 shows the design structure for 3 objects and 3 2 = 3 comparisons.

The paired comparison pattern model
The pattern models can also be formulated as loglinear models (Dittrich, Hatzinger, and Katzenbeisser 2002).The expected numbers for a whole sequence of preferences y is given as where n is the total number of respondents.Given n, the observed numbers of subjects responding with certain patterns y = (y 12 , . . ., y J−1,J ) for y jk ∈ {1, −1} are assumed to follow a multinomial distribution with pattern probabilities p(y) given in equation (3).In total, there are L = 2 ( J 2 ) possible distinct response patterns y.The linear predictor η y for the pattern model is (5) Pattern y µ λ 1 λ 2 λ 3 θ 12.13 θ 12.23 θ 13.23 y 12 y 13 y 23 Counts Const x 1 x 2 x 3 y 12 y 13 y 12 y 23 y 13 y 23 Table 2: Design structure for a pattern model including dependencies.The counts are the observed numbers of preferences (the dependent variate in the loglinear model), the other columns are covariates for the µ (the normalizing constant), variates for the objects and for the dependence parameters.
The µ is constant for all patterns and the covariates x j for the λs are generated by summation over all paired comparisons, i.e., The x j s for the λs vary according to the effective pattern y and are obtained by counting how often object j was preferred minus how often object j was not preferred.
An important feature of the pattern models is the possibility to include dependencies between the decisions, and therefore to abandon the (sometimes unrealistic) assumption of independent decision.Dependencies are defined as associations between pairs of PCs with a common object.It is assumed that the dependencies between responses come from the repeated evaluation of the same objects.Dependence parameters have the form θ jk,jl and are defined for triples of objects {j, k, l}, i.e., on pairs of PCs with one common object j.Pairs of PCs are assumed to be independent if there is no common object.
Including dependencies, the probability for a pattern p(y) is extended to The linear predictor for a certain response pattern then becomes The design structure for a pattern model for J = 3 objects including dependencies is given in Table 2.
In fact, the LLBT and the PC pattern model are generalized linear models (GLMs) and can be fitted using any software capable to estimate GLMs.

Undecided responses
In some experiments, the researcher may allow subjects to give an undecided or tie response.Such data also occur naturally in certain sports games such as football.Now a PC response y jk can take three values: y jk = 1, if object j is preferred, y jk = −1, if object k is preferred, and we additionally define y jk = 0 for an undecided response.
Adapting a formulation suggested by Davidson and Beaver (1977) and using (2), the probability for a PC response including ties is given as In this case we define a general undecided effect υ for all PCs and the equations defined in (4) are extended by a third equation for an undecided response in the LLBT model where the undecided effect γ is ln υ.
This can easily be extended if a different undecided effect is needed for each paired comparison by substituting the common γ with γ jk .
In pattern models, undecided responses can also be incorporated by extending (3).The linear predictor for a pattern model including a general undecided effect then becomes where u is the sum of undecided responses in the corresponding pattern.
To estimate different undecided effects for each paired comparison (8) becomes where u jk = 1 if there is an undecided response in the comparison jk and zero otherwise.

Categorical and numerical subject effects
Preference decisions can depend on characteristics of the subjects making decisions (Dittrich, Hatzinger, and Katzenbeisser 1998).Inclusion of subject covariates allows the data modeler to move away from the assumption that all subjects have the same preference ordering (given by the object parameters λ or the worth parameters π).Here, we are interested how the object parameters vary according to characteristics of the subjects which can be either categorical (e.g., gender) or numerical (e.g., age).

Categorical subject effects
For the LLBT the starting model equations are given in (4).To illustrate the approach we assume that the subjects are cross-classified according to one or more factors into s = 1, . . ., S groups.
The loglinear representation for the log expected numbers of preferences in comparison jk and for subjects in covariate group s is given by ln m(y jk; s ) = µ jk; s + y jk;s (λ j + λ js − λ k − λ ks ) for y jk;s ∈ {1, −1} .
where the µ jk; s are nuisance terms to fix the margins in a multinomial model and are parameters to reproduce the marginal totals for a specific factor level and comparison.
There are different ways to parameterize the model of interest.The approach taken here is to define a reference group, where λ j and λ k are the object parameters for that reference group (e.g., s = 1) and λ js and λ ks , s = 2, . . ., S are the parameters for the other groups.
The main effects λ s are absorbed into µ jk; s .While the λ j s represent the ordering for the reference group, the preference order for the other groups are obtained by adding λ js s specific for group s.

Numerical subject effects
When numerical subject characteristics are considered the model development is similar.But now the LLBT and the pattern models have to be extended for each subject.
The LLBT equations for subject i and comparison jk are: We model the λ j: i through the linear relationship λ j; i = λ j + P p=1 β jp x p; i where x p; i corresponds to the pth covariate (p = 1, . . ., P ) for individual i.For each object j there is a separate set of β-parameters which describe the effect of the covariates on that object.λ j is an intercept for the location of object j for x = 0.
All considerations for categorical and numerical subject covariates in the LLBT also apply to pattern models.Effectively, categorical and numerical subject covariates can be incorporated by constructing a separate table (cf.Tables 1 and 2), for each subject group s or each subject i.The basic design matrices are repeated with extra indicators (dummies) for the groups or individuals.This allows us to extend the models to more than one categorical and/or numerical subject covariate in the LLBT and pattern models.

Object-specific effects
Often objects can be described by certain characteristics which govern the preferences.A group of objects having a certain property might be preferred to objects which do not have this property.Or the preferences could be affected by some numerical properties of the objects.
To incorporate object characteristics (e.g., the price of a product) the following linear reparameterisation is used where x jq is the covariate for characteristic q of object j and β q is the effect of characteristic q.
The LLBT equations for the preferences in comparison jk are given by ln m(y jk ) = µ jk + y jk q (x jq − x kq )β q , for pattern models this is ln m(y) = µ + j x j q β q x jq .
If the number of object characteristics is small compared to the number of objects, the model will be a simplification of the standard model.Note however, this reparameterisation is useful mainly in situations where the x jq are fixed according to an experimental setting, such as in conjoint studies.For observational x jq , the introduction of an additional error term might be useful (cf., Turner and Firth 2012).

Position effect
If the order of the presentation of objects in a comparison is believed to affect the responses we might consider a position effect.
We differentiate between the comparison order by using the notation jk • j if j is presented first and jk • k if k is presented first.When preferring one object to another, there are two distinct expected counts which we denote as m(y jk•j = 1) and m(y jk•k = 1).In both cases j is preferred to k.
The LLBT equations for these two expected counts are ln m(y jk with an extra parameter δ representing a general position effect for the first presented object being chosen.

Pattern models for rankings
So far, we have only considered data that originate from (real) paired comparisons.In many studies, however, rankings of objects are collected to get information about the preference order.Subjects are asked directly which object is the most preferred, the second preferred and so on.Such rankings can also be modeled using a pattern approach (model (5) or any extension as presented in the previous sections) by transforming the rankings into (derived) PC patterns (Francis, Dittrich, and Hatzinger 2010).
Given, for instance, three objects O 1 , O 2 , and O 3 and the preference order O 1 O 3 O 2 , we can easily derive the according PC response pattern to be (1, 1, −1).
One major difference is the number of possible patterns L. For rankings L = J!, whereas for (real) PCs L = 2 ( J 2 ) .The reason is that there are many real PC patterns that cannot be generated from a ranking.The patterns left out are intransitive patterns, which can not result

Ranks
Derived patterns Table 3: Design structure for a pattern model for rankings.The counts are the observed numbers of preferences (the dependent variate in the loglinear model), the other columns are covariates for the µ (the normalizing constant) and for the object parameters.
in a ranking.For example, the PC pattern ) which cannot occur with rankings.Another difference to models for (real) PCs is the absence of undecided responses.Moreover, for non-tied rankings not all dependence parameters can be estimated due to their relation to intransitivity (for some discussion see Dittrich et al. 2002).
Table 3 shows all possible (full) rankings for J = 3 objects, derived PC patterns and the design structure for a ranking pattern model.

Pattern models for ratings
Responses to Likert-type items, often called ratings, are another form of data collection which can be used to obtain preference orderings.Such items (these are the objects in our terminology) are useful for gathering respondents' feelings, opinions, attitudes, etc.It is not uncommon that mean or median values are used to establish an order of these items though based on questionable assumptions.Sometimes, the Likert-type responses are treated as categorical, where frequencies of the high categories for each item are used to determine a ranking of the items.A detailed discussion is given in Dittrich, Francis, Hatzinger, and Katzenbeisser (2007).
Alternatively, we can use the paired comparison pattern model approach to rank Likert items according to the 'measured' dimension (e.g., dangerousness, importance or liking).
The rating pattern models based on Likert-type data can be seen as ranking models naturally including undecided decisions (they also apply to ranking data with ties).To convert Likertdata on sets of items to (derived) PCs we compare pairs of item responses.In the following we assume low rating scores to represent a higher value on the preference scale and that the items all represent the same dimension.Comparing the responses of item j to k, the item j is preferred (j k) if j has a lower score and the PC response is y jk = 1.The item k is preferred (k j) if k has a lower score and y jk = −1.Neither of the items is prefered if both items have the same score (j = k).This corresponds to an undecided response and y jk = 0.
The probabilities for each pair of items jk and a general undecided effect are defined in (7).The linear predictor for this rating pattern model is given by ( 8).If comparison-specific tie effects are required, the model can be extended as in (9).The set of possible response patterns y, however, is different.
In contrast to rankings, the undecided responses are inherent in rating pattern models as items can be rated equally.This is affecting the number of possible derived PC patterns.The number of possible patterns for real PCs including an undecided category is 3 ( J 2 ) whereas there are fewer derived patterns for rating data since several rating patterns have the same representation as derived PC patterns (e.g., the rating patterns (1, 1, 2) and (2, 2, 3) are both  represented by the unique PC pattern (0, 1, 1)).For details see Dittrich, Francis, Hatzinger, and Katzenbeisser (2007).
The rating pattern models can also be extended to subject and object covariates and, additionally, dependencies can be modeled.
It should be noted, that both ratings and rankings should, in general, not be modeled using LLBTs but rather pattern models.In case of real paired comparisons the LLBT and the independence pattern model coincide.Inherently, the LLBT considers all transitive and intransitive PC patterns.However ranking and rating pattern models are based on fewer response patterns compared to PC pattern models and consequently, the LLBT leads to biased estimates for data without intransitive patterns.

Missing values
Missing observations in paired comparison data can occur for several reasons.One is 'by design', if deliberately not all possible comparisons are presented to all subjects.Another is that subjects do not know which object to prefer or are unwilling to make a decision, or are tired, etc.In the following we will denote a missing value by NA (not available).
In the LLBT models, missing values can be handled easily, if they occur at random and are not left out systematically by the respondent.The decisions are assumed to be independent (as coming from different subjects) and the n jk can therefore vary for each comparison.But as previously mentioned, the LLBT does not allow for dependencies and gives biased estimates for derived PCs originating from rating and ranking data.
To take incomplete response patterns into account, the composite link approach as suggested by Thompson andBaker 1981 is used (Dittrich, Francis, Hatzinger, andKatzenbeisser 2012).We distingush between the observed patterns and the complete data.Whereas the observed patterns include all possible outcomes including NAs, the complete data are a combination of all possible patterns without NAs (denoted as y) and all possible nonresponse patterns (denoted by r).A nonresponse pattern r describes the NA structure and is of the form r = (r 12 , . . ., r J−1,J ), where r jk is 1 if comparison jk is missing and 0 otherwise.The observed and complete data are combined by the composite link method.
The set of all possible patterns that can be observed is divided into blocks according to their NA patterns.We use square brackets to denote such blocks.For instance, the block with no missing responses is written as [ ], i.e., it consists of all (2 ( J 2 ) in the case of no ties) possible patterns y.In this block, the observed and complete patterns coincide.If, e.g., the comparison jk is missing, the block containing all patterns that possibly could be observed is denoted by [jk].Given three objects, Table 4 shows the data structure for block [23].In total there are 2 ( J 2 ) possible blocks.The missing value models are based on the complete data.where λ stands for all parameters depending on the specified outcome model.The linear predictor for the outcome model is η y; λ and the expected counts for the response patterns depend on the λ parameters.
(2) The second part g(r|y; ψ) is the nonresponse model where the parameters ψ are related to r and y depending on the specified nonresponse model.The linear predictor for the nonresponse model is η r| y and the expected counts for the nonresponse patterns depend on the ψ parameters.
In the following we consider only two of the missing data mechanisms as described by Rubin 1976, i.e., missing completely at random (MCAR) and missing not at random (MNAR).
The MCAR assumption defined as g(r; ψ) is valid if the conditional distribution for the nonresponse pattern r is thought to be independent of the response pattern y and the parameters ψ are related to r only.
We consider two different models under the MCAR assumption, where we use α parameters to specify ψ for these models.
In the MCAR model 1, a common parameter α representimg the missingness process is assumed which is the same for all comparisons (α jk = α).Using the logit fomulation for a nonresponse, logit(r jk ) = α, and assuming independence, the linear predictor for a nonresponse pattern is given as η where the j<k r jk denotes the total number of nonresponses.
The probabilty for a nonresponse pattern under model 1 is For MCAR model 2 we specify an α parameter for each object by reparameterizing α jk to α j + α k and the linear predictor for a nonresponse pattern is The MNAR assumption defined as g(r| y obs , y mis ; ψ) is used if the MCAR assumption does not hold; in other words, the conditional distribution for the nonresponse pattern r depends on both the observed and the missing values and the parameters ψ are related to r, y obs , y mis .
For the MNAR model we additionally introduce β parameters related to r jk and y jk to characterize the objects.
In this model the linear predictor for a nonresponse pattern is Here the β parameters depend on the product of the missing indicators and the PC responses and therefore only the missing observations y mis contribute.
For example, the pattern (1, 1, NA) in Table 4, can only be generated by the complete patterns (1, 1, 1) or (1, 1, −1).The observed part y obs = (1, 1) and the missing element y mis is y 23 , which takes the values 1 or −1.For this example the linear predictor for the observation if no nonresponse model is specified.The specification for the ηs are given in (5) if no dependencies are assumed and in (6) when dependencies are included.
If additionally a nonresponse model is specified, e.g., for the MNAR model given in (10), the linear predictor for the observation (1, 1, NA) is To estimate the outcome model f (y; λ) the total likelihood L is the product of likelihoods for each NA pattern block [•] The individual contributions for the block with no missing responses is exp(η (y 12 ,y 13 ,...,y J−1,J ) ) y∈[ ] exp(η y ) ny .

Latent subject effects
Variation between responses may be due to different characteristics of the subjects.The models presented in the previous sections allowed for including observed (categorical and numerical) subject covariates.However, there might also be unmeasured or unmeasurable characteristics of the subjects that could affect the responses.Using random effects models is one common approach to account for such heterogeneity (Francis, Dittrich, and Hatzinger 2010).
To introduce random effects in a pattern model (that includes fixed subject effects) we have to specify J random effect components δ j s .For each subject in covariate set s and response pattern there are J separate random effects one for each object j.
The linear predictor for a standard random effects approach would be where the location of the preference parameter for item j will be shifted up or down for each response pattern in each subject covariate group s.
The likelihood becomes where g(δ s ) is the multivariate probability density function or mixing distribution of the random effects vector.
Francis, Dittrich, and Hatzinger (2010) use an alternative approach, the non-parametric maximum likelihood (NPML) method suggested by Aitkin (1996) and Mallet (1986).The multivariate distribution is replaced by a series of mass point components with unknown probability and unknown location.The mass point approach is a mixture model, where the multinomial (fixed effects) model is replaced by mixture of multinomials.
If the number of components (latent classes) is known, say C, we get C vectors of mass point locations δ c = (δ 1c , δ 2c , . . ., δ J−1;c ) and the unknown component probability is q c .The likelihood becomes where p sc = 1 for each subject group s and mass point c.
To estimate the nonparametric random effects model the EM algorithm is used treating class membership as a missing data problem.A latent class membership indicator z sc for each s combination is introduced and set to one if s belongs to class c (zero otherwise).The likelihood now is The E-step (re)calculates the expectation E(z sc |y s ) = w sc which is P (z sc = 1|y s ).This expected value can be thought of as the posterior probability that a subject characterized by the pattern s belongs to class c.
The M-step maximizes the multinomial likelihood w.r.t. the λs and δs carried out through a loglinear model with weights w sc .The estimate of q c , the proportion of patterns in latent class c, is obtained from qc = s w sc /LC, where L here is the number of patterns y.

The prefmod package
Several existing packages in R (R Development Core Team 2012) will estimate Bradley-Terry models, e.g., BradleyTerry2 (Turner and Firth 2012), eba (Wickelmaier and Schmid 2004), and psychotree (Strobl, Wickelmaier, and Zeileis 2011).These packages are limited to true PC data and cannot deal with dependencies.The prefmod package (Hatzinger 2012) provides an integrated suite of functions which can cover the wide variety of models suitable for PC and derived PC data as discussed above.Apart from model fitting routines there are several example data sets, functions to simulate PC data and to transform raw data.
The two basic model specifications are the LLBT and the pattern model (as presented in Section 2).The names of the functions that relate to these two models echo these names, starting with either llbt or patt.

General usage
The work flow to estimate a PC model is: (1) prepare the data, (2) set up a design matrix, (3) fit the model, and (4) display the results.Since the loglinear PC models are GLMs, the model fitting can be accomplished using the R function glm or preferably (because it is more efficient), the function gnm from the gnm package (Turner and Firth 2011).The tricky part here is to set up the design matrices.This is supported by the functions llbt.design and patt.design.Once having devised the design, model fitting using one of the generalized linear model functions is straightforward.Each of the four steps is explained in more detail below:

Data preparation
The result must be a data frame on an individual level (with one line per subject), where the leftmost columns must be the responses optionally followed by columns for subject covariates.Three main types of responses are covered by the prefmod package: (real) paired comparisons, rankings and ratings.
Rankings and ratings: Each column corresponds to an object, the responses have to be coded in a similar way to PC responses.
Subject covariates: If present, they have to be specified as numeric vectors (not as factors).
Levels of categorical covariates have to be represented by consecutive integers starting with 1.
Missing values are represented by NA and are allowed only for responses but not for subject covariates.

Design matrix
The two functions llbt.design and patt.designgenerate data frames containing the design structures from the above mentioned data for LLBT and pattern models, respectively.Both functions allow for specifying categorical and numeric subject covariates (using the arguments cat.scovs and num.scovs).For categorical subject covariates, a separate design matrix is generated for every combination of covariate levels and these matrices are stacked to form the overall design matrix.For numeric subject covariates design matrices for every individual are generated and again stacked to form the overall design matrix.The option casewise (which is set to TRUE by default when numeric subject covariates are present) applies in two situations.If the number of combined levels of the factors specified in cat.scovs exceeds the number of subjects, a smaller overall table is generated.Secondly, the option casewise allows for fitting a suitably dimensioned null model (without subject covariates) to serve as a reference model.
Common to both functions llbt.design and patt.designthere is also the option to specify a data frame for object-specific covariates (objcov).For both subject and object-specific covariates the corresponding vectors are generated in the resulting design structure.
For pattern models there are a few more arguments.The resptype argument allows the user to set the response format to paircomp (for real PCs), rating or ranking.Covariates for dependence parameters are generated through the ia argument.The preference coding of the responses can be reversed using reverse = TRUE.Then the highest value for codings with nonnegative integers denotes a preference for the first object or if the coding is of the form (−1, 1), it will be interpreted as (1, −1), i.e., 1 denotes the preference for the second object.
It is worth noting that the function patt.designautomatically provides dummies for undecided response categories for each comparison (for rating models and for PC models, if ties are present).The function llbt.designproduces dummies for each response category.They are labelled g0 and g1 (in case of no undecided response) or g0, g1, and g2 (in presence of an undecided response category).The dummy g1 can then be used to fit a global undecided effect, the interaction term g1:mu allows for estimating comparison-specific undecided effects.Moreover, the function llbt.designgenerates a factor for the comparisons (mu).Both functions create a factor CASE with a separate level for each row of the original data matrix if the argument casewise was set to TRUE.For both functions, the output is a data frame of either class llbtdes or pattdes containing all variates to fit the corresponding models.

Model fit
Both type of models can be fitted using glm or gnm using family = poisson and the above design as the data argument.If categorical subject covariates are present, the highest interaction term amongst them must be present in the linear predictor as a nuisance term, having the sole purpose of fixing the marginal distribution of the contingency table.For fitting numerical subject (and optionally additional categorical) covariates this interaction term is instead represented by the factor CASE.For LLBT models this interaction must include the comparison factor mu, i.e., mu must always be present.It is preferable to use gnm rather than glm, since these nuisance terms can be specified in gnm's eliminate argument resulting in faster computations (see Hatzinger and Francis 2004), fewer memory requirements, and the supression of the nuisance parameter estimates from the display of the results.
Effects of the subject covariates are specified using interaction terms between the subject and the object(-specific) covariates.

Display results
Numerical summaries are obtained from the generalized linear model functions as usual.The two functions llbt.worth and patt.worthapplied to the output of glm or gnm calculate the worth parameters for all objects (or combination of objects if reparameterized using objectspecific covariates) optionally grouped for all combinations of categorical subject covariates.
The output is a matrix of worth parameters with columns for all combinations of levels of the categorical subject covariates which can then be plotted using plot.Optionally, the original estimates (prior to conversion to the worths) can be calculated and plotted.The functions do not directly work for models including numerical subject covariates.However, the resulting estimates can be used for generating appropriate plots (for more details see ?llbt.worth or ?patt.worth).

Alternative work flow
To ease computations some convenience functions have been devised that directly perform model fitting without the need to set up a design matrix.These are llbtPC.fitfor LLBT models and pattPC.fit,pattR.fit,and pattL.fitfor pattern models.The capital letters denote: PC for (real) paired comparisons, R for rankings, and L for ratings/Likert type responses.They all require the data prepared as described in the previous section.
There are some limitations when using these convenience functions.These concern mainly the lack of support for object-specific and continuous subject covariates.Only a general undecided effect can be fitted.Moreover, there is no general formula interface and hence no special model terms (e.g., interaction terms with single objects) can be specified.However, the fitting is based on the likelihood that is directly computed from the internally constructed design matrices.This allows for a higher number of subject covariates and the treatment of missing values in pattern models using a composite link approach.
By default, all objects are included in the models (without any requirement to explicitly specify them).Categorical subject effects can be included using the formel and elim arguments.
formel specifies the actual model to be fitted.For instance, if specified as formel = ~SEX a different preference scale for the objects will be estimated for males and females.For two or more covariates, the operators + or * can be used to model main or interaction effects, respectively.The operator : is not allowed.The specification for elim follows the same rules as for formel.However, elim specifies the basic contingency table to be set up but does not specify any covariates to be fitted.This feature allows for the succesive fitting of nested models to enable the use of deviance differences for model selection.It is essential, when comparing models through deviance differences that the elim formula is the same.
Where applicable, parameters for undecided responses (argument undec) and dependencies (argument ia) can be included into the models.
The functions for calculating worth parameters and for plotting the preference scales described earlier are also defined for the output of the convenience functions described here and can be used analogously.
Table 5 gives an overview of the work flow and the basic functions of the prefmod package.

Missing responses
In the LLBT models, missing responses under the missing completely at random (MCAR) assumption do not pose any problem since the number of respondents can vary across comparisons.This is not so straightforward for pattern models.Here the dependent variate is the number of subjects showing a particular (complete) response pattern.If certain comparisons are missing, standard GLMs do not apply.The convenience functions pattPC.fit,pattR.fit,and pattL.fithave been implemented to allow for including response patterns where some comparisons are missing (MCAR).This enables the modeling of specialized response designs such as partial rankings (only a subset of objects is ranked) or "best-worst" choices.Based on a composite link approach the total likelihood of these models is maximized from the sum of individual likelihoods that are calculated for each block of the design structure with differing response/nonresponse patterns (for details see Section 5).
The convenience functions provide some treatment of missing not at random (MNAR) mech-anisms.Setting the argument NItest to TRUE gives separate estimates for the object parameters for the group of respondents showing complete responses and the group having at least one NA in the responses.If the estimates differ significantly there is some evidence for different response behaviour in the groups.
For real paired comparisons, additionally two different nonresponse models can be specified that allow for the evaluation of nonignorable missings (MNAR).When the outcome model is specified as above, the nonresponse models either assume a common probability for a nonreponse for all objects (argument MIScommon) or different nonresponse probabilities for the objects (arguments MISalpha and MISbeta).MISalpha and MISbeta are specified using logical vectors with entries for each object.A TRUE specifies that the corresponding parameter for the object should be estimated.MISalpha defines the set of NA indicator parameters and MISbeta defines the set of interactions between the NA indicator parameters and the object parameters of the outcome model.If only MISalpha is specified, the result is an MCAR model.The utility function checkMIS facilitates the specification of MISalpha and MISbeta.Applied to the data, checkMIS returns a logical vector that determines for each object whether missing values have been recorded at all.If not used in the context of model fitting, checkMIS returns some descriptive statistics on the response structure in the data.

Latent class preference models
The function pattnpml.fitallows for the computation of latent class models for real and derived paired comparisons utilizing a nonparametric maximum likelihood (NPML) estimation method (Aitkin 1996).The function is a wrapper for alldistPC which in turn is a modified version of the function alldist from the npmlreg package (Einbeck, Darnell, and Hinde 2012).This function extends the NPML aproach to allow fitting of overdispersed paired comparison models.It assumes that overdispersion arises because of dependence in the patterns.Fitting a non-parametric random effects term is equivalent to specifying distinct latent classes of response patterns.
Prior to using pattnpml.fita design matrix has to be generated with patt.design and supplied using the design argument.Model specification is accomplished with two model formula arguments, formula for the fixed effects and random for the random effects.Fitting latent class models can be achieved by specifying the paired comparison objects as additive terms in the random part of the model formula.A separate estimate for each item and for each mass point component is produced.The number of components of the finite mixture has to be specified beforehand using the argument k.
Fitting subject covariate models with the same effect for each mass point component is achieved by the following specifications: the formula part must contain a) the interaction of all subject factors (this corresponds to the specification of the eliminate argument when using gnm for pattern models) and b) an interaction of the chosen subject covariates with the objects.The random argument contains only the objects and gives a model with a set of random intercepts (one set for each mass point).
Fitting subject covariate models with a different effect for each mass point component (sometimes called random coefficient models, see Aitkin, Francis, Hinde, and Darnell 2009, pp. 497) is possible by additionally specifying an interaction of the subject covariates with the objects in the random part.Thus the setting random = ~x:(o1 + o2 + o3) gives a model with a set of random slopes (one set for each mass point) and a set of random intercepts, (again one These variables were coded as sex (1 male, 2 female), age (numeric in years), ltype (1 accomodator, 2 diverger, 3 converger, 4 assimilator).The learning types were identified from a questionnaire (Kolb and Kolb 2005).The theory of Kolb and Kolb portrays "two dialectically related modes of grasping experience -Concrete Experience (CE) and Abstract Conceptualization (AC) -and two dialectically related modes of transforming experience -Reflective Observation (RO) and Active Experimentation (AE)."The learning types are characterized as accomodating (CE and AE), diverging (CE and RO), assimilating (AC and RO), and converging (AC and AE).The trdel dataset is an excerpt from the original data.Missing values and a few more subject variables have been removed.A more detailed analysis is given in Schöll and Veith (2011).
For fitting models with numerical subject covariates we cannot use the convenience function llbtPC.fit.We use gnm instead and have to generate a design structure first (we display the first two lines).
Estimate Preferences 0.10 0.15 0.20 0.30 0.40 ltype1 ltype2 ltype3 ltype4 q q q q q AU TV PA CL CO q q q q q AU TV PA CL CO q q q q q AU TV PA CO CL q q q q q AU TV PA CO

CL
Figure 1: Worth estimates for presentation modes.
For a more complex analysis we include all subject covariates into the model.Additionally, we define an object-specific covariate type of presentation media (pres), where CO, TV, and AU are coded with 0, and PA and CL with 1.We redefine the design structure as R> PRES <-data.frame(pres= c(0, 0, 1, 0, 1)) R> dsg <-llbt.design(trdel,5, objnames = objnames, objcov = PRES, + cat.scovs = c("sex", "ltype"), num.scovs = "age") The maximal model includes the highest interaction term between sex, ltype, and age.Since age is continous, the eliminate argument is specified using the mu:CASE interaction.
Since the data is in aggregated form we cannot directly use prefmod's design generating functions.However, a little trick helps.
We can use llbt.design to set up the basic design structure, and then modify the structure, adding in the aggregated counts.We form a simple invented data set (pseudo) with 21 columns (the number of comparisons for 7 objects) and two lines (one for matches played at home and one for matches played away) augmented with an extra column for a home/away covariate cov.The actual values of the observed comparisons stored in the 21 columns do not matter at this stage as the y variate (counts) generated here will later be replaced by the actual numer of wins and losses.So we make all values in the first line 1 and all values in the second line −1.

Analysis of Deviance
For fitting models with object-specific covariates we need a design matrix.
R> data("salad") R> conc <-data.frame(acet= c(0.5,0.5, 1, 0), gluc = c(0, 10, 0, 10)) R> sal <-patt.design(salad,nitems = 4, resptype = "ranking", + objcov = conc) R> m.sal.0 <-gnm(y ~A + B + C + D, family = poisson, data = sal) R> m.sal.1 <-gnm(y ~acet + gluc, family = poisson, data = sal) R> anova(m.sal.1, m.sal.0)Online configuration systems allow customers to actively participate in the creation of products and have become increasingly important in various market sectors.Dabic and Hatzinger (2009) report a study on car configurators that aimed at investigating the effects of certain person characteristics (such as gender) on the configuration process.Subjects were asked to configure a car according to their preferences.They could choose freely from several modules: such as exterior and interior design, technical equipment, brand, price, and producing country.The order of module choice was recorded as ranks.Since not all modules had to be chosen the response format was partial rankings.For illustrational purposes, we want to assess the effect of gender.First we fit the null model (without sex), but for later comparisons we have to set up the contingency table including sex through the elim argument (formel = ~1 is default).

Music style preference data: Ratings, latent classes
The function pattnpml.fitcan used to fit models for latent classes using a nonparametric random effects distribution.We apply the method to the previously used music style preference data.We first have to generate a design matrix.Missing values in the responses are not allowed, and we therefore carry out a complete case analysis.
R> mod.lc <-pattnpml.fit(y~clas + coun + rap + conr, k = 1, design = dsg) For k > 1 latent classes (and no fixed subject effects) the fixed part contains only the intercept whereas the objects are specified in the random argument.We fit models up to five latent classes.

Figure 4 :
Figure 4: Worth plot for hypothetical migration threats under an MCAR and MNAR assumption.
The joint probability for the y

Table 4 :
Data structure for block[23]and PC responses.
and r patterns is p(y, r; λ, ψ) = f (y; λ) g(r| y; ψ)(1) The first part f (y; λ) is called the outcome model (defined for the complete patterns y)

Table 5 :
Overview of basic functions.
Table1mu is represented by dummy variables), g1 and g2 are dummies for the response categories (see Section 6.1) and CASE is a factor representing the subjects.The eliminate argument of gnm has to be specified as mu:CASE if there is any continuous subject covariate in the model.The mu:CASE interaction determines which comparison has been made by which subject.For a graphical display of the worth parameters we have to evaluate the fitted linear predictor for a range of values of age, and transform them to worths.The main effect estimates for the objects (CO through CL) are the intercepts, the interactions of the objects with age (CO:age through CL:age) are the slopes, e.g., est = CO + (CO:age) * x.where we use x = 15, . . ., 55, the range of observed age values.The parameter estimates relating to the reference object have to be set to zero.The gnm function parameters is used to obtain the coefficients with NAs replaced by zeros.
Constructing worth plots follows the same lines as before but now complicated by the presence of the main effect sex and the age:ltype interaction.A separate set of fitted linear predictors for the objects has to be computed for each combination of levels of ltype and sex (e.g., CO + CO:sex2 + CO:ltype3 + (CO:age + CO:ltype3:age) * x).

Table Model
The model including a home advantage effect provides a reasonable fit (residual deviance is 38.64 on df = 35).The results indicate a strong home advantage effect, with the estimated parameter for home being 0.302.The home team of two evenly matched teams therefore has a winning probability of