Extending the Multiple Discrete Continuous (MDC) modelling 1 framework to consider complementarity, substitution, and an 2 unobserved budget

Abstract


Introduction
Many choices can be represented as multiple discrete continuous decisions.In these, a decision maker faces a finite set of alternatives, and must choose how much to "consume" of each one, potentially consuming none, one, or multiple alternatives.Examples of these situation include activities performed during a day, grocery shopping, investment allocation, etc. Traditional choice models are not well suited for these situations, as they only allow the choice of a single alternative.
Continuous models, on the other hand, often underestimate the probability of zero consumption for individual alternatives, also known as the "corner solution".Joint models, where the continuous choice is conditional on the discrete one, usually lack a strong grounding in economic theory, though there are exceptions (Hausman et al., 1995).
The Karush-Kuhn-Tucker multiple discrete continuous (MDC) consumer demand models (Bhat, 2008(Bhat, , 2018;;Chintagunta, 1993;Hanemann, 1978;Kim et al., 2002;Mehta and Ma, 2012;Phaneuf and Herriges, 1999;Song and Chintagunta, 2007;Wales and Woodland, 1983) attend to the issues mentioned in the previous paragraph.These models begin by explicitly formulating the consumer utility maximisation problem, assuming either a direct or indirect utility function with associated randomness.Then the optimal solution is derived through the use of Karush-Kuhn-Tucker conditions.Finally, the likelihood function of these conditions is written given the distributional assumptions on the utility function.Nowadays, one of the most popular models of this category is the Multiple Discrete Continuous Extreme Value (MDCEV) model (Bhat, 2008).It has been applied in different areas, such as transport (Jäggi et al., 2012), time use (Enam et al., 2018), social interactions (Calastri et al., 2017), alcohol purchase (Lu et al., 2017), energy consumption (Jeong et al., 2011), investment decisions (Lim and Kim, 2015), household expenditure data (Ferdous et al., 2010), price promotions (Richards et al., 2012), and tourism (Pellegrini et al., 2017).
In this paper, we propose two extensions to the MDC modelling framework.First, we propose a new non-additive functional form for the utility that includes explicit complementarity and substitution effects.Secondly, we present an MDC model formulation that does not require the definition of a budget, while still allowing for explicit complementarity and substitution.The second approach is a suitable approximation of a full MDC model for (the relatively common) situation where the expenditure on all alternatives that are included in the model (i.e.inside goods) is small compared to the overall budget, which allows us to drop the budget from the model likelihood.To allow for a tractable likelihood function, we do not include a stochastic error term in the marginal utility of the outside good in any of the two proposed models.
Substitution and complementarity define relationships between the demand for pairs of products.
If the demand for one of them increases, then the demand for the other is reduced in the case of substitution and increased in the case of complementarity (Hicks and Allen, 1934).While the budget constraint naturally induces substitution between products due to income effects, this is only an indirect effect.The inclusion of complementarity and substitution is necessary for a more realistic representation of behaviour in applications as diverse as time use or grocery shopping.
For example, in the first case, it could be that going to the cinema makes it more likely for individuals to also eat at a restaurant.In the second case, it could be that products such as pasta and tomato sauce are usually bought together.On the other hand, it could be that the more hours an individual works, the fewer hours they allocate to leisure activities; or purchasing more bread leads to a reduction in the consumption of biscuits.
Concerning the budget, while determining it can be easy in some applications, it can be challenging in others.For example, in purchase decisions, the budget will rarely be an individual's full income, as there is likely mental accounting and recurring expenses to account for, all of which are not observable.Investment decisions face a similar problem, as the total budget may expand or shrink as a function of expected performance of the investment alternatives.There are other scenarios where even the simple definition of a budget is problematic, for example when modelling the number of recreational trips during a year, or the number of activities performed by an individual during a week.The problem becomes more acute in forecasting.Any predictions from a model require a budget, and predicting the budget, e.g. the income of individuals in the future, is another problem in itself, and introduces cascading errors in the forecast values.
While other models including complementarity and substitution effects through non-additive separable utility functions have been proposed in the literature, they either require complementarity and substitution effects to add up to zero (Song and Chintagunta, 2007), or pose specific constraints on their parameters, making either estimation or model transferability difficult (Bhat et al., 2015;Mehta and Ma, 2012;Pellegrini et al., 2021a).Models with implicit (also called infinite) budget have also been proposed by Bhat (2018) and ?for models with neither complementarity or substitution effects.A detailed comparison between the models in this paper and those already in the literature is presented in section 5.
The remainder of this document is structured as follows.The next section introduces the formulation, derivation, likelihood function and forecasting algorithm of the model with complementarity and substitution.Section 3.2 presents the same for the model with complementarity, substitution and an implicit budget.Section 4 discusses the identification of both model parameters, some constraints that theory and estimation imposes on them, and compares the forecasting performance of both models to each other.Section 5 compares the proposed models' formulation to that of similar models in the literature.Section 6 presents applications of the proposed models to four different datasets, dealing with time use, household expenditure, supermarket scanner data, and number of trips, respectively.The paper closes with a brief summary of the proposed model formulations capabilities and limitations.
2 An MDC model with complementarity and substitution

Model formulation
Consider the classical (consumer) utility maximisation problem, where an individual n must decide what products k to consume from a set of alternatives, by maximising his or her utility subject to a budget constraint (Eqn.1).
where n = 1...N indexes individuals and k = 1...K alternatives, x n = [x n0 , x n1 , ..., x nK ] is a vector grouping the consumed amount of each alternative (product), p nk is the price of alternative k faced by individual n, and B n is the total budget available to individual n. x n0 is an outside or numeraire good, i.e. a good that aggregates all consumption outside of the category of interest.
For example, if the researcher is interested in modelling demand for food, x n1 , ..., x nK would represent consumption of different food categories (the inside goods), while x n0 would represent the aggregate consumption of housing, transport, leisure, etc.It is usually assumed that p n0 = 1, so that x n0 becomes the total expenditure on categories other than the one of interest.To simplify the notation, we use this convention henceforth.It is assumed that the numeraire good is always consumed, so x n0 > 0 always.
The formulation in eqn. 1 is consistent with a two-stage budgeting approach, where the individual first allocates expenditure to broad groups (e.g.food, utilities, transport, entertainment, etc.) based on price indices representative for each group, followed by independent within-group allocations to individual products.According to Edgerton (1997), such an approach is sensible and subject to only small approximation errors when (i) the preferences for groups are weakly separable, i.e. the utility provided by each group is not affected by the level of consumption of other groups; and (ii) the group price indices being used do not vary too greatly with the utility or expenditure level.The first condition can be satisfied as long as the inside goods are reasonably separable from excluded goods.Edgerton (1997) argues that empirical and theoretical arguments support the fulfilment of the second condition.
We assume the following functional forms for the different parts of the utility function.
We take the definition of u k from Bhat (2008).In this formulation, ψ nk represents alternative k's base utility, i.e. its marginal utility at zero consumption.This parameter could be interpreted as the scale of the utility of product k.The γ k parameters, on the other hand, relate mainly to consumption satiation, by altering the curvature of alternative k's utility function.In general, a higher γ k indicates higher consumption of alternative k, when consumed.While a common interpretation is that ψ nk and γ k determine what and how much of alternative k to consume, respectively, this is not completely true.There is a level of interaction between these parameters, and in some circumstances a low value of ψ nk can be compensated by a high value of γ k (Bhat, 2008(Bhat, , 2018)).
Parameters ψ nk must always be positive, as they represent the marginal utility of alternatives at the point of zero consumption.We ensure this using the following definition. (5) where z n0 is a column vector of characteristics of the decision maker that are expected to correlate with that individual's marginal utility of the outside good (e.g.socio-demographics); α is a row vector of parameters representing the weights of those characteristics on the marginal utility of the outside good; z nk are attributes of alternative k; β k are vectors of parameters representing weights of those attributes on the alternative's base utility; and ε nk is a random disturbance term.
We only include random disturbances in the base utility of the inside goods, as this leads to a computationally tractable likelihood function.We discuss the inclusion of a random disturbance in the marginal utility of the outside good in Section 4.1.
The final component of the utility function, u kl (x nk , x nl ), captures the complementarity and substitution effects between inside goods.This particular functional form is inspired by the translog function, and previous formulations by Vásquez Lavín and Hanemann (2008) and Bhat et al. (2015).Figure 1 presents the behaviour of this component for a set of δ kl parameters, and different values of x nk and x nl , which are assumed to be equal.If δ kl > 0, there is complementarity between alternatives k and l, as this component will increase the overall utility.If δ kl < 0, there is a substitution effect between alternatives k and l, as u kl becomes more negative as x nk and x nl increase.If δ kl = 0, the consumption of both alternatives is independent of each other.The value of u kl is bounded to the interval [0, δ kl ), ensuring transferability of estimated models to other datasets, a point we discuss in Section 4.2.
In summary, the proposed MDC model has two main characteristics.First, it contains no stochastic error in the marginal utility of the outside good, allowing for a tractable likelihood function.Second, its non-additive utility function allows for interaction (complementarity and substitution) among alternatives.

Model derivation
To solve the optimisation problem, we begin by writing its Lagrangian (Eqn.6) and Karush-Kuhn-Tacker conditions of optimality (eqns.7 and 8).We drop the n subindex to simplify the notation.
Eqn. 8 will be an equality when alternative k is consumed (i.e.x * nk > 0, with x * nk the consumption at the optimum, i.e. the observed consumption).Eqn. 8 will be an inequality when x * nk = 0.In other words, the marginal utility of any consumed product k at the optimum level of consumption will be λ scaled by the alternative's price p nk .Instead, if the product is not consumed, its marginal utility will be lower.By combining eqns.7 and 8, we obtain: Replacing ψ 0 and ψ k by their definitions (Eqn.5), and isolating the random component ε k , we obtain Now, if we assume all ε k disturbances to follow identical and independent distributions, we only need to apply the Change of Variable Theorem from ε k to x k (only over the consumed alternatives) to obtain the likelihood function of the model.Then, if f and F are the density and cumulative distribution functions of ε k , respectively, we can write the likelihood function as follows: In this set of equations, |J| is the value of the determinant of the Jacobian J of vector −W m , where m indexes consumed alternatives.The elements of this Jacobian are defined in Eqn. 12 (i indexes rows, and j columns).No obvious compact form exists for this determinant.I x k >0 and I x k =0 are binary variables taking value 1 if x k > 0 or x k = 0, respectively, or zero in other case.
If no alternative is consumed, the Jacobian drops out of Eqn.11.
In the remainder of this paper, we assume all ε k disturbances to follow identical and independent Normal distributions with mean fixed to zero and a standard deviation σ, which is estimated.
Assuming other distributions is possible, where the use of Gumbel distribution leads to a closedform likelihood, but has the disadvantage of generating a high rate of outliers during prediction, due to the thick tails of the distribution.The Normal distribution, on the other hand, has thinner tails and it is a natural choice due to the Central limit theorem, while being computationally tractable.

Forecasting
Once the model has been estimated, forecasting requires solving the original maximisation problem proposed in eqn. 1 several times, each time using different draws of ε k from a Normal distribution with mean zero and standard deviation σ, and then averaging the result across these draws.
This must be done separately for each observation in the sample.The optimisation problem can be solved using any algorithm, with the Newton or gradient descent algorithms being the most common type.This forecasting procedure is demanding from a computational perspective, especially if a high number of draws are used for each individual.However, due to the forecast for each individual and draw being independent from one another, calculating them in parallel can significantly reduce the overall processing time.The software implementation in Apollo (ApolloChoiceModelling.com) uses parallel computing to speed up the forecasting.
3 An MDC model with complementarity, substitution and an implicit budget In this section we introduce an extension of the model presented in section 2, such that it does not require defining a budget.The formulation and derivation of the model is very similar to that presented in the previous section, so in this section we only highlights the points where the two models differ.

Model formulation
Considering the classical consumer utility maximisation problem described in eqn. 1, we now assume a different utility formulation for the outside good, while all other definitions remain as in the previous section (i.e. as in eqns.3, 4, and 5).
We assume a linear utility function for the outside good (eqn.13), as this will later on allow us to drop both the outside good consumption x 0 and the budget B from the final model formulation.
While a linear utility function does not comply with the law of diminishing marginal utility (a common assumption in demand models), it should be considered as an approximation of a function that does, when most of the budget is spent on the outside good, and only a relatively small amount is spent on the inside goods.In such a case, changes in the total expenditure of inside goods would lead to a relatively small change in the consumed amount for the outside good, and therefore a negligible change in the marginal utility of it.
More formally, we can write changes in the utility of the outside good using a second degree Taylor expansion as u 0 (x 0 + ∆) u 0 (x 0 ) + u 0 (x 0 )∆ + 1 2 u 0 (x 0 )∆ 2 , where u 0 and u 0 are the first and second derivatives of u 0 , respectively, and ∆ is a small change in the consumption of the outside good.If u 0 is continuous, monotonically increasing, and satisfies the law of diminishing returns, then lim x 0 →+∞ u 0 is a constant equal to or bigger than zero, because the slope must smoothly decrease as x 0 increases, without ever becoming negative.It then follows that lim x 0 →+∞ u 0 = 0. Therefore, for a large value of x 0 , we can assume that u 0 (x 0 ) is small, and approximate u 0 using a linear function, making u 0 ψ 0 .
Assuming a linear utility function for the outside good does not necessarily imply that all individuals have the same marginal utility for it, nor that absolutely no information on the budget can be included in the model.The proposed formulation allows for parameterisation of the ψ 0 parameter.The modeller could make ψ 0 a function of socio-demographics, or other proxies of the budget.For example, ψ 0 could be explained by an individual's full income, occupation, or their level of education.

Model derivation
Proceeding in the same way as in section 2.2, we first find a difference when calculating the derivative of the Lagrangean (Eqn.6) with respect to the outside good, as follows.
which combined with Eqn. 8 leads to the Eqn.15 Replacing ψ 0 and ψ k by their definitions (Eqn.5), and isolating the random component ε k , we obtain Assuming all ε k disturbances follow identical and independent distributions, and applying the Change of Variable Theorem from ε k to x k for the consumed alternatives, to obtain the likelihood function of the model, as described in eqn.11, except this time the definition of the Jacobian elements is as in eqn.17, with E i the same as in eqn.12.
Just as with the model with observed budget, we assume all ε k disturbances to follow identical and independent Normal distributions with mean zero and a standard deviation σ to be estimated.

Forecasting
Once the model has been estimated, forecasting requires solving the original maximisation problem proposed in Eqn. 1 several times, each time using different draws of ε nk from a Normal(0,σ) distribution, and then averaging the result across these draws.
To solve the optimisation problem we once again use the Lagrangian in Eqn.6 and the KKT conditions in eqns.14 and 8, leading us to Eqn. 15.Assuming an equality and isolating x k , we obtain where the definition of E k can be found in eqn.17, and where it depends on the value of all x n .
Eqn. 18 is a fixed point problem, i.e. a problem of the form x = h(x).According to the Existence and Uniqueness theorem, as the right part of Eqn.18 is continuous in x n over the closed interval [0, Bn p nk ], at least one solution to the problem exists.However, we cannot ensure that the solution is unique.We solve Eqn.18 through the following iterative approach: K ] to zero.
where S is the maximum number of iterations allowed, and τ indicates the convergence tolerance parameter, which can be set to the desired precision.This procedure must be performed multiple times for each observation, each time with a different set of draws for the ε k disturbances.Then results for each set of draws must be averaged.
As this model assumes a very large budget, in practice, there is no bound on the magnitude of the forecast consumption.Therefore, we recommend only forecasting for values of the explanatory variables in a reasonable vicinity of the values observed in the estimation dataset.What defines reasonable is difficult to quantifiy, but, for example, if an explanatory variable z 1 ∈ [0, 1] in the estimation dataset, forecasting for z 1 = 10 could lead to unreasonably high consumption levels.
This is similar to how linear models are usually valid only in the vicinity of values on which they were estimated.

Model properties
In this section, we discuss some of the most relevant properties of the model, namely the identifiability of its parameters, including the possibility of using random coefficients; some theoretical constraints on its parameters; and the performance of the model with implicit budget as compared to the model with observed budget.

Identification of parameters
When estimating the proposed models, the modeller should consider the following six points regarding identifiability of parameters.
First, observations who do not consume any inside good should not be excluded from the sample.Even though these observations do not provide any information on the value of ψ k , they do provide information of the value of ψ 0 in relation to the inside goods.
Second, there should be no constant (intercept) in the definition of ψ 0 , i.e. z 0 should not contain an element equal to 1 for every individual.As utility does not have any meaningful units, we require setting a base against which all other utilities are measured.To do this, we recommend setting the intercept of the outside good to zero.Any variable that changes across observations can be included in z 0 , even if they are not centred around zero.We recommend populating z 0 with characteristics of decision makers, such as socio-demographics.
In the case of the model with implicit budget (see section 3) we recommend including the individual's income in z 0 .Including income in this way does not imply that the budget is equal to the income, but only that the marginal utility of the outside good depends on it.We would expect a negative coefficient for income if included in ψ 0 , as an increase of income usually leads to increased overall consumption, and therefore a smaller marginal utility of the outside good.
In general, a negative coefficient α indicates that an increase in the corresponding explanatory variable leads to increased consumption.The opposite is true for a positive coefficient.
Third, just as most other MDC models, the two formulations presented in this paper are not scale-independent.This means that the magnitude of the dependent variable influences the results of the model.For example, expressing the dependent variable in grammes or kilogrammes might lead to different forecasts and marginal rates of substitution.This is due to the non-linear nature of the utility functions used in the models.We recommend testing different scalings of the dependent variable, favouring those making the dependent variable range between zero and five, so as to match the range of maximum variability of the transformation in u kl , which is mostly flat for values x k > 5 (see figure 1).
Fourth, in the case of the model with implicit budget, complementarity and substitution effects can be confounded with income effects.In the model with implicit budget, all interactions between the consumption of alternatives are captured by the δ kl parameters.The cause of interaction could be complementarity or substitution, but it could also be due to income effects.For example, a restricted budget could induce increased demand for an inexpensive product while decreasing the demand for an expensive one.This could be captured by the model as substitution between the two products.This problem will be attenuated if the budget is large in comparison with the expenditure on the inside good.
Fifth, concerning the number of complementarity and substitution parameters (δ kl ), while the model formulation defines one parameter per pair of products, the modeller can easily impose restrictions to reduce the number of parameters to estimate.For example, if alternatives can be grouped into non-overlapping sets, the modeller could impose all δ kl parameters to be the same within each group, and across the same pair of groups.Alternatively, the modeller could perform a Principal Component Analysis on the dependent variables, identifying the most important interactions between alternatives, and then estimating only those δ kl parameters and fix all others to zero (as done in section 6.2).These or other strategies are recommended when the number of alternatives is large.
Finally, as recommended by Manchanda et al. (1999), the proposed models allow for complementarity, substitution, and coincidence effects, both in a deterministic and random way.
Complementarity and substitution effects are captured by the δ kl parameters.Coincidence effects are shocks to demand influencing either one or multiple alternatives at the same time, and they can be captured by either ψ 0 (common shocks to all alternatives), or ψ k and γ k (independent shocks).All of these parameters allow for deterministic heterogeneity, for example defining δ kl as a function of socio-demographic characteristics.It is also possible to incorporate random heterogeneity in ψ k and γ k by using simulated maximum likelihood techniques (Train, 2009), but we do not recommend including such heterogeneity in ψ 0 nor δ kl as it could lead to violations of eqns.23 and 24 (see section 4.2).
To test identifiability of the model through simulation, we created 50 datasets using the generation process of the model with observed budget, and another 50 datasets using the generation process of the model with implicit budget.We then estimated the corresponding model on each generated dataset to check if we were able to recover the parameters used during data generation.
All datasets were composed of 500 observations with four alternatives each.All models shared the specification described in eqn.19, but with the value of their parameters randomly drawn on each occasion from the distributions defined in table 1.The range of parameters was influenced by other models estimated in section 6 and considerations discussed in section 4.2.All explanatory variables (z, x, y) followed a U(0,1) distribution, except for z 1 ∼ Bernoulli(0.5).Prices were drawn from a U(0.1, 1) distribution, while the budget was set to 10 for the models with observed budget.
Table 1: Distributions used to draw parameters from when simulating datasets.
Observed budget Implicit budget

Constraints on estimated parameters
The derivation of the likelihood function relies on the assumption of the utility function being monotonically increasing with decreasing marginal returns of consumption.In other words, it assumes ∂U ∂x k > 0, where U is the global utility.Failing to comply with this assumption renders the likelihood function invalid, as second order derivatives on the Lagrangean would have to be checked to make sure the critical point is not a minimum.Furthermore, it could lead to the existence of multiple local critical points, i.e. the solution may not be unique, which is once again contrary to the assumptions made during the derivation of the likelihood function.The marginal utility of the outside good is always positive in both models proposed in this paper.But the marginal utility with respect to an inside good will only be positive when the inequality in Eqn.
Additionally, the argument of the logarithm inside W k must be larger than zero, so as to avoid undefined operations.In the case of the model with observed budget, this translate into the inequality in Eqn. 21.And in the case of the model with implicit budget, it implies Eqn.22 must be satisfied.
These conditions are functions of x k , making their fulfillment dependent on the particular dataset at hand.We would like to instead derive dataset-independent conditions.This is possible by noting that the impact of x k in both conditions is bounded by its exponential transformation to the interval 0 ≤ e −x k ≤ 1 (because x k ≥ 0).This allows us to derive more general conditions than Eqns.20, 21 and 22 by analysing the extreme cases x k = 0 and x k = ∞, as the value of the conditions for all other x k values will fall between these.These extreme cases have the benefit of removing x k from the conditions.Table 2 summarises the results from this analysis.
All conditions in table 2 with zero on the right hand side are always fulfilled because ψ k , γ k , p k , ∆ − and ∆ + are all equal or bigger than zero.Eqn.20 for x k = ∞ will also always be true as zero is approached from the right (i.e. from positive values).Among the remaining conditions, Therefore, the sufficient conditions for the model with observed budget can be summarised as in eqn.23 Where: And the sufficient conditions for the model with implicit budget are summarised in eqn.24.
Conditions in eqns.23 and 24 are based on extreme cases, so they represent sufficient but not necessary conditions for the validity of the parameters.In other words, estimated parameters need only to comply with eqn.20, and with eqn.21 or 22, but satisfying eqn.23 or 24 guarantees that those conditions are met.
If individuals in the dataset behave rationally and in accordance with economic theory, then the estimated parameters should naturally comply with eqn.23 or 24.At the time of writing, we have not experienced any issues of running into inconsistent parameters, nor have we had to impose parameter constraints during estimation to enforce compliance with these equations.

Suitability of a linear utility for the outside good
In the model with implicit budget, we propose a linear utility for the outside good as an approximation of the case where expenditure on the inside goods (i.e.considered alternatives) is small compared to that on the outside (numeraire) good.In these cases, we expect only very small changes to the marginal utility of the outside good due to changes in the consumption of the inside goods.For example, consider consumption of the yoghurt product category.The expenditure on yoghurt will be small compared to the total expenditure on food, and even smaller compared to the entire disposable income of the household.By using the model with implicit budget, the modeller does not need to determine what the correct budget is, but only needs to know that total expenditure in the category of interest is small compared to the budget, whatever that may be.
If our interpretation is correct, then the forecast of the model with implicit budget should approach that of the model with observed budget when the expenditure on the outside good is large compared to that on the inside goods.We tested this assumption through simulation.We first created 30 different datasets of 500 observations each, assuming a data generation process with observed budget, i.e. using the model presented in section 2.Besides having an outside good, each dataset had four inside goods that were always available.The base utility of the outside good was set to zero, while the base utility of the inside goods was composed of a single constant, each drawn from U (−2, 0), i.e. a uniform distribution between -2 and 0. Satiation parameters γ k were drawn from U (0.5, 1.5), δ kl were drawn from a U (−0.01, 0.01), while price p k followed a U (0.1, 1), and the budget was set to 10 for every observation.We measured the fit of each model on each dataset using the Root Mean Squared Error (RMSE) of the forecast aggregate demand in the whole sample.Results are exhibited in figure 4.
As figure 4 shows, the fit of the model with implicit budget approaches that of the model with observed budget as the expenditure on the outside good increases.This indicates that the model with implicit budget is an appropriate approximation when the expenditure on the outside good is large relative to the expenditure on inside goods.

Comparison with other MDC formulations
The MDC models presented in this paper are not the first to include complementarity, substitution or an implicit budget in the literature.In this section, we discuss other MDC models with these properties, and compare them to the models proposed in this paper.We begin with a very brief review of models without complementarity or substitution (other than income effects), which form the basis for more flexible models.

No complementarity or substitution, and an observed budget
One of the most popular models in this category is the MDCEV model by (Bhat, 2008).It is derived from the same consumer optimisation problem proposed in eqn. 1, but using a different functional form for the utility components.While there are several possible formulations, the most common one is the alpha-gamma formulation, due to it allowing for an efficient forecasting algorithm (Pinjari and Bhat, 2011).In this case, the utility takes the form described in eqn.25, where α can either tend towards zero during the estimation process, or the modeller can fix it a priori.
Parameter interpretation in the MDCEV model is essentially the same as in the models described in this paper, except for two differences.First, the outside good's marginal utility contains no covariates, but only a stochastic error term, i.e. ψ 0 = e ε 0 .Second, α measures satiation across the whole choice set in MDCEV, and not the influence of covariates in the outside good's marginal utility as in the models proposed in this paper.And while it is possible to introduce explanatory variables into the base utility of the outside good in MDCEV models (either directly, or by including them with the same coefficient in all inside goods' base utility), it is not commonly done in practice.
By setting u kl = 0, the MDCEV model does not allow for pure complementarity or substitution effects, though product substitution can still take place due to income effects.Also, the form of u 0 requires the value of x 0 , and therefore the budget, to be observed.also present a similar model to MDCEV, but without an error term in the marginal utility of the outside good.Other models in this category include Habib and Miller (2008) and Habib and Miller (2009), who present models similar to that by Von Haefen and Phaneuf (2005).

Introducing complementarity and substitution through new functional forms
Vásquez Lavín and Hanemann (2008) propose a model formulation allowing for complementarity and substitution using a non-additively separable utility function and an observed budget.This formulation was later refined by Bhat et al. (2015), who called it the NASUF model.Beginning from the consumer optimisation problem set in eqn. 1, the utility components are defined as described in eqn.26.
The definition of u kl makes the NASUF utility function non-additive, effectively introducing complementarity and substitution effects.A positive value of θ kl is indicative of complementarity, while a negative one represents substitution, and θ kl = 0 implies no complementarity or substitution.Yet, this formulation has three main drawbacks.
The first drawback is that the utility function is valid only for some values of θ kl .Just as in the case of the models proposed in this paper, and as discussed in section 4.2, the derivation of the likelihood function assumes ∂U ∂x k > 0. For this to be true, the inequality in eqn.27 must be satisfied.

∂U ∂x
While it is possible to bound the value of parameters during estimation, the problem with the condition in eqn.27 is that it depends on the value of x k .As the logarithm is not a bounded function, whether or not this condition is satisfied will depend on the level of consumption x of each individual, making it impossible to assess the correctness of a model without associating it to a particular dataset.This hinders model transferability from one dataset to another, and jeopardises forecasting, as only scenarios that fulfil the condition above should be permissible forecasts.
If all individuals in the dataset behave in accordance with economic theory, then the parameters should automatically fulfill eqn.27.Yet, this does not prevent the estimation algorithm from trying parameter values violating eqn.27 during the parameter value search.Furthermore, calculating the likelihood of the model requires calculating the logarithm of the expression in eqn.
27, leading to an error if the expression is less or equal than zero.
The second issue with the solution proposed by Bhat et al. (2015) is that the stochasticity is introduced midway through the derivation of the model in the Karush-Kuhn-Tacker conditions, and not in the initial formulation of the model.While this is merely a formal issue, it does imply that the origin of the randomness is not clear, and it is not possible to easily associate it with unobserved variables or measurement errors, as would be the case in more traditional econometric models.
The third issue is that γ parameters have a role both in satiation and in the interaction term (i.e.complementarity and substitution) of the utility, making their interpretation difficult.A similar formulation was proposed by Lee and Allenby (2009), but using a quadratic function to incorporate satiation, complementarity, and substitution.This model only considers inside goods, defining the global utility as x l (we assume only one product per category to simplify the analysis).Note that θ kk is not restricted to zero in this case, as is in the models proposed in this paper.The validity of the formulation rests on the condition which depends on the value of x k , leading to the same issue already discussed in the context of the NASUF model.
Finally, Lee et al. (2010) propose a model allowing for asymmetric complementarity and substitution among categories of product.However, the formulation of the model does not satisfy the principle of weak complementarity (Maler, 1974), i.e. that an individual's utility is not influenced by the attributes of non-consumed goods or, in other words, that goods provide utility only through their use.This is a reasonable assumption in cases where non-use values are believed to be absent or small (see von Haefen ( 2004) for a more detailed discussion).

Introducing complementarity and substitution through the indirect utility function
While in this paper we derived MDC models from the direct utility function of consumers, it is also possible to make assumptions on the indirect utility instead, and then calculate the optimal consumption using Roy's identity, as described in section 3.1 of Chintagunta and Nair (2011).Song and Chintagunta (2007) propose an MDC model following the indirect utility approach, considering not only a set of alternatives, but grouping them into categories, and assuming that at most one alternative inside each category is consumed.Furthermore, this model imposes a symmetry constraint on its complementarity and substitution parameters, as described in eqn.

M l=0
θ kl = 0 ∀k (28) where θ kl represents the complementarity and substitution parameters (originally called β in Song and Chintagunta ( 2007)).Eqn.28 forces that, for each product, the amount of complementarity and substitution with other products needs to add up to zero.But there are no theoretical reasons for this to necessarily be the case in any given application.This requirement prevents, for example, for a product to only have complementarity with one other product, while not having substitution with any other product.Mehta and Ma (2012) propose a model with a similar formulation to that of Song and Chintagunta ( 2007), but without the symmetry constraint.However, it requires the matrix of complementarity and substitution parameters (whose elements are θ k l) to be positive semi-definitive.
Additionally, the likelihood function does not have a closed functional form, requiring multipledimension integration; and the number of parameters increases geometrically with the number of alternatives.

Introducing complementarity and substitution through correlation in utility functions
An alternative way to introduce complementarity and substitution into an MDC model is by introducing correlation across the utility of alternatives.This can be done in two ways: (i) by directly correlating the random error term ε in the utility function of each alternative across multiple alternatives, or (ii) by adding new random error terms common to the utility of multiple alternatives.Pinjari and Bhat (2010) use the first approach, using extreme value distributions to nest alternatives together into mutually exclusive subsets, allowing for perfect substitutes but not for complementarity.This approach was generalised by Pinjari (2011), by allowing for overlapping non-exclusive nests, but still limiting its applicability to complementarity.Bhat et al. (2013) makes ε follow a multivariate normal distribution across alternatives, allowing for flexible correlation patterns.Calastri et al. (2020a) follows the second approach, by using random intercepts and coefficients (β in our notation) correlated across alternatives.
As Pellegrini et al. (2021a) discuss, the main limitation of introducing complementarity and substitution through correlation in the utility functions of different alternatives is that of confounding effects.Indeed, using this approach it is impossible to discriminate between correlation due to common heterogeneity in preferences, from correlation due to complementarity and substitution.For example, two utilities could be positively correlated due to them sharing unobserved attributes, but not because the alternatives are complementary.

Two stage approaches to unobserved budgets
The necessity to observe the budget can lead to two separate issues.The first one is during estimation, in the case when the budget is not observed.This forces the modeller to assume some value for the budget before even estimating and MDC model.A common solution to this problem in past work has been to use the total expenditure as the budget.This is a strong assumption, as it implies that the total expenditure will not change as a function of prices or other attributes of the products.For example, it implies that consumers will spend the same amount regardless of the level of discount offered.
The second problem due to the necessity of an observe budget in MDC models manifests during forecasting.Forecasting for any future scenario requires exogenously defining a budget.
Any errors in the forecasting of the budget will cascade down to the MDC model, as shown in section 6.2.
In the literature, these problems have been addressed mostly through two-stage procedures, where in the first stage, a model is used to estimate (and predict) the budget, and in the second stage, a traditional MDC model with observed budget is used to allocate the budget to the different alternatives.Pinjari et al. (2016) proposes a two-stage approach.In the first stage, they use either a stochastic frontier or a log-linear regression to estimate the expected budget, and in the second stage they use the expected budget in an MDCEV model.They compare the performance of both approaches against arbitrarily determined budgets.When using the stochastic frontier method, they assume the budget to be an unobservable characteristic of decision makers, defined as the maximum amount they are willing to spend.This implies that the expected budget under this approach tends to be bigger than the total expenditure.The log-linear regression, on the other hand, attempts to predict total expenditure, so it leads to expected budgets that are of the same magnitude as the total expenditure.While both approaches offer similar performance, and both outperform the arbitrarily determined budget, the stochastic frontier approach leads to bigger expected budgets, therefore allowing for more variability in the forecast, as the total expenditure has room to grow if the attributes of the alternatives improve.This approach is also used by Pellegrini et al. (2021b).Dumont et al. (2013) propose a different two-step approach to estimate the budget.In the first step, they estimate a Structural Equation Model (SEM) where the budget is a latent variable, whose structural equation has socio-demographics as explanatory variables.The budget can have several indicators, such as average expenditure in the category during the last three months, expected expenditure in the future, and ownership of goods from the same category.Income is also considered a latent variable, with at least stated income as indicator.More formally, the latent budget B n and latent income I n relate as follows : where Z n are socio-demographics of individual n, y nj is indicator j of the budget, S n is the stated income, η n , ξ n , ε nj and ε ns are standard normal error terms, and ζ z , ζ I , λ j , σ j , λ s and σ s are parameters to be estimated.As expected, authors report lower log-likelihoods when using the SEM approximation to the budget than when using maximum expenditure, but they also do note an improvement in the MDC parameters significance levels.They do not report changes in forecast performance, making it difficult to evaluate the performance of the proposed approach.

Other MDC models with implicit budget
Other models in the literature have also used linear utility functions for the outside good, in the same way that in the models proposed in this paper.This functional form leads to a likelihood function that does not depend on the budget, effectively allowing for unobserved budgets.
In the context of the MDCEV model and its derivations, Bhat (2018) was the first one to propose using a linear utility function for the outside good.This functional form, however, was not motivated by the need to drop the budget from the model formulation, but it was used to allow for more separability between the parameters that determine the discrete choice (i.e.what to choose), from those that determine the continuous choice (i.e.how much to choose).Therefore, this property of the model is hardly explored in that paper.
More recently, Saxena et al. (2022) discussed the consequences of using a linear utility for the outside good in models with additively separable utility functions.Such a configuration leads to models that do not consider complementarity, substitution, nor income effects, therefore making demand from one product independent from another, unlike the model proposed in this paper (though it does allow for parameterising ψ 0 ).Similarly to our own advice, they recommend using a linear utility function for the outside good only when the total expenditure in the inside goods is no more than 35% of the budget (or more strictly, less than 5%).If the expenditure in inside goods is higher than those values, they find bias in the model estimates and poor forecasting performance.
While we did not find evidence of biased parameters in the proposed model (see figure 3), we did find evidence of poor forecast performance (see figure 4).The absence of parameter bias in the proposed model could be due to it including complementarity and substitution effects, and the fact that the error term follows a Normal distribution instead of a Gumbel distribution.

Model application and comparison
In this section we apply the proposed models to four different datasets.The first dataset records time use, where all participants face the same budget (24 hours a day), and all alternatives (in this case, activities) have the same price (one unit of time).This dataset allows us to measure how much fit is lost when using the model with implicit budget when the budget is known, as well as compare the proposed models against a model without complementarity nor substitution.The second dataset deals with household expenditure, where budgets vary between different households, but consumption is aggregated to categories, so prices are still unitary (one unit of money).
This dataset helps us illustrate how the fit of the model with observed budget degrades when the budget is misspecified, a case particularly relevant in forecasting.The third dataset contains scanner data from a supermarket, where both budgets and prices vary from one observation to the next.This dataset allows us to compare the sensitivity to price of the models with observed and implicit budget.The last dataset reports the number of trips performed by travellers for different purposes.This dataset is a case where the very definition of a budget is problematic, as there is no evident limit on the number of trips during a day.

Fixed budget and fixed prices: time use dataset
The first dataset records time use of 447 individuals across 2,826 days in total.Details about the data collection can be found in Calastri et al. (2020b), and an application to time use analysis using this data can be found in Calastri et al. (2019) and Palma et al. (2021).Only out-ofhome activities are registered in the dataset, which we aggregate to six plus the outside good, as described in table 3. We estimated three different models using the Time Use data.First we estimated a traditional MDCEV model (Bhat, 2008), which has an observed budget and no complementarity.
We also estimated the first model proposed in this paper (eMDC1 ), with an observed budget, complementarity and substitution.Finally, we estimate the second model proposed in this paper (eMDC2 ), with an implicit budget, complementarity and substitution.
In the case of time use, the budget is observed (24 hours a day for everyone), and remains unchanged in forecasting scenarios, giving a clear advantage to the MDCEV and eMDC1 models.
Nevertheless, we are interested in exploring the consistency of results across the models with observed budget, as well as the loss of fit in the eMDC2 model (which uses an implicit budget) with respect to the others.We estimated the models using 70% of the sample, and forecast for the remaining 30%.Table 4 presents the estimated parameters, likelihood and root mean squared error (RMSE) of the forecast consumption at the aggregate sample level for each model.
The parameter estimates point towards consistent effects across models.And while parameters across models change in magnitude, their signs remain unchanged.Parameter interpretation is equivalent across models, except for α.In the MDCEV model α measures satiation across all alternatives.Instead, in the proposed eMDC models α represents the impact of the associated explanatory variable (z 0 ) on the marginal utility of the outside good (ψ 0 ).In the proposed models, α > 0 (α < 0 ) implies a positive (negative) effect of z 0 on ψ 0 , therefore an increased (decreased) consumption of the outside good, and a decreased (increased) consumption of the inside goods when z 0 grows.In this particular application, the negative sign of α female indicates that, after controlling for other variables, women on average perform more out-of-home activities than men.
Concerning the β parameters, all of them are negative because all "inside" activities are less common than the "outside" activity (staying at home, see table 3).These parameters become more negative as the engagement with their corresponding activity decreases, except for leisure and work in eMDC1, probably due to the effect of interactions.As expected, working full time increases the chance to engage in work activities, while the weekend decreases it but increases the chance of engaging in leisure activities; and being 30 years old or younger increases the probability of engaging in school activities.γ parameters follow a similar trend, with higher values associated with activities performed for longer periods of time.The only exception is school, which has a large γ parameters despite being consumed for shorter periods than leisure, probably to compensate for its small ψ school .
Only the eMDC models provide information on complementarity and substitution through their δ parameters, which are fairly consistent across eMDC1 and eMDC2.As expected, there is substitution between work and school, because few people work and study concurrently.On the other hand, we observe complementarity between shopping, private business and leisure, probably because all of these activities are often performed at the city centre, and therefore easier to chain into a single trip.As table 3 shows, correlations between time consumption are negative for all pairs of activities, because of the fixed budget and competing nature of the activities.Yet we do observe that correlations with a magnitude smaller than 0.05 tend to be associated with complementarity effects.In section 6.3, we again compare correlations and complementarity/substitution parameters, but in a dataset where the budget constraint is less strenuous, finding a much stronger connection between them.
Concerning fit, the eMDC1 model achieves the lowest RMSE of the three models, followed by eMDC2 and MDCEV.We expected the eMDC1 achieving the best fit, as it uses all the available information, including the total consumption or budget, and it includes complementarity and substitution effects.On the other hand, it was hard to predict which of the other two models would achieve the second best fit, as the MDCEV model omitts complementarity and substitution, while the eMDC2 model does not use information about the budget.In this particular case, the eMDC2 model fit better than MDCEV, but this is probably a dataset-dependent result, and may change in other study scenarios.The loglikelihood is not comparable across models, as they have different formulations, making the RMSE a better indicator of fit.In summary, when the budget is known, and will be known in future scenarios when forecasting is relevant, then we recommend using the eMDC model with observed budget.
eMDC1-100 and eMDC2 are presented in Table 6.Parameter estimates of eMDC1-80 and eMDC1-120 followed similar trends, and are available from the authors.
α, β and γ parameters follow a similar trend in models eMDC1-100 and eMDC2.Results indicate that having a female or older household head both increase the marginal utility of the outside good (i.e.decrease expenditure in the inside goods), while a more educated household head has the opposite effect.These effects can be explained by the low female participation in the labour market (Contreras and Plaza, 2010), higher levels of education among younger individuals (for Economic Co-operation and Development, 2009), and a strong correlation between level of education and income among the Chilean population (Bilbao, 2013).Among β parameters, we observe that a higher number of adults, children, elders, workers and students per household increase the chance of spending money on alcohol, clothing, health, transport and education, all of which are reasonable effects.Furthermore, the estimates of the γ parameters indicate that more populous households tend to spend more on food, transport, communications, leisure, education and others, but not necessarily on alcohol, clothing, homeware, health, and restaurants, as these categories are more discretionary.
Complementarity and substitution parameters δ are particularly different between the model with observed and implicit budget (eMDC1-100 and eMDC2, respectively).While the model with observed budget captures substitution between multiple pairs of categories, the model without it is dominated by complementarity.This is because when the budget is not controlled for, all categories of consumption seem to increase or decrease in tandem, because a higher (lower) income implies a higher (lower) expenditure across all categories.In other words, the income effect is confounded with complementarity in the model with implicit budget, as discussed in section 4.1.
Our main objective with this dataset was to analyse how errors in the definition of the budget lead to different forecast errors in models with observed budget.To do this, we first estimated the models using 70% of the full sample (training dataset), and then forecast demand on the remaining 30% of observations (validation dataset) multiple times, assuming a different value of the budget in each occasion.We repeated this for each of the eMDC1 models we estimated.
Different budgets lead to different forecasts in the eMDC1 models, but not in eMDC2 model.
Figure 5 presents the results of this exercise.We used the root mean squared error (RMSE) of the aggregate predictions in the validation sample as an indicator of error in the forecast.
As Figure 5 shows, the forecast performance of the model with implicit budget (eMDC2 ) does not change as a function of the budget.Instead, the eMDC1 models achieve a better forecast performance when the forecast budget is close to the estimation budget, but their error grows in a quadratic way with the budget misspecification.It does not seem to be very important how the estimation budget is defined in eMDC1 models.For example, the estimation budget could be defined as the total income of the household or just the total expenditure on the inside goods plus one.However, once a budget has been used during estimation, it is very important to accurately and consistently predict the budget for any forecasting scenario, otherwise the forecast error can  These results reveal that in contexts where the forecasting of the budget implies even mild uncertainty, the proposed model with implicit budget can ensure a bounded level of error in the forecast.

Variable budget and variable prices: supermarket scanner dataset
The third application deals with scanner data from a chain of supermarkets (Venkatesan, 2014).
After dropping all records of transactions from households with missing socio-demographic characteristics, and limiting the analysis to only four product categories, the dataset contains 4,002 purchase baskets from 656 households.All the considered product categories are fresh fruits: oranges, peaches, pears, and pineapples.Each fruit can be purchased in packs of different weights, but to simplify the analysis, we calculated the average price per Kg of each product, and expressed the amount purchased in Kg.Table 7 summarises consumption in the dataset.Our objective with this dataset was to compare the model with observed and implicit budget in terms of their sensitivity to changes in price.We estimated two models on the supermarket dataset: eMDC1 is the model with observed budget, which we set to the observed consumption plus one; the second model (eMDC2 ) assumes an implicit budget.The parameter estimates and log-likelihood at convergence of these models are shown in Table 8.Non significant parameters were not removed from the model formulation.To compare their sensitivity to price, we changed the price of oranges between 70% and 130% of their original price, and calculated both models' aggregated forecast demand on the training dataset.Figure 6 plots the demand forecast by each model, for different prices.
As can be seen in Figure 6, both models predict a similar demand for the product whose price changes (oranges), but offer different predictions for the other products, whose prices remain constant.This is because of the income effect only being present in the model with observed budget, pushing for a much more dramatic reassignment of consumption when price changes.
On the other hand, the model with implicit budget assumes a large unobserved budget, inducing smaller reassignment effects caused only by the δ parameters.Assuming a larger budget in eMDC1 would decrease the sensitivity of the forecast demand among the products whose price does not change, making it more similar to the forecast of the eMDC2 model (not reported).Based on the available data we cannot determine which of the two predictions is more accurate, as we are forecasting for unobserved prices.
The complementarity and substitution (δ kl ) parameters are significantly different across models.While eMDC1 captures only complementarity, eMDC2 captures both complementarity and substitution.This is because the δ parameters in eMDC2 are not only capturing the complement-  arity and substitution effects, but are also confounded with the income effect.This is apparent as the sign of δ parameters in eMDC2 mirror those of the correlation of demand in the dataset (see table 7).This also explains why the δ parameters in eMDC2 have higher t-ratios, as they are used to capture any interaction between the demand of different products, be it due to complementarity, substitution, or income effects.Larger budgets (as compared to expenditure in inside goods) will reduce the size of income effects, making the model with implicit budget more suitable for such scenarios.Our objective with this dataset is to compare out-of-sample forecast performance between the proposed models with explicit and implicit budget (eMDC1 and eMDC2, respectively) when the definition of the budget is arbitrary.In theory, the budget in our dataset should be the maximum amount of trips a household could generate during a day, but this value is very difficult to determine.Defining the budget as any lower (but more reasonable) value would be an arbitrary decision.A common approach in situations without an evident budget is to use the observed total consumption as the budget (Bhat and Sen, 2006).We follow this approach when estimating eMDC1, assuming the budget to be equal to the observed total number of trips plus one, so that the "outside good" is always consumed.However, this strategy poses a problem when predicting out of sample, as the budget needs to be predicted using an auxiliary model.To reproduce this situation, we estimate our models using only 70% of the whole sample, and predict for the remaning 30%.In the case of eMDC1 we predict the budget using a linear regression on the training data.In the case of eMDC2 we have no need to make assumptions on the budget nor using an auxiliary model for out-of-sample prediction, as the budget is not needed during estimation nor forecasting.
In both eMDC1 and eMDC2 we use a linear function with the same socio-demographics to explain the base utility of the outside good (ψ 0 ).The base utility of the inside good and their satiation is described by a single constant each.The linear regression used to predict the 36 budget has the same socio-demographics as explanatory variables than the discrete-continuous models.Table 10 presents the coefficients of each model estimated with the training dataset (70% of the whole sample), and their forecast performance when predicting on the validation dataset (remaining 30% of the sample).Table 11 presents the complementarity/substitution (δ) parameters of both eMDC1 and eMDC2.Establishing parallels between the parameters of both models is difficult.In the model with observed budget (eMDC1 ) the effect of socio-demographics has two components: their effect on the budget prediction, and their effect on the multiple discrete continuous model itself.On the other hand, the model with implicit budget (eMDC2 ) does not have this complexity.The sign of another that -additionally to these effects-does not require the analyst to define a budget.The inclusion of explicit complementarity and substitution effects enriches the interpretability and realism of the model (Manchanda et al., 1999), while its functional form avoids issues present in previous formulations proposed in the literature (see section 1).The second model, with its implicit budget, is particularly useful when forecasting as it avoids cascading errors due to inaccurate budget predictions (see section 6.2).
The model with implicit budget is based on the hypothesis that total expenditure on the alternatives under consideration is small compared to the overall budget.This hypothesis allows us to approximate the utility of the numeraire good by a linear function, hence removing the necessity to define a budget.This approximation comes at the cost of reduced fit, as compared to the model with observed budget.However, simulations show that the fit of both models converges when the hypothesis above is fulfilled (see section 4.3).Such an assumption is realistic in most daily consumption decisions, but should always be justified when using the model.In general, if the budget can be determined with a great degree of confidence in forecasting scenarios, then we recommend using the model with observed budget.But if there is significant uncertainty in the budget prediction, the model with implicit budget can be a useful alternative, as it makes the prediction error independent from the budget estimation.
A computational implementation of the proposed model is available for R, as an extension of the Apollo package (Hess and Palma, 2019).To download this extension and see examples, visit ApolloChoiceModelling.com.
The models proposed in this paper contribute to the literature on Kuhn-Tucker system demand models to study multiple-discrete choices.There are still several avenues for improvement and further investigation.New functional forms for the complementarity and substitution term in the direct utility function could be explored, with special emphasis on those leading to a compact form of the Jacobian in the likelihood function.More generally, including a random component in the marginal utility of the outside good would be a useful development, especially if it leads to a closed-form likelihood function.Alternative formulations based on indirect utility functions could be less restrictive, as they avoid assumptions on the shape of decision makers' direct utility functions.The model formulation could also be modified to incorporate multiple constraints, for example a monetary and a time budget, or a storage capacity.Of particular interest would be an approach that mixes constraints with an explicit and implicit budget.Finally, an empirical comparison of alternative formulations for the complementarity and substitution component of the utility, as well as the utility of the outside good, is of much interest specially given recent developments in Bhat (2018) and Pellegrini et al. (2021a).
Figures 2 and 3 summarise the true and estimated parameter for the model with observed and implicit budget, respectively.In the graphs, the horizontal axis indicates the true value of the parameter, while the vertical axis indicates the estimated value.In these graphs, a perfect recovery of a parameter is represented by a dot along the identity line (in blue).The graph also contains the 95% confidence interval for each estimated parameter.Both figures offer a similar perspective: while all parameters are recovered correctly, α and β parameters are recovered more precisely, while γ and δ parameters (specially the latter) are harder to recover.

Figure 2 :Figure 3 :
Figure 2: Recovery of parameters for the model with observed budget.

Figure 4 :
Figure 4: Compared fit of models with observed and implicit budget, on data generated assuming a generation process with observed budget Kim et al. (2002) use a similar utility function to the MDCEV model, but assume that the random disturbances follow a multivariate normal distribution.While more flexible, this distribution makes the model much more computationally demanding.Von Haefen andPhaneuf (2005) Pellegrini et al. (2019) refine the model proposed inBhat et al. (2015) by proposing a different interaction term in the utility function.While this new formulation leads to an improved fit and provides a clear interpretation of γ parameters, it retains at least the first issue associated to the formulation ofBhat et al. (2015).Pellegrini et al. (2021a) further expand the NASUF model by allowing for two budget constraints in an application where both time and monetary constraints are considered jointly.

Figure 5 :
Figure5: Comparison of forecast precision of model with implicit and observed budget, when the budget is wrongly specified in the latter.

Figure 6 :
Figure 6: Relative aggregated sample demand forecasted by the traditional and extended MDCEV models for variations in the price of oranges.The black line indicates unity (i.e.original demand).

Table 2 :
Constraints on proposed model parameters for extreme levels of consumptionx k x l:δ kl >0 x l:δ kl <0

Table 3 :
Main descriptive statistics of the time use database * outside good; † when engaged

Table 4 :
Comparison of the proposed extended MDC and a traditional MDCEV models on a time use dataset

Table 6 :
Comparison of model with observed and implicit budget on expenditure dataset

Table 7 :
Main descriptive statistics of the supermarket scanner data

Table 8 :
Parameters estimates of model with observed and implicit budget on the supermarket scanner dataset

Table 9 :
Main descriptive statistics of the number of trips database The last application deals with number of trips generated by a household, split across different purposes: work, study, personal business, leisure and return home.Data comes from the 2012 Origin-Destination survey of Santiago, Chile(Observatorio Social, 2014).The database contains observations for a single day from 10,927 households.Table9summarises the average number of trips per purpose by households' number of vehicles and income.

Table 10 :
Parameter estimates and forecast performance for models on number of trips dataset * Robust t-ratio.† Calculated based on out-of-sample prediction