Speciﬁcation of Informative Prior Distributions for Multinomial Models Using Vine Copulas

. We consider the speciﬁcation of an informative prior distribution for the probabilities in a multinomial model. We utilise vine copulas: ﬂexible multivariate distributions built using bivariate copulas stacked in a tree structure. We take advantage of a speciﬁc vine structure, called a D-vine, to separate the speciﬁcation of the multivariate prior distribution into that of marginal distributions for the probabilities and parameter values for the bivariate copulas in the vine. We provide guidance on each of the choices to be made in the prior speciﬁcation and each of the questions to ask the expert to specify the model parameters within the context of an engineering application. We then give full details of the approach for the general problem.


Introduction
The specification of an informative prior distribution over multiple unknowns, for example the probabilities in a multinomial distribution, is a challenging task. When the unknowns are dependent in the prior beliefs of the expert, then the most flexible approaches currently used necessitate the specification of a prior covariance or correlation matrix. The restriction of this matrix of positive definiteness adds further complexity as it restricts the dependencies to be specified by the expert. Vine copulas offer no such restrictions. All correlations to be specified are algebraically free between -1 and +1. Thus, vines offer a promising approach to the specification of informative prior distributions.
One of the first approaches to the specification of a prior distribution for multinomial models was that of Chaloner and Duncan (1987) who proposed an extension of a method to elicit binomial probabilities in which the specifications of the expert, in the form of modes, were converted to the parameters of a Dirichlet distribution. While this is conjugate to the multinomial distribution, it is a very restrictive form for the dependency structure between the probabilities (O'Hagan et al., 2006). Dickey (1983) also proposed the Dirichlet distribution as a prior for multinomial models, using hypothetical future samples to specify the parameters. The method adjusted the expert's specifications. Other approaches based on the Dirichlet distribution and ordered Dirichlet distribution are given in van Dorp and Mazzuchi (2003); Zapata-Vázquez et al. (2014). Garthwaite (2013, 2016) considered the Dirichlet distribution as a prior for multinomial models and proposed two extensions: the Connor-Mosimann distribution, which is also conjugate to the multinomial distribution, and the multivariate Gaussian copula. The Connor-Mosimann distribution has more parameters than the Dirichlet distribution and so provides more flexibility for prior specification. However, some restrictive structure remains, such as forcing all of the probabilities to be negatively correlated with the first probability. The Gaussian copula relaxes the restrictions still further on the dependency structure, while still requiring a positive definite covariance matrix. Elfadaly and Garthwaite (2016) developed a procedure to ensure this condition holds.
Vines offer an extension to multivariate copulas in which a multivariate distribution can be built from the marginal distributions of the unknowns and a series of bivariate copulas encoding dependence in a tree structure (Joe, 1996;Bedford and Cooke, 2002). Kurowicka and Cooke (2007); Cooke et al. (2015) considered sampling from, conditioning in and searching through vines. Haff et al. (2010); Stöber et al. (2013) considered simplified vines, in which conditional dependence is assumed constant. Schepsmeier (2015); Dißmann et al. (2013) proposed methods for model selection in vines, which have been extended to Bayesian approaches by Gruber and Czado (2015); Min and Czado (2010b,a). Bayesian inference for D-vines has been considered by Czado and Min (2011).
Example 2 in Bedford et al. (2016) considered prior specification of a bivariate copula within a vine. All other applications to date have been concerned with fitting vines to data. This represents a missed opportunity for vines, whose sequential specification and completely flexible dependence structure makes them an ideal tool for prior specification. In this paper we describe how to operationalise vines as prior distributions for multinomial models. We describe the implementation of the prior specification in detail through an extended application, illustrating each step explicitly, indicating how each of the various necessary choices is to be made and providing specific elicitation questions to ask the expert. This is followed by a discussion of the general problem, providing the mathematical details of the approach.
The context of the application is a group of engineers responsible for the safety of a complex structure. In engineering applications such as this, interest typically lies in the tails of the joint distributions, as this is where safety critical events will manifest. By assuming relatively light-tailed forms for joint distributions such as the Gaussian copula, there is a risk that the probability of safety critical events will be underestimated. By allowing more flexibility in the specification of joint distributions, we are able to provide more accurate, and more conservative, estimates.
In Section 2 we provide the engineering application, detailing the specification of vines as prior distributions for multinomial models within an example. In Section 3 we give details of the approach for the general problem of the specification of a prior distribution for multinomial probabilities. In Section 4 we give a summary and areas for future work.

Background and structure
We consider a desensitised industrial case of an ageing complex engineering structure as described in Wilson et al. (2013). The problem under consideration was to assess the underlying condition of the structure. There were several possible tests which could be carried out to assess the condition, and each would give information about specific aspects of the condition. The tests varied in their costs, use of time and resources and invasiveness. A Bayesian network was developed to assess the potential reduction in uncertainty which would result from the tests. It is given in Figure 1. Each condition variable, such as grout condition and pitting, was represented by a multinomial distribution categorised by the condition of that element of the structure. For example, grout condition was categorised into fully effective, partially effective and not effective. In each node in the network, a prior distribution was required on the probabilities of belonging to each category, which would be updated on observation of results from the chosen test.
Consider the variable "capacity". It has four possible states: original level, acceptable reduced capacity, unacceptable reduced capacity and failed. There are a large number of locations across the structure where we are interested in the state of the capacity. The multinomial observation is therefore the number of locations where capacity is in each of its states. Prior to the elicitation, full definitions of capacity and each of its states were agreed with the expert.
We wish to define a prior distribution on (p 1 , p 2 , p 3 , p 4 ) , the probabilities that capacity is in each of its states, ordered as above. However, these probabilities have a unit sum constraint, 4 i=1 p i = 1, and if we ask an expert about the probabilities directly, they are likely to give us beliefs which are incoherent. Therefore, instead we choose to elicit a prior distribution over (θ 1 , θ 2 , θ 3 , θ 4 ) , the probabilities that capacity is in each of its possible states conditional on not having been in any of its previous states. We can then recover (p 1 , p 2 , p 3 , p 4 ) via By asking experts about the conditional probabilities, this allows them a free choice and ensures coherence of the resulting probability distribution. By defining the prior distribution over the conditional probabilities, this means we only need to consider θ = (θ 1 , θ 2 , θ 3 ) , as θ 4 = 1. Thus, the ordering of the states is an important consideration. We choose to make the original level state 1, and so failed is state 4. We could have chosen the reverse order. However, complex engineering structures are typically very reliable. This means that the engineers will have lots of experience of locations where capacity is at its original level, and relatively little experience of locations where it is failed. It is therefore more reasonable to ask them questions about the original level, acceptable reduced level and unacceptable reduced level than the failed and two reduced levels.
We represent the prior distribution for θ using a D-vine, and so it takes the form where f i (θ i ) are the prior Probability Density Function (PDF) and Cumulative Distribution Function (CDF) of θ i , c i,j (·, ·) is a bivariate copula and F is the prior conditional CDF of θ i | θ j . From this, we see that, to fully define the prior distribution for θ, we need to specify the marginal distributions for θ 1 , θ 2 and θ 3 and the dependence between θ 1 and θ 2 , θ 2 and θ 3 and θ 1 and θ 3 conditional on θ 2 . The vine structure is particularly suitable for prior specification in this case because it allows us to consider each of the specifications in isolation, reducing the elicitation burden on the expert. The vine, like multivariate copulas, preserves the marginal distributions of the parameters and uses the bivariate copulas to encode the dependencies between the parameters.

Marginal distributions
Consider the marginal distribution for θ 1 , the probability that capacity is in its original state. To elicit the marginal distribution for θ 1 , the expert is asked three questions: Q1, Consider the proportion of locations in the structure in which capacity is at its original level.
(a) What is the value for which you think that the true proportion would be equally likely to be above or below this value? Call this q 0.5,1 .
(b) What is the value for which you think it is equally likely that the true proportion would be between 0 and this value and this value and q 0.5,1 ?
(c) What is the value for which you think it is equally likely that the true proportion would be between q 0.5,1 and this value and this value and 1?
We choose to ask the expert about proportions to avoid asking for probabilities of probabilities, and to link the questions more closely to observable quantities. Part (a) gives the expert's median for θ 1 and parts (b) and (c) give their lower and upper quartiles respectively. For θ 2 and θ 3 parts (a), (b) and (c) would remain the same (using the values given by the expert for q 0.5,2 and q 0.5,3 respectively) but the populations the expert was to consider would be Q2, Consider all of the locations where capacity is not at its original level. We are interested in the proportion of these locations in which capacity is at a reduced acceptable level.
Q3, Consider all of the locations where capacity is not at its original level or an acceptable reduced level. We are interested in the proportion of these locations in which capacity is at a reduced unacceptable level.
Suppose that the elicited quantiles for an expert are given in Table 1. We see that, in the beliefs of this expert, the capacity is most likely to be at its original level. Conditional on capacity not being at its original level, it is most likely to be at an acceptable reduced level. Of course, once we know that it is not in any of the first three categories, it will be failed with probability one.
For each pair of quantiles, we can calculate the values of the parameters of the equivalent beta distribution. For three quartiles, there is no exact beta distribution which matches them in general. We choose a beta distribution which approximately fits all three quantiles my matching the first two moments. Full details are given in the general problem section. By fitting a two parameter distribution to three elicited quantiles, this allows us to provide feedback to the expert as to the consistency of their quantile specifications. If the beta distributions implied by the three pairs of quantiles are very different, then this could imply that the expert may wish to revise one or more of their quantiles.
From the elicited quantiles in Table 1, we find exact beta distribution parameter values (a i,j , b i,j ), where j = 1, 2, 3 represent the pairs of quantiles, (q 0.25,i , q 0.5,i ), (q 0.5,i , q 0.75,i ) and (q 0.25,i , q 0.75,i ) respectively, and the parameter values which approximately match all three quantiles, (a i , b i ). They are given in Table 2 for θ 1 , θ 2 and θ 3 .
We see that there are differences between the beta distributions resulting from the different pairs of quantiles, though in general these are relatively small. We can also  view the densities of the marginal beta prior distributions in each case, and these are given in Figure 2. The colours represent (a 1, We see that the overall densities in blue represent a reasonable aggregation of the other densities for θ 1 , θ 2 and θ 3 . For θ 2 , one of the elicited quantiles appears to be inconsistent with the other two. We may wish to ask the expert to reconsider their quantile specifications in light of this. We also see the differences in the variances of the prior distributions, with the largest uncertainty in the value of θ 2 and the smallest in the value of θ 1 .

Bivariate copulas
We need to specify the dependence structure between the elements of θ. This is equivalent to specifying the copulas c 1,2 (·, ·), c 2,3 (·, ·) and c 1,3 (·, ·). The elicitation in this stage asks the expert to condition on the value of one of the probabilities and provide revised beliefs about another probability in light of this. It would be extremely cognitively challenging to ask the expert to condition on the value of a conditional probability and then give their updated beliefs about a further conditional probability. Therefore, instead, we return to p = (p 1 , p 2 , p 3 ) , the unconditional probabilities that capacity is at its original level, a reduced acceptable level and a reduced unacceptable level respectively.
To specify the two unconditional copulas in the vine, we ask the expert questions about p 2 | p 1 and p 3 | p 2 respectively. In particular, we choose to condition on the probabilities taking their prior median values. We could condition on any value of the probabilities, but by choosing the median we ensure that the expert is considering values that are not inconsistent with their prior beliefs. The questions we ask the expert to elicit c 1,2 (·, ·) are Q4, In Q1, you identified that you think that the proportion of locations where capacity is at its original level is equally likely to be above or below 0.75. Suppose that this is correct. In light of this information: (a) Are you still comfortable that it is equally likely that the proportion of locations where the capacity is at an acceptable reduced level is above or below 0.175? If so, (b) What is the value for which you think it is equally likely that the true proportion is between 0 and this value and this value and 0.175?
(c) What is the value for which you think it is equally likely that the true proportion is between 0.175 and this value and this value and 1?
This provides (q 0.25,2 , q 0.5,2 , q 0.75,2 ), the lower quartile, median and upper quartile of p 2 | p 1 . The same three quantiles of p 3 | p 2 can be elicited via: Q5, In Q2, you identified that you think that the proportion of locations where capacity is at an acceptable reduced level is equally likely to be above or below 0.175. Suppose that this is correct. In light of this information: (a) Are you still comfortable that it is equally likely that the proportion of locations where the capacity is at an unacceptable reduced level is above or below 0.01875? If so, (b) What is the value for which you think it is equally likely that the true proportion is between 0 and this value and this value and 0.01875?
(c) What is the value for which you think it is equally likely that the true proportion is between 0.01875 and this value and this value and 1?
We notice a strength of vines which makes them an ideal structure for the prior distribution here. Vines are not symmetric in the sense that they do not treat the relationships between all of the variables equally, as opposed to a Gaussian copula for example. The only unconditional relationships to be specified in the vine are between adjacent nodes. Similarly, in this application, the multinomial categories have a specific structure. They are ordered. Thus by choosing the ordering as in (1), we can elicit both of the dependencies in the first tree of the vine by only conditioning on adjacent categories. It would be a much more cognitively challenging task for an expert to revise their quartiles for one category on observation of a non-adjacent category only. The vine structure allows us to avoid this.
Returning to the example, we saw in Q4, that the prior median for p 1 was 0.75. Suppose that, in Q4, the expert chooses to accept the conditional median of p 2 | p 1 = 0.75 and gives revised quartiles of q 0.25,2 = 0.165 and q 0.75,2 = 0.185. This compares to unconditional quartiles of q # 0.25,2 = 0.15 and q # 0.75,2 = 0.188. We can transform these to quartiles of θ 2 | θ 1 , i.e., q * 0.25,2 = 0.66 and q * 0.75,2 = 0.74 using We see that the upper and lower quartiles reflect reduced uncertainty in light of the extra information. The expert further identifies that they would expect the dependence to be positive.
We need to use the three revised quantiles for θ 2 | θ 1 to specify the bivariate copula c 1,2 (·, ·). We could do so by fitting a non-parametric copula which exactly matches the quantiles, for example using minimum information methods (Bedford et al., 2016). This in general would be under-constrained. Alternatively, we could impose a specific parametric copula, for example the Gaussian copula, for each bivariate copula in the vine and choose the parameter values to most closely match the elicited quantiles. This is more restrictive, but offers a more convenient form for inference.
Instead, for each bivariate copula we will consider a number of parametric copulas suitable for representing dependence: Gaussian, Frank, Clayton, Gumbel and t-, fit each to the three quantiles given by the expert using least squares and then choose the copula which provides the best fit of these, again using least squares. This approach offers more flexibility than assuming a single form for all copulas in the vine and the resulting vine is less complex than that when using non-parametric copulas. An alternative approach would be to choose the copula based on the expert's knowledge of the dependencies and how this relates to properties of different copulas. We will assess the impact of this approach compared to assuming a Gaussian copula for all bivariate relationships in the next section.
In Table 3 we provide the parameter values which most closely match the revised quantiles and the fit, in terms of the sum of squared differences, for each of the candidate copulas for θ 2 | θ 1 .
From the table, we see that the Clayton copula is providing the best fit to the elicited judgements of the expert, though the Frank, Gaussian and t-copula also appear to fit the quantiles well. The parameter value chosen for the Clayton copula is λ = 3.61 and so we see positive dependence between θ 1 and θ 2 .  Table 3: A comparison of each of the candidate copulas for the prior dependence between (θ 1 , θ 2 ).
We can see the differences between the fitted copulas, and the fits of the copulas to the elicited quantiles, in Figure 3, which plots the conditional CDF of θ 2 | θ 1 = 0.75 for the Gaussian copula (black), Frank copula (red), Clayton copula (green), Gumbel copula (light blue) and t-copula (pink), as well as the three elicited quantiles (dark blue). Figure 3: The conditional CDF of θ 2 given θ 1 = q 0.5,1 for the Gaussian copula (black), Frank copula (red), Clayton copula (green), Gumbel copula (light blue) and t-copula (pink), as well as the three elicited quartiles (dark blue).
We see from the figure that most of the copulas produce relatively similar conditional distributions, with all except the Gumbel and Clayton copulas appearing very similar. The Clayton copula provides a conditional distribution closest to the median and upper quartile. We choose the Clayton copula to represent the dependence between θ 1 and θ 2 . This gives a Kendall's Tau value of τ = 0.64, which indicates moderate correlation.
For the copula between θ 2 and θ 3 , based on the quantiles elicited in Q5, we can use the same approach. In this case, the Clayton copula again provides the best fit with λ = 3.06 and a value of Kendall's Tau of τ = 0.61.
In the second tree of the vine, we require the copula between θ 1 and θ 3 conditional on θ 2 . We again choose to ask questions concerning p as they are cognitively simpler for the expert. In this case, we condition on p 1 and p 2 both taking their prior median values q # 0.5,1 = 0.75 and q # 0.5,2 = 0.175, and ask the expert for revised quantiles for p 3 . The specific questions are: Q6, Suppose that the proportions of locations where capacity is at its original level and at an acceptable reduced level are 0.75 and 0.175 respectively. In light of this information: (a) Are you still comfortable that it is equally likely that the proportion of locations where the capacity is at an unacceptable reduced level is above or below 0.01875? If so, (b) What is the value for which you think it is equally likely that the true proportion is between 0 and this value and this value and 0.01875?
(c) What is the value for which you think it is equally likely that the true proportion is between 0.01875 and this value and this value and 1?
This provides revised quantiles for p 3 | p 1 , p 2 which we convert to revised quantiles of θ 3 | θ 1 , θ 2 using the same approach as the unconditional copulas. This provides information about F , which we obtain using the copulas in the first tree of the vine, allows us to specify c 1,3 (·, ·). Full details are given in the general problem section.
Suppose that the quantiles resulting from Q6, are (q 0.25,3 , q 0.5,3 , q 0.75,3 ) = (0.018, 0.01875, 0.0195). The revised quantiles for θ 3 | θ 1 , θ 2 are then (0.24, 0.25, 0.26). Fitting each of the candidate copulas to these three quantiles results in the Frank copula providing the best fit. The parameter value which most closely matches the quantiles specified by the expert is λ = 0.98 and the Kendall's Tau value is τ = 0.11. We see that the correlation between θ 1 and θ 3 conditional on θ 2 is weaker than either of those in the first tree of the vine. This is partly as a result of θ 1 and θ 3 representing non-adjacent categories, i.e., original level and unacceptable reduced level. Thus, it is more suitable to consider this relationship in the second tree of the vine. This fully defines the prior distribution.

Impact of using a range of copulas
Let us suppose that there are safety implications for the structure if more than a quarter of the locations have a capacity in the unacceptable reduced or failed states. Then the engineers are interested in Pr(p 3 ∪p 4 > 0.25). We simulate from our prior distribution to obtain samples θ (j) 4 ) , for j = 1, . . . , 100, 000 and transform these to samples from p (j) = (p We plot the resulting densities for p in Figure 4. We calculate the proportion of samples in which p 3 + p 4 > 0.25, which provides our estimate of Pr(p 3 ∪ p 4 > 0.25). We obtain 0.0123.
We can also fit a vine to the specifications of the expert in which we make each bivariate copula a Gaussian copula. If we do so, then the correlation parameter ρ takes the values 0.85, 0.79 and 0.04 for c 1,2 (·, ·), c 2,3 (·, ·) and c 1,3 (·, ·) respectively. Under this copula, Pr(p 3 ∪ p 4 > 0.25) = 0.0102. Thus, by allowing flexibility in the choice of copulas in the vine, and fitting the parametric copula which most closely matches the expert's specifications, the probability of safety implications for the structure increases by 21%. This demonstrates the value of taking the extra effort to investigate each of the candidate copulas.
We can see the differences between the joint densities under the flexible choice of copulas and the vine built only using the Gaussian copula in Figure 5, where we have plotted the joint prior density between p 3 and p 4 from the simulations.
In particular, we see stark differences in the tail behaviour under the two vines.

D-vine prior distribution structure
Suppose that we have counts in (m + 1) categories, Y = (Y 1 , . . . , Y m+1 ) and, conditional on the probabilities of each of the categories, (p 1 , . . . , p m+1 ) , they follow a multinomial distribution Y | (p 1 , . . . , p m+1 ) ∼MN(N, (p 1 , . . . , p m+1 )), for some number of observations N . The probabilities are constrained to sum to one and so we set p = (p 1 , . . . , p m ) and then p m+1 = 1− m i=1 p i . To perform inference we require a prior distribution f (0) (p) representing the beliefs of an expert. It will typically be the case that p i ⊥ ⊥ p j for i = j. We wish to define a flexible prior distribution with this property. Figure 5: The prior joint density between p 3 and p 4 using the best fitting parametric bivariate copulas (left) and using only Gaussian copulas (right).
We will use a D-vine to construct such a distribution. A background on copulas and vines is given in the Supplementary Material (Wilson, 2017).
We will use a D-vine to represent the prior distribution f (0) (θ). The vine is fully defined by the ordering of the variables in the first tree of the vine. Suppose this ordering isθ = (θ 1 , . . . ,θ m ) . Then the structure of the vine is given in Figure 6.
In some cases, there will be a natural ordering of the categories which should be maintained inθ. This was the case in the application and will always be the case if the multinomial variable represents the partitioning of a continuous univariate variable. If there is no natural ordering, the orderingθ should be chosen to assess the strongest dependencies in the first tree of the vine. This is consistent with advice in Aas et al. (2009);Min and Czado (2011).
The prior distribution is given by the vine distribution representing the D-vine in Figure 6: The structure of a general D-vine in m dimensions. Figure 6. This is i (θ i ) is the marginal prior CDF ofθ i and F (0) i|i+1,...,j−1 (θ i |θ i+1 . . . ,θ j−1 ) is the prior conditional CDF ofθ i |θ i+1 , . . . ,θ j−1 . In order to write the prior density in this form we have made the assumption of a simplified vine. Full details of this are given in the Supplementary Material (Wilson, 2017).
In order to fully specify the D-vine we need to specify the marginal distributions of each of the probabilities, f (0) i (θ i ), the unconditional copulas in the first tree of the vine, c i,i+1 (·, ·), and the conditional copulas in trees 2 to m − 1 of the vine, c i,j (·, ·). We do not need to specify the prior conditional distributions F (0) i|i+1,...,j−1 (· | ·), as they can be calculated from the other specifications. For example, if we require F and integrate overθ i . These prior conditional distributions from the second tree of the vine can then be used to calculate the prior conditional distributions in the third tree of the vine, for example F and so on.

Elicitation and specification
For each conditional probabilityθ i , we require a marginal prior distribution with support on [0, 1]. A flexible and much used prior distribution is the beta distribution. We definẽ θ i ∼beta(a i , b i ), i = 1, . . . , m, and so we need to elicit information to specify (a i , b i ).
We choose to elicit three quantiles to specify these two parameters, which provides a consistency check on the quantiles (Garthwaite et al., 2005). We elicit the median q 0.5,i , a quantile below the median, q L,i and a quantile above the median, q U,i . In the example we chose to elicit the lower and upper quartiles, q 0.25,i and q 0.75,i , which allowed us to use the bisection method. Other common choices are (q 0.33,i , q 0.67,i ) and (q 0.05,i , q 0.95,i ). There are no exact values for (a i , b i ) in general which match all three quantiles. We can, however, find exact beta parameter values for each pair of quantile specifications from the expert.
Suppose that the exact parameter values are (q L,i , q 0. 3 , b i,3 ). We can find the means and variances of each of these beta distributions, We specify the prior distribution forθ i by setting its mean and variance to be weighted averages of (μ i,j , σ 2 i,j ), j = 1, 2, 3: where 3 j=1 w i,j = 1 and w i = 3 j=1 w 2 i,j . The beta distribution parameters which satisfy this mean and variance are . We also need to specify each of the bivariate copulas in the vine, which represent the dependencies between the probabilities. To do so, we condition on the values of some of the probabilities and ask for revised quantiles for a further probability. For this to be a manageable task for the expert, we choose to elicit quantiles of the elements ofp, the ordered unconditional probabilities, then convert them to quantiles of the elements ofθ.
There are many different parametric bivariate copulas which could be used as c 1,j (·, ·). We advocate choosing the bivariate copula which minimises the sum of squares overall with the expert's revised quantiles out of a number of candidates. Suitable candidates for both positive and negative dependence are the Gaussian, t-and Frank copulas, for positive dependence only are the Clayton and Gumbel copulas and for negative dependence only are the rotated Clayton and rotated Gumbel copulas. Details of each of these copulas are given in the Supplementary Material (Wilson, 2017).
This procedure will specify each of the bivariate copulas on the very left of each tree of the vine. To specify the subsequent copulas in each tree, for example the k'th copulas, we ask the expert to suppose that an observation hasn't fallen into the first (k − 1) categories. This redefines the remaining unconditional probabilities as The method above will then specify all copulas of the form c k,j (·, ·), for j > k.

Summary
In this paper we have considered the problem of specifying a prior distribution for probabilities in a multinomial model. We have proposed a method which is based on a structure called a D-vine in which the specification of the multivariate prior distribution reduces to that of the marginal distributions, the ordering of the probabilities and the conditional and unconditional bivariate copulas. This provides a flexible structure and avoids issues of underparameterisation associated with conjugate priors such as the Dirichlet distribution and the direct specification of a positive definite correlation matrix.
While this paper focuses on multinomial models, there is much scope to utilise this approach to specify informative prior distributions more generally. The vine structure separates the marginal and dependency specifications and allows the Bayesian statistician to relate questions on observable quantities for elicitation to the parameters in the vine. With the recent advances in Bayesian inferential techniques for vines, the use of vines as a structure for multivariate prior distributions incorporating dependency is now very appealing.