Hierarchical model for forecasting the outcomes of binary referenda

A Bayesian hierarchical model is proposed to forecast outcomes of binary referenda based on opinion poll data acquired over a period of time. It is demonstrated how the model provides a consistent probabilistic prediction of the final outcomes over the preceding months, effectively smoothing the volatility exhibited by individual polls. The method is illustrated using opinion poll data published before the Scottish independence referendum in 2014, in which Scotland voted to remain a part of the United Kingdom, and subsequently validate it on the data related to the 2016 referendum on the continuing membership of the United Kingdom in the European Union. © 2018 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Opinion polls provide predictions of the outcomes of voting events such as elections or referenda, based on samples drawn from the population that is eligible to vote. Forecasting from opinion polls can be of widespread interest: to the public, to the media, to those running campaigns, or to individuals standing for election. The forecasts can also aid decision making, for example on the amount of money spent on campaigning, by providing a description of uncertainty about the outcome of the voting event.
In this article, we propose a method for predicting the outcome of two referenda recently held in the United Kingdom. The first one is the Scottish independence referendum, held on 18 September 2014, and the second one is the referendum on UK's membership in the European Union (EU) that took place on 23 June 2016. We use a Bayesian hierarchical model to capture the dynamics of the opinion polls. Share of the votes in the polls is assumed to be sampled from a multinomial distribution with overdispersion and a possibility of polling company bias. Next, logit-transformed probabilities of voting for Yes, No and those Undecided are assumed to follow a stationary Ornstein-Uhlenbeck process. We provide a framework that can be subsequently generalised to forecast voting outcomes on the polling day where there are more than two options, such as parliamentary elections with several parties, or to forecast turnout as well as the outcome.
The paper is structured as follows. In Section 2, we review the literature on opinion poll forecasting, and provide a general background for both Scottish independence and EU membership referenda. Section 3 contains a description of the opinion polls data used for making predictions of referendum outcomes. In Section 4 we present the forecasting model. The results are described and analysed in Section 5, with detailed outcomes presented for the Scottish referendum, and the EU membership referendum used for external model validation. In the Section 6 we conclude and suggest issues for further research.

Forecasting from opinion polls
Forecasting the result of voting events has long been of great interest in democratic countries. The most prominent examples in the statistical literature describe methods for predicting the share of votes for political parties and presidential candidates in national elections, based on pre-election opinion polls (Campbell, 1992;Jackman, 2005Jackman, , 2009Lock and Gelman, 2010;Silver, 2012;Linzer, 2013). There are also models for forecasting the outcome of an election, such as a UK general election, based on exit polls (Brown et al., 1999;Curtice and Firth, 2008). These models have been successful in predicting the outcome before the final result is declared. The earlier the polls take place before the event, the less precise they are, as opinion changes over time (e.g., Campbell and Wink, 1990;Gelman and King, 1993;Campbell, 1996). Nevertheless, they can be useful in identifying trends in preferences of the electorate.
In many models used to forecast US national election outcomes, covariate information, called 'fundamentals', is used to create initial forecasts. These, in turn, are used as prior input into a binomial model (Campbell, 1992;Lock and Gelman, 2010;Linzer, 2013). Fundamentals are based on theories of retrospective voting, according to which voters tend to punish current authorities for economic or social crises (e.g., Kinder and Kiewiet, 1981;Nadeau and Lewis-Beck, 2001;Duch andStevenson, 2008, citations after Linzer, 2013). Hence, such fundamentals include economic growth, unemployment rate and geographical variation (e.g., Campbell, 1992;Linzer, 2013). Campbell (1992) predicted the outcome of a presidential election for each state in the US. He applies a linear regression with 16 covariates measured at national, regional and state levels, which include, amongst others, macroeconomic variables, opinion polls from early campaign, state's voting record in the previous two elections, incumbency of a candidate and measures of partisan shifts over time. The assumption of using linear regression to predict an outcome bounded between zero and one is justified by the fact that the predicted shares are likely to lie around 0.5 rather than the boundaries. Lock and Gelman (2010) analysed deviations of opinion on the state level from the nationwide average. A Bayesian model used for forecasting is based on a normal approximation to a binomial outcome. Their method integrates past election data to form prior distributions for each state outcome, and combines them with information from the state-level opinion polls to create posterior distributions of the shares of vote before the election. To account for overdispersion due to survey issues, such as weighting or clustering as well as uncertainty about opinion shifts between the forecast and the election day, the prior for variance includes information from past opinion polls averaged over states.
The model proposed by Linzer (2013) extends the Lock and Gelman (2010) approach by explicitly introducing binomial variability and allowing the model to borrow strength not only across states, but also over time. This is achieved by introducing a random walk model, similar to that of Jackman (2005), for national-level and state-level effects. The logit of the share of the votes is the sum of these effects. Again, historical state-level forecasts are used to form a prior distribution for the state-level effects.
In the models described above, the historic data on actual elections can be utilised for forecasting the outcome of presidential or parliamentary elections, as well as assessing the performance of the opinion polls in predicting their outcomes. The 2014 Scottish referendum had no direct precedent, this being the first time that the choice of full independence had been offered to the Scottish electorate. Therefore, there are no data with which to assess the influence of fundamentals on the opinion of the electorate. Furthermore, within the time frame for analysis, fundamental macroeconomic variables do not vary greatly. We consider that the influence of other factors, such as the position of the UK government and the main opposition party on the independence question (both pro-Union), can be more effectively incorporated in the analysis through expert-based priors. A Bayesian hierarchical model for the evolution of public opinion over time, based on a series of opinion polls, which does not utilise fundamentals, is the state-space model described by Jackman (2005Jackman ( , 2009. This model assumes that the result of the poll, denoted by y, is a normal variable with mean that includes 'true' intentions and a polling company bias (also called 'house effects'), and a variance approximated from the central limit theorem as y(1 −y)/n where n is a sample size, as in Lock and Gelman (2010). This model, however, does not allow for overdispersion. Polls are pooled over time assuming random walk models for the true intentions. The polling company bias is estimated by assuming that the trajectory converges to the known result of the vote on the day of the election.
An alternative approach of Bunker and Bauchowitz (2016) is based on a two-stage model. In the first stage, the polls results are aggregated using weighting procedures that account for accuracy of the pollster, margin of error resulting from the size of the poll and timing of the poll. In the second stage, the variability of the voting preferences is taken into account by using a multinomial distribution with a conjugate Dirichlet prior.
There are also a number of articles in the literature that provide models for forecasting the outcome of the voting event on the British general election night (Brown et al., 1999;Curtice and Firth, 2008). These models are useful to predict the outcome right before the event taking place or before the final result has been published. In general, they estimate the deviations of votes from the known outcome of the previous election.
In our model, we extend Jackman's approach to allow us to model simultaneously turnout and the binary outcome of the poll. We also use a logistic model which implies that the variance is not a data-dependent approximation as it is in Jackman (2005) and Lock and Gelman (2010). We model the variability of the polls over time by using a stationary Ornstein-Uhlenbeck process, which accounts for the fact that polls were infrequent in the early stage of the campaign but very frequent close to the referendum day. Following Lock and Gelman (2010), we also explicitly allow for overdispersion due to the survey issues and the fact that variability between polls is not simply a function of the underlying time-varying proportions. Finally, in our forecasts we assume that the outcome of election remains unknown to the forecasters.

The Scottish referendum of 2014
Scotland and England have had a shared sovereign since James VI and the Union of the Crowns of 1603. However, after the Acts of Union of 1707 the Scottish Parliament has been incorporated into the Parliament of the Kingdom of Great Britain, with its seat in Westminster, effectively ending the era of independence of both countries. In modern times, the question of self-rule and possible independence of Scotland has returned to the political agenda over 250 years later, mainly since the 1970s, when the referendum on establishing a separate Scottish Parliament first took place on 1 March 1979. Although 52% of voters were in favour, it did not meet the special condition outlined in the Scotland Act 1978 that 40% of the entire electorate have to vote in favour (BBC, 1997a).
As a result of the referendum on 11 September 1997 was convened by the Scotland Act 1998 (BBC, 1997b;Scottish Parliament, 2014a). The Scottish Parliament has the power to make laws in socalled Devolved Matters, such as education, justice or health. The Reserved Matters are outside the competence of the Scottish Parliament and remain with the UK Parliament in Westminster. They include, amongst others, defence, foreign policy, energy, and immigration (Scottish Parliament, 2014b).
The independence referendum that took place on 18 September 2014 was enabled by the Scottish Independence Referendum Act 2013 (Scottish Parliament, 2013). It asked residents of Scotland a question: 'Should Scotland be an independent country?' If the majority of respondents had answered positively, Scotland, according to the Scottish Government White Paper Scotland's Future (Scottish Government, 2013), would have been able to decide about all Reserved Matters, with the monarch of the UK remaining the head of state.
The motivation for developing a model to forecast the outcome of the referendum arose from research on future Scottish migration. As mentioned above, migration policy implemented in Scotland is currently that of the entire UK (McCollum et al., 2014). Forecasts of Scottish migration, prepared before the referendum, had to take into account the uncertainty of the referendum outcome . This was achieved by incorporating the probability that Scotland will vote 'Yes' in the forecasting framework. This probability was used for averaging the forecasts of Scottish migration under two scenarios: independence and status quo. The model developed in the current paper was designed to provide this probability for particular dates prior to the referendum.

The UK membership in the EU referendum of 2016
On 1 January 1973, the UK has joined the European Communities (in 1992 transformed into the European Union; see e.g. Church and Phinnemore, 1995). This was confirmed by the 1975 European Communities referendum which saw 65% turnout and 67% voters supporting the membership (BBC, 1975). The 2016 referendum on the continuation of the membership in the European Union (EU) was called by the then Prime Minister David Cameron, amidst the rounds of renegotiating the conditions of UK's future EU membership (UK Parliament, 2016;Nugent, 2017).
The referendum took place on 23 June 2016 and, despite the opinion polls suggesting the outcome in favour of the Remain camp (e.g. BBC, 2016), 51.9% of those eligible voted to leave the EU, with turnout just above 72% (Electoral Commission, 2016). The inconsistency of the polls with the final outcome spurred a discussion on the reasons for discrepancies between the opinion polls and the outcomes of voting events, similar to the one after the 2015 general election (Sturgis et al., 2016). The opinion polls data for the EU membership referendum are used to estimate the probability of the ''Leave'' outcome and validate the model developed for forecasting the Scottish referendum.

Scottish independence opinion polls data
The data on opinion polls for forecasting Scottish referendum were obtained from the UK Polling Report website 1 for each of the ten polling companies. These companies apply various methodologies for polling: YouGov-online polling from a recruited panel of respondents; Survation-online interviews, telephone and face-to-face; TNS-BMRB-face-to-face computer assisted interviews at respondents' homes; ICM-online polling with stratification by demographic characteristics; Ipsos-Mori-computer-assisted telephone interviews with quota sampling; Panelbase-online polling; Progressive-online polling and telephone interviews; Angus Reid-online polling; Ashcroft-documentation unclear but apparently face-to-face interviews, which seem to have been subcontracted to different companies; and Opinium-online interviews. 2 The first observation comes from 29 January 2012, the last polls were carried out on the eve of the referendum, on 17 September 2014. Categories of voters 'undecided' and 'not voting' have been merged together into one called 'Undecided', though the category of those 'not voting' was used in very few polls. Proportional corrections to the Undecided were made if the shares of votes did not sum to one. In the upper plot of Fig. 1 we present the percentage of respondents to the opinion poll who declared an intention to vote in favour of independence, out of the total number of those expressing intention to vote. The lower plot shows the percentage of respondents who were classified as 'Undecided' at the time of each poll.

EU membership opinion polls data
The data on opinion polls for the EU membership referendum were obtained from the same source as above, 3 and are presented in Fig. 2 for each of the ten polling companies. The last category ''Other'' contains polls from companies that polled only once or twice during the considered period. Here, the first observation comes from 9 September 2010 and the last six polls were collected on the eve of the referendum. For clearer picture of the last days before the referendum, figures in Online Supplementary Material depict the data with the time scale converted into the natural logarithm of the number of days left till the referendum day. Given that the data come from the same source as for the Scottish referendum, the same general remarks and caveats as discussed above hold with respect to their key features, interpretations and limitations.

Data assessment and assumptions
Throughout the paper, we assume that the main sources of error are the sampling error, variability over time and overdispersion. Possible causes of overdispersion include survey issues such as non-independence of respondents or differences in methodologies used by the polling companies, as well as other factors contributing to differences in polls carried out on the same day such as measurement error. We also allow for the fact that individual polling companies may exhibit a consistent bias across time, but we assume that the average bias across companies is zero. This strong assumption is a direct consequence of the fact that we did not know the outcome of the referendum at the moment of creating a forecast. If the final result is known, the approach of Jackman (2005) can be used to learn about the polling company biases. Alternatively, if there were relevant data on the historic performance of pre-referendum opinion polls, this assumption could be relaxed.
In general, the polling company bias can stem from sources such as the mode of data collection (Jäckle et al., 2010), item non-response (National Research Council et al., 2013) or type of survey (probability and non-probability surveys, see e.g. Sturgis et al., 2016;Baker et al., 2013), to name a few. For instance, face-to-face interviews may induce social desirability bias and reporting behaviours which do not actually occur. This bias tends to be reduced when using telephone interviews (Presser and Stinson, 1998;DeBell et al., 2018). However, in the current application, we do not have access to such details for most of the used polls as we rely on available aggregate summaries of the data described in previous subsections.
We recognise that the usual polling techniques used for elections may not provide such accurate results for a referendum for which polling companies have no previous experience. The main two problems that may distort the accuracy of the opinion polls are overestimation of turnout and differential response rate (e.g., McAllister and Studlar, 1991;Lynn and Jowell, 1996;Wells, 2014). The differential response rate between prospective Yes and No voters, which, if present here, might be expected to indicate a reluctance to reveal a preference for Scotland remaining in the UK or UK leaving the EU, can lead to a bias in opinion poll results.
The one-off referendum, on one hand, may engage those who stay at home for parliamentary elections which would diminish the bias. On the other hand, polls usually overestimate the turnout, mainly due to the social desirability and acquiescence response biases which can put the validity of the entire prediction of the outcomes into question (DeBell et al., 2018). In their experimental study, Großer and Schram (2010) find that the turnout depends on the information from the opinion polls and floating (undecided) voters. A decrease in turnout can be expected if polls indicate a landslide victory of one option and vice versa, turnout increases with polls showing decreasing differences between the two options. Further, the floating voters are almost entirely responsible for the turnout boosts. Interestingly, in 2004, the pollster Populus categorised 35 percent of the electorate in the UK as floating voters (The Times, 2004).

Model specification
In this section we present the bespoke model for forecasting the referendum outcome based on the opinion poll data. Even though the model was initially designed to fit the data on the Scottish referendum, its generality has been subsequently assessed and confirmed in the context of the referendum on the EU membership.

Forecasting model
In our application, we consider three categories for voting intentions: Yes (1), No (2) and Undecided (3). Let y i = (y i1 , y i2 , y i3 ) be a vector containing the number of persons declaring to vote for each category in the ith opinion poll taken at time t(i) days before the referendum, by polling company c(i), with a sample size n i . For each poll i, t(i) takes a value in {t 1 , . . . , t D }, representing D distinct polling occasions, and c(i) takes a value in {1, . . . , 10}, as there are ten polling companies.
To simplify, we assume that the results of a poll are published at the same time on a given day.
In particular, we assume that where p i = (p i1 , p i2 , p i3 ) is the corresponding probability vector. Let q i = (q i1 , q i2 ) be the logit transformation of the category Yes against No, and of the Yes and No categories jointly against the Undecided, ) . (2) The logit transformation in (2) ensures that the estimated probabilities p i1 , p i2 , p i3 remain bounded by 0 and 1.
For q i we make the following assumptions. First, we assume overdispersion resulting from the issues discussed in Section 3.3. Second, we account for the possibility of polling company biases, and introduce parameter γ to capture it. Finally, we include a parameter ρ to allow for correlation between q i1 and q i2 . Hence, a hierarchical model for q i is as follows: where N (µ, Σ) denotes a normal distribution with mean µ and variance matrix Σ, which we parameterise using marginal precisions τ 21 and τ 22 and correlation ρ. In (3), θ 1 (t) and θ 2 (t) are logits of the corresponding proportions in the general population at a time t days before the referendum. Such a specification assumes that the only difference between opinion polls on a given day t is due to the polling company biases γ = (γ c1 , γ c2 ), c ∈ {1, . . . , 10}. These effects are constrained to have a zero mean and are assumed to follow a univariate normal distribution: The zero mean here reflects the assumption, discussed in Section 3.3, that the average bias across polling companies is zero.
Next, we assume that the underlying logits θ k (t) follow independent Ornstein-Uhlenbeck processes. For k = 1,2, we define θ k = (θ 1k , . . . , θ Dk ) as logit-transformed proportions in the general population on the dates of the opinion polls in our sample, so that θ jk = θ k (t j ). Therefore, the model for θ k is where j = 1, . . . , D and 0 ≤ α k < 1. Function f (t j , β k ) is the marginal mean of q ik , for any poll i taken on day t j , and specifies the time trend using a vector parameter β k . Parameter α k controls the extent of serial correlation. For function f we consider constant, linear, quadratic and cubic trend specifications. For example, for the quadratic trend we have By setting the date of the referendum as the origin of the time variable t (t = 0), the intercept parameter β 10 is the marginal expected logit of the probability of a Yes outcome in the referendum. The forecasted share of votes in favour of independence, s 0 , is given by where θ 1 (0) is a forecast forward to t = 0. Hence, s 0 is the key parameter of interest taking into account both the underlying trend β 10 and recent divergences of opinion from this trend.
The best fitting trend is selected by using the Deviance Information Criterion (Spiegelhalter et al., 2002). Time trend, autocorrelation and precision parameters are specific to the level of the model hierarchy k, as the odds Yes/No (q i1 ) and (Yes+No)/Undecided (q i2 ) can behave in different ways over time. In general, different specifications of the time trend f for both logits are also possible.

Prior distributions
For the model parameters, we assume the following prior distributions. For the key parameter β 10 , the intercept in the mean model for Yes/No odds in (5), we utilise subjective expert information obtained from the Delphi questionnaire on future migration trends in Scotland Wiśniowski et al., 2014). The Delphi survey was carried out in June 2013. Information was provided by eight out of 12 experts, who had been asked to state the probability that Scotland would gain independence. This probability can also be interpreted as a probability that the share of the votes will be larger than 50%. This provides partial prior information about the experts' uncertainty about the share of the votes which we utilise to construct the prior distribution for β 10 as follows.
We assume that the prior distribution is an equally-weighted mixture of normal distributions, constructed for each of the eight experts. Since β 10 represents the logit of the expected share of votes on the referendum day, we assume that the probability of share of votes being larger than 50% provided by expert l is where Φ is the distribution function of a standard normal distribution. Here S l is a random variable representing the lth expert's opinion on the share of votes on a referendum day, which we assume to have a logistic-normal distribution with mean µ l and variance σ 2 l . As full information was not provided by the experts about their uncertainty concerning the share of the votes, there are 16 unknowns in eight equations (7). Therefore, we assume that each expert's opinion is equivalent to a poll of size n. In such a poll, we would have nS l ∼ Binomial(n, p l ). Hence, by applying the delta method we obtain Since this variance is the largest for p l = 0.5, we simplify (8) to σ 2 l ≈ 4/n, for l = 1, . . . , 8. For our application, we assume that each expert is equal to n = 125 respondents in a representative opinion poll, so that a total of eight experts is equivalent to 1000 respondents, which is a typical size for an opinion poll. In the sensitivity analysis below we assess departures from this assumption by considering n = 1 as well as an approximately uniform prior.
For the other elements of the vector β k we use normal distributions centred around zero with very large variances: β ki ∼ N (0, 10 4 ), i, k = 1, 2.
These priors are extremely diffuse, which in principle allows opinion to exhibit variation in time which is much more rapid than might be considered plausible. However, when we use priors with much smaller variances restricting variation in opinion to a more plausible range, the final results were indistinguishable from those with the diffuse priors, indicating the lack of sensitivity to the precise values of the prior variances.
For α k we assume a uniform distribution on (0, 1), which prevents this parameter taking negative values or exhibiting explosive behaviour: For precisions τ 1k , τ 2k and τ 3k , we assumed truncated normal distributions: where 1(g) is an indicator function taking one if g is true, and zero otherwise. Finally, for the correlation parameter ρ, we assume ρ ∼ U (−1, 1).
To select the prior distributions for the precisions including the values of the hyperparameters, we analysed the model with data simulated to resemble opinion polls related to the Scottish referendum. This simulation study demonstrated that the use of diffuse conjugate gamma priors (e.g. Γ (10 −3 , 10 −3 )) for the precision parameters τ k1 and τ k2 results in poor mixing of the MCMC algorithm and undesirable levels of sensitivity. Hence, following recommendations of Gelman (2006), we investigated (i) diffuse half-normal prior distributions for precisions, and (ii) uniform distributions over large range for standard deviations. If the number of days with multiple polls was small, the overdispersion parameter τ k2 was weakly identified. This led to poor convergence of the MCMC algorithm and high correlation between τ k1 and τ k2 . The truncated normal distributions performed considerably better than the uniform distributions.

General remarks
Predictive distributions of the referendum result, based on the opinion poll data and expert opinion, were obtained by using OpenBUGS (Lunn et al., 2009) and JAGS with rjags R package (Plummer et al., 2003, see code repository in Online Supplementary Material). As discussed in Section 4.1, s 0 in (6) in the context of the Scottish referendum is the share of the Yes votes, in favour of independence, on the referendum day, and hence uncertainty about the referendum outcome is represented by the posterior distribution of s 0 . The probability of an outcome favouring independence is therefore calculated as the probability mass of the predicted share of the Yes votes larger than 50 percent, i.e. P(s 0 > 0.5). Similar operationalisation has been adopted for the EU membership referendum, where s 0 relates to the share of the Leave votes, favouring the withdrawal of the UK from the European Union.
Below, we first present the history of forecasts for the Scottish independence based on the data one year, and then 200, 100, 30 and seven days before the referendum. Next, we use the data that include the polls carried out within the week prior to the referendum as this was the period when nine additional polls were carried out. The time trend with the lowest DIC is quadratic for all subsets of the polling data. Finally, we apply the model to the EU membership referendum data, in order to externally validate the robustness of the method.

Scottish referendum: history of forecasts
Forecasts of the result of the referendum, together with the opinion poll data until one week before the referendum, are presented in the upper panels of Table 1 and Fig. 3. In the last three columns of Table 1, we present the median and quartiles of the predicted share of Yes votes (excluding Undecided). We observe that the median is almost constant over time, at around 46 percent (with a drop to 44.7 percent one month before the referendum), but the predictive distribution of the share of votes for independence is shrinking with additional observations utilised in the model. An earlier version of our model, 4 which did not include the polling company effect γ , produced undesirable sensitivity of the forecasts to the order in which different companies published their polls and led to an overestimation of the proportion in favour of independence at certain dates.
The results indicate that one year before the referendum, there was still room for a change in the opinion for both Yes and No camps. Closer to the referendum, however, the chances of any dramatic shifts were shrinking. The decrease observed one month before indicates the influence of the debate between the First Minister of Scotland, Alex Salmond, and the leader of the pro-Union campaign, Alistair Darling, on 5 August 2014, deemed to have been lost by the former. Even in the final days of campaigning, the forecasts remained at a similar level, which can be explained by relatively large differences amongst the polls carried out by various companies in the last week. Hence, the uncertainty of the predicted votes share remained rather large, with the interquartile range of 45.2-48.1 percent.
The probability of independence decreased over time from 23 percent one year prior to the referendum, to two percent with 30 days to go (and after the first TV debate on 5 August 2014). The upward shift in the polls conducted a week before the referendum led to an increase of the probability to five percent, which might have been a result of the second TV debate on 25 August, said to have been won by Alex Salmond. The changes in this probability over time are summarised in the upper panel of Table 1 and the corresponding distributions of the share of the Yes vote are presented in the upper panel of Fig. 3.

Scottish referendum: on-the-eve forecasts
Our final prediction, based on the opinion polls reported up to 17 September 2014, i.e. on the eve of the referendum, is presented in Fig. 4, together with the referendum outcome on 18 September, which was 44.7 percent in favour for independence. This outcome is approximately the 18th percentile of our predictive distribution. Despite the last-minute increase in the probability of independence to seven percent, the median share of Yes votes, 46.7 percent, remained at a similar level to the earlier predictions. The final result also suggests that the opinion polls carried out shortly before the referendum might have been biased towards independence, as they all predicted a higher share of the Yes votes than the final outcome.
The predicted share of the undecided on the day of election was 8.5 percent with the 95% predictive interval being 6.5-10.9 percent. If we assume that all 'decided' vote, this provides an estimate of a turnout of 91.5 percent. The actual turnout was at a lower level of 84.6 percent which aligns with the findings about the overestimation of turnout described in Section 3.3. Finally, the comparison of the models with quadratic and cubic trends, which in the end were the two competing models, reveals that the differences in the predictions were minimal.

Scottish referendum: sensitivity analysis of the priors
To assess sensitivity of the results to the expert-based prior on β 10 we considered two priors where the expert opinion is moderated. In the first, we reduce the effective weight of each expert to n = 1, and in the second we use a diffuse prior β 10 ∼ N (0, 2), which is approximately equivalent to a uniform prior on (0, 1) for E(s 0 ). The results are presented in the second and third panels of Table 1 and Fig. 3. As would be expected, the expert-based prior has an influence on the predictive distributions at the earlier polling dates, in 2013 and the spring of 2014, effectively shrinking the predictive distributions. However, there is little sensitivity to the choice of the prior in the run-up to the referendum, as the polling evidence outweighs the earlier expert opinion. 5 As a result, the discrepancy between the predicted results close to the referendum day is negligible.

Model validation: EU membership referendum
The EU membership referendum carried out in the United Kingdom in 2016 enabled us to carry out an external validation exercise for the model proposed in Section 4. As in this case we did not have access to expert opinion which would be comparable to the information elicited for the Scottish referendum (Section 3), the estimation only includes a vague prior distribution for the parameter β 10 . As before, the best-fitting trend has been also selected by using the Deviance Information Criterion, with the caveat that forecasts done for different pre-referendum cut-off dates could have been based on different trend functions. In particular, the on-the-eve predictions based on all three trend functions yielded similar predictions as well as approximately the same Deviance Information Criteria.
The specific results of the forecasts are presented in Figs. 5-6, as well as in Table 2. Overall, as was the case with the Scottish independence referendum, there is a visible narrowing of predictive intervals over time. Towards the end of the prediction horizon, and especially on the eve of the referendum, the predictions formally remained inconclusive, with the inter-quartile range of the predictive distributions covering 50 percent of the Leave vote. Nevertheless, as seen especially in Table 2 and Fig. 6, the odds were only slightly for the Leave vote winning. As it happened, the final share of the Leave vote was 51.9 percent. The predicted share of the Undecided vote, a proxy for the share of the electorate that did not vote, was 10.2 percent, with a 95% predictive interval between 7.9 and 13.2 percent. Hence, the model predicted the turnout of 89.8 percent which is again quite well above the actual turnout of 72.2 percent. Here, a likely explanation might be an estimated lower turnout of the young voters aged 18-34 compared to those aged 55-74 (Ipsos-Mori, 2016). A speculative hypothesis for that behaviour might be that, on the day of the referendum, a victory of the Remain camp has been expected as revealed by, e.g., the last published poll which showed 48 to 46 percent for Remain. 6 This explanation would align with the findings of Großer and Schram (2010). Overall, as in the case of the Scottish referendum, the model performed reasonably well, clearly indicating the uncertainty around the polling outcome.

Model sensitivity: time trend specification
We analyse the use of cubic B-splines to specify the time function f (t, β k ) in (5). This can be viewed as a more general specification of the marginal means of the logit q ik in (2) that might be useful in the cases where the long-term time pattern exhibits more erratic behaviour.
In particular, we use two knots that split the data set into three subsets with a similar number of observations in each of them. The indices for the splines are created using function bs() from package splines in R (R Core Team, 2018   forecasting horizons (100 days before or longer), we observe relatively large uncertainty in predictive posteriors for the vote share under study. These differences diminish for short-term horizons. On the eve of the referendum, the predicted probability of Scottish independence is estimated to be 13 percent compared to 7 percent in Table 1; for the EU membership the corresponding probability of Leave is 43 percent compared to 42 percent in Table 2. The DIC yielded by the models based on B-splines for short-term forecasts are also close to the ones produced by the polynomial trends. Hence, in the cases where no informed judgement can be made about the long-term behaviour of the voting intentions, we recommend Bayesian averaging (Hoeting et al., 1999) of the results to account for the uncertainty about the trend specification for the mean.

Discussion
In this paper we have proposed a method of estimating the probability of the outcome of a referendum by using opinion poll data. It takes into account the variation between the polls over time, between the polls carried on the same day, and accounts for the biases of various polling companies. It smooths out the volatility of the opinion polls carried out before the referendum, yet is still able to capture swings in voting intentions. The model can also predict approximate turnout in addition to the outcome of the vote. The method has been applied to the 2014 independence referendum in Scotland and validated on the data for the 2016 UK referendum on continuing membership of the European Union. The actual outcomes of both referenda fall well within the respective predictive probability distributions obtained from our model, and in both cases are close to our predicted median values, with differences of around ±2 percentage points.
The predictions generated by the model proved quite robust with respect to the specification of the underlying time trend function, especially for the short-term horizons. For long-term horizons, B-splines tend to increase uncertainty compared to using single polynomial function. The model also yielded results comparable to those that could be obtained by using other methods and data sources, such as the prediction markets or bookmakers' bets, while retaining the advantage of much more modest data demands. In particular, for the EU referendum, according to Auld and Linton (2018), the prediction markets or currency markets alone were indicating a probability of approximately 0.1 for Leave until the polls had closed, whereas a Bayesian model estimated on their basis, using real-time, high-frequency data, had this probability estimated as approximately 0.37 (Auld and Linton, 2018, Figure 10 on p. 36). The latter (0.37) is more in line with our main finding. In that respect, for the longer time horizons, the additional information gains provided by the prediction or currency markets can be seen as negligible. For the Scottish referendum, on the other hand, the prediction markets performed well, and similarly to the model presented in this paper, picking up the second leaders' debate as the decisive turning point in favour of Scotland remaining in the Union (Wall et al., 2017).
Further research is required for applications such as election forecasting, where additional covariates, such as fundamentals or outcomes of historical events, as well as information on past performance of the polls can be utilised. This information could be used for forecasting the spatial distribution of votes, important for the prediction of results for individual electoral districts, for example in the UK parliamentary elections, or the US presidential elections. It could be also used to strengthen the predictions of turnout, which in the current version is biased compared with the actual outcome but is only based on the share of Undecided (i.e. floating) voters whose decisions about turning out are difficult to predict until the actual voting event (Großer and Schram, 2010). Further, the bias across polling companies could be addressed by utilising, if available, information on the mode of data collection, non-response or type of the survey. This information can be incorporated in our modelling framework by including, for example, a generalised linear model for parameters γ in (4); alternatively, informative priors reflecting potential biases can be constructed based on expert assessment of the polls. Another extension could be to perform Bayesian model averaging to account for the uncertainty about the particular time trend used to make the prediction. Finally, the time trend could be replaced by other semi-or non-parametric function, such as a normal kernel or other types of splines. Still, after necessary modifications -mainly extending the two-level model hierarchy to more levels -the proposed framework can be easily generalised to forecast voting outcomes with more than two options.