The Performance of Multivariate Calibration on Ratios, Means and Proportions

In this paper, the calibration approach is revisited in order to allow new calibration weights that are subject to the restriction of multiple calibration equations on a vector of ratios, means and proportions. The classical approach is extended in such a way that the calibration equations are not based on a vector of totals, but on a vector of other nonlinear parameters. We stated some properties of the resulting estimators and carry out some empirical simulations in order to asses the performance of this approach. We found that this methodology is suitable for some practical situations like vote intention estimation, estimation of labor force, and retrospective studies. The methodology is applied in the context of the Presidential elections held in Colombia in 2014 for which we estimated the vote intention in the second round using information from an election poll, taking the results from the ﬁrst round as auxiliary information.


Introduction
Consider a finite population U as a set of N units labeled as {1, . . ., N }.The size of the population U is not necessarily known.There are a vector of variables x k = (x 1k , . . ., x Qk ) and a variable y k associated with every unit k in the population.Likewise, assume that a random sample s of size n is drawn from U according to a (usually complex) sampling design p(s).Let π k = P r(k ∈ s) be the first-order inclusion probability, and π kl = P r(k, l ∈ s) the second-order inclusion probability.If the purpose of the study is to unbiasedly estimate the total of y in the finite population given by t y = k∈U y k , then the Horvitz-Thompson estimator (HT) can be used, which is defined as: Where d k = 1/π k is known as the sampling weight.The unbiased estimator of the variance of the HT estimator is obtained by the following expression: where ∆ kl = π kl − π k π l .It is well known that the use of auxiliary information is important in survey sampling theory, not only in the design stage, but also in the estimation step.One of the most plausible ways to incorporate auxiliary information is by using calibration estimators, where the calibration equations only involve totals of auxiliary variables.Deville & Särndal (1992) proposed a class of linear estimators of population totals in the following form: where w k (k ∈ s) is the calibrated weight of unit k, induced by the use of auxiliary information in the form of a vector of population totals t x = (t x1 , . . ., t x Q ) .The aim of calibration weights is to satisfy the following calibration equations: In the classical approach, the calibration weights are defined in order to minimise a pseudo-distance Φ s from the design weights d k =1 π k that are subject to the calibration equation (3).In this perspective, the calibration weights are given by: For example, by minimising the chi-squared distance, Φ s = s c k (w k −d k ) 2 /d k , 1 the calibration weights can be expressed as (5) Särndal (2007) reviews the calibration approach and the use of auxiliary information.Note that the estimated variance of the calibration estimator (2) under the chi-squared distance is where e k = y k − x k B and Other approaches to variance estimation of calibration estimators are provided by Kim & Park (2010).Note that many weight systems may satisfy (3).For example, Estevao, Särndal & Sautory (2000) found that by taking into account a set of instrumental variables z k , the calibration weights become: Observe that dim(z k ) must be equal to dim(x k ).Kott (2004) and Estevao & Särndal (2004) indicated that the optimal instrumental vector is given when A deeper view of instrumental calibration may be found in Kott (2003), Kim & Park (2010), and Park & Kim (2014).Estevao et al. (2000) and Estevao & Särndal (2006) also considered the following set of calibration weights: In particular, when z k = x k , the calibration weights are given by: As we can see, there are many choices when calibrating on known totals, but not all of them are efficient.However, it should be mentioned that calibrating on totals is not always suitable because current totals are not frequently available.Note that this situation is more dramatic in developing countries where censuses are not carried out regularly.The nature of totals is dynamic and it changes over time, most of the times they increase year after year.However, in an official statistics context, ratios, means and proportions are more stable over time.Following the findings of Krapavickaite & Plikusas (2005), Plikusas (2006) and Lesage (2011), when a population ratio R is accurately known, it is possible to compute new calibration weights that are subject to this new benchmark constraint: Note that we do not necessarily know t y or t x .Moreover, with this approach we can simultaneously estimate the totals that define the ratios while maintaining their structural relation.Nevertheless, from a methodological perspective, we can extend the restriction to a vector of ratios, means and proportions.This paper deals with this issue and it can be considered a fallow an to Lesage's (2011) suggestion, who claimed that it would be interesting to determine the practical cases in which the use of complex parameters in the calibration improves the precision of the parameters of interest.He also examined calibration in terms of ratio, median and variance of auxiliary variables.Kim, Sungur & Heo (2007) proposed a calibration approach to estimate the population mean in stratified sampling by defining the calibration equation in terms of the population mean of one auxiliary variable.
In this article, we extend the ratio calibration approach to the multivariate case, that is, we propose a calibration methodology based on more than one ratio.We also present the variance estimation based on Taylor's linearization.This paper split up into the following section: after a brief introduction, Section 2 describes the estimation of total by using the calibration approach with a vector of known ratios along with some interesting properties and some specific scenarios.Section 3 reports a Monte Carlo simulation, the results of which show that, in some scenarios, the approach could be more efficient than calibrating known totals.In Section 4, the proposed methodology is applied in an electoral surveys context to estimate voting intention in a possible second round.Since the voting in the second round is influenced by the first round, we had the possibility to calibrate the sample weights using known first round ratios, which substantially improved the estimation of the second round voting intention.Section 5 concludes with a brief discussion on the use of this approach.

Multivariate Calibration Over Ratios
Let consider a different approach to calibration.Assume that kth population element is associated with vectors x k and y k = (y 1k , . . ., y Qk ) .For the elements k ∈ s, we observe both y k and x k .The population ratios R q = ty q tx q (q = 1, . . ., Q) are assumed to be known (even when t yq and t xq remain unknown).As such, the goal is to estimate all of the population totals t yq = k∈U y qk and t xq = k∈U x qk , through the estimators tyq = k∈s w k y qk and txq = k∈s w k x qk , where new weights w k satisfy the following constraints: Equivalently, the calibration equations are defined by: The following result furthers the Lesage's (2011) idea: Result 1. Assume that we have access to a vector of known ratios defined by: where, Then, to calibrating over R is equivalent to calibrating over the following vector where, That is, to calibrating over R is equivalent to calibrating over t z = 0 Proof .First, notice that, from the calibration equations, we have: In this way, for every q = 1, . . ., Q, 0 = tyq,cal − R q txq,cal Then, the calibration equations tz,cal = 0 and Rcal = R are equivalent.
Note that even though R q is known, the totals t y and t xq are not necessarily known.This condition ensures the flexibility of the approach because we can use this methodology not only to calibrate over ratios but to estimate other parameters of interest while maintaining the restriction on the calibrated weights on ratios.
Result 2. Suppose that a total of interest t y is estimated by means of the approach given in result 2.1.As such, an asymptotically unbiased estimator of t y is where the set of weights w k satisfies the calibration restriction (12) over the auxiliary ratios.
Proposition 1.For every q = 1, . . ., Q, the expectation of calibration estimators for totals performs according to the following expression: The variance of calibration estimators for totals is based on the following relation: V ar( tyq,cal ) = R 2 q V ar( txq,cal ). (20) The coefficient of variation of calibration estimators for totals derives from this relation: and the relative bias of calibration estimators for totals follows the expression: Proof .The demonstration of equations ( 19) and (20) are straightforward.For the coefficient of variation, it should be noted that: With respect to the relative bias

Some Particular Cases
When our parameters of interest are means or proportions, they can be estimated as particular cases of the proposed methodology.In the case of means, the corresponding calibration equation is: Where, for every q = 1, . . ., Q, z qk = y qk − ȳq (24) In the case of proportions, the corresponding calibration equation is: Where, for every q = 1, . . ., Q, z qk = δ qk − P q (26)

Another Perspective to Post-Stratification
Now, suppose that the population is partitioned into Q subgroups called poststrata.If ratios for those particular population subgroups are known, we should find weights satisfying the following constrains: Note that if population ratio is known, it is also possible to impose the following constraint on the calibration equations: Rcal = R U .Where, for every q = 1, . . ., Q, Now, if means are known for the population subgroups, for example, in poststratification, the proper constrains are as follows: Consider that if population ratios are known, it is also possible to impose the following constraint to the calibration equations: ȳcal = ȳ.Where, for every q = 1, . . ., Q, Now, if proportions are known for population subgroups, for example, in poststratification, the proper constrains are as follows: Pcal = ( P1,cal , . . ., PQ,cal ) = (P 1 , . . ., P Q ) = P (31) Note that if population proportions are known, it is also possible to impose the following constraint on the calibration equations: Pcal = PU .Where, for every q = 1, . . ., Q,

Extending the Approach
Note that the calibration estimator proposed can be extended, in the sense that we can consider the situation in which we want to estimate the population totals by means of the calibration estimators t * yq = k∈s w * k y qk with q = 1, • • • , Q where the weights w * k satisfy the calibration equations where ỹqk is any other variable, as well as xqk .In a special case, these variables could represent the same characteristic of interest measured at a previous period.For example, this estimator can be useful when estimating unemployment rates for a particular period of time restricted to the calibration over the previous unemployment rate.Moreover, in runoff elections, we can calibrate using the results from the first round election in order to estimate the voting intention in the second round.As such is, Q * may be different from Q.
Then, the aim is to find new weights w * k that satisfy the following calibration equations:

Empirical Simulation
In this section some simulation experiments were carried out in order to compare the performance of the estimation of a total of interest t y by using the calibration estimator on auxiliary ratios (CALR).This is given by ( 18), the classical calibration estimator on auxiliary totals (CAL), which is given by ( 2), and the Horvitz-Thompson (HT) estimator, which is given by (1).
A finite population of size N = 10000 was simulated from a superpopulation model ξ.It was supposed that the relationship between y k and x k can be described through a general model ξ, such as y k = x k β + ε k .This model may adopt different forms throughout this section.The values of the vector of auxiliary information were generated from a uniform distribution and it was assumed that the ε k values were independent and distributed as N (0, σ 2 ), where σ 2 was suitably allocated to allow different values of the R-squared of the model.
In each run, random samples according to a simple random design without replacement (SI design) were drawn.We considered two sample sizes: n = 400 and n = 2000.The parameter vector β was estimated by ( 7) with c k = 1.This process was repeated M = 1000 times.The simulation was written in the statistical software R 3.1.1.(R Development Core Team 2007).In the simulation, the performance of an estimator ty was tracked by means of the Relative Bias (RB), Coefficient of Variation (CV ) and the Relative Efficiency (RE).The RB was given by: where ty,m is computed in the mth simulated sample, m = 1, . . ., M .The CV was given by: where Mean Square Error (MSE) is defined as Finally, we computed the RE between the CALR and the CAL estimators as: As such, if the RE takes values higher than unity, it is concluded that the CALR estimator outperforms the CAL one.The fist two simulations dealt with super population models for the entire population.However, the remaining scenarios dealt with models involving super population groups or post-strata.In those cases, we simulated H = 3 with groups of the following sizes: N 1 = 5000, N 2 = 2500, and N 3 = 2500.In other words, population U was divided into three unequal groups U 1 , U 2 and U 3 .

Simple Regression Model
This first scenario deals with a single regression model: We assumed that ε k ∼ N (0, σ 2 ).The values of X k were obtained from the distribution U (10, 20), the values of the regression coefficients were set at β 0 = 180, β 1 = −2 and we chose convenient values for σ in order to get a predetermined Rsquared.
For the CAL estimator, we assumed that the vector of auxiliary totals t x = (N, t x ) was known, and it was used in computing this estimator.Note that t x = k∈U x k is the population total of the variable x.However, for the CALR estimator, it was assumed that the auxiliary ratio R = ty tx was known, and it was used when computing this estimator.Also, note that t y = k∈U y k is the population total of the variable y.Tables 1 and 2 show the performance of the estimators that were considered.In this scenario, we found that all of relative biases are negligible, and the lower coefficient of variation is that induced by the CALR estimator.Likewise, the relative efficiency of the CALR estimator is higher that the ones obtained in all other scenarios.When the R-squared increases, the efficiency of the CALR estimator decreases.One explication is that when the R-squared increases, the correlation between x and y also increases, and the variance of the CAL estimator gets smaller, decreasing faster than the variance of the CALR estimator.The simulations show a similar performance when the sample size increases.
We chose a convenient value for σ in order to obtain a predetermined R-squared.
For the CAL estimator, it was assumed that the vector of auxiliary totals t x = (N, t x1 , t x2 , t x3 ) was known, and it was used in the computation of this estimator.Note that t xq = k∈U x qk is the population total of the variable x q for q = 1, 2, 3.Moreover, for the CALR estimator it was assumed that the auxiliary was known, and it was used under computing this estimator.Tables 3 and 4 show the performance of the estimators that were considered.In this scenario, we found that all of relative biases are negligible, and the lower coefficient of variation is that induced by the CALR estimator.In the same way, the relative efficiency of the CALR estimator is higher that those in all other scenarios.When the R-squared increases, the efficiency of the CALR estimator decreases.The simulations show a similar performance when the sample size increases.
For the CAL estimator, we assumed that the vector of auxiliary totals t x = (t 1 x , t 2 x , t 3 x ) was known, and it was used when computing this estimator.Note that t h x = k∈U h x k is the total of the variable x for the subpopulation U h .Also, take into account that the population total of x over U is defined to be t x = 3 h=1 t h x .Moreover, for the CALR estimator, it was assumed that the auxiliary vector of means R = x was known, and was used in the computation of this estimator.Here, t h y = k∈U h y k is the total of the variable y for the subpopulation U h , and the population total of y over U is t y = 3 h=1 t h y .Tables 5 and 6 show the performance of the estimators considered in this scenario.
In this scenario, we found that all of relative biases are negligible, the lower coefficient of variation is that induced by the CALR estimator.Also, the relative efficiency of the CALR estimator is higher than those achieved in other scenarios.When the R-squared decreases, the efficiency of the CALR estimator increases.The simulations show a similar performance when the sample size increases.
For the CAL estimator we assumed that the vector of auxiliary totals t x = (N 1 , N 2 , N 3 ) was known, and it was used when computing this estimator.Moreover, for the CALR estimator, it was assumed that the auxiliary vector of means was known, and it was used when computing this estimator.Tables 7 and 8 show the performance of the estimators considered in this scenario.In this scenario, we observed that all of relative biases are negligible.We also found that the relative efficiency of the CALR estimator is higher than the one obtained when the R-square is lower than 0.4.When the R-squared decreases, the efficiency of the CALR estimator increases.The simulations show a similar performance when the sample size increases.

Poststratified calibration over proportions
This scenario deals with a poststratified model, given by: Note that in this scenario, the variable y is not continuos but discrete, taking only two values: one and zero.As such, y kh = 1 if the element k has a certain characteristic of interest and y kh = 0 otherwise.Besides, N 1 = 5000, N 2 = 2500 and N 3 = 2500.Values of β h were chosen conveniently in order to obtain a suitable R-squared.
For the CAL estimator, we assumed that the vector of auxiliary totals t x = (N 1 , N 2 , N 3 ) was known, and it was used when computing this estimator.Moreover, for the CALR estimator, it was assumed that the auxiliary vector proportions was known, and it was used when computing this estimator.
Note that N h = U h y kh and bear in mind that N h is the size of the population subgroup U h .As such, even though 9 and  10 show the performance of the estimators that were considered in this scenario.In this final scenario, we found that all of relative biases are negligible, and that the relative efficiency of the CALR estimator is higher that when the R-square is lower than 0.4 and when the sample size is large.When the R-squared decreases, the efficiency of the CALR estimator increases.

Calibration Over Any Set of Ratios
In this section we show the results of some empirical simulations when calibrating over a vector of known ratios R * .Generally speaking, the results obtained with a sample size of 400 are very similar to those of 1000; so we only show the results relating sample size of 400.Furthermore, the relative bias of the estimators are very small (negligible), so we just show the relative efficiency between the estimators.R HT C denotes the relative efficiency between the HT estimator and the CAL estimator, R HT R denotes the relative efficiency between the HT estimator and the CALR estimator and R C R denotes the relative efficiency between the CALR estimator and the CAL estimator.

Simple Regression Model
We first consider a simple regression model that relates the variable ỹk to xk .As such, ỹk = β 0 + β 1 xk + ε k with ε k ∼ N (0, σ 2 ): the values xk were simulated from the distribution U (10, 20).The values of the regression coefficients were set to β 0 = and β 1 = −2, and we chose convenient values for σ 2 in order to get a predetermined R-squared.
In order to create the variable of interest y k , we assumed that y k = γ 0 + γ 1 ỹk + k , with k ∼ N (0, 10 2 ).We varied γ 1 to get different coefficients of correlation (ρ) between y k and ỹk .The results of this empirical study are shown in Table 11.We can observe that the performance of the classic calibration CAL estimator improves as the correlation between y k and ỹk increases, which is well known.With respect to the proposed CALR estimator, we can conclude that the performance of the CAL estimator is the same as the CALR estimator for lower correlation coefficients.We emphasise that the proposed CALR is useful when there are no population totals available for the variable ỹ, which prevents the classical calibration estimator from being used.In these situations, if we know population ratios, we can use the CALR estimator.Note that the CALR estimator is always better than the Horvitz Thomson estimator and is as efficient as the classical calibration estimator when the coefficient of correlation is low.

Multiple Regression Model
We now consider a simple regression model that relates to the variable ỹk to x k .As such, we consider the following model ỹk = The values of x1k , x2k and x3k were simulated from the distributions U (10, 20), U (100, 150) and U (1, 1.8), respectively.The values of the regression coefficients were set to β 0 = 400, β 1 = −2, β 2 = −0.8 and β 3 = 50.Convenient values for σ 2 were proposed in order to get a predetermined R-squared.
In order to create the variable of interest y k , we assumed that y k = γ 0 +γ 1 ỹk + k with k ∼ N (0, 10 2 ), γ 0 = 100 and we varied γ 1 to get a different coefficient of correlation between y k and ỹk .For the CAL estimator, the calibration is made over the population total t ỹ , while for the CALR estimator, the calibration is . The results of this simulation are shown in Table 12.

Estimation of Vote Intention
In a runoff election, a candidate wins in the first round if he obtains an absolute majority of the votes.If no candidate wins in the first round, then a second round must be held between the two candidates who managed to obtain the majority of the votes in the first round.The winner of that round wins the election (Bouton & Gratton 2015).This system is used around the world for the election of presidents in Afghanistan, Argentina, Austria, Brazil, Bulgaria, Cape Verde, Chile, Colombia, Costa Rica, Croatia, The Czech Republic, Cyprus, Dominican Republic, Ecuador, Egypt, El Salvador, Finland, French, Ghana, Guatemala, India, Indonesia, Liberia, Peru, Poland, Portugal, Romania, Senegal, Serbia, Slovakia, Slovenia, Timor-Leste, Turkey, Ukraine, Uruguay and Zimbabwe.
In Latin America, as stated by Pérez-Liñán (2006), over the last two decades a majority of Latin American countries have adopted presidential runoff elections in order to strengthen the legitimacy of their elected presidents.During 2014, out of the 20 countries in Latin America, 7 had presidential elections, while 5 of them had to use the runoff elections mechanism.Table 13 shows the elections dates for the first and the second rounds in 2014 that were held in these nations, as well as the winners of these second rounds.Now, let us assume that after the first round elections, we perform a survey to a sample s of n citizens who are able to participate in the second round election.In that very survey, we ask the following estimations: for a) the vote intention in the second round; b) whether they had vote in the first round, and c) for which candidate they voted in the first round.Note that the estimates of the survey may be calibrated in order to improve the estimation of the results in the runoff by including auxiliary information from the results officially cast in the first round.
To do this, we must understand that for k ∈ s there are four variables of interest that address the problem of vote intention.For the first round we define: Otherwise.
And, assuming that Q candidates (blank vote included) were contending in the first round, we define for every q = 1, . . ., Q, If k -th individual voted for the q-th candidate in first round, 0 Otherwise.
For the second round, assuming that the intention of vote in the second round is going to be measured for only two candidates and a blank vote, we define the following: If the k -th element will vote in the second round, 0 Otherwise.
Finally, assuming that M (blank vote included) from Q candidates remain in the second round, we define for every m = 1, . . ., M , If k -th individual has the intention to vote for the m-th candidate, 0 Otherwise.
Note that, as the survey is carried out between the first and the second rounds, the vector of total votes in the first round t x = (t x1 , . . ., t x Q ) is already known.By defining t v = k∈U v k as the amount of voters in the first round, the vector of ratios per candidate in the the first round is: The approach in this paper addresses the construction of new weights w k that areobtained by calibrating over the vector of population ratios R. If the objective is to exactly reproduce the percentage of voters in the first round, then the calibration ratio estimators present in this paper should be used.As such, in order to create the weights w k , it is necessary to define the following calibration equations: Applying the methodology proposed in this article, we define proper variables z qk , so that for every q = 1, • • • , Q, we obtain: If the k-th element voted for the q-th candidate in the first round 0 Otherwise Therefore, we also may address the estimation of the percentage of potential voters per m-th candidate in the second round by defining the following estimator: Note that, in order to solve the calibration problem, if we use the chi-square distance, the estimator tyq,cal adopts the following form: where Where z k = (z 1k , . . ., z Qk ).In the same way we can define tu,cal as it follows: where and the ratio estimator takes the following form:

Variance Estimator
We propose a variance estimator for Rq,cal by using a Taylor's approximation (see Särndal, Swensson & Wretman (2003) for detailed information).Then, the ratio estimator Rq,cal can be approximated by: And Where B yz and B uz are the population counterparts of Byz and Buz , respectively.As such, the variance estimator for Rcal is: Revista Colombiana de Estadística 39 (2016) 281-305 which can be estimated by

Presidential Elections Held in Colombia (2014)
Presidential elections are the electoral mechanism through which citizens determine who will be the president of Colombia for a four year period (Blais, Massicotte & Dobrzynska 1997).One candidate gets elected in the first round when he or she obtains 50% of the total voters plus one (an absolute majority).If none of the candidates obtain the absolute majority, it is necessary to conduct a second round of voting: a runoff election.This will include the two candidates who obtained the most votes in the first round, as stated in Article 190 of the 1991 Colombian Constitution.Of the six presidential elections held since 1991, the second round mechanism has been used on four occasions: in 1994, 1998, 2010, and, recently, in 2014. The exceptions occured in 2002and 2008 when the most popular politician in recent years, Álvaro Uribe Vélez, obtained on absolute majority in the first round with 53.04% and 62.35%, respectively.In Table 14, we show the results in the second rounds since 1991.We can conclude that the two candidates achieved quite similar numbers in all the second rounds, with the exception of 2010 when the candidate of the Colombian Green Party, Antanas Mockus, lost with 27.47% despite his popularity among young voters.Furthermore, the estimation of vote intention in the second round is also important because the candidates who do go on to the second round ally with those who did not.These partnerships are important as they try to get the most votes of these potential voters for they are the ones who will define the victor of the second round.We applied the proposed methods in this article to the results of a survey to estimate the voting intention in the second round of the presidential elections held Revista Colombiana de Estadística 39 (2016) 281-305 in Colombia in 2014.This information must be conducted between the first and the second rounds by selecting a probability sample of voters made by a sampling design.Therefore inclusion probabilities must be included, which allows us to make estimations about the population of potential voters in the second round.
In this section we applied the proposed estimator to the Colombian presidential runoff election held in 2014.The first round was held on May 25th, 2014.The results of this first round are shown on Table 15.These results indicate that if Colombia used the simple majority system, the president would have been Zuluaga and not Santos, who, in fact, is the current president of Colombia.As stated above, the candidates involved in the second round were Santos and Zuluaga.The population of interest were the voters who cast a valid vote in the first round, including votes that did not choose any candidate.This way, N = 12.851.650, of which 94% voted for a candidate while the other 6% did not vote for any candidate.Our goal was to estimate the number of people who planned on voting for Santos, Zuluaga or no candidate.The way to compute this estimation is by constructing new weights w k , which are created using the voting rates for each candidate and the no vote in the first round as auxiliary information.
We also used the results of one survey carried out between the first and the second electoral rounds.This sample contains the opinion for n = 2594 potential voters.We present the summary information of this survey in Table 16.Note that the first round results are based on the real voting of the respondents, whereas the second round results are based on their intentions.In order to estimate the vote intention in the second round, and simultaneously calibrate over the known ratios of the first round, we defined the following calibration equations: .2925, 0.2569, 0.1552, 0.1523, 0.0828, 0.0599) Where the weights w k are used to compute the proposed estimator for the total votes and the corresponding proportions given in equations ( 44) and ( 45), respectively.Additionally, it is also possible to calibrate by using the number of in the first round.That is, we computed the classic calibration estimator (CAL) using the calibration equation given by: tx = k∈s w * k x 1k , . . ., (3759971,3301815,1995698,1958414,1065142,770610) We computed the Horvitz-Thompson (HT) estimator, the proposed estimator (CALR) and the classic calibration estimator (CAL), and we found the new weights2 w k and w * k using the function calib from the package sampling (Tillé & Matei 2013).The dataset and the computational codes are available upon request from the main author.
Table 17 presents the results of the estimation of potential voters per candidate for the second round using the new weights.We can see that all three estimators considered Santos to be the winner of the election: this was the actual reality.However, the HT estimator gives much more percentage of a vote for no candidate than the other two estimators.The results of the CAL and CALR estimators, in this particular dataset, are similar.However, the proposed estimator in this paper does calibrate over the known ratios in the first round.

Discussion
In this paper, we have proposed a ratio calibration estimator considering several ratios, inducing calibration constraints.From the empirical research, we found that the proposed estimator has a smaller variance than the Horvitz & Thompson estimator and even a smaller one than the classic calibration estimator for most simulation scenarios considered in this article.Furthermore, the proposed estimator has the ability to estimate the population totals with negligible empirical bias.We illustrated the particular usefulness of the proposed methodology in the runoff election system to estimate the vote intention in the second round.Despite the good performance of the proposed estimator, we noted that the estimated total number of voters is by far smaller than the real one, and that the estimation of vote for no candidate is too high.For future research, one way to estimate the voting intention in the second round could be by attempting to estimate the abstention percentage.
The proposed estimator can also be useful in other survey studies.For example, by taking into account the auto correlation and seasonal behaviour of macroeconomic variables, we can use the unemployment rate of a particular month of the year as auxiliary information in order to estimate the current value.
In order to keep the model-consistency and design-unbiasedness of the calibration estimators, Brewer (1999) argued that the proper choice of c k , as in equation (5), should be d k − 1.For further work, the appropriateness of these scalars should be investigated.In terms of consistency, this approach can also be used jointly, from a model-based perspective.
Further work on using this approach in the presence of non-response and frame imperfections is necessary.This methodology could also be used in surveys with multiple frames such as in the work of Elkasabi, Heeringa & Lepkowski (2015), its applicability, statistical properties and effect of misclassified domains are of great interest in further investigations.

Table 10 :
Performance of the sampling estimators for model (41) for a sample size of n = 2000.The relative bias have been multiplied by 10000, the unit of CV is %.

Table 11 :
Relative efficiency of the sampling estimators for the simple regression model considering a sample size of n = 400.

Table 12 :
Relative efficiency of the sampling estimators for the multiple regression model considering a sample size of n = 400.

Table 13 :
Latin American presidential elections held in 2014.

Table 14 :
Second round results in Colombia.

Table 15 :
Results of the first round of the Colombian Presidential Elections held in 2014.

Table 16 :
Results from the survey carried out with a total of 2594 persons.

Table 17 :
Estimations for the second round of the Colombian presidential elections held in 2014: estimated vote intention, proportion of estimated votes and its corresponding standard error (SE).