What Do We Learn About Voter Preferences From Conjoint Experiments?

Political scientists frequently interpret the results of conjoint experiments as reflective of voter preferences. In this paper we show that the target estimand of conjoint experiments, the AMCE, is not well-defined in these terms. Even with individually rational experimental subjects, unbiased estimates of the AMCE can indicate the opposite of the true preference of the majority. To show this, we characterize the preference aggregation rule implied by AMCE and demonstrate its several undesirable properties. With this result we provide a method for placing sharp bounds on the proportion of experimental subjects with a strict preference for a given candidate-feature. We provide testable assumptions that help reduce the size of these bounds. Finally, we offer a structural interpretation of the AMCE and highlight that the problem we describe persists even when a model of voting is imposed

voter preferences and electoral outcomes.
The goal of factorial designs like those in forced-choice conjoint experiments is to mimic the comparisons individual voters make at the ballot box. By randomizing a large number of candidate and platform features, researchers seek to construct realistic approximates of the choices voters face. With a simple difference-inmeans or least-squares regression researchers compare the attributes of candidates most frequently chosen to the attributes of the candidates least frequently chosen to make empirical claims about the preferences of voters.
For example, experimental results from conjoint experiments are used to make claims about voters' preferences for particular policies like: "Americans express a pronounced preference for immigrants who are well educated, are in high-skilled professions, and plan to work upon arrival (Hainmueller and Hopkins, 2015); and "[there is] strong evidence for progressive preferences over taxation among the American public" (Ballard-Rosa, Martin and Scheve, 2017). Even more frequently, conjoint results are used to make statements about candidates for elected office like: "voters prefer experienced or locally born politicians, but do not prefer politicians affiliated with a major political party... and are indifferent with regard to dynastic family ties and gender (Horiuchi, Smith and Yamamoto, 2018);" and "voters and legislators do not seem to hold female candidates in disregard; all else equal, they prefer female to male candidates (Teele, Kalla and Rosenbluth, 2018)." Put simply, political scientists use conjoint results to make statements about a binary preference relation for a representative voter in the context of elections. Researchers interpret findings from conjoints as evidence that candidates with particular features are most preferred and thereby more likely to win elections (Carnes and Lupu, 2016;Teele, Kalla and Rosenbluth, 2018). What is more, the conjoint method and this common interpretation have even migrated to the public discourse. CBS News and POLITICO, for example, have both highlighted results from conjoint experiments, asserting that the "[Democratic] party's primary voters prefer female candidates of color in 2020 (Magni and Reynolds, 2019)" and that [Democratic] "voters showed a clear preference for females, all else equal (Khanna, 2019)." By way of example and formal proof, we show that the AMCE produces a representative voter that is uninformative with respect to empirical claims about electoral contests.
The AMCE is defined as the average effect of varying one attribute of a candidate profile, e.g. the race or gender of the candidate, from A to A ′ , on the probability that the candidate will be chosen by a respondent, where the expectation is defined over the distribution of the other attributes. To be clear, we do not dispute that the estimators proposed by Hainmueller, Hopkins and Yamamoto (2014) for this quantity are unbiased under their assumptions. Rather, we show that even when these assumptions hold, a positive AMCE of candidate-feature A over A ′ does not indicate: 1.) A majority of voters prefer candidates with feature A to those with A ′ ; 2.) all else equal the median voter prefers candidates with A to those with A ′ ; nor 3.) candidates with feature A beat candidates with feature A ′ in most elections.
This occurs because the AMCE averages over two aspects of individual preferences: their direction (whether or not an individual prefers A to A ′ ) and their intensity (how much they prefer A to A ′ ). Because the AMCE produces a literally average voter, it assigns greater weight to voters who intensely prefer a particular outcome, the consequence of which can be inaccurate out-of-sample predictions. For example, a large majority of people may have a strict preference for male candidates over female candidates, but the AMCE may be positive for female candidates if there is a small minority of voters who have an intense preference for women. Far from being a statistical accident, this structure undergirds numerous political questions where the direction and intensity of preferences are potentially correlated.
Our point is not merely semantic. In the field of market research, where the tools of conjoint experiments were first developed, scholars are typically interested in the demand for a given product, which is determined by both the intensive and extensive margins of consumer choice. By contrast, political scientists typically care about elections, which are won on the extensive margin. Indeed, outside of fantastical institutional designs (e.g, Lalley and Weyl (2018)) electoral contests are not swayed by how much a subset of voters prefer a given candidate but, rather, how many voters have a strict preference for each candidate. By averaging over both margins of choice, the AMCE can prove uninformative with respect to the questions of interest to political scientists.
Since the objective of conjoint experiments is to construct a mapping from individual to aggregate preferences, we build on the literature in positive political theory that formally evaluates mechanisms that do just that. That is, we characterize the AMCE as a preference aggregation rule -a mapping from individual to aggregate preferences (Austen-Smith and Banks, 2000, p. 26). In doing so, we show that the AMCE is a perturbation of the Borda rule and, as such, inherits some of its undesirable properties. Namely, we demonstrate that the AMCE does not satisfy the majority or independence of irrelevant alternatives (IIA) criteria. In this paper we focus on violations of the majority criterion -a principle that states if a majority of voters prefer a particular feature the aggregation mechanism should select it and only briefly discuss the implications of IIA violations for conjoint designs that restrict attribute-combinations from their randomization schemes.
Having characterized the preference aggregation rule of the AMCE, we then use results from this exercise to provide a method that, for a given AMCE estimate, allows researchers to place sharp bounds on the proportion of experimental subjects that maintain a strict preference for a candidate-feature. Using this method, we re-evaluate the findings of every conjoint experiment published in the American Political Science Review, American Journal of Political Science, and the Journal of Politics between 2016 and 2019 and show that, with two exceptions, their results are consistent with either the majority or the minority of respondents holding a strict preference for the candidate-feature that yielded each study's largest estimated effect.
Finally, we explore the relationship between the AMCE and a simple model of voting. In providing a structural interpretation of the AMCE we show that it reflects an average of individual ideal points over candidate-features. This highlights how conjoints combine information about both the intensity and direction of preferences and demonstrates the need to impose additional structure in order to obtain estimates of theoretically relevant quantities of interest. We conclude with some directions for future research on how to make conjoints more informative about voter preferences.

I. An Example
To start, we work through a toy example of how the AMCE aggregates preferences. We aim to make as few assumptions about the underlying preferences of individual voters as possible. While we view our assumptions as benign, we note that if the AMCE exhibits undesirable properties under these assumptions, placing even less structure will not rectify whatever problems we identify and only obscure what drives these results. Furthermore, we emphasize that we are agnostic with respect to the content of voters preferences.
Individuals may be self-interested, other-regarding, or some mixture thereof. We impose only that individual preferences are complete and transitive. 1 Without completeness and transitivity we can learn about neither individual or aggregate preferences. As such, these are the minimal assumptions about individual preferences we can make and still hope to recover meaningful insight into the AMCE.
Since, fundamentally, the object researchers seek to describe concerns a preference relation over candidatefeatures, the primitives we begin with are over these features. For simplicity, consider an electorate of five voters (V1, V2, V3, V4, V5), whose preferences over candidates we would like to study with a conjoint experiment. To eliminate concerns about estimation, suppose we can fully observe every potential choice between candidates made by every member of this population. In this world, there are two attributes of candidates that are important to voters: their gender (female or male) denoted by G ∈ {F, M }, and their age (old or young) denoted A ∈ {O, Y }. Each candidate is an ordered pair of gender and age, so that there are four different candidate profiles: F O, F Y, M O, and M Y . The voters' preferences over attributes are a strict partial order ≻, and are given in the following table: We construct preferences over candidates from preferences over attributes in the following way: Voters prefer candidates that have both of the attributes they like to those that have one attribute they like, which in turn they prefer to candidates who have neither of the attributes they like. Notice that there are two types of candidates that have only one attribute that matches a voter's preference. For these candidates, whether a voter prefers one or the other depends on which attribute the voter places a greater weight on.
For example, if a voter places more weight on gender, we would expect them to choose a candidate who has their preferred gender but not their preferred age over a candidate who has the voter's preferred age but not the gender.
Formally, such preferences over candidate profiles can be written as the lexicographic preference relation ≿, where for each voter one attribute is given a greater weight in determining the preference ordering.
Accordingly, we assume that voters 1, 2, and 5 place more weight on the candidate's age, A ≿ G, whereas voters 3 and 4 place more on the candidate's gender, G ≿ A. Combining weights with preferences over attributes, we can produce voters' preferences over candidate profiles. These are presented in Table 2. Given these preferences, in Table 3 we present the votes candidates would obtain in each head-to-head election for every possible pairwise comparison. Note that in this example women and men win the same number of elections (the winner is bolded in the first column). 3. The intuition behind the comparisons being made when estimating the AMCE is given in Table 4. Here, Y (C 1 , C 2 ) denotes the fraction of votes that candidate C 1 obtains when run against candidate C 2 . For each contest we can obtainȲ from the last column of Table 3. To obtain the AMCE for males we compare how male candidates (column 1) fare relative to female candidates (column 2) when they run against the same opponent, then sum this difference over all possible opponents. This sum is finally normalized by the number of possible profiles minus one (3) times the number of possible values for gender (2). The procedure yields an AMCE for male equal to −1/15, meaning that the average probability of being chosen is higher for female candidates than it is for male candidates. Holding all else constant (in the case of this example, age), a male candidate would always win. 2 Furthermore, women and men win an equal number of electoral contests. 3 The AMCE produces an estimate that indicates the opposite of the true majority preference because the minority, who place the greatest weight on the gender dimension, also have a preference for female candidates, while the majority, who prefer men, do not place much weight on gender when making their decisions. When aggregating preferences over gender, the AMCE mechanically assigns greater weight to the minority that strongly prefer women.
Crucially, this result is a feature of the target estimand and is not a problem of estimation. Our example is analogous to a survey in which each respondent is asked to evaluate all possible head-to-head comparisons.
To highlight this, we conduct a simulation exercise where we run a three question conjoint experiment on a population characterized by the distribution of voter preferences in our toy example. That is, we take a population of five voters with the preferences detailed in Table 3. Then, we randomly construct pairs of candidates, perturbing their gender and age. Knowing voter preferences for candidate profiles we then obtain a winner in each contest and estimate the AMCE for male candidates. In Figure 1 we present results from conducting this exercise 1,000 times. Of course, because the AMCE is unbiased, the effect is centered on -1/15, despite being generated from a population of voters where 3/5 prefer men.  Table 3 and where candidates are randomly generated from a combination of gender and age.

II. The AMCE as a Preference Aggregation Rule
In this section, we show that the above example is a general feature of the AMCE. To accomplish this we start by showing that the AMCE has a direct correspondence to the Borda rule, a voting system that assigns points to candidates according to their order of preference. Borda rule voting is implemented as follows. With n candidates, the Borda rule assigns zero points to each voter's least preferred candidate, one point to the candidate preferred to that but no other, and so on until the most preferred candidate receives n − 1 points. Thus for each voter, the Borda score contributed to a candidate corresponds to the number of other candidates to whom he or she is preferred. This in turn is equal to the number of times that candidate would be chosen if the voter was presented with every possible binary comparison. A candidate's Borda score is the sum of the individual Borda scores assigned to that candidate by each voter, and is equal to the total number of times that candidate would be chosen if each voter was subjected to each binary comparison. This is summarized in Lemma 1: LEMMA 1: The Borda score of each profile is equal to the total number of times that profile is chosen in all pairwise comparisons.

PROOF:
All proofs are in the appendix.
In the context of conjoint experiments, we further define the Borda score of a feature as the sum of the Borda scores of each profile that has that feature. For example, the Borda score of "female" is the sum of the Borda scores of all female candidates. We can now state our result pertaining to the equivalence of Borda and AMCE: PROPOSITION 1: The difference of the Borda scores of a feature and the benchmark is proportional to the AMCE of that attribute.
The proof of Proposition 1 follows from Lemma 1 and the observation that Borda and AMCE measure aggregate preferences in analogous ways. They both tally the number of alternatives that are defeated by candidates with a given feature, then use that tally to compare across features. The AMCE estimates are constructed by taking the difference of these tallies and normalizing them to be between −1 and 1. In the proof we formally walk through the steps of how to get to AMCE from Borda counts, and produce the same expression as the AMCE in Equation 5 of Hainmueller, Hopkins and Yamamoto (2014). This equivalence is important, because it is well known in the social choice literature that the Borda rule has several undesirable properties. We have shown that these properties extend to the AMCE. For example, the Borda rule violates the irrelevance of independent alternatives (IIA) criterion, which states that the relative ranking of two candidates should not depend on the presence of another candidate. In the supplemental appendix we show that the AMCE violates IIA. That is, we demonstrate via a simple example that the AMCE of a given candidate-feature depends on the other feature-combinations included in the experiment.
In our example, the estimated AMCE on male versus female depends on the particular randomization of party and education. By restricting, for example, educated Republicans from the randomization scheme, the AMCE on male changes sign. Of course, this is deeply problematic for conjoint experiments that frequently exploit constrained randomizations. 4 In this paper we focus upon a second social choice property of the AMCE -that it also inherits from the Borda rule -and show that it violates the majority criterion. This states that if a majority of voters prefer one candidate, then that candidate must win. Our example shows that this feature of the Borda rule extends to attributes, where a majority of voters prefer male candidates to female candidates, but the Borda score of F is greater than that of M . Here, we establish this result more generally. Specifically, we show that when a majority of candidates prefer a feature, the AMCE may still indicate that feature has a negative effect on the probability of being chosen. This discrepancy is driven by respondents assigning different weights, or importance, to attributes. For example, if respondents who like a feature also put more weight on it than those who dislike it, the AMCE estimate will be higher than the margin of respondents who strictly prefer that attribute. More importantly, a small minority that cares intensely about an attribute can overtake a much larger majority that has the opposite preference but cares less intensely about it. This may result in an AMCE in favor of the feature the minority prefers, even if that feature would in fact lead to a large electoral disadvantage between otherwise similar candidates.
We leverage the correspondence between the AMCE and the Borda rule to derive sharp bounds on the fraction of the population that prefers a feature over the benchmark and show that the potential divergence with AMCE grows in the number of unique candidate profiles, K. More precisely, for any given value of the AMCE for a feature, total number of candidate profiles, and the number of values the attribute can take, we define the maximum and minimum fractions of voters who prefer that feature over the benchmark attribute. These bounds are given in our next result.
PROPOSITION 2: Let y denote the fraction of voters who prefer t 1 over t 0 . Given an AMCE estimate of where τ is the number of distinct values the attribute of interest can take.
To find these bounds, we calculate the highest and lowest possible Borda scores a respondent can contribute to a feature as a function of the total number of possible profiles, and the number of distinct values the attribute of interest can take. We first assume that for all proponents of a feature, the attribute involved is the top priority. This means that all profiles with that feature are preferred to all profiles without that feature. This results in the highest possible Borda score to the feature, and minimum possible Borda score to the benchmark. Thus we obtain the maximum net Borda score a proponent can contribute to a feature.
In contrast, we assume for all opponents of that feature, the attribute has the lowest priority. This means where we formally state and carefully trace the arguments summarized here.
In Figure 2, we apply this proposition to compute the bounds for AMCEs of 0.05, 0.10, 0.15, and 0.25 for a binary feature, plotting the upper and lower bounds of the proportion of experimental subjects who prefer a binary feature on the y-axis against the number of potential candidate profiles that respondents can choose from on the x-axis. As the figure shows, even for AMCE estimates of a fairly large magnitude, it takes fewer than five possible profiles for these bounds to grow to a completely uninformative range.
Of course, nearly all conjoint experiments exceed five possible candidate profiles. For instance, with six attributes taking two possible values each -still a conservative design by recent standards -there are already 2 6 = 64 possible profiles. Only when the AMCE is extremely large -an effect size of 0.25, which is rarely achieved by anything other than controls such as a candidate's partisanship or experience -do the bounds become informative regarding the majority preference. Even then, if the feature of interest were ternary instead of binary, an AMCE of 0.25 would still be inconclusive.
In Table 5, we conduct this exercise for every forced-choice conjoint experiment in the APSR, AJPS, and JOP published between 2016 and the first quarter of 2019. We construct our bounds for the largest estimated effect presented in each of these papers. In this way, for each paper, we provide the best possible case for informative bounds. Nevertheless, from the eleven papers we analyze, only two -those of Mummolo (2016) and Hemker and Rink (2017) -prove informative with respect to a majority preference. In both of these papers, the effect sizes are quite large -0.30 and 0.33, respectively -and the number of possible candidate profiles is comparatively small (6 and 32, respectively). Furthermore, in both cases, the attribute of interest is binary; note that, by contrast, the very largest effect size of 0.35, found in Newman and Malhotra (2018), produces uninformative bounds due to the large number of possible profiles (over 120,000) and relevant features (9).
The bounding exercise we propose contains the entire range of preferences that are consistent with a given AMCE. In other words, the upper and lower bounds reflect a worst-case scenario for researchers, which is realized when preferences over features and weights over attributes are highly correlated. Thus, Proposition 2 underscores the dangers of making statements about aggregate preferences with so little structure on individual choices. Of course, in reality, this correlation may not be so large. As such, researchers may want to know how the AMCE performs in the best-case scenario. We can use the logic underlying Proposition 2 to show that when voters have homogeneous weights -that is, when every respondent has the same priorities over attributes -the AMCE and the majority preference must point in the same direction. That is, when all subjects assign the same priority ranking to a binary attribute, we show that the sign of the AMCE must correspond to the sign of the margin of victory for the relevant feature over the baseline. Usefully for researchers, under these conditions the AMCE will be smaller in magnitude than the size of the margin, thus providing a downwardly biased -and therefore conservative -estimate for that quantity. Furthermore, we find that as the weight assigned to an attribute relative to other attributes grows, the distance between the AMCE and the size of the margin shrinks.
COROLLARY 1 (Homogeneous weights): When voters assign homogeneous weights to attributes, the AMCE of a binary attribute has the same sign as the majority preference, but underestimates the size of the margin.
The size of the underestimation grows as the relative weight assigned to the attribute of interest falls. In the limit as the relative weight of the attribute of interest goes to zero, so does AMCE; even when the margin is arbitrarily close to one.
Proof of Corollary 1 follows closely the logic of Proposition 2: when weights are identical across proponents and opponents, each proponent contributes as many net Borda points to a feature as an opponent takes away from it. As such, when the points contributed by proponents and opponents cancel out, the remainder corresponds to the margin of victory for the feature preferred by the majority. Because Borda scores are increasing in the weight assigned to an attribute, the remainder also increases. Thus the AMCE is sensitive to the weight assigned to that attribute and therefore captures the size of this margin, even when it has the correct sign.

III. Structural Interpretation of the AMCE
Although the proposed estimator of the AMCE of Hainmueller, Hopkins and Yamamoto (2014) is "model free," in this section we demonstrate how it relates to an underlying model of choice. Our purpose in providing this simple structural interpretation of the AMCE is to illustrate from another angle the same aggregation problem that we have already identified in the preceding sections, wherein we cannot disentangle the intensity and direction of individual preferences. To start, consider two candidates c ∈ {1, 2} running in contest j who offer platforms x ijc to voter i. A platform x ijc is a vector of policies of length M that fully characterizes a candidate in contest j, which we will eventually recast as a vector capturing all the features (e.g. female, white, Republican) of that candidate. Let b i represent an M length vector of voter i's preferred policy locations (e.g., their issue-specific ideal-points), and assume that voters have quadratic utility functions. Thus, voter i's utility is maximized when candidate c offers a platform that exactly matches her preferred policy positions, and the loss she obtains is a function of the distance between the candidate's policies and her ideal platform. Her utilities from the Candidate 1 and 2's respective platforms is given by: (1) While the imposition of quadratic loss utilities may seem restrictive, in the appendix we show that our results are numerically identical if we assume an absolute linear loss utility function. Regardless, it follows that: where y ij1 is a binary indicator that equals 1 when respondent i chooses Candidate 1 in contest j and 0 otherwise. Now consider data generated from a conjoint experiment, where x ij1 and x ij2 are vectors of randomized candidate attributes that have been discretized into binary indicators with an omitted category.
Typically, we would estimate Equation 2 with a probit or logit-like regression as is common in the discrete choice/voting literature. Instead consider a linear model of the form: Finally, averaging over all individuals, we obtain E(β im ) as the coefficient from the regression: where the estimated coefficientβ m recovers the AMCE for feature m. 5

IV. Discussion and Recommendations
We have shown why the AMCE does not support most interpretations made by political scientists. A positive AMCE for a particular candidate-feature does not imply that the majority of respondents prefer that feature over the baseline. It does not indicate that they prefer a candidate with that feature to a candidate without it, all else equal. It does not mean that voters are more likely to elect a candidate with that feature than candidates without it. Furthermore, this is not the consequence of uncertainty introduced by sampling or measurement; all of it is inherent to the AMCE's properties as an aggregation mechanism.
Even when the universe of respondents is fully observed and every conceivable contest between candidates is assessed carefully and honestly, claims about voter preferences and electoral outcomes are not generally supported by the results from conjoint experiments.
Instead, what we have demonstrated is that the AMCE can be thought of as an average of the direction and intensity of voters' preferences, or essentially an average of ideal points. As a consequence, it can point in the opposite direction as the majority preference when there is a minority that intensely prefers a feature and a majority that feels the opposite, but less strongly. The larger the correlation between direction and intensity, the more misleading the AMCE. When it comes to the sorts of issues that interest political scientists (and for which conjoints are often deployed), such as gender parity in elected office (Teele, Kalla and Rosenbluth, 2018) or the sorts of people who should be favored by the nation's immigration policy (Hainmueller and Hopkins, 2015), this problematic preference structure is pervasive.
Building on well known results from the literature social choice, we have derived sharp bounds on the proportion of a sample that prefers a feature based on a given AMCE. Unfortunately, the vast majority of findings published in the top political science journals in the past few years fail to support claims about majority preferences. That said, we have also shown that if there is no variation in preference intensity in the sample, then at the very least the sign of the AMCE indicates the majority preference.
Our findings leave us with three types of practical advice for applied researchers. First, we address how the discipline should assess the large body of research that has already been produced using the standard conjoint experiment framework. Then, we offer some guidance on how to design a conjoint experiment if a researcher wishes to use this framework, based on the results of our bounding exercise. We conclude with some promising avenues for future methodological research to strengthen the link between conjoint results and majority preferences.
Our first and most important point relates to the body of research already conducted using conjoint experiments. We strongly urge researchers to place the "representative voter" implied by the AMCE in the correct context and to use precise language when interpreting the results of conjoint experiments. While common interpretations such as "voters prefer A to A ′ " are not well-defined, colloquially they evoke some notion of a majority -one that is not supported by the typical estimand presented in conjoint analysis in political science. By the same token, political scientists should, on the whole, stop making inferences about electoral contests from the AMCE unless such claims are supported by further evidence about the distribution of voters' priorities. As a consequence, the discipline must reevaluate what we have learned from conjoint experiments with this clearer understanding of the AMCE in mind. We do not know whether most voters prefer male or female candidates; we have only learned that the "average preference" for women Of course, there may be research questions for which conjoint designs are appropriate. For example, if preference intensity is an important object of inquiry, then conjoint experiments may prove a way forward.
Even then, researchers must be willing to make inter-personal utility comparisons. As such, when focusing on preference intensity, we recommend conjoint-like designs that recover a marginal willingness to pay for particular candidate features. Nevertheless, if researchers must rely upon a forced-choice conjoint experiment in the context of elections, our results indicate they should restrict themselves to conservative randomization schemes that limit the number of attributes and potential candidate-profiles. Still, as our bounding exercise demonstrates, even with a conservative design and a small number of binary attributes, the effect size that produces informative bounds is extremely high by social science standards.
If researchers want to make claims about majority preferences from conjoint experiments, one potential way forward may be to combine them with experiments designed to recover voters' priorities. As we have shown in Corollary 1, if respondents have homogeneous weights on the dimensions of choice, claims about a majority preference can be sustained with existing research designs. However, this may not be a fruitful avenue since the likelihood of homogeneous priorities in realistic political contexts is limited.
Finally, we suggest that researchers should be willing to trade off stronger assumptions with an ability to make claims about electoral outcomes. A fully structural approach to conjoint analysis may prove best capable of combining the realistic approximations of candidates that randomizing a large number of candidate-features provides with an ability to make claims about electoral contests. By imposing and estimating a model of voter choice, researchers may be able to have their cake and eat it too. Suppose there are N voters and K profiles. Consider voter i's preference ranking over profiles. For any otherwise. Without loss of generality, reorder the profiles such that the profile most preferred by i is x 1 , the second most preferred is x 2 , and so on such that the least preferred is x K . Assign i's most preferred profile a Borda score of b i (x 1 ) = K − 1, their second most preferred profile a score of b i (x 2 ) = K − 2, and so on such that their least preferred profile has a score of zero. Notice that when i is presented with each pairwise comparison, their most preferred profile x 1 will be chosen every time it is on the ballot, so times. The second most preferred will be chosen each pairing except with the most preferred profile, so The aggregate Borda score of a profile is the sum of individual voters' Borda scores of that profile. When we sum across voters the times each profile x m is chosen in all pairwise comparisons, their sums must be equal to the sum of individual Borda scores. Formally,

PROOF OF PROPOSITION 1:
Recall that we defined the Borda score of a feature as the total number of times all the profiles with that feature are chosen in all pairwise comparisons. Formally, let Borda score of a feature t 1 , B(t 1 ) be where κ(t 1 ) denotes the set of all profiles that have the feature t 1 . Dividing B(t 1 ) by the total number of pairwise comparisons t 1 appears in, |κ(t 1 )|N (K − 1), and taking the difference with the Borda score of the benchmark attribute t 0 , divided by |κ(t 0 )|N (K − 1) yields exactly AMCE as defined in Hainmueller et al (2014):

PROOF OF PROPOSITION 2:
Since we have already established the equivalence of Borda and AMCE in Proposition 1, we prove this proposition by finding the range of Borda scores of t 1 and t 0 that can be rationalized for some proportion of voters who prefer t 1 over t 0 ; and then inverting to find the minimum and maximum proportions for a given AMCE.
Let us find the minimum fraction of voters who prefer t 1 over t 0 that is consistent with an AMCE estimate.
Notice first that for a fixed fraction of voters, AMCE is maximized when voters in favor of t 1 assign the highest priority to the attribute, they rank t 1 the best, and t 0 the worst; whereas those prefer t 0 like t 1 second, and assign the lowest priority to it. In other words, when those who prefer t 1 rank all profiles with t 1 at the top, and all profiles with t 0 at the bottom, this drives the AMCE estimate up. To help with the intuition, the preferences of such a voter might look like: Holding constant the other features, the difference in Borda scores of a profile with t 1 and with t 0 is thus for any arbitrary combination of other attributes, x. Since each voter makes K τ such comparisons between t 1 and t 0 , each voter who prefers t 1 maximally generates K 2 (τ −1) τ 2 scores in favor of t 1 .
Similarly, when those who prefer t 0 assign the lowest priority to this attribute, their preferences might look like: By holding constant the other features, the difference in Borda scores of a profile with t 1 and with t 0 is −1.
For these voters, the maximum difference is b j (t 1 , x) − b j (t 0 , x) = −1, for any arbitrary combination of other attributes, x. Therefore, each voter who prefers t 0 maximally generates − K τ scores in favor of t 1 . Thus, for a given AMCE π(t 1 , t 0 ), we can derive the minimum fraction y of voters who prefer t 1 by summing these scores and normalizing.
Simple algebra reveals A very similar argument establishes the upper bound of y.
■ PROOF OF COROLLARY 1: When weights are homogeneous, all voters who prefer a feature contribute the same amount of net points to it; whereas others with the opposite preference take away as many net points each. Formally, suppose there is at least one voter i who prefers t 1 to t 0 , and swap labels if there is not. Then, for any j who prefers t 0 to t 1 , we have that for all combinations of other attributes Each voter makes K 2 comparisons involving t 1 and t 0 . Therefore, if there are y voters who prefer t 1 to t 0 and 1 − y voters who prefer t 0 to t 1 , we can write that for any x, We know from the proof of Proposition 2 that if i prefers t 1 to t 0 , the maximum value can take for a binary attribute is K 2 , which obtains when the weight assigned to t is so high that i prefers any profile with t 1 to any profile without. The minimum value it can take is 1, which obtains when the weight assigned to t is so low that there is no profile that is ranked lower than (t 1 , x) but higher than (t 0 , x), is monotone increasing in the relative weight of t. This is because profiles with t 1 and some less preferred values on other attributes become preferred to some profiles that have t 0 and better preferred values on other attributes with lower weights. This drives the ranking of all profiles with t 1 higher, and those with t 0 lower. Next, recall that π(t 1 , t 0 ) = B(t 1 ) |κ(t 1 )|N (K−1) − B(t 0 ) |κ(t 0 )|N (K−1) . Combining this with Equation A.11 gives: x) > 0, the sign of π(t 1 , t 0 ) is be positive if and only if y > 1/2. That is, under homogeneous weights, the AMCE returns a positive estimate for t 1 if and only if there are more people who prefer t 1 to t 0 . When the weight assigned to t is highest, we have π(t 1 , t 0 ) = K(2y−1) 2(K−1) , and so the AMCE corresponds to roughly half the size of the margin. As attributes with higher weights are added, AMCE falls. In particular, when the weight assigned to t is lowest, we have π(t 1 , t 0 ) = 2y−1 K−1 . It is clear that for any y, as the number of profiles grows, AMCE goes to zero in the limit.

PROOF THAT EQUATION 4 IS EQUIVALENT TO THE AMCE:
To show that the estimation of Equation 4 would yield the AMCE note first that Hainmueller, Hopkins and Yamamoto (2014) show that the following regression recovers an unbiased estimate of the AMCE: y ijc = δ + x jmc ρ k + υ ijmc whereρ m gives the AMCE for feature m. From the randomization of x, it follows from standard results that the vector of coefficients β from Equation 4 can be obtained from the separate regression of the outcome y ij1 on each column k of the matrix ∆X ij , e.g. y ij1 = ∆x ijm β m + ϵ ijm . It is sufficient to show thatρ m =β m .

The above equation impliesρ m =
Cov(x ijmc ,y ijc ) V ar(x ijmc ) . Similarly, estimating Equation 4 via least squares without an intercept impliesβ m = The last line follows from the fact that Cov(x ijm1 , y ij1 ) = Cov(x ijm2 , y ij2 ) Next consider the denominator.

PREFERENCES IN CONJOINT EXPERIMENTS
Which again follows from the randomization of features. It directly follows thatβ m =ρ m = AM CE.
■ PROOF OF THE EQUIVALENCE OF THE QUADRATIC LOSS AND ABSOLUTE LOSS: Since x j1 & x j2 can take on only two values {0, 1}, it follows x j1 ≤ b i ≤ x j2 or x j2 ≤ b i ≤ x j1 This yields: (A.14) Pr(y ij1 = 1) = P r (η j1 − ν j2 < ∆x j (2b i − 1)) If we were to estimate this via a linear probability model we obtain (A.15) y ij1 = ∆x j (2b i − 1) + η ij − ν ij = ∆x j β i + ϵ ij AN EXAMPLE OF THE AMCE VIOLATING IIA: Consider three types of voters with preferences over three candidate-features, Gender (M or F ), Age (O or Y ), and Race (B or W ). Preferences over features are given in Table A1.