Managing spatial sustainability trade-offs: The case of wind power

The deployment of onshore wind power involves spatial sustainability trade-offs, e.g., between the minimization of energy system costs, the mitigation of impacts on humans and biodiversity, and equity concerns. We analyze challenges arising for decision-making if wind power generation capacity has to be allocated spatially in the presence of such trade-offs. The analysis is based on a game developed for and played by stakeholders in Germany. The results of the game illustrate that there is no unanimously agreed ranking of sustainability criteria among the participating stakeholders. They disagreed not only on the weights of different criteria but also their definition and measurement. Group discussions further revealed that equity concerns mattered for spatial allocation. Yet, stakeholders used quite different concepts of equity. The results support the importance of transparent, multi-level and participatory approaches to take decisions on the spatial allocation of wind power generation capacity.


Introduction
Onshore wind power is one of the key renewable energy sources that needs to be developed at large scale to decarbonize the energy sector (Rogelj et al., 2018). An important question discussed by scientists and policy-makers is where to site wind power generation capacity. Answering this question is non-trivial due to sustainability trade-offs. A sustainability trade-off implies that one sustainability criterion or objective can only be attained at the cost of impairing the achievement of another. For example, installing wind turbines at sites with the highest wind yields to minimize generation costs is often in conflict with a spatial allocation minimizing other energy system costs (e.g., related to the extension of networks and storage) (Agora Energiewende, 2013;Bucksteeg, 2019;Drechsler et al., 2017;Eriksen et al., 2017;Fürsch et al., 2013;Hagspiel et al., 2014;Schaber et al., 2012a;Schaber et al., 2012b;Schlachtberger et al., 2017;Schmid and Knopf, 2015). Similarly, spatial sustainability tradeoffs may arise with respect to minimizing adverse impacts on humans and ecosystems (Drechsler et al., 2011;Eichhorn et al., 2019;Eichhorn et al., 2017;Gauglitz et al., 2019;Latinopolous and Kechagia, 2015) or concerns of spatial equity (BMWi, 2017c;Drechsler et al., 2017;Sasse and Trutnevyte, 2019).
Which spatial allocation of wind power generation capacity is considered as sustainable (or optimal) therefore crucially hinges on which criteria are considered and how they are weighted. Against this background, we analyze the differing arguments and associated challenges that arise for practical decision-making if wind power generation capacity has to be allocated in space in the presence of sustainability trade-offs. We aim to understand whether there is a generally accepted definition and ranking of sustainability criteria that are used to derive an optimal spatial allocation of wind power generation capacity. In this respect, we are also interested in the relative importance of arguments of efficiency (minimization of aggregate costs or impacts) and equity (distribution of costs or impacts).
These restrictions are partly overcome by multi-criteria decision analyses (MCDA), which do not necessarily rely on monetization. Instead, MCDA typically build on physical measures of impacts of 3 deploying wind power (e.g., distances to settlements or habitats), and thus allow for a broader variety of sustainability criteria to be included. The Achilles heel of these studies are the weights attached to the different criteria. MCDA results typically respond very sensitively to changes in weights (Baban and Parry, 2001;Gorsevski et al., 2013;Sánchez-Lozano et al., 2016;Tegou et al., 2010). Despite this insight, existing MCDA studies often make quite strong assumptions regarding weights. In a pragmatic approach, some analyses attach the same weights to all criteria (Baban and Parry, 2001;Egli et al., 2017;Eichhorn et al., 2019;Eichhorn et al., 2017;Huber et al., 2017;Kienast et al., 2017;Latinopolous and Kechagia, 2015;Sánchez-Lozano et al., 2016;Tegou et al., 2010;Watson and Hudson, 2015). If differentiated weights are used, these are often chosen explicitly or implicitly by the authors themselves (Atici et al., 2015;Baban and Parry, 2001;Baseer et al., 2017;Hanssen et al., 2018;Jahangiri et al., 2016;Janke, 2010;Rodman and Meentemeyer, 2006;Tegou et al., 2010;van Haaren and Fthenakis, 2011;Villacreses et al., 2017) or by small groups of experts (sometimes only one) (Asakereh et al., 2017;Gigović et al., 2017;Gorsevski et al., 2013;Höfer et al., 2017;Jangid et al., 2016;Sánchez-Lozano et al., 2014;Sánchez-Lozano et al., 2016;Watson and Hudson, 2015). If multiple experts are interviewed, the average or jointly agreed ranking is used. Overall, these studies seem to assume that an unequivocal ranking of sustainability criteria can be identified to optimize the spatial allocation of wind power deployment. 1 Our study challenges this assumption. We aim to understand potential difficulties and disagreement regarding the ranking of sustainability criteria. To analyze these challenges for decision-making, we developed a transdisciplinary game and played it with German stakeholders from administration, industry, civil society, science, and intermediary organizations who were frequently confronted with questions of the spatial allocation of wind energy. Participants were asked to allocate wind power generation capacity across German States to attain a pre-defined generation target. When taking their decision, they were provided with State-specific, spatial information on land availability, wind yield, electricity demand, and nature conservation risks. In addition, we asked participants to discuss potential spatial equity concerns. Both the course of discussion in different groups as well as the respective quantitative outcomes were evaluated with respect to similarities and differences in the ranking of the different sustainability criteria.
The remainder of the paper is organized as follows: Section 2 introduces the methodological approach in more detail. Section 3 presents the quantitative and qualitative results of our game. Section 4 discusses implications of our game for identifying optimal allocations of wind power deployment.
Section 5 concludes with implications for future research and policy-making. 1 To be fair, many MCDA studies rather aim to develop methodology. The empirical applications in these studies -including weights -are often primarily used for illustrative purposes. 4 2 Method

Function of the game and setting
To understand the challenges arising for the spatial allocation of wind power generation capacity across regions in the presence of sustainability trade-offs, we decided to involve stakeholders in our research. The involvement of stakeholders aimed at including different types of knowledge about challenges in practical decision-making and how to solve them (Cash et al., 2006).
We used a game to involve stakeholders. Games provide a multitude of benefits to further our understanding. According to Bassi et al. (2015) games help to engage stakeholders by facilitating the closing of the gap between different forms of knowledge. Bots and van Daalen (2007) highlight that games support the analysis of a policy issue when it is not possible to study the real system and hence when it is not possible to study alternative solutions. In our case, we used the game as a design studio to understand how wind turbines may be spatially allocated if sustainability trade-offs (and synergies) arise. We chose a game because it has advantages over simple brainstorming, i.e., providing wild and unexpected answers in large quantity, and other selection techniques, e.g., rankings (for an application to the energy transition in general, see Joas et al., 2016). In particular, the game allows going beyond listings to 'capture and integrate both the technical-physical and the social-political complexities of policy problems' (Mayer, 2009, p. 825). In doing so, games provide a space for problem-solving (Brynen and Milante, 2012). We assumed that playing the game, i.e. deep-diving into the complexities, made participants become more aware of relevant trade-offs and search for solutions which have fewer side effects according to their perception (Bots and van Daalen, 2007). Thereby, the game offers participants the opportunity to enter in-depth discussions and reveal their respective rankings of different sustainability criteria as well as clarify arguments behind different points of views (Gandziarowska-Ziołecka and Stasiak, 2017). Consequently, the eventually revealed rankings of sustainability criteria may be more reliable (e.g., Bots and van Daalen, 2007;Mayer, 2009).
To understand different arguments and perspectives, we decided to invite a broad range of stakeholders from Germany, which were knowledgeable about the problem of spatially allocating wind power generation capacity. Stakeholders were invited from administration, industry (associations), civil society, science, and intermediary organizations to cover a range of different values and perspectives. 27 stakeholders agreed to play the game with us during a joint workshop. Apart from civil society, where none of the invited representatives followed our invitation, we had a rather good coverage of the different sectors. Stakeholders also represented different levels of decision-making, from local and regional level planning to national level policy-making.
We chose a realistic representation of a physical system, i.e., we asked the invited stakeholders to discuss how wind power generation capacity should be allocated to States on a map representing Germany, taking into consideration different sustainability criteria (see Figure 1 and below). We invited the stakeholders to play the game in five small groups of 4 to 6 participants. Groups were composed of stakeholders from different stakeholder groups to incorporate different types of expertise and perspectives, to facilitate reflection and creativity, and to foster mutual learning (Douven et al., 2014).
Reflection and learning was further encouraged in a round of discussions after the game was finished, when the players were asked to review their experience concerning the trade-offs they encountered during the game.

Rules of the game
For each group, the goal of the game was the same: Allocate as much wind power generation capacity (in GW) across German States as it takes to generate 200 TWh per year. This figure roughly corresponds to the wind power generation deemed necessary for 2030 to attain Germany's long-term renewable energy targets (see, e.g., Agora Energiewende, 2013). Groups were asked to take a greenfield approach, assuming that the generation capacity installed in each State was zero at the start of the game. The conversion of generation capacity into power generation was assisted by an Excel tool, considering the average wind yield in each State.
To take their decisions, all groups got the same information, as displayed in Figure 1 (the "game board"). The pieces of information illustrated on the game board were extensively explained before the game by experts in the respective areas. When allocating generation capacity spatially, groups first of all had to respect the potential for wind power generation capacity in each State as an allocation constraint (see Appendix 1 for details on data). In addition, they were asked to consider information on four types of variables representing four different sustainability criteria (see also Appendix 1 for further details on data): (1) wind yield, (2) load proximity, (3) nature and landscape conflicts as well as (4) equity. Regarding the latter, no specific measure was proposed. Instead, groups were asked to specify equity, if they considered this criterion as important. All groups had 30 minutes to discuss and agree on one spatial allocation of generation capacity. Installed capacity in each State was marked by pins on the game board, which allowed for constant readjustments throughout the time of discussion.
We also asked each group to document and explain the definition and ranking of sustainability criteria they relied on for their decision, once they had realized that there were trade-offs between the criteria.
After this group discussion, a participant of each group presented the results to the other groups.
Subsequently, we organized a second round of group discussions, where members of different groups where brought together to reflect on (a) the trade-offs that showed up in the game, (b) the different 6 arguments and corresponding difficulties and opportunities to identify priority areas for future wind power deployment, and (c) resulting policy implications.

Data, documentation and analysis
During the game, we were thus able to collect three types of material: • quantitative data on the spatial allocation of wind power generation capacity chosen by each group • the documentation by the groups themselves summarizing definitions, the discussion process, and explicit rankings of the sustainability criteria, as well as the documentation of the second, reflexive group discussion • recordings of the group discussions, capturing details not included in the groups documentations All data were entered either in Excel tables or Word documents to make them accessible for further analysis. Based on the available quantitative data, we computed a correlation coefficient , which measures the strength of the relationship between the amount of wind power generation capacity allocated by each group to State and the respective State values of the variables representing different sustainability criteria (wind yield, load proximity, nature and landscape conflicts) as well as the State potentials of wind power generation capacity: We complemented the quantitative analysis by qualitative evidence. For this purpose, we transcribed the recordings of the group discussions. The transcriptions were analyzed to cross-check whether they support, specify, or qualify the definitions and rankings of sustainability criteria obtained from the quantitative analysis and the self-reported documentation. We outline in the following section to what extent the results derived from these different types of material were consistent or contradictory.  Figure 2 illustrates the wind power generation capacity allocated by the five groups to each German State (sorted from North to South). It highlights some similarities across groups. All groups allocated at least 3 GW to each State. The State of Lower Saxony/Bremen received the highest amount of capacity in all groups. Except for group 4, the spatial allocations of all groups exhibited at least a slight north-south divide.

Spatial allocation of wind power generation capacity chosen by groups
Apart from these few similarities, differences in group results were substantial. In absolute terms, capacity allocations to individual States varied between groups by 2 to 7 GW (recall that 2 GW correspond to roughly 500 modern wind turbines). For some States, this range amounts to up to 30 percent of the respective States' potential for wind generation capacity (in the States of Schleswig-Holstein/Hamburg and North Rhine-Westphalia). Differences in the spatial allocation between groups also explain the variation in overall amounts of generation capacity needed to satisfy the national 200 TWh generation target (last column of the data table in Figure 2). Groups allocating more capacity to windier States in the North of Germany (e.g., group 1) needed to install less capacity to attain the 200 TWh target.

Ranking of sustainability criteria
In this section, we explore whether the observed variation in the spatial allocation of wind power generation capacity across groups may be due to structural differences in the ranking of sustainability criteria. We summarize the recorded line of argument in each group. This is subsequently related to the self-reported ranking of sustainability criteria (Table 1) and the correlation coefficient of each group's spatial allocation with the different indicators discussed (Table 2). During the discussion, groups 4 and 5 mentioned the criteria of grid expansion and current allocation of wind turbines, which they considered as relevant in addition to those explicitly introduced for the game. • Grid expansion Rank 5 • Equity

Group 1
Group 1 first allocated capacity to the Northern States arguing that these exhibited high wind yield, a high potential of wind power generation capacity as well as relatively low nature and landscape conflicts. Subsequently, additionally necessary capacity was spread across the remaining States from North to South. For this purpose, the group equally considered load proximity, wind yield, and nature and landscape conflicts. Moreover, the group accounted for equity concerns as it made sure that at least a minimum amount of capacity was installed in every State. Therefore, the final spatial allocation chosen by group 1 exhibits a strong positive correlation with wind yield. This is consistent with the selfreported ranking of criteria. As a consequence, group 1 installed the least capacity (79 GW) among all groups to achieve the 200 TWh generation target.

Group 2
Group 2 started off allocating capacity proportionally to load proximity. Load proximity was in fact understood as a measure of equity: The group argued that those States consuming more power should also contribute more to producing wind power. Subsequently, the group readjusted the spatial allocation for wind yield and the potential of wind power generation capacity. The group argued that this step reflected another dimension of equity: States with a higher potential should contribute more to producing wind power. Nature and landscape conflicts were hardly considered. The group discussed that those conflicts were already properly accounted for when the potential of wind power generation capacity was determined (see Appendix for details). This ranking of sustainability criteria was also selfreported by the group. It explains that the spatial correlation chosen by group 2 is primarily correlated with load proximity.

Group 3
As a first step, group 3 allocated capacity based on load proximity. As group 2, load proximity was understood as an indicator of an equitable allocation. As a second step, additionally necessary capacity was installed in States with high wind yield and high potential of wind power generation capacity.
During this step, the group also pursued a second equity objective: They made sure that at least some capacity was installed in every State. As a third and final step, some capacity was reallocated from South to North to mitigate nature and landscape conflicts. The importance of load proximity for spatial choice is confirmed by the self-reported ranking of criteria. However, this ranking is not in line with the quantitative data, which shows the strongest correlation with wind yield and nature and conservation risks. Possibly, the readjustments of capacity undertaken in steps two and three eventually overruled the initial allocation of capacity following load proximity.

Group 4
Group 4 started with a perfectly equal allocation of capacity: each State initially received 5 GW.
Subsequently, this allocation was adjusted to better consider load proximity. In this context, the group also discussed potential grid expansion requirements. Nature and landscape conflicts were ignored when capacity was allocated. The group argued that an aggregate consideration of this criterion at the State level is not meaningful, inter alia, because there is significant intra-state variation of nature and landscape conflicts. Instead, this criterion should only be accounted for when specific siting decisions are taken at the local level. This ranking of criteria was also self-reported by the group after the game.
It explains why the eventually chosen spatial allocation is strongly correlated with load proximity.
Interestingly, group 4 was the only group for which wind yield did not play any role. In fact, their eventual spatial allocation was negatively correlated with wind yield and did not exhibit the North-South divide of the other groups. Consequently, this group installed the most capacity (87 GW) of all groups to attain the 200 TWh generation target. Table 2 also illustrates that the allocation chosen by group 4 exhibits the strongest correlation with the potential of wind power generation capacity.
Presumably, this criterion at least implicitly guided the group's decisions, even though it was not explicitly reported as relevant.

Group 5
Notably, stakeholders in group 5 were well informed about the existing allocation of wind power generation capacity in Germany. Consequently, they allocated capacity to each State based on existing capacity plus an add-on. The add-on depended on the respective level of wind yield and load proximity.
The group started with States exhibiting high load (Northrhine-Westfalia, Baden-Württemberg, Bavaria) and continued with States further North and East. Equity and nature and landscape conflicts only played a minor role for allocation decisions in the discussion. This procedure is partly reflected in the self-reported ranking. Even though load proximity did not show up in the ranking, it may be represented by the criterion of grid expansion. In any case, the group's allocation is most strongly correlated with wind yield, load proximity and nature and landscape conflicts. The correlation with wind yield may have also occurred because the group's decisions were based on existing capacitywhich has been strongly correlated with wind yield in Germany in the past (e.g., Lauf et al., 2020).

Disagreement between groups on the ranking of sustainability criteria
The rankings of sustainability criteria exhibit some similarities across the groups. Wind yield and load proximity were the most important criteria in most groups. Similarly, equity concerns mattered in all groups (see below). In contrast, nature and landscape conflicts were of lower importance for group outcomes. These observations notwithstanding, our results indicate that the variation in the spatial allocation of wind power chosen by the groups can be explained by differences in the weighting of sustainability criteria.
Two alternative general lines of arguments can be observed. One set of groups favored a spatial allocation based primarily on wind yield (groups 1, 3 and 5). These groups allocated most capacity to Northern and Eastern States. The analysis of the group recordings suggested that the underlying intention was to reduce the overall amount of capacity necessary to attain the generation target of 200 TWh. The spatial allocation chosen by these groups also exhibited a strong negative correlation with nature and landscape conflicts. At first glance, this result could suggest that these groups paid a lot of attention to avoiding nature and landscape conflicts. An alternative interpretation stems from the fact that wind yield and nature and landscape conflicts were strongly negatively correlated in our game set-up. In other words, strong synergies existed between minimizing generation costs and nature and landscape conflicts.
The other set of groups attached more priority to load proximity (groups 2 and 4). Consequently, relatively more generation capacity was allocated to Western and Southern States. One motivation behind this ranking was to reduce energy system costs other than generation costs, like grid expansion and balancing costs. In addition, these groups also understood load proximity as an indicator of an equitable spatial allocation (see next section).
Interestingly, differing rankings of wind yield and load proximity mirror the inconclusive results of quantitative studies for Germany. A study by Agora Energiewende (2013) finds that spatial optimizations of wind power generation capacity based on wind yield and load proximity lead to similar total energy system costs. Bucksteeg's (2019) results indicate that the spatially optimal allocation of capacity in terms of total energy system costs is close to the one that minimizes generation costs. In contrast, Drechsler et al. (2017) show that considering grid expansion costs shifts optimal wind power deployment to the South.
More fundamentally, groups disagreed on which criteria to include at all when discussing the optimal spatial allocation of wind power generation capacity. This disagreement materialized, for example, with respect to the criterion of nature and landscape conflicts. Group 2 abstained from considering this criterion, arguing it was already properly accounted for when the potential of wind generation capacity was determined for each State. Other groups also added criteria (grid expansion, current allocation of generation capacity), which they considered important for choosing their spatial allocation. Thinking beyond the set-up of our game, similar disagreement can be expected with respect to other criteria not yet included in our investigation. A prime example may be the consideration of impacts on local residents. In order not to overload our game, we accounted for this criterion only when determining the potential of wind power generation capacity. In this context, legally set minimum distances of wind turbines to settlements constrained the availability of sites for installing wind turbines. Yet, it may be debatable whether this approach is sufficient to properly account for impacts on local residents -also in the light of mixed empirical evidence regarding the relationship between social acceptance of wind power and its spatial allocation (Ellis and Ferraro, 2016). Importantly, the spatial scale at which the allocation decisions were taken also mattered for the ranking of criteria. In our game, this became obvious in group 4's decision to ignore the criterion of nature and landscape conflicts. The group argued that the conflicts cannot be properly measured by an aggregate indicator at the State level because intra-State variation may matter as much as inter-State variation. For them this criterion would potentially gain more importance if local allocation decisions were to be taken. The weighting of a sustainability criterion may therefore also hinge on how well its spatial heterogeneity can be accounted for at the spatial scale of the decision problem at hand.
But apparently this was viewed differently by other stakeholders who played our game. In other words, there may also be disagreement on what a proper spatial scale for assessing sustainability trade-offs is.
Finally, the ranking of sustainability criteria may depend on the underlying assumptions made by stakeholders regarding the future development of the entire energy system. In our game, this mattered, for example, when stakeholders assessed the importance of the criterion load proximity.
For this purpose, assumptions needed to be made about the future use of other renewable and nonrenewable energy sources as well as on the development of grids, storage and demand. These aspects were not systematically considered in our game. This notwithstanding, related arguments and assumptions were made explicit in the group discussions and varied between groups.

Importance of equity concerns
We explicitly asked all groups to also consider possible equity concerns when taking the decisions on where to install wind power generation capacity. In principle, these concerns may be related to an equitable spatial allocation of both local burdens (e.g., opportunity costs and negative externalities arising from land-use changes at the local scale) as well as local benefits (e.g., economic development due to local investments) of wind power deployment. For the purpose of our game, we focused primarily on local burdens and suggested that there is a positive causal relationship between equity in the allocation of capacity and equity in the resulting local burdens. 2 However, we deliberately left it to the groups to specify how exactly they understood equity. The group discussions revealed that equity considerations seemed to matter for the spatial allocation of wind power generation capacity in all groups. Yet, groups used different concepts of equity, sometimes even simultaneously.
A first equity-driven approach pursued by some groups was to allocate wind power generation capacity proportional to the States' potential of wind power generation capacity, i.e., the availability of land.
14 This concept was most pronounced in groups 2 and 3. Similar approaches to attain an equitable allocation of generation capacity were also implemented in previous studies carried out for Germany by BMWi (2017c) (regional allocation proportional to wind energy potential) and Drechsler et al. (2017) (regional allocation proportional to renewable energy potential weighted by a State's population size).
Land availability can be interpreted as an indicator of the ability of a State to bear the local burden of wind power deployment. The more land there is, the easier it may be to identify sites where wind turbines produce relatively small impacts on humans or nature. This approach exhibits similarities to the ability-to-pay principle known from the theory of taxation (e.g., Pigou, 1932) -or more generally the idea of distributive justice. Certainly, the group strategies observed in our game could only imperfectly mimic the concept of distributive justice, which refers to equity among individuals. Groups playing our game could only consider equity between States due to the set-up of our game. Depending on the respective allocation of wind turbines within a State, the resulting burden per individual could still deviate a lot from individuals' abilities to bear the burden, both within as well as across States. This could be due to differences in physical (e.g., landscape profiles) and socio-economic (spatial allocation and size of affected population) endowments as well as heterogeneity in preferences of individuals (e.g., when it comes to evaluating changes in landscape aesthetics).
A second equity-driven approach pursued also by groups 2 and 3 was to allocate wind power generation capacity across States proportional to States' demand for power (i.e., load proximity in our game). In States with a higher power demand relatively more capacity was installed. A similar concept is also applied by Sasse and Trutnevyte (2019). This approach can be interpreted as an attempt to allocate wind power generation capacity and the resulting burden proportional to -and thus in return for -the benefits received from wind power generation in terms of satisfaction of electricity demand. This approach is similar to the benefit (or quid pro quo) principle known from the theory of taxation (Buchanan, 1949;Lindahl, 1919) -or more generally the idea of commutative justice. Again, however, relating an equitable allocation of capacity to State power demand is only an imperfect proxy for commutative justice among individuals. Reasons include, for example, heterogeneous levels of electricity consumption between individuals within a State.
Finally, several groups addressed equity by equalizing the absolute amount of wind power generation capacity installed in every State. This was most pronounced in group 4 which initially allocated 5 GW to each State. But also groups 1, 3, and 5 made sure that at least some capacity was installed in each State. These strategies may be one explanation why, in our game, every State ended up with at least 3 GW installed. They can be interpreted as an implementation of either of the two equity approaches discussed above. On the one hand, these strategies may be a simple representation of the approach of distributional justice -as the mere existence of a State could be understood as a fair proxy for the availability of land. On the other hand, they may also implement the idea of commutative justice -if one assumes that each State should contribute to wind power deployment because electricity is demanded everywhere.
Overall, equity concerns thus mattered in all groups when allocation decisions were taken. However, we observed significant differences regarding the fundamental concepts of equity applied (distributional vs. commutative justice).

Methodological reflections
The results of our game are sensitive to its design. Schafsma et al. (2018) point out numerous methodological stumbling blocks that need to be considered when applying deliberative methods, such as: • Sampling: We invited stakeholders from a variety of fields and backgrounds. Yet, representatives of civil society or affected citizens were not present.
• Information provision: Before playing the game, we provided participants with information, e.g., on the sustainability criteria. During the game, it turned out that stakeholders had quite different levels of knowledge, e.g., regarding the current spatial allocation of wind power generation capacity.
• Group effects: We equipped each group with a moderator to provide for a balanced discussion.
Still, the eventual group decisions were most likely also determined by the role of individuals with strong leadership skills.
• Topics of deliberation: We strongly framed the game and limited the remaining degrees of freedom in various ways. First of all, we defined an explicit target for wind power generation to be attained in the game (so not deploying wind power was no option). The level of ambition of the deployment target may be decisive for the course of discussion and the eventual outcome. If the level is low, it may be relatively easier to identify a spatial allocation with few trade-offs. If the level of ambition is high, the degrees of freedom for choosing a spatial allocation vanish. In either case, the actual ranking of criteria may lose importance. In addition, we pre-selected sustainability criteria to be considered in the game. Choosing other criteria may well lead to different rankings.
We also asked for an allocation of capacity across States. This scale cannot fully represent the spatial heterogeneity of all sustainability criteria, e.g., for wind yield or nature and landscape conflicts. Playing the game at a smaller scale may therefore result in different outcomes.
Thus, our results -regarding both the spatial allocation of wind power generation capacity and the underlying definition and ranking of sustainability criteria -cannot claim to be representative.
However, it was not our intention to derive an "optimal" spatial allocation or a "correct" ranking of sustainability criteria. Rather, we aimed to provide more general insight on whether or not there is disagreement among stakeholders regarding these aspects and shed light on lines of argumentations.

Conclusion
The spatial allocation of generation capacity for onshore wind power brings about trade-offs between different sustainability criteria, such as energy system costs or impacts on humans and ecosystems.
Consequently, the decision how to allocate wind turbines depends on which weight a decision-maker attaches to the different sustainability criteria. Results of the game we played with stakeholders suggest that there is no generally accepted ranking of sustainability criteria on which everybody agrees when it comes to the spatial allocation of wind power generation capacity. Disagreement was not only related to the weights of sustainability criteria. It was also due to the fact that stakeholders had different opinions regarding the definition and measurement of some of the sustainability criteria as well as the underlying assumptions regarding relevant future developments of the energy system.
Interestingly, concerns of spatial equity seemed to matter for all stakeholders. Yet, different concepts of spatial equity -relating to distributive or commutative justice -could be observed. Disagreement on the proper ranking, definition and/or measurement of sustainability criteria implies that there is no objectively optimal spatial allocation of wind power generation capacity. Arguably, this insight is likely to hold even if more sophisticated empirical and deliberative methods are used.
Our results open up for recommendations for promising future avenues of research. Scientific studies investigating the optimal spatial allocation of wind power -or renewables more generally -should not primarily rely on aggregate monetary or multi-criteria assessments. These may conceal the sensitivity of results with respect to the monetary or non-monetary weights attached to different criteria -and this sensitivity may be decisive if there is disagreement on the ranking of criteria. Instead, more spatial trade-off analyses are required which compare different spatial optimizations for individual sustainability criteria in a consistent modelling framework. While such trade-off analyses do not derive an optimization across multiple criteria, they may help to identify "no-regret" sites for future wind power development. On "no-regret" sites, wind turbines are installed under any mono-criterion optimization. Moreover, it may also be promising to carry out a comparative assessment of different concepts of spatial equity. Existing studies are scarce and apply quite different approaches to spatial equity so far (see Drechsler et al., 2017;Sasse and Trutnevyte, 2019). A prerequisite for any such analysis would be a more comprehensive conceptual discussion of possibly meaningful definitions of spatial equity with regard to deploying renewable energy sources.
Our results also have implications for policy debates regarding the optimal spatial allocation of wind power deployment. In Germany, for example, policy-makers are discussing a stronger spatial differentiation of the tender scheme for renewable energy support . Following this proposal, capacities may be tendered separately for regions, or bids from certain regions may be awarded bonuses. Similarly, there is an ongoing public debate on how much priority area for wind power development each German State should assign . Either decision requires a ranking of sustainability criteria. If society disagrees on the ranking -as indicated by our study -, several policy recommendations apply: A minimum requirement is that the choice and ranking of sustainability criteria underlying political decisions on the spatial allocation of wind power generation capacity should be made as explicit, transparent, and specific as possible. Second, policy-makers should facilitate public debate and, where possible, societal consensus regarding which criteria should be reviewed and how strongly they should matter for governing the spatial allocation of wind power deployment. A prerequisite for any such consensus may be a better communication and discussion of benefits and costs of wind power deployment at different spatial scales. Finally, political decisions on the spatial allocation of wind power deployment should involve all relevant stakeholders with possibly diverging views on the ranking of sustainability criteria. This strengthens the case for multi-level governance and participatory decision-making. It also calls for scrutinizing proposals aiming at more centralized decisions or shifting competencies from parliaments to executive and judiciary branches.
In Germany, for example, such proposals are made frequently with the intention to speed up wind power deployment .

Appendix
The following data sources were used to generate the "game board" presented in Figure 1.

Potential for wind power generation capacity
The potential for wind power generation capacity is taken from Masurowski (2016). He carries out a two-step procedure. First, he uses a GIS-based approach to identify areas suitable for wind power deployment. For this purpose, he considers a comprehensive set of geophysical, technical and legal constraints. Subsequently, he uses an algorithm optimizing energy yield per area to identify potential sites for a reference wind turbine (Enercon E-101) with a capacity of 3 MW. In total, this approach yields 191,873 potential sites for Germany. Considering the available sites for each State as well as the assumed wind turbine capacity of 3 MW yields the corresponding aggregated potential for wind power generation capacity.

Wind yield
Information on full-load hours is also derived from Masurowski (2016, p. 80). They are calculated considering both the average wind speed at the potential sites and the performance curve of the reference wind turbine. Full-load hours per State are calculated as the average for all respective potential sites. To also account for potential yield losses onsite (e.g., due to shadowing effects), we adjusted the values of full-load hours downwards proportionally to the German average full-loadhours of 2300 h projected for the year 2030 (NItsch et al., 2012).

Proximity to load centers
Information on electricity consumption as well as power generation from other renewable energy sources is based on the assumptions made in Scenario B 2033 of the German grid development plan adopted in (50Hertz et al., 2013.

Nature and landscape conflicts
Nature and landscape conflicts are measured by an ordinal conflict risk scale. The full procedure is provided in Gauglitz et al. (2019). The conflict risk is the result of an iterative expert discourse and describes the probability of conflicts between wind turbine development and nature conservation assets on the basis of available environmental information. A low risk level indicates an area where a usage is unlikely to produce significant conflicts, a high risk level indicates an unsuitable area with multiple possible conflicts. The risk level is the result of a nationwide comparative assessment.
Mapping spatially differentiated risk levels on a 6-point scale for wind energy is achieved in a combined GIS-based and discursive process. Under consideration of the typical effects of wind turbines, potential conflicts are identified especially with avifauna, bats or recreational functions. Altogether all objects 19 of protection -flora and fauna, biodiversity, water, soil, air and climate, diversity, characteristic features and beauty as well as the recreational value of nature and the landscape -are operationalized. For the operationalization nationwide available data, e.g., Nature 2000 sites, are used.
The potential conflicts represented by these datasets are rated considering impact and vulnerability of the objects of protection. Based on these ratings and additional information about their normative meaning and accuracy, the conflict risk of datasets is generated. To map a nationwide nature conservation conflict risk rating concerning wind turbines, the datasets for each objct of protection are aggregated rule-based. The result is a nationwide map rating sites according to the overall conflict risk that can be used as a criterion to allocate wind power plants. These results can be utilized in planning to avoid areas with high conflict risk. They can be used as an instrument to assess and develop wind energy scenarios. The assessment is carried out site-specific and was aggregated to the State level for the purpose of the game. The pie chart in Figure 1 illustrates the shares of the different conflict risk classes in the potentially available sites for wind power deployment.