List of Figures and Tables The Role of Market Structure , Technology , and Policy

....................................................................................................................... 1


Introduction
Installations of solar photovoltaic (PV) systems have expanded rapidly over the past decade, with continued growth anticipated over the near-and longer-term (Baker et al. 2013, IPCC 2014. In 2013 alone, 38 GW of solar PV were installed globally, marking a roughly 25% increase from the prior year (EPIA 2014). The United States has also witnessed this dramatic growth, with an average annual growth of residential and commercial solar PV systems exceeding 40 percent per year over the last decade (SEIA/GTM Research 2014). Along with this growth has been a substantial decline in PV system prices, with a roughly 50 percent decline from 2009 to 2013 , Bazilian et al. 2013, Candelise et al. 2013. Amid this decline, however, there remains considerable heterogeneity in PV system pricing. For example, among residential and small commercial systems installed in 2013, roughly 20 percent were sold for less than $3.90/Watt (W), while a similar percentage was priced above $5.60/W ( Barbose et al. 2014). This paper empirically examines the observed heterogeneity in equilibrium PV system prices in the United States. We explore different sources of variation to provide evidence on the determinants of such price dispersion, including classic hedonic variables such as the characteristics of the systems and demographics, as well as other plausible equilibrium price shifters, including market structure and policy variables. We use a rich dataset of nearly 100,000 individual solar PV systems installed across the United States over the 2010-2012 period, focusing our analysis on residential and commercial PV systems under 10 kW. We focus on consumer-owned PV systems, but also include a subset of third-party owned (TPO) systems in our analysis. As expected, we find that PV system prices differ based on characteristics of the systems. However, our results also point to the important roles of market structure, firm-specific characteristics, and government policy.
Understanding the determinants of the equilibrium price of solar PV systems is useful from both an academic and policy perspective. Economists have developed a deep theoretical and empirical literature that points to a variety of explanations for price variation (e.g., see Baye et al. (2006) for a comprehensive review). The simplest explanation is that products are not homogenous, so we observe a differentiated product equilibrium, such as in the Bertrand oligopoly model. This hypothesis would lead to price variation based on system characteristics as well as market structure. But even when we have a homogenous product, there are theoretical explanations supporting equilibrium price variation due to information or search costs by consumers or firms (Stigler 1961). Over the past several decades, a variety of information acquisition and transmission models have been posited in the theoretical literature (Burdett and Judd 1983, Carlson and McAfee 1983, Salop and Stiglitz 1977, Varian 1980). With frictions from information acquisition and transmission, consumers trade off the costs (e.g., opportunity or other costs) of obtaining a quote from another firm against the expected benefit from this quote. Price variation then follows with heterogeneous consumer or firm costs-or market power.
An extensive empirical literature examines the extent to which these factors influence equilibrium prices in a variety of settings, with much of the recent work in online internet markets (Baye et al. 2004, Brynjolfsson and Smith 2000, Ellison and Ellison 2009) and gasoline markets (Barron et al. 2004, Chouinard and Perloff 2007, Shepard 1991. Evidence of significant price variation has also been demonstrated in air travel (Borenstein and Rose 1994), pharmaceuticals (Sorensen 2000), books (Clay et al. 2001), and for many other goods and services in the economy (Crucini and Yilmazkuday 2014). A common theme in many of these papers is that variables capturing market structure, firm characteristics, and policy interventions are important in many settings. This relates closely to a body of work in electricity markets, such as Andersson and Bergman (1995), which establishes the importance of market structure and regulation for equilibrium electricity prices.
By examining market structure and policy variables, this paper also provides policy-relevant insights. Seel et al. (2014) find the striking result that average residential PV system prices in Germany are roughly half of those in the United States. Since solar PV modules are a globally traded commodity, the most likely explanations for this difference in price are differences in policy, characteristics of the local installer base, and market structure. By exploring how these factors influence equilibrium prices in the United States, we shed light on sources of price variability that may be amenable to policy interventions aimed at facilitating cost reductions. With the U.S. Department of Energy's SunShot Initiative dedicated to reducing the cost of installing solar PV systems, and with a wide variety of local, state and national incentive programs for solar, these results can provide relevant policy insights. Our findings, for example, suggest not only that the level of installer competition may affect solar PV prices, but also that some combination of installer experience and scale is bringing down costs, consistent with findings in Bollinger and Gillingham (2014). Moreover, we find evidence that policy measures are associated with differences in PV system prices, much as Dong and Wiser (2013) and Burkhardt et al. (2014) find that local permitting processes can influence PV system prices.
Our approach follows a vein of the rich price dispersion literature, by estimating the reducedform relationship between equilibrium prices and likely supply and demand shifters of these prices (Barron et al. 2004, Chouinard and Perloff 2007, Clay et al. 2001, Haynes and Thompson 2008, Shepard 1991. This approach draws from the classic hedonic pricing literature, widely used in economics (Rosen 1974). Wiser et al. (2007) explore a similar approach using data from the early years of California's solar PV market, finding evidence suggestive of higher solar PV prices with policy incentives, economies of scale at the system level, and lower prices for systems on new homes. Davidson and Steinberg (2013) use more recent data on solar PV prices in California and focus on differences in prices between third-party-owned systems that are priced based on the appraised value and all other systems, finding quite different results for each. Their findings also suggest the importance of heterogeneity in installers for the equilibrium price of PV systems. The present paper uses a much more comprehensive dataset to more deeply explore the primary factors that influence equilibrium prices in non-appraised value PV systems across the United States.
The remainder of the paper is organized as follows. In Section 2, we describe our dataset. Section 3 descriptively demonstrates the extensive heterogeneity in PV system prices and posits several hypotheses for this heterogeneity. Section 4 describes our empirical methodology, while Section 5 presents our primary results and robustness checks. Section 6 concludes and discusses policy implications.

Data
This paper leverages an extensive dataset of PV system installations, compiled for Lawrence Berkeley National Laboratory's annual Tracking the Sun (TTS) report series. The dataset used in this study, from Barbose et al. (2013), includes reported PV system prices for approximately 231,000 PV installations, representing 68% of cumulative grid-connected residential and commercial PV capacity in the United States as of year-end 2012. The data were collected from 47 PV incentive programs in 29 states. All installations that receive a government incentive from these programs are included in our raw data.
The data contain the total transaction price for each PV system installation, whether that transaction is between installer and site host (in the case of customer-owned systems) or between installer and third-party financier (in the case of third-party owned systems). The transaction price is the pre-incentive price the system owner pays prior to any subsidy payments, such as upfront rebates, tax credits, renewable energy certificates, or performance-based incentives based on the production of the system. The data also contain information on direct incentive payments provided by programs in the sample, which are incorporated into the consumer value of PV variable described below. Additionally, the data contain detailed information about each PV installation: the date of installation; system size; zip code of the installation; whether the system is residential, commercial, or other; whether the system is customer-owned or owned by a third party (where the host consumer leases the PV system or purchases power from the system); whether the system is installed in new construction; installer of the system; presence of a battery; module and inverter manufacturer; and module and inverter model. Based on the module and inverter model, we can infer further characteristics, including whether the module is building integrated PV (vs. rack-mounted), thin-film PV (vs. crystalline), Chinese made (vs. non-Chinese made), and whether the PV system uses micro-inverters (vs. central or string inverters).
We also bring in data on module and inverter price indices from SEIA/GTM Research to capture U.S. module and inverter price trends (SEIA/GTM Research 2014). Since both modules and inverters are globally traded commodities, the trends in these costs will generally be similar across systems installed in the U.S., and thus these indices are useful cost shifters. 1 Furthermore, we use data on the 2011 combined average state and local sales tax at the state level (Tax Foundation 2014), accounting for the existence and timing of sales tax exemptions for solar PV (DSIRE 2014), and a state-level interconnection score measuring the ease of interconnection of a solar PV system to the grid (IREC/Vote Solar 2013). We include a detailed set of household socioeconomic and demographic data at the zip code level from the 2010 U.S. Census (U.S. Census Bureau 2014). Additionally, from this, we calculate the average household density in each county. We also derive a county-level composite labor cost index by averaging contractor, electrician, and roofing wage data from the Bureau of Labor Statistics (BLS) (U.S. Bureau of Labor Statistics 2014).
We clean and standardize the above data, and in the process create several additional variables relevant for our analysis. We adjust all dollars to real 2012 dollars. We divide the PV system price variable by the system's nameplate capacity (in DC units under standard test conditions) in order to calculate the price per watt.
We create a "consumer value of solar" variable to broadly capture how the financial attractiveness of PV systems might influence pricing. This variable is calculated as the sum of the discounted value of all incentives-including direct rebates, performance-based incentives, solar renewable energy credit (SREC) payments, and state and federal tax credits, as applicableand electricity bill savings over the expected life of the PV system (using a 7% discount rate, 20 year system lifespan, and electricity rates increasing with inflation). The variable accounts for solar insolation levels, which are used to estimate electricity bill savings. However, it does not account for all system-level details (e.g., precise rate tier of customer or orientation of PV system), so is not a perfect measure. Using this variable, we also calculate the percentage of total consumer value of solar that would be expected to come from SREC payments; those incentives are relatively uncertain and therefore are of particular interest. Appendix 1 provides further details on the construction of these variables.
We create variables to capture aspects of market structure, including the county-level installer density (i.e., number of installers with at least two installations in the previous six months divided by the number of households) and the county-level Herfindahl-Hirschman Index (HHI). 2 We define the market share used to calculate the HHI as the share of cumulative installations in a county by the installer in the previous year.
Further, we create variables for the installer's depreciated experience (at county and state levels) and aggregate depreciated experience (at county level), accounting for mergers and acquisitions. The depreciated experience variable is motivated by the possibility that installation experience (which is possibly includes some of the effect of scale, which we do not include separately) reduces PV system installation costs, as is suggested by van Benthem et al. (2008) and Bollinger and Gillingham (2014). It is defined as the number of installations by the installer in the geographic region depreciated at 20% per quarter. 3 In constructing these variables, we use two sets of dates in order to capture the process by which an installation contract is signed and then completed. We note that in states that report both the incentive application date (usually within a week of when the contract was signed) and the system installation date, there is on average a roughly 120-day difference between the two dates. We assume that all pricing decisions are made at the time the price is quoted, which is approximated by the incentive application date (assumed as 120 days before system installation). In contrast, the installer experience is based on installations completed at the time the contract was signed. Accordingly, we use the number of installations completed by this date in constructing installer experience.
From the raw data, we take a subsample that is more relevant for understanding price variation in recent years. In particular, we restrict the dataset to PV systems that were installed between 2010 and 2012. Additionally, in order to focus on comparable small-scale systems, we restrict the data to PV systems between 1 and 10 kW and remove systems with installed prices outside of $1.5/W and $20/W (in 2012$).
Further, we exclude appraised-value TPO systems. Appraised-value systems are installed by integrated firms that both perform the installation and provide consumer financing. In these cases, installation prices within our dataset are based on an appraised value stipulated by the firm, rather than on an actual transaction price. Podolefsky (2013) and Davidson and Steinberg (2013) show that appraised values are a poor measure of transaction prices, supporting our decision to exclude these systems from the analysis. We do retain TPO systems that do not report appraised value prices. These are PV systems financed by non-integrated TPO providers (i.e., companies that provide consumer financing but purchase the system from an engineering, procurement, and construction [EPC] contractor). The reported price data for these installations generally represent the actual price paid to the EPC contractor by the consumer finance provider. Our results include a control for these systems and we also perform a robustness check excluding them. It deserves note that for those TPO systems retained in our core regressions, our results only pertain to the transaction price between the PV EPC installation contractor and the financier rather than to the pricing of the financing arrangement between the TPO provider and the enduse customer: our data do not contain information on the TPO-customer pricing arrangements. Given the growing size of the TPO PV market in recent years, future work is recommended that specifically evaluates TPO-customer pricing arrangements.
Appendix 1 contains further details on the construction of the variables in the final dataset. Our final sample contains 99,029 observations, where an observation is an installation. Figure 1 shows the geographic distribution of these installations in our final dataset across the country, illustrating the concentration of systems in California, comprising 60% of the final sample, as well as elsewhere in the Southwest and Northeast. There are other states with a significant number of installations that are not included, such as Hawaii, Colorado, and Oregon, but these states either are not included within the initial raw dataset or do not provide data on either the installer or the geographic location of the installation. Of the states included in our raw dataset, we retain nearly 80% of the small scale systems from 2010 to 2012. Most of the excluded systems are dropped because they are appraised value or are missing the geographic location. Table 1 provides summary statistics for the full dataset. The average pre-incentive price of an installation is $32,605, but with a very large standard deviation of $14,345. In terms of price per watt, the mean is $6.43/W, with a considerable one standard deviation range of $4.53/W to $8.33/W. This evidence already underscores the great variability in prices. The next section provides further evidence and will begin to disentangle why installation prices vary so considerably.

Descriptive Evidence of Variation in Solar PV System Prices
A kernel density plot of the pre-incentive price per watt in our dataset ( Figure 2) clearly illustrates the considerable variation in installed prices.

Figure 2. A kernel density plot shows dramatic variation in the price per watt in our dataset.
Why might the per-watt price of solar PV installations vary so significantly? Several hypotheses were discussed in the introduction. The variation may simply reflect differences in technical characteristics of the PV systems in the sample, as well as differences in costs across locations and installers, perhaps due to differences in wages and installer experience. The market may be imperfectly competitive, so that demand factors shift the markup installers charge. Information and search costs, a common explanation for price dispersion, could influence the degree of market competition. In addition, policy actions may influence the price, such as efforts to reduce interconnection costs. The effect of incentive policies on the equilibrium price depends in part on pass-through, a topic examined in several recent papers , Henwood 2014, Hughes and Podolefsky 2014.
To begin to understand these factors, we examine the variation in prices both across and within county-level markets. Figure 3 shows the heterogeneity in prices across markets, by plotting the county average price per watt (dropping counties with fewer than five installations). Counties in California, the Northeast, Texas, and Arizona/New Mexico are color-coded, with darker colors indicating higher prices. In Figure 3, we can see that California, the most mature market in the United States, has relatively homogenous prices across geography, at least in terms of cross-county pricing, with average county-level prices mostly in the $5/W to $7/W range. Other states exhibit more significant price variation across counties; in some states, such as Texas and Arizona, greater variability in county-average prices may be tied partly to the absence of a uniform statewide PV incentive program, and therefore greater variation in both the consumer value of solar and demand for solar.
One hypothesis for the variability in the above figure is that the maturity or size of the market explains the differences in prices. However, in comparing Figure 4 to Figure 3, we can see that there is much more to the story. Counties with high average prices are sometimes relatively large markets and in other instances relatively small markets. Installation hot spots, such as the San Francisco Bay Area, Los Angeles, and San Diego, coastal areas in the Northeast, and Maricopa and Pima counties (around Phoenix and Tucson) in Arizona, do not necessarily have the lowest installation price in their states. Indeed, the Pearson correlation coefficient between the county average price per watt and the cumulative number of PV installations from 2010-2012 is only -0.04. 4 Moreover, this finding is not simply a result of differences in population density; Figure  A1 in the appendix shows that the map plotting the density of installations per household differs similarly from Figure 3. A second plausible explanation for the differences in prices is heterogeneity in labor wages across the United States. Indeed, there is substantial variation in wages in our dataset, as is shown in Panel (a) of Figure 5, and one might expect higher wages to lead to higher costs and thus higher prices. The Pearson correlation coefficient between wages and average prices in our dataset is 0.06, indicating some correlation but also suggesting that other factors are needed to explain the observed heterogeneity in prices. A third hypothesis is that costs may vary at a relatively localized level due to some combination of firm experience and scale (Bollinger and Gillingham 2014). If firms in a particular county have more experience in installing solar PV, the equilibrium price might be lower. Panel (b) of Figure 5 shows that there is indeed considerable variation in the aggregate installer experience in a county, although most installations are performed by an installer with less than 500 previous installation in that county. The Pearson correlation coefficient between installer experience in a county and the average price per watt is -0.108, consistent with the hypothesis.
A fourth hypothesis for the differences in price is that there is imperfect competition in the solar PV market and consumers face search and information costs. In this case, one would expect that as the number of active installers (i.e., installers with at least two installations in the county in the previous six months) relevant to a consumer increases, equilibrium PV prices would decline. Panel (c) of Figure 5 shows that the number of active installers varies significantly across installations. The Pearson correlation coefficient between the installer density-a measure of active installers-and the average price per watt is -0.105, suggestive of the importance of information and search costs in the PV market.
A fifth hypothesis is that if there is imperfect competition in the solar PV market, we might expect firms to price discriminate, charging more when there is high demand for solar, perhaps due to the consumer benefits from solar or due to demographics. On the other hand, some of the incentives that provide for a high consumer value of solar may be passed through to consumers. The Pearson correlation coefficient between the consumer value of solar variable and the average price per watt is -0.109, suggesting that the latter effect may be more important. Figure A2 in the appendix presents a map of the county-level consumer value of solar to show the difference in heterogeneity across space between this variable and the price per watt in Figure 3.
Finally, as shown in several recent papers, such as Shrimali and Jenner (2013), there is considerable variation across states and localities in policies relating to renewable energy. We hypothesize that several policy variables may also influence equilibrium pricing.

Estimation Approach
Our estimation approach follows an extensive literature that aims to provide evidence on price dispersion or the factors influencing prices. Papers taking this approach include Shepard (1991), Goldberg and Verboven (2001), Clay et al. (2001), Barron et al. (2004), Chouinard and Perloff (2007), Haynes and Thompson (2008), Busse et al. (2013), and Crucini and Yilmazkuday (2014). Of course, firm pricing is based on supply and demand (and marginal revenue under imperfect competition); so just as in the hedonic pricing literature, our reduced-form approach captures the equilibrium pricing response to a variety of factors.
Our empirical specification regresses the PV system price per watt on a number of covariates that we hypothesize will influence the equilibrium price. For convenience, we group these covariates into the categories in Table 1. We model the price per watt ( ) for installation i by installer j in state s at time t as follows: Here is a vector of market structure or competition variables, is a vector of experience variables, is a vector of policy-related variables, is a vector of zip codelevel and county-level demographic variables, and is a vector of installation characteristics. Each of these vectors contains the variables in that group listed in Table 1. A few variables of potential interest are not included, such as a variable specifically to capture firm-level economies of scale. We do not attempt to separately identify this effect, since it is highly correlated with our depreciated experience variables. 5 Thus, our experience variables can reasonably be interpreted to capture the combined effect of experience and economies of scale.
We explore different sources of variation by examining specifications with a variety of fixed effects, including installer fixed effects ( ), state fixed effects ( ), and year-month fixed effects ( ). Without any fixed effects, our coefficients are identified from variation in prices and the covariates both over time and cross-sectionally. By including year-month fixed effects, we control for any unobserved time-varying factors by using the within-year-month variation in our data. However, when we include state fixed effects to control for state-level unobserved heterogeneity, we can no longer examine the effect of variables that we observe only at the statelevel, including some of our policy-relevant variables. Using installer fixed effects is valuable for addressing installer-specific unobserved heterogeneity, such as better marketing strategies or negotiation for lower module prices. However, since there are 1,098 installers in the dataset, we cannot identify most of the market structure variables in the presence of installer fixed effects. Thus, by exploring specifications with and without different fixed effects, we obtain a more complete picture of the factors influencing equilibrium prices.

Primary Results
Our primary results, shown in Table 2, include different sets of the variables in our model specification in (1). Moving across the columns, we include various sets of fixed effects in order to explore different sources of variation and estimate coefficients on variables that vary at different levels. Moving from column 1 to column 6, the specifications alternate between including state fixed effects (ST) and including policy-relevant variables (POL) that vary principally at the state level. Columns 3 and 4 add installer fixed effects (INS) to examine the results relying on within-installer variation. Columns 5 and 6 include market structure and experience variables (COMP), with and without state fixed effects respectively. The rows in Table 2 are ordered following equation (1).  (3) and (4) and zip code in the remaining columns. * p < 0.05, ** p < 0.01, *** p < 0.001. The column headings are as follows: ST = "state fixed effects," POL = "policy-related variables," INS = "installer fixed effects," COMP = "market competition variables." A first observation about Table 2 is that even with a very rich set of covariates, much of the variation in prices remains unexplained; the adjusted R 2 value ranges between 0.33 and 0.38 across specifications. This suggests that much of the variation may be due to measurement error in the reported prices or, more likely, to highly installation-specific unobservables, such as the suitability of the roof or the willingness of the consumer to search for a lower price.
We are next interested in both the statistical significance and economic significance of the results. Broadly, the coefficient estimates are highly statistically significant for most core variables. Notable exceptions are some of the variables in columns 3 and 4, which become statistically insignificant when installer fixed effects are added. The decreased significance is likely due to limited within-installer variation in these variables. Our preferred results for examining the market structure, experience, and policy variables are in column 6. However, we are interested in the coefficient estimates in other columns as well, in part because several of those provide controls for unobserved heterogeneity across both installers and states and in part because the full set of specifications provides a check on the robustness of our key findings.
The signs and magnitudes of the coefficients are generally sensible. The system characteristic variables act as cost shifters and are particularly easy to explain. The coefficients on the system size variables suggest economies of scale with diminishing marginal returns. The coefficients in column 6, for example, indicate that increasing the system size by 1 kW at the mean system size of 5.27 would decrease the price by $0.19/W. 6 The prices of small commercial systems are not statistically different than similar-sized residential systems, while other systems (hosted by government, school, and non-profit customers) are associated with higher prices than residential systems, perhaps due to more complicated and less standardized installations, such as carport structures. The coefficients on third party-owned, non-appraised value systems vary in sign across specifications, but within a relatively narrow range, suggesting similar pricing between systems sold directly to host customers and those sold to third party financiers.
System designs with tracking (most of which are ground mounted), thin-film panels, buildingintegrated panels, or batteries all serve to considerably increase the price. For example, including a battery system increases the price by around $2.50/W. Including tracking increases the price by almost $2/W. In contrast, systems installed with new construction and self-installed systems have lower prices. For self-installed systems, which exclude labor costs, there is not surprisingly a strong effect: a self-installed system is associated with about a $2/W decrease in price. New construction systems in most of our specifications are associated with a roughly $0.75/W decrease in price, suggestive of economies of scope and scale that may arise in new housing developments with systems installed on multiple homes, where labor or materials costs are shared between PV installations and other elements of home construction.
Several key insights arise from the estimates of the remaining variables in Table 2 and Table 3. First, installer density has a strong effect on prices, such that a move from the 5 th to the 95 th percentile of this variable decreases prices by $0.49/W in column 6. Greater installer density is consistent with greater competition in the market and lower information search costs, which may bring down prices. In contrast, the HHI has a smaller impact on prices and in fact has a negative coefficient, suggesting that greater market concentration is associated with slightly lower prices. This seemingly counter-intuitive result may arise because high market share firms in concentrated markets can achieve lower costs through economies of scale, which may outweigh the higher margins they can charge due to lower competition. Another explanation is that high-HHI markets may be more concentrated because of unobserved factors that reduce demand for PV (e.g., a high proportion of homes with unsuitable roofs or low levels of electricity usage) and thus also attract fewer installers who must deliver low-priced systems in order to stimulate demand. This latter explanation is suggestive of imperfect competition and price discrimination (our fourth hypothesis above) over geography if those areas with lower (greater) demand for PV are associated with lower (higher) prices.
We find that increasing installer experience at the state or county level reduces the equilibrium price, consistent with installer-level learning or scale lowering costs (our third hypothesis). Furthermore, county-level experience ($0.23/W shown in column 6 of Table 3) appears to have a much stronger effect on the variability of prices than experience at the state level ($0.07/W shown in column 6 of Table 3). This suggests that much of firm learning is localized, perhaps embodied in the knowledge of individual offices or teams, or associated with knowledge of local conditions. Given that we do not separately control for economies of scale, our experience variable may also capture certain cost advantages of larger firms, such as greater efficiencies in transportation, less down-time for installation crews, and lower materials costs associated with bulk purchasing. Interestingly, the aggregate number of installations in a county (recall our first hypothesis) has an increasing effect on installed prices. This variable may be capturing an important demand effect: the total number of installations is a measure of market size, which may be associated with stronger unobserved preferences for solar PV (e.g., a high proportion of homes with relatively high electricity usage levels) and therefore higher prices.
Among the policy variables (our last hypothesis), perhaps the most revealing are the results for the consumer value of solar, which represents an estimate of the combined value of all incentives and utility bill savings, where the latter is a function of both solar insolation and retail electricity rate levels. In Table 2, we find a positive relationship between PV prices and the consumer value of PV. For example, under model (6), we estimate a coefficient of 0.095 for this variable, suggesting that for each $1/W increase in the combined present value of incentives and bill savings, installed prices increase by roughly $0.095/W. Across systems in the data sample, the consumer value of PV varies widely, with a difference of $4.97/W between the 5 th and 95 th percentiles. Given the model (6) results, this corresponds to pricing variation of $0.47/W, making consumer value of PV among the more economically significant of the variables considered. These results may stem from a demand shift due to higher incentives, or alternatively, they may be a symptom of imperfect competition, whereby installers are able to "value-price" systems based on consumer willingness to pay. Our results further suggest that these effects are slightly diminished when a larger fraction of incentives are provided through SREC payments, perhaps because these incentives are inherently uncertain and are thus likely discounted by the consumer, or perhaps because in some instances installers retain ownership of SRECs.
The two remaining policy-related variables, the sales tax per watt and the interconnection score, both have strong positive effects on price, and contribute substantially to overall pricing variability. The result for sales tax has important implications for incentives: under the assumption that consumers and firms react symmetrically, incentives will have the same effect as the sales tax only with the opposite sign. The results for the interconnection score, however, are contrary to the expected relationship; that is, one would anticipate higher interconnection scores, which signify more streamlined processes, to imply lower prices. This counter-intuitive result may be associated with a demand-side effect: Higher interconnection scores are in places where building permitting offices have considerable experience with solar PV, due to high demand. Indeed, there is a positive Pearson correlation coefficient between aggregate installations in a county and the interconnection score of 0.10.
Among the demographic variables, household density is positively correlated with prices, such that going from the 5 th to 95 th percentile of this variable increases prices by $0.32 in column (6). Several possible explanations could be posited for this relationship. It may be partly a demand effect: more densely populated regions may have higher demand, which increases prices. In addition, dense areas may tend to contain smaller houses with more-complicated roof profiles, which may require more complex and costly PV installations. More densely populated regions are also often associated with a higher cost of living, which itself may tend to drive up PV installation costs if it raises the overall cost of doing business (apart from wages). That being said, our results show that higher local labor costs are, in fact, associated with lower, rather than higher, PV system prices (our second hypothesis). One possible explanation is that higher labor prices, after conditioning on income and education, are in areas with lower PV system demand, and thus lower PV prices.
Interestingly, we find that higher education levels in a zip code are associated with lower prices, perhaps suggesting a negotiation process whereby education is related to more effective search and a stronger bargaining position for consumers. At the same time, we find that higher income levels in a zip code are associated with higher prices (consistent with our fifth hypothesis). For example, going from the 5 th to 95 th percentile for the highest income bracket is associated with a price increase of $0.40 in column (6). This can be interpreted as a demand effect, whereby wealthier households have higher demand for solar PV, perhaps because they have higher electricity bills or can afford the systems (though this result is not robust to inclusion of installer fixed effects). It may also be interpreted as a negotiation effect, as wealthier households may have a less-binding budget constraint or higher information search-costs (i.e., their time is more valuable), so they may not negotiate as hard on price or seek out as many competing bids. Furthermore, wealthier households may request additional add-ons to improve the look or performance of the system or have more complicated roofs, also raising the price.

Robustness Checks
We conduct several robustness checks to confirm that our results are not unduly impacted by model specification and data selection. Appendix 3 presents the results of two of these estimations.
In the first, and perhaps most important, of these additional robustness checks, we exclude all third party-owned (TPO) systems. While we control for this characteristic in our primary specifications above, one may be concerned that a simple dummy is not adequate as a control. We find that our results are quite robust to excluding TPO systems. One of the few variables that changes sign is the installer experience at the state level. We interpret this sign change as due to the removal of most of the installations by some of the largest installers.
A second robustness check uses absolute price rather than price per watt as the dependent variable. Naturally, the R-squared in the alternative specification is significantly higher due to the strong relationship between price and system size. 7 The results are otherwise very similar between model specifications in terms of both signs and relative magnitudes. However, some statistical significance is lost for a number of key explanatory variables. For example, the coefficient on installer experience at the state level becomes small and statistically insignificant. We also perform several other robustness checks, which are discussed in Appendix 3.

Conclusions
This paper examines the variability in equilibrium prices of solar PV systems in the United States using a large dataset of installations in fourteen states between 2010 and 2012. We first demonstrate considerable heterogeneity in prices, both across and within states. We then seek to explain this heterogeneity in terms of system characteristics, market structure, installer characteristics, demographics, and policy conditions, all of which influence pricing through their impacts on underlying costs and demand. Even after controlling for many plausible observable factors, a surprising amount of unexplained variation in PV system prices remains. The size of this residual suggests that unobserved conditions, including installation-specific factors such as the characteristics of the roof or the negotiating power of the household, may play an important role in the heterogeneity in pricing.
Our estimates provide great insight into installer pricing behavior. Not surprisingly, we find that system characteristics act as important cost shifters that influence price. For example, batteries and tracking equipment are associated with higher prices, while self-installed systems and systems installed in residential new construction are associated with lower prices. However, the vast majority of the systems in our sample do not have any of these characteristics, implying that observable system characteristics alone are unlikely to explain the dramatic variability in prices.
In examining the impact of market structure, our results are consistent with PV markets with imperfect competition and consumers who face search costs. We find that greater installer density leads to lower prices, consistent with increased competition. We also provide suggestive evidence of installer experience leading to lower costs, consistent with a large literature on learning-by-doing in new technologies, and we find that these effects manifest most prominently at the local (county) level for PV installers. Most of our other results can be interpreted as capturing factors correlated with higher demand for solar PV systems. With greater demand, we would expect a higher equilibrium price, an effect that would be exacerbated under imperfect competition.
One notable demand-side effect is the consumer value of solar. The results show that regions with a higher consumer value of solar, considering retail electricity prices, solar insolation levels, and incentives, tend to face higher prices. This phenomenon may be the result of a shift in consumer demand caused by the presence of rich incentives, enabling entry by higher-cost installers and allowing for higher-cost systems. Alternatively, the results may be a symptom of high information search costs or otherwise imperfect competition, whereby installers in these markets are able to "value price" their systems, effectively retaining some portion of the incentive offered. In this sense, the finding may be related to the burgeoning literature on PV system incentive pass-through , Henwood 2014, Hughes and Podolefsky 2014.
These results have several implications for policy. First, they provide a broad view of the factors influencing equilibrium PV system pricing. This overview is important given that price reduction itself is a stated federal policy objective. Second, several of the results are directly relevant for policymakers since they may involve market failures or other justifications for government intervention. Our results suggest that market structure varies considerably across markets and plays an important role in determining equilibrium prices. Government efforts to foster a competitive market in solar PV, e.g., by encouraging entrants and reducing information search costs, have strong potential to bring down prices. We also find suggestive evidence that experience reduces prices. This result is important for forecasting future prices for PV systems, and indicates that efforts to increase deployment-whether publicly or privately funded-are likely to reduce costs. Finally, we find evidence of how specific policy actions, for example sales tax exemptions for PV and changes to the magnitude of financial incentives offered for PV, may directly influence prices. This has potential implications for deployment efforts aimed at reducing costs. In the short-run at least, policies that stimulate demand for PV may have the exact opposite of their intended effect, by causing prices to go up rather than down. Attention may therefore be required when designing deployment policies, to minimize any perverse effects, and when evaluating such policies, to distinguish between short-term price increases caused by demand shifts and longer term reductions caused by learning-by-doing. This analysis also points to several directions for future research. While this paper sheds light on some of the factors underlying price differences in PV systems, a deeper analysis into price dispersion may provide further guidance. Similarly, by looking at our empirical results, one can get a sense for the characteristics of the lowest priced systems, but a targeted analysis may provide further policy guidance and elucidate important factors unobserved within the present research. Finally, given the growth in third-party PV ownership, and claims that "value-based" lease and power-purchase agreement pricing is common within that segment, targeted analysis of the drivers to TPO-customer pricing would be valuable. Additional work along the lines described here could lay the groundwork for more carefully designed policies for solar PV systems, especially where the policy objectives are not only to increase deployment but also to reduce the social costs of that deployment.

Appendix 1: Further Details on Data Construction
This appendix outlines some of the details on the construction of the variables in the final dataset used in the analysis.

Data Sample Selection:
The raw data received from all PV incentive program administrators was initially cleaned to remove systems with missing data for installation date, system size, or installed price, as well duplicate systems participating in multiple programs. These initial cleaning steps yielded a raw data sample consisting of 240,633 residential and commercial PV systems installed from 2010 through 2012. This time frame provides insight into recent installer pricing in the United States, but includes enough years for a sufficiently large dataset. The raw sample was then further cleaned by removing projects under any of the following conditions (these numbers are not based on sequential dropping of systems; they are just the total number of systems in the raw dataset in each category): • systems installed before January 1, 2010 or after December 31, 2012 (82,566 systems) • systems with installed price less than $1.5/W (147 systems) or greater than $20/W (618 systems) • systems less than 1 kW (1,287 systems) or greater than 10 kW (31,494 systems) • systems for which the reported installed price was deemed likely to be an appraised value (19,839 systems) • systems with missing or unknown values for any of the variables included in the regression specifications (43,292 systems) In total, the final data sample contains 99,029 installations, though we use the entire raw data sample to construct experience and market structure variables. For a subset of the system characteristic variables, when the value was unknown or not reported, we set it to the most probable value based on the set of known variables. Specifically, unknown values for the following variables were set to 0: tracking, thin film module, BIPV, new construction, battery backup, self-install.
Additional details on several of the cleaning steps are provided below.

Systems in Multiple PV Incentive Programs:
In order to eliminate double-counting of individual systems, an effort was made to identify systems that received incentives from multiple PV incentive programs in the data sample. Where these systems could be identified (either using data fields that explicitly indicated participation in other programs or by matching addresses or other system characteristics across programs), duplicate entries were eliminated, and records associated with those programs were consolidated under a single program. Based on this process, records were consolidated for systems in both ETO's and OR DOE's programs, systems in both the Massachusetts DOER's and the MassCEC's programs, systems in both the Florida Energy & Climate Commission's program and either GRU's or OPUC's programs, and systems in both California's SGIP and either SMUD's or LADWP's programs.
Identification and Removal of Appraised Value Systems: Systems were removed from the data sample if the reported installed price within the raw data was deemed likely to represent an appraised value. Appraised-value reporting occurs for a particular type of third party owned (TPO) systems -namely, for TPO systems financed by integrated third party providers that provide both the installation service and customer financing. In order to eliminate any bias that such data could introduce into the summary statistics presented in this report, an effort was made to identify and remove appraised-value systems from the data sample. Details of the appraised value system identification process can be found in Barbose et al. (2013).

Application and Completion Date:
The data provided by several PV incentive programs did not identify installation dates. In lieu of this information, the best available proxy was used (e.g., the date of the incentive payment or the post-installation site inspection). For all systems, the application date was assumed to be 120 days prior to the installation date.

Incorporation of Data on Module and Inverter Characteristics:
A number of variables within this report distinguish between systems based on characteristics of the module, including distinctions between building-integrated PV vs. rack-mounted systems, crystalline vs. thin-film modules. The raw data provided by PV incentive program administrators generally included module and inverter manufacturer and model names, but did not include any further information about the characteristics of the components. The aforementioned information about component characteristics was therefore appended to the dataset by cross-referencing reported module manufacturer and model data against existing databases of PV component specification data, including SolarHub (www.solarhub.com) and the California Solar Initiative's List of Eligible Modules.
Conversion to 2012 Real Dollars: Installed price and incentive data are expressed throughout this report in real 2012 dollars (2012$). Data provided by PV program administrators in nominal dollars were converted to 2012$ using the "Monthly Consumer Price Index for All Urban Consumers," published by the U.S. Bureau of Labor Statistics.

Conversion of Capacity Data to Direct Current (DC) Watts at Standard Test Conditions (DC-STC):
Throughout this report, all capacity and dollars-per-watt ($/W) data are expressed using DC-STC capacity ratings. Most PV incentive programs directly provided data in units of DC-STC; however, several programs provided capacity data only in terms of the California Energy Commission Alternating Current (CEC-AC) rating convention, which represents peak AC power output at PVUSA Test Conditions (PTC). DC-STC capacity ratings for systems funded through these programs were calculated according to the procedures described in Barbose et al. (2013).
Customer Value of Solar variable: The value of solar variable encompasses all the elements that contribute to the economic value of the PV system to the customer. This includes the following: 1. Tax credits. The federal government and a number of states offer investment tax credits (ITC) for PV systems. Since 2009, the federal ITC has been 30% of system costs. For host-owned residential systems, the credit is based on the total system price net of any cash rebates (since the cash rebates are not taxable income). For commercial and thirdparty owned residential systems, the credit is based on the total system price (since the cash rebates are taxable income for commercial entities 2. Cash incentives and rebates, from state and local governments. In most cases, the exact amounts for the cash incentives and rebates were received directly from the incentive programs. In some cases, the incentive programs did not provide incentive data for all systems. For those systems, the cash incentive was estimated by using the average known incentive amount (in $/W) from other PV systems in a similar size range that had applied for an incentive within 1 month from the same incentive program. Since cash incentives are taxable for commercial entities, we assumed that commercial and third-party owned systems were taxed at the appropriate corporate federal and state tax rate. (FiT). PBIs and FiTs are tied to actual or estimated PV generation, and in most cases disbursed annually for a fixed amount of time (5-20 years, depending on the incentive program). In order to calculate the annual PBI or FiT payment, we estimate the PV production using NREL's PVWatts model (http://pvwatts.nrel.gov/), unless an estimated lifetime PBI amount is specified by the incentive program. In the latter case, we use those data directly, subject to discounting. Inputting system location (i.e. zip code) and system size, and making a number of assumptions regarding system characteristics such as south-facing panels with a 25 degree tilt and a derate factor of 0.77, the model returns the system's estimated annual generation. We then calculate the annual PBI or FiT payment (subject to applicable state and federal income taxes), assuming a system degradation rate of 0.5% per year (Jordan and Kurtz 2013) and a discount rate of 7%. The present value of the income stream is calculated and included in the value of solar variable.

Performance based incentives (PBI) and feed-in tariffs
4. SREC payments. Seventeen states plus the District of Columbia have enacted renewable portfolio standards with solar or distributed generation set asides, and in many of those states, compliance with the set-aside is achieved through the purchase and retirement of tradable solar renewable energy credits (SRECs). Among the states in our sample, active SREC markets exist in the District of Columbia, Delaware, Massachusetts, Maryland, New Hampshire, New Jersey, Ohio, and Pennsylvania. Given the uncertainty in future SREC prices, we have chosen to extrapolate the 2-year rolling average price from the state's SREC market over five years, then assumed $100/MWh SREC payment for the following 10 years. As with the PBI calculations, we use estimated PV system generation to calculate total SREC payments, and sum the present value of all future SREC payments (again, with a discount rate of 7% and a system degradation rate of 0.5% per year).

Electricity Bill Savings.
We estimate the present value of all electricity bill savings over the lifetime of the PV system. We use NREL's OpenEI platform to determine each system's appropriate utility (assuming the default service provider in areas with retail competition). We then use the utility's average retail electricity rates for commercial and residential customers for 2010, 2011, and 2012, as appropriate, extracted from the US Department of Energy Energy Information Administration's form 861, and the estimated annual PV system generation to calculate annual electricity bill savings for each PV system. To account for inclining block pricing in California investor-owned utilities, we multiply the utilities' average rate by a tiering factor. The tiering factor is based on how much higher the average rate is for net-metered customers (based on their gross consumption) than for average non-solar customers following work by the environmental consulting company E3. Utilities with inclining block pricing in other states have much less steep price tiers, and hence tiered pricing is not modeled for utilities outside California. For commercial systems and third-party owned systems, the bill savings are taxed at the applicable state and federal corporate tax rate, to reflect the fact that the utility service costs are an expense that reduces taxable income. We assume that rates rise with inflation through the lifetime of the system (20 years), and calculate the present value of each year's bill savings from PV.
Experience variables: Three installer experience variables are included in the analysis: installer experience in county, installer experience in state, and aggregate installations in county. The first two are installer specific and hence the installer spellings needed to be cleaned and standardized. We also use Bloomberg and a through internet search to find mergers between different installers. We use the combined capacity of the two firms after the date of the merger. For each PV system in the dataset, we then calculated depreciated experience, E i,t , for installer i at incentive application date t where: and γ i,τ is the number of installations by installer i within time period τ (at the county or state level), starting with the earliest installation in the dataset. δ is the depreciation rate (20% per quarter in this analysis).
The aggregate number of installations in county variable is the depreciated number of installations in the county, , at time t, where: and γ τ is the aggregate number of installations within time period τ at the county, starting with the earliest installation in the dataset.
Market Structure Variables: Installer density and the Herfindahl-Hirschman Index (HHI) are meant to capture elements of market structure. The installer density is the number of installers in the county with at least two installations in the previous six months divided by the number of households. The HHI is calculated by first determining the market share, MS i,g,t , at time t for installer i in county g, where: Where γ i,g,t is the number of PV systems installed by installer i in period τ in the county, N is the number of installers in the county. HHI g,t is then calculated: Using median prices for construction of price-dependent variables: Several of the variables depend in part on price. To avoid a spurious correlation, we constructed these variables using representative, median prices rather than the actual values. These variables include the value of solar variable (which include price-dependent federal and state investment tax credit calculations), the percentage incentive that is SREC-based (which uses the value of solar variable), and sales tax. To calculate these values, we replaced actual PV system price with the PV system size multiplied by the median US price for small solar systems (≤10 kW) for the appropriate year. Figure A2. Consumer value of solar per watt by county (dropping counties with less than five installations) shows that when we account for electricity rate differences, solar insolation differences, and differences in incentives, the value to consumers of solar PV does not simply scale with latitude.

Appendix 3: Robustness Check Results
This appendix includes the results of two additional robustness checks and discusses further robustness checks. The first, Table A1, shows the results excluding third party-owned systems. The results are very similar to the primary regression results in Table 2. The second, Table A2, presents the results using the pre-incentive price, rather than the price per watt as the dependent variable. These results are largely similar, although some of the statistical significance is lost.  (3) and (4) and zip code in the remaining columns . * p < 0.05, ** p < 0.01, *** p < 0.001. In addition to the above two specifications, we also conducted a variety of other robustness check. For example, we examine variations on the percentage SREC incentive variable by including performance-based incentives. This makes little difference. We break out consumer segment into additional categories, but find that residential systems drive the results. We explore three separate wage rates (electrician, roofing, and general contracting), but find that we lose statistical significance and the general magnitude and sign for all three are similar to our composite wage index. We also examine several alternative specifications for market structure and experience variables, but find little difference to our preferred results. Finally, we include dummy variables for systems with Chinese-brand modules and for the presence of microinverters. Including either of these variables involves dropping some of our observations (9,389 for Chinese-brand modules and 8,234 for micro-inverters). The coefficients on both are negative and highly statistically significant. The remaining coefficients largely remain the same, with the one exception being the percentage of the incentive from SRECs, which moves closer to zero and becomes statistically insignificant.