Assessing Office Building Marketability before and after the Implementation of Energy Benchmarking and Disclosure Policies—Lessons Learned from Major U.S. Cities

An increasing number of U.S. cities require commercial/office properties to publicly disclose their energy performance due to the adoption of energy benchmarking and disclosure policies. This level of transparency provides an additional in-depth assessment of a building’s performance beyond a sustainability certification (e.g., Energy Star, LEED) and may lead less energy-efficient buildings to invest in energy retrofits, therefore improving their marketability. However, the research is scarce on assessing the impact of such policies on office building marketability. This study tries to fill this gap by investigating the impact of energy benchmarking policies on the performance of office buildings in four major U.S. cities (New York; Washington, D.C.; San Francisco; and Chicago). We use interrupted time series analysis (ITSA), while accounting for sustainability certification, public policy adoption, and property real estate performance. The results revealed that in some cities, energy-efficient buildings generally perform better than less energy-efficient buildings after the policy implementation, especially if they are Class A. The real estate performances of energy-efficient buildings also exhibited continuously increasing trends after the policy implementation. However, due to potentially confounding factors, further analysis is required to conclude the policy impacts on energy-efficient buildings are more positive than those on less energy-efficient buildings.


Introduction
Commercial building stock has increased by 47.4%, to 5.6 million square feet in 2012 compared to 1979, with the increase expected to reach 124 billion by 2050 [1], although this might be tempered due to the post-COVID-19 workplace flexibility. This concentration of commercial buildings in city cores and the adoption of energy benchmarking and public disclosure policies can contribute to a more sustainable built environment and an increased awareness among owners/investors and tenants [2,3].
Previous studies have reported that sustainable, energy-efficient buildings, such as Leadership in Energy and Environmental Design (LEED), Energy Star, provide higher rents and sale prices while lowering vacancies than comparable non-energy-efficient buildings in the market [4][5][6][7][8][9][10][11][12]. A recent study of electric vehicle (EV) charging stations has also found evidence of a positive effect between these stations and sale prices for office properties [4]. Furthermore, higher adoption of EV chargers is linked to areas with higher socioeconomic attributes, including home values [13]. As more states begin to mandate the transition to electric vehicles, more such stations will be required with additional energy consumption not previously accounted for. Even though there is owner/investor sensitivity towards sustainability certifications as federal agencies (e.g., U.S. General Services Administration) require their facilities to be LEED Gold and private tenants with core sustainability values prefer sustainably certified buildings, the open disclosure of energy performance provides for the first time a transparent, level comparison field among all buildings. As such, mandatory energy benchmarking and disclosure policies could potentially affect the leasing and purchasing decisions of real estate customers as they become more aware of such data. Consequently, such policies are expected to motivate the owners of less energy-efficient buildings to invest in energy retrofits with the goal of improving the short-and long-term performance and marketability of their buildings.
However, there is a lack of studies specifically focusing on exploring the impact of such policies on office buildings. In view of the significance of the energy benchmarking and disclosure policy as well as their potential impacts on real estate markets, the authors preliminarily examined the effectiveness of the benchmarking policy on the real estate performance of office buildings in downtown Chicago and argued that the energy disclosure policies have a positive impact on improving the real estate performance of energy-efficient buildings [14]. This study aims to expand the previous effort on assessing the effectiveness of the energy benchmarking and disclosure policy on real estate performance by adding more major cities across the U.S. into the analysis and conducting a more comprehensive and robust analysis to explore the relationship between the energy policy and office building marketability. Specifically, the research covers an assessment of the real estate performance of sustainable buildings before and after policy implementation while taking market cycles (e.g., seasonality) into consideration; further, it investigates if various building characteristics (e.g., building class level) would affect the effect of energy policies on a building's marketability.
This paper first provides a review of the previous research regarding the exploration of the relationship between energy policies and the corresponding impacts on the real estate market. It is then followed by an overview of the research data and description of the methodology used to examine the policy impact. Last, the results are discussed, followed by some concluding remarks.

Building Energy Efficiency Policies
For several years there has been evidence of increasing adoption of sustainability practices among buildings, including energy-efficient strategies that provide cost savings to owners and tenants. Sustainability has become an important factor to evaluate the property and impact the decision-making in the U.S. real estate market [15]. For office buildings, compared to less energy-efficient buildings, the LEED and Energy Star labeled offices (i.e., those with a certificate of building energy efficiency) have significantly high occupancy rates [16]. Those energy efficiency certificates also show a positive effect on increase in the revenues of office rental [5,6,10,17]. For industrial facilities, 'green' certified properties have higher values [18].
Energy efficiency policies such as building energy benchmarking and disclosure mandatorily require property owners to share their building energy performance with the public; these policies aim to raise the awareness of energy-efficient properties among owners, investors, and tenants. Zalejska-Jonsson [19] proved that in an area with a high availability of public energy information, energy efficiency is a key factor in impacting a buyer's decision on property purchase. However, in an area with less awareness of energy information, energy efficiency has rather a minor impact on decision-making. The adoption of energy policies is highly determined by economic, political, and climate factors [20]. Cities that are categorized as policy pioneers with a high education rate usually have a higher adoption rate of those policies and tend to have lower carbon emissions per capita [21]. The adoption of energy disclosure policies also increases the availability of building energy-performance data, which contributes to academic research on better understanding the relationship between building characteristics and energy performance. For example, Kontokosta [22] proved building structural, mechanical, and locational attributes can be used to predict energy consumption. To help the government better implement energy policies, studies have also summarized best practices and recommendations on adoption of such policies [23,24].
Studies indicate that there is a huge need for increasing the transparency of building energy performance and that the standardization of energy reporting is desired [25]. The U.S. Department of Energy [26] states that measuring and disclosing building energy use would drive building owners in making improvements to lower energy costs for their property, which can also be passed through to their tenants. It also has been demonstrated that disclosure of energy performance will positively impact on energy savings [2,[27][28][29]. The policies would improve the market penetration of energy-efficiency buildings in various real estate markets (e.g., office and residential) [30]; be positive in promoting green office building designations [31]; and increase the purchase of energy-efficient equipment [2]. Furthermore, Barrett et al. [32] investigated the energy ordinances requiring energy retrofits for rental properties in Boulder, Colorado, and found that early engagement of people committed to energy efficiency is conducive to the adoption of such requirements in an economically driven environment.
To the best of the authors' knowledge, there is a scarcity of research on the relationship between energy disclosure policies and office building marketability. Based on a preliminary study [14], the authors found that the energy-efficient buildings show a decreasing trend in vacancy rates after such disclosure policies were implemented, which implied that such policies would positively impact on improving the marketability. Therefore, a study of this nature can be viewed as a significant leap forward in facilitating informed decision-making of building owners in future energy-efficiency improvement projects.

Data Source
This research focuses on four major cities across the U.S. (New York City; Washington, D.C.; San Francisco; and Chicago). The key consideration of the city selection is its disclosure policy history-the aforementioned cities have relatively longer histories in adopting energy disclosure policies (e.g., Washington, D.C.-since 2008, New York-since 2009, San Francisco-since 2011, and Chicago-since 2013). In addition to data availability, the current research also targets the cities with a higher level of sustainability interest/awareness; this research begins by analyzing the effect of the disclosure policy in a city with a population that is known to be generally more sensitive to sustainable practices in comparison to other cities.
The data were collected from three different databases. The type of data collected were as follows: a.
Real estate data: Building characteristics (e.g., building class, size, etc.) and real estate performance data (e.g., occupancy rate, rents, etc.) were collected from the CoStar Group database for office buildings of more than 10,000 square feet. b.
Sustainability labeling data: Sustainability data such as rating, certification level, and points are publicly available and were collected from the U.S. Green Building Council (USGBC)'s website. The Energy Star label data were obtained from the Energy Star building database. c.
Energy consumption data: The energy benchmarking and disclosure policy requires building owners to publicly disclose their building's energy performance. Such data were available at city web portals.

Data Processing
Data processing consisted of two main tasks-data cleaning and database merging. For data cleaning, the incomplete, incorrect, inaccurate, and unreasonable data points were detected and carefully addressed (e.g., replacing, modifying, or deleting). For example, Figure 1 exhibits that some outliers (highlighted in red) exist in San Francisco's energy consumption database, and due to their irrationality, we decided to remove the data points that have a site EUI (i.e., the amount of heat and electricity consumed by a building) larger than the average plus three standard deviations. We also compared the trend of mean site EUIs before and after the dropouts. As shown in Figure 2, the trend of site EUI after dropping out the outliers is consistent with that of the original data, which implies the dropouts would not significantly impact our analysis of policy effects on trends.

Data Processing
Data processing consisted of two main tasks-data cleaning and database merging. For data cleaning, the incomplete, incorrect, inaccurate, and unreasonable data points were detected and carefully addressed (e.g., replacing, modifying, or deleting). For example, Figure 1 exhibits that some outliers (highlighted in red) exist in San Francisco's energy consumption database, and due to their irrationality, we decided to remove the data points that have a site EUI (i.e., the amount of heat and electricity consumed by a building) larger than the average plus three standard deviations. We also compared the trend of mean site EUIs before and after the dropouts. As shown in Figure 2, the trend of site EUI after dropping out the outliers is consistent with that of the original data, which implies the dropouts would not significantly impact our analysis of policy effects on trends.  To examine the impact of energy policies on the real estate performance of office buildings, this study further merged the three separate data dimensions into an integrated database. Building address was used as the primary key to join different datasets. Due to the inconsistency of recording address and typos in different datasets, a fuzzy merge

Data Processing
Data processing consisted of two main tasks-data cleaning and database merging. For data cleaning, the incomplete, incorrect, inaccurate, and unreasonable data points were detected and carefully addressed (e.g., replacing, modifying, or deleting). For example, Figure 1 exhibits that some outliers (highlighted in red) exist in San Francisco's energy consumption database, and due to their irrationality, we decided to remove the data points that have a site EUI (i.e., the amount of heat and electricity consumed by a building) larger than the average plus three standard deviations. We also compared the trend of mean site EUIs before and after the dropouts. As shown in Figure 2, the trend of site EUI after dropping out the outliers is consistent with that of the original data, which implies the dropouts would not significantly impact our analysis of policy effects on trends.  To examine the impact of energy policies on the real estate performance of office buildings, this study further merged the three separate data dimensions into an integrated database. Building address was used as the primary key to join different datasets. Due to the inconsistency of recording address and typos in different datasets, a fuzzy merge To examine the impact of energy policies on the real estate performance of office buildings, this study further merged the three separate data dimensions into an integrated database. Building address was used as the primary key to join different datasets. Due to the inconsistency of recording address and typos in different datasets, a fuzzy merge method [33] was used to calculate the matching degree of the addresses, and the two addresses with the highest matching degree were identified as the same building and merged. Specifically, we designated Energy Star (ES) certification for the buildings featured in the sustainability labeling dataset, while considering the remaining ones as non-ES certified  Figure 3 illustrates the merging process and the data size of each dataset before and after the merge. method [33] was used to calculate the matching degree of the addresses, and the two addresses with the highest matching degree were identified as the same building and merged. Specifically, we designated Energy Star (ES) certification for the buildings featured in the sustainability labeling dataset, while considering the remaining ones as non-ES certified buildings. Figure 3 illustrates the merging process and the data size of each dataset before and after the merge. The basic structure of the integrated database of each city has been summarized in Table 1. A variety of variables can be used to assess the marketability of office buildings. In this study, the annual occupancy rate was chosen as the metric to evaluate real estate performance because it has relatively high data quality (e.g., no missing values) and reflects tenant demand for all buildings regardless of sustainability certification/label. The second column in Table 1 presents the available years of real estate data (i.e., occupancy) of each city. The third column shows the year the energy benchmarking policy was implemented in each city. In this project, the Energy Star label is the main feature we used to group the buildings as energy-efficient (sustainable) buildings vs. non-energy-efficient buildings. Thus, Table 1 also summarizes the number of Energy Star (ES) label buildings and those without the Energy Star label. Note that in this research, a building is considered as Energy Star if it obtained the label at least once.  The basic structure of the integrated database of each city has been summarized in Table 1. A variety of variables can be used to assess the marketability of office buildings. In this study, the annual occupancy rate was chosen as the metric to evaluate real estate performance because it has relatively high data quality (e.g., no missing values) and reflects tenant demand for all buildings regardless of sustainability certification/label. The second column in Table 1 presents the available years of real estate data (i.e., occupancy) of each city. The third column shows the year the energy benchmarking policy was implemented in each city. In this project, the Energy Star label is the main feature we used to group the buildings as energy-efficient (sustainable) buildings vs. non-energy-efficient buildings. Thus, Table 1 also summarizes the number of Energy Star (ES) label buildings and those without the Energy Star label. Note that in this research, a building is considered as Energy Star if it obtained the label at least once.

Data Description
In addition to grouping buildings based on their sustainability status (i.e., Energy Star vs. non-Energy Star), the study also classified the buildings based on the building class, which aims to test if different characteristics of a building will affect the impact of the policy on its real estate performance. From CoStar (our real estate data source), office buildings are generally classified into three classes: A, B, and C-Class A representing the highest quality buildings in each city. Buildings are rated based on such parameters as age, building systems (e.g., HVAC), location, how well the building is maintained, and amenities. Figure 4 exhibits the number of buildings in each class. As shown in Figure 4, except for D.C., the other three cities have a similar distribution among the number of buildings for each class. The reason Washington, D.C., is different from the other three cities for Class C buildings is mainly due to data loss caused by data merging.
In addition to grouping buildings based on their sustainability status (i.e., Energy Star vs. non-Energy Star), the study also classified the buildings based on the building class, which aims to test if different characteristics of a building will affect the impact of the policy on its real estate performance. From CoStar (our real estate data source), office buildings are generally classified into three classes: A, B, and C-Class A representing the highest quality buildings in each city. Buildings are rated based on such parameters as age, building systems (e.g., HVAC), location, how well the building is maintained, and amenities. Figure 4 exhibits the number of buildings in each class. As shown in Figure 4, except for D.C., the other three cities have a similar distribution among the number of buildings for each class. The reason Washington, D.C., is different from the other three cities for Class C buildings is mainly due to data loss caused by data merging. This study further compared the number of Energy Star buildings with the number of non-Energy Star buildings in each class level, and the comparison is summarized in Figure 5. Generally, in Class A, the proportion of Energy Star buildings is relatively higher, while the proportion of non-Energy Star buildings is higher for Class B buildings. Additionally, there are only a few Energy Star Class C buildings, which follows the expectation as buildings with better energy efficiency are more likely to achieve a better classification level. The subsequent analysis focuses on A and B classes since the number of ES buildings in Class C is too small for meaningful statistical analysis. This study further compared the number of Energy Star buildings with the number of non-Energy Star buildings in each class level, and the comparison is summarized in Figure 5. Generally, in Class A, the proportion of Energy Star buildings is relatively higher, while the proportion of non-Energy Star buildings is higher for Class B buildings. Additionally, there are only a few Energy Star Class C buildings, which follows the expectation as buildings with better energy efficiency are more likely to achieve a better classification level. The subsequent analysis focuses on A and B classes since the number of ES buildings in Class C is too small for meaningful statistical analysis. A variety of variables can be used to measure the real estate performance of office buildings. The occupancy rate was chosen because of the relatively high data quality (e.g., absence of missing variables) and its better reflection of tenant demand for properties that have or have not embraced sustainability. Figures 6-9 show the annual trends of average occupancy for each class in each city for Energy Star buildings and non-Energy Star buildings. A variety of variables can be used to measure the real estate performance of office buildings. The occupancy rate was chosen because of the relatively high data quality (e.g., absence of missing variables) and its better reflection of tenant demand for properties that have or have not embraced sustainability. Figures 6-9 show the annual trends of  A variety of variables can be used to measure the real estate performance of office buildings. The occupancy rate was chosen because of the relatively high data quality (e.g., absence of missing variables) and its better reflection of tenant demand for properties that have or have not embraced sustainability. Figures 6-9 show the annual trends of average occupancy for each class in each city for Energy Star buildings and non-Energy Star buildings.   A variety of variables can be used to measure the real estate performance of office buildings. The occupancy rate was chosen because of the relatively high data quality (e.g., absence of missing variables) and its better reflection of tenant demand for properties that have or have not embraced sustainability. Figures 6-9 show the annual trends of average occupancy for each class in each city for Energy Star buildings and non-Energy Star buildings.

Methodology
This interdisciplinary research is at the interface of building energy efficiency, policy planning, and real estate economics, making contributions to each field. In order to achieve the research objective of assessing the impact of energy benchmarking policy on the real estate performance of office buildings, this study applied two interrupted time series analyses based on the occupancy rates of office buildings in the four cities. The general research process is summarized in Figure 10.

Methodology
This interdisciplinary research is at the interface of building energy efficiency, policy planning, and real estate economics, making contributions to each field. In order to achieve the research objective of assessing the impact of energy benchmarking policy on the real estate performance of office buildings, this study applied two interrupted time series analyses based on the occupancy rates of office buildings in the four cities. The general research process is summarized in Figure 10.

Interrupted Time Series Analysis (ITSA)
Once the real estate performance variable (occupancy rate) was selected, the ne was to set up hypotheses regarding how the policy would impact the variable. The

Interrupted Time Series Analysis (ITSA)
Once the real estate performance variable (occupancy rate) was selected, the next step was to set up hypotheses regarding how the policy would impact the variable. The policy impacts have three main scenarios, including if the trend of the performance variable had a change in level (i.e., an immediate change) after the policy, a change in gradient (i.e., a continuous change), or both. It is difficult to draw conclusions as to whether such impacts exist or not by simply observing the trends shown in Figures 6-9 in the previous section. Therefore, this study applies an interrupted time series analysis (ITSA) to obtain statistically significant evidence.
ITSA is a quasi-experimental method that was developed to assess if a time series of a specified outcome (e.g., occupancy rate) was affected by intervention(s) at a known point or points in time [34][35][36]. It has become increasingly popular in political science, in which it is usually used to study the impact of changes in laws or regulations on the behavior of people or the market [37][38][39]. ITSA is based on the key assumption that data trends remain unchanged without interventions. In other words, if there were no interventions, an expected trend can be predicted based on the pre-existing trend. A comparison between the expected trend and the actual trend observed in the post-intervention period reveals the difference, which provides evidence for the impact of the intervention. However, the assumption of the unchanged data trends has the risk of yielding biased results if the time series data is seasonal. For example, ITSA may detect changes after the policy implementation, but it is difficult to determine if that change is caused by the policy or seasonality. Therefore, it is necessary to first adjust for the seasonality in the time series data before conducting ITSA.

Seasonality Adjustment
The process of adjusting for seasonality can be divided into two main steps. In the first step, Fourier transformation was used to detect seasonality [40]. The main purpose of this step is to detect the seasonal cycle of the time series data (i.e., occupancy rate). For instance, Table 2 shows the detection results based on the non-Energy Star data in Chicago. It shows that the data have two seasonal cycles, and the main cycle comprises 8 years. The next step is to extract seasonality from the time series data through decomposition. Seasonality may exist in time series data through two forms-an additive way or a multiplicative way. According to the occupancy rate trends shown in Figures 6-9, the time series data in this research show an additive pattern. Based on the seasonal cycle determined from the previous step, the original time series data can be decomposed into three parts (i.e., seasonal, trend, and random), and the seasonal part can be removed from the original data. Figures 11-14 show the annual trends of the average occupancy for each city after the seasonality adjustment, which can be compared with the original trends shown in  multiplicative way. According to the occupancy rate trends shown in Figures 6-9, the time series data in this research show an additive pattern. Based on the seasonal cycle determined from the previous step, the original time series data can be decomposed into three parts (i.e., seasonal, trend, and random), and the seasonal part can be removed from the original data. Figures 11-14 show the annual trends of the average occupancy for each city after the seasonality adjustment, which can be compared with the original trends shown in Figures 6-9.  tion. Seasonality may exist in time series data through two forms-an additive way or a multiplicative way. According to the occupancy rate trends shown in Figures 6-9, the time series data in this research show an additive pattern. Based on the seasonal cycle determined from the previous step, the original time series data can be decomposed into three parts (i.e., seasonal, trend, and random), and the seasonal part can be removed from the original data. Figures 11-14 show the annual trends of the average occupancy for each city after the seasonality adjustment, which can be compared with the original trends shown in Figures 6-9.

The Multiple-Group ITSA on Aggregated Vacancy of Two Building Groups
When studying the impact of a large-scale intervention (e.g., a policy affecting all buildings in a city), researchers often have an effective sample size of N = 1 (treatment group) or N = 2 (treatment group with a control group) [41], and it is common to use an aggregated value (e.g., median or mean) to represent the sample in the ITSA. In the present study, the treatment group consists of all the Energy Star buildings of each class for each city, and the mean occupancy rate is used as the aggregated outcome variable for the ITSA.
In addition to the energy policy, many unobserved factors could potentially affect occupancy rates. Including a control group in the ITSA can help account for the other confounding factors when an exogenous intervention affects all the groups, which is called multiple-group ITSA [41]. The multiple-group ITSA hypothesizes that the level or trend of the outcome variables remains unchanged for all groups if no intervention occurs. It assumes the unobserved factors affect both groups to the same extent.
This study conducted multiple-group ITSAs for each city based on two comparable groups-one control group consisting of the non-Energy Star buildings and one treatment group consisting of the Energy Star buildings. By accounting for confounding factors, this grouping enables us to focus the study on investigating how the benchmarking policy affected occupancy rates differently between the energy-efficient buildings and their nonenergy-efficient counterparts. The multiple-group ITSA with two groups is based on the following regression model [42,43]:

The Multiple-Group ITSA on Aggregated Vacancy of Two Building Groups
When studying the impact of a large-scale intervention (e.g., a policy affecting all buildings in a city), researchers often have an effective sample size of N = 1 (treatment group) or N = 2 (treatment group with a control group) [41], and it is common to use an aggregated value (e.g., median or mean) to represent the sample in the ITSA. In the present study, the treatment group consists of all the Energy Star buildings of each class for each city, and the mean occupancy rate is used as the aggregated outcome variable for the ITSA.
In addition to the energy policy, many unobserved factors could potentially affect occupancy rates. Including a control group in the ITSA can help account for the other confounding factors when an exogenous intervention affects all the groups, which is called multiple-group ITSA [41]. The multiple-group ITSA hypothesizes that the level or trend of the outcome variables remains unchanged for all groups if no intervention occurs. It assumes the unobserved factors affect both groups to the same extent.
This study conducted multiple-group ITSAs for each city based on two comparable groups-one control group consisting of the non-Energy Star buildings and one treatment group consisting of the Energy Star buildings. By accounting for confounding factors, this grouping enables us to focus the study on investigating how the benchmarking policy affected occupancy rates differently between the energy-efficient buildings and their nonenergy-efficient counterparts. The multiple-group ITSA with two groups is based on the following regression model [42,43]: where Y t is the aggregated outcome variable (average occupancy rate) at each equally spaced (annual) time point t, and Z is the dummy variable to indicate the group (0 = control and 1 = treatment); T t is the time since the starting year of the database; and X t = the dummy variable to indicate the pre-or post-intervention period (0 = pre-intervention period and 1 = post-intervention period). In Equation (1), the first four coefficients, β 0 through β 3 , refer to the control group, while the last four coefficients, β 4 through β 7 , refer to the treatment group. Specifically, β 0 = the intercept of the outcome variable; β 1 = the coefficient to represent the initial trend before the intervention; β 2 = the level change that occurs immediately after the intervention; and β 3 = the continuous change of the trend after the intervention. Furthermore, β 4 is the difference in the intercept of the outcome variable between treatment and control groups before the intervention. β 5 is the difference in the trend between the two groups before the intervention. β 6 is the difference between the two groups in the level change immediately after the intervention. Lastly, β 7 is the difference between the two groups in the continuous change of the trend after the intervention. ε t is a random error term.
To ensure the comparability between the groups, the control and treatment groups should not be significantly different in either the initial intercept or the trend of the outcome variable before the intervention. Thus, the appropriate control group should have the p-values for both β 4 and β 5 greater than the required threshold (i.e., 0.05). The p-values of β 2 and β 3 show whether there are significant changes (immediate and continual) in the control group (non-Energy Star group) after the intervention. The p-values for β 6 and β 7 then provide statistical evidence on whether the policy affects the treatment group differently from the control group.

Single-Group ITSA on Individual Building
There is a potential issue of information loss by using the aggregated data (i.e., average occupancy rate). This is because the average vacancy was used to represent the whole population (i.e., ES group or non-ES group), which limited the analysis in evaluating the policy impacts on each building. To deal with this challenge, the study expands the aforementioned analysis by conducting ITSA on an individual building. In this case, a single-group ITSA is used.
The single-group ITSA is a simple version of the multiple-group ITSA that only examines the changes for the treatment group (i.e., occupancy rate of each building). It is based on the following model [42,44]: where Y t is the occupancy rate of each building at year t; T t is the time since the starting year of the database; and X t = the dummy variable to indicate the pre-or post-intervention period (0 = pre-intervention period and 1 = post-intervention period). It is noted that if we set Z in Equation (1) to 0, the two functions become the same. The meanings of the coefficients (i.e., β 0 to β 3 ) in Equation (2) are the same as those in Equation (1).

The Result of the Multiple-Group ITSA on Two Building Groups
The multiple-group ITSA was conducted based on the annual average occupancy rates of the office buildings for each class and each city in order to examine the change(s) after the policy implementation and then to infer the impact of the benchmarking policy on real estate performance. This multi-group analysis specified Energy Star buildings as the treatment group and non-Energy Star buildings as the control group.

New York City
For Class A buildings, as shown in Table 3, the starting level of difference between the treatment group and the control group (β 4 : z) was not significant (p = 0.80), and the initial trend difference (β 5 : z_t) was not significant, either (p = 0.87). As mentioned earlier, the groups with p-values greater than a specified threshold (i.e., 0.05) for both β 4 and β 5 in Equation (1) are preferred, to ensure the comparability. Thus, for NYC Class A buildings, the Energy Star group (i.e., treatment group) and non-Energy Star group (i.e., control group) behave similarly before the policy intervention. After the intervention, the occupancy rate of the non-Energy Star group increases by 2.28% immediately (β 2 ; p = 0.001), while that of the Energy Star group drops immediately by 2.50% (β 2 + β 6 ; p = 0.01). The policy was implemented following the beginning of the financial crisis with Class A buildings commissioning the highest rents, and the crisis could lead to tenant flight to Class B. For the continuous trends (β 3 and β 7 ), there is a slightly increasing trend for the Energy Star group's occupancy rate and a decreasing trend of the non-Energy Star group's occupancy rate, but neither is statistically significant. Note that β 6 and β 7 represent the differences between the Energy Star group and non-Energy Star group rather than the changes of the Energy Star group before and after the policy implementation. The results were verified upon the visual display of Figure 15a.  For Class B buildings, in Table 3, the occupancy rate of the non-Energy Star group increases by 2.28% immediately after the policy implementation with a continuously decreasing trend (−0.21% per year). The p-values of 6 and 7 show that there is no significant difference of the trend between the Energy Star and non-Energy Star groups after the intervention. This indicates that the occupancy rate of the Energy Star group also had an immediate increase with a continuous decreasing trend. The results are also exhibited in Figure 15b. For Class B buildings, in Table 3, the occupancy rate of the non-Energy Star group increases by 2.28% immediately after the policy implementation with a continuously decreasing trend (−0.21% per year). The p-values of β 6 and β 7 show that there is no As shown in Table 4, for Class A buildings, there is a significant drop of occupancy rate for the non-Energy Star group (−6.89%) after the policy implementation. Although the immediate change of the Energy Star group is slightly different (1.10%) from that of the non-Energy Star group, it is not statistically significant. Thus, the occupancy rate of the Energy Star group also had an immediate drop after the policy. Similar to New York City, these drops might also be due to the financial crisis. However, after the policy implementation (year 2008), the occupancy of the Energy Star group shows an increasing trend, while the non-Energy Star group has a decreasing trend (statistically significant, i.e., p value 0.000). The results are visualized in Figure 16a.
For Class B buildings, in Table 4, the occupancy rates have significant decreasing trends after the policy implementation for both groups. However, there is no significant difference between the two groups, which implies that the policy affected both Energy Star and non-Energy Star buildings to a comparable extent. Visualization is shown in Figure 16b.   For Class B buildings, in Table 4, the occupancy rates have significant decreasing trends after the policy implementation for both groups. However, there is no significant difference between the two groups, which implies that the policy affected both Energy Star and non-Energy Star buildings to a comparable extent. Visualization is shown in Figure 16b.

San Francisco
As shown in Table 5 and Figure 17a, for Class A buildings, both groups have an increasing trend of occupancy rate (but not statistically significant) after the policy implementation. The results show that there is no significant difference in the trend between these two groups. For Class B buildings, after the policy implementation, the occupancy rates of both groups have statistically significant increasing trends, and there is no significant difference between the two groups, as shown in Table 5 and Figure 17b.

San Francisco
As shown in Table 5 and Figure 17a, for Class A buildings, both groups have an increasing trend of occupancy rate (but not statistically significant) after the policy implementation. The results show that there is no significant difference in the trend between these two groups. For Class B buildings, after the policy implementation, the occupancy rates of both groups have statistically significant increasing trends, and there is no significant difference between the two groups, as shown in Table 5 and Figure 17b.   As shown in Table 6 for Class A buildings, both groups have an increasing trend of occupancy after the policy implementation. However, there is a significant difference in the immediate change between the two groups. The non-Energy Star group has an

Chicago
As shown in Table 6 for Class A buildings, both groups have an increasing trend of occupancy after the policy implementation. However, there is a significant difference in the immediate change between the two groups. The non-Energy Star group has an immediate jump after the implementation. Furthermore, Figure 18a visualizes the results. For Class B buildings, the non-Energy Star group has a significant jump in occupancy after the policy implementation, which is different from the Energy Star group. However, the continual trend of the non-Energy Star group is decreasing after the policy implementation, while the Energy Star group has an increasing trend, as shown in Table 6 and Figure 18b.

The Result of ITSA on Individual Buildings
In order to maximize the use of the collected data and avoid loss of information caused by the aggregated-data-based analysis, the study further adopted the single-group ITSA to examine if the implementation of the policy resulted in a shift in the occupancy rate for each building. Based on the analysis results, the study counted the number of buildings with a statistically significant shift (i.e., p-value < 0.05) in the occupancy rate after the policy implementation. Furthermore, among the buildings with statistically significant changes, the number of them with positive changes (i.e., the occupancy rate increases after the policy implementation) were further counted. Table 7 summarizes the numbers for each city. Note that in Table 7, the columns labelled "Total" mean the total number of buildings in each group (Energy Star vs. non-Energy Star); the columns labelled "Significant" mean the number of buildings that have statistically significant changes (either immediately or continuously) after the year the policy was implemented; and the columns labelled "Sign & Pos" means the number of buildings with statistically significant and also positive changes (increase in occupancy rate) after the year of policy implementation.
By comparing these totals (i.e., significant and sign & pos) with the total number of buildings under the Energy Star group and the non-Energy Star group, respectively, two ratios can be derived. The first ratio (significant/total) indicates the percentage of buildings that have statistically significant changes in occupancy rate after the policy implementation, and the second ratio (significant and positive/significant) indicates the percentage of the buildings that are positively affected by the policy among the buildings that have significant changes. The results can be used to infer if the policy has different impacts on the real estate performance between the Energy Star and non-Energy Star groups. Figure 19 shows the corresponding ratios.  The first ratio (significant/total) can be used to check which type(s) of buildings are more likely to have a change in real estate performance after the policy implementation. As shown in Figure 19, in New York City, overall, the ratios of both groups (Energy Star and non-Energy Star) are very close. For Class A buildings, the ratios are approximately 0.5, which implies the real estate performances of about half of Class A buildings have significantly changed after the policy implementation. For Class B, these ratios are slightly smaller, around 0.4. In Washington, D.C. and San Francisco, the significant/total ratio of the Energy Star group is higher than that of the non-Energy Star group, while the difference in this ratio between Class A and Class B is not obvious. This indicates that for Washington, D.C., and San Francisco, the real estate performance of the Energy Star buildings may be more sensitive to the energy policy (i.e., more prone to change) compared to the non-Energy Star buildings. For Chicago, both the Energy Star and non-Energy Star Class A buildings have a relatively high significant/total ratio (close to and over 0.7, respectively), which means occupancy rates of a large proportion of Class A buildings significantly changed after the policy implementation. For Class B buildings, the ratio of the Energy Star group is very close to that of the Energy Star group in Class A. This implies that for Energy Star buildings, the class level may not affect the sensitivity of their real estate performance to the policy. However, for non-Energy Star buildings, the ratio of The first ratio (significant/total) can be used to check which type(s) of buildings are more likely to have a change in real estate performance after the policy implementation. As shown in Figure 19, in New York City, overall, the ratios of both groups (Energy Star and non-Energy Star) are very close. For Class A buildings, the ratios are approximately 0.5, which implies the real estate performances of about half of Class A buildings have significantly changed after the policy implementation. For Class B, these ratios are slightly smaller, around 0.4. In Washington, D.C. and San Francisco, the significant/total ratio of the Energy Star group is higher than that of the non-Energy Star group, while the difference in this ratio between Class A and Class B is not obvious. This indicates that for Washington, D.C., and San Francisco, the real estate performance of the Energy Star buildings may be more sensitive to the energy policy (i.e., more prone to change) compared to the non-Energy Star buildings. For Chicago, both the Energy Star and non-Energy Star Class A buildings have a relatively high significant/total ratio (close to and over 0.7, respectively), which means occupancy rates of a large proportion of Class A buildings significantly changed after the policy implementation. For Class B buildings, the ratio of the Energy Star group is very close to that of the Energy Star group in Class A. This implies that for Energy Star buildings, the class level may not affect the sensitivity of their real estate performance to the policy. However, for non-Energy Star buildings, the ratio of Class A buildings and that of Class B buildings are obviously different (0.75 vs. 0.53), which implies that the sensitivity of non-Energy Star buildings' real estate performance to the policy may depend on the building class.
The second ratios (positive and significant/significant) can be used to check which type(s) of buildings are more likely to be positively affected by the energy policy. From Figure 19, in New York City, among the buildings that have statistically significant changes in occupancy rate after the policy, Class A non-Energy Star buildings have a higher ratio showing positive changes (0.64). However, only 39% of Class A Energy Star buildings exhibit an increase in occupancy rate after the policy implementation, which means, after the policy implementation, more Class A Energy Star buildings have a decreasing trend in occupancy rate. For Class B buildings, the occupancy rates of 52% of Energy Star buildings increased after the policy implementation, which is slightly higher than non-Energy Star buildings (49%). In Washington, D.C., the 'significant and positive/significant' ratios of both Energy Star and non-Energy Star groups are relatively low, which implies that more buildings experienced declines in occupancy rate after 2008 (the year of the energy policy implementation). This low ratio might have been caused by other confounding factors, such as the financial crisis. However, it is noted that the ratio of the Energy Star group is higher than that of the non-Energy Star group for Class A buildings, which implies that Energy Star buildings are more likely to be positively affected by the policy. For Class A buildings in San Francisco, the 'significant and positive/significant' ratio of the Energy Star group is approximately 0.76, while that of the non-Energy Star group is only 0.5. This implies that after the policy implementation, Energy Star buildings tend to have better improvement in real estate performance. In Chicago, the 'significant and positive/significant' ratios of both groups for Class A buildings are relatively high, which shows that the policy generally has a positive influence on Class A buildings, with the influence not being substantially different between the Energy Star and non-Energy Star groups. For Class B buildings, 53% of Energy Star buildings exhibit an increase in occupancy rates after the policy implementation, which is lower than the ratio of non-Energy Star buildings (68%).

Discussion
According to the multiple-group ITSA on the average occupancy rate, the results are mixed with New York City and Washington, D.C., showing that the occupancy rate of Class A Energy Star buildings fell immediately after the policy implementation and then recovered with a gradual upward trend. In contrast, the Class A non-Energy Star buildings exhibited a gradually decreasing trend after the policy implementation. According to the ITSA, however, there was no statistical evidence of a continuous occupancy rate trend difference between the two groups after the policy implementation. These effects may have their roots in the financial crisis, as the implementation happened in 2008 and 2009, respectively, and rents are much higher in these properties. On the other hand, the Class A buildings (both Energy Star and non-Energy Star) in San Francisco and Chicago show an increase in occupancy after the policy implementation. For Class B buildings, the multiplegroup ITSA shows that New York City and Washington, D.C. experienced a decreasing trend in occupancy rate after the policy implementation, while San Francisco experienced an increase. There is also no evidence to indicate that occupancy rates performed differently between Energy Star and non-Energy Star groups for the aforementioned cities. Chicago is the only city with statistically significant differences between the two groups ( Figure 18b and Table 6).
The results from the single-group ITSA complemented the previous analysis. For Class A buildings, the grouping of Washington, D.C. with New York City as well as San Francisco with Chicago is maintained with the former group showing a slower occupancy increase than the latter. For Class B buildings, in contrast to all other cities, Washington, D.C., maintains a smaller ratio of the buildings that have increasing occupancy. Again, the financial crisis may be a confounding factor for the result.
This preliminary study has certain limitations. Due to the limited timeframe since the policy implementation in certain cities, the established model may not fully capture the trend of occupancy rate before the policy implementation and therefore will limit the forecast of the trend after the policy implementation. With more available data points in the future, a more sophisticated ARIMA (autoregressive integrated moving average) model could improve the forecast accuracy. Additionally, an external confounding factor such as the financial crisis may have affected our analysis results.

Conclusions and Policy Implications
The implementation of energy benchmarking and disclosure policies aims to raise awareness of energy-efficient properties among owners, investors, and tenants. Consequently, they are expected to motivate owners of less energy-efficient buildings to invest in energy retrofits to improve the energy and sustainability performance of their buildings. This study focuses on assessing the impact of such policies in relation to the real estate performance of office buildings.
To contribute to the body of knowledge in sustainability, public policy, and real estate, this research investigated the impacts of the benchmarking policy on real estate performances by applying two types of ITSAs to office buildings in four cities across the U.S., namely, New York City; Washington, D.C.; San Francisco; and Chicago. Initially, we assessed the impact of the policy on real estate performances between energy-efficient and non-energy-efficient buildings based on the aggregated data (i.e., the mean of occupancy rates) by using the multiple-group ITSA. To avoid the potential issue of information loss due to the use of aggregated data, the second analysis focused on the impact of the policy on real estate performance of each building using the single-group ITSA and counted how many buildings under each group showed statistically significant and positive results.
Generally, the results revealed that for some cities, the Energy Star buildings have better real estate performances for both analyses, but it is hard to conclude that the policy impacts on Energy Star buildings are more positive than non-Energy Star buildings. Specifically, the results obtained from the first analysis revealed that the energy policies might not immediately affect the real estate performance of office buildings. However, after the policy implementation, the real estate performances of energy-efficient Class A buildings exhibit gradual increasing trends after the policies have been implemented for a while, which is evidenced by the ITSA of all four cities. For Class B buildings, the results are mixed for New York City, while Washington, D.C., exhibited a decline in the real estate performance after the policy implementation. This effect may also have its roots in the financial crisis as the policies implemented in 2008 and 2009, respectively, for these two cities with rents being much higher in these properties.
The result from the single-group ITSA is consistent with the result of the first analysis. For the cities of San Francisco and Chicago, the energy-efficient buildings have higher ratios of the 'positive and significant/significant', which implies that the energy-efficient buildings are more likely to be positively affected by the policy. However, such ratios are relatively low for New York City and Washington, D.C., which may be caused by other confounding factors, such as the financial crisis. To reach more robust conclusions, more sophisticated analyses with additional data are needed in future works to account for the confounding effects, such as including control-group cities without disclosure policies in the analyses.
These findings have implications for other cities or countries considering energy policies and their impact on real estate performance. Immediate impacts of energy policies on real estate performance were limited, but gradual improvements were observed over time for energy-efficient Class A buildings. Different building classes yielded mixed results across cities, emphasizing the importance of tailored policies for each class within a coun-