Roofpedia: Automatic mapping of green and solar roofs for an open roofscape registry and evaluation of urban sustainability

Sustainable roofs, such as those with greenery and photovoltaic panels, contribute to the roadmap for reducing the carbon footprint of cities. However, research on sustainable urban roofscapes is rather focused on their potential and it is hindered by the scarcity of data, limiting our understanding of their current content, spatial distribution, and temporal evolution. To tackle this issue, we introduce Roofpedia, a set of three contributions: (i) automatic mapping of relevant urban roof typology from satellite imagery; (ii) an open roof registry mapping the spatial distribution and area of solar and green roofs of more than one million buildings across 17 cities; and (iii) the Roofpedia Index, a derivative of the registry, to benchmark the cities by the extent of sustainable roofscape in term of solar and green roof penetration. This project, partly inspired by its street greenery counterpart `Treepedia', is made possible by a multi-step pipeline that combines deep learning and geospatial techniques, demonstrating the feasibility of an automated methodology that generalises successfully across cities with an accuracy of detecting sustainable roofs of up to 100% in some cities. We offer our results as an interactive map and open dataset so that our work could aid researchers, local governments, and the public to uncover the pattern of sustainable rooftops across cities, track and monitor the current use of rooftops, complement studies on their potential, evaluate the effectiveness of existing incentives, verify the use of subsidies and fulfilment of climate pledges, estimate carbon offset capacities of cities, and ultimately support better policies and strategies to increase the adoption of instruments contributing to the sustainable development of cities.


Introduction
As urban population continues to grow, cities need to search for new spaces to feed the expansion. Traditionally, cities expanded horizontally to accommodate more housing, amenities, and public spaces, and this process took away forests and agricultural land which had a direct impact on the surrounding environment (Benson-Lira et al., 2016;Du et al., 2007). According to Murshed et al. (2018), compact and vertical cities are considered to be more energy-efficient and environmentally friendly, and one method of increasing the compactness of cities is to build on the roofs where there is a multitude of potential uses including civic spaces, urban farms, and solar power.
A roof serving a dual purpose is not a new invention. The flat roofs of traditional Egyptian houses were often used as an extension of their living spaces (de Garis Davies, 1929), and Le Corbusier (1927) believed that a roof garden is an essential element to modernist architecture. Today, luscious sky gardens with swimming pools have become a symbol of high-end residential developments, fetching considerable premiums than ordinary condominiums (Shukri and Misni, 2017) and offering residents a place to get in touch with nature, which is critically correlated to happiness in an urban setting (Han and Kim, 2019).
Besides hosting civic activities, the roofscape also has enormous potential in improving the climatic and energy performance of a building and reducing or offsetting its carbon footprint, e.g. by providing thermal insulation, which may result in decreased energy consumption (Coutts et al., 2013;Jaffal et al., 2012;Theodosiou, 2003;Wong et al., 2021). Yang et al. (2018) shows that green roofs can reduce 31% of a building's heat gain on a summer day and when applied in large scale, green roofs could mitigate urban heat island (UHI) effect with an average decrease of the peak ambient temperature close to 0.9 K (Santamouris, 2014). These benefits would increase the thermal comfort of urban residents and reduce excessive air-conditioning energy usage due to the UHI effect (Wong et al., 2003a,b; velop an automated method to identify rooftops that are vegetated and/or have photovoltaic system installations. The method relies on high-resolution satellite imagery is available for an increasing number of cities around the world, facilitating replication and scalability of our work. Second, we implement the method on 17 cities to create a cadastre of the roofscape, revealing the quantity and spatial distribution of rooftops that contribute towards a city's sustainable development. Our dataset contains information on more than one million buildings, and we are making this sustainable roofscape inventory available publicly, aiding researchers, practitioners, local governments, and the public to understand the current status of the roofscape in the context of sustainable urban development and achieving carbon neutrality. Third, our registry enables us to benchmark and compare cities with regard to the utilisation of their roofscape. We establish the level of penetration of green and solar roofs in each city, and quantify it, creating the Roofpedia Index. The scores for the 17 cities in our study are computed, and the cities are compared. Some of the aspects of the project and its name -Roofpedia -are inspired by Treepedia (Seiferling et al., 2017;MIT Senseable City Lab, 2016), a project to measure and map the amount of street greenery in cities from the pedestrian perspective, and compare cities around the world. A notable characteristic of our modular work is that it enables replication and extensibility, thus, additional cities can be included in Roofpedia in the future -from mapping the roofscape to calculating their index for benchmarking purposes, and potentially supporting future extensions to include other typologies of rooftops such as social uses. Furthermore, this service may potentially provide a basis for a crowdsourced platform that would integrate additional information on rooftops.
The paper is organised as follows. In Section 2 we expand on the introduction and continue the literature review. Section 3 describes our methodology for automatically mapping green and solar roofs, and evaluates the results we achieved. Section 4 analyses the results, presents our database, and the derived Roofpedia Index to assess the performance of cities. In Section 5 we discuss the work, expose limitations, and map out ideas for future work. Section 6 concludes the paper.

Sustainable roof policies and documentation
From the previous section on the benefits of a sustainable roofscape, it is evident that the roof could offer much more to a city rather than just been a plain enclosure for shelter and interior comfort. The roofscape of a city can contribute positively to a city's environment and quality of life, especially when instruments for sustainable development such as vegetation and PV panels are applied widely.
The potential is even greater when considering that not only new buildings may be enriched with green and solar installations. For example, many existing unvegetated roofs can be converted into green roofs (Wilkinson and Reed, 2013), which may be especially appropriate for buildings with flat roofs and sufficient structural strength (Clark et al., 2008;Stovin, 2010;Castleton et al., 2010).
While green roofs are a viable expenditure from the societal, environmental and economical perspective, they come at a cost, and their installation is inhibited by high investments (Nelms et al., 2007;Bianchini and Hewage, 2012;Sproul et al., 2014). To offset these costs, many local governments worldwide have introduced various incentives and guidelines to encourage the utilisation of rooftops spaces for activities that contribute to sustainable development (Claus and Rousseau, 2012;Chen, 2013). For example, sky gardens for communal uses are exempted from Gross Floor Area calculations in Hong Kong (Hong Kong Buildings Department, 2019), and Singapore's National Parks Board will fund up to 50% of installation costs of green roofs for developers (National Parks Board, 2009). The incentives in these two land-scarce and densely populated jurisdictions also suggest the proliferation of the role of roofscapes in transcending the frontiers of space utilisation, especially in promoting carbon offsetting. In the case of solar roofs, private PV panel owners in the United States are allowed to 'turn-back' their meter if they use less energy than what the solar panel generates (U.S. Department of Energy, 2020). Residents in the Australian state of Victoria can enjoy a rebate for the installation of a solar panel system with a four-year, interest-free loan for the remaining amount (Solar Victoria, 2020). While the above incentives could potentially increase the adoption of sustainable roofs, publicly available data on the current state of adoption in most cities is very limited, making it difficult to assess the effectiveness of current policies or to understand the current state of adoption.
Some projects have endeavoured to tackle this issue by mapping rooftop typologies at the city-scale. There are two initiatives that are closely related to our work. First, the City of Melbourne (2016) created a one-of-a-kind roof registry that maps the city's existing green and solar roofs and also potential roofs that could be converted to sustainable roofs. Another study was done in Berlin by the Senate Department for Urban Development and Housing (2017). An Roofpedia Figure 1: Illustration describing the project and its organisation in three parts.
automatic mapping was carried out with GIS processing that returned detailed statistical data on the area and location of green roofs in the city. Both projects offer an interactive map of the dataset on sustainable roofs and provide an open database for further research.
These two projects set a commendable example in popularising roof data as they enable everyone to explore the roofscape intuitively and potentially sparking interest in the topic. However, these projects have their fair share of limitations. For example, the project in Berlin is restricted to mapping green roofs. Further, both are focused on a single city, not allowing easy comparisons on how a city fares in comparison with others when it comes to sustainable installations on their rooftops. Additionally, no statistical validation was carried out to assess the accuracy of their methodology, nor the work has been published in international scientific outlets.
In this paper, we seek to overcome their shortcomings. We advance the steps taken by these two projects further and aim to map both green and solar roofs across multiple cities to create a global registry of sustainable roofscapes. We aim to assess the feasibility and accuracy of an automated mapping methodology and present the information intuitively to the public. It is important to underline that our work is not confined to a single city, as our cross-city analysis not only includes more than a dozen cities, but it is further scalable as long as satellite imagery of comparable quality as the ones in the training set is available. Such scalability has allowed us to create a cross-city benchmark (Roofpedia Index) that measures the penetration of solar and green roofs across 17 different cities. Furthermore, we have provided open access to the Roofpedia prediction pipeline where users could carry out the mapping and estimations using their own imagery. Additionally, our methodology of using satellite imagery allows us to examine the temporal evolution of sustainable rooftops, which would enable multiple use cases in the future, e.g. understanding the effectiveness of policies on a temporally fine scale. As photovoltaic capacity is dynamic and rapidly evolving, it is important to track changes through time.

Automated roof mapping with deep learning
In recent years, deep learning, especially Convolutional Neural Networks (CNN) (Shelhamer et al., 2017) offer new opportunities for object detection in optical remote sensing images (Cheng and Han, 2016) and it has been used for various applications in the built environment (Middel et al., 2019;Chen et al., 2021). Basu et al. (2015) demonstrated that deep learning algorithms can classify satellite image tiles to an accuracy of 97.95%. Going further in granularity, Chen et al. (2014) created a pipeline for vehicle detection in satellite images using Deep Convolutional Network.
There exists a multitude of research on the feasibility of mapping the roofscape with CNNs with satellite or aerial images. Chen and Li (2019) showed that CNNs can detect and segment building boundaries of small detached houses from aerial images in the city of Christchurch. Castello et al. (2019) demonstrated a deep learning method that predicts solar panels at pixel level with an accuracy of about 94% and an Intersection over Union of up to 64% using highresolution aerial photos, which is a significant leap compared to traditional machine learning approaches (Malof et al., 2016). However, the research was only an evaluation of a small region, and its scalability remains to be tested.
While aerial images are still limited to selected areas, satellite images have a wider coverage, and previous studies had demonstrated that deep learning methods on satellite images can still provide reliable predictions of the shapes and the sizes of building roofs or building footprints (Nosrati and Saeedi, 2009;Lee et al., 2019). Therefore, deep learning performed on satellite images will not be limited to a specific area but has the potential to scale up to cover myriads of cities around the globe.
Using a CNN architecture called Inception-v3 and a large-scale training set of 366,467 images, Yu et al. (2018) created a solar installation database for the contiguous U.S. This project demonstrates the scalability of computer vision algorithms in mapping solar panels from satellite images, and their results are encouraging for the development of our project. However, the study provides visualisation only at the scale of census tracts, unlike the Melbourne study where the resolution is at the building level. It is also difficult to use the same method on green roofs as there is yet to be a large labelled green roof dataset available. Further, while appreciating the scope of the study given the large extent of the U.S., the project is still focusing on only one country.
Nevertheless, compared to traditional spectral classification, CNNs can be used on lower quality images that are widely available in the public realm. Using a modified CNN pipeline with U-Net architecture, in the next section, we present an automated method to identify both green and solar roofs from satellite images and assign corresponding labels to building datasets derived from OpenStreetMap to create a roof registry at the building level for spatial visualisation and further analyses.
Besides the novelties and contributions discussed in Section 2.1, our method has several advantages compared to existing methods mentioned in this section. Firstly, we present work at a substantially higher spatial resolution than the current state of the art (i.e. census tract level to building level). Secondly, the model is created with international scalability in mind and it can be used in multiple cities around the world with different urban morphology and building typology. Thirdly, our method identifies both green and solar roofs, which adds another dimension to understanding the sustainability of the roofscape across cities. Finally, as the cities are assessed with the same model and within the same system, comparisons between cities would be fairer and easier than comparing results from different research projects. This advantage enables us to create a sustainable index that ascribes standardised scores to the state of sustainable roof practice in each city. Therefore, one of the outcomes of the work is not only a database of such roofs across many cities, but also gauging how sustainable and effective cities are when it comes to unlocking the space provided by rooftops.

Data and study areas
To demonstrate the scalability and feasibility of our methodology, 17 morphologically diverse and geographically distributed cities spanning Europe, North America, Australia, and Asia are selected as our study areas. These are: Berlin, Copenhagen, Las Vegas, Los Angeles, Luxembourg City, Melbourne, New York, Paris, Phoenix, Portland, San Diego, San Francisco, San Jose, Seattle, Singapore, Vancouver, and Zurich.
The satellite imagery used in this study is retrieved with the Mapbox static tiles API, except for Luxembourg for which the data is acquired from the open data by the Luxembourgish Land Registry and Topography Administration, affirming that our method is not constrained to one data source. Mapbox has a generous free tier usage limit enabling obtaining imagery of several cities. According to Mapbox, the images available for access are from a combination of different sources, including Maxar's Vivid products for much of the world, Nearmap and USDA 's NAIP 2011's NAIP -2013 in the contiguous United States, and open aerial imagery from Denmark, Finland, Germany, and other regions. These datasets offer resolutions of about 50 cm and higher in major cities in the world, and would be an effective way to assess the scalability and adaptability of our methodology. Data on building footprints have been extracted from OpenStreetMap, which is being increasingly used in research (Cerri et al., 2020;Feldmeyer et al., 2021), and which has a very good level of completeness for the study areas we have selected (Fan et al., 2014;Biljecki, 2020). The building footprints serve a dual purpose: they filter out greenery and solar panels not installed on rooftops, and they are used to integrate the results of our work, enriching the building data. The visualisations of the final result are rendered on Mapbox for public access and both the dataset and code are available openly on Github.
In our research, we have included a few more cities besides the listed ones (e.g. Washington D.C. and Marseille). However, eventually we did not proceed forward with including them in our database and index, due to a deviating performance of the model to identify sustainable roofs, which was largely caused by external factors such as quality of imagery and presence of particular features in these cities (e.g. a large density of skylights that tend to be misidentified as solar panels). Nevertheless, considering these cities was vital for understanding the data requirements and exposing limitations of the method, and further examples are given in Section 3.3. Our methodology can be broken down into four steps ( Figure 2). First, we label the training images across multiple diverse locations and tile the satellite images to a uniform shape for the deep learning algorithm. The polygonal labels required for the training are manually labelled across multiple cities to create a wide range of examples in different urban contexts and image conditions. Secondly, the processed dataset is trained in a Convolutional Neural Network based on a modified U-net architecture. The network is initialised with a pre-trained model (ResNet50) to improve the accuracy of the prediction. Thirdly, the building footprints of the predicted areas are passed in as new inputs and the probabilities predicted by the CNN are converted into georeferenced polygons to tag the respective building footprints with either 'Solar' or 'Green' or both. Lastly, the resulting, semantically enriched building footprints are analysed quantitatively and visualised spatially to create the Roofpedia Registry and the Roofpedia Index. training, validation and testing, there are a total of 1,517 labelled tiles for rooftop greenery and 1,380 labelled tiles for solar panels.

Labelling and tiling
For training, we have used the imagery of 8 instead of all 17 cities in our study to determine whether the trained models will be scalable in new locations. In each of the cities, areas are randomly selected during labelling and thus the labelled data do not represent all green and solar roofs in the city. Nevertheless, they are sufficiently representative of the green and solar roof typology in the city, and as the continuation of this paper will demonstrate, the model trained on these locations scales successfully in new areas without training data, facilitating the extension of the work to easily and accurately include new cities after the publication of this paper.
After creating training labels, we slice the tiles and the vector polygon masks of the satellite images using a zoom level of 19 according to the Slippy Map tiling convention described by OpenStreetMap. This zoom level enables the model to focus on roof objects and minimise confusion with objects outside of the building boundaries. The images and masks are further sliced into 256 x 256-pixel datasets for training. Labelled polygons are similarly sliced and converted into image masks for training ( Figure 4). In total, 2,897 images are prepared with 80% serving as training data and 20% as test data to validate the accuracy of the model.

Image segmentation with deep learning
The deep learning model used is a type of Fully Convolutional Neural Network (Shelhamer et al., 2017), which excels at image boundary identification and instance segmentation. The model is modified from U-Net (Ronneberger   (He et al., 2015). The U-Net architecture is chosen because of its optimisation on augmented data which uses available training samples more efficiently. As such, a U-Net can be trained from very few images and still provide reliable results. By replacing the pre-trained Resnet50 as the encoding layer for the U-Net, we leverage the power of transfer learning to reduce false negatives , and further improve the validation accuracy of the model (Shin et al., 2016;Hussain et al., 2018).
We use PyTorch (Paszke et al., 2019) for building the deep learning model and we have borrowed a preprocessing pipeline from Robosat, a satellite feature extraction toolbox (Ng and Hofmann, 2018). We created our own pipeline integrating these packages, which is available publicly for reproducibility.
To avoid confusion in shared features between greenery and solar panels, the model is trained on the solar dataset and green dataset separately, learning the weights independently for each feature. During training, the model learns the respective features that make up a patch rooftop greenery or a solar panel from the input image and mask and validates its accuracy against the test set. Figure 5 shows the results of the prediction model against the ground truth in the evaluation set for the green and solar models respectively. The performance of the model is measured by Mean Intersection over Union (mIoU) which measures the number of pixels common between the ground truth and prediction masks divided by the total number of pixels present across both masks. The model for identifying solar portions of roofs achieved a mIoU score of 0.784 while the green roof counterpart achieved a mIoU of 0.396. Compared to solar panels which have a well defined edge, some green roofs contain trees and vegetation that do not have a defined edge. This fuzziness has resulted in lower mIoU score for the green roof model as it struggles to find these edges. Nevertheless, this shortcoming does not have a major influence over the quality of the final result as only the locations of the predicted polygons are used in GIS operations. Figure 6: Geospatial engineering of the predicted result: tagging buildings with sustainable rooftops based on detected greenery and solar panels on them.

Classification with geospatial engineering
Even though the training labels of the deep learning model only contain solar panels and greenery on building rooftops, the entire image tile is passed through the model during training and inference. This process causes features that look like solar panels or greenery outside of the building rooftops to be picked up. For example, green patches in a park or trees on the streets could potentially be identified since they share similar features as rooftop gardens. Open-air carparks also have a similar grid-like feature that could be confused as solar panels. While this issue can be technically solved by complex image pre-processing procedures within the model architecture (i.e. creating a cropped image for each building), it will increase the size of the data and the complexity of the model, and thus increase the computational complexity for training and prediction.
To avoid this issue, we added a post-processing algorithm to remove false positives and restrict the predictions only within the building polygons. The prediction results from the deep learning model are converted into georeferenced polygons for GIS processing. When converting the pixel masks into predicted polygons for GIS operations, there is a small degree of rectification to simplify the polygons to reduce the number of edges while maintaining the general shape of the prediction. In addition, a denoiser is added to remove 'speckles' to reduce the number of False Positives. The setting of the denoiser is fine-tuned to provide the best balance between accuracy and precision.
Following the conversion, the resultant polygons that do not intersect building polygons are removed, and polygons that intersect building polygons but are too irregular or insignificant are ignored since we have noticed that these are predominantly false positives. Finally, building footprints that contain or intersect with the cleaned prediction set are assigned a label depending on whether they have a solar or green roof, or both ( Figure 6).

Results and evaluation
To determine the scalability and accuracy of the prediction, 11 neighbourhoods in different cities are selected for evaluation. These areas are not included in the original training set and their ground truths are labelled manually. As such, the results obtained is unbiased and effective in measuring the adaptability of the model across cities. Furthermore, some of the neighbourhoods are in cities excluded from training data which will help to measure the adaptability of the model when applied to a new city where it has not been trained. The size of the neighbourhoods varies with a Roofpedia range of 475 to 3425 buildings, and all buildings in the selected areas are carefully labelled as the ground truth, which is then compared against the predicted results.

Metrics for evaluation
We evaluate the performance of the methodology in both the number of buildings (Count) and the area of rooftops (Area) identified with green or solar roofs. A model that predicts Count accurately allows us to analyse the distribution of the identified rooftops in the urban fabric whereas a model that predicts Area accurately allows us to analyse how extensive a city has adopted sustainable roof typologies amongst its total roof area. Learning from the performance metrics proposed by Wentz and Zhao (2015), the following variables are used for calculating the performance of the method . These variables are also shown as columns in Tables 1-4.  Table 1 and Table 2 present the prediction results of green roofs against the manually labelled ground truth. Using absolute Count as the metric, the accuracy of the model reached 77.45% of the ground truth, and the average %FP reached 1.92%. Comparing the Area of the prediction with the ground truth, accuracy increased to 87.59% but %FP also increased, to 3.66%.

Green roof results
In both cases, %FPs are under 5% for most of the areas except Berlin 1 (Stadtmitte) and Berlin 2 (Friedrichshain). The percentage of matching (%Matching) is generally above 80% in terms of Area except for New York 1 (Hudson Square) and above 70% in terms of Count except for Zurich (Alstetten) and New York City 1 (Hudson Square).  Looking closely into these regions, we observed that some degree of confusion had occurred for the model. As shown in Figure 7, some roofs in Berlin (7a), have a light green and 'furry' appearance which the model confuses as a green roof. In the case of Zurich (7b, 7c), the model failed to recognise seasonal changes in green roof colour (browning of leaves in autumn) in the presence of solar panels. It successfully picked up some badly maintained green roofs (dark brown in colour with a furry texture) but confused it with dark brown pitches roofs in the vicinity. Finally, the model also failed to identify some green roofs in smaller buildings such as in New York City (7d). This issue might be caused by the removal of noise in the prediction pipeline where some true positives are sacrificed to reduce the percentage of false positives (in the case of New York City, removing noise contributed to an average 67.6% reduction of false positives (4.17% to 1.35%) at the cost of 5.48% decrease of accuracy (88.38% to 83.54%)). Table 3 and Table 4 present the prediction results of solar roofs against the manually labelled ground truth. Using the same method of calculation as that of the green roofs, the accuracy of the solar roof model in terms of Count reached 91.90% and an average %FP of 1.00%. Accuracy in terms of calculating the Area increased to 94.06% while %FP also increased to 4.42%. This performance suggests that larger buildings might have higher chances of having features that confuse the model into categorising it as a solar roof building.

Solar roof results
There is some confusion between building skylights and solar panels as they have almost identical features and difficult to distinguish also by humans. This issue is the major cause of the rise of %FP in Berlin (8a) and Washington D.C. (8b) as illustrated in Figure 8. Another factor that causes confusion is the angle of projection of tall buildings. When the perspective of the satellite photo is not ortho-rectified, the image may show the building facade where the glass windows can be confused with solar panels as seen in Marseille (8c). Lastly, objects that share similar features with solar panels such as dark rectangles with white border might also be counted as false positives. These objects could be rooftop sunshades or stadium seats such as the example in Melbourne (8d).

Evaluation conclusion
The evaluation results suggest that the percentage of false positives for both solar and green roofs are within acceptable levels, and the two models are sufficiently accurate to provide insightful city-wide spatial and quantitative analyses, both in the location and size of sustainable instruments on rooftops. In future work, it would be worthwhile to expand our approach by integrating other forms of data. For example, point clouds obtained from airborne Light Detection and Ranging (LiDAR) might provide further information that would aid the classification , such as 3D geometry and intensity of the returns, which may hint at the characteristics of the surface.

Roofpedia registry and index
The results from the predictions on the 17 cities (Section 3) are used to create (i) a prototype of a sustainable roof registry, i.e. open dataset and web viewer (Section 4.1); and (ii) an index to assess the penetration of sustainable roof typologies in cities (Section 4.2).

Roof registry
The roof registry is derived for the 17 cities and it is released as an openly accessible interactive map (Figure 9) that visualises the spatial distribution of green and solar roofs in two styles. At smaller scale levels, the centroids of the buildings are shown as yellow (solar roofs) or green (green roofs) dots for easy identification of the city-wide distribution pattern and reveal spatial patterns at the urban scale. At larger scales, the building polygons are shown to highlight buildings with sustainable roofs and addresses can be searched. Further, the generated data can be visualised as a heatmap to understand the density of sustainable rooftops.

Roofpedia Index -a measure for rooftop sustainability across cities
The idea of the index is comparable to the one described by the Treepedia project (MIT Senseable City Lab, 2016), which measures the street canopy cover in different cities with a Green View Index (Li et al., 2015;Seiferling et al., 2017). In a similar fashion, we create a city-scale summarised index for the roofscape, which assigns scores based on the density and area coverage of green and solar roofs in a city. We hope to raise curiosity and awareness in rooftop utilisation among the public and also provide planners and researchers with a barometer to access the effectiveness of roofscaping policies in different cities. As indicated in the previous section, the percentage cover of green and solar roofs are calculated based on Count and Area separately. Calculating the percentage by Count helps us to understand the degree of adoption by individual owners while calculating the percentage by Area indicates the extensiveness of the cover. For example, the evaluation subset in New York City (NYC2) has a low coverage of green roofs in terms of Count (3.51%) but a much higher coverage in terms of Area (27.59%). Such discrepancy suggests that green roofs in NYC are mostly installed on several large buildings while smaller buildings have yet to adopt green roofs. Conversely, although the evaluated subsets Marseille and Berlin 2 have a similar percentage of the cover by area, Berlin 2 has 2.9 times higher % by Count (3.72% vs 1.28%). This difference suggests that the density of buildings with solar roofs is higher in Berlin and more homeowners and developers have adopted solar panels on their roofs.
The Roofpedia Index recognises that both factors are important and takes them into consideration. %Count (Equation 4) and %Area (Equation 5) for both solar and green roof penetration are calculated for each city and normalised with Equation 6 and Equation 7. A combined score is calculated for both solar and green roofs with Equation 8 and Equation 9. An overall score that aggregates both the Solar Score and Green Score is then calculated by taking the average of the sum of two scores as in Equation 10. %Count = 100 × (Matching Count (TP)∕Total Count) (4) %Area = 100 × (Matching Area (TP)∕Total Area) Score by Count = 100 × (%Count − min(%Count))∕(max(%Count) − min(%Count)) Score by Area = 100 × (%Area − min(%Area))∕(max(%Area) − min(%Area)) Solar Score = (Score by Count Solar + Score by Area Solar )∕2 Green Score = (Score by Count Green + Score by Area Green )∕2 Overall Score = (Solar Score + Green Score)∕2 (10) Table 5 and Table 6 show the ranking of Solar and Green Roof coverage of the 17 cities and Table 7 shows the combined ranking of both scores, deriving a holistic index indicating the proliferation of sustainable roofscapes in cities.
The results offer material for discussion and further deliberations. Looking at the ranking for green roofs, Zurich tops the chart with 41.6% cover in Area and 31.2% cover in Count. This impressive lead in green roof coverage affirms the efforts taken by the Zurich City Government in making Green Roofs mandatory for all new buildings since 1991 (Tiefbau und Entsorgungsdepartement, 2020). Berlin comes in second place with approximately 24.8% in Area and 13.6% in Count. This result also presents a substantial lead compared to the rest of the cities in the study and echoes the long tradition of green roofs in Berlin as early as the beginning of the 19th century (Ahrendt, 2007). New York Roofpedia (a) Mapping buildings with sustainable rooftops (example for New York City).
(b) A visualisation for understanding the spatial density of solar roofs (example for Singapore). ranks highest in the U.S. with the largest green roof cover. With the new Climate Mobilization Act (NYC Mayor's Office of Sustainability, 2019), which made solar or green roofs mandatory on all new construction as well as buildings undertaking major roof renovations in the city, it is highly likely that New York will continue to maintain its edge in a sustainable roofscape and increase its score in the years to come.
In the case of solar roofs, Las Vegas tops the chart with 17.3% coverage in Area. This result is probably due to  the high solar potential of the geography. Phoenix is also another city in the U.S. that leverages its high solar potential throughout the year to harness solar energy. Other highly ranked cities in the study offer incentives to subsidise solar panel installation, maintaining our vision that this work can be used to assess and monitor the effectiveness of policies. For example, homeowners in Melbourne can enjoy up to AUD 1850 plus the option of an interest-free loan for their solar roof installation until mid-2021 under the Solar Victoria incentive (Solar Victoria, 2020), and residents in Copenhagen enjoy tax deductions of up to DKK 7000 to encourage the installation of renewable energy plants such as solar panels (Bundgaard and Lexner, 2011). It is important to note that the Roofpedia Index is not created to promote competition among cities as each city has its unique characteristics, and the exact benefit of greenery or solar panels on rooftops much depends on the urban morphology (Ng et al., 2012). The adoption of these rooftop typologies is also affected by the geolocation and macroclimate of the city. In drier areas, green roofs are harder to maintain while in rainy and dark areas, solar roofs might not make economic sense. Taking these limitations into consideration, a city could still be environmentally progressive without a sustainable roofscape. For example, although Vancouver did not perform well in our index, it is nevertheless consistently ranked as one of the most sustainable cities in the world. According to the Sustainable Cities Index 2016 (Arcadis, 2016), Vancouver ranks highest among North American cities in terms of environmental sustainability. Further, according to Treepedia, Vancouver is one of the cities highest on the list in terms of green canopy. Finally, Vancouver has access to plenty of hydropower, which provides 25% of the city's energy need alone, and has plans to derive 100% of the energy used from renewable sources before 2050 (City of Vancouver, 2015). Therefore, we believe that the Roofpedia Index complements existing sustainability indices (e.g. (Phillis et al., 2017))  by adding a new dimension of consideration in assessing the overall sustainability of a city.

Discussion and limitations
We have demonstrated the feasibility and reliability of a scalable method to identify sustainable roofscapes using computer vision and geospatial techniques. The result from the 17 cities has also revealed insights into the distribution and penetration of sustainable roofscapes in these urban areas. Besides the perennial importance of space utilisation in the built environment and the positive role of greenery and solar panels, our work is also important at a time of the increased collocation of photovoltaic and farming systems, and the growing importance of urban agriculture (Tablada et al., 2018;Ciriminna et al., 2019;Langemeyer et al., 2021).
We have shown that a Convolutional Neural Network could be an alternative approach in roof typology detection apart from traditional classification methods on satellite images. The advantage over the traditional methods is that CNNs are more forgiving of image quality. Where traditional classification methods require multi-spectral satellite imagery that are only available for select cities, our method can cover more cities with RGB images that are available at a reasonable degree of resolution. As such data is becoming available for more locations, additional cities can be added to the registry and index. Our modular pipeline does not only allow new cities to be added, but also new roof typologies. Adding such requires training custom models, which can be plugged in the system to predict an additional aspect of the content of roofs.
However, one challenge we have encountered in the process is the impact of inconsistent image quality across cities on prediction accuracy. Running the prediction pipeline on cities with lower quality images results in excessive noise, rendering the prediction unreliable. For example, when predicting rooftop greenery in Singapore, the targets were too blurry even for humans to discern. Fortunately, due to their darker colours and more defined edges, solar panels were much more perceptible than green roofs, thus, we have included Singapore in the Solar Index and the registry of solar rooftops. A general rule of thumb to follow is that as long as a human could discern the rooftop typology effortlessly and create labels without inferring from the surrounding context, the model would be able to automate the detection accurately. The issue of human legibility is also present in the labelling of datasets. When labelling solar panels on rooftops, we are unable to discern whether the solar panel belongs to a solar thermal module. As such, there is yet an effective way to separate solar thermal modules from generic solar panels as they are hard to differentiate even by humans.
Another consideration in interpreting the result of the predictions is the correlation between the percentage of cover area and the actual roof area covered by greenery and solar panels. In our method, the entire footprint area of a building is counted as the cover area because we assumed that the size of the roof area is proportional to the size of rooftop greenery and solar panels. While this is largely true, there are cases where the actual coverage of rooftop greenery and solar panels only covers a small proportion of the roof area.
On the note of building footprints, it is important to keep in mind that they are sourced from OpenStreetMap, a volunteered geospatial dataset with global coverage. While the quality is sufficiently good, the crowdsourcing nature and myriads of contributors across the world may have different approaches in mapping buildings. For example, a set of adjacent buildings (e.g. terraced houses) might be mapped as a single building by one mapper, while another one might map them as multiple individual buildings. These different modelling approaches might have an impact on the results.
While interpreting the prediction results, we did not take false positives into consideration. As shown in Section 3.3, %FP fluctuates across cities and could potentially reach 11.79% in terms of Area and 2.49% in terms of Count. Although the average %FP is well below 2% in terms of count and 4% in terms of Area, a city with a high %FP could be placed at a higher position in the index mistakenly, affecting interpretation of results. Therefore, quick manual inspections are still required on the predicted dataset to make sure the results are not exceptionally skewed. On a broader scope, this line of research also grants an opportunity for the public to participate in the mapping process. Just like OpenStreetMap, an interactive platform can be created for the public to verify the result of the AI models to eliminate false positives and the models could be then retrained with a larger dataset to improve its accuracy and precision iteratively.
It is important to recognise that this work is focused on rooftops, and it is geared towards elevating the prominence of the roofscape in the sustainability context. However, they are not the only venue for greenery and PV panels. For example, solar panels may be installed on walls of buildings (Saretta et al., 2020), but also on other platforms in cities, e.g. noise barriers (Zhong et al., 2021) and vehicles (Brito et al., 2021). The same goes for urban greening efforts, i.e. vertical spaces of buildings such as walls (Tan et al., 2014;Song et al., 2018;Palliwal et al., 2021;Huang et al., 2021). Furthermore, innovations in BIPV technology have created panels that mimic the appearance of traditional building materials (Verberne et al., 2014). Therefore, even if such panels are installed on building rooftops, it is difficult for both human and the model to detect their presence. A possible direction for future work is to complement our work with other forms of urban data that provide another perspective and cover further venues, e.g. using street view imagery (Tang and Long, 2019;Ding et al., 2021).
Our work has further benefits for use cases in different domains, predominantly in energy and urban planning. For example, the city-wide dataset that we have generated may aid researchers to focus on studies at the district or urban scale, such as large-scale retrofit studies (Ang et al., 2020;Wang et al., 2020;Johari et al., 2020). The dataset may also aid urban planners in understanding the spatial distribution of sustainable roof typologies, e.g. associating them with socio-economic attributes of neighbourhoods. Furthermore, it can provide location and density information (Figure 9) for further studies on the impact of green and solar roofs on urban microclimate as the positive effect of solar panels on the UHI effect remains debatable (Wang et al., 2005;Barron-Gafford et al., 2016).

Conclusion
As urban population continues to grow, cities need effective measures to improve their environment for both the residents and the surrounding ecology. The roofscape has a demonstrated potential in contributing to a city's sustainability, such as doubling its use by greening them and having solar panels installed on them. While most studies so far have been focused on estimating the potential of rooftops, Roofpedia enables us to understand the current and actual status of the rooftops, i.e. how many of them are currently occupied with solar panels and/or extensive greenery. Such information advances the state of the art, complements existing studies (including those estimating the potential), and it is useful for several purposes, e.g. gauging the efficiency of government policies, tracking and monitoring pledges by businesses, verifying the use of subsidies, estimating the current carbon offsetting capacity of cities and benchmarking them, and determining how much of the potential has already been realised.
By mapping the distribution of sustainable rooftops across cities in the world, we hope to raise public awareness on the importance of the role of rooftops in supporting sustainable development. We expect that interest in rooftops will be gaining more traction in urban analytics, and thus, our work and the generated data may provide a solid basis for follow-up studies. While our work is focused on green roofs and those with solar panels, we believe that it paves the way for future work in large-scale identification of further rooftop typologies using computer vision.
Besides the ideas discussed in Section 5, further opportunities for future work are many. For example, our method can be used to study the temporal evolution of sustainable rooftops, as its landscape is increasingly dynamic thanks to decreasing costs of technology and widely available satellite images (Zhang et al., 2019). The presented approach can be used to analyse satellite imagery from different epochs to provide the difference and evolution in green and solar installations over time, adding the temporal aspect to this project. Such research line would be welcomed by different use cases, such as understanding the effect of urban policies. In fact, our work suggests that cities that rank highly in the index usually have incentives for green or solar roofs in place, indicating that Roofpedia could be a useful instrument for urban planners and policymakers.
Another idea we have in mind is to expand our work with a crowdsourced platform, bringing it closer to the public and enable enriching our dataset with additional information that might not be available from satellite imagery, and with information that may widen the work beyond the sustainability context. For example, it might be beneficial to expand the work into providing information about social uses of rooftops through a web mapping service for the public who would like to explore the roofscape around them, e.g. to find rooftops in their neighbourhood that have public access and a nice view.
Further options include enrichment of building datasets using the instance that we generated. For example, as OpenStreetMap and 3D building models support modelling and integrating features such as solar panels Stowell et al., 2020), it might be worthwhile exploring whether our project could serve as a source for tagging buildings with green or solar installations and/or modelling their geometry.