INACITY - INvestigate and Analyze a CITY

INACITY is a platform that integrates Geo-located Imagery Databases (GIDs), Geographical Information Systems (GIS), digital maps


Introduction
Open-Government data, or crowd-sourced data, or even enterprise companies data, is usually available on the Internet, and the correct and proper use of it may be significant for both governments and citizens [1]. Several social investigations and projects are done by inspecting places in person and collecting data through surveys or photographs. Some of these investigations had been conducted in order to establish a relationship between neighborhood audition data and health related subjects like adolescents mental health [2], overall and lower-extremity functional loss [3,4], poverty and mortality [5], low birth weight, asthma and respiratory diseases [6,7], physical activity [8], selfrated health [9], depression and stress [10,11]. Some studies targeted more specifically the relationship between neighborhood physical characteristics (e.g. greenery, graffiti, visual urban decay), psychosocial characteristics and obesity [12][13][14][15][16]. In some studies the causes and consequences of physical and social disorder at the neighborhood level are analyzed [17][18][19].
Collecting data can be an expensive and time-demanding process [20]. To the best of our knowledge, there is no open source platform to gather, combine, analyze, and visualize multimodal data. When image data at the street level is needed, Geo-located Imagery Databases (GIDs), that is, image retrieval systems, can be used to collect the urban images [20,21]. Looking for some feature in thousands of images can be unfeasible, and Computer Vision (CV) systems can be employed to mitigate this problem.
INACITY is an open-source platform that integrates GIDs, Geographical Information Systems (GIS) databases, digital maps, and CV techniques to collect and analyze urban street-level images. The software architecture of the platform is a client-server model, where the client-side is a simple Web page that allows the user to select regions of a map and also select filters to analyze and visualize urban features. The server side is a Django powered Web service with PostgreSQL and Neo4j databases. The platform is not a replacement for a Geographical Information System, such as QGIS [22] for example. It can be complementary to a GIS to define geographical locations of interest. The workflow would start using an Imagery database system to collect images from defined locations, extracted features from those images, and stored the data back into a GIS, as will be described later when we explain how we use the store geographical data into the graph-oriented Database Management System Neo4j [23].
From the perspective of an end-user, the platform allows him/her to select a region of interest in a map, select a feature (e.g. greenery, bus stops, traffic signs, etc.) and then analyze the presence, or even the distribution of that feature based on images collected, inside the selected region, on GIDs. Besides, the enduser can also use INACITY's front-end to visualize geographical entities from a GIS such as bus stops, schools, overpasses, and others, over the same region of interest.
From a developer's perspective, INACITY offers three kinds of extensions: new GIS, GIDs, and CV techniques. The architecture is extensible, and it is easy to add new modules or replace the existing ones with new digital maps, GIS databases, CV filters, or GIDs.

Methodology and software description
This section presents the software architecture, components, classes, current functionalities, and the way to extend it to add new features. The components responsible for collecting data in INACITY's back-end keep a common interface, so that integrating a new data collecting component is just a matter of implementing that interface. Besides, custom objects allow heterogeneous data to be combined and displayed at INACITY's front-end.

Software architecture
The INACITY's client-server model [24] is twofold: a back-end (data access layer) and a front-end (presentation layer) component. This model was chosen in order to maximize the accessibility of the platform, that is, instead of having a client application for each major OS (e.g., Linux, macOS, and Windows) by making the platform available as a web platform, any http-based client front-end can consume it. To make the platform even more accessible, the main deploy tool used is Docker system [25].
The programming language adopted to develop the back-end was Python 3. The reason to choose Python rather than a more business-driven language like C# or even Java is that those languages usually come coupled with an environment of their own, like the Java Virtual Machine or proprietary libraries from Microsoft. Python tends to be more friendly for newcomers and has a rich set of libraries and packages publicly available. Besides, there is a considerably large body of scientific work produced with Python (e.g., the Scikit libraries family [26]) and web-development frameworks (e.g., Django and Flask), which in turn are tuned to deal easily with database modeling even with geographical data.
The choice for the Django framework [27] as the back-end core was because Django provides all the machinery for handling web-based requests (i.e. HTTP or, more specifically REST based requests), database access and modeling, user authentication and authorization, real-time communication (with the extension Django-channels) and a stable and large community. Having all of these pieces managed and put together by the Django framework leaves only the main concepts (data integration) to be dealt with in the INACITY platform. In the back-end, the main body of work, besides setting up the Django machinery, is modeling the classes responsible for collecting and integrating data.
The front-end development also followed the Django guidelines. Using Django templates, we developed a coupled front-end provided by the same back-end rather than having a framework to deal with the server stuff (e.g., processing and database access) and front-end serving (e.g., NodeJS). The front-end provided by the INACITY platform is a visualization tool that consumes data from the back-end, also allowing user account creation, login, and the management of users' work sessions. The front-end was developed using Django template language (based on HTML) and Javascript.
The back-end holds a manager system 1 responsible for keeping track of classes that collect data from GIS, GIDs, and classes that implement CV techniques for extracting and processing data from images. The flow of a request to the back-end is as following: It is worth noting that the communication with the backend is performed through a REST API. Any client application can request to the back-end, not only the front-end developed in the INACITY platform. A diagram describing the back-end components along with some comments about their functions is available at the project's Github entry [30]. The diagram also illustrates the relationship between managers, abstract classes and 1 A class that follows the design pattern known as Strategy [28], that is, when a class is a manager (usually called Manager Components) it delegates a request to an associated class and the response depends on the associated class. These components have nothing to do with Django's Manager class, which is an interface to allow Django models to query a database [29]. derived counterparts. Each manager is responsible for delegating requests from some front-end client to the components responsible for collecting urban imagery, GIS data, and data extracted by some Image Filter. The MapMinerManager, ImageProvi-derManager, and ImageFilterManager are abstract classes that define a common interface to enable the manager classes to delegate requests in the same fashion to different external systems. The URLs component define end-points that external clients can call; those end-points define the REST API functions of the back-end.
A class diagram describing the Manager classes, arranged according to the Strategy design pattern, is available at the project's Github entry [31]. The diagram shows the abstract base classes allowing the extension of the INACITY data sources and processors. For example, the class OSMMiner is a subclass derived from MapMinerManager, and it provides a unified way to collect data from the OpenStreetMap GIS [32]. The OSMMiner class implements functions whose signatures are defined in its base class. By calling those functions, the MapMinerManager class can collect data from the OpenStreetMap GIS seamlessly, without the need to be adjusted to the data model or even the connection details from the OpenStreetMap GIS. This design is possible because the OSMMiner class translates requests from the MapMinerManager to queries for the OpenStreetMap, and also translates the response from the latter to a common format (i.e., GeoJSON [33]) that can be transmitted back to the front-end client.
The front-end part comprises a website for the end-user. A diagram describing the front-end diagram is available at the project's Github entry [34]. The main components are its pages and communication classes. The pages essentially enable the enduser to interact with the website and render updated information as the user makes requests. The communication components are responsible for encapsulating requests for the back-end and, at the current version, for Google Street View (GSV) servers. It is noteworthy that the GSVService component, responsible for communication with GSV servers, is implemented at the frontend due to restrictions on GSV. However, the signing key and the GSV request formulation algorithms are implemented at the back-end. The Google Street View [35] was chosen as the standard GID due to its worldwide coverage and because it is a platform that is commonly used in scientific papers targeted at analyzing the urban environment through street-level urban imagery.

Software functionalities
This section presents the components of the system that allow one to add functionalities to the system. By combining geolocated features from GIS and geo-located images from an image provider, one can enrich the system's functionalities. One example of such functionality consists of observing specific kinds of trees cataloged in a city greenery database like the Pasadena Urban Trees dataset [36], or the road afforestation (''Arborização viária'') layer from the GeoSampa dataset [37] of the city of São Paulo, Brazil.
The INACITY platform concerns integrating imagery data and information extracted from it with a GIS. Such integration allows a comprehensive range of applications, the most direct ones assessing the quality or presence of urban features. This integration allows one to assess visually and automatically the quality of a segment of road, spotting precisely cracks, potholes, paint damage, and other signs of degradation. Concerning the detection of urban features, one possibility would be to implement deep learning neural networks to detect traffic signs [38] in images collected from a crowdsourced imagery platform such as KartaView (previously known as OpenStreetCam) [39]. By creating subclasses, from the abstract base classes ImageProvider and ImageFilter, one can integrate KartaView (to collect the images) to INACITY to process the collected images and detect the traffic signs or other urban elements. From a developer's perspective, INACITY offers three kinds of extensions: new GIS, GIDs, and CV techniques.

Extending the platform
The platform provides the means for a user to use his/her dataset or new image filter algorithms, so new components can be easily integrated with previously implemented components thanks to its design. Integrating a new component to extend the platform requires the user to implement the new component(s) directly in Python's source code. There are three main ways to extend the platform, new geographical databases, new imagery platforms, and new Computer Vision algorithms.
Each possible extension is made by implementing a subclass of a specific base class, as detailed in the following sections, that defines an interface for a corresponding Manager Component.

GeoImage
To facilitate the integration between imagery data and geographical data, one can use the GeoImage object. A class diagram describing the GeoImage component based on GeoJSON is available at the project's Github entry [40].
The implementation of this object is similar to the GeoJ-SON object to achieve better interoperability. As specified by RFC 7946 [33], the GeoJSON object has its fields well defined with a proper semantic, except for the properties field of the Feature object. This field can contain any JSON (JavaScript Object Notation) object. Therefore, we keep the imagery data related to the coordinates of a Feature object inside the properties field under the key geoimages. Every GeoJSON object is either a FeatureCollection, a Feature or one of seven kinds of geometries [33]. We consider each geographical entity as a FeatureCollection, usually containing only a single Feature.
Every geographical entity is treated as a FeatureCollection, possibly containing just a single Feature. Each request to ImageProvider will contain a FeatureCollection with an array of Features. The coordinates of the latter will define the coordinates of the images to be retrieved by the ImageProvider. Fig. 1 shows a diagram of the GeoJSON as an abstract class with nine possible subclasses (considering that the Geometry class could be one of six distinct types).
The GeoImage object keeps metadata about some image collected from an image provider and data extracted from that particular image using some Computer Vision algorithm. The extracted data will be kept in a separate object called Pro-cessedImageData. Notice that the same GeoImage can hold a reference to multiple ProcessedImageData, because each image can be processed by different Computer Vision algorithms, yielding multiple distinct extracted data.
We add a new entry with the key geoimages into the JSON field properties of the Feature object to keep the same indexes between the coordinates of the geometry property. That is an easy way to access the GeoImage related to a particular coordinate. The geoimages entry may have the same structure (i.e., nesting indexes) of the coordinates in the geometry property of the Feature. When an image is not available for a particular coordinate, an error string will fulfill that particular index position in the geoimages entry.  The Scene Captioning class could, for example, be such that given an image, it provides a textual description of the scene. A geo-located image's description can be stored in the geographical feature related to the image or even in a completely decoupled GIS.

Image filter module
The Pavement quality class is another example of an object that could provide a degree to indicate to what extent the pavement of a road is damaged. If the input image's point of view is such that the camera points downwards, the main object of interest will be the pavement and its eventual defects (e.g., cracks and potholes). The component could provide a mask image obtained by segmenting the defects that would highlight the pavement's most relevant damages. The interface design and the guidelines provided by the abstract class ImageFilter can be used to add a new component that derives from the ImageFilter. The newly implemented component can then be readily used by the ImageFilterManager and integrated into the INACITY platform.

Image Provider module
Similarly, the same extensibility and modularity concepts can be extended to the Image Provider and the Map Miner classes. Fig. 3 presents the same concept as presented for image filters but applied to image providers. The main difference is that instead of processing an input image, the component is responsible for collecting an image associated with a given coordinate. For example, the GSV API allows one to query for the closest panorama with relation to some given coordinate. It also allows selecting the vertical and horizontal angles, the field of view, image resolution, and other parameters regarding an image from some specific panorama.
The component GoogleStreetView in Fig. 3 receives as input a set of coordinates and converts them into appropriate requests for the GSV platform, treat possible exceptions and finally return a formatted response (typically a GeoImage instance) back to the ImageProviderManager class.
To extend the Image Provide module a subclass derived from the Image Provider abstract class must be created, as exemplified in Fig. 3 by the highlighted (yellow boxes) subclasses Mappilary [41], Baidu Total View[42] and Crowdsource. Such subclasses encapsulate all the code responsible for communicating with the target Image Provider system. Given a set of coordinates, the subclasses must formulate a query to the target system, and then merge the response from that system with the input coordinates, composing a GeoImage response that is returned to the ImageProviderManager class. In the case of the Crowdsource subclass, its target system is the INACITY platform back-end itself, and this component provides the means for the users to provide urban images themselves, which in turn can be retrieved later by the Image Provider Manager component under another user request.

Map Miner module
The Map Miner module is responsible for integrating GIS databases to INACITY. It has some particularities that are worth mentioning. Fig. 4 presents the Map Miner module and its connections with some GIS databases. As in the other module figures, the blue components correspond to those implemented, and the yellow ones are example components to be implemented in future versions of the INACITY platform. The GeoSampa component defines the means to collect data, from the internal PostgreSQL database, regarding bus stops from the city of São Paulo. An extract from the GeoSampa [37] database was introduced directly into the INACITY database to mitigate the number of external queries. The database also keeps users' access and working session data.
The OSMMiner class implements the means to collecting streets' information. The streets are represented as a collection of interconnected LineString objects (according to GeoJSON specification [33]), from the OpenStreetMap [32] platform.
The sequence of collected LineString objects (each holding its geographical coordinates) can be the input to get geolocalized images from some Image Provider system. Besides that, the relationships (e.g. direction) between each pair of objects can be used to determine camera angles between two adjacent panoramas.
The PanoramaMiner class performs queries over a Neo4j database instance, a graph-oriented Database Management System (DBMS) [23]. The main advantage of using a graph-oriented database is to model the streets, the regions, and other geographical objects. The Neo4j instance is hosted together with the Django server into the same Docker container in the current version. This database is responsible for relating data from physical entities (e.g., objects from some GIS), their images (sampled from some Image Provider system), and even data extracted from those images (using some Image Filter component). Fig. 5 shows an example of some GSV's panoramas, images metadata from each panorama, and data extracted from the images, as seen by Neo4j Browser User Interface. An orange circle represents each panorama. It holds information like address, pitch, heading of the camera, and the shot's time. Each panorama may span different images, each with a pitch and heading. Therefore, each panorama may be related to multiple images, so the metadata regarding these images are stored in the blue circles called Views. The data resulting from processing the images is stored in vertices called FilterResult represented by the gray circles in Fig. 5 (related to the View that correspond to the processed image). This data representation in a graph-oriented DBMS allows faster retrieval of results already collected and processed, reducing the time a user needs to wait for his/her request to be completed, keeping the system flexible and extensible.  This subclass must translate the user request into a request to the target GIS (e.g., OpenStreetMap) or local database (e.g., Neo4j), which contains geographical data (e.g., the location of GSV panoramas or even bus stops imported from GeoSampa into the local PostgresSQL database). Notice that the Django ORM is used to interact with the PostgresSQL database, while the interaction with the Neo4j is done through the packages for python provided by the Neo4j, Inc.

Results and illustrative examples
The platform's current implementation includes the Open-StreetMap GIS as a source of coordinates to be sampled from the GSV Image Provider. This combination allows multiple uses of the platform in a practical way. In this section, we present two use case examples of the platform.

Neighborhood visual inspection
The platform's most simple use case involves selecting a region of interest and fetching images from that region. In the current implementation, the OpenStreetMap and the GSV platforms are the coordinates' source and the Image Provider. Once collected, the streets define a route for the user to view pictures from streets in the selected region as if he/she was there. This use case is handy for auditing neighborhoods [21]. A short video showing the use case accompanies this article [43].
The pipeline begins when a person using INACITY's front-end selects a region and presses the ''Get Images'' button. We assume that the user keeps the default options of GIS (MapMiner). This action triggers a request from the front-end class UIModel to INACITY's back-end. The request consists of the selected region's geographical entities collected and sent to the Image Provider.
A diagram of the process that follows the UIModel request is available at the project's Github entry [44]. Since this is a request for a geographical entity, the request is received by the MapMinerManager component, which in turn delegates it to the appropriate MapMiner subclass. In this example, the OSMMiner is the referred subclass since it is responsible for treating requests to the OpenStreetMap platform. The OSMMiner subclass will formulate a query written using the Overpass Query Language [45] and will send it to the OpenStreetMap platform.
After the OpenStreetMap returns the query results, the OSM-Miner subclass will format the response using the GeoJSON [33] specification and return it to the MapMinerManager, which in turn will return it to the front-end. INACITY's front-end will then display the returned geographical entities, streets in this case. To better visualize the results, the system represents streets as blue lines in the digital map.
Following this step, a second request is issued by the UIModel class. This request consists of the geographical entities collected in the first step and an Image Provider. where images were taken. Views (in blue) represents an image and encapsulates the horizontal (heading) and vertical (pitch) angles of the camera (with relation to the true North and a flat ground, respectively) at the moment of the shot. Filter results (nodes in gray) maintain data extracted from a view by some ImageFilter subclass (Fig. 2).
Following this step, the UIModel class issues a second request consisting of the geographical entities collected in the first step and an Image Provider. This request follows a similar flow as the one used to collect street geographical data, except that this time the ImageProviderManager receives the request. Assuming that the default option for the Image Provider is the GSV, the ImageProviderManager will delegate the request to the GoogleStreetViewImageProvider subclass of the Image-Provider component.

Urban feature visualization (greenery)
Visualizing the distribution of urban features, like greenery, is another use case of the INACITY platform in a given geographical region.
The pipeline starts as before (a user selecting a region of interest) and then triggers a request for processing the images collected during a neighborhood audition. In the video that accompanies this paper, the user applies a filter called Greenery filter to estimate the Green View Index by segmenting which parts of the image correspond to green vegetation. The proportion of the image (in relation to the image size) regarded as green vegetation will be stored in the density property of a ProcessedImageData object associated with the GeoImage of the processed image. When the FeatureCollection is returned back to the front-end each of its features and corresponding GeoImages (if available) will have an associated ProcessedIm-ageData which in turn may have extracted data as the density of Green View Index which will be displayed as Fig. 6, for example.
Besides the heat-map visualization, INACITY can present some urban features overlayed to the original images. Fig. 7(b) presents an example of an image in which the greenery parts (as classified by the back-end) are highlighted in green and the non-greenery ones in blue overlayed to the image. This kind of visualization allows a fine-grained inspection over each image rather than over the analyzed region. The greenery image filter module uses the Python packages numpy [46] and scikit-image [47].

Demo site, impact and limitations
INACITY is available at http://inacity.org to anyone who wishes to try the implemented analysis without deploying it. A user-level quota system is necessary to allow more users to try the platform through the demo site (due to GSV costs). Despite that, users can supply the platform with their own GSV credentials (i.e., there is no general quota use).
A non-specialist user can use the public instance to select a region, query images from it, and extract features from those images. Nevertheless, this approach limits the user to the already available capabilities implemented in the public instance, those are, street network locations from OpenStreetMap [32], Bus Stops from GeoSampa [37], images from Google Street View [35] and currently the only image processing algorithm implemented in INACITY's public instance is the greenery one. Additional capabilities can be implemented by developers and researchers in the future, possibly in locally deployed instances of the INACITY platform.
INACITY can be part of a more extensive pipeline of research. For instance, in [48], we collect images from some locations in the cities of Porto Alegre (BR) and São Paulo to build a machine learning model to detect entanglements between electric wires and tree branches. If the model is successful, it can be coupled into INACITY by subclassing the ImageFilter class, thus enabling the platform to help city managers to detect tree and wire entanglements and prevent accidents.

Quota module
The quota module keeps track of how many calls are performed by a registered user. An anonymous user, identified by a Django session-id, can also use the system. The main components of the quota subsystem is shown at the project's Github entry [49]. The class QuotaManager is responsible for registering a new entry in the database and keeping track of each registered user's available quota. The decorator factory quota_request_decorator_factory is used as a decorator over the functions whose usage is tracked. The parameters of the decorator factory are default_user_quota, default_anonymous_quota and skip_condition. These parameters specify how many calls will be available for a registered, and an anonymous user, respectively. The last parameter is a Boolean function called to test if the quota manager should be used (e.g., the user is using his/her own GSV credentials).

Performance tests
We created a benchmark test to assess the effect of multiple simultaneous requests to the back-end. The tests consist of collecting streets and images for two disjoint urban regions. Each region has a different size and consequently a different number of streets and images. Table 1 presents the results and information details of both areas used in the benchmark. The features are the size (in squared meters) of each region, the number of streets, the number of images collected. The minimum, maximum, average, and standard deviation times are split between collecting streets in each region and collecting the images for the corresponding streets collected. Finally, the time spent processing all the collected images using the currently implemented Greenery image filter. To collect the response times, 100 requests were performed, 50 for area 1 and 50 for area 2. At any given time, ten simultaneous requests have been made. The requests for area 1 were intertwined with requests for area 2, that is, the first request performed was related to area 1, the second to area 2, the third to area 1 again, and so on. The server hosting the platform during the benchmark was an Intel Xeon E5420 2.5 GHz with eight cores.

Discussion and conclusions
In the context of smart cities and the Internet of the future, using government public data or even private data available on the Internet is essential to assess features in a city. We created the Table 1 Multiple statistics taken upon the execution times and requests sizes (in terms of streets and images collected) for two distinct regions. INACITY platform with three concepts in mind: geolocated images, geolocated data, and algorithms to extract information from images. These concepts directed the platform modules' design and architecture such that implementing a new image provider, GIS, or a new CV/image processing algorithm can be done without impacting any of the other modules. In other words, by following each module base class's specifications, new components can be seamlessly integrated. The front-end is simple and allows end-users (e.g., citizens, developers, researchers, or government administration agents) to use the platform to gather data for future use and, because it is open-source, further improve the platform by modifying it to their needs.
The use cases presented are useful but straightforward, and we plan to extend them to other city problems as studying the problem of intersection of power lines with trees in the future.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.