Scalable 3D mapping of cities using computer vision and signals of opportunity

Abstract Three-dimensional (3D) maps are used extensively in a variety of applications, from air and noise pollution modelling to location-based services such as 3D mapping-aided Global Navigation Satellite Systems (GNSS), and positioning and navigation for emergency service personnel, unmanned aerial vehicles and autonomous vehicles. However, the financial cost associated with creating and updating 3D maps using the current state-of-the-art methods such as laser scanning and aerial photogrammetry are prohibitively expensive. To overcome this, researchers have proposed using GNSS signals to create 3D maps. This paper advances that family of methods by proposing and implementing a novel technique that avoids the difficult step of directly classifying GNSS signals into line-of-sight and non-line-of-sight classes by utilising edge detection techniques adapted from computer vision. This prevents classification biases and increases the range of environments in which GNSS-based 3D mapping methods can be accurately deployed. Being based on the patterns of blockage and attenuation of GNSS signals that are freely and globally available to receive by many mobile phones, makes the proposed technique a free, scalable and accessible solution. This paper also identifies some key indicators affecting data collection scalability and efficiency of the 3D mapping solution.


Introduction
Many applications and location-based services either require three-dimensional (3D) maps to function or can significantly improve their quality by using 3D maps (Biljecki et al. 2015) and (Huang et al. 2018). They include urban pollution modelling (Lin et al. 2009), 3D ray-tracing for radio wave propagation modelling (Yun and Iskander 2015), positioning and navigation for autonomous vehicles (Levinson et al. 2011) and low-altitude unmanned aerial vehicles (Floreano and Wood 2015), 3D mapping-aided Global Navigation Satellite Systems (GNSS) positioning (Groves et al. 2012) and virtual and augmented reality (Santana et al. 2017). Many of these applications only require a Level Of Detail (LOD) 1 map (Biljecki et al. 2015), also known as a 2.5D map. LOD1 maps represent buildings as flat-roofed blocks extruded from ground level (Kolbe 2009), as shown in Figure 1.
In order to produce LOD1 maps, there are several techniques, including technology-based (direct) approaches and inference-based (indirect) approaches. While the technology-based, such as airborne photogrammetry or laser scanning Mallet 2012, Dukai et al. 2019) can provide more consistently accurate and reliable 3D maps, they are relatively expensive to apply. For example, the cost of modelling a 3D map for the UK has been estimated at over £19 M (Wong 2018). The indirect approaches, on the other hand, tend to develop models that make use of multiple sources of data, such as images, laser scans (Musialski et al. 2013), digital surface model (Musialski et al. 2013) and two-dimensional (2D) building footprints (Dukai et al. 2019, Roy et al. 2022 to estimate height values in urban environments (Roy et al. 2022, Labetski et al. 2023. Although LOD1 map creation using ground-based imagery or laser scans is possible, aerial imagery and airborne laser scans are best suited to the problem (Musialski et al. 2013). This raises issues of cost, however, since acquiring that type of data is financially expensive and can often involve overcoming regulatory constraints (Duan and Lafarge 2016). Given the achievable accuracy (Roy et al. 2022) at a large-scale of cities that are required for some sensitive applications, such as drone navigation, neither of the approaches yet can provide a solution, and hence we do not have good coverage of 3D maps at a large scale.
In Yin et al. (2009), a survey of methods that use digitised architectural drawings to construct highly detailed LOD3 models is presented. But these methods are not fully automated, requiring human intervention at a scale that makes them unsuitable for modelling large urban areas (Musialski et al. 2013). Goetz (2013) proposed a fully automated framework to generate 3D models using Volunteered Geographic Information (VGI). Similarly, Biljecki et al. (2017) used the building attribute data to estimate building height, this time sourcing it from administrative databases using data-mining techniques. In this method, the final LOD1 maps are created by extruding the height from the known 2D building footprints. Both methods however are limited by the availability and accuracy of the input data. Administrative records can often lack completeness in terms of the building attribute information they hold (Biljecki et al. 2017), while VGI information also faces issues of completeness and coverage, in addition to potentially having lower accuracy in terms of the position, shape and architectural detail of the building footprint (Fan et al. 2014).
Image-based modelling methods that utilise readily available ground-level images mined from online photo databases can also be used for 3D mapping. These include structure from motion (SFM) based methods that produce 3D point cloud representations of varying density, most notably (Pollefeys et al. 2004, Snavely et al. 2008, Xiao et al. 2009, Fang and Quan 2010, Agarwal et al. 2011, Crandall et al. 2011, Irschara et al. 2012. These methods form part of the wider computer vision literature on image-based 3D reconstruction, which continues to be a much-studied problem even in the era of deep learning (Han et al. 2021). The survey (Musialski et al. 2013) provides a more detailed treatment of SFM and other multi-input photogrammetry methods.
This paper proposes a novel solution to extract a 3D map from the freely and globally available GNSS signals. The proposed approach can provide a scalable solution that can be applied in different environments including 'urban canyons' where the other GNSS-based solutions may not be reliable.
The rest of this paper is structured as follows: Section 2 explains GNSS-based 3D mapping in the context of computer vision. Section 3 proposes and implements a novel technique to extract 3D maps using edge-detection techniques. Section 4 introduces metrics for dataset size and measures their relationships to the algorithm, examining how data collection efficiency varies in space for some common scenarios. And finally, the results are presented and discussed followed by a conclusion.

Edge detection techniques for GNSS-based 3D mapping
3D mapping using GNSS data has been implemented in a small number of pilot studies demonstrating its potential (Swinford 2005, Kim et al. 2008, Weissman et al. 2013, Isaacs et al. 2014, Irish et al. 2014a, 2014b, Rodrigues and Aguiar 2019, Lines and Basiri 2021. GNSS signals are an appealing source of input data for 3D mapping, since they are free to use, highly available globally (Groves, 2011) and cheaper to collect, store and process in comparison to other forms of input data such as high-resolution photographs. Moreover, GNSS data can now be passively collected at an even lower cost using smartphones. All smartphones running version 7 and above of the Android operating system can access the 'raw GNSS' measurements necessary for GNSS-based 3D mapping (Banville and van Diggelen 2016). Consequently, many of the studies listed above have used Android smartphones for data collection (Irish et al. 2014a, 2014b, Rodrigues and Aguiar 2019, Lines and Basiri 2021, enthused by the potential for large-scale crowd sensing models reminiscent of OpenStreetMap (Haklay and Weber 2008) and volunteered GPS trajectories (Huang et al. 2013).
The advantage of using GNSS to create 3D maps (GNSS-based 3D mapping) is that it can be implemented on the majority of smartphones, ie an 85% share of sales globally (European Global Navigation Satellite Systems Agency 2020, International Data Corporation 2020) so it can use freely receivable, globally available GNSS signals (Basiri et al. 2017). This can potentially provide a high (potentially global) coverage and lowcost solution to the challenges of creating 3D maps. It is suitable for a large-scale crowdsourcing approach as it has been used already in such contexts (Huang et al. 2013). Data collection on recent Android-running smartphones ie phones manufactured since 2016 is straightforward (Banville and van Diggelen 2016). However, 3D mapping using GNSS data has been implemented only in a small number of pilot studies (Swinford 2005, Kim et al. 2008, Weissman et al. 2013, Isaacs et al. 2014, Irish et al. 2014a, 2014b, Rodrigues and Aguiar 2019, Lines and Basiri 2021. The significant challenges of using GNSS signals for 3D mapping have made these pilots only demonstrate potential, and wider use of GNSS signals for 3D mapping still requires overcoming such challenges. The premise of GNSS-based 3D mapping is to use patterns of signal blockages and attenuation that arise as signals are occluded, attenuated, reflected and diffracted by the urban environment to make inferences about the presence or absence of any intervening objects along the direct line-of-sight (LOS) signal path (Lines and Basiri 2021). This is possible assuming that the location of the receiver is well-known via the normal operation of GNSS positioning, which requires a favourable satellite geometry not guaranteed within all urban environments, eg urban canyons. Under these conditions, the position, shape and height of the intervening objects can be determined.
In order to convert these patterns into a 3D map, approaches to-date rely on classifying signals as open (ie having a LOS signal) or closed (ie only comprising a non-lineof-sight (NLOS) signal which has either been reflected or blocked entirely). This classification is highly challenging for data collected by smartphones, due to limitations in smartphone antenna directionality, polarisation sensitivity and gain, which make it hard to differentiate true NLOS signals from signals merely exhibiting multipath interference, or which have been reflected or attenuated due to foliage, body masking or as a consequence of the antenna gain pattern Jiang 2013, McGraw et al. 2020). In addition, creating accurate ground truth labels is challenging without also having accurate receiver positions, which poses a particular challenge to crowedsensed data. The existing GNSS-based mapping methods do not adequately reflect this uncertainty in the signal classifier or receiver position in their reconstruction algorithms. We, therefore, propose a novel solution to directly extract a 3D map from the GNSS signal's carrier-to-noise ratio without having to undertake any intermediate NLOS/LOS classification step. We expect the proposed approach to improve the final accuracy of the 3D map and to provide a more scalable solution that can be applied in challenging environments such as 'urban canyons' where the accuracy of NLOS/LOS signal classifiers is particularly unreliable.
The contribution of this paper is thus two-fold: we propose an approach to GNSSbased 3D mapping based on edge-detection techniques borrowed from computer vision and apply them to GNSS data to create LOD1 maps. Secondly, we identify measures to meaningfully quantify the size of a GNSS signal dataset and investigate the effectiveness of different receiver and satellite positions with respect to the accuracy of the final estimation. This allows us to draw conclusions on sampling strategies and the suitability of the method for wide-scale deployment.

GNSS-based 3D mapping
GNSS-based 3D mapping can be considered a form of multi-view stereography (MVS) (Furukawa and Hern andez 2015), ie the identification of common features within a set of overlapping images and the use of stereographic correspondence to calculate the distance of the features, thus reconstructing a 3D object. In GNSS-mapping the basic image is a skyplot, which is an azimuth-elevation plot of the above-horizon satellites from the point of view of a receiver at a single epoch, as shown in Figure 2.
The commonly used GNSS mapping algorithms are limited due to being relied on a pre-specified signal strength classifier (Weissman et al. 2013, Isaacs et al. 2014, Irish et al. 2014a, 2014b, Rodrigues and Aguiar 2019. They also do not reflect the input uncertainty of the signal classification or positional data in the reconstruction algorithm. The main challenge of GNSS mapping is the extreme sparsity of the observations at each epoch, which they struggle to deal with. A GNSS receiver may receive up to around 40 signals in all directions, which is insufficient to match common features between skyplots. This requires an alternative approach to MVS, based on inverse ray-tracing (Liu and Cooper 2011) wherein the object is given an assumed shape, which is then refined in such a way as to minimise the inconsistencies between the shape of the object and its corresponding representation in the given set of 2D images depicting the object.
The act of measuring the consistency between images is a typical component of MVS techniques (Seitz et al. 2006). In GNSS-based mapping, techniques investigated to-date have measured consistency by comparing the results of a signal classifier to the results of using the 3D map as a classifier geometrically, ie all signals which pass through a building are blocked (NLOS) and LOS otherwise. However, creating accurate NLOS/LOS classifiers for smartphone GNSS data is a challenging problem. As previously mentioned, smartphone receivers are linearly polarised meaning they are not sensitive to the polarisation of the GNSS signal (McGraw et al. 2020). They also tend to have low gains, but high sensitivity, meaning they can track weak GNSS signals that often turn out to be NLOS (Wang et al. 2015). This means that the commonly used approach of classifying signals based on signal strength, ie carrier-to-noise ratio, is unreliable at best since the specific properties of the antenna result in large overlaps between the carrier-to-noise ratio distributions of the NLOS and LOS signals (Wang et al. 2015).
The inverse ray-tracing approach proposed in this paper bypasses this problem. Instead of the 3D map being used as a geometric classifier, it can be used as the basis to create a composite image from the many skyplots, by using standard techniques of epipolar geometry to transform the image viewpoints. This process is illustrated in Figure 3. The receiver, R, observes a signal p R with azimuth a and elevation . This defines a line P R which transforms to the line P V when considering the image from the viewpoint V (a 2D image with xy coordinates but no z coordinate). If the distance between the object and the receiver is known, defining point P, the transformed signal is a point p V in the image.
However, this distance is not known directly from the skyplot. Instead, the 3D map can provide a distance estimate through intersecting the line P R with the map objects. Importantly, not all dimensions of the shape need to be fixed to allow this process to occur. For example, one of the building dimensions can be allowed to extrude without end. A simple and practical illustration of this idea is to assume the building floor plan is known while the height is unknown. This is the primary case in practice because we Figure 3. Generating a composite image from the skyplots using the epipolar geometry. assume 2D maps are available. Hence, an image can be generated on the assumption of an infinite building height, which allows us to directly extract the shape of the building roof by applying low-level image processing, ie edge detection techniques, to detect the edges of the building roof.
To clarify the different approaches, GNSS-based 3D mapping may be represented as a common set of potential pre-processing operations to combine the set of skyplot images, followed by generating the 3D map using either a consistency-measuring approach or the proposed edge-detection approach. The pre-processing operations are as follows: 1. Transformation. A signal classifier is used to transform raw signal features into a probabilistic or binary classification of open and closed signals. This may help in subsequent feature detection, depending on the classifier accuracy. However, the primary reason for this transformation is that corresponding classified signals can also be generated using a hypothesised 3D map, facilitating a consistency-based approach. 2. Dimension-reduction. As illustrated in Figure 3, the path of a signal is represented as a point in the corresponding skyplot, but a line under a change of viewpoint. To combine points across a series of skyplots into a common viewpoint, it is necessary to intersect the signals with a polygon or plane, eg the footprint of a building. This reduces each line to a point within a 2D image. A further dimension reduction can also be achieved by binding the 3D map to the LOD1 assumption, ie assuming that the building can only have a flat roof which means the height as the only dimension of freedom. Similar to the approaches that are taken in Swinford (2005), Kim et al. (2008), Weissman et al. (2013) and Lines and Basiri (2021). This would allow the edge detection approach to directly extract the LOD1 building height by detecting the 'edge of the building' as it were. 3. Pixelation. Converting the signals into a pixel (or voxel) grid with one value per pixel. Each pixel contains many signals, which may mitigate the effects of measurement noise. A coarse resolution (eg 1-4 m cubed) has been used in implementations (Irish et al. 2014b, Rodrigues and Aguiar 2019), on the basis that it is appropriate for the underlying position uncertainty and size of collected data sets. For example, Irish et al. (2014b) used a probabilistic classifier on each signal and then combined using the Bayes rule, and Rodrigues and Aguiar (2019) directly classified pixels based on features of the set of intersecting signals. The advantages of pixelation are that it can combine signals without requiring dimension reduction and the pixel grid can be used directly as a 3D map or as the input into further steps, at the expense of discarding a great deal of information.
Having processed the images into a combined set, a number of consistency metrics between the map and image-set can be implemented to generate the 3D map through minimising inconsistency. They include the number of misclassified LOS signals (Swinford 2005), the ratio of misclassified NLOS signals (Kim et al. 2008) and hinge-loss function (Weissman et al. 2013). While consistency-based measures cannot be regarded as reliable as they are highly susceptible to bias in the underlying signal classifier (Lines and Basiri 2021), an edge detection approach does not require any signal classification to take place. Therefore, the building shape can be extracted from a composite image using edge detection methods, a form of low-level image feature extraction, as described in Cyganek and Siebert (2009).

Edge detection for 3D mapping
Edge detection relies on identifying the portion of the image where the signal undergoes significant change. This can be achieved by considering the signal gradient vector: The norm of the vector, jjrIjj, is maximised in areas of rapid change, such as edges. An equivalent approach is to consider zeros of the Laplacian: which equates to local maxima of the gradient vector norm. Prior to applying these methods to edge detection, it is necessary to remove noise from the image. Two common approaches include (A) applying a smoothing filter (eg Gaussian) to the image, prior to applying a differentiating operator (Cyganek and Siebert 2009) or (B) using Savitsky-Golay filters (Press and Teukolsky 1990), which fit a polynomial to a window around each sample-point. As images are naturally pixelated, a number of different operators exist to best approximate the continuous functions and their derivatives from the discrete measurements. However, as the intent is to reconstruct an underlying continuous function, we can apply the same principles to non-pixelated signal data.
A similar method named the bootstrapped four-parameter logistic regression (4PL-B) was introduced in Lines and Basiri (2021). On the basis of class-conditional independency between the height and the signal strength, the approach used a four-parameter logistic curve fit to classified signal data. A four-parameter logistic curve is a generalised logistic curve where the asymptotes can vary from 0 and 1, respectively. The varying asymptotes allowed for inaccuracy in classifying the signal data. Once a 3D map was generated it was used to train the signal classifier and the process was repeated, hence the bootstrapping part of the algorithm. Similar to edge-detection methods, the height was considered as a function of the maximum gradient of the curve with some allowances for asymmetric diffraction effects. Key differences to the techniques proposed here are that it was fit to the classified signals as opposed to the untransformed data, and that more generalised smoothing functions were not considered. The four-parameter logistic curve has some differences to other smoothing functions that may be useful for edge detection. Firstly, it is fit using the entire data set, unlike typical edge detection methods which utilise local data to generate a smoothed function. Secondly, it imposes a bell-curve shape to the gradient. Together, these points are equivalent to the restriction that there is only one edge in the image and no other variations in the intensity. This inflexibility may avoid false-positives in detecting edges, although this needs to be tested in practice.

Experiment and results
To examine the performance of different smoothing curves with regard to their edge detection properties, we apply them to a dataset of GNSS measurements collected in several areas with different urban structures and features, including urban canyons, open sky and residential neighbourhoods. The proposed approach has been implemented and tested using data collected in several locations, however, for the sake of comparability with previous work, the common three representative environments of open sky, residential and urban canyon are discussed here. We select the three sites in Greater London which are the same sites where the dataset was collected and made accessible for Lines and Basiri (2021), allowing direct comparison with their method, which has already been benchmarked against other GNSS-based mapping approaches.

Implementation
The GNSS dataset was collected in three representative locations with diverse building heights, materials and surrounding environments: the main quadrangle of University College London (UCL), a block of flats in an urban canyon (Goodchild Road) and a Victorian-era terrace of residential housing (Hermitage Road). Data presented in this paper were collected using a Samsung Galaxy 10 smartphone running Android 10.0. Four different smartphones were also used as GNSS receivers to compare and control the receivers' sensitivity. Those devices were: a Huawei P30 running Android 9.0, Google Pixel 3 and 4 running Android 9.0 and 10.0, respectively, and another Samsung Galaxy 10 running Android 10.0. All phones are embedded with multi-constellation GNSS receivers that are able to receive all four major GNSS constellations: GPS, GLObal NAvigation Satellite System (GLONASS), Beidou and Galileo. To collect the GNSS signal data, we developed our own data collection application based on Google's GNSS Logger (Banville and van Diggelen 2016), that allowed for the manual input of location along with access to the 'raw GNSS' data: GNSS position solutions and satellite identifiers which are used to determine the direct signal path and the carrier-to-noise ratios. Locations were manually recorded using visual map matching, since inaccuracies in the position estimate can result in consequential building height estimation errors for the recorded locations that can come from map matching. The errors in position solutions are minimised as they can lead to consequential errors in calculated intersection heights.
The measurement protocol was to place the receiver in a static location at each site and record at 1 Hz frequency for around 30 min. This was repeated at different locations and times of day to allow the satellite geometries to change. Movement during recording varied: six sets were in a static location and eight sets were walking loops with up to 50 different predetermined positions where the observer paused for a brief period of around 20 s. For these locations, we manually inputted the observer location based on a visual determination against a small-scale map, with an estimated location error of below 5 m. Between 90,000 and 150,000 signals that intersected with the building floor plan were recorded at each site. Figure 4 shows the spatial patterns of the intersections, illustrating typical paths generated by the orbiting satellites and a static receiver.
Due to the relative sparsity of the dataset, we made the further dimensionality reducing assumption introduced in Section 2, of assuming the building to be LOD1, ie flat-roofed. Because of this, and because the effects of occlusion are not being considered, we are therefore able to work directly with the intersection heights as a proxy for the height of the building, rather than imposing a viewpoint from a particular direction. This reduces the interpolation to fitting univariate smoothing functions to a dataset of signal strength versus intersection heights, independent of other signal path variables.
Signals that were blocked entirely, but are known to have been above the horizon from orbit records, were recorded with a 'not available' (NA) signal strength. These missing values were imputed as 10dBHz. This appears to be the minimum value for which the receiver registers a signal based on the truncated distribution of observed values and assumes the NLOS signal distribution follows a typical Rayleigh distribution (Molisch 2011).
Four different smoothing methods were tested: four-parameter logistic curve, as described above. smoothing spline regression spline LOWESS (locally weighted scatterplot smoothing) curve with cubic polynomials used in the splines and LOWESS curve to ensure continuity of the second derivative (Hastie et al. 2009).

Results and discussion
The smoothing parameter for the smoothing spline, knots for the regression spline and span of the LOWESS curve were determined by manual inspection of the results. The four-parameter logistic curve does not require parameterisation. All curves were fit to the observations using least-squares. The smoothing spline was rejected at this stage because it seemed to be too sensitive to the smoothing parameter used and tended to either collapse into a cubic function or be insufficiently smoothed. The knots for the regression spline were spaced on the X-axis at the greater of 5 m and 10% of the data between knots. This mitigated the uneven distribution of observations at different heights and generated around three to four knots in the spline. The LOWESS curve was fit with a span of 70% of the dataset and used a tri-cube weight function.
Both the LOWESS curve and regression spline take into account a relatively large proportion of the data, compared to typical edge detection, which tend to fit to localised data. This is primarily due to the high levels of noise in the data, but also because ideally only one edge should be detected, and hence a high amount of smoothing must occur.
The results of the smoothing process applied to the dataset are shown in Figure 5. The experiment was repeated using the GNSS-position solutions to calculate intersection heights, with results shown in Figure 6. The errors in the height estimates produced by each curve are shown in Table 1, alongside the benchmark error estimates achieved in Lines and Basiri (2021). The uncertainty of the estimated height reported by LOWESS can be found in Table 2. The results show an improvement on the existing methods.
Overall, the results show that building edges can be successfully detected directly from signal strength without the need to transform the signal through classification, by generating interpolations of the signal strength intensity and identifying gradient maxima representing the building edge. This means that by using building footprints, which are relatively widely available through proprietary and open source VGI databases, one can apply the edge detection method to produce LOD1 maps since the detected edges represent the LOD1 building height. An example of this is shown in Figure 7, where the darker coloured/purple building represents the UCL building whose height has been estimated using the edge detection method.
Edge detection seems to work with a similar magnitude of accuracy to the 4PL-B algorithm. The GNSS readings, shown in Figure 4 for each site created composite images, having made the images blurred horizontally, it is possible to use the edge detection algorithm. As shown in Figure 8, the most distinct edges correspond well to the actual height. The actual height for Hermitage Road, Goodchild Road and UCL buildings are, respectively, 34 m, 47 m and 46 m. These are estimated using the proposed approach, ie horizontal blurring the GNSS readings and applying the edge       Table 1.
The LOWESS curve and regression spline generated similar interpolations, with the four-parameter logistic curve producing a different result. The differences appear due to the four-parameter logistic curve's (A) imposed asymptotically-zero gradient at boundaries and (B) restriction to one maxima. These differences lead to some advantages and disadvantages. An advantage was that the other curves tended to overfit to the sparse data at high intersection heights, and at both boundaries the other curves often showed large gradients, leading to ambiguity as to whether an edge may be present at the boundary of the data. A disadvantage of the maxima restriction is when there are false patterns of changing signal strength intensity, as this can lead to the incorrect maxima being chosen. This is particularly problematic due to the leastsquares fitting of the four-parameter logistic function which will tend to generate a curve where there is more data, even if there is a strong change of intensity elsewhere.
An incorrectly chosen maxima can be seen in the results for Goodchild Road in Figure 5, where there is a pattern of increasing signal strength at low heights. We believe this may be due to signals reflecting on nearby buildings. For urban canyons, such as the Goodchild Road environment, the false patterns may be prone to being identified because the spatial distribution of data collection may potentially lead to a number of recorded signals intersecting the building at a low height.
The shapes of the interpolations are similar when the dataset is used with the GNSS-position solutions ( Figure 6). However, the additional noise causes the peaks in the gradient to be less well-defined. This may lead to decreased success in identifying building height, with peaks being missed or false peaks identified. In these cases, the flexibility of the LOWESS curve and regression spline appear to have an advantage as the changes of intensity identifying an edge are still present around the true heights, albeit other false peaks are generated. Table 1 compares the accuracy of the investigated edge-detection techniques against a benchmark classification-based technique: the bootstrapped four-parameter logistic ('4PL-B') technique (Lines and Basiri 2021), which was shown to perform better than others. This fits the four-parameter logistic curve to classified signals and uses the results to iteratively train the signal classifier. Results are mixed: the regression spline and LOWESS curves outperform the benchmarks in Goodchild Road, which is the most challenging environment, however, the 4PL-B approach is better at estimating the height of buildings in non-urban canyon environments, such as at the UCL site.
It is unclear why the 4PL-B technique is better at estimating the height of the UCL building. However, one possible reason is the asymmetric effect of diffraction on signal intensity: signals diffracted by a building edge can still be received at up to. Five degrees of diffraction (Bradbury 2007). The 4PL-B technique accounts for this by estimating height at around half-way down the upper slope of the gradient peak, as opposed to the top. A similar amendment to the edge detection techniques would improve the UCL height estimate but worsen the estimates for other locations. Overall, the results show there are advantages, especially in more challenging environments, to using the more flexible interpolation methods over the 4PL, which can be too restrictive. It also suggests that a better interpolation function may be possible by combining elements from both types, ie a flexible function which is restricted by ensuring gradients tends to zero at boundaries. As the transformed 4PL method outperforms the 4PL method in some instances, improvements may be possible by transforming the data before applying the LOWESS curve and regression spline.
A limitation of the approach is that it requires accurate 2D maps, in the form of buildings floor plans, to initialise the approach. It is still of valuable application in the many areas where such maps are readily available, however, future work would consider extending the approach to scenarios where the footprint is not accurately known.

Sampling effectiveness
Some of the existing GNSS-based mapping methods have chosen to collect data using a crowd-sensing model, relying on GNSS data collected voluntarily via android application as in Rodrigues and Aguiar (2019). To justify large-scale data collection, however, it is important to understand the relationship between the quantity of data collected and the resulting 3D maps. To do this, the following section introduces metrics to meaningfully quantify the amount of data collected and thereafter, we investigate the relationship between data size and data collection strategy and the quality of the resulting dataset and building height estimates.

Data size metrics
The proposed edge detection approach heavily relies on GNSS data, particularly the observations that pass close to a building's edge. This implies that the value of a GNSS-mapping data set is defined specific to an individual building and varies between buildings. Of course, the definition can then be extended across multiple buildings by combining those specific values. Two natural metrics can be considered: an Intersection metric, which measures the number of observations in a data set that pass within a certain distance of the building edges; and a Window metric, which measures the distance from the building edges needed in order to include a certain number of observations. Given a set of observations D and building B, these metrics can be formally stated as: where v is the indicator function. For a given D and B, I d and W n are monotonic functions in d and m, respectively, ie I m > I n () m > n, and W m > W n () m > n: Furthermore, I 1 is the size of D.

Sampling
The proposed 3D mapping technique takes the building footprint from existing 2D maps and uses GNSS signals to determine building height. Here the datasets are filtered to only include signals that intersect the footprint of the chosen buildings. After filtering, the total number of observations in each dataset ranges from 109,467 at Goodchild Road to 245,619 at UCL. To create datasets of varying sizes, each location's dataset was repeatedly sampled; 350,000 different datasets were created with a size distribution shown in Figure 9.
To estimate the building heights, a four-parameter logistic curve was applied to signal strengths. A four-parameter logistic curve is a generalised logistic curve where the asymptotes can vary from 0 and 1 (Healy 1972). Its equation is as follows: where a and d are the upper and lower asymptotes, respectively, and b and c are the standard linear coefficients in x. This algorithm directly identifies building edges. It works by taking a dataset of (signal strength, intersection height) pairs and generating signal strength as a smooth function of height by fitting a four-parameter logistic curve using ordinary least-squares. The height is predicted as c À 1:5=b, (the curve's inflection point plus circa half the curve's width), to correct for diffraction effects (Lines and Basiri 2021).

Results
Six different intersection metrics (I 1 ,I 2 ,I 4 ,I 8 ,I 16 and I 1 , with the distance measured in metres) and four different window metrics (W 1 ,W 10 ,W 100 ,W 1000 ) were calculated for each sample. The intersection metric and the window metric have an inverse relationship ie there is a linear relationship between I d and 1=W n , as shown in Figure 10. A pairwise Pearson's correlation matrix is generated to quantify the relationship between the intersection metrics and the inverse of the window metrics, as shown in Figure 11. All the intersection metrics are strongly positively correlated, with the correlation increasing steadily as the distance parameters of the metrics become closer. The W 1 metric is an outlier, showing little correlation to either the other window metrics or the intersection metrics. The other window metrics are almost perfectly  correlated with each other and highly correlated to the intersection metrics with a small distance parameter eg I 1 .
On this basis, the intersection metrics I 2 , I 8 and I 1 were chosen as typical sample size metrics to categorise a dataset. For each location and measured sample size, the root mean square error (RMSE) of the height estimates was calculated and a smooth trend line was produced using a LOWESS function, as shown in Figure 12. The trend lines show that the error of the height estimate decreases exponentially with dataset size, but rapidly reaches a stable error beyond which additional data does not improve the algorithm's estimate. This is achieved at a relatively small sample size, ie between 10 and 100 samples consistently, ie in any of the metrics. For the collected samples, the high correlation between metrics makes it difficult to determine whether a particular metric is more closely related to algorithm performance or not. The correlation seems to be partly due to the closeness of the observer to the building. Closer the location of the receiver to the building can result in a larger proportion of observations become at low heights as most of the satellites are at a low altitude. If observations further from the building were used, it is possible the I 1 metric in particular would decorrelate. The relationship between distance and usefulness of observation is examined in the next section.

Spatiotemporal location and collection efficiency
A GNSS observation, taken at a location and time, will contribute to the accuracy of a GNSS-based 3D-mapping algorithm if the rays to the GNSS satellites pass close to the edge of a building, increasing the sample size as measured by the sample size metrics defined in section 4.1. Because GNSS satellites orbit the earth in predetermined patterns, the usefulness of a location and time combination for GNSS-based 3D-mapping is predetermined by the building's shape. As the building shape is not known in advance, this does not give precise guidance on when and where an observation should be taken, however, key determinants of observation efficiency can still be determined.
A simulation was implemented as follows. First, a square building with sides of 10 m and height of 10 m was generated on an otherwise uniform plane with a hypothetical location of British National Grid 528000 Northing, 183000 Easting, which is in central London. The location only affects the relative position of satellite orbits. Next, a uniformly random process in space and time was used to generate over 10 million observer locations within a 100-m square bounding box centred at the building in the 24-h period of 1 January 2021. The date and location have no significance. Lastly, GNSS orbit data was used to reconstruct the satellite positions at each moment for each of the main four GNSS constellations of GPS, GLONASS, Beidou and Galileo, allowing the satellite signal paths to be calculated.
Only 7% of observations are useful with regard to the coarsest metric I 1 , and the number of useful observations decreases as the metric parameter decreases, as shown in Table 3. As a reminder, this is a slight abuse of notation as we are only considering vertical distance, and requiring all signals to intersect the given building floor plan.  We first consider sample size under the I 1 metric as this is independent of the building height. There is a strong connection between distance and the probability of an observation contributing to the I 1 metric. A logistic regression on the simulated data, shows the probability of an observation contributing is related to the distance of the observation is as follows: LogOdds ¼ 0:1288 ð0:120 À 0:138Þ À 0:846 ð0:843 À 0:849Þ log ð1 þ distanceÞ (6) where the bracketed figures are the 95% confidence intervals, and 1 is added to distance to avoid taking the logarithm of 0. The logistic model seems to represent a strong explanatory power. This is illustrated in Figure 13. As it is illustrated in Figure 14 there are explainable spatial correlations to the residuals. The intuitive interpretation of the spatial distribution of residuals can confirm the result shown in Figure 14. An observer standing near the corner of a building is less likely to have rays that intersect it versus standing along a wall. Other patterns reflect the relative orbit positions of satellites (to the South-East and South-West) with respect to the hypothetical Northern hemisphere location. These results relate to the simulated 10 m square building, and thus should be scaled proportionately with building size.
The I 1 metric does not take into account the height at which a ray intersects a building, however, other metrics do. The majority of observations have a low elevation, causing the intersection height to be a small fraction of their distance from the building, as shown in Figure 15. Hence for a taller building, as the distance from a building increases the more likely it is that an intersection is close to the building edge. This effect combines with the opposite effect of decreasing the likelihood of intersection at any height with distance, leading to lower probabilities of useful observations at either extreme of distance, with a maximum likelihood obtained somewhere in the middle. The relationship depends on the height of the building, as shown in Figure 16.
For taller buildings, the best distance for observations is further away, and the maximum likelihood obtained is lower, as it is always bounded above by the monotonically decreasing I 1 metric. Excluding the I 1 metric as a special case, different Figure 13. Logistic regression and binned empirical proportions for the probability of an observation contributing to I 1 metric against distance from the building.
intersection metrics vary with distance in a practically identical way, but with a proportionately lower probability as the metric parameter decreases.

Conclusion
GNSS-based 3D mapping could be a low-cost alternative for producing large-scale 3D maps. This would be of particular benefit in areas unable to produce 3D maps due to the prohibitive cost of existing technologies. These maps are a key requirement for improved location-based services, in particular navigation and positioning. Existing methods to create GNSS-based 3D maps have been limited by their reliance on signal classification techniques. This paper demonstrates the ability to generate 3D maps directly from GNSS signals by creating composite images and extracting low-level features, without requiring  classification. The methods introduced in this paper appear to be more robust than existing approaches to identifying building heights in challenging environments and with typical position uncertainty. The paper also demonstrates that the amount of data required by the edge detection method is low, provided the data are strategically collected. The insight from sampling effectiveness can be used for crowdsourcing mobile app design and incentivising volunteers to collect data where it is most advantageous, for example, for taller buildings the best distance for making observations is further away.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
The authors received the support from the UK Research and Innovation Future Leaders Fellowship "Indicative Data: Extracting 3D Models of Cities from Unavailability and Degradation of Global Navigation Satellite Systems (GNSS)", grant number MR/S01795X/2, the Royal Society Research Grant, grant reference RSG/R1/180396.

Notes on contributors
Anahid Basiri is a Professor of Geospatial Data Science, a UKRI Future Leaders Fellow, and the Director of the Centre for Data Science and Artificial Intelligence at the University of Glasgow. Anahid contributed to the ideation, conceptualising, developing of the idea and performing analysis. Anahid drafted and submitted the manuscript and revised the paper.
Terence Lines is a researcher in the field of urban analytics, with specific expertise in real estate investment and spatial data science. He holds master's degrees from UCL and the University of Cambridge. He works in the industry investigating data-led and quantitative approaches to real estate investment. As a research assistant on the project, he led the design and development of the experiment and analysis and contributed to the data collection and interpretation of results.
Miguel Fidel Pereira is a PhD student at the University of Glasgow, where he studies positioning and navigation within the Geospatial Data Science research group. He received a BSc in Mechatronics Engineering from the University of Cape Town and an MSc in Applied Computational Science from Imperial College London. Miguel assisted with revising the literature review of the manuscript. Data and codes availability statement