Voxelization algorithms for geospatial applications

Graphical abstract


Introduction
We present three methods for voxelating point, curve, and surface objects. For curve (1D) and surface (2D) objects we present algorithms for the methods mathematically described by Laine [1] and for voxelizing points (0D) we devise our own algorithms. Note that 0D, 1D, and 2D refer to the topological dimension of inputs; this, however does not contradict the fact that all inputs are embedded in three-dimensional Euclidean space of R 3 , i.e. a Cartesian product of X, Y and Z coordinates represented in the domain of real numbers R. Full scripts of algorithms are available in the abovementioned repository. Laine [1] mathematically proves that using the so-called intersection targets desired connectivity levels such as 6 or 26 connected voxel results could be achieved. He does not present an algorithm in detail though. In developing an algorithm for dealing with large datasets, we had to come up with an efficient way of iteration. In general, one can either iterate over all possible environmental modelling studies while ensuring they are spatially correct, meaning they correctly represent topological and semantic relations among objects. In this article, we present algorithms that generate voxels (volumetric pixels) out of point cloud, curve, or surface objects. The algorithms for voxelization of surfaces and curves are a customization of the topological voxelization approach [1]; we additionally provide an extension of this method for voxelization of point clouds. The developed software has the following advantages: It provides easy management of connectivity levels in the resulting voxels. It is not dependant on any external library except for primitive types and constructs; therefore, it is easy to integrate them in any application. One of the algorithms is implemented in C++ and C for platform independence and efficiency. ß 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http:// creativecommons.org/licenses/by/4.0/). voxels in a bounding box or iterate over objects. It sounds obvious that iterating over objects is more efficient as the whole space is not usually filled with objects. This approach can also lead to inefficiency when it comes to a large mesh object such as a TIN (Triangular Irregular Network) terrain model. However, this issue can be resolved if all objects are decomposed into primitives such as triangles (in case of surface input) or line segments (in case of curve inputs). In such cases the algorithms can iterate over triangles or line segments. In our implementation of the topological voxelization method, we have chosen four intersection targets from those defined by Laine, namely two hairline targets for surface inputs and two mesh targets for curve inputs as shown in Table 1. In the following section we give an overview of topological voxelization.

Topological voxelization in brief
Laine [1] mathematically proves that using the so-called ''intersection targets'' desired connectivity levels such as 6 or 26 connected voxel results could be achieved. The idea can be best understood by asking the following question based on the definitions of connectivity given below: 6-Connected Voxel Collection: a voxel collection in which every voxel has at least one faceneighbour, i.e. by virtue of having adjacent faces. 18-Connected Voxel Collection: a voxel collection in which every voxel has at least one edgeneighbour, i.e. by virtue of having adjacent edges. 26-Connected Voxel Collection: a voxel collection which every voxel has at least one vertexneighbour, i.e. by virtue of having adjacent vertices.
Question 1: given a curve (1D manifold input, as vector data model) how can we obtain a 'thin' voxelated curve (i.e. a voxel collection as a raster representing the curve in question without unnecessary voxels) that is 6 connected or 26 connected?
Question 2: given a surface (2D manifold input, as vector data model) how can we obtain a 'thin' voxelated surface (i.e. a voxel collection as a raster representing the surface in question without unnecessary voxels) that is 6 connected or 26 connected?
Formally, considering adjacency between voxels in a voxel collection can be in the sense of faceadjacency, edge-adjacency or vertex adjacency. Analogous to these definitions, we can mention edgeadjacency or vertex adjacency between pixels which correspond to 4-neighbourhoods (a.k.a. von Newman neighbourhoods 1 ) and 8-neighbourhoods (a.k.a. Moore neighbourhoods 2 ) respectively. These neighbourhood definitions are mostly known in definition of Cellular Automata 3 (CA) models. The term 'connected' refers to the graph that represents the adjacencies between voxels or pixels with the given definition of adjacency. It is easier to conceive of the principle of guaranteeing connectivity in raster (pixel or voxel) output by looking at the 2D case.
(saleable) 26-Connected Intersection Target: 3 plane faces cutting a voxel cube in halves along X,Y and Z, meeting at the centre of the voxel cube.
(saleable) 26-Connected Intersection Target: A 3D crosshair centred at a voxel cube made of lines going through face centres along X,Y and Z axes.

What is an intersection target?
The fundamental idea behind using Intersection Targets is based on what is called Poincare Duality Theorem [2] similar to the approach of [3,4]), which establishes a primal-dual pairing between 'primal' kdimensional features and their (n-k)-dimensional 'duals' in an n-dimensional space ðR n Þ, within which these objects are 'embedded'. In 3D space, we can think of dualities as shown in Table 4. For a clearer picture we also show potential dual pairs in 1D and 2D spaces respectively in Tables 2 and 3 an Intersection Target is a dual feature of dimension n-k (3-k in case of the 3D space of R 3 ) that must be intersected with the input feature (primal) to determine whether a voxel should be added to the voxelated output object or not. A well-known example of duality in 2D maps is the duality between a Delaunay triangulation and a Voronoi tessellation.
An example of dual relationships in 3D is shown in Fig. 2. A face in the left image can be considered as the element through which two 3D cells are connected; this is why representing the same face with an edge in the dual graph makes sense as it connects two vertices representing the respective 3D cells (Fig. 2).
What is special about topological approach over alternatives?
In many geospatial applications it is important to preserve topological relations among objects [5,6]. This set of topological relations is used for many purposes such as to analyze relationships between geographical objects [7], to ensure validity of geospatial datasets [8], or to construct 'graphs', as in navigation and path planning [9]. In the raster domain, connectivity levels in the output voxels of a voxelation process are essentially topological properties, which are best handled through an explicitly topological approach. The topological approach brings added elegance and clarity to the voxelization process and allows for obtaining desired connectivity levels in a systematic way.
In the topological approach to voxelization [9] the idea is that if we are to decide whether a voxel 'needs to be' in the output (so as to ensure preserving the topological properties of the input features); we need to ensure its relevant Intersection Target intersects with the input. The nature of these Intersection Targets is dual to the nature of the primal inputs. This way, there will be no doubt on the necessity of having a voxel in the output; i.e. the results are guaranteed to be 'minimally thin' as to representing a raster version of the original vector features. We refer the reader to the mathematical proofs given in [1]. Here we give an intuitive explanation for the 2D example shown in Fig. 1. This figure shows a 2D version of topological rasterization (pixelization) of a 1D input curve. If we want to ensure that every necessary pixel is added to the results, we need to focus on the connectivity level desired (4 or 8 neighbours for each pixel). If we want our pixelated curve to be a 4-Connected path so as to best represent the connected underlying curve, then we need a determining factor for including a pixel in the outcome. Let us figuratively imagine an ant going through that curve somewhere, we want to know from which spaces (pixels) it has passed; and that we want to reconstruct its path as a sequence of interconnected pixels sharing a wall with one another. If the ant in question crosses a wall-edge in one pixel it definitely enters to a 4-neighbouring cell. Therefore, checking the crossings of the path curve with the pixel boundaries would be enough to decide for including a pixel in the rasterized path.
The prevalent alternative approaches, which are quite common as to their efficiency can be seen as variants of Scan Conversion [10], which works by interacting rays in 3D directions passing through objects. If all voxels inside are needed then voxels coinciding with intersection points between odd and even intersections will be added to the rasterized (voxelated) output. In our approach this   operation corresponds to finding those 0D voxel centre points which happen to be intersecting with the 3D volume (inside or on the volume). 4 Observe that the last pair of dual features in Table 4 actually corresponds to voxelating a volume in this way. The topological approach can also eventually be adapted to be implemented based on rays, so as to make it more efficient, but that subject would fall out of scope of the current paper. We can simply say that such a scaling approach would be based on considering Meta Intersection Targets such as rays for meshes and half-planes for curves and finding intersection parameters along these Meta Intersection Targets to locate meeting voxels.

Point voxelization (0D inputs, 3D targets)
In this section the voxelization algorithm for 0D data inputs, i.e. point clouds, is described in detail. The algorithm has been devised and implemented in C# and its implementation is available in our GitHub repository. Its pseudo code is in Algorithm 1.
In the initialization phase, the algorithm creates a bounding box for R 3 coordinates that is larger than or equal to the size of the bounding box of the point cloud in R 3 (1.a, 1.c). It then finds how many voxels could be in each direction by finding the floor integer closest to the size of bounding box in each direction divided by the voxel size in the corresponding direction (1.e, 1.f, 1.g). It then initializes a 3dimensional array of Boolean (1 bit) values of respective X, Y, and Z sizes plus one (i.e. minimal in size for it only stores bits). This array will be used to keep track of voxels already visited as tuples of [i,j,k] in R 3 .
If not yet visited, it marks it as visited (true). It then continues by 'embedding' the voxel in R 3 , that is through mapping the [i,j,k] voxel in R 3 to R 3 by first creating a point of respective voxel size and then shifting it first for half of the voxel size vector and then for the point at the minimum corner of the R 3 (2.g,2.h) bounding box.   iii. Voxelout = Voxelout + VSize / 2 + Zbbox.Min; // change its reference as to the Z iv. VoxelsOut.Append(VoxelOut); //and add it to the voxel output 3. return voxels The ZBBox algorithm adjusts a bounding box in R 3 to a larger or equal size bounding box in R 3 . Its pseudo code, described in Algorithm 2, uses MinBoundRP_to_ZP and MaxBoundRP_to_ZP to ensure the new bounding box contains the R 3 bounding box. The time complexity of the algorithm is linear, i.e., O(n) where n is the number of points in the point cloud.  Fig. 3. This point cloud is consisted of 136,828 points as X, Y, and Z coordinate tuples, we have provided this point cloud in our GitHub repository. Fig. 5 depicts voxels with higher resolution than those shown in Fig. 4.

Curve voxelization (1D inputs, 2D targets)
According to the topological voxelization approach [1] if an effectively one dimensional input such as a curve is intersected with voxels replaced by relevant intersection targets  Table 1, then the resulting set of voxels is guaranteed to have the desired connectivity level, namely 6 or 26. The idea is to construct intersection targets as triangle meshes and then: 1. form a bounding box for the curve in question; 2. adjust the bounding box to ensure its size is an integer multiple of the voxel size and that it covers the whole curve; 3. fill in the bounding box with voxels; find out voxels which potentially intersect with the curve in question, i.e. those closer than half of the size of a virtual vector representing the diagonal of a voxel cube; 4. for each voxel that is near enough to be possibly included in the result, find out if its relevant connectivity target intersects with the curve in question; if yes then add it to the voxels.
Here we explain our voxelization algorithm for 1D data inputs, i.e. lines, polylines or curves that can be approximated as such. 5
The function 1D_26_intersectionTarge described by the pseudo code in Algorithm 4 creates mesh objects that ensure 26 connectivity if used as intersection target for voxelizing 1D input.  The above algorithm is currently implemented such that it produces 12 quadrangular faces and thus 24 triangles. It will be therefore more efficient to implement the above function instead.
The function 1D_6_intersectionTarge described by the pseudo code in Algorithm 5 creates 1D mesh ensuring 6-connectivity. Note that this intersection target produces 12 triangles and that this number cannot be reduced, even if the other alternative for a 6-connected result were implemented (i.e. the outline faces of a voxel cube). This implies that if we want to create a better-connected raster with more voxels (6-connected), then we need to do more computation.

Algorithm 5 1D_6_intersectionTarget.
function: 1D_6_intersectionTarget input: voxel size as Vector3d output: an intersection target for 1D input ensuring 6 connectivity as a mesh object as in the figure below [ ( F i g . _ 5 ) T D $ F I G ] 12 triangles. It is important to note that for solving intersections unambiguously an intersection between a ray or a half-line and a triangle should be found. This means that in practice the diagonal target will 12/8 = 1.5 times faster than a corresponding cube outline target. This is not the final conclusion on efficiency however. As for implementing Meta Inetrsection Targets, straight faces would be advantageous as they can be eventually replaced by finding intersections to half-planes (Fig. 7).
Note that every line in the voxelized curve network in Fig. 6 is 6-connected but the whole network is not. Ensuring such connectivity for the network would require extra measures and techniques.

Surface voxelization (2D inputs, 1D targets)
Function voxelizeSurface is used for topological voxelization of 2D surfaces, which are represented as TIN or triangular polygon mesh objects. Note that other surfaces such as NURBS surfaces are also approximated as such for rendering purposes and so they can be the input of this method. The algorithm works by iterating over triangle faces of the surface in question by checking whether they intersect with the relevant intersection target of each voxel that could possibly be in resulted 3D raster. The points that possibly intersect with an intersection target are those whose distance to the input object is less than half of the length of the voxel size vector, i.e. a virtual vector comprised of the voxel size in X, Y and Z directions. Intersection between a connectivity target and the input surface is eventually determined by checking intersections between individual line segments and triangles. A standard algorithm for determining whether a triangle and a line intersect has been used for this purpose (see Curve Voxelization (1D Inputs, 2D Targets)).
Users can choose a level of connectivity (6 or 26), because of which a corresponding connectivity target will be chosen here to produce appropriate results. The pseudo code is described in Algorithm 6. and Mesh, which can be replaced straightforwardly. An example code without such dependencies is the surface voxelization code provided in the repository. Codes dependant on Rhinocommon can be complied and tested in an environment such as Grasshopper3D 10 also developed by McNeel.
As can be seen in curve and surface voxelization algorithms, we are currently forming a 3D grid full of voxels for a bounding box of the input object. This can be problematic in case this bounding box is big. In case of surfaces, it is more efficient to iterate over individual triangle faces to avoid this problem. Similarly, it will be more efficient to iterate over individual line segments in a polyline approximating the input curve compared to iterating over curve objects directly. 11 This, however, means that there will be many duplicate voxels in the raster output. To avoid this problem, a 3D raster data model [ ( F i g . _ 8 ) T D $ F I G ] [ ( F i g . _ 9 ) T D $ F I G ] Fig. 9. A surface (triangular polygon mesh) voxelized with connectivity level 6. 10 http://www.grasshopper3d.com/. 11 This is not the case in our current implementation now. We are iterating over 'generic curve objects' and use a function of Rhinocommon to intersect a curve and a mesh. However, deep inside, this function is also based on intersecting line segments and triangles. Note that free-form curve objects such as NURBS curves are rendered as polylines with fine line segments. similar to the model implemented in the voxelization algorithm for point clouds needs to be implemented to keep track of voxels already created and to ensure there will not be duplicates. Such a 3D raster data model will be possibly very space efficient as it should only store one bit per voxel. Such an implementation then will bring other advantages such as the ease of Boolean operations on multiple 3D raster objects such as Boolean operations (union, intersection, etc.) and mathematical morphology operations (erosion, dilation, gradient, etc.) as it will only involve bitwise operations, provided that every two 3D raster models are defined in reference to the same bounding box so that their R 3 representations are compatible. Then such a requirement imposes another requirement that regards the existence of many zeros in any such 3D raster data model. Consider a house model voxelized in reference to a bounding box over the whole neighbourhood or city. It is obvious that most of the voxels represented in the 3D array of bits are zero. This suggests implementation of a data structure like sparse matrix so that such 3D raster models can be compressed in memory.
What is interesting about a binary representation of a voxel collection (as 3D raster) is that the same way 2D raster data models (2D images) can be processed and analyzed, 3D raster models (3D images if you like) can be processed and analyzed. What is common in both cases is the binary representation of coloured picture elements (pixels) or coloured volume elements (voxels). In short, we can enter a completely new world of possibilities inspired by digital image processing, merely by representing voxel grids as Boolean tensors (3D matrices).