Acoustic Sensor Planning for Gunshot Location in National Parks: A Pareto Front Approach

In this paper, we propose a solution for gunshot location in national parks. In Spain there are agencies such as SEPRONA that fight against poaching with considerable success. The DiANa project, which is endorsed by Cabaneros National Park and the SEPRONA service, proposes a system to automatically detect and locate gunshots. This work presents its technical aspects related to network design and planning. The system consists of a network of acoustic sensors that locate gunshots by hyperbolic multi-lateration estimation. The differences in sound time arrivals allow the computation of a low error estimator of gunshot location. The accuracy of this method depends on tight sensor clock synchronization, which an ad-hoc time synchronization protocol provides. On the other hand, since the areas under surveillance are wide, and electric power is scarce, it is necessary to maximize detection coverage and minimize system cost at the same time. Therefore, sensor network planning has two targets, i.e., coverage and cost. We model planning as an unconstrained problem with two objective functions. We determine a set of candidate solutions of interest by combining a derivative-free descent method we have recently proposed with a Pareto front approach. The results are clearly superior to random seeding in a realistic simulation scenario.

The hyperbolic location method [17] minimizes an error measure that is a nonlinear function of the potential source location. This approach is scalable, since location accuracy increases with the number of nodes that detect the gunshot (see Section 5.). Its domain can be a plane or a three-dimensional space.
TDoA-based ranging techniques require accurate clock synchronization. Every sensor knows its own position exactly, and it records the arrival time of the sound event. So, the sensor clocks must be as tightly synchronized as possible, using dedicated time-synchronization algorithms. Since a gunshot location network must scale to large sizes, it is necessary to minimize the number of exchanged messages to attain convergence, keeping energy consumption at reasonable levels if the electric grid is not available. Selecting a synchronization schema for a real application like ours is not easy. The best known synchronization schemas implement network mechanisms to adjust all local clocks to the same value. This is achieved by exchanging time stamps between node pairs. The more frequent the exchanges, the higher the time accuracy. Two representative examples are Reference Broadcast Synchronization (RBS) [18] and the Timing-Sync Protocol for sensor Networks (TSPN) [19]. We discarded GPS receivers due to their high cost. In Section 2.3. we propose a new ad-hoc flood method to set the clock times in every network node to the same value. In this method, the nodes do not exchange synchronization messages, and thus they save power. Once a node detects a gunshot, the time of the event is transmitted to the sink node through a previously generated path. The method performs a cooperative backward time adjustment, so that every node along the path is able to estimate the event time.
Regarding network planning, since the areas under surveillance are wide, and electric power is seldom available, it is necessary to both maximize detection coverage and minimize system cost. Therefore, we model sensor network planning as an unconstrained problem with two objective functions. We provide a set of candidate solutions of interest by combining a derivative-free descent method we have recently proposed and a Pareto front approach.
Due to the inherent difficulties exhibited by the thus far formulated models in sensor network planning, several heuristic optimization strategies have been proposed in the literature: variants of simulated annealing [20], genetic algorithms [21], gradient descent (when applicable) [22] and others [23,24].
Some of these approaches (simulated annealing, genetic algorithms and the like) do not guarantee theoretical convergence. Regarding gradient descent methods, the gradient is often unavailable or too costly to compute. Therefore, we have adapted a non-monotone derivative-free optimization technique with guaranteed convergence [25] to formulate and solve a computationally efficient optimization model with a dual objective: maximization of acoustic network coverage and minimization of power infrastructure cost. The results are presented as a Pareto front, revealing solutions that are clearly superior to random seeding.
Our notation is as follows: Lower case Greek letters are scalars, lower case Latin letters are vectors in IR 2 , x j k is the j-th component of the vector x k , and ||x|| is the Euclidean norm. A capital Latin letter, say S, stands for a collection of vectors in IR 2 ; if S = {s 1 , . . . , s p } we also say that S ∈ IR 2p ; τ S = {τ s 1 , . . . , τ s p }, and the sum Z = S + D means that S and D have the same number of elements, say p, and z k ∈ Z if and only if ∃k ∈ {1, . . . , p} | z k = s k + d k . In general the subindex i is the value of an entity (scalar, vector, set, and so forth) at the i-th iteration of an algorithm; for instance S i = {s i1 , . . . , s ip } is a set of p vectors in IR 2 , at the i-th iteration.
The rest of the paper is organized as follows: Section 2. describes the sound location and node synchronization procedures. In Section 3. we present the optimization model to plan the acoustic sensor network. In Section 4. we verify that the problem fulfills the conditions in [26] to apply the advanced derivative-free algorithms in [25]. In Section 5. we perform numerical tests to evaluate the approach. Finally, Section 6. concludes the article.

Gunshot Location
In this section we describe the location procedure for acoustic events, which we have implemented for MicaZ motes. The goal of our system is gunshot location by means of a sensor network. Location is based on hyperbolic positioning [17]. Hyperbolic positioning requires the sensor clocks to be synchronized, in order to apply the TDoA technique. For this reason we have implemented a synchronization protocol in the MICAz motes. Next, we describe the system architecture, the location method and, finally, the synchronization schema. Figure 2 shows the system architecture, with three components:

System architecture
• Sensor nodes: The sensor nodes in known positions are equipped with the necessary hardware for the detection of acoustic events. They can discriminate between normal and shot segment classes in audio streams. When a sensor node detects a sound event, it transmits a packet with information about the type of sound event and a sound timestamp to a special node, the sink.
• Sink nodes: Sink nodes collect the packets sent by sensor nodes and deliver them to the GIS server to calculate the position of the sound event. Sink nodes may be sensing nodes as well.
• GIS server: Using the information from the sink nodes, the GIS server estimates the position of the acoustic event by means of a hyperbolic method, described in Section 2.2..

Source location procedure
Since our optimization model (see Section 3.) considers a flat landscape with scattered trees, we have selected a two-dimensional hyperbolic positioning algorithm. Nevertheless, it could be easily extended to three-dimensional location in rough scenarios.
Let us consider an acoustic event that takes place at an unknown position x ∈ IR 2 , which we wish to determine. Formally, all sensing node locations s = (s 1 , s 2 ) belong to a well-defined compact set X ⊂ IR 2 . We denote the Euclidean distance δ(s, x) between a sensor location s and another point x in the scenario, s, x ∈ X, by Let us also assume that a subset B ⊂ X of the nodes have detected the event, and that the position of those nodes is known. Let us denote the sound propagation speed as υ. Then, the time of arrival t b of the event at any node b ∈ B is: and the difference between the arrival times of the event to a pair of nodes s b and s b is: This expression defines hyperboles, whose intersection determines the source of the shot. Figure 3 shows this process graphically.
For two given receiver locations, s b and s b , a set of emitter locations would yield the same TDoA measurement. This is the locus of possible emitter locations and it describes a hyperbole. If we now consider a receiver at a third location s b , it provides a second TDoA measurement, and, hence, allows the location of the emitter on a second hyperbole. In general, a set of distances from the source to every sensor pair identifies a set of hyperboles. As a consequence, the location of the source is the intersection of this set of hyperboles. In Section 5., our tests show that location accuracy increases with the number of gunshot detection nodes. In most trials, the accuracy in scenarios with three detection nodes or more is sufficient to find the acoustic source (the detailed TDoA location technique is described in [17]). Although non-linear expression (3) has a unique solution if there are enough hyperboles, there is some uncertainty in the calculation since: • The speed of sound varies depending on altitude, humidity and air temperature. As we have mentioned, multi-path propagation affects the accuracy of acoustic signal detection. Single spread-spectrum techniques such as those in [27] largely mitigate it.
• The microphone directionality or polar pattern affects the result.
• The clock drift may drastically vary in time due to environmental temperature and humidity changes. In Section 2.3. we propose an approach to reduce sensor clock deviations.
Due to these inaccuracies, expression (3) may be inconsistent. Nevertheless, low-error estimations are possible. A nonlinear optimization problem can be formulated [28] to minimize the difference between estimated and real positions. We minimize the square error of the location, defined as the differences between the squares of the theoretical and measured differential arrival times to a reference node s b : The least square minimization problem in (4) is not convex. Thus, standard optimization algorithms, such as incremental gradient, are not guaranteed to converge to the global minimum. The initial conditions in an iterative algorithm may lead to a local optimum or a saddle point at the termination, adding imprecision to gunshot location. The required computing power is moderate, and the GIS server ( Figure 2) can handle the calculations. The solutions can be obtained practically in real time (see Section 5.). Our synchronization algorithm (see Section 2.3.) also contributes to the location procedure, since it is required for the sink node to compute the TDoA between the detectors. The tests with our reference implementation in MICAz motes reveal that, the more sensors that detect the sound event, the greater the location precision (see Section 5.). This is due to error compensation in arrival time measures. In conclusion, a high density of sensors around a sound event always improves location precision. Therefore, optimal sensor positioning improves the performance of the location system.

Synchronization schema
Time synchronization is a fundamental aspect in distributed sensor networks. In the proposed shot detection system, the differences in arrival times can only be computed if the nodes are tightly synchronized. To adjust the clocks, the nodes must exchange messages indicating the time reference. In fact, time adjustment degrades progressively due to clock drifts, and it is indeed mandatory to readjust it periodically (by sending new messages). However, power consumption is higher if synchronization packets are continuously transmitted. Even if a continuous energy source is available, it may be necessary to extend the network with autonomous nodes at its edges. Therefore, the number of exchanged messages should be as low possible, for a given accuracy goal. For gunshot location purposes, if we assume a maximum error of few meters, the maximum allowed error in time synchronization is in the order of a tenth of a second (d e /υ, where d e is the allowed spatial error). For example, three milliseconds of clock drift will cause an estimated error of one meter, relative to the real source position. Thus, fine-grained clock synchronization is mandatory. The tests with our synchronization algorithm show a time accuracy of tenths of microseconds. The implementation has been carried out with TinyOS in MICAz devices.
The synchronization algorithm has two steps: 1. Level discovery: This step is similar to the level discovery stage in TPSN [19]. Before the synchronization process takes place, the network has to organize itself as a hierarchical tree, beginning at a root node (in our case we choose the sink). According to the minimum number of hops to the sink, a level is assigned to each node (level 0 to the root). To compute the tree, the process starts at the root, broadcasting a level discovery packet to the nodes at level 0. The nodes that receive this packet are marked as children of the root node, and they set their level to 1. The nodes ignore further level discovery packets with greater or equal level numbers. Then, level 1 nodes broadcast their level discovery packets, and so on. Note that this process also permits discover of optimal communication paths (in number of hops) to the root, and, thus, it is valid for network routing.
2. Synchronization: Once the hierarchical network structure is completed, the synchronization process may start. In general, level k nodes synchronize their children (of level k + 1).
Besides its own local clock, a sensor node will maintain an estimation of its synchronizer node clock in the upper hierarchical level. The approximation consists of calculating the regression line of those two clocks. Previously, the level k node receives several synchronized time-stamps of level k − 1 (see Figure 4), which are broadcast following the tree structure that was created at the level discovery step. Figure 4 shows the regression line used to calculate the parent node clock in a level k node. Value α k represents the clock offset at reference time t = 0, and the slope β k is the rate of change (clock drift) of the local clock.
Once a node detects a gunshot, it sends the event to its parent node in the upper level, according to the parent time clock. After one or more hops, level 0 (sink node) will receive estimations of the detection time that are synchronized with the sink clock, from one or more level 1 nodes. This way, local clock exchanges do not spend power. Since clock drift varies slowly, the regression line must only be calculated every 6 or 8 hours, according to our tests with MicaZ motes. Only large temperature variations affect the regression line slope, requiring node re-synchronization.

Optimization Model
The sound spectrum of a gunshot is dominated by the 130 Hz to 3 kHz frequency range [29]. According to [30], at these frequencies there is an extra attenuation of 3 dB each ∼25 m in woods, yielding 12 dB vs. 9 dB in open space. Given the sensitivity threshold of the sensing node, its maximum ranges are 1 Km in open space and 750 m in a wood.
We consider a large outdoor flat scenario with open space areas and wood patches. The scenario is crossed by a few power lines. In it, we wish to deploy a given number of fixed sensing nodes, so that the detection coverage is maximum. A point gets covered if it is reachable by one sensing node at least, although this capability increases if more sensing nodes see the point (to achieve source signal location we require a coverage of at least three nodes, as explained in Section 2.2.).
At the same time, we wish to minimize the distance between the sensing nodes and the power lines so that the cost of the power infrastructure is minimal in case the nodes are not autonomous. The nodes communicate through the electric grid itself, so transmission coverage is not an issue of interest.
This scenario clearly prevents an optimal educated guess, specially when only a small number of sensing nodes is available.
• ω 2 (s, x) corresponds to propagation distance through open space.
Given a sensor location s ∈ X and a point x ∈ X we say that x is visible from sensor s if: Let S = {s 1 . . . , s p }, s k ∈ X, k = 1, . . . , p, be the positions of p acoustic sensors on X. We denote by V (x) the set of all sensors s ∈ S that are visible from the location x. We also define an arbitrary grid G ⊆ X. Objective function f 1 , which measures acoustic sensor coverage, is defined as follows: This function penalizes grid points that see fewer than three sensors (a gunshot in those points cannot be located with highest precision), and gives a small bonus to grid points that see more than three sensors (the minimum number of sensors for highest precision location).
A second objective function f 2 measures the cost of the sensor deployment. As in the case of sensor coverage, there are many ways to model this. In this paper we assume that the cost of a sensor unit is negligible compared to the cost of a permanent power line. As we previously said, the scenario is crossed by m power lines. We define: where δ i (s) is the Euclidean distance between s and the i-th power line, i.e., between s and the point in that line that is closest to s in Euclidean distance.
Our ultimate task is to place p sensors on X in such a way that the coverage f 1 on X is maximized and the cost f 2 on X is minimized. Clearly, these objectives are contradictory. Minimizing −f 1 (i.e., maximizing f 1 ) tends to spread the sensors, whereas minimizing f 2 tends to concentrate them around the power lines. For that reason, it is desirable to produce the Pareto front [31] of these two functions, which represents a pool of candidate solutions. A point x * ∈ X belongs to the Pareto front of a set of functions in X if a further decrease in one of them is not possible without causing an increase in some of them. The methodology in [31] obtains joint descent directions for all the objective functions in a set, but it requires all of them to be differentiable, and that is not the case with f 1 .
In our case, since there are only two objective functions, we define the following unconstrained optimization problem: By solving problem (8)

Solving the Optimization Model
Many optimization algorithms are iterative. Starting with a solution estimate S 1 = {s 11 , . . . , s 1p }, a subsequence {S i } i∈I = {s i1 , . . . , s ip } i∈I is generated that hopefully converges to the solution of the problem. As it is common in all implementations, there are several parameters (magic numbers) the user must set. Some of them will notably influence the performance of the algorithm, and often depend upon the structure of the objective function.

Alternative approaches
Two methods were suggested in [23,24] for access point coverage optimization, a similar problem to ours: neighborhood search and simulated annealing [32], which we compared in [33]. These methods have no guaranteed convergence. On the other hand, gradient descent methods converge, but they can only be applied when the objective function is smooth, which is unusual in realistic models like ours.

Derivative-free unconstrained minimization
The function f (S), with θ ∈ (0, 1], is non-smooth on X with directional derivatives everywhere defined, which is a required assumption in a recent algorithm for unconstrained minimization (ignoring θ = 0 is not relevant when estimating the Pareto front, because θ can be arbitrarily close to 0). Numerical results show that the algorithm is competitive with others that try to find a good local minimum [25,26]. Essentially the algorithm is an iterative process that does not force the decrease of f (S i ), but imposes a controlled bound ϕ i ≥ f (S i ) at every iteration. More specifically, given a stepsize τ i ∆ > 0, and a unitary direction D ∈ IR 2p , D = {d 1 , . . . , d p }, one iteration of the algorithm succeeds if where ν(·), ϕ satisfy A4 -A5 given below. The point S i is blocked when the algorithm fails to satisfy (9) on a set of directions {D 1 , . . . , D n }, n > 2p that positively spans IR 2p . It is shown in [25] that under assumptions A1 -A5 given below, the sequence of blocked points converges to a point S * that satisfies the zero order stationary point of f (S), i.e., f (S * , D k ) ≥ 0, k = 1, . . . , n, where the directional derivative is nonnegative along the given directions. In theory, if A6 also holds, then ∇f (S * ) = 0, but we are aware that A6 is seldom fulfilled for our kind of function and we do not stress this result in this paper. The reader may read [25,26] to complete the details. We reproduce [25, Table 1] in Table 1.
A4. The function ν(·) : are upper bounds of f (·), i.e., ϕ i ≥ f (S i ) for all i, and decrease sufficiently after a given finite number of successful iterations.
A6. f (S) is strictly differentiable or locally convex at all limit points of the sequence {S i } ∞ i=1 generated by the algorithm.

Remark 2:
The simplest case of INTERPOLATE(S, D, τ ∆) is Z = S + τ ∆D, f Z = f (Z), which we follow in this paper, although there exist more elaborate alternatives [25]. Remark 3: If α = 0, we obtain the monotone version of the derivative-free algorithm, which converges to the local minimum in the neighborhood of the starting point. Remark 4: A5 holds. Remark 5: In order to improve convergence, [25] suggested expanding τ every time a significant number of successes is achieved. Nevertheless, convergence is also guaranteed if τ only decreases.

Numerical Tests
We model the 10 km × 10 km flat wood scenario in Figure 5. The light areas represent open space, and the dark ones represent wood patches. The scenario is crossed by three "vertical" power lines, respectively at 1.66, 5 and 8.33 Km from the left side. This terrain representation partly encloses the difficulties in modelling Cabañeros national park. Grid G is given by a uniform discretization of the scenario in 50-m steps. In it we want to install 200 acoustic sensors (p = 200) with maximum detection coverage, at a minimum power cost. In order to obtain the Pareto front, we obtain the solution of problem (8) several times, starting from different random points S 0 . We normalize the objective function by dividing it by its value at the starting points.
Since an algorithm can lead to a local optimum of problem (4) or to a solution that differs from the global optimum, we have chosen a brute force approach that computes the function at every grid point. Real-time system response is not affected, as the calculation time in a 10 km × 10 km scenario is in the order of a tenth of millisecond.
Following the results in [33], instead of applying the non-monotone derivative-free search as described in Section 4., we applied a zone search variant. If we simply "move" one sensor at a time, instead of "moving" them all, the evaluation of the objective function is significantly less time consuming, since most computations in f (S) do not change.
To formalize this approach, we split the scenario into q non-overlapping zones X j , j = 1, . . . , q, such that X = ∪ j=1..q X j and X k ∩ X j = ∅, k = j. Let V (s i ) = {x ∈ G | s i ∈ V (x)}. When we move the k-th sensor, the remaining sensor positions do not change, i.e., the group moves from S = {s 1 , . . . , s k , . . . , s p } to S = {s 1 , . . . , s k , . . . , s p }. When this happens, , we discard all X j | ∀x ∈ X j min(δ(s k , x), δ(s k , x)) > 1 Km. Let T nv be the set of discarded zones and let T v = V (S ) − T nv . Then, we compute x∈Tv f (S ), and keep x∈Tnv f (S ) = x∈Tnv f (S) from the previous iterate. In the numerical tests that follow, we divided our scenario in 100 [1 Km × 1 Km] zones (q = 100). We compute one objective function component per zone at the beginning of the algorithm execution. When only one sensor is moved, there is no need to recompute the objective function in zones that are not affected by the sensor movement.
This way, the time to compute an objective function value drops from 20 seconds on average to just 1.5 seconds on a Pentium IV.
Let S = {s 1 , . . . , s k , . . . , s p }. In order to move only one AP at a time, say s k , we generate a set of unit search directions d j ∈ IR 2 , j = 1, . . . , n such that the set D = {d 1 , . . . , d n } positively spans IR 2 . We recall that n ≥ 3. We declare a success when for some d j ∈ D and some s ∈ S. The point S is blocked if the algorithm is unable to move a single s ∈ S. We observe that this schema may be carried out simultaneously in a multi processor environment and it is straightforward to show that after we try all s ∈ S we have searched on a set of directions that positively span IR 2p , although we are using at least 3p directions of search.
We tuned the method in preliminary trials and determined the following parameter values: • ∆ = 0.1 Km, i.e., the method stops when the maximum sensor displacement is under 100 m.
• We initially set ϕ to the value of the objective function at the starting point.
• Finally, we set ν(τ ∆) = 0.001(τ ∆) 2 . The results reveal that the Pareto front approach is useful. The solutions show a compromise between cost and coverage. In the V (x) ≥ 1 case there is no real advantage over a random seeding in terms of coverage, although the cost drops considerably for 100% coverage. However, in the V (x) ≥ 3 case, the optimization result is clearly superior to random seeding across the whole Pareto front. At maximum cost, there is a 20% increase in coverage, and at 85% coverage (the best coverage of random seeding) there is a 50% decrease in cost.     Figure 5, for three different choices of θ = {0.1, 0.5, 0.9} (red points represent sensor positions, blue areas zones with less coverage, dark blue areas indicate no coverage at all). The results are consistent with the fact that coverage improves for low θ values. As the values of θ become higher, the sensors tend to concentrate around the power lines. Therefore, power cost decreases, but so does the coverage (Figure 11 for θ = 0.9).   Figure 11. A sensor deployment for θ = 0.9. Figure 12 represents the percentage of grid points covered by different numbers of sensors in the three previous deployments. In the θ = 0.9 case, where the cost function has more weight, over 35% grid points are out of coverage, and, due to the concentration of the sensors around the power lines, there are grid points that are covered by 13 sensors or more. As θ decreases, grid points covered by less than 3 sensors are rare, and the majority of the grid points are covered by 3-5 sensors.  Table 2 shows location precision and expected coverage results. The location algorithm described in Section 2.2. has been tested in the three previous deployments (θ = {0.1, 0.5, 0.9}). Table 2 shows the average location error in meters (distance between estimated and real positions) for 10, 000 random gunshot positions in the simulation area, for each θ value and three levels of variation in sensor clocks. These variations are generated by adding a normal Gaussian random toss to the arrival time at each node, for three different values of σ between 0 and 10 ms. The results show that location accuracy increases with the number of detecting nodes, as well as with a better clock synchronization. In addition, this table provides information on the expected coverage at each scenario. For instance, in the θ = 0.1 case, the probability that one or more nodes detect the event is 0.99973, for two or more nodes it is 0.99777, and so on. Clearly, there must be a compromise between detection accuracy, coverage and deployment cost.

Conclusions
In this paper, we have proposed a gunshot location procedure based on sensor networks and a Pareto front approach to optimize large-scale deployments. The location procedure is based on hyperbolic multi-lateration using data from a synchronized sensor network. We also propose a practical distributed synchronization algorithm for that purpose, with low energy consumption. Sensor network planning follows a Pareto front approach, using a monotone descent method without derivatives that is compatible with realistic optimization functions. Our results are clearly superior to random placement, achieving a 50% cost reduction for 85% coverage. In the two-dimensional scenario, event detection by at least four nodes is required to achieve satisfactory gunshot location accuracy.