Multi-Scan Multi-Sensor Multi-Object State Estimation

If computational tractability were not an issue, multi-object estimation should integrate all measurements from multiple sensors across multiple scans. In this article, we propose an efficient numerical solution to the multi-scan multi-sensor multi-object estimation problem by computing the (labeled) multi-sensor multi-object posterior density. Minimizing the $L_{1}$-norm error from the exact posterior density requires solving large-scale multi-dimensional assignment problems that are NP-hard. An efficient multi-dimensional assignment algorithm is developed based on Gibbs sampling, together with convergence analysis. The resulting multi-scan multi-sensor multi-object estimation algorithm can be applied either offline in one batch or recursively. The efficacy of the algorithm is demonstrated using numerical experiments with a simulated dataset.


I. INTRODUCTION
Instead of the state or trajectory of a single object, multiobject state estimation is concerned with the joint estimation of the number of objects and their trajectories [1], [2], [3]. This problem has a wide range of application areas from aerospace [4], computer vision [5], [6], [7], to cell biology [8] [9]. Multiobject state estimation is challenging because we have to address the unknown and time-varying number of objects, false negatives/positives, and data association uncertainty, which altogether incur a combinatorial complexity. The three main approaches to multi-object estimation in the literature are Multiple Hypothesis Tracking (MHT) [2], Joint Probabilistic Data Association (JPDA) [10], and Random Finite Set (RFS) [1], [3]. Apart from the number of objects and their trajectories, the RFS approach also provides statistical characterization of the entire ensemble of objects [11].
In general, employing multiple sensors in state estimation enhances detection capability and spatial coverage, reliability, observability, and reduces uncertainty [12], [13]. There are many applications that fundamentally require multiple sensors because a single sensor is simply not adequate, see e.g., [14], [15]. Moreover, given the proliferation of inexpensive sensors, effective multiple sensor solutions are imperative for exploiting their intended capabilities. The benefits of multiple sensors are even more significant in multi-object estimation, where there is inherent additional uncertainty.
Whereas filtering only considers the current timestep, the multi-scan approach considers the history of the multi-object states over a time window, thereby allowing the correction of previous errors [29], [2], [30]. MHT forms hypotheses by associating measurements to tracks across a time window [31], and delays making estimates until further information becomes available. In [32], a fixed lag smoother was integrated to the JPDA filter [10], where it was shown that a significant improvement in tracking accuracy over filtering can be achieved with small window of sizes two and three. The first RFS multi-scan multi-object filter was developed in [30], where all statistical information pertinent to the multi-object system over the time window is encapsulated in the multi-object posterior, including statistics on the ensemble of trajectories. Computing this posterior, however, also requires solving NPhard multi-dimensional ranked assignment problems. Recently, an efficient numerical algorithm with polynomial complexity was developed for propagating the so-called (multi-scan) GLMB posterior, where multi-dimensional ranked assignment problems are solved using Gibbs sampling [11].
Ideally, if computational tractability were not an issue, to inherit all the benefits from both multi-sensor and multiscan solutions we should make use of measurements from all sensors across multiple scans. However, keeping in mind that the multi-scan problem itself requires solving NP-hard multidimensional ranked assignment problems, and ditto for the multi-sensor problem, the joint multi-scan multi-sensor problem is far more computationally demanding. Consequently, research on multi-scan multi-sensor multi-object estimation is very limited. To the best of our knowledge, only a two-scan two-sensor demonstration with four objects, using an Interacting Multiple Model (IMM) JPDA filter, has been reported [33].
This article addresses multi-scan multi-sensor multi-object estimation via the labeled RFS formulation, which translates to computing a labeled multi-object posterior consisting of an intractably large weighted sum of set functions. Approximating this so-called GLMB posterior by retaining the highest weighted terms minimizes the L 1 -norm approximation error [11], and requires solving far more challenging multidimensional assignment problems than those for multi-scan (with single-sensor) or multi-sensor (with single-scan). Indeed, the posterior truncation problem for V sensors across K scans is a (KV)-dimensional assignment problem, and existing techniques are not adequate for K and V greater than two, even with only four objects.
We propose an efficient multi-dimensional assignment solution with polynomial complexity using Gibbs sampling, together with convergence analysis. This solution generalizes those for multi-sensor (with single-scan) and multi-scan (with single-sensor) in [21], [11]. The resultant multi-object estimation algorithm can be used offline in one batch or recursively at each measurement update, i.e., smoothing-while-filtering. The efficacy and utility of the proposed solution are demonstrated in numerical experiments. For the purposes of establishing the theoretical foundation and scalability for the algorithm, we focus on the (labeled) multi-object posterior, which involves a growing window of scans. While such a growing window results in a complexity per time step that grows with time, the algorithm can be easily adapted to a fixed-length moving window, so that the complexity per time step is fixed. Note that filtering is a special case of the moving window approach with a window length of one. A formulation that produces trajectories with a fixed complexity per time step is only possible with labels, see e.g., [34], [35] for further discussion. Without labels, the only way to obtain trajectories is by using a growing window which incur a complexity per time step that grows exponentially with time. Such an approach is impractical even for the simplest case of single object state estimation with a single sensor [36].
The remainder of this article is organized as follows. Section II summarizes relevant concepts in multi-object estimation, and Section III presents the implementation details of the multi-sensor GLMB posterior recursion. Section IV presents the numerical studies, and Section V concludes the paper. Mathematical proofs are given in the supplementary materials.

II. BACKGROUND
As per the convention in [11], the symbols and notations used throughout the article are summarized in Table I.

A. Multi-object State and Multi-object Trajectory
The labeled state of an object is represented by x = (x, ℓ), where the vector x ∈ X is its kinematic state, and ℓ = (s, ι) is its (unique) label with s representing the time of birth and ι is an index to distinguish objects born at the same time [37]. Let B s denote the label space of objects born at time s, Indicator function for a given set Class of all finite subsets of a given set S f, g Inner product, f (x)g(x)dx of two functions f and g |X| Distinct label indicator δ |X| (|L(X)|) Fig. 1: Labeled multi-object trajectory and history. The two objects born at time 1 are given labels (1,1) and (1,2), colored red and white. The object born at time 4 is given label (4,1), colored blue. The multi-object history X 0:k (note X 0 = ∅) is represented by two equivalent groupings according to: (a) time (vertical strips, i.e., multi-object states) or; (b) labels (strips containing states of the same color, i.e., trajectories).
then the space of all labels up to time k is the disjoint union . A sequence of consecutive labeled states (say, from time s to t) with kinematic states x s , x s+1 , ..., x t ∈ X, and label ℓ = (s, ι), is called a trajectory. The labeled state of trajectory τ at time i is denoted by τ (i). At time i, a labeled multi-object state X i is a finite subset of X × L i with distinct labels. Let L denote the projection defined by L((x, ℓ)) = ℓ, and L(X i ) denote the set of labels of X i . Since a valid labeled multi-object state X i must have distinct labels, we require the distinct label indicator ∆(X i ) δ |Xi| [|L(X i )|] to be 1. Given a set S of trajectories with distinct labels, the labeled multi-object state at time i is Fig. 1 for illustration.
Given a sequence X j:k of labeled multi-object states (from a set of labeled trajectories) over the interval {j : k}, the trajectory of (the object with) label ℓ ∈ ∪ k i=j L(X i ) is where: s(ℓ) and t(ℓ) are, respectively, the earliest and latest times on {j : k} such that the label ℓ exists; and (x (ℓ) i , ℓ) denotes the element of X i with label ℓ and unlabeled state x Hence, the sequence X j:k can be equivalently represented by the trajectories of all labels in ∪ k i=j L(X i ), i.e., This equivalence is illustrated in Fig. 1.
Analogous to a single-object system where the state trajectory is a represented by the state history x 0:k , in a multi-object system the multi-object trajectory is represented by the labeled multi-object (state) history X 0:k . Further, the equivalence (3) enables the following extension of the multi-object exponential to multiple scans. For any non-negative integer n and i 1 < For any function h : ⊎ I⊆{j:k} T I → [0, ∞), the multi-scan exponential [11] is defined as Note that when j = k, this reduces to the single-scan multiobject exponential h Xj in Table 1.
Hereon, single object states are represented by lower case letters (i.e., x and x), and multi-object states are represented by upper case letters (i.e., X and X), where the symbols for labeled states and their distributions are bolded (i.e., x, X, π, etc.) to distinguish them from unlabeled states.
In Bayesian estimation, the states and measurements are modeled as random variables. Hence, in a multi-object system, the multi-object state is modeled as an RFS, and the system model is described by the multi-object state transition kernel and measurement likelihood function, presented respectively in Subsections II-B and II-C. The multi-object posterior recursion is presented in Subsection II-D.

B. Multi-object Dynamic Model
Given the multi-object state X k−1 at time k − 1, each x k−1 = (x k−1 , ℓ k−1 ) ∈ X k−1 either survives with probability P S,k−1 (x k−1 ) and moves to a new state x k = (x k , ℓ k ) with transition density f S,k|k−1 (x k |x k−1 , ℓ k )δ ℓ k−1 [ℓ k ] at time k, or dies with probability Q S,k−1 (x k−1 ) = 1 − P S,k−1 (x k−1 ). An object (with label) ℓ k ∈ B k , is either born at time k with probability P B,k (ℓ k ) and state x k with probability density f B,k (x k , ℓ k ), or not born with probability Q B,k (ℓ k ) = 1 − P B,k (ℓ k ). The multi-object state X k at time k, is the superposition of surviving and new birth states. In a standard multi-object dynamic model, the birth and survival sets are independent of each other, and each object moves and dies independently of each other. The multi-object dynamic model is encapsulated in the multi-object transition density f k|k−1 (X k |X k−1 ), see [11], [37] for the actual expressions.

C. Multi-object Measurement Model
Consider the multi-object state X k at time k, and V sensors, each produces a set Z Each x ∈ X k is either detected by sensor v with probability P (v) D,k (x) and generates a detection z ∈ Z Sensor v also receives false alarms (clutter), modeled by a Poisson RFS with intensity function κ is the superposition of detections and false alarms, which are assumed independent, conditional on X k .
An association map γ k |}, of sensor v, is a positive 1-1 map (i.e., no two distinct labels are mapped to the same positive value). Here γ be the space of all association maps for sensor v, then the multi-object likelihood is [21] The multi-sensor association map is defined from the singlesensor association maps γ is required to be positive 1-1, in which case γ k is said to be positive 1-1. Here, γ k (ℓ) = −1 V means label ℓ does not exist, whereas γ k (ℓ) ∈ Λ (1:V ) k means label ℓ exists, in which case it is misdetected by sensor v if γ which is independent of v, we write it as L(γ k ).
Assuming that the sensors are independent conditional on X k , the multi-sensor multi-object likelihood function is simply the product of the likelihoods of individual sensors. Let k ) denote the multi-sensor observation, then the multi-sensor multi-object likelihood function is where Γ k is the space of multi-sensor association maps, and

D. Multi-object Bayes Recursion
All information about the set of objects in the surveillance region for the interval {0 : k} is captured by the multi-object posterior, which can be recursively propagated in time by, Note that the posterior is conditioned on the measurement history Z 1:k , but we have omitted it for brevity. The dimensionality of the posterior (hence complexity per timestep) grows with time, and computing it at each timestep is impractical. Nevertheless, the complexity per timestep can be capped by smoothing over short windows and linking trajectory estimates between windows via their labels. Multi-object filtering can be regarded as a special case with a window length of one.
For the single-sensor special case, Particle Markov Chain Monte Carlo [38] has been applied to approximate the multiobject posterior in [30]. Further, an analytic solution called the (multi-scan) GLMB filter/smoother was developed in [11], which conceptually extends to the multi-sensor case.

E. Multi-Sensor GLMB Posterior
Before delving into the multi-sensor GLMB posterior, it is informative to consider the association weight and trajectory posterior of a label ℓ. Recall that s(ℓ) and t(ℓ) are, respectively, the earliest and latest times on {0 : k} such that ℓ exists. There are four possible cases for its trajectory: (i) new born, Suppose that ℓ generated the sequence α s(ℓ):k of multi-sensor measurement indices, then its trajectory posterior at time k is In addition, the association weight for ℓ ∈ L k is defined as We now present the multi-object posterior as follows. Starting with the initial prior π 0 (X 0 ) = δ 0 [L(X 0 )] (i.e., there are no live objects at the beginning) with weight w (γ0) 0 = 1, the multi-sensor GLMB posterior at time k can be written as [11] π 0:k (X 0:k ) ∝ ∆(X 0:k ) j|j−1 (·)] Bj ⊎L(γj−1) . (16) Some of the relevant posterior statistics from the multi-scan GLMB are [11]: • The cardinality distribution, i.e., the distribution of the number of trajectories is given by, • The cardinality distribution of births and deaths at time u ∈ {0 : k} are given by, Pr(n births at time u) Pr(n deaths at time u) • The distribution of trajectory lengths is given by, Pr(a trajectory has length m) Several multi-object trajectory estimators [11] are applicable to the GLMB posterior (14). This work uses the GLMB estimator, which first determines the most probable cardinality n * by maximizing (17), and then the highest weighted component (indexed by) γ * 0:k with cardinality n * . The n * estimated trajectories can be taken as the mean or mode of the trajectory posterior τ (γ * 0:k (ℓ)) 0:k (·, ℓ) for each ℓ ∈ L(γ * 0:k ). The GLMB posterior (14) is completely parameterized by the set of components {(w (γ 0:k ) 0:k , τ (γ 0:k ) 0:k )}, indexed by γ 0:k , which grow super-exponentially in number, and hence a tractable approximation is necessary. Truncation by retaining a prescribed number of components with highest weights minimizes the L 1 -norm approximation error [11].

III. MULTI-SENSOR GLMB POSTERIOR PROPAGATION
This section presents techniques to truncate the multi-sensor GLMB posterior by extending the Gibbs samplers proposed in [11]. Subsection III-A introduces the distributions from which significant components of the GLMB posterior are drawn. Subsection III-B presents an efficient algorithm to generate valid components that can be used to initialize the full Gibbs sampler presented in Subsection III-C, which generates components according to a desired distribution.

A. Sampling Distributions
We truncate the posterior GLMB by sampling its components from some discrete distribution π such that those with higher weights are more likely to be chosen. Without loss of generality, we start with L(γ 0 ) = ∅, and consider where j|j−1 (·), (22) is proportional to (16), and (21) is proportional to (15).
. Hence, only the values of γ j on B j ⊎ L(γ j−1 ) need to considered. In addition, the following decomposition of 1 Γi (γ i ) is instrumental for the proposed Gibbs samplers (the proof is given in Supplementary Material, Subsection A).

B. Sampling from the Factors
Sampling from (21) can be performed by iteratively sampling from the factors (22), i.e., γ j ∼ π(·|γ 0:j−1 ), for time j = 1 : k (note that γ j consists of γ j (ℓ n ), ℓ n ∈ {ℓ 1:|Lj| } L j ). We sample from π(·|γ 0:j−1 ) using a Gibbs sampler [40], which constructs a sequence of iterates by generating the next iterate γ ′ j from γ j and the conditionals of π(·|γ 0:j−1 ) as follows The efficiency of Gibbs sampling hinges on the availability of conditionals that are easy to sample from. We adopt the multi-sensor Gibbs sampling technique developed in [21] to sample γ ′ j (ℓ n ) from the n-th conditional (25). The basic steps in this technique are illustrated in Fig. 2, and summarized as pseudocode in Algorithm 1. The pseudocode for generating R iterates of the factor Gibbs sampler is given in Algorithm 2. Theoretical justifications and analyses of Algorithms 1 and 2 are given in the remainder of this subsection.
Consider first the n-th conditional distribution by substituting (22) into the right hand side of (25), and noting that we are only interested in the functional dependency on α, Note that, given a positive 1-1 multi-sensor association map, any association map sampled from these conditionals is also positive 1-1.
In general, sampling α directly from (26) is both memory and computationally expensive since ϑ ((γ0:j−1(ℓn),α)) j (ℓ n ) needs to be evaluated for each of the 1 possible values of α. Fortunately, the computational cost can be drastically reduced by using the strategy in [21] via the so-called minimally-Markovian conditional distributions.
and minimally-Markovian if ϑ j,v can be written in the form The Markov property allows us to sample α (1) , . . . , α (V ) individually as shown in Fig. 2, thereby alleviating the
Note that although this approach produces valid multisensor association history γ 0:k , in order to produce samples distributed according to (21), the Gibbs sampler for each factor π (j) n needs to be iterated for a sufficiently long time (i.e., executing Algorithm 2 with a large R) before proceeding to the next timestep. While this is not efficient, sampling from the factors can provide a cheap way to generate valid γ 0:k to initialize the full Gibbs sampler (for faster convergence) presented in the next subsection.
For completeness, the pseudocode to generate a set of valid γ 0:k is given in Algorithm 3, where the factor sampler (Algorithm 2) is used to sample γ j for times j = 1 : k.
j |}, P j |B j ⊎ L(γ j−1 )|, and R be the number of new samples generated at time j. Then, the complexity of this algorithm is O(R k j=1 P 2 j V M j ) (although it can be implemented in O(R k j=1 P j (P j +V M j )) in practice by pre-calculating β

C. The Full Gibbs Sampler
Instead of sampling from each factor, a full Gibbs sampler for (21) constructs a sequence of iterates, where the next iterate γ ′ 0:k is generated from the current γ 0:k by sampling γ ′ j (ℓ n ) from the conditional for each j ∈ {1 : k}, ℓ n ∈ {ℓ 1:|Lj| }. The proposed algorithm for sampling γ ′ j (ℓ n ) from the conditional π j,n is illustrated in Fig. 3, and summarized as pseudocode in Algorithm 4. The pseudocode for generating T iterates of the full Gibbs sampler is given in Algorithm 5. Theoretical justifications and analyses of Algorithm 4 and 5 are given in the remainder of this subsection. Mathematical proofs are given in the Supplementary Material.

(42)
Similar to sampling from the factors, in general, sampling α from the conditional (41) is both memory and computationally expensive since ϑ j (α|γj (ℓ n ), ℓ n ) needs to be evaluated for each of the 1 + Further, since each ϑ j (α|γj(ℓ n ), ℓ n ) is a product of terms that need to be evaluated for i = j : k ∨ (t(ℓ n ) + 1), it is more computationally expensive than the factor sampler.
Again, the computational cost can be drastically reduced using minimally-Markovian conditional distributions.

Definition 5. The conditional (41) is said to be Markovian if
and minimally-Markovian if ϑ j,v can be written in the form The Markovian conditional allows us to sample individual α (1) , α (2) , ..., α (V ) , alleviating the evaluation of ϑ j (α|γj (ℓ n ), ℓ n ) over all possible values of α, but still incurs a total complexity of O(kVṀ 2 |t(ℓ n ) − s(ℓ n )|) for computing normalization constants. Further, a linear complexity of O(kVṀ |t(ℓ n )−s(ℓ n )|) can be achieved using the minimally-Markovian conditional given in the following Proposition, which extends Corollary 4 of [21] to multi-scan.  is sampled from the categorical distribution π (v) j,n (·) that depends on the pre-computedθ (v) j,n (·). To ensure that γ ′ j is positive 1-1 (and γ ′ 0:k ∈ Γ 0:k ), we mask out the positive indices of sensor v that have already been allocated to other labels (by multiplying ϑ (v) j,n (·) with the mask M (v) j,n (·)) to produce π (v) j,n (·); (ii) top branch, with probability Q j,n (Λ
(51) can be interpreted as the (unnormalized) probability that (conditioned on γj(ℓ) and Z 1:k ) label ℓ generates measurement z ). The rationale for choosing (51) can be seen by substituting (51) into (43) and (44), which gives i.e., the product of probabilities Pr j (ℓ n ∼ z Consequently, the approximation in (50) boils down to approximating Pr j (ℓ n ∼ z ). This is reasonable, because the event space is α ∈ {−1} V ⊎ Λ (1:V ) j , and conditioned on Z 1:k and γj, the probability that ℓ generating a measurement from one sensor is almost independent from generating a measurement from another sensor. Furthermore, similar to [21], the support of (21) with the minimally-Markovian property contains the support of (15) as per Proposition 7.
Proposition 7. The support of (21) with minimally-Markovian property contains the support of (15).
Our proposed full Gibbs sampler (Algorithm 5), uses the minimally-Markovian strategy above. Note that, once each γ ′ j (ℓ n ) is sampled, the corresponding trajectory posterior and the association weight are updated using (7) and (13). The complexity of this algorithm is Similar to the single-sensor case [11], the full Gibbs sampler has an exponential convergence rate.
Algorithm 6 provides the pseudocode for batch computation of the multi-sensor GLMB posterior (14) using the full Gibbs sampler (Algorithm 5). Note that the factor sampler is used to initialize the full Gibbs sampler with a set of Q 0 valid association histories, each of which is used to generate a set of T samples distributed according to (21). Since Algorithm 6 can be executed in parallel for each q, indicatively its complexity is also O(kT V P 2Ṁ ). Alternatively, the posterior (14) can be recursively propagated, i.e., smoothing-whilefiltering, as shown in Algorithm 7. Note that Algorithms 6 and 7 both produce the same posterior with the same complexity. However, the smoothing-while-filtering algorithm allows us to obtain an estimate of the multi-object trajectory at each point in time, while the batch algorithm is an offline method, where the multi-object trajectory can only be estimated after all the observations have been collected. For a fixed complexity per time step, a moving window-based implementation can be adopted, which incurs an O(LT V P 2Ṁ ) complexity per time step, where L is the length of the window.

IV. NUMERICAL EXPERIMENTS
This section evaluates the performance of the proposed (multi-scan) multi-sensor GLMB smoother and benchmarks it against the multi-sensor GLMB filter via simulations on single-sensor, two-sensors and four-sensor scenarios.
Births, deaths and movements of an unknown and timevarying number of objects are simulated in a 3D surveillance region over k = 100s. The objects' 3D positions are measured using multiple sensors with severely limited observability (detection probability) and high measurement noise. There are 12 births at j = 1s, 20s, 40s, 60s and 80s (respectively 3, 3, 2, 2, and 2 objects), and the probability of survival is set at P S (x (ℓi) , ℓ i ) = 0.95. Two of the objects born at time j = 1s die at 70s, and the peak number of 10 live objects occurs from 80s onwards. The three objects born at j = 1s cross paths at [0, −400, 0] T at time 40s, and two of the objects born at time 20s cross paths at [300, −200, 200] T at time 59s, constituting a more challenging multi-object tracking scenario.
The kinematic state of an object is represented by a 6D state vector, i.e., x k = [p x,k ,ṗ x,k , p y,k ,ṗ y,k , p z,k ,ṗ z,k ] T consisting of 3D position and velocity, and follows a constant velocity motion model. The single object transition density is linear Gaussian and given by f S,k|k−1 (x I 3 is the 3 × 3 identity matrix, ∆ = 1s is the sampling time, σ a = 5 m/s 2 , and ⊗ denotes the matrix outer product. Births are modeled by a Labeled Multi-Bernoulli (LMB) Process with (birth and spatial distribution) parameters z,k ] T , and modeled by the linear gaussian likelihood function g [20,20,20] 2 ). The probability of detection of each sensor is P k (z) = 3.75 × 10 −10 m −3 over the surveillance region, i.e., a clutter rate of 3 per scan. This scenario is more challenging than those in [11], [21] due to far lower detection probability and higher measurement noise in a cluttered 3D environment.
We plot single runs of the multi-sensor GLMB smoother and filter for the same sets of measurements under the singlesensor, two-sensor and four-sensor cases, and provide an analysis of the results in the remainder of this section.

A. Filtering vs. Smoothing
The multi-sensor GLMB smoother (Algorithm 6) is run with 100 scans, and 1000 valid initial samples generated via factor sampling. Its performance is compared with the singlesensor GLMB filter (Algorithm 2 of [41]), single-sensor multiscan GLMB smoother (Algorithm 3 of [11] with the same parameters as the proposed multi-sensor smoother), and multisensor GLMB filter (Algorithm 3 of [21] using the minimally-Markovian strategy with the same multi-sensor parameters). The smoothing problem involves 400 dimensional rank assignment problems with approximately 10 variables in each dimension. State-of-the-art solvers with dedicated hardware cannot handle problems with such large dimensions [17]. Fig. 4 shows the estimated trajectories from multi-sensor GLMB filter and multi-sensor GLMB smoother (for singlesensor, two-sensor and four-sensor scenarios) superimposed on the true trajectories. Due to the challenging signal settings, the single-sensor GLMB filter produces several fragmented tracks, fails to estimate large portions of some tracks, and yields significant errors. Increasing the number of sensors to two and four results in fewer fragmented tracks, produce estimates for almost all portions of truth tracks, and improves estimates. However, it does not prevent track switchings (at [0, −400, 0] T and [300, −200, 200] T , respectively, for the two-sensor and four-sensor scenarios) from occurring at track crossings.
The multi-sensor GLMB smoother outperforms the filter on each scenario. The single-sensor scenario shows worse performance than the multiple sensor scenarios due to lower observability, resulting in fragmented tracks, undetected tracks and significant errors in two tracks. In the two-sensor scenario, except for the two track switchings at [0, −400, 0] T , all other tracks promptly start and terminate, resulting in      The smoother produces significantly lower errors due to its ability correct earlier errors in the multi-sensor assignments.

B. Posterior Statistics Computation
This section illustrates the proposed multi-object posterior capability to provide useful statistical information [11] about the ensemble of multi-object trajectories (mathematical expressions are given in eqs. (17)-(20)). Fig. 6 shows the probability distribution of the number of trajectories. As the number of sensors increases, the mode of this distribution settles at 13, although the actual number of trajectories is 12. This mismatch arises from a short (3s) false track appearing near the birth location (−800, −200, −400) T due to a false measurement falling close to it (recall that the detection probability of each sensor is 0.3). From the distribution of the length of trajectories shown in Fig. 6, it can be seen that except for the mode at time j = 3s (corresponding to the short false track), the posterior correctly captures the modes at times 20s, 40s, 60s, 80s and 100s in the four-sensor scenario. Also, note that the uncertainty at time 80s is higher in the two-sensor scenario than the four-sensor scenario. Fig. 7 shows the birth and death cardinality distributions against time. Except for the birth and death around time 30s (due to the short false track), the smoother correctly identifies all instances of birth and deaths in the four-sensor scenario.

V. CONCLUSION
In this paper, we developed an efficient numerical solution to the multi-sensor multi-object smoothing problem via Gibbs sampling, which unifies the implementation techniques for multi-sensor GLMB filtering and multi-scan GLMB smoothing in [21], [11]. The proposed solution, which involves solving large-scale NP-hard multi-dimensional assignment problems, reduces to that of [11] when the number of sensors is one and to that of [21] when the number of scans is one. Theoretical justification and analysis of the proposed algorithms are also presented. Numerical studies, conducted for the first time with 100 scans and up to four sensors, show that excellent tracking performance can be achieved despite having sensors with poor detection probabilities (as low as 0.3) by combining them over a longer integration time. This also demonstrates the benefits of multi-sensor fusion for low observability sensors, which is vital for applications in harsh signal environments such as underwater.  (2) performance of GLMB filtering (in black) and GLMB smoothing (in red).
smooth trajectory estimates with smaller errors. The foursensor scenario produces superior results due to increased observability. It handles the track crossings well, resulting in no track switchings, nor fragmented tracks, and produces even smaller errors with each pertinent track starting and terminating promptly. Fig. 5 compares OSPA and OSPA (2) errors over time between the multi-sensor GLMB smoother (final estimate) and multi-sensor GLMB filter. The OSPA and OSPA (2) parameters are c = 100m, p = 1, with a scan window size of 10 for OSPA (2) . It is clear that increasing the number of sensors improves the performance of the multi-sensor GLMB filter. The smoother produces significantly lower errors due to its ability correct earlier errors in the multi-sensor assignments.

B. Posterior Statistics Computation
This section illustrates the proposed multi-object posterior capability to provide useful statistical information [11] about the ensemble of multi-object trajectories (mathematical expressions are given in eqs. (17)-(20)). Fig. 6 shows the probability distribution of the number of trajectories. As the number of sensors increases, the mode of this distribution settles at 13, although the actual number of trajectories is 12. This mismatch arises from a short (3s) false track appearing near the birth location (−800, −200, −400) T due to a false measurement falling close to it (recall that the detection probability of each sensor is 0.3). From the distribution of the length of trajectories shown in Fig. 6, it can be seen that except for the mode at time j = 3s (corresponding to the short false track), the posterior correctly captures the modes at times 20s, 40s, 60s, 80s and 100s in the four-sensor scenario. Also, note that the uncertainty at time 80s is higher in the two-sensor scenario than the four-sensor scenario. Fig. 7 shows the birth and death cardinality distributions against time. Except for the birth and death around time 30s (due to the short false track), the smoother correctly identifies all instances of birth and deaths in the four-sensor scenario.

V. CONCLUSION
In this paper, we developed an efficient numerical solution to the multi-sensor multi-object smoothing problem via Gibbs sampling, which unifies the implementation techniques for multi-sensor GLMB filtering and multi-scan GLMB smoothing in [21], [11]. The proposed solution, which involves solving large-scale NP-hard multi-dimensional assignment problems, reduces to that of [11] when the number of sensors is one and to that of [21] when the number of scans is one. Theoretical justification and analysis of the proposed algorithms are also presented. Numerical studies, conducted for the first time with 100 scans and up to four sensors, show that excellent tracking performance can be achieved despite having sensors with poor detection probabilities (as low as 0.3) by combining them over a longer integration time. This also demonstrates the benefits of multi-sensor fusion for low observability sensors, which is vital for applications in harsh signal environments such as underwater.