LapTrack: linear assignment particle tracking with tunable metrics

Abstract Motivation Particle tracking is an important step of analysis in a variety of scientific fields and is particularly indispensable for the construction of cellular lineages from live images. Although various supervised machine learning methods have been developed for cell tracking, the diversity of the data still necessitates heuristic methods that require parameter estimations from small amounts of data. For this, solving tracking as a linear assignment problem (LAP) has been widely applied and demonstrated to be efficient. However, there has been no implementation that allows custom connection costs, parallel parameter tuning with ground truth annotations, and the functionality to preserve ground truth connections, limiting the application to datasets with partial annotations. Results We developed LapTrack, a LAP-based tracker which allows including arbitrary cost functions and inputs, parallel parameter tuning and ground-truth track preservation. Analysis of real and artificial datasets demonstrates the advantage of custom metric functions for tracking score improvement from distance-only cases. The tracker can be easily combined with other Python-based tools for particle detection, segmentation and visualization. Availability and implementation LapTrack is available as a Python package on PyPi, and the notebook examples are shared at https://github.com/yfukai/laptrack. The data and code for this publication are hosted at https://github.com/NoneqPhysLivingMatterLab/laptrack-optimisation. Supplementary information Supplementary data are available at Bioinformatics online.


Introduction
Automated tracking of particles in timelapse images is important in a wide range of fields in science and is especially crucial in creating large datasets of cell lineages in biological studies. Recently there has been considerable development in tracking algorithms, where methods based on probabilistic modeling (Bise et al., 2011;Bove et al., 2017;Chen, 2021;Chenouard et al., 2009Chenouard et al., , 2014Meijering et al., 2009;Ulicna et al., 2021;Ulman et al., 2017) and supervised machine learning (Ben-Haim and Riklin-Raviv, 2022;Chen, 2021;Lou and Hamprecht, 2011;Ulman et al., 2017) are increasingly being developed. The diverse nature of live imaging tasks, however, frequently requires tracking without underlining model or largescale ground-truth annotations, emphasizing the need for a robust tracking algorithm with a small number of parameters that can be tuned by manual annotations.
Defining and optimizing a global cost function to appropriately penalize wrong connections is a common approach in robust tracking methods. If the cost function is a linear sum of the costs associated with the connections, we can employ efficient algorithms (Jonker and Volgenant, 1987;Kuhn, 1955) to solve the global optimization problem called the linear assignment problem (LAP). The LAP-based tracking method has proven to be accurate and robust, especially for data with higher particle density. To deal with particle splitting (by division or oversegmentation) or merging (by undersegmentation), which is common in live cell data, Jaqaman et al. (2008) further developed a two-stage LAP method, with the second stage dedicated to the connection of splitting and merging branches. The cost function in their case was the squared Euclidean distance between the positions of the objects, with additional intensityassociated costs for splitting and merging.
Tools have been developed to provide similar LAP-based algorithms with splitting and merging detection; TrackMate (Ershov et al., 2022;Tinevez et al., 2017), for example, provides distancebased LAP tracker with particle detection and segmentation workflow and a method to conduct manual correction, all within the Java-based framework in ImageJ (Schindelin et al., 2012;Schneider et al., 2012). Cell-ACDC (Padovani et al., 2022), which was originally designed for yeast analysis, also implements an overlap-based LAP tracker with splitting detection, as well as various functions ranging from image alignment to manual correction that support the entire analysis workflow in Python. In addition, TracX (Cuny et al., 2022) employs a multi-round tracking and correction workflow using a LAP tracker and mistracking detector by matching image features.
Although other highly accurate methods have been proposed to work for the tracking problem with cell divisions, no single tracking algorithm will be perfect for all the diverse experimental situations (Ulman et al., 2017). To obtain near-perfect segmentation and tracking for specific data, users must still optimize the segmentation and tracking steps, automatically or manually. In this regard, the LAP-based algorithm that robustly works with a small number of parameters continues to play a key role in generating the initial tracking data without large-scale manual annotation.
An adaptive improvement to the original LAP-based tracking with distance can be made by using additional features taken from the cell images. For example, we can extract the morphology of each cell, such as its shape and size, from typical live cell images, as well as the signal levels from multiple fluorescent channels. The consistency of cell shape and fluorescent signals across time frames is useful when tracking is conducted by human eyes, especially when the frame rate of the data is not high enough. Therefore, it is desirable to be able to implement arbitrary inputs and cost functions in the LAP-based tracking scheme, as well as to tune the parameters using partial ground-truth annotations.
These requirements motivated us to build a tool that recapitulates the LAP algorithm (Jaqaman et al., 2008;Tinevez et al., 2017) with additional flexibility and modularity; LapTrack is designed as a simple intermediate in the entire tracking pipeline that takes the positions and features of particles and returns LAP-optimized tracks. The three unique features of LapTrack are (i) arbitrary tunable cost functions for particle connection, (ii) integrability with other Python tools and (iii) the functionality to preserve ground-truth (annotated) connections. Within this framework, we can implement user-defined cost functions for connections that can take an arbitrary number of inputs. The tracking function is modularized and documented as an application programming interface (API) so that it can be integrated into any custom workflow in Python, allowing parallel parameter optimization as well as visualization of results in easy steps.
In this article, we demonstrate how this pipeline can be used not only to optimize the tracking in a supervised manner, but how it is also useful for efficient manual correction of the tracks when combined with visualization tools such as napari (Sofroniew et al., 2022).

Datasets
We here describe the data that we used to demonstrate the use cases of LapTrack: live cell images with ground truth segmentation and tracking (mouse paw epidermis dataset, cell migration dataset, Yeast Image Toolkit dataset and C2C12 dataset) and simulated data (colored particles) provided in https://github.com/NoneqPhysLivingMatterLab/lap track-optimisation. We also used high-density vesicles, yeast and 3D Drosophilla data to show that the tracking pipeline works for a wide range of applications.

Mouse paw epidermis dataset
The segmentation data and the ground truth tracking result collected and analyzed in Mesa et al. (2018); Yamamoto et al. (2022) were used as a reference. The dataset contains 236 to 327 cells in the observation area and has 15 frames.

Cell migration dataset
Images, segmentation data for a portion of frames and the ground truth tracking result were downloaded from Zenodo (Pylvänä inen et al., 2022). Segmentation was conducted by Cellpose (Stringer et al., 2021) and manually corrected in napari (Sofroniew et al., 2022). The ground-truth tracking result was also manually validated and corrected. The dataset contains 218 to 434 cells in the 648.95 mm Â 648.95 mm observation area and has 86 frames.

Yeast image toolkit dataset
The dataset was downloaded from the Yeast Image Toolkit website http://yeast-image-toolkit.org/ (Versari et al., 2017). The data included the ground-truth cell positions at each time frame, which were used for the tracking in the benchmark (Section 3.2).

C2C12 dataset
The dataset (Ker et al., 2018) was downloaded from the public repository (Ker, 2017). We used the first 780 frames of sequence 9 with the 'BMP2' condition for the benchmark (Section 3.2), since it included the annotation for all cells in the field. We manually validated the dataset and removed duplicated annotations on a single cell.

Colored particles
We simulated the Brownian motion of 400 particles with colors in a 2D box of size 20 Â 20 with periodic boundary conditions. The particles were split into two species, a and b, where the interaction between the particles was set as harmonic repulsion with the spring constants set as 1 for a and a pairs, 1.2 for a and b pairs, and 1.4 for b and b pairs. The dynamics was simulated with the simulate.brownian routine in Jax-MD (Schoenholz and Cubuk, 2020) with the parameters kT ¼ 0.1 and dt ¼ 0.001, where the mass and friction coefficient were set to the default values, 1 and 0.1. For each particle, labeled by i, a random integer n i between 0 and 7 is assigned. The feature vector c i 2 R 3 , corresponding to RGB colors, of each particle at each time step is then assigned as where n k i is the kth digit of n i in the binary representation and RðxÞ ¼ d x;0 N ð2; 0:5Þ þ d x;1 N ð6; 0:5Þ, where N ðl; rÞ is the normal random variable with mean l and the standard deviation r. When used for the tracking benchmark, particles crossing the boundary are regarded as disconnected and belong to different tracks.
For Figure 1c, the Yeast Image Toolkit data in IT-Benchmark2/ TestSet4/RawData were segmented by Cellpose 0.7.2 (Stringer et al., 2021) with the parameters model_type¼ 'cyto', net_avg¼True, and diameter ¼ 30 in the eval function. The centroids of each segmented region were tracked by LapTrack with the default metric and track_cost_cutoff ¼ 100, splitting_ cost_cutoff ¼ 2500.

Tracking implementation
The implemented particle tracking algorithm follows the method proposed in Jaqaman et al. (2008), with modifications following TrackMate (Ershov et al., 2022;Tinevez et al., 2017) and additional flexibility, as we describe in the following sections.

Frame-to-frame LAP
In the first step, the points in successive frames are connected by solving LAP, and then generating tracks without splits and merges (Fig. 1a, left top). Specifically, for every pair of points with properties (such as Euclidean coordinates) x i and x j at frames t and t þ 1, the costs l ij ¼ lðx i ; x j Þ are computed using a user-definable metric function l. The costs d and b are then assigned to the particles not connected to any of the particles in the next and previous timesteps, respectively. The optimal assignment is found by minimizing the cost (Jaqaman et al., 2008): where C is the set of all connected index pairs, B and D are the numbers of the points which does not have a connection to the previous and next timesteps, respectively, and l 0 ¼ minðl ij ; d; bÞ (see Supplementary Material for algorithm details). In the default setting, d and b are calculated as 1:05 Â c 90% , where c 90% is the 90% percentile value of the all finite entries in fl ij g ij (Jaqaman et al., 2008). The default metric for l is set to the squared Euclidean distance lðx i ; x j Þ ¼ kx i À x j k 2 2 (Ershov et al., 2022;Jaqaman et al., 2008;Tinevez et al., 2017) with which the cost-minimizing association can be interpreted as the maximum log-likelihood solution for Brownian particles when we ignore splitting and merging (Crocker and Grier, 1996).

Segment-connecting LAP
In the second step, another LAP is solved to predict splitting, merging, and gap closing (Fig. 1a, left bottom). Gap closing connects free segment ends with allowing frame skips. The gap closing cost g ab ¼ gðx a ; x b Þ is calculated by a user-definable metric g for all possible connections between free ends up to a specified frame difference, and the splitting and merging costs s ab ¼ sðx a ; x b Þ and m ab ¼ mðx a ; x b Þ are calculated for all possible connections between a free end and a track midpoint by user-definable metrics s and m. The metrics g, s and m default to the squared Euclidean distance. Then, the optimal assignment is calculated by minimizing the overall cost:

Freezing annotated tracks
We implemented an option to specify partial tracks within the data to be fixed as ground-truth verified connections (Fig. 1d). Fixing the correct tracks is especially useful when performing manual corrections using visualization tools such as napari. As we demonstrate (https://github.com/NoneqPhysLivingMatterLab/laptrack-optimisa tion) (Fig. 1e), track connections can be specified to be fixed by annotating the cell regions before rerunning the LAP-based tracking. The resulting track preserves the training data tracks due to the masking scheme (Fig. 1d).

Parameter optimization
In practice, we introduce the cut-off for the costs l ij , g ab , s ab and m ab , above which those values are regarded as infinity. The values of the cut-offs can affect the performance as demonstrated in Section 3.1, but it is difficult to optimize those values due to the nondifferentiability of the LAP algorithm (Xu et al., 2020) and the high computational cost for repeating the tracking routine. We therefore used non-gradient optimization methods to optimize the specified sets of the parameters in parallel using the package Ray Tune (Moritz et al., 2018) with the Optuna optimizer (Akiba et al., 2019) and random search. We selected the parameters that achieved the highest connection Jaccard index value or true positive rate, depending on the type of the training data (Section 2.3.1).

Analysis pipeline
LapTrack is written in Python with explicit API documentation and can be integrated with, for example, particle detectors in scikitimage and deep learning-based segmentation packages such as Cellpose (Stringer et al., 2021) (Fig. 1b and c). The output data is a networkx (Hagberg et al., 2008) directed graph, which can be analyzed using the network analysis functions in the package. We also implemented utilities to convert data into pandas dataframes (pandas development team, 2020; Wes McKinney, 2010) and shorthand functions to track coordinates organized in a dataframe. In this paper, we used the ground-truth segmentation for each dataset as the input and analyzed the result tracks by networkx and pandas. Python scripts for tracking and analysis are provided at https:// github.com/NoneqPhysLivingMatterLab/laptrack-optimisation.

Metrics for the tracking results
To measure the performance of tracking, we employed the following metrics, which can also be calculated within LapTrack.

Overall tracking scores
To measure the overall track consistency, we calculated the target effectiveness (TE) and track purity (TP) (Bise et al., 2011;Chen, 2021), which penalize the false negative and the false positive detections, respectively. Let us denote the set of ground truth tracks by fT g j g j and predicted tracks by fT p j g j . TE for a single ground truth track T g j is calculated by finding the predicted track T p k that overlaps with T g j in the largest number of the frames and then dividing the overlap frame counts by the total frame counts for T g j . The TE for the total dataset is calculated as the mean of TEs for all ground truth tracks, weighted by the length of the tracks. TP is defined analogously, with T g j and T p j being swapped in the definition. We also measured the mitotic branching correctness (Bise et al., 2011;Chen, 2021), defined as the fraction of the number of correctly detected divisions over the total number of the divisions.

Overlap between predicted and ground truth connections
During the parameter optimization, we used a less computationally expensive quantity, the Jaccard index and the true positive rate of the connections to measure how well the predicted connections overlap with the ground truth. The quantity is defined by jE p \ E g j=jE p [ E g j and jE p \ E g j=jE g j, respectively, where we denote the set of predicted and ground-truth connections by E p and E g , respectively, and the size of a set E by jEj. In the benchmark of the Yeast Image Toolkit dataset (Section 3.2), we additionally calculated the F-score of the assignment 2jE p \ E g j=ðjE p j þ jE g jÞ to compare the performance with previously reported results.

Distance cut-off points can be optimized to increase performance
We first investigated the performance against various cost cut-off points in the simplest cases where the costs for connecting, gap closing and splitting are the squared Euclidean distance between the centroids. Specifically, we varied the maximum distance allowed for frame-to-frame particle association (max_distance) and splitting and gap-closing association (splitting_max_distance), which defines the cut-off for l ij and s ab (g ab ), respectively, and investigated how the overall performance changes. In the mouse epidermis dataset (Fig. 2a), we performed grid search in the parameters max_distance and splitting_max_distance. We found that there exists a maxima in the TE around some finite length scale, suggesting that optimization is useful in performance improvement even for the cut-off parameters (Fig. 2b). We also found that the correlation of the tracking scores between mouse epidermis data from different regions are high upon changing of the parameters [r ¼ 0.96 (r ¼ 0.90) for TE (TP) using data with TE > 0:75 (TP > 0:75), respectively ( Supplementary Fig. S1)], meaning that the optimized parameters are transferable within similar data.

Distance-only LAP tracker can achieve comparable performance to data-specific methods
We then benchmarked the tracking performance of the simple distance-only LAP tracker with the Yeast Image Toolkit dataset. Since the published benchmark results in Versari et al. (2017) do not include divisions, we tracked ground truth segmentation positions without splitting, with different cut-off points max_distance and gap_closing_max_distance (the cut-off point for g ab ). We then calculated the TE, the assignment F-score (tracking F-score), and the F-score for the assignments between the first and the last frames (f) TE score for the colored particles dataset with different frame intervals, with or without the feature difference term in the metric. The error bar indicates the standard deviation of five trials. (g) Mitotic branching correctness score for the mouse epidermis dataset, tracked with the centroid distances (centroid) or the overlap ratio (overlap). The error bar indicates the standard deviation of five trials (long-term tracking F-score). We used the Evaluation Platform software (Versari et al., 2017) to calculate the F-scores. Supplementary Figure S2 shows that this simple tracker achieves TE higher than 0.9 for all the datasets, and the F-scores are comparable to or higher than most published methods (Versari et al., 2017), except the longterm tracking F-score for TestSet 3 and 4, which have frames with large cell displacements. Note that the previous methods track the cells after their segmentation pipeline, whereas we started with the ground-truth segmentation which can be advantageous. Nevertheless, this result suggests that the distance-based LAP tracker can generate tracks with accuracy comparable to data-specific tracking methods, as long as we start with sufficiently accurate segmentation.
We also performed a similar benchmark with the C2C12 dataset and found that the distance-only tracker yields the maximum TE of 0.998 when starting from ground-truth segmentation ( Supplementary  Fig. S3). This is higher than the score from the cutting-edge graph neural network-based tracking method; 0.976, which was obtained from the test including segmentation and in a larger dataset (Ben-Haim and Riklin-Raviv, 2022).

Tunable cost function improves tracking performance
We next investigated if variable cost functions help improve the tracking score for different datasets.
In Figure 2c, we show a snapshot of the cell migration dataset. Here, the cells are moving collectively toward the upper open region. Due to this drift, LAP-based tracking based solely on Euclidean distance fails with large frame intervals, as demonstrated in Figure 2d using datasets with skipped frames. This situation can be easily fixed by changing the cost function by adding a drift term to the Euclidean distance as with the drift parameter d 2 R 2 and defining g and s analogously (Fig. 2d, Supplementary Fig. S4). We used 5% of the non-dividing and dividing connections to tune d as well as the cut-offs so that they optimize the true positive rate of the connections. The details are summarized in the Supplementary Material.
In real experimental data, particles may have features that help to identify species, such as the size, shape, and fluorescent intensities of genetic labels. In those cases, we can use those features in addition to the Euclidean distances to improve the performance. To illustrate this, we measured the tracking performance for simulated particles with eight species, characterized by different sets of feature values corresponding to RGB colors (Fig. 1e, see Section 2.1.5 for details). We then defined the cost function as where c i ; c j 2 R 3 are the feature vectors. We tuned the parameter w as well as the distance cut-off using the training data with 100 frames so that the tracking result maximizes the connection Jaccard index. We then measured the tracking scores for an independent dataset with 100 frames. As shown in Fig. 2f, with the features used in the metric, the target effectiveness with large frame interval remains above 0.8 while it drops to $0:4 when only Euclidean distance is in the metric (w ¼ 0), illustrating the performance improvement by including the particle features. We also observed an improvement in the other scores ( Supplementary Fig. S5).
For segmented images, we can also use the overlap between segmented regions to calculate the cost (Chalfoun et al., 2010;Ershov et al., 2022;Padovani et al., 2022). The flexible implementation allows us to integrate the overlap metric in addition to the distance in the LAP framework. We define l (with g and s analogously) as lðL i ; L j Þ ¼ Àlog which measures the overlap, where L i and L j are the set of pixel coordinates of the segmentation area for the particle i and j and A is a parameter. By comparing the tracking performance for the mouse epidermis dataset with the squared centroid Euclidean distance cases, we found that replacing the metric improves the mitotic branching correctness by $10% (Fig. 2g).

Conclusion
In this article, we showed how the LAP-based tracking pipeline with additional flexibility and optimizability can be useful in improving tracking performance in certain situations, and easily combined with visualization tools to conduct manual corrections. LapTrack, in large part, is complementary to TrackMate (Ershov et al., 2022), which has a useful graphical user interface, support for including feature value differences, and its own optimization pipeline. Compared with TrackMate, LapTrack can take arbitrary inputs and cost functions and is flexible in its output, making it easier to connect with other upstream and downstream analysis pipelines. Trackpy (Allan et al., 2021) provides a tracking routine based on the algorithm by Crocker and Grier (1996) in Python, as well as functions for particle detection, analysis and data input/output. One major difference is LapTrack's ability to detect splitting and merging particles, which makes it more suitable for cell tracking. The tracking function in LapTrack is designed to help make accurate and validated tracks quickly and efficiently, with the hope to increase the amount of ground-truth data that can be used in training more sophisticated tracking methods.
With a sufficient amount of manually annotated ground-truth data, machine learning-based approaches will likely outperform the current parameter optimization strategy of simple affinity metrics. Due to its flexibility, our package can be easily combined with strategies such as one-to-one association affinity learning (Emami et al., 2020;Li et al., 2009), structured learning (Lou and Hamprecht, 2011), and the metric learning approach combined with graph neural networks (Weng et al., 2020), serving as a reusable platform for implementation.