Enforcing temporal consistency in physically constrained flow field reconstruction with FlowFit by use of virtual tracer particles

Processing techniques for particle-based optical flow measurement data such as 3D particle tracking velocimetry (PTV) or the novel dense Lagrangian particle tracking method ‘Shake-the-Box’ (STB) can provide time-series of velocity and acceleration information scattered in space. The following post-processing is key to the quality of space-filling velocity and pressure field reconstruction from the scattered particle data. In this work we describe a straight-forward extension of the recently developed data assimilation scheme FlowFit, which applies physical constraints from the Navier–Stokes equations in order to simultaneously determine velocity and pressure fields as solutions to an inverse problem. We propose the use of additional artificial Lagrangian tracers (virtual particles), which are advected between the flow fields at single time instants to achieve meaningful temporal coupling. This is the most natural way of a temporal constraint in the Lagrangian data framework. FlowFit’s core method is not altered in the current work, but rather its input in the form of Lagrangian tracks. This work shows that the introduction of such particle memory to the reconstruction process significantly improves the resulting flow fields. The method is validated in virtual experiments with two independent DNS test cases. Several contributions are revised to explain the improvements, including correlations of velocity and acceleration errors in the reconstructions and the flow field regularization within the inverse problem.


Introduction
Today, in applications of particle based optical flow measurement techniques images, in the form of subsequently illuminated tracer particle pictures, are captured from which velocity information can be estimated by PIV evaluation methods [10] or particle tracks can be inferred by tracking techniques such Original Content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. as classical particle tracking velocimetry (PTV) [1,2] or tomographic PTV [3][4][5] or by the novel dense Lagrangian particle tracking method Shake-the-Box [6].
For all of the flow measurement techniques mentioned above, an important goal is the ability to measure instantaneous velocity vector fields and corresponding velocity gradient tensor elements in 2D or better still in 3D, in order to analyze the topologies and dynamics of the governing flow structures (e.g. in turbulence). The regular grid oriented crosscorrelation based methods like 2D PIV [10] and later 3D tomo PIV [11] displayed distinct advantages in comparison to classical PTV methods in terms of achievable spatial resolutions of the desired velocity vector fields, and robustness against outliers in several developmental steps during the past few decades [10]. Nevertheless, the robustness of the local crosscorrelation of subsequent particle image distributions brings with it a disadvantage: a low-pass filtering effect caused by the used correlation window size, which is a well-known issue of PIV methods especially when aiming at measuring strong shear flows.
Tomographic PTV on the other hand typically achieves a better spatial resolution compared to classical PTV methods, as it combines the ability of MART based schemes [11] for reconstructing quite dense 3D particle distributions with the high quality of the latest particle tracking schemes. However, due to the tendency of MART to produce an increasing amount of s.c. 'ghost' particles at higher particle image densities, tomo-PTV is still restricted in terms of the reconstructed particle position accuracy and achievable particle image density to values of below approximately 0.05 ppp (particles per pixel). The latest step forward towards higher reconstruction accuracy and increased particle image densities, including an almost complete suppression of 'ghost' particles with much lower computational effort, is the Shakethe-Box LPT technique [6], which combines the iterative particle reconstruction technique [12] in an initial phase with an inherent tracking, prediction and correction scheme. The tracking and particle position prediction of STB uses 3rd order B-splines for temporal filtering and the predicted particle distribution of the next time step is corrected by an image matching ('shaking') procedure using calibrated optical transfer functions (OTFs) of the particle images [13] before any reconstruction algorithm is applied, thereby making optimal use of the physical properties of the imaged particles inside the flow over many time-steps. Nevertheless, although the found particle trajectories and related velocity and acceleration information is well-defined regarding their individual positions in space, their scattered appearance and (mean) interparticle distances does not enable a simple and global spatial filling calculation of the desired (time-resolved) 3D velocity gradient tensor [14] and pressure fields. For this purpose, different spatial interpolation algorithms were proposed, such as FlowFit [7,15] and VIC+ [8,16], which take instantaneous Lagrangian particle-track data (position, velocity and acceleration) as input and exploit known physical properties such as vanishing divergence and the Navier-Stokes equations for incompressible and uniform-density flows to reconstruct accurate and space-filling velocity, acceleration and/or pressure fields [17][18][19]. The working principle in both, VIC+ and FlowFit, is a very similar data assimilation approach. A complete state space consisting of the velocity and pressure fields (u, p) in FlowFit and consisting of velocity and vorticity fields (u, ω) in VIC+ is fitted to measurements of instantaneous particle velocity and acceleration under the aforementioned constraints. These algorithms reach spatial resolutions that exceed the classical Nyquist wave number limit for the velocity interpolation problem alone, owing to the increased amount of data in the coupled reconstruction [7]. Consequently, the data assimilation schemes can achieve higher spatial resolutions than simpler interpolation or filtering schemes, e.g. [20,21], that make use of the constraint of vanishing divergence only. This approach had already distinguished the aforementioned algorithms in the benchmark cases of the 4th PIV challenge [22], where the DLR contribution was evaluated with a predecessor of FlowFit. Figure 1 clarifies the difference between an interpolation approach without constraints and a reconstruction approach with constraints: Consider s as ground truth for an instantaneous velocity and pressure field of an incompressible flow with constant density. The green line represents the subset of this space where the divergence of velocity is zero and the pressure Poisson equation holds. We know s to be an element of this subset but can only approximate it with the help of particle movement observations (velocity and acceleration at n particular points in space) due to measurement errors (r) and the sparsity of these observations. The lower row shows the potential difference between two reconstruction methods where only one would enforce the aforementioned physical constraints. One reconstruction (s ′ ), e.g. a simple interpolation, lies outside of the physically relevant subset. This is to be expected in the face of measurement errors and undersampling if the physical constraints are not enforced because this approach has more degrees of freedom to distance itself from the ground truth compared to the method that enforces the physical constraints.
We want to stress that FlowFit reconstructs single time instants. FlowFit's input is a temporal snapshot of particles tracks, i.e. a frozen point cloud of particles scattered in 3D space with known instantaneous velocities and accelerations and their estimated uncertainties. The inverse nonlinear optimization problem is addressed by minimization of a cost function that is composed of the deviations from the incompressible Navier-Stokes equations ; see the appendix for a brief recapitulation of the applied method or [7] and [15].

Temporal consistency
Two subsequent reconstructions that are a time shift τ apart can in principle be compared by means of any CFD time marching simulation method (e.g. direct numerical simulation, large eddie simulation) with reconstructed field values as initial conditions. As described above for FlowFit, subsequently reconstructed fields are only coupled by the physical behavior of the Lagrangian tracers themselves, but the fields u, p are reconstructed independently in a space-filling manner for each time step. This implies that two subsequent reconstructions simulated forward in time will not necessarily show an identical time evolution, and are thus not temporally consistent with each other. This temporal inconsistency is due to a missing space-filling temporal constraint in addition to the spatial constraints in FlowFit. Only field values close to measured particles are temporally consistent in this sense. Consequently, there is still potential for further increasing precision by combining the information from subsequent time instants, since the current procedure does not a priori impose temporal consistency. One could argue that, to some extent it is guaranteed, owing to the physical input of temporally B-spline filtered Lagrangian particle-tracking data, but the temporal dynamics of smaller scales in the reconstructions are often particularly severely affected by spatial undersampling. The latter is illustrated in figure 2. For the sake of simplicity consider a stationary velocity field u(x, t) = u(x) seeded with some tracers. The reconstructed velocity field u rec (x, D track (t)) is now obtained from sparse Lagrangian particle data denoting the position, velocity and acceleration of each of the N particles. But after a time τ , the Lagrangian tracers have traveled through the field, so their position, velocity and acceleration have changed to D track (t + τ ). With missing temporal constraint data the reconstructed fields do not necessarily coincide, so in general is obtained. By temporal variations of the input data, the time derivative of the reconstructed velocity field d dt u rec (x, D track (t)) is non-zero, although the underlying field was known to be stationary d dt u(x) = 0. Consequently, in the more general case of instationary fields, time derivatives of true and reconstructed fields do not coincide. This results in physically inconsistent behavior of flow features if the fields of interest are spatially undersampled, e.g. due to limited particle seeding and high Reynolds numbers. If multiple particles randomly occur in close proximity to high velocity gradients, the latter will be reconstructed with higher accuracy than in a case with a more unfavorable particle distribution or particle shift, even if the gradients themselves persist over time. Preventing this scenario in a computationally affordable manner is among the aims of this work.
More recently [9] proposed a method called time-segmentassimilation (TSA) that incorporates multiple time steps into the inverse problem of the flow field reconstruction. Therein, the temporal evolution of flow fields is obtained by forward and backward simulation applying the vortex-in-cell time integration method [CHRISTIANSEN1973363]. It performed well in recovering additional structures and implements the missing temporal constraint to the detriment of computational cost.
Using temporal information within the reconstruction process provides the following benefits: As pointed out by [24], the achievable spatial resolution should in this case be limited by inter-streak distance r s instead of inter-particle distance r p and r s < r p holds true at all times. Also according to [25], the additional temporal information yields an increased ratio of the number of realizations from STB measurement to the degrees of freedom of the reconstruction target. Innovatively, the naturally Lagrangian data framework of VIC# [26] and FlowFit has been utilized to address the missing temporal constraint by taking into account information at additional non-particle positions: [25] applied Taylor's hypothesis to project spatial information onto preceding and prospective positions on particle streaks, whereas [27] introduced virtual tracer particles (VPs) to induce memory in the reconstruction process. The former was demonstrated on experimental data of a thermal plume, where considerable temporal smoothing was achieved. The focus of this paper lies in an elaboration on the virtual particle seeding approach introduced by [27] in artificial experiments. This way, improvements in precision can be distinguished from immoderate temporal smoothing of the fields.

Uncertainty quantification and error scaling
The uncertainty quantification of LPT/PTV methods and spatial field reconstruction schemes is a current topic due to recent developments, see e.g. [28]. As mentioned above, FlowFit and VIC+ can retrieve smaller flow scales than the interparticle distance (Nyquist limit) owing to the combination of both velocity and acceleration data in the reconstruction process. With few exceptions, particle tracking experiments regarding recent turbulence research are carried out in the high-Reynolds-number-regime, targeting general turbulence characterization, the occurrence of a secondary peak in the energy spectrum, and the origin of superstructures in wallbounded flows [29][30][31][32][33]. This carries the experimental research to the boundaries of feasibility and requires, e.g. high flow rates in wind tunnel facilities and/or small tracer particles, which complicates dense volumetric seeding owing to limited particle seeding rates on one hand and a lack of spatial resolution on the other. Using such particle-tracking data for volumetric field reconstructions raises the question of uncertainty and error quantification for applied reconstruction methods like FlowFit. It is anticipated that inter-particle distances greater than several Kolmogorov length scales η will effectively cause truncation of the turbulent spectrum. But an assessment of the scaling of expected velocity and acceleration errors with average inter-particle distance has not yet been performed for various benchmark flows. The average inter-particle distance is a measure for seeding concentration but does not cover local concentration variations due to rapid particle dispersion in turbulent flows. In principle, the uncertainty in flow field estimation is expected to decrease locally whenever particles approach each other. This motivates us to inspect the connection between reconstruction errors and local particle configurations within the scope of this work.

Structure of the paper
This paper addresses the questions of both uncertainty quantification and temporal consistency in FlowFit's flow field reconstructions, as an understanding of either one of these topics may greatly contribute to the other. For this purpose, two local criteria based on properties of the underlying particle cloud that might correlate with errors in estimated field values are introduced in section 2.1. In the next step, a scheme for temporal coupling of the reconstructions via virtual particles (VPs) is presented in section 2.2. Section 3 describes two artificial particle-tracking-experiments in homogeneous isotropic turbulence and a channel flow designed utilizing the Johns Hopkins Turbulence Database (JHTDB) test cases and tools [34][35][36]. These form the basis of further analysis, which aims to (i) establish a connection between local particle cloud configuration and errors in estimated velocity and acceleration field values using FlowFit 2.0 and (ii) quantify and discuss the impact of additional information carriers in the form of VPs on the reconstructed fields. Besides the temporal coupling, a further beneficial effect of VP seeding is identified. The distance d from a measurement point (grey) to the nearest neighbor in the initial particle cloud (black). Right: The more uniform/spherical the arrangement of three nearest neighbors, the smaller s becomes.

Candidate properties for local uncertainty quantification
FlowFit optimizes flow fields based on the Lagrangian data D track at certain time instants. This optimization aims to find continuous fields coinciding with measured field values U k , A k at particle positions X k . Owing to continuity, estimated field values will become less reliable with increased distance from particles, which qualifies the nearest neighbor distance d, i.e. the distance between the position at which the reconstructed field is evaluated and the position of the closest particle in D track , as a first candidate for local uncertainty quantification ; see also figure 3. On the other hand, acceleration field values contain contributions from local velocity field derivatives, which might presumably be better resolved in a situation of a more uniform, i.e. spherical arrangement of nearest neighbors. An unambiguous mathematical criterion to assess uniformity can be constructed from the first three nearest neighbors by the summation of the unit vectors from point of interest to neighbor position. The absolute of the resulting vector, denoted s, will read zero in the case of a perfectly spherical composition, but will yield a value of up to three if all neighbors are found in the same direction, see figure 3. The general statement holds true that the greater the value of s , the less uniform the arrangement of neighbors becomes. Note that in the case of four or more neighbors, there are degenerate, non-spherical cases with small s, e.g. four particles aligned.

Algorithm for the advection and weighting of virtual particles in FlowFit reconstruction
The spatial gaps in measured clouds of Lagrangian tracers were identified to be potential sources of unphysical temporal behavior. If the Lagrangian tracers were dense enough to resolve the smallest relevant scales at inter-particle distances smaller than η, no such problem would occur. Consequently, suppression of unnatural changes in undersampled flow regions requires a sufficient number of tracers. More generally speaking, a sufficient number of data points is required to reduce those degrees of freedom that allow for temporal deviations from the governing equations. Since any experimental record cannot be changed in hindsight, we aim to artificially enhance the number of tracers from the reconstructed fields themselves. Additional virtual tracer particles (VPs) can make use of the physically meaningful velocity and acceleration information provided by the reconstruction process and restrict the material derivative within spatial gaps without real tracers. The motivation for their application (instead of using joint methods formulating a 4D inverse problem) is the incontestable simplicity of the approach and the ease of implementation in the naturally Lagrangian data framework of FlowFit. An outline of the procedure can be found in figure 4. Positions are drawn randomly from the flow-domain to select the VPs. In the following, these are advected into the next time step by means of the reconstructed velocity and acceleration data. Next, they are added to the existing particle cloud with individual weights for the updated velocity and zero weight for their initial acceleration. Based on the extended particle cloud and the weights, FlowFit reconstructs the field under these additional constraints. In this way, direct temporal coupling with almost no further computational cost can be achieved, and reconstructions run on the enhanced clouds are less sensitive against positional changes of the real particles due to artificially increased seeding density. However, care must be taken in case of possible noise amplification. To circumvent this problem, the weight with which virtual particle velocities enter the cost function is clearly required to be lower than the corresponding weight of real tracer velocities. This will guarantee that FlowFit decides in favor of the real particle information in case of any contradictions. Additionally, a dampening of virtual particles with locally estimated velocity gradients can be applied.

Weighting of Virtual Particles
The deviation from real particle velocities enters the cost function with a weight close to 1. Virtual particle velocities are applied with a lower weight for velocity and zero weight for acceleration. On the other hand, the weight should not be too small in order to let them be smoothed out. The advection is typically subject to relatively large errors, so it is worthwhile to deduce equations for their temporal evolution. The latter can subsequently be used to define proper weights. Note that a physically motivated estimate for the acceleration exists in the current case, so the advection incorporates a higher temporal derivative and is presumably less erroneous than in the case of known initial velocity alone. However, this acceleration is uncertain (FlowFit error) and undergoes unknown continuous changes over time owing to dynamics (dynamic error), which leads to additional errors in the estimation of the new position and velocity. Consequently, we aim to incorporate both effects into the equations by considering the Lagrangian statistics and the reconstruction errors. In the following, the deviation of true velocities and the reconstructed velocities (FlowFit velocity error) is denoted as ν, which reads ν = u reconstructed − u true or ν = u reconstructed − u true for vectors or a single component, respectively. Analogously, to denote acceleration, α or α is used. Because the reconstructions represent a lowpass-filtered version of true fields, the assumption of vanishing mean values of the errors ν and α is justified.
The Taylor series for a position and velocity component along a trajectory read with error termsΘ(τ ). In fact, the true values for initial velocity u(t) and initial acceleration a(t) are unknown so the estimation become where the Θ's now contain both errors, owing to deviations from the constant acceleration model and errors due to the reconstruction, respectively. The mean square errors (MSEs) of position and velocity become Herein, the time dependent shifts . . ⟩ denotes the temporal average over t (plus additional average over spatial symmetries of the flow) and repeating time dependence noted by ...(t) was dropped above for better readability. So, terms like ⟨∆x(t + τ )u(t)⟩ or ⟨∆x(t + τ )a(t)⟩ implicitly contain the Lagrangian velocity and acceleration autocovariances and crosscovariances which can be evaluated on the basis of the available particle tracks. In general, the MSEs show quite complex dependencies but there is less reason to assume that the FlowFit errors are correlated to any of the other properties so crossterms containing ν or α are neglected. This way, equations (5) and (6) yield the simpler versions All of the terms can be evaluated and binned spatially by use of (i) the particle-track data and (ii) track benchmarking in order to estimate the reconstruction errors [37]. To elaborate on the latter: If no ground truth data in form of a reference or DNS is available, reconstruction errors ν and α can be obtained indirectly by the exclusion of a small number of particles from the FlowFit reconstruction. This way, FlowFits input data is not altered much, but independent ground truth data modulo measurement noise is created to compare with the FlowFit results.
Since reconstruction errors will certainly exceed the measurement noise in spatially undersampled flows, this will be a reliable ansatz. In the JHTDB test cases the reconstruction error can be accessed directly. Use of binned Lagrangian statistics ensures that even spatial variations of the dynamic time scales are covered, as is the case in boundary layers. Note that ⟨Θ 2 x ⟩ and ⟨Θ 2 u ⟩ are thus functions of position and prediction time τ . Usually, the time spacing between camera snapshots is constant and so only a single τ needs to be evaluated, therefore the dependency is dropped in the following notation. To apply the results for a weighting of virtual particles, the considerations have to be generalized to three dimensions. Because FlowFit does not consider a positional error, the MSE in position must be added to the velocity error by consideration of the estimated velocity gradients, so and analogously for the velocity components v and w. C gradient is a constant greater than or equal to 1, which accounts for the smoothing of gradients in the reconstruction. With these errors at hand, a scalar weight W for virtual particles can be introduced thus which takes values smaller than 1 to account for the lower reliablity of the virtual particles. RMSE denotes the root mean square errors. Here, another constant C weight is introduced, because tracked velocities in the JHTDB test cases show no experimental error. We want to stress that in an experimental study, no C weight has to be chosen. Note that the scalar weight W can vary with VP position due to spatial variations of Lagrangian statistics in equations (7) and (8) and the consideration of local gradients in equation (9). Further, the dependence of local particle cloud configuration expressed by d and s has not yet been considered but may be included in the calculation.

Description of the JHTDB test cases
Virtual experiments in two test cases from the Johns Hopkins Turbulence Database were performed. Cuboid sub-volumes were chosen in the forced isotropic turbulence (FIT) dataset [34] with Re λ ∼ 433 and the channel flow (CF) dataset [35] with Re τ ∼ 1000. The latter sub-volume is co-moving in streamwise x-direction with a speed of 0.7 in order to keep track of the same flow structures over a greater range of time.
The position and extent of the sub-domains is documented in table 1. Several sets of particle tracks with different seeding densities were generated by making use of the database functions [36]. To achieve homogeneous seeding, the subvolumes were extended by a buffer volume within which, random positions were drawn from a uniform probability density. All of these particles are integrated forward in time. At each time step, the particles inside the sub-volumes are saved and utilized to generate track-files, whereas the particles in the buffer layer are replaced by newly drawn random positions to guarantee a constant particle inflow into the sub-domain. The velocities and accelerations in the trackfiles were determined from velocity and pressure provided by the spatio-temporal interpolation schemes (8th-and/or 4th-order Lagrange polynomials and 4th-order centered finite differencing in space and piecewise cubic hermite interpolation polynomials in time). The frame spacing ∆t, i.e. the size of the time step between successive reconstructions, was chosen rather conservatively to enable variations in hindsight. The frame spacing in FIT was adjusted to approximately τ η /4, which corresponds to roughly 1/8 acceleration decorrelation times. The frame spacing in CF was chosen to be approximately 1/4 of the acceleration decorrelation time in the layer closest to the wall, i.e. the layer of the sub-domain with the smallest dynamic timescale.
In the following, the track-data at every time instant is processed with FlowFit. A B-spline grid is set such that the number of particles per cell is in the order of 0.04 unless the grid spacing would fall below the DNS grid spacing. FlowFit relies on a parameter R (acceleration-velocity measurement error ratio) which is to define for proper weighting in the cost function (see appendix), and can be evaluated for any measurement [7]. However, by accessing DNS flow fields directly, no measurement error can be quantified in the artificial test case. A first attempt to cast velocity and acceleration errors comparable in cost function (and to define R) is by means of a normalization with the intensity of turbulent fluctuations, i.e. the standard deviations of velocity and acceleration. This predicts a value of R ≈ 4 in FIT. But according to a small parameter pre-study of the data the parameter was corrected to 30 in FIT and 40 in CF in order to account for error-prone acceleration calculation from pressure values. The latter arises since the simulations were performed with spectral methods, but finite difference schemes and Lagrangian interpolation polynomials are applied using JHTDB database tools as mentioned above. If R is heavily underestimated or overestimated, this will be equivalent to fitting to either only acceleration values or only velocity values on particle positions, respectively. If only the particle velocity is being fitted to, this will basically yield a divergence -free interpolation of the field without making use of acceleration data (and Navier-Stokes constraint) but provides a nonsense pressure field. Fitting solely to acceleration data will result in a nonsense velocity field with high frequency content defined mostly by measurement errors (or interpolation errors in the DNS test case) alone. It is therefore not critical to slightly overestimate R here, as the utilisation of acceleration data is a plus, but an underestimation results in totally disregarding measured velocities, which are usually known with higher accuracy.

Uncertainty analysis and link between particle configuration and reconstruction quality
The errors in FlowFit's flow field reconstruction need to be characterized first in order to enhance the reconstruction results in any second step. To that end, the FIT test case is best suited for the application of the developed candidate properties d and s because errors in any quantity can be treated the same way for each spatial coordinate. The reconstructed B-spline fields are sampled on the DNS grid for best comparison of reconstructed field values and DNS field values by means of ν and α. Additionally, the errors very close to the boundaries of the sub-domain are observed to exceed the error in its interior by a factor of up to 4 due to unknown boundary conditions. Thus, a shell with thickness of 0.08 sub-domain-edge-length was excluded from the following analysis. This was found to be sufficient to compensate for the effect. Also, the latter step prevents a bias in the evaluation of s and respective errors because boundary points are known to exhibit high s-values. Note that distance, velocity and acceleration are obtained in dimensionless form from the simulation, which by no means hinders the assessment of the relevance of s and d. An example PDF of obtained velocity and acceleration error is provided in figure 5. The distributions are symmetric and long-tailed. Furthermore, the error PDFs are found to be isotropic, as expected for an isotropically forced test case and an isotropic cost function. The pronounced tails are inherent to the velocity increment statistics at small spatial shifts as well as to the velocity gradient tensor elements, as shown in [38]. Consequently, long tails in the error PDFs will certainly occur if the reconstructions are seen as low-pass filtered versions of the true fields. So it is expected that the error PDFs resemble turbulent velocity increment distributions.
Exemplary joint PDFs of squared deviations ν 2 and α 2 and candidate properties d and s are provided by figure 6. Besides different error amplitudes, the histograms appear very similar regardless of seeding density. The maximum of the distributions on slices with fixed d basically shifts towards smaller values if the nearest neighbor distance is decreased. So, a clear trend of reduced errors for decreased nearest neighbor distance is observed, as expected from the form of the cost function. Interestingly, no clear relation between errors and nearest neighbor configuration can be found, since no trend can be inferred from the joint PDFs involving s. Figure 7 shows the presence of d-dependence in terms of root mean square errors (RMSEs) for all considered particle seeding densities. The RMSEs approach a constant value close to the particles, which can be interpreted as the tolerance owing to the weighting of different terms in the cost function. Furthermore, RMSEs grow monotonically and exhibit a region with linear behavior in the double logarithmic plot. Note that the statistics to the very left and very right end of the d-axis are not converged due to the small amount of data points per bin, compare to figure 6, and that only every second data point is displayed. Owing to the former observations, a functional ansatz in the form of a power-law plus a constant for small to intermediate distances is chosen, explicitly for both velocity and acceleration. The constant C 2 therein corresponds to the error that is admitted at the particle positions. Weighted least square fits with regard to equation (11) are  shown as lines, and data to the far right end was masked. Exponents γ close to one indicate linear scaling, which cannot be perceived visually in the double-logarithmic plot but appears obvious from a Taylor-series perspective. Furthermore, the weighted least squares fit reveals that the tolerance for deviation from particle velocity grows only slightly with decreasing particle number, whereas the tolerance for deviations in particle acceleration increases significantly. This cannot be caused by a changing ratio of B-spline grid points to particles, since the number of particles per B-spline cell is tuned to about 0.04 to allow for sufficient degrees of freedom in the reconstructed field, so an exact reasoning remains puzzling. The C 2 -values allow us to check for consistency, because the internal FlowFit parameter R should approximately coincide with C α 2 /C ν 2 , if the remaining terms in the cost function are disregarded. C α 2 /C ν 2 = 33.81 compares well to R = 30 in a sufficiently seeded case N = 26000, but the discrepancy increases with lesser N, indicating that a reconstruction of matching velocity and acceleration fields is hampered by spatial undersampling.

Scaling of Total Mean Square Error on Number of Particles
The total MSE averaged over time and space (disregarding field values too close to the boundary) is a simple quantitative measure for the quality of the reconstruction. It will certainly decrease the more information on the present flow field is available, i.e. the more tracer particles are present, or equivalently, the higher the seeding density. Figure 8 shows the available data points and indicates by a weighted least squares fit that reconstruction accuracy in velocity and acceleration scales roughly with the inverse of N in FIT. In contrast, the situation in the non-isotropic CF case is less intuitive. A roughly inverse scaling with N is obtained for total velocity MSE but the exponents for total acceleration MSE suggest an N −2/3 dependence for the considered range of seeding densities inside the loglayer of the wall-bounded flow.

Impact of virtual particles
Section 3.2 aimed to link the uncertainty in flow field estimation to two local properties associated with the Lagrangian particle cloud. In the course of this analysis, the local arrangement of particles was found to be irrelevant, and only the nearest neighbor distance could be identified as a measure of reconstruction quality. Although a clear trend can be quantified, the errors scatter widely regardless of neighbor distance. In consequence, the preliminary study hardly assists in the design of an optimal seeding strategy for additional VPs. The latter aims to preserve dispersive structures that are resolved within a certain particle distribution, but whose temporal evolution is highly dependent on ingoing and outgoing tracers, which suggests VP seeding close to real particles or on their trail, respectively. However, as pointed out in the introduction, it is at spatial gaps furthest away from real particles where the time derivative of reconstructed fields is least constrained. A reasonable first approach would therefore be to seed VPs randomly with a uniform probability density in the flow domain. The nearest neighbour distance is disregarded from the weight of VPs due to the latter demand and the great scatter in figure 6, i.e. temporally averaged values are used for the α and ν terms in equations (7) and (8) as initially assumed. The parameters C gradient and C weight are adjusted to set the relative weight W of VP velocities in the cost function to a value of about 0.7. Much smaller values resulted in negligible effects on the reconstructions.
One aims to minimize the error over time and so the total MSEs, i.e. MSEs averaged spatially over the sub-domains, serve as comparison criterion for the impact of VPs. The temporal evolution of the total MSEs under the influence of additional VPs is presented in figures 9 and 10 for the FIT and CF test case, respectively. The graphs show that the velocity MSE decreases rapidly within the first ∼10 time steps and even the very first step causes a great drop in the FIT test case. Thereafter, the velocity MSE ratio levels were measured at values between 0.9 and 0.7 corresponding to a total MSE reduction of 10% to 30%. Only velocity data is provided at the VP positions, but still the acceleration MSE decreases and levels accordingly. The gain in acceleration ranges from 5% to 15%. Note that the graphs display the temporal evolution of the total MSEs of single realizations in rather small subdomains. Therefore, they are expected to be noisy and show  some abrupt changes whenever badly resolved structures leave or enter the domain. Moreover, the MSE values to which the graphs are normalized differ greatly due to the higher accuracy in the case of high seeding densities, compare figure 8. For this reason, the noisy appearance of the acceleration MSE ratio in the highly seeded channel flow test case should be interpreted with care.
The enhancements can also be observed visually. Figure 12 (13) and yield positive values whenever the quadratic error is reduced after VP seeding, but negative values otherwise. In the depicted flow domain, the locations of positive acceleration gain appear inside the Q-isosurfaces (encasing high Qvalues with regard to a defined threshold), which motivates us to evaluate conditional averages of the gain with respect to the Q-value determined by FlowFit. Figure 13 shows an exemplary result for the FIT test case, in which the trend can be confirmed statistically. Interestingly, high velocity and acceleration gain is not found solely at high positive Q-values, but rather increases with absolute Q-value.

Discussion of the improvements
The previous section demonstrated that virtual particles beneficially affect physically constrained flow field reconstruction by temporal coupling. The percental reduction of mean square errors in figures 9 and 10 appeared mostly independent of undersampling, although the reconstructed flow fields are of quite different quality due to varying seeding densities. The improvement was shown to appear in regions with high Q-absolutes, but most notably affects the vortical structures. A clearly unphysical effect of VPs manifests in the great reduction of MSEs after only one advection step in the highly seeded FIT cases. Two successive time steps contain almost the same information and there is no clear physical reason for an instant error drop of this size. This reduction could theoretically depend on the transport of well resolved structures/information into the boundary where the reconstruction is known to yield significantly higher errors. However, this effect was already accounted for by excluding the boundary from the MSE calculation itself, so it cannot be the cause of this considerable gain. Furthermore, one could raise the question of time reversibility, since the idea of additional information carriers in the form of VPs should not be limited to advection progressive in time. This is rendered reasonable because the dissipation is practically negligible on Kolmogorov time scales and the Euler equations are time reversible. In fact, it is not yet possible to enhance the flow fields under backward advection or alternating backward and forward advection, respectively, since the MSEs would grow backwards and decrease again during a forward step. Several trials yield the same qualitative results. One reason for this peculiar behavior is found by a reconsideration of formulae (5) and (6). During the advection process the velocity MSE following the VP positions changes, according to equation (6). Therein, the shift ∆u can be approximately expressed by a(t)τ when dynamic changes of the acceleration are ignored for the moment. This yields the neat and compact equation Therein, all but the last term have positive signs by definition. If the last term is non-vanishing and negative, the velocity MSE ⟨Θ 2 u ⟩ will initially decrease for small positive time steps until the quadratic term takes over. For negative time shifts, all terms will be positive and no such effect can occur. Indeed, when the reconstructions are checked for the a priori neglected covariance term ⟨να⟩, the average correlation coefficients C να = ⟨να⟩(⟨ν 2 ⟩⟨α 2 ⟩) −1/2 in tables 2 and 3 are obtained for the FIT case and the CF case, respectively. They take absolute values between circa 0.07 and 0.45 and grow with the number of seeded particles. Equation (14) suggests that the possible gain will depend on the strength of this correlation and the time shift. The minimal velocity MSE ⟨Θ 2 u ⟩ in equation (14) occurs at a time shift τ * and causes a relative By pure chance, the initially chosen frame spacings in table 1 coincide almost perfectly with the optimal time steps τ * for both test cases. Thus, the effect predicts maximum MSE improvements between 0.6% and 20% dependent on the seeding density. Consequently, it explains both the instant error reduction in the highly seeded FIT cases and parts of the error amplification during backwards advection. Here, the question for the origin of the correlations can be raised. FlowFit distinguishes from simpler reconstruction schemes, since both velocity and acceleration fields are found simultaneously rather than solving the fields separately. The joint approach enhances the reconstruction quality, but at the same time it becomes likely that deviations in velocity and acceleration are slightly correlated also. The error-correlation model fails to capture the  total gain in the other cases, so more mechanisms must clearly exist to explain the improvements. The prediction of the velocity MSE ratio along the advection paths of VPs from equation (14) ignored dynamic changes in acceleration, and assumed that the virtual particle would land on the right position, which is obviously not the case according to the error provided in equation (5). Instead, the velocity MSE ratio along VP tracks can be calculated on the fly by comparison of reconstructed and true flow fields as follows. The velocity error in the flow domain can be thought of as an instationary field ν(x, t). Now, a VP is placed at a random position x 0 and advected with reconstructed velocity and acceleration, u 0,rec = u(x 0 , t) + ν(x 0 , t) = u 0 + ν 0 and a 0,rec = a(x 0 , t) + α(x 0 , t) = a 0 + α 0 . From the data, we can now compare the velocity MSE ν 2 1 at new position and new time to the initial value ν 2 0 , which reads averaged over multiple realizations and time. Representative results in forward and backward time direction are provided in figure 14. We want to stress that equation (17) and figure  14 do not present the enhancements of the whole algorithm, i.e. the iterative advection-reconstruction procedure of figure  4, but only enhancements caused by the error-prone advection step alone. The ratio ⟨ν 2 k+1 ⟩/⟨ν 2 k ⟩ was evaluated dependent on nearest neighbor distance. Close to real particles, the velocity error along the path of the virtual particle increases under advection, regardless of time direction. This is reasonable, since the error is typically very small in the vicinity of real particles and so is the denominator in (17) and the total contribution to the total velocity MSE. Favorably, those VPs further away from real particles experience a reduction of their squared velocity error in accordance with equation (16), but an improvement of about ten percent is not captured by the obtained correlation C να alone (compare to table 2). Furthermore, the partition of improved versus worsened velocities shows that the effect is not due to outliers. Similar results showing a velocity MSE reduction of about 10% after a single forward advection are obtained for the rest of the FIT cases and in addition all considered CF cases, which shows that VPs act as error dampening only in forward time direction and preferably in the spatial gaps with no real particles. In these gaps, the Navier-Stokes constraints and the regularization contribute to FlowFit's cost function, of which the latter can potentially add partly to unphysical behavior. One could argue that the puzzling behavior under time reversal contradicts the notion of VPs as additional information carriers. In fact, the turbulent fields of interest do not behave in a time reversible way themselves. The characteristic teardrop shape of the joint probability density of velocity gradient tensor invariants is believed to be a universal aspect of small scale turbulence, [39][40][41] only to mention a few. It shows in particular that the occurrence of stable and unstable foci corresponding to stretched and compressed vortical motion is not equally likely. Under time reversal, stable and unstable foci interchange, which will reverse the chronological order by which tracer particles pass certain structures. High gain due to VPs emerges within vortical structures. So by time reversal, any advantageous mechanism present in the forward direction can change to the opposite. On the other hand, time-symmetry breaking is already quantifiable from a particle perspective. Two-particle dispersion (or more generally multiparticle dispersion) is known to be non-reversible in time, both on very short and long time scales ( [42] and [43], respectively). Therefore, the bulk of virtual particles will undergo different relative positional changes dependent on time direction. Furthermore, it is not the interaction with the true turbulent topologies but with those favored by FlowFit which causes the  self-focusing effect and renders the temporal coupling extrauseful. It appears that the reconstructed topologies and the real turbulent topologies interact in such a manner as to shift the reconstruction closer towards the truth under forward advection. The above reasoning cannot elucidate the missing timereversibility in the application of virtual tracers, but should at least suffice to illustrate the complexity of the problem.

Revision of the regularization
FlowFit relies on a regularization (curvature penalization) to make the underdetermined inverse problem uniquely solvable. Additionally, the regularization aims to dampen noise inherent to particle tracking data. In order to elaborate on its necessity, think of the reconstruction process as an interpolation problem. Given the scattered sample points for, e.g. velocity, one could decide to use scalar radial basis functions (RBFs) for interpolation. However, if the chosen interpolation function is too narrow, the field will not appear smooth but exhibit symptomatic, spurious oscillations between sample points. Related problems, such as overfitting, are circumvented by FlowFits regularization, which constrains the B-spline interpolant to be as smooth as possible under simultaneous fulfillment of Navier-Stokes. In the previous section, reconstructed flow fields were presented showing unconnected vortex blobs owing to missing particle data in the intermediate zones. In particular, that these gaps are reduced partly by use of VPs, which were shown to act as error dampeners in the absence of real particles. It is those gaps where Navier-Stokes constraint and regularization compete against each other, which possibly contributes to the error correlation C να and the effect of VPs. It is therefore desirable to design a regularization that is justified not only by the nature of interpolation problems in general but also from a physical perspective. Basically, it would be desirable to make the regularization itself favor vortex tubes instead because these are more common structures: The teardrop shape of the joint PDF of velocity gradient tensor invariants Q and R reveals that vortical motion is more likely to appear in the form of stable foci together with a stretching motion. Vorticity itself is accordingly distributed in a prolate fashion owing to the self-stretching term. Thus the vortex dynamics induce local anisotropy such that, e.g. the gradient of vorticity is smaller in the direction of vorticity itself : compare to the sketch in figure 15. This suggests that the second derivatives of the velocity fields, i.e. their curvature, exhibits the same behavior. Therefore, the problem of vortex blobs can be addressed by an alternative definition of smoothness in terms of anisotropic curvature penalization. This means that if the direction of vorticity is known approximately, the occurrence of gaps within Q-envelopes of vortex tubes can be penalized more.
To base the idea on solid ground, the Hessian H of the velocity fields in the FIT test case is evaluated by use of the provided analysis tools. The curvature of a velocity field along the direction of vorticity n ∥ = ω|ω| −1 is expressed by c p = |n ⊺ ∥ Hn ∥ |, which will be called parallel curvature in the following. Analogously, the orthogonal curvature reads c o = |n ⊺ ⊥ Hn ⊥ | with n ⊥ · n ∥ = 0. The unconditional joint probability density of parallel and orthogonal curvatures obtained in the FIT case is provided in figure 16, confirming the hypothesis. The mean values of both curvature absolutes are comparable in the absence of swirling motion, but their ratio increases rapidly with vorticity, see figure 17. Consequently, an estimate of the vorticity field can be translated into an advantageous unconditional regularization: As opposed to an isotropic penalization of terms corresponding to one can specify a local coordinate system by the three orthogonal vectors n ∥ , n 1 ⊥ and n 2 ⊥ and penalize the weighted second derivatives instead to account for the expected anisotropy. Note that curvature ratio and coordinates both depend on the vorticity  field. Thus, a clever hierarchic implementation with updated vorticity field estimates needs to be found to circumvent the given chicken-and-egg problem.

Results and outlook
This feasibility study verified the beneficial effect of additional virtual tracer particles (VPs) in physically constrained flow field reconstructions. The additional VPs constrain the material derivative in undersampled regions of the flow domain and counteract unphysical, temporal changes of the Eulerian field, too. The flow topologies are observed to appear more consistently after application of virtual tracers, although no considerable gain in the form of additional small scale structures can be found in the spectrum. This suggests that VPs assist in recovering and shaping those persistent structures that are occasionally found by FlowFit, but they cannot be considered for enhancing spatial resolution significantly. Their application is native with regard to the data framework of FlowFit and VIC+, straightforward to implement based on available Lagrangian statistics, and comes with almost no further computational costs in contrast to joint methods that accompany the spatial reconstruction with computationally expensive simulation methods for time integration. Besides the temporal coupling, another beneficial effect of VP seeding is identified due to moderate correlations of velocity errors and acceleration errors in reconstruction results obtained with FlowFit 2.0. This has hindered the application of VPs in reversed-time direction so far. However,the steady enhancement of flow fields regardless of initial seeding density and correlation strength hints at another mechanism causing the improvements, rendering the temporal coupling responsible. In order to rule out whether missing time reversibility is caused by the physical problem or the reconstruction algorithm, a thorough parameter study would need to be performed. This should include several particle-to-VP ratios, different weighting approaches and seeding strategies. To elaborate on the latter: VPs are shown to take most effect far from real particles and at high Qvalues. As such, seeding on trails of real particles or within regions of high Q-values could be considered useful. As the uncertainty quantification based on local particle cloud properties could not assist in the choice of an optimal VP seeding strategy, the development of uncertainty quantification and seeding strategies based on estimated flow properties should be considered. In the long term, the approach should be applied to experimental benchmark cases. Currently, a third version of FlowFit is under development. This implements an advantageous divergence-free B-spline basis and a fast Poisson-solver for the pressure field, effectively reducing the necessary terms in the cost function. VPs can potentially take greater effect in combination with a simplified cost function and an anisotropic regularization.
The DLR developed the flow reconstruction tool FlowFit [7], which can in principle be described as an interpolation and data assimilation scheme. Particle positions, velocities and accelerations can be deduced from optical particle tracking experiments, either following the classical particle tracking velocimetry approach (PTV) [1,2] or the novel dense Lagrangian particle tracking method Shake-the-Box [6]. FlowFit's input is a temporal snapshot of these tracks, i.e. a frozen point cloud of particles scattered in 3D space with known instantaneous velocities and accelerations. The output is a space-filling velocity field and a pressure field, both represented with 3rd-order B-spline basis functions, that fits the provided measurement.
The data assimilation approach distinguishes FlowFit from simple interpolation schemes, as the measurement values and the underlying physical constraints (prescribed by incompressible Navier-Stokes equations) are both exploited. This is key to a high-quality velocity and pressure field reconstruction. Therefore, we use the term reconstruction throughout the paper to distinguish FlowFit from simpler interpolation approaches that come without the application of physical constraints. FlowFit relies on cubic 3D B-splines as interpolation functions for three cartesian velocity components and pressure. Therefore, the output of the reconstruction is a continuous field represented by coefficient grids for velocity components and pressure rather than a grid of velocity and pressure values. B-splines are piecewise polynomials with compact support. They were first introduced by Schoenberg and have been a standard tool in numerical calculations since the work of [44] and [45]. FlowFit was designed to solve the inverse problem/ optimization process relying on minimization of a cost function. This cost function punishes deviations from measured particle velocities and accelerations as well as deviations from the incompressible Navier-Stokes equations in spatial gaps between particles. The constraints are u(x p ) ! = u p , a(x p ) = −∇p + µ ρ ∆u ! = a p (A1) at the particle locations x p , with particle velocities u p , particle accelerations a p , pressure p, viscosity µ and density ρ. On all B-spline gridpoints, i.e. particularly in the gaps between particles, the constraints read The cost function contains a weighted combination of the squared deviations from the constraints (A1) and (A2). Additionally, FlowFit includes a regularization to make the reconstruction problem uniquely solvable and to dampen measurement noise. This is achieved by penalization of the approximate curvature of the scalar velocity functions in order to favor low wavenumber contributions to the spectrum, where most of the energy is concentrated. It can be shown by dimensional analysis and spectral arguments that this regularization will be approximately consistent with the −5/3 Kolomogorov scaling of the energy spectrum in the inertial range. The detailed procedure works as follows: FlowFit calculates a lowpass-filtered version of the velocity field B-spline coefficients. From the difference of this and the coefficients, a highpass-filtered version is obtained. The L2 norm of this highpass-filtered version (that is an estimator for the curvature) is penalized. The weighting of these five terms in the cost function is crucial to the reconstruction process. The acceleration of the tracers is usually noisier and takes greater values, which can be accounted for by the introduction of an error ratio R between these terms. This ratio is estimated during the filtering process of the raw tracks. The weighting of the constraints (A2) against (A1) is more problematic, since the measurement will always be erroneous, but Navier-Stokes and conservation of mass must theoretically be fulfilled exactly in the absence of external forces. For a discussion of this problem and exact weighting coefficients see [7]. The non-linear least squares problem is solved with a limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) solver [7,46].