Quantum sensing networks for the estimation of linear functions

The theoretical framework for networked quantum sensing has been developed to a great extent in the past few years, but there are still a number of open questions. Among these, a problem of great significance, both fundamentally and for constructing efficient sensing networks, is that of the role of inter-sensor correlations in the simultaneous estimation of multiple linear functions, where the latter are taken over a collection local parameters and can thus be seen as global properties. In this work we provide a solution to this when each node is a qubit and the state of the network is sensor-symmetric. First we derive a general expression linking the amount of inter-sensor correlations and the geometry of the vectors associated with the functions, such that the asymptotic error is optimal. Using this we show that if the vectors are clustered around two special subspaces, then the optimum is achieved when the correlation strength approaches its extreme values, while there is a monotonic transition between such extremes for any other geometry. Furthermore, we demonstrate that entanglement can be detrimental for estimating non-trivial global properties, and that sometimes it is in fact irrelevant. Finally, we perform a non-asymptotic analysis of these results using a Bayesian approach, finding that the amount of correlations needed to enhance the precision crucially depends on the number of measurement data. Our results will serve as a basis to investigate how to harness correlations in networks of quantum sensors operating both in and out of the asymptotic regime.


Introduction
An important task in quantum information science is to devise protocols for multiparameter metrology and estimation by exploiting the quantum properties of light and matter. This problem has been widely explored not only in a theoretical fashion arXiv:2003.04867v1 [quant-ph] 10 Mar 2020 , but also in applications [9,15,16,[23][24][25][26][27][28][29][30][31][32][33][34][35][36][37][38][39] and experiments [27,[40][41][42]. As a result, new practical ways of enhancing our estimation schemes have recently emerged [43][44][45][46][47]. These protocols are normally formulated on the basis of d unknown parameters θ = (θ 1 , . . . , θ d ) that arise naturally in the description of the system at hand, and in many cases these are the quantities of interest. However, sometimes we may wish or need to find l new quantities that are functions of θ, that is, f (θ) = (f 1 (θ), . . . , f l (θ)). This is the case, in particular, when we analyse global properties in a quantum sensing network [32,33], which is a model for spatially distributed sensing [46] and the main focus of this work. Indeed, in [32,33] this model is defined as an array of quantum sensors where one or several parameters are locally encoded in each of them, and while a property of the network is said to be local if it is represented by parameters at a single sensor, a global property is thought of as a non-trivial function of two or more parameters at different sensors. Here we consider that a single parameter θ i is encoded in the i-th sensor, so that θ is a collection of local properties, and we assume that both parameters and functions are real-valued quantities. See figure 1 for a schematic representation.
Networked scenarios where global properties are relevant provide a natural testbed to identify the potential usefulness of entanglement in a broad range of multi-parameter schemes [32,37]. Within this context, the optimal estimation of a single function f (θ) has been extensively studied [32,33,37,46,[48][49][50][51][52][53][54][55][56][57], and it has been established that one can find entangled states that beat the best separable probe when that function is linear [32,33]. In addition, Eldredge et al. [48] derived a bound on the error for this scenario that was later generalised to accommodate a single analytical function [51], which can also be estimated with an enhanced precision when there is entanglement, while Gross and Caves [58] have reexamined the linear case using an elegant geometric approach. On the opposite extreme, it has been shown that a collection of l = d linear functions that generates an orthogonal transformation (i.e., f (θ) = V θ with V V −1 = I) can be estimated optimally with a local strategy [32,37].
Beyond these two types of global properties, the simultaneous estimation of l > 1 linear but otherwise arbitrary real functions has been a less travelled path. There exist generic bounds for this problem (see, e.g, [32,59]), which in practice may arise in scenarios such as the estimation of phase differences [29,59]. However, how quantum correlations may help for linear functions with arbitrary geometry has not been examined in detail. Given that this represents a richer regime than the l = 1 and l = d with orthogonal functions cases, it can be argued that answering this question is essential for further progress in networked quantum metrology.
While a general answer is beyond the scope of our methods, here we obtain a definite solution for a subclass of schemes with sensor-symmetric pure qubit states, which we introduce in section 2.1. Using the Helstrom Cramér-Rao bound and the associated quantum Fisher information matrix, in section 3 we derive a general expression linking the geometry of the vector components associated with the functions and the strength of the inter-sensor correlations, such that the uncertainty in the asymptotic regime of Figure 1. A network of d = 5 sensors. The parameters θ = (θ 1 , . . . , θ 5 ) represent local properties, since each of them is locally encoded in a single sensor. On the contrary, f 1 (θ 1 , θ 3 ) and f 2 (θ 2 , θ 4 , θ 5 ) are global properties associated with sensors 1 and 3 (green solid lines) and sensors 2, 4 and 5 (purple dashed lines), respectively. many trials is optimal. Moreover, we show that there exists a physical state for many of the optimal configurations that our formula predicts. Equipped with this, we then derive a number of important results. First we find that the largest amounts of correlations are associated, for sensor-symmetric states, with two special subspaces: the direction of the vector of ones 1 ≡ (1, 1, . . . ), and the subspace orthogonal to it. This connection between entanglement in a pure state and how much the vectors are clustered around certain directions was precisely one of the open questions identified in [32], and our findings contribute towards its solution. In addition, we demonstrate that entanglement can be detrimental for estimating global properties other than those associated with orthogonal transformations, while a three-sensor network reveals that entanglement is sometimes irrelevant. This is consistent with the fact that the asymptotic uncertainty only depends on correlations of a pairwise nature, and thus other forms of entanglement do not affect the asymptotic error.
On the other hand, it is known that strategies with a good asymptotic precision found by optimising the Cramér-Rao bound sometimes have a particularly poor performance when the number of trials is very low (see, e.g, [60]). In fact, there is compelling evidence of the existence of a potential trade-off between the performances in the asymptotic and non-asymptotic regimes [61]. In view of this, a non-asymptotic analysis of our findings for sensing networks is in order. To do it, in section 2.2 we propose a multi-parameter Bayesian procedure that generalises its single-parameter counterpart in [60], and in section 4 we utilise it to examine the non-asymptotic properties of some of our results in section 3. Our central insight here is that trading a part of the asymptotic enhancement is sometimes associated with an improved performance in the non-asymptotic regime also in networked quantum metrology, and in general we find that the amount of correlations needed to enhance the precision crucially depends on the amount of data that has been collected. Due to the more complex (and often numerical) nature of Bayesian calculations, this study is restricted to the d = 2 case, although in section 5 we discuss some potential directions to overcome this limitation. To the best of our knowledge, this work, together with [16,53], constitutes one of the first Bayesian studies of a network of quantum sensors in this context.
Our approach to the simultaneous estimation of linear functions in a scheme for distributed quantum sensing will serve as a basis to investigate how to harness correlations in multi-parameter schemes, operating both in and out of the asymptotic regime. Since the construction of entangled networks is likely to be difficult in practice, these insights may prove to be crucial in the study and implementation of quantum sensing networks that operate with a realistic amount of data.

Physical scheme and available information
Consider a network of d qubit sensors prepared in some initial state ρ 0 = |ψ 0 ψ 0 |, with , and the basis elements 0| j = (1, 0) and 1| j = (0, 1) for the j-th sensor. In addition, suppose we encode d local parameters θ = (θ 1 , . . . , θ d ), one per sensor, as ρ(θ) = e −iK·θ ρ 0 e iK·θ , where K = (K 1 , . . . , K d ), each generator K i has the form and This is an instance of the type of unitary encoding that arises in spatially distributed sensing [32,33], and while it is separable, i.e., in principle we allow for entangled pure states and any general measurement acting on all the sensors at once. When the state and the measurement present no quantum correlations, we say that the scheme implements a local strategy. Otherwise we have a global strategy. We also note that which is a useful feature of this system because it will allow us to saturate the asymptotic bound in section 2.2.
To introduce the subclass of sensor-symmetric states that we will exploit, first we recall that the strength of correlations between any pair of sensors, which we call inter-sensor correlations, may be quantified as [29,32] for i = j, where ∆K 2 i = K 2 i − K i 2 and we use the notation ≡ ψ 0 | |ψ 0 . Furthermore, J ij in equation (6) is bounded as −1 J ij 1. Using this quantifier, we define sensor-symmetric states as those satisfying for all i, j, where c and v are fixed values that characterise the preparation of the network and the encoding of the parameters. In turn, equation (6) becomes J ij = J = c/v, also for all i = j, and for our qubit model we see that where 0 4v 1 due to the fact that the eigenvalues of σ z are ±1 and thus | σ z,i | 1. This definition in terms of the conditions in equation (7) is a way of generalising the notion of path-symmetric states in optical interferometry [29,62,63], and it motivates our choice of initial probe. The final piece required before we can formulate the estimation problem of interest is to establish what prior information is available. The properties of the network that we wish to estimate are those that can be modelled linearly as where V is a (d × l) matrix and a is a column vector with l components. We consider that the form of these functions is known and so there is no uncertainty associated with the matrix V or the vector a. Furthermore, we assume that the unknown parameters θ can be initially thought of as independent in the statistical sense, such that there are no prior correlations between them, and we suppose that the magnitude of the i-th parameter can be found somewhere within an interval of width W 0,i centred aroundθ i , which is a moderate amount of prior knowledge [45,61,64]. This state of information can be represented by the separable prior probability , and zero otherwise. Equivalently, equation (10) may also be written as p(θ) = 1/∆ 0 , with hypervolume ∆ 0 = d i=1 W 0,i centred aroundθ = (θ 1 , . . . ,θ d ). The interested reader will find in Appendix A a way of justifying this prior from the perspective of the so-called objective version of the Bayesian framework.

Estimation method: a hybrid approach
Starting with the transformed network state ρ(θ) in section 2.1, the next step is to consider µ identical and independent measurements on this system, which we see as trials or repetitions. In particular, the i-th measurement is represented by a POVM E(m i ) with outcome m i , and the probability of this process generating the outcomes m = (m 1 , . . . , m µ ) is given by the likelihood function Since the form of the functions f (·) has been assumed to be known, it is appropriate to construct their estimators as whereθ(m) = (θ i (m), . . . ,θ i (m)) are the estimators for the parameters θ, and we evaluate the the uncertainty of our estimatesf (m) as where p(θ) is the prior, W = diag(w 1 , · · · , w l ) is a weighting matrix, w i 0 represents the relative importance of estimating the i-th parameter, and Tr(W) = 1. Importantly, although a square error is generally not suitable for quantities associated with topologies other than that for the real line, it can still be a good approximation to the uncertainty for other topologies when the prior knowledge about θ is moderate or high (see, e.g, [45,47,60,61,65,66]), which is our case. By using equations (10 -12) and the network configuration in section 2.1, equation for our system. We note that this error does not depend on a, so that we can set a = 0 without loss of generality. Hence, from now on the functions are f (θ) = V θ and the coefficients are encoded in the columns of V . Ideally, we would like to minimise the error in equation (14) with respect to the estimatorsθ(m), the measurement scheme E(m i ) and the initial sensor-symmetric state ρ 0 , so that we can find the optimal configuration of the network and study its properties. Since, in general, this is a very challenging problem, in this work we follow an approximate procedure that combines asymptotic and non-asymptotic optimisations. We now describe this hybrid approach and how to use it for our analysis of sensing networks (a discussion of other methods in the literature can be found in Appendix B).
On the one hand, equation (14) can be minimised with respect toθ(m) in a straightforward way (e.g., using calculus of variations; see [16,67]). This provides the familiar result thatθ (m) = dθ p(θ|m) θ are the optimal estimators [67,68], where p(θ|m) = p(m|θ)/[∆ 0 p(m)] is the posterior probability and p(m) = dθ p(m|θ)/∆ 0 . As a consequence, inserting equation (15) in equation (14) we have that where f i (θ) = d j=1 V ji θ j . This is the optimal uncertainty based on the probabilities that emerge from the measurements in a given quantum strategy (E(m i ) plus ρ 0 ), and is valid and exact for any number of trials µ.
On the other hand, we may select the quantum strategy such that it is optimal in the asymptotic regime of many trials, where µ 1. First we recall that, if the true values θ lie within the prior hypervolume ∆ 0 , and the likelihood p(m|θ), which we assume to be sufficiently regular, becomes concentrated around θ as µ grows, then the posterior probabiliy p(θ|m) can be approximated as a multivariate Gaussian density and the uncertainty c opt in equation (16) satisfies [67,69,70] c opt ≈ where is the Fisher information matrix for a single trial with outcome m (for a derivation of this approximation, see, e.g., [67,69,70] and section 6.2.2 of [45], and [8,71,72] for a rigorous treatment). At the same time, given that the form of the unitary encoding is exp(−iK · θ) and the state ρ 0 = |ψ 0 ψ 0 | is pure, the Helstrom Cramér-Rao bound establishes the matrix inequality [43,44,46,47] F q being the quantum counterpart of the information matrix. Then, the combination of equations (16), (17) and (19) implies that, in the asymptotic regime, The quantum Cramér-Rao bound¯ cr in equation (20) is a function of ρ 0 only, since K, V , W and µ are fixed, and it does not depend on the measurement. As such, if we choose the POVM E(m i ) for the i-th repetition such that c asym =¯ cr , then that measurement will be asymptotically optimal. It can be shown that a measurement such that F (θ) = F q (and thus c asym =¯ cr ) always exists when the generators K commute with each other [12,13], and equation (5) demonstrates that this is indeed satisfied by our qubit network. Hence, we will use this criterion to construct the POVM. Regarding the optimisation of the state, we will proceed by first calculating¯ cr as a function of the properties that characterise the sensor-symmetric state ρ 0 , which, as we will see, are the variance v and the correlation strength J , and then minimising the resulting bound with respect to the pair (v, J ). Once we know the optimal estimators and the asymptotically optimal state and measurement as prescribed above, we can complete the estimation by inserting these in the Bayesian uncertainty for µ repetitions in equation (14), which here will be calculated numerically with the algorithm in section 6.2.3 of [45] (the reader interested in reproducing our numerical results will find the associated MATLAB code in Appendix C of the same work).
It is important to realise that our approach can fail when the asymptotic approximation is not valid. This could happen, for example, if the prior information provided within the hypervolume ∆ 0 is not sufficient to distinguish a single point [60,67], or if the Fisher information matrix (classical or quantum) is singular. Therefore, we will concern ourselves with schemes where the information matrix is invertible, and, once we have found the asymptotically optimal quantum strategy, we will also check that the likelihood p(m|θ) associated with it does not present ambiguities in the relevant portion of the parameter space. Nevertheless, note that, in general, a potentially ambiguous likelihood function or a singular F (θ) do not introduce any fundamental difficulty for Bayesian estimation itself (this will be demonstrated in section 4 with an example).
In summary, the estimation method that emerges from the previous discussion requires that we: (i) calculate the quantum Cramér-Rao bound¯ cr and find the sensor-symmetric state that makes it minimal, (ii) search for a POVM such that c asym =¯ cr , (iii) verify that the quantum strategy (state plus POVM) allows for unambiguous estimation given the prior information represented in equation (10), (iv) calculate the optimal estimators for the linear functions in equation (21), and (v) calculate the µ-trial Bayesian uncertainty in equation (14).
While the protocols constructed in this way may not be optimal for low µ, [60] demonstrated that this technique can provide important information about the nonasymptotic regime in optical interferometry, and here we will show that this is also true for networked quantum sensing. Moreover, a very useful feature of our approach is that the analysis of the role of inter-sensor correlations emerging from (i, ii) will be relevant for researchers interested only in the Cramér-Rao bound, while those that also require an analysis based on a finite number of repetitions will benefit from the insights arising from (iii -v). The next section is dedicated to the former.

Estimation of arbitrary linear functions
Our first step is to examine the quantum strategies that are optimal in the regime where the square error¯ mse converges to the quantum Cramér-Rao bound¯ cr = Tr(WV F −1 q V )/µ as µ grows. If we denote by {e i } the basis components of the real space where W, V and F q are defined, with e i e j = δ ij , then from equations (8) and (19) we have that where I is a (d×d) matrix of ones and I the (d×d) identity matrix. This is the quantum Fisher information matrix for sensor-symmetric states.
To invert F q , we need to impose the condition of positive definiteness, which is equivalent to requiring that its eigenvalues are strictly positive. Expressing I as I = 11 , where we recall that 1 = (1, 1, . . . ) is the vector of ones, the information matrix becomes In that case, the characteristic equation for the eigenvalues {λ} is which upon using the identity det(X + yz As a result, the eigenvalues of F q are λ 1 = 4v[1 + (d − 1)J ], with multiplicity 1, and λ 2 = 4v(1 − J ), with multiplicity d − 1, and by imposing that they are positive we conclude that F q is invertible when 1/(1 − d) < J < 1. The rest of our calculations assume that J lies in such open interval under this assumption.
We can now calculate the inverse of F q in equation (22), which is [32] Utilising this result we find that the asymptotic uncertainty for the estimation of linear functions is given bȳ where we have introduced the (d × d) matrix X ≡ I − I to separate the contribution to the uncertainty due to the diagonal elements of F −1 q , which are the errors for each of the parameters, from that of the rest of the matrix.
The expression in equation (26) shows that the uncertainty depends on three types of quantities: i) the number of repetitions µ and the number of parameters d, (ii) the combined properties of state and generators through the correlation strength J and the variance v, and (iii) two quantities, Tr (WV V ) and Tr (WV X V ), that are defined in terms of the functions encoded in V and the weighting matrix W. The next step is to investigate the physical meaning of these two quantities in (iii).
By relabelling the vector formed by the components of the j-th linear function as , we can rewrite the first quantity in a more suggestive form as where the norm in the last term is defined as |v| 2 = k v 2 k for a real vector v. This is the weighted sum of the squared magnitudes of the vectors associated with the linear functions. Since V WV T is positive semi-definitive, and excluding the degenerate case where all the coefficients vanish, we have that Tr(WV T V ) = Tr(V WV T ) > 0. In addition, when the functions are normalised, that is, |f i | = 1 for 1 i l, and recalling that Tr(W) = l i=1 w i = 1, we have that Tr(WV T V ) = 1. Hence, we define the normalisation term (28) satisfying that N > 0, with N = 1 for normalised linear functions.
As for the second quantity, we can rewrite it as where ϕ 1,j is the angle between the vector associated with the j-th function and the direction defined by the vector of ones 1, and we have used the fact that |1| = √ d. Recalling that |cos (ϕ 1,j ) | 1 and using equation (29), we see that Tr and that the extremes are realised when either the functions are aligned with the direction of the vector of ones 1, or they lie in a subspace orthogonal to it and of dimension (l − 1). So, for sensor-symmetric networks with properties modelled by linear functions, there are two kinds of global properties that play a special role: the sum of all the natural parameters with equal weights, and any linear combination of them such that the sum of its coefficients vanishes. Any other set of global properties will produce some value for Tr WV T X V lying within the interval in equation (30), and this will be given by the geometry of the transformation defined by V WV T . This motivates the introduction of the geometry parameter Inserting equations (28) and (31) in equation (26), the asymptotic uncertainty finally becomes¯ where Given a sensor-symmetric network with d local properties, the factor h (J , G, d) in equation (33) codifies the interplay between the inter-sensor correlations of strength J and the geometry parameter G for any linear property, which may be local or global. A representation of this interplay can be found in figure 2. The formulas in equations (32) and (33) have been obtained without imposing further restrictions on the functions, and this implies that this formalism can be applied to any number of linear functions whose coefficients generate vectors that can form any angle and have any length.

The role of inter-sensor correlations I
Let us exploit the previous result to address the problem of selecting a sensor-symmetric network state that is optimal to estimate a given set of linear functions. This amounts to finding the values for v and J that are optimal for a given G. One approach is to use the fact that, for qubits, 0 4v 1, which allows us to lower bound equation (32) as We then search for the J that minimises this bound after having fixed G, d and µ. In principle, there is no guarantee that the pairs of values (4v = 1, J ) generated by this method will correspond to any physical state, although the bounds on the asymptotic error constructed in this way would still be valid. Nevertheless, later we will study an example that realises a large portion of the pairs (4v = 1, J ) that we will predict. By minimising¯ f (see Appendix C) we find that, if 4v = 1, and restricting our attention to the range 1/(1 − d) < J < 1 where the information matrix is invertible, the optimal strength for the inter-sensor correlations of the network is for −1 < G < d − 1, which is determined by the structure of the functions alone via G (once d has been fixed). This provides a map between correlation strength and geometry with one-to-one correspondence (note that , as is illustrated in figure 3, and this is the central result of our asymptotic analysis. The expression in equation (35) reveals that, the more a collection of functions is clustered around the vector of ones 1, the larger the amount of positive correlations is required to be in order to perform the estimation optimally (provided that 4v = 1). Similarly, the amount of correlations with negative strength needs to be large if the functions are instead clustered around the subspace orthogonal to 1. The potential  Figure 3. Optimal inter-sensor correlation strength J opt versus the geometry G of a set of arbitrary linear functions, for d = 2, 3, 5 and 10 parameters (lines (a -d) respectively). These monotonic curves provide a quantitative representation for the uncertainty minima identified in figure 2, and the associated analytical formula is in equation (35). This result shows that, the more a collection of functions is clustered around the direction of 1, so that G = d − 1, the larger the amount of correlations must be in order to perform the estimation optimally (provided that 4v = 1), while the opposite is true if the functions are instead clustered around the subspace orthogonal to 1, for which G = −1. Remarkably, any amount of correlations is detrimental when G = 0, even though a vanishing geometry parameter can also be obtained for properties of the network that are global.
existence of this type of connection between geometry and quantum correlations was precisely one of the general open questions identifed in [32].
Furthermore, equation (35) (and figure 3) shows that any non-zero pairwise correlation strength is detrimental whenever the geometry parameter vanishes. It is therefore interesting to investigate which kind of linear functions imply that G = 0, as well as the form of the associated optimal strategy. To achieve this, let us recall the original definition for G in equation (31), that is, G = Tr(WV X V )/N . If we choose the uniform weighting matrix W = I/l, and if V is an orthogonal transformation (i.e., V V = V V = I), then Now we observe that J = 0, which is the optimal choice for the previous scenario, is always achieved by a separable qubit state

and by
selecting a i = 1/2 for all i we have that 4v = 1. Thus we can say that the estimation of a set of l = d linear functions that are equally relevant and orthogonal can be carried out optimally by preparing our scheme with separable states. Moreover, since the estimation of the parameters θ is equivalent to choosing V = I, our result implies that separable states are also optimal in that case. So, our present formalism is consistent with previous results [32,33,37,73]. The above conclusion is sufficient to affirm that while entangled pure states are generally useful for the optimal estimation of global properties, it is not true that we always need entangled probes in such case. However, a transformation that is orthogonal preserves angles and lengths, and thus one may argue that, in a sense, the information encoded by a set of functions that gives rise to an orthogonal transformation is equivalent to the information content of the original parameters, provided that the weighting matrices are uniform. Hence, it is perhaps not surprising that a local estimation strategy is preferred here, since [32,33] had already shown that the estimation of local properties associated with commuting generators can be performed optimally with a local scheme. In view of this, it is important to establish whether there are other global properties with G = 0 that instead select information that is not equivalent to estimating all the original parameters. First we observe that the eigendecomposition of X , which is a symmetric matrix, is (see Appendix D) where the eigenvector for the first eigenvalue is 1 and those for the other eigenvalues belong to the orthogonal subspace. That implies that if we choose a single linear function as V = f = U X 1, then we will have that G = 1 U X X U X 1/N = 1 X D 1/N = 0. Now consider a three-parameter network, so that Clearly, this gives rise to a global property, as these are the coefficients of a non-trivial function of three local parameters. Yet, G = 0, and so, according to equation (35), pairwise correlations are detrimental. Therefore, entanglement is sometimes not needed in scenarios where we are estimating non-trivial global properties. Interestingly, the same argument fails for d = 2, since in that case and this is associated with a local property because it simply rescales the first parameter. Nonetheless, our conclusion above is still valid in general. For the link between geometry and correlations in equation (35) to be truly relevant, it is necessary that there are physical states with the properties that such a link predicts as optimal. In [32] we studied the estimation of 1 l d = 2 linear and normalised but otherwise arbitrary functions using the sensor-symmetric state with −∞ < γ < ∞, and we provided a complete solution to this two-parameter estimation problem. The fact that this is a particular case of the more general formalism that we develop in this work suggests that, for the d = 2 case, it may be possible to use the state in equation (40) to realise all the pairs (4v = 1, J ) that are optimal according to our results. We will now show that this is the case.
It is interesting to observe that γ splits the state into a part where the sum of the parameters is encoded and a part that encodes the difference. More concretely, e − i 2 (σ z,1 θ 1 +σ z,2 θ 2 ) |ψ 0 = 1 A partial extension of this idea to the d-parameter case can be achieved by constructing a state where the part that encodes functions aligned with the direction of 1 is isolated in an analogous fashion, i.e., For this probe, 4v i = 1 − σ z,i 2 = 1 = 4v for all i, and 4c ij = σ z,i σ z,j − σ z,i σ z,j = σ z,i σ z,j = (1 − γ 2 )/[1 + (2 d−1 − 1)γ 2 ] = 4c for all i = j, which verifies that the state in equation (42) is also sensor symmetric. As a result, we can see that its inter-sensor correlations are given by If 0 |γ| 1, then we have that 1 J 0. This implies that there always exists a physical state associated with all the results in this section that require either positive inter-sensor correlations, or the absence of them. On the other hand, the amount of negative correlations that this state can cover lies in 0 > J > −1/(2 d−1 − 1), which corresponds to 1 < |γ| < ∞. Unfortunately, the amount of negative correlations that equation (35) might predict can lie in 0 > J > 1/(1−d), where 1/(1−d) −1/(2 d−1 −1) for d 2 and the inequality is only saturated when d = 2. Thus there is a subinterval not covered by equation (42). Whether there are other physical states that may realise the missing values is an open question.
Finally, we note that the only entangled pure probes that may be asymptotically relevant for sensor-symmetric networks are those that give rise to inter-sensor correlations, while any other form of entanglement will be irrelevant in this type of scenario. To illustrate this idea, let us consider the state in equation (42) for d = 3, and suppose that the functions to be estimated give rise to the geometry parameter G = 0. We have seen that, in that case, no inter-sensor correlations are needed to perform the estimation optimally, which implies that, according to equation (43), γ = ±1 . By inserting these parameters in equation (42) we find that the optimal states are and The first state is separable, but |ψ − is not. More concretely, if we tried to write the latter as |ψ − = (x 0 |0 + x 1 |1 )(y 0 |0 + y 1 |1 )(z 0 |0 + z 1 |1 ), with |x 0 | 2 + |x 1 | 2 = |y 0 | 2 + |y 1 | 2 = |z 0 | 2 + |z 1 | 2 = 1, we would find contradictions such as which by reductio ad absurdum allows us to conclude that the state with γ = −1 and d = 3 is entangled. Hence, while here entanglement is not required to reach the asymptotic optimum, neither is it necessarily detrimental. The only requirement imposed by our formalism is the absence of pairwise correlations, and the presence or absence of any other kind of correlation does not affect the asymptotic uncertainty.

Optimal POVM in the asymptotic regime
The final step of the asymptotic analysis is to find some POVM that is optimal in the large-µ regime, in the sense that it saturates the quantum Cramér-Rao bound as c asym =¯ cr , and we can achieve this by requiring that F (θ) = F q [12,13]. That the latter condition refers to the parameters but not to the functions, together with the fact that the former can be estimated optimally using a local strategy [32,33] (see also section 3.2), suggests that a local POVM might be sufficient to make the classical and quantum information matrices equal. In fact, this would be very useful, since then we could associate any enhancement derived from the presence of correlations with the initial state alone. In the following we demonstrate this for a network with d = 2 parameters. Consider a local POVM with elements Furthermore, we have seen that, if d = 2, then the state in equation (40) is general enough to realise all the asymptotic results predicted by our theory. As such, this is the probe that we will use in this calculation. Combining this POVM with the transformed state |ψ(θ 1 , θ 1 ) = e − i 2 (σ z,1 θ 1 +σ z,2 θ 2 ) |ψ 0 in equation (41), we find the amplitude the modulus of the proportionality factor being 1/[2(1 + γ 2 )]. This allows us to arrive at the likelihood function where we have introduced the notation x ± ≡ [θ 1 ± θ 2 + π(k ± n)] /2. The elements of the classical Fisher information matrix in equation (18) for the quantum probability in equation (49) are and [F (θ)] 12 = 1 n,k=0 1 p(n, k|θ 1 , θ 2 ) ∂p(n, k|θ 1 , θ 2 ) ∂θ 1 ∂p(n, k|θ 1 , θ 2 ) ∂θ 2 with [F (θ)] 21 = [F (θ)] 12 . Additionally, in sections 3.1 and 3.2 we have seen that, for this configuration, which is identical to the classical Fisher information matrix in equations (50 -52). We thus conclude that the quantum strategy formed by the local POVM in equation (47) and the state in equation (40) is asymptotically optimal. This completes our solution for the asymptotic estimation of linear functions in a two-parameter network, and will be our starting point to perform a Bayesian analysis.

Bayesian analysis of non-asymptotic quantum sensing networks
Now we turn to the more general problem of estimating linear functions when different amounts of data are available, which may include cases with a low number of trials.
Thanks to the simplicity of the asymptotic approach, in section 3 we were able to discuss examples where d = 2, 3, 5 and 10, and many of the results there were valid for any d. However, due to the more challenging nature of the numerical calculations associated with Bayesian estimation, in the remainder of this work we will focus on two-parameter sensor-symmetric qubit networks.

Regions of unambiguous information
Our aim is to use the asymptotically optimal strategy in equations (41), (47) and (49) as a guide to perform a non-asymptotic analysis. Following our discussion in section 2.2, this approach is best justified when, as µ grows, the likelihood function with each p(n i , k i |θ 1 , θ 2 ) given by equation (49), becomes concentrated around a unique absolute maximum within the prior area ∆ 0 . Indeed, this condition helps to prevent the estimation process from giving ambiguous answers [67]. Hence, before we proceed we need to find how large ∆ 0 can be such that the above requirement is satisfied. One way of estimating this size is to first represent the posterior probability p(θ 1 , θ 2 |n, k) ∝ p(n, k|θ 1 , θ 2 ) as a function of (θ 1 , θ 2 ), where the outcomes (n, k) come from a simulation with true values (θ 1 , θ 2 ), and then visualise the regions with an asymptotically unique maximum in a direct fashion (see [60]).
The result of this is shown in figure 4 for several values of γ. First we note that the simulations in figure 4 have been restricted to the area (θ 1 , θ 2 ) ∈ [0, 2π] × [0, 2π] because the single-shot likelihood in (49) is invariant under θ i → θ i + 2πm, with m = 0, ±1, ±2, . . . and i = 1, 2, and thus it suffices to examine its symmetries within one period. While the number of maxima changes with γ, we can observe that all the ambiguities in figures 4.i -4.iv can be avoided if the prior area satisfies that ∆ 0 π 2 .
The situation for γ = 0 in figure 4.v is, however, different. In that case, no single peak can be selected even if µ 1, which implies that such scheme does not have an asymptotic approximation in the sense of section 2.2. This is consistent with the fact that, if γ = 0, then J = 1, and this case must be excluded for F q to be invertible (see section 3.1). Moreover, the same type of behaviour would have been observed if we had examined the limit |γ| → ∞, for which J → −1. Hence, we only need to impose the existence of a unique maximum for 0 < |γ| < ∞. Crucially, this does not imply that the scheme with γ = 0 is useless. Figure 4.v shows that this scheme is giving information about the combination θ 2 + θ 1 = πm, with m = 0, ±1, ±2, . . . , that is, about the sum of the parameters. In fact, this can be readily seen by inserting γ = 0 in equation  (49), with (i) γ = 1, (ii) γ = 0.9, (iii) γ = 0.531, (iv) γ = 0.334 and (v) γ = 0. The simulated true values of the parameters are θ 1 = 1 and θ 2 = 2. This figure shows that the potential ambiguities in the estimation associated with scenarios (i -iv) can be generally avoided if the prior area satisfies ∆ 0 π 2 . On the contrary, while the scheme (v) can be exploited to estimate the sum of the parameters, in general it cannot provide good estimates for other linear functions, independently of the value for ∆ 0 . We draw attention to the fact that the representation for 1 < γ < ∞ follows the same pattern but with the posterior peaks tending to the direction orthogonal to that in (v).
(49), since then the likelihood for a single shot is only sensitive to the equally weighted sum of the parameters. The calculations in the next section will reveal that while the asymptotic performance of this scheme is poor, it can be useful when µ is low.

The role of inter-sensor correlations II
Given the quantum strategy in equations (41) and (47) for a two-parameter qubit network, we wish to estimate two global properties of such network when the experiment operates both in and out of the regime of limited data. In particular, consider the linear functions f 1 (θ) = (2θ 1 +πθ 2 )/ √ 4 + π 2 and f 2 (θ) = (2θ 1 +θ 2 )/ √ 5, which can be encoded in the columns of V as We assume that both functions are equally relevant, so that W = I/2, and that our prior knowledge is represented by the prior probability p(θ 1 , θ 2 ) = 4/π 2 , when (θ 1 , θ 2 ) ∈ [0, π/2]×[0, π/2], and zero otherwise. Thanks to our analysis in section 4.1, we know that this prior assignment will allow us to perform the estimation unambiguously when the asymptotically optimal strategies are employed, since ∆ 0 = π 2 /4 < π 2 . Let us start by comparing a local strategy with an entangled scheme that is asymptotically optimal. The former assumes that the experiment is arranged such that γ = 1, J = 0, while to find the properties of the latter we need to recall our results in section 3.2 for the asymptotic role of inter-sensor correlations. Equation (35) indicates that, for d = 2, when G = 0, and J opt = 0 if G = 0. In addition, J = (1 − γ 2 )/(1 + γ 2 ), and by combining the latter expression with equation (56) we find that when G = 0, and γ opt = 1 if G = 0. The normalisation term for the functions in equation (55) is simply N = Tr(WV V ) = 1, while the geometry parameter is By inserting this result in equations (56) and (57) we have that γ opt ≈ ±0.531 (we can choose the positive solution without loss of generality) and J = 0.561, where the latter verifies that this state is indeed entangled (note that the two-sensor state in equation (40) is only separable when γ 2 = 1). Next we perform the numerical calculation of the Bayesian uncertainty¯ mse in equation (14) for these two sensor-symmetric states, whose form as a function of γ is in equation (40); the measurement E(n i , k i ) = |n i , k i n i , k i | in equation (47) for the i-th repetition in a sequence of µ trials; and the optimal estimators which arise from equation (21) after inserting equation (55). The results have been represented in figure 5.i as graphs (a) for the local scheme and (b) for the optimal entangled strategy. We can observe that the local strategy performs worse than the entangled one for any number of repetitions. Therefore, in this case we have that the prediction made by the asymptotic theory is qualitatively preserved in the nonasymptotic regime. However, a closer analysis reveals that the distance between the  (a -c), respectively, verifying that the latter is recovered asymptotically. All the calculations assume the weighting matrix W = I/2 and a flat prior of area ∆ 0 = π 2 /4 centred around (π/4, π/4). two lines is considerably less when 1 µ 20 than when µ 1, and this behaviour is reminiscent of that of a Mach-Zehnder interferometer [61]. Indeed, optical probes with a large Fisher information (and thus a good asymptotic performance) have sometimes an error very close to that of a coherent laser beam in the regime of limited data, and coherent probes can be seen as an optical analogue of the notion of local strategy in this work. Moreover, the optical study in [61] also demonstrated that a better asymptotic error is sometimes associated with a worse performance in the regime of low µ. As a consequence, a natural question is whether a similar phenomenon can be exploited here, so that we can obtain an uncertainty that is lower than the error for the asymptotically optimal entangled state when the network operates in the non-asymptotic regime.
To test this idea, let us select a third arrangement with an asymptotic error that lies between those of the local scheme and the asymptotically optimal strategy. The asymptotic error for our network can be written in terms of γ as (see equations (32) and (33) Using this we can find the the value of γ for the strategy satisfying our desideratum above by imposing that and the solutions are γ ≈ ±0.334, ±0.842. So we take our third strategy to be the state in equation (40) with γ = 0.334 (and thus J = 0.799), a choice motivated by the fact that this is the option with the lower uncertainty for a single shot (in particular, The uncertainty¯ mse for the third scheme has been represented as a function of the number of trials in figure 5.i, where it is labelled as (c). As expected, this error lies equidistantly between the local and the asymptotically optimal strategies when µ 1, but this is no longer the case in the regime of limited data. More concretely, the graphs for the asymptotically optimal strategy and the new scheme cross each other when µ ≈ 40, so that the former is optimal when µ > 40 and the latter is the preferred choice if 1 µ 40. Consequently, we may say that trading a part of the asymptotic enhancement is sometimes associated with an improved performance in the non-asymptotic regime, which constitutes a multi-parameter generalisation of the analogous phenomenon in [61] for a Mach-Zehnder interferometer.
Interestingly, the balanced strategy (γ = 0.334, J = 0.799), which provides a better precision in the non-asymptotic regime, is associated with larger inter-sensor correlations, and in what follows we propose a potential explanation for this. Let us first recall that, when µ is large, the information about the global properties is essentially provided by the measurement outcomes that we accumulate as µ grows, which contrasts with the non-asymptotic regime where the information is a mixture of prior knowledge and experimental data. This implies that the optimal correlation strength predicted by the asymptotic theory is implicitly assuming a large amount of information, while the information available in the non-asymptotic regime is poorer because µ is low and the prior in equation (10) is only moderately informative. It is thus reasonable to expect that the asymptotically optimal amount of entanglement is generally inappropriate in the non-asymptotic regime. One can then try to compensate the low amount of information Asymptotically optimal 0.531 0.561 4.3 · 10 Balanced enhancement 0.334 0.799 5.37 · 10 2 Maximally entangled 0 1 − Table 1. Properties of different strategies based on a two-parameter qubit network, where γ selects the state and J is the amount of inter-sensor correlations. The POVM is separable for all four schemes, but only the local strategy is based on a separable state. The asymptotically optimal strategy minimises the quantum Cramér-Rao bound. The balanced strategy has also been enhanced via quantum correlations, but it is not asymptotically optimal because part of this enhancement has been traded to instead enhance its non-asymptotic performance. Finally, the fourth strategy uses a maximally entangled state. We note that the fourth column provides the number of repetitions µ τ needed such that the relative error between the Bayesian uncertainty and the Cramér-Rao bound is equal to or less than a 5% threshold (see [60]), and in general it depends on the available prior information. Importantly, this calculation does not apply to the strategy with a maximally entangled state, since the estimation uncertainty for the latter does not have an asymptotic limit in the sense of section 2.2. These results demonstrate the state-dependent nature of the conditions required to approach the Cramér-Rao in multi-parameter systems.
in the regime with limited data by choosing J judiciously. In our case, we observe that our functions are clustered around the equally weighted sum of the parameters, since the geometry parameter of the former is G ≈ 0.853 and this is relatively close to the geometry parameter of the latter, G = 1. In turn, this motivates choosing a J that is closer to that associated with 1, which is J = 1, in order to enhance the precision when µ is low, and this is what (b) and (c) in figure 5.i show. We may push this intuition further and consider a network with γ = 0, J = 1, which makes the state in equation (40) maximally entangled. Its graph has been labelled as (d) in figure 5.i, and upon comparing it with the three previous strategies we see that the maximally entangled state is the best option when 1 µ 10. The price that we pay for this low-µ enhancement is that the scheme ceases to be useful after µ ≈ 20 trials, and it is asymptotically beaten by the rest of schemes, including the local strategy. We notice that this result is consistent with our analysis in section 4.1, where we established that this probe is only sensitive to the equally weighted sum of the parameters.
The maximally entangled state also illustrates how, despite the lack of an asymptotic approximation in the sense of section 2.2, we can still perform a Bayesian estimation using such strategy, even when it has limited usefulness. On the contrary, for the local, asymptotically optimal and balanced strategies we have that the Bayesian mean square errors converge to their respective Cramér-Rao bounds, as it may be verified by observing figures 5.ii -5.iv. The number of repetitions required for the relative error between these Bayesian uncertainties and the asymptotic bounds to be equal to or less than 5% runs from µ ∼ 10 to µ ∼ 10 2 (see table 1 for more details).
In summary, in this section we have demonstrated that the strength of the intersensor correlations that is useful to estimate a given collection of global properties changes substantially for different amounts of data, i.e., for different values of µ. Since this is the same type of behaviour that we had established for single-parameter schemes in [61], we conjecture that the novel effects associated with a limited number of trials, which here have been uncovered using specific examples, are a general feature of nonasymptotic quantum metrology, and that they are generally present in a wide range of experiments operating in the regime of limited data.

Summary and outlook
The central question addressed in this work has been that of the role of inter-sensor correlations in the estimation of linear functions with arbitrary geometry, having exploited a sensor-symmetric qubit network in the presence of different amounts of data. First we focused on the asymptotic part of the problem, and by optimising the class of sensor-symmetric states, we have established an optimal link between correlation strength and the geometry of the linear functions. Thanks to this we have been able to demonstrate that, while entanglement is useful for many geometrical configurations, it is sometimes detrimental even with functions that are non-trivial global properties. Furthermore, we have found that forms of entanglement other than those of a pairwise nature are in fact irrelevant in this regime. Hence, our approach significantly extends previous studies in networked quantum sensing that had only considered the estimation of a single function or a collection of l = d orthogonal ones.
Given that, in practice, the number of trials µ is always finite and possibly small, we have also performed a non-asymptotic analysis of sensing networks. To this end we have introduced a hybrid estimation technique combining asymptotic and non-asymptotic optimisations in Bayesian estimation. This approximate but powerful approach has revealed that the correlation strength that is optimal for sensor-symmetric networks crucially depends on the number of times that we repeat the experiment. Additionally, we have demonstrated how the non-asymptotic precision may be enhanced by trading precision enhancements associated with the asymptotic regime.
Admittedly, while many of our asymptotic results are valid for d parameters, our Bayesian analysis has been restricted to the d = 2 case due to numerical complexity. Hence, developing methods to overcome this limitation may have a major impact in the long run. For instance, it would be interesting to examine whether the irrelevancy of forms of entanglement other than those that generate pairwise correlations is also true for a low number of trials, which is a question that requires simulations where d 3.
One possibility is to modify the multi-parameter algorithm in [45] that we have exploited in section 4, such that the integrals associated with the parameters θ are performed with Monte Carlo techniques. Alternatively, we could employ some other quantum bound whose calculation is simple enough to study cases where both µ and d are unrestricted.
One potential candidate fulfilling the latter is the multi-parameter quantum Ziv-Zakai bound in [9], although, according to our findings in [16], we cannot expect the results derived using this type of tool to be tight in general.
Notwithstanding these limitations, our methodology has revealed new important aspects of the role of entanglement in the simultaneous estimation of linear functions with networked schemes, and these results could contribute decisively towards a powerful theoretical framework for networked quantum sensing. independent, we may express the hypervolume ∆ 0 as ∆ 0 = d i=1 W 0,i , where W 0,i is the prior width for the i-th parameter. Therefore, our multi-parameter prior will be for θ ∈ [θ 1 − W 0,1 /2,θ 1 + W 0,1 /2] × · · · × [θ d − W 0,d /2,θ d + W 0,d /2], and zero otherwise, which is the prior introduced in section 2.1 and employed in the main text. We notice that this is a multi-parameter application of a method proposed by Jaynes to construct objective prior probabilities [67,74]. Other methods can be found in [75,76].
Appendix B. Optimising the multi-parameter Bayesian uncertainty: review of techniques There are several ways of addressing the problem of optimising the uncertainty in equation (14) with respect to the estimatorsθ(m), the measurement scheme E(m i ) and the initial sensor-symmetric state ρ 0 . One option is to perform a direct minimisation [2][3][4][5], which is sometimes possible in covariant estimation [7,23,47,77] but generally intractable. Alternatively, one can bound the estimation error and search for the strategy that better approaches that bound, which may be attempted with tools such as the Yuen-Lax bound [1], the quantum Weiss-Weinstein bound [11], or some multiparameter version of the quantum Ziv-Zakai bound [9,10], among others [8,47]. This method usually suffers from the lack of tightness of the bounds, although this can be partially overcome with the Bayesian analogue of the Helstrom Cramér-Rao bound that we recently constructed in [16] (see also [46,47]), since it can be saturated in certain cases and we showed how to exploit it for the estimation of local parameters (i.e., θ). Nevertheless, we have followed the weaker but computationally simpler hybrid approach in section 2.2 because the theory of estimating global properties of a network is more challenging, and we leave the application of more sophisticated methods to the estimation of linear functions for future work.

Appendix C. Minimisation of the asymptotic uncertainty for linear functions
The optimal strength for the inter-sensor correlations in equation (35) can be found as follows. If we look at¯ f = N h (J , G, d) /µ as a function of J , where we recall that, according to the discussion in section 3.
whose solutions are Since we need to restrict our study to the range 1/(1−d) < J < 1 for F q to be invertible, only J + is a valid candidate to find a minimum. Next we examine the sign of the slope in the left hand side of equation (C.2) for some values of J around J + . By noticing that N /µ > 0 and using the endpoints of the domain for J we find that for an arbitrarily small ε > 0 when G = −1, G = d − 1, which we exclude to guarantee that J = 1/(1 − d), J = 1. Consequently, J + gives rise to the minimum that we were looking for. giving the eigenvalues λ 1 = d − 1, with multiplicity 1, and λ 2 = −1, with multiplicity d − 1 (see the calculations associated with equation (23), whose eigenvalues are obtained in the same way). By inspection we see that 1 is one of the eigenvectors. Since the latter satisfies that X 1 = (11 − I)1 = (d − 1)1, the rest of the eigenvalues must be associated with the subspace orthogonal to 1, and this concludes the eigendecomposition of X .