Optimal quantum states for frequency estimation

We investigate different quantum parameter estimation scenarios in the presence of noise, and identify optimal probe states. For frequency estimation of local Hamiltonians with dephasing noise, we determine optimal probe states for up to 70 qubits, and determine their key properties. We find that the so-called one-axis twisted spin-squeezed states are only almost optimal, and that optimal states need not to be spin-squeezed. For different kinds of noise models, we investigate whether optimal states in the noiseless case remain superior to product states also in the presence of noise. For certain spatially and temporally correlated noise, we find that product states no longer allow one to reach the standard quantum limit in precision, while certain entangled states do. Our conclusions are based on numerical evidence using efficient numerical algorithms which we developed in order to treat permutational invariant systems.

We investigate different quantum parameter estimation scenarios in the presence of noise, and identify optimal probe states. For frequency estimation of local Hamiltonians with dephasing noise, we determine optimal probe states for up to 70 qubits, and determine their key properties. We find that the so-called one-axis twisted spin-squeezed states are only almost optimal, and that optimal states need not to be spin-squeezed. For different kinds of noise models, we investigate whether optimal states in the noiseless case remain superior to product states also in the presence of noise. For certain spatially and temporally correlated noise, we find that product states no longer allow one to reach the standard quantum limit in precision, while certain entangled input states do. Our conclusions are based on numerical evidence using efficient numerical algorithms which we developed in order to treat permutational invariant systems.

I. INTRODUCTION
How can one determine a quantity or parameter with high precision? This is one of the central questions in physics throughout all fields. In addition to the numerous practical problems in this quest, one faces some fundamental limits, for example, set by measurement statistics. Metrology as the science of measurement aims at identifying these limits, and at developing optimal schemes for estimating an unknown parameter in a given experimental setup. In the seminal paper by Caves [1] these questions were tackled within the quantum mechanical framework. It was found that quantum metrology offers a significant advantage as compared to classical strategies, where a quadratic improvement in achievable precision is obtained owing to the use of quantum entanglement. Nowadays, this insights find widespread applications in interferometry [2,3], atomic clocks [4,5], gravitational wave detectors [6,7] and frequency estimation [8,9].
In this work, we consider frequency estimation where the strength, ω, of a Hamiltonian, H, describing the interaction of N qubits with an external field should be estimated. In the noiseless case one finds that for classical strategies the achievable precision, δω, that measures the statistical deviation of the estimated parameter from the actual value, is given by the standard quantum limit (SQL), where N is the number of probe systems. In turn, entangled input states allow one to achieve Heisenberg scaling where δω = O(1/N ). Optimal input states are readily identified as a coherent superposition of eigenstates of H corresponding to the minimal and maximal eigenvalue, which we will refer to as GHZ states (after Greenberger, Horne and Zeilinger [10]).
In the presence of noise and imperfections the situation changes drastically. For a local Hamiltonian proportional to σ z and dephasing noise (what we will refer to as "standard scenario" in the following), one finds that GHZ states do not offer any advantage as compared to classical strategies with noise [11]. Moreover, no input state can reach the quadratic gain in precision and at most a constant gain factor can be achieved. This is expressed in terms of general bounds [12,13], that show for generic kinds of noise the impossibility to reach Heisenberg scaling [51]. In the standard scenario socalled one-axis twisted spin-squeezed states (SSS) [14] were identified to reach the optimal constant gain factor [15] in the limit of large N , and it is widely believed that such states are optimal.
However, the situation is far from being fully understood. On the one hand, optimal states for finite N in the standard scenario are unknown, and it is not known whether states other than the SSS are optimal in the asymptotic case. On the other hand, beyond the standard scenario where one considers different kinds of noise models (e.g., depolarizing rather than dephasing noise), or different Hamiltonians [16], basically nothing is known about optimal states be it for finite N or in the asymptotic limit (an exception is transversal noise considered in [17]).
In this paper, we are concerned with the question of identifying optimal input states for frequency estimation in different noisy metrology scenarios. Our central results for the standard scenario are as follows: • We determine optimal states for finite N . To this aim we develop numerical tools to treat permutational invariant states that allow us to investigate systems of up to 70 qubits.
• We find that spin-squeezing is not necessary for optimality. For finite N one-axis twisted SSS are in fact only almost optimal.
• We identify a key feature for optimality which is a specific global distribution of coefficients of the input state in the eigenbasis of H. This also implies a certain value of the variance of H.
Beyond the standard scenario, where we consider local depolarizing noise or spatially and temporally correlated noise, we find: • Contrary to the standard scenario GHZ states may remain superior to product states also in the noisy case.

arXiv:1402.6946v1 [quant-ph] 27 Feb 2014
• For spatially and temporally correlated noise, N qubit product states do not even reach the SQL, whereas GHZ states do. This opens again a gap in the scaling. In addition, the equivalence between parallel and sequential strategies [18] (i.e., to either use one particle N times, or N particles in parallel) does not hold in this case.
The paper is organized as follows. In Sec. II, we provide relevant background information, and describe the different scenarios we consider. There, we also discuss the ansatz space of input states we consider and outline the numerical method we develop. In Sec. III, we consider the standard scenario with local Hamiltonian and dephasing noise, and determine optimal states and their features. In Sec. IV, we present results for frequency estimation scenarios with different kinds of noise models. We summarize and conclude in Sec. V. Some technical details, in particular regarding the numerical algorithm we develop here, are presented in the appendices.

II. PRELIMINARIES
In this section we briefly review the quantum metrology scenarios considered in this work, as well as the main theoretical tools behind our numerical routines. Specifically, in Sec. II A we review the main results of classical and quantum metrology and provide the mathematical descriptions for all the models we investigate. Sec. II B introduces the ansatz state space used in our numerical optimization routines, and Sec. II C provides a brief overview of the mathematical tools on which our numerical optimization is based on.
A. Dynamical evolution models for quantum frequency estimation In a metrological scenario the goal is to estimate a parameter, ω ∈ R, from a probability distribution, p(x|ω), where x ∈ R denotes the outcomes of a suitable measurement. An unbiased estimate,ω, of ω is obtained by suitable post-processing of the measurement outcomes [52]. The variance, δω 2 = (ω −ω |dω/dω| ) 2 , of an unbiased estimator is lower-bounded via the wellknown Cramér-Rao inequality [19] where F is the Fisher information given by [20] and n is the number of measurements. It is known that the lower bound in Eq. (1) can be achieved in the limit n → ∞ by the maximum likelihood estimator [21]. Assume that the parameter of interest is a quantity governing the evolution of a physical system, such as the frequency of rotation of a spin about a given axis. Then, the scenario described above is implemented by preparing the system in some initial known state, ρ, and allowing it to evolve under the requisite dynamics for some time before measuring its final state, ρ(ω, t). This process is repeated n times to obtain the measurement statistics, p(x|ω), from which an estimateω can be extracted. If the physical system is quantum mechanical in nature, i.e. ρ(ω, t) ∈ B(H), then the probability distribution, p(x|ω), is given by p(x|ω) = Tr(M x ρ(ω, t)), where the set of measurement operators x M x = I. As any set of measurement operators, {M x }, constitute an admissible measurement it is natural to ask which measurement minimizes Eq. (1) or, equivalently, maximizes the Fisher information. The Quantum Fisher Information (QFI), F, is defined as the maximal Fisher information over all allowable measurements, and is given by [22][23][24] where is known as the symmetric logarithmic derivative. Here, ρ(ω, t) = ∂ρ(ω, t)/∂ω, λ j are the eigenvalues of ρ(ω, t), |ψ j the corresponding eigenvectors, and the sum in Eq. (4) is over all i, j satisfying λ i + λ j = 0. The most informative measurement is the one whose elements are the projectors on the eigenspaces of the symmetric logarithmic derivative. Substituting Eq. (3) in place of the Fisher information in Eq. (1) yields the quantum Cramér-Rao inequality which provides the ultimate lower bound on precision achievable by a quantum mechanical strategy. As the QFI already incorporates the optimization over all measurements, it remains to minimize the precision in ω with respect to all other available resources. In this work we focus solely on estimating the frequency of rotation of a spin around a given axis. For this task the two relevant resources are the number of probe systems, N , in a given run of the experiment and the total running time, T = nt, of the experiment. Here, t denotes the interrogation time, the time interval the N probes are subjected to the ω-dependent dynamical evolution before they are measured. With respect to these resources the quantum Cramér-Rao bound for frequency estimation reads [11] where F(ρ(ω, t)) denotes the QFI of the final state of the N probes. The interrogation time, t, is a controllable parameter which needs to be optimized in order to maximize the precision in frequency estimation. Henceforth, when we refer to the maximal QFI for a given input state we mean the maximal value of F/t, after the optimization over t has been performed. The dynamical evolution of the N probe systems is described by a master equation of the Lindblad form where ω is the frequency we are interested in estimating, ρ is the initial state of the N probes, and H = ωh and L are the Hamiltonian and Lindblad operators generating the unitary and non-unitary (noisy) part of the evolution respectively. In this work we consider the local generator We note that for local Hamiltonians, and in the absence of noise, the optimal precision in estimating frequency using an initial pure product state scales at the SQL, δω 2 = (T N ) −1 . Contradistinctively, if the N probes are initially prepared in a GHZ state, i.e. an equal superposition of the eigenstates corresponding to the maximum and minimum eigenvalue of h, then the precision in estimation scales at the Heisenberg limit, The noise models we consider here are of two main types. The first type of noise we consider is local, uncorrelated noise described by the Lindblad operator where γ denotes the strength of the noise. The case µ x = µ y = 0, µ z = 1 corresponds to the case of local, uncorrelated dephasing noise and is the main type of noise considered in this work (Sec. III), whereas the case µ x = µ y = µ z = 1/3 corresponds to local uncorrelated, depolarizing noise (Sec. IV A).
The second type of noise we consider is correlated dephasing noise, where correlations are both in space and time (Sec. IV B), with Lindblad operator given by Here, f (t) = 1 − exp(−γt) denotes the temporal profile of the noise, whereas the double-commutator in Eq. (9) governs the spatial correlations. Such type of noise is physically relevant in ion trap setups due to fluctuations in the phase reference beams addressing all N ions collectively, i.e. with infinite spatial correlation length, but finite memory [26][27][28][29]. One might expect that correlated noise is more damaging than uncorrelated noise. However, it has been demonstrated that for local Hamiltonians temporal correlations alone allow for sub-SQL precision in optical interferometry [30], whereas spatially correlated noise alone, with finite or infinite correlation length, can even allow for Heisenberg scaling in precision using a suitably chosen higher dimensional entangled state [31] or considering a different Hamiltonian [32].
A key feature for all scenarios considered in this paper is that the unitary and noisy part in Eq. (6) commute. This means that the solution for the time-dependent density operator reads Note that F is independent of ω. For the case of pure states and in the absence of noise it can be easily verified that F = 4t 2 V(h) := 4t 2 ( h 2 − h 2 ). However, even if we start with an initially pure state, the evolution of such a state under the full dynamics of Eq. (6) will in general yield a mixed state. Computation of the QFI in this case becomes computationally difficult due to high numerical cost of diagonalizing a mixed state describing N qubits. To ease the numerical cost we restrict our initial state space to a subspace of the total Hilbert space. Our choice of state space is described in the next subsection.

B. Ansatz space
Our goal is to numerically determine the input states leading to maximal sensitivity for the case of noisy frequency estimation and to identify the key properties of such states. Specifically, we are interested in determining the states that maximize the QFI. As the latter is convex [33] it suffices to consider only pure states of N qubits. However, numerically searching for arbitrary pure states that maximize the QFI seems unfeasible for a large number of qubits. Thus, we restrict ourselves to a subspace of the total Hilbert space in order to reduce computational effort.
In order to pick the most suitable subspace we note that the dynamical evolutions we consider are symmetric under particle exchange. This observation has motivated several authors to consider pure input states that are symmetric under particle permutations [11,32,34], i.e. input states of the form The states |j max , m are simultaneous eigenstates of the total angular momentum operator, S 2 , and its projection onto the z-axis, S z , with corresponding eigenvalues j max (j max + 1) = N/2(N/2 + 1) and m respectively, and are the so-called Dicke states with N/2 + m excitations [35]. Note that we define |0 as the excited state, so that for instance the Dicke state of 3 qubits, one of which is in the |0 state, is |3/2, −1/2 = 1/ √ 3(|011 + |101 + |110 ). We can reduce the number of parameters in Eq. (12) further by requiring that the coefficients c m are real and positive, as any complex phase can be taken into account by applying a phase gate that commutes with the dynamics (Eq. 6). In addition, the dynamics are invariant under collective spin flips σ ⊗N x . Requiring the same symmetry for the initial states leads to c m = c −m .
With the exception of correlated noise (see Eq. (9)) [32], we are not aware of any proof that the state that maximizes the QFI in the presence of local uncorrelated noise belongs to our ansatz space. However, by extensive numerical studies for N = 2, 3, and by comparing specific examples of asymmetric states and their symmetrized counterparts, it seems that the optimal state must exhibit the same symmetry as the dynamics. With all these considerations taken into account we choose our ansatz space as the space spanned by the states of Eq. (12) with c m ∈ R, c m > 0, and c m = c −m . Note that with these restrictions, states in Eq. (12) Within this ansatz space, there are several state families that are of particular interest. We now define four such state families for comparison to the optimal states for noisy frequency estimation. The first family of states we examine are the product states. The optimal product state for all considered scenarios is |PS = |+ ⊗N which, when expressed in the {|j max , m } basis, reads The second family of states we consider is the GHZ state As mentioned in Sec. II A these states achieve Heisenberg scaling in precision for noiseless frequency estimation using the Hamiltonian in Eq. (7). The third family of states we consider are the one-axis twisted SSS [14] This state family is defined by two parameters: the squeezing parameter, µ, and a local rotation parameter ν which serves to re-orient the squeezing axis so that the benefit for the specific Hamiltonian is optimized. As the value of ν depends only on µ, the one-axis twisted SSS are essentially a one-parameter family. It was shown that for local Hamiltonians and for local uncorrelated dephasing noise (see Sec. III) one axis-twisted SSS are asymptotically optimal [15]. The fourth family of states we consider are the Dicke states in the x-basis, where Had denotes the Hadamard operation Note that |SSS(0) = |D 0 = |PS . In addition to the four families of states above, we conduct a numerical search for the optimal state in the entire ansatz space. For this task, we choose the Nelder-Mead simplex algorithm [36] which is known to be successful for low-dimensional, unconditioned search problems [37]. This algorithm takes as input an initial state and time and optimizes the coefficients c m of Eq. (12) by maximizing F/t. The initial state is either a specific state or a randomly chosen one. The coefficients, c m , of a random state are stochastic variables which follow a normal distribution. After all coefficients are chosen the random state is normalized.
In the next subsection we describe how the QFI can be efficiently calculated for states in our ansatz space.

C. Numerical methods
Evaluating the QFI (see Eq. (11)) is computationally hard as it requires full diagonalization of the density matrix E[|ψ ψ|], where E denotes the noisy channel (see Eq. (10)). However, if the eigenvalues of E[|ψ ψ|] are highly degenerate, then the computational effort for calculating the QFI can be significantly reduced. A state that is symmetric under particle permutations exhibits such large degeneracies. In particular, for h = S z and L as in Eqs. (8,9), we are able to use a specific representation of ρ(t) and h that allows for an efficient calculation of F. Here, we briefly summarize this representation for the local uncorrelated dephasing scenario, Eq. (8), with µ x = µ y = 0, µ z = 1, and refer the reader to Appendix A for more details.
Under the effect of local dephasing, any state in the ansatz space remains permutationally invariant. Hartmann showed in [38] that any permutationally invariant state can be represented by a weighted sum of O(N 3 ) specific operators. Thus, even though E[|ψ ψ|] may be of full rank, where |ψ is permutationally symmetric, its representation is always efficient. Moreover, the operators introduced in [38] are such that the action of local dephasing is easy to express. As shown in Appendix A the spectral decomposition of E[|ψ ψ|] can then be efficiently computed. For an efficient computation of F, it is necessary that h obeys the same symmetries as the state The general procedure for computing the QFI is hence the following. Express the initial state, Eq.(12), in terms of the "Hartmann operators" introduced in [38]. Next, calculate the action of local dephasing on the basis operators. Then, express the operators in terms of joint eigenvectors of S 2 and S z , and numerically determine the spectral decomposition of E[|ψ ψ|] as a function of the coefficients, c m . One can then efficiently evaluate Eq. (11). We note that the same method can be applied for the case of depolarizing noise, even though the effect of this noise model when computed in terms of the "Hartmann operators" is more cumbersome. Spatially correlated noise, on the other hand, is easier to treat since the states of Eq. (12) stay within the subspace spanned by {|j, m } m under the action of the noise (See Appendix A for more details).

III. LOCAL HAMILTONIAN WITH DEPHASING NOISE
In this section we consider the standard scenario where h is given by Eq. (7) and L is given by Eq. (8) with µ x = µ y = 0, µ z = 1, respectively. Hence, the unitary and noisy evolutions in the standard scenario act parallel to each other if visualized on the Bloch sphere.
Before presenting our numerical findings we first recall the known results pertaining to the standard scenario. It is easy to see that the product state, |+ ⊗N , leads to a maximal QFI of max t (F PS /t) = N/(2γe) which has the same scaling as in the absence of noise. The GHZ state, however, leads to the maximal QFI max t (F GHZ /t) = max t (F PS /t) which is qualitatively different from the quadratic scaling in N for γ = 0 [53]. Hence, the advantage of the GHZ state compared to the product state is lost as soon as γ = 0. As already mentioned in Sec. II A, it is known that asymptotically the QFI for any input state scales at most linearly with the system size, and the maximal improvement, for any N , compared to product states as input states is bounded where the bound can only be achieved for N → ∞ [12]. Thus, the minimum error obtainable is given by 0.61(δω) PS . It is also known that one-axis twisted SSS with a specific squeezing parameter µ (which depends on N only) achieve this bound asymptotically [15] (see Sec. III D for more details).
Let us now consider the finite N case. The aim here is to numerically determine the optimal input states and identify their properties. We show in Sec. III A that there exists a µ = µ opt such that |SSS(µ opt ) is already almost optimal for finite system sizes. However, we find that there exist states that perform slightly better than |SSS(µ opt ) . In addition, we find that the property of being spin-squeezed is not necessary for obtaining high precision. The conclusions for both statements are drawn from comparing the optimal QFI obtained by |SSS(µ opt ) and the one obtained for states within the ansatz class, Eq.(12), which numerically maximize F/t for a given N . Surprisingly, our results show that squeezing, often considered as the key-property for high sensitivity experiments, is not necessary. The only common property which seems to be essential for optimal states is the global distribution of the amplitudes c m in Eq. (12). In particular, we find that the numerically obtained optimal states all have the same variance of h, which is, as mentioned above, proportional to the QFI in the noiseless case. In fact, we find that for all optimal states it holds that the variance of h scales as V(h) ∝ N 1.674 . The numerically fitted exponent is larger than one, which corresponds to the variance of the product states, but strictly smaller than two, which corresponds to the overall maximum of the variance for the given Hamiltonian.
In Sec III B we show that a randomly chosen state from the ansatz class, Eq. (12), leads to a QFI that is larger than F PS . This highlights the importance of this ansatz class, since this result is not expected for states randomly chosen from the entire Hilbert space. It also shows that a quantum improvement is nothing extraordinary and can be achieved by many different states (within the ansatz class). This result is obtained numerically by comparing the maximal QFI obtained from millions of randomly chosen states with the one obtained for the product state for various values of N . In Sec. III C we identify the properties of optimal states. We show that not only do optimal states have a particular value of V(h) but an additional condition on the coefficients of these states is required.
A. Spin-squeezing is not necessary for optimality In this subsection we show that spin-squeezing is not necessary for optimal frequency estimation.
The search for the optimal one-axis twisted SSS for a given N is numerically relatively simple as one only needs to optimize over two parameters, the measurement time t and the squeezing parameter µ. On the contrary, optimization within the ansatz class (see Eq. (12)) is much more demanding as one has to optimize over N/2 + 1 parameters including time. For each value of N , ten randomly chosen states from the ansatz space are chosen as initial states. The states are then evolved according to Eq. (10) and the QFI of the resulting state is optimized. The maximal QFI of the optimal SSS and the state obtained via optimization by taking the optimal SSS as initial state is then compared (see Fig. 1). We observe only small relative fluctuations between the single runs of the algorithm. We are confident that we found the global maximum within the ansatz class (Eq. 12) for N ≤ 70 as, if the algorithm would end up in local minima, one would expect to obtain significantly different values for the QFI. In the following, we refer to "optimal states" as those states that are numerically found by this algorithm.
For all values of N ≤ 70, we found a relative improvement compared to |SSS(µ opt ) of approximately 10 −5 to 10 −3 , which is significantly larger than the fluctuations of the optimization algorithm, see Fig. 1. We therefore conclude that whereas one-axis twisted SSS are very close to the optimal state, there exist states which perform better. With the confidence of discussing the actual global optimal states, we can analyze their properties. Here, this is done for the squeezing strength, ξ 2 , and in Sec. III C for the variance V(h).
The squeezing strength [39,40] measures how spin-squeezed an Nqubit state is. For states in the ansatz class (Eq. (12)), it is defined as A state is called spin-squeezed if ξ < 1. Let us remark that the optimal one-axis twisted SSS, |SSS(µ opt ) , is not the state with the smallest value of ξ.
If we calculate ξ for the optimal input state found by the optimization routine, we observe an interesting phenomenon (see Fig. 2). If the optimal initial state was spin-squeezed, then the resultant state after the optimization is also spin-squeezed (squares in Fig. 2). In contrast, optimal states found with random starting states are not necessarily spin-squeezed (dots in Fig. 2). In particular, for N > 50, we found no such state which was spin-squeezed. We therefore conclude that spinsqueezing is not a necessary property for optimal frequency estimation. It is not sufficient either as there exist SSS that are sub-optimal.  18) for optimal input states. The blue line represents the border below which states are spin-squeezed. In case |SSS(µopt) was used as an initial state in the search algorithm the resulting state, after optimization, was found to be spin-squeezed as well (squares). However, if the initialization was done with a random initial state this is, for larger N , typically not the case.

B. Typical ansatz states perform better than classical ones
In this section we calculate the optimal QFI for states chosen randomly from our ansatz space and compare this to the maximal QFI for the four state families introduced in Sec. II A. The motivation is to learn about the performance of "generic" states from the ansatz space (Eq. (12)). For each N ∈ {10, 20, . . . , 70}, we generate 10 6 random states as described in Sec. II B, and for each state the maximal QFI and V(h) are calculated. We find that for larger N almost all random states perform better than both product states and the GHZ state. In Fig. 3, the maximal QFI of different state families including the optimal states and the average maximal QFI from the sampling are compared.

C. Properties of optimal states
In this subsection we identify two important properties for the optimal states. We find that a key property of the optimal states is that the distribution of its coefficients, c m , (see Eq. (12)) follows the general shape of a cosine distribution. Moreover, only the global shape of the distribution of the c m 's plays a role, resulting  Maximal QFI for different state classes, relative to the maximal QFI of product states (and GHZ states). The results for Dicke states |D k (Eq. (16)) where we optimize over k, the numerically found optimal states of Sec. III A (whose QFI is almost that of |SSS(µopt) ), and the randomly chosen states are shown. The random states with conditioned variances are introduced in Sec. III C. For N = 70, the minimal uncertainty is about 0.80(δω)PS. The standard deviations of the results for the random states are indicated as shaded areas.
in optimal states having a variance, V(h) = O(N 1.674 ).
Recall that in the absence of noise V(h) is proportional to the QFI.
First intuitions about these results are obtained via Fig. 4, where a comparison of the maximal QFI with the variance V(h) for N = 10 is plotted for the four families of states introduced in Sec. III B. One notices not only the relation between QFI and V(h), but also that the states from the ansatz class clutter around a particular interval of V(h) close to the maximal QFI. Similar observations can be made for all the different values of N considered here.
Whereas we find a large deviation of ξ 2 for the optimal states (see Sec. III A), one observes the contrary for V(h) which, as mentioned before, is the figure of merit in the noiseless case. In Fig. 5, the lower blue line represents the average variance of h for ten optimal states obtained by random sampling. The standard deviation from this average is plotted as well. However, the relative divergences from the average values are of the order of 10 −6 to 10 −3 and therefore hardly visible. Even though the random states used for the initialization of the optimization algorithm exhibit very different variances (scaling as O(N 2 )), the optimal states found all show a reduced variance that is very well approximated by V(h) = 0.196N 1.674 . This gives strong evidence that the variance of the state is relevant even in noisy metrology. More importantly, the variance does not scale with N 2 , as it would in case of a randomly selected state from the ansatz class, or for optimal states in the noiseless case.
Clearly the variance is not the only quantity which determines the usefulness of a quantum state for noisy frequency estimation. In fact, a state that exhibits the same variance as the optimal states does not necessarily have to have an optimal QFI. For instance, the Dicke states |D 1 and |D 2 in Fig. 4 are close to the optimal variance, yet their maximal QFI is much smaller   For each N , 10 6 states were sampled. The average value is very well fitted by N (N + 2)/12, which is the variance of the uniform distribution. The lower curve (blue) is the average variance from the ten optimal states found with random starts (Sec. III A). Numerically, we find that the variance is excellently fitted by 0.196N 1.674 . The standard deviation (bars) from the average for these states is comparatively small. than the optimal value. To understand why states such as |D 1 and |D 2 with the same variance as the optimal states yield different maximal QFI's we compare the distributions of the coefficients, c m , of these states (see Eq. (12)). An example for N = 75 is shown in Fig. 6, where the coefficients, c m , of |SSS(µ opt ) and |D 6 are compared. Both states have almost the same value for V(h). There are however two important differences: first, the distribution, {c m }, of |D 6 is not a "smooth" function of m as it oscillates having k + 1 = 7 maxima; second, the tails of the distribution, i.e. c ±k for k ≈ N/2, of |D 6 are exponentially suppressed. Note that the maximal QFI of |SSS(µ opt ) is about 21% higher than the one of |D 6 .  Comparison of the distributions, |cm| 2 , for |SSS(µopt) (µopt ≈ 9.51 × 10 −2 ) and the Dicke state |D6 for N = 75. Both states exhibit almost the same value for V(h). Apart from the oscillations in the distribution of |D6 , the main difference is that for |D6 the coefficients close to ±N/2 are exponentially suppressed.
In order to discover which of the two differences above has the largest impact on the maximal QFI we performed the following numerical test. As in Sec. III B, we sample a large number of random states (10 4 for each value of N ∈ {10, 20, . . . , 70}). Then the coefficients, c m , of the states are multiplied by cos(ϑm), with ϑ chosen such that the variance of the random state is equal to the variance of the optimal states [54]. Then, the maximal QFI is calculated. We find that for increasing N , this class of modified random states typically performs much better than the unmodified random states (see Fig. 3). What is more, the relative difference between the maximal QFI of the optimal states and these modified random states decreases with N , in contrast to the native random states. This indicates that, for larger N , the detailed distribution of c m is not so important, as these modified random states do not have smooth distributions. In particular, the maximal QFI apparently does not depend on a specific kind of "smoothness", which is present, e.g., for |SSS(µ opt ) (compare also to the sharpness quantity in [41,42]). However, the global structure of a relatively broad distribution that is suppressed at the boundary seems to play a major role.

D. Asymptotic behavior
The numerical results thus far suggest that the variance as well as a specific behavior of the coefficients, c m , of the state, in the eigenbasis of the Hamiltonian, play a key role in the metrological performance of a quantum state in the standard scenario. An interesting question is what happens to the average improvement of the maximal QFI for certain state families as N be-comes large. For instance, although the relative difference between the maximal QFI of the optimal states and the random states increases in the investigated range of N , this gap could stabilize to a finite value, or even close in the limit N → ∞. Unfortunately in this limit numerical results do not provide definitive answers as calculation of the QFI requires the spectral decomposition of the time-evolved quantum state. Even for simple state families, this is asymptotically not feasible.
In this subsection we consider a particular measurement that clearly leads to a lower bound on the QFI. The measurement is assumed to be in the eigenbasis of S ϕ = exp(−iϕS z )S x exp(iϕS z ), i.e. a measurement in the x − y plane. The quantity which has also been considered in [11,15], is equivalent to the QFI for this particular measurement for any input state if N = 1, and for the product state for any N .
It was recently shown [43] that G is in general a lower bound on the QFI, i.e. G ≤ F. Optimizing over ϕ and inserting the time-evolved state (see Eq. (6)) for a symmetric input state, |ψ , we obtain with s = 2 S x ψ /N . In terms of the coefficients, c m , s reads (see Appendix B 1) If s → 1 and V ψ (S y )/N → 0 for N → ∞, G/t reaches the optimal value N/(2γ) at t → 0. In Appendix B 1, we show that states |ψ whose distribution of coefficients, {c m }, is sufficiently smooth and for which V(h) = O(N 2 ), cannot reach s = 1 asymptotically. Therefore, such a state can not achieve the bound N/(2γ), and is suboptimal with respect to the quantity G given in Eq. 20. Note that if a similar statement was to hold true for the QFI, one could conclude that random states and other states that satisfy V(h) = O(N 2 ) are asymptotically less useful than the optimal states.

IV. BEYOND THE STANDARD SCENARIO
The scenario with local Hamiltonian and dephasing noise discussed in Sec. III is certainly an important instance in frequency estimation theory. However, many experimental setups have to be described by different unitary evolutions and/or different noise models. Hence, the investigation of scenarios different than the standard scenario is of fundamental and practical relevance. This section is devoted to two such scenarios. In Sec. IV A we consider the situation where local dephasing noise is replaced by local depolarization noise (Eq. (8) with µ x = µ y = µ z = 1/3), which can be viewed as a combination of local dephasing and transversal bit-flip noise, and show that this change does not lead to a dramatically different picture. In Sec. IV B we consider spatial and temporal correlated dephasing noise, described by Lindblad operator given in Eq. (9) and show that the GHZ state remains superior to product states. In fact, a favorable scaling of the QFI can be achieved for the GHZ state.

A. Local depolarization
In this section, we study the local depolarization noise, Eq. (8) with µ x = µ y = µ z = 1/3. Physically, this model describes uncorrelated interactions between the system qubits and a bath at infinite temperature. As can be easily seen, the minimal error achievable with product states as input states is the same as for dephasing noise, i.e. (δω) PS = 2γe/(N T ). Recently, an upper bound on the maximal QFI was obtained [13], stating that the minimal error δω for any input state scales with the SQL and is at most 0.53(δω) PS , which is smaller than for local dephasing noise. It is not known, however, whether this bound is tight. As mentioned before, the one axis-twisted SSS were shown to asymptotically reach the (optimal) minimal error, given by Eq. (20), of 0.61(δω) PS in the presence of phase noise. As the replacement of phase noise by depolarization noise does not change G in Eq. (20), it follows that, asymptotically, |SSS(µ opt ) reaches at least the same precision with depolarization noise. To our knowledge there exists no example of a state that overcomes this limit. Therefore, it is not clear whether the bound of Ref. [13] can actually be achieved.
Let us now come to the results of our numerical studies. The algorithm to compute the QFI in case of this noise model is still efficient for symmetric ansatz states of Eq. (12). However, in practice it is much more demanding (see Appendix A) and we are therefore limited to system sizes up to N = 30 (see Fig. 7). Whereas for N = 2, 3, the GHZ state is the optimal initial state, this is no longer the case for N ≥ 4. However, the GHZ state remains superior to the product state. The QFI for the GHZ state reads (see Appendix B) For large N , the maximization of F/t over time leads to (δω) GHZ = 3γe/(2N T ) ≈ 0.87(δω) PS . Note that this improvement is already achieved for a relatively small N . The optimal states are found using the algorithm described in Sec. III A. One observes that the relative difference between the optimal states and |SSS(µ opt ) is larger than in the standard scenario. For N ≤ 30, it is in the range of a few percent. However, this gap seems to vanish as N increases such that one could expect that SSS become optimal for larger N (see Fig. 7). Compared to the standard scenario of Sec. III, the best Dicke states are closer to the optimal states for small N . However, the difference in their performance as compared to the standard scenario seems to vanish for larger N (see Figs. 3 and 7). Also the average  improvement of the randomly chosen initial states (10 6 random states for each N ∈ {10, 15, . . . , 30}) is compatible with that of the phase noise scenario. Altogether, these results are qualitatively very similar to those obtained in the standard scenario. It is worth noting, however, that there are certain differences in the asymptotic case, e.g. that the GHZ state is superior to the product state in case of depolarizing noise.

B. Correlated dephasing
We now discuss our findings for the case of correlated dephasing noise, a dominant source of noise in experiments based on trapped ions [26][27][28][29] that arises mainly due to fluctuations in the phase reference that collectively affect all the ions. It was shown that optimal states for the standard scenario offer no quantum advantage in this case as spatially correlated noise causes quantum coherences to vanish even faster than in the case of local dephasing noise [32]. However, clever use of decoherence-free subspaces [32], or higher-dimensional probe systems, i.e. qutrits [31], allows for higher precision compared to uncorrelated noise and can even restore Heisenberg scaling in precision. On the other hand, temporal correlations can lead to an improved performance of highly entangled states such as the GHZ state [44,45]. The reason is that the optimal measurement time t decreases with N , eventually entering regime where γt 1. Then, (short) temporal correlations lead to a quadratic suppression of the noise strength.
Here, we will focus on qubits and study the effect of temporal as well as spatial correlated noise as defined in Eq. (9). For spatially correlated noise, the master equation reads as in Eq. (6). An additional (time-dependent) factor multiplying the super-operator L (see Eq. (9)) models the time-evolution assumed in [29]. For γt 1, we are in the Markovian regime f (t) ≈ 1, while for γt 1, the effect of noise is suppressed through the temporal correlations by f (t) ≈ γt.
For the GHZ state, pure spatial correlations lead to a maximal QFI constant in N [32], whereas pure temporal correlations give rise to an improved scaling of the optimal QFI of O(N 3/2 ) [44]. With h = S z , the QFI for the GHZ states with both kinds of correlations equals For N 1 the optimization over time can be easily performed and leads to max t (F/t) = N/( √ 2eγ), i.e. the SQL.
We now compare this result to the one obtained for product states as input states. Note that, in order to avoid the effect of the correlated noise, one might use a sequential approach, where one qubit after the other is exposed to the evolution. In this case the total time is given by N T , where T denotes the evolution time for a single experiment. Then, the maximal QFI coincides with the one obtained for local dephasing. The improvement of using a GHZ state is hence limited to a constant factor (δω) GHZ ≈ 0.65(δω) PS . If, however, a parallel scheme is employed, where a N -qubit product state is used as input state, the effect of noise changes the scaling of the maximal QFI. Indeed, numerically we find that then (δω) PS = O(N (−1/4) ), which is worse than the SQL. Hence, for a parallel scheme, we indeed encounter a quantum improvement in scaling. This is because, for the GHZ state, the time-like and space-like correlations partially compensate each other. In contrast, the product state, whose optimal measurement time is relatively long (and constant in N ), cannot benefit from short correlations in time as much as it is impaired by spatial correlations. It is worth mentioning that, in this scenario, the equivalence between the sequential and parallel strategies for product states, discussed in [18], is not valid.
Numerically we searched for the optimal states within the ansatz class of Eq. (12) for N ≤ 75 using the same algorithm as explained in Sec. III A. We find that the GHZ states are the optimal states. We believe that this is due to the fact that the GHZ state has the shortest optimal measurement time and, for large enough N , its entire time evolution occurs in the regime where the strength of the noise is quadratically suppressed.

V. SUMMARY AND CONCLUSIONS
In this paper, we investigated different frequency estimation scenarios mainly by numerical means.
In the scenario of local interactions and local dephasing, we found that one-axis twisted SSS are almost optimal for a certain squeezing parameter. However, we also gave strong evidence that the property of being spin-squeezed is not necessary by presenting optimal states that are not spin-squeezed. Moreover, we identified the key property all optimal states have in common, namely the variance with respect to the Hamiltonian as a consequence of a specific global distribution of the coefficients c m in Eq. (12). The envelope of this distribution seems to be of cosine-form (as a function of the coefficient index). We showed that similar results apply to the case of local depolarization noise. However, there the GHZ state outperforms product states. In contrast to that, we demonstrated that more drastic changes of the dynamics give qualitative different results. In the case of temporal and spatial correlation of noise, the scaling of the maximal QFI was shown to be better for the GHZ state than for the product state. Note that recently phase estimation in the presence of local or spatially correlated noise has been investigated, and a similar family of states as presented here have been identified as optimal input states for parameter estimation for large N [46].
There are still several important open questions. For example why is there asymptotically no improvement in scaling for the standard scenario of Sec. III. Clearly, the maximal QFI scales with N F (N ) , where F (N ) > 1 for finite N but is strictly one for N → ∞. Furthermore, the numerical findings indicate which variance V(h) is opti-mal, but it is not clear why it has this particular value. Another big open problem, that is also relevant from the experimental point of view, is the one regarding the optimal measurement. By calculating the QFI, the optimization of the measurement is intrinsically performed. In fact, for states other than SSS, where it is known that the optimal measurement is local, nothing substantial is known about the optimal measurement basis (apart from the fact that it is the projection onto the eigenspaces of the symmetric logarithmic derivative).
Here, j labels the inequivalent irreducible representations on S N , π (j) , present in the unitary representation π, and d j denotes the multiplicity of the corresponding π (j) . Consequently, the total Hilbert space, H ⊗N , can be conveniently written as where the space N/2+j+1 , is the space upon which the irreducible representation π (j) acts non-trivially, and the space N (j) ≡ span{|j, m dj m=1 }, of dimension d j = 2j + 1, is the space upon which {π g ; g ∈ S N } act trivially [48]. The tensor product in Eq. (A5) is not a tensor product between real physical systems, but rather a tensor product between two virtual systems described by the state spaces M (j) and N (j) respectively. The states of these virtual systems can be defined via the isomorphism |j, m, α ≡ |j, α ⊗ |j, m .
Hartmann showed that an operator basis for the permutationally symmetric states is given by [38] K m,m ,d : Note that the definition of the basis operators in Eq. (A6) differ from that of Ref. [38] by a factor N !/(i 1 !i 2 !i 3 !i 4 !).
Using Schur's lemmas it can be shown that Eq. (A6) can be written with respect to the {|j, m, α } basis as where D is the completely depolarizing map, D[A] = tr(A) dim(H) 1l, ∀A ∈ B(H), I is the identity map, and P (j) [A] = Π j AΠ j , where Π j is the projector onto the space M (j) ⊗ N (j) [48]. Careful counting gives a total number of 1/6(N + 1)(N + 2)(N + 3) different basis operators for the space of permutationally invariant matrices [38].

Efficient representation of permutationally invariant states under phase noise
In this sub-appendix we seek an efficient description of states from our ansatz space (see Eq. (12)) in terms of the Hartmann operators K m,m ,d of Eq. (A6). Consider an arbitrary pure state, ρ = m,m c m c m |N/2, m N/2, m | where c m ∈ [0, 1], belonging to our ansatz space. For m ≥ m , and m+m ≤ 0 the matrix elements |N/2, m N/2, m | can be expressed as where In addition, one finds Combining Eqs. (A16, A17) gives As S 2 |j,m,α = j(j + 1), one finds that m 2 + N/2 + µ (j) m,m,1 = j(j + 1). Hence, one can calculate the values of µ (j) m,m,1 for all j, m. Multiplying X (N/2+m−d,d,d,N/2−m−d) by π g X (N/2+m−1,1,1,N/2−m+1) π † g , for some g ∈ S N and d > 1, results in one out of four possibilities, depending on the choice of g As j, m, α| S 2n S i − |j, m + 1, α = [j(j + 1)] n i k=1 C − j,m+k , with C − j,m = j(j + 1) − m(m − 1), one has linear equations for any (n, i) which can be solved recursively using the solution for (n − 1, i): In this way, one finds all µ (j) m,m ,d and therefore an efficient way to express ρ in the basis of P (j) m,m . The actual implementation of the effective density operator description is done with four-dimensional tensors. Their processing is facilitated by using the Tensor Toolbox [49]. The QFI is then calculated via the numerical diagonalizing of ρ.

Remarks on other scenarios
We shortly note how the method changes if one considers different scenarios. Depolarization noise.-In Sec. IV A, local depolarization noise is considered. The action of depolarizing noise on diagonal elements is: E(|0 0|) = (1 + e −γt )/2 |0 0| + (1 − e −γt )/2 |1 1| and E(|1 1|) = (1 + e −γt )/2 |1 1| + (1 − e −γt )/2 |0 0|. Using this fact it is straightforward to determine the action of depolarizing noise on the Hartmann operators as E ⊗N K m,m ,d = n/2−i/2 ν=−n/2−i/2 e −2dγt β ν K ν,ν+i,d , where n = N/2 − d, i = m − m and (with = ν − m) The effort in calculating the action of depolarization noise is thus higher than that of local phase noise. For this reason the maximal system size is restricted to N = 30. Correlated noise.-In the presence of correlated noise (see Sec. IV B), the numerical treatment is straightforward. The action onto an off-diagonal element simply reads E|N/2, m N/2, m | = p (m−m ) 2 |N/2, m N/2, m | with p = exp(−γt−e −γt +1). This means that the noisy state stays within the subspace spanned by {|N/2, m } m . Therefore, a more complicated treatment using the Hartmann basis is not necessary.
Proof. We first compute S x ψ . From the theory of angular momentum S x = 1/2(S + + S − ) where the operators S ± are the standard raising and lower operators whose action on the basis states {|j, m } is given by where we have used the smoothness of the coefficients c m in the second line of Eq. (B3). In order to convert the sum into an integral, let x = 2m/N . Taking the limit N → ∞ one obtains where c(x) = lim N →∞ c xN/2 ∈ [0, 1] with 1 −1 c(x) 2 dx = 1. Note that √ 1 − x 2 ≤ 1 and equals unity if and only if x = 0. Due to the normalization of c(x), we find that lim N →∞ 2 S x ψ /N = 1 only if c(x) 2 = δ(x), the Dirac delta function. However, as V(h) = O(N 2 ) by assumption, c(x) 2 cannot be the Delta function. Hence, lim N →∞ 2 S x ψ /N < 1. This completes the proof.

Maximal QFI for GHZ under depolarizing noise
We consider here the depolarizing channel (see Eq. (8)). In [50] (Appendix I), we have shown that the QFI of for the GHZ state as input state is given by For the time optimization, we assume that, for large N , one can restrict oneself to γt 1, which is a posteriori justified. Then, one approximates (1 + e −γt ) /2 by e −γt/2 and neglect [(1 − e −γt ) /2] N . Hence, the QFI equals F ≈ t 2 N 2 exp(−3/2N γt). The small difference in the power then gives rise to a modified optimal time t opt ≈ 2/(3N γ) which leads to a maximal QFI of 2N/(3γe).