Bayesian optimization to estimate hyperﬁne couplings from 19 F ENDOR spectra

We propose Bayesian optimization for a rapid, global parameter search with little prior knowledge, followed by a reﬁnement by more standard gradient-based ﬁtting procedures. Indeed, the latter suffer from ﬁnding local rather than global minima of a suitably deﬁned loss function. Using a new and accelerated simulation procedure, results for the semi-rigid nitroxide-ﬂuorine two and three spin systems lead to physically reasonable solutions, if minima of similar loss can be distinguished by DFT predictions. The approach also delivers the stochastic error of the obtained parameter estimates. Future developments and perspectives are discussed. (cid:1) 2023 The Authors. Published by Elsevier Inc. ThisisanopenaccessarticleundertheCCBYlicense(http:// creativecommons.org/licenses/by/4.0/).

Recently, the introduction of fluorine labels has provided an additional opportunity to employ ENDOR for distance measurements in structural biology.The approach exploits some unique properties of the 19 F nucleus [16], i.e. its nuclear spin I ¼ 1  2 in combination with the large gyromagnetic ratio, providing relatively simple ENDOR spectra.The spectra are dominated by dipolar interaction, as long as the 19 F nucleus is sufficiently far from the electron spin such that no effective spin density transfer mechanism is operative.Analysis of the spectra reveals the electron spin-fluorine dipolar tensor, from which the inter-spin distance can be extracted [16].In the last year, the method has been extended in combination with other paramagnetic labels (trityl, Gd 3+ ) [17,18] as well as with endogenous tyrosyl radicals [19], and distances in the range between 0.5 and 2 nm have been reported so far.As an attractive feature, ENDOR samples can also be investigated by paramagnetic NMR techniques [20], for instance paramagnetic relaxation enhancements (PRE) [21] or pseudocontact shifts [22], which opens avenues to an integrative approach for structural biology studies.Very recently, this approach has been reported for investigations of proteins in cell [23].
Similarly to NMR, ENDOR spectroscopy benefits from high magnetic fields and frequencies, as nuclear Larmor frequencies become naturally separated, and consequently HF powder patterns arising from different types of nuclei can be better resolved.This is particularly relevant for studies with 19 F, which has a gyromagnetic ratio very close to that of protons.For instance, at 34 GHz/1.2Tesla (Qband) the 19 F and 1 H Larmor frequencies are separated by only ca. 3 MHz, which means that 1 H/ 19 F overlap will occur in typical nitroxides featuring proton HF coupling constants on the order of 6 MHz [24].Even 94 GHz/3.4Tesla (W-band) can present severe complications with proton background subtraction, for example if using tyrosyl radicals as paramagnetic centers [19].Thus, exploration of ENDOR at even higher frequencies (263 GHz/9.4Tesla), although instrumentally demanding [25], becomes crucial for future developments.
Alongside their mentioned advantages, 19 F ENDOR spectra at high frequency come with complications in the analysis due to two factors: (i) a strong orientation selection that usually prevents immediate read-out of couplings from peak positions, and (ii) the emerging resolution of chemical shielding (CS) anisotropy.Recently, we have reported 263 GHz 19 F Mims ENDOR spectra of nitroxide-fluorine model systems and demonstrated an unprecedented, visible asymmetry arising from CS anisotropy [26].This latter interaction contributes six additional parameters (three tensor eigenvalues plus three Euler angles) per 19 F nucleus in ENDOR spectral simulations, rendering standard estimation procedures based on least-square fitting with gradient methods unreliable.In our previous work, spectral simulations were achieved by using a fully DFT-predicted parameter set as input and subsequent, minor manual adjustment.The study provided the motivation to search for more rigorous methods of parameter estimation.
Recently, Stoll et al. have presented an example for inferring information about Fermi contact interaction as well as electronnuclear distances from ENDOR spectra, including estimation of their uncertainties and inference on distributions [27].The method demonstrates that multiple HF couplings can be extracted from ENDOR spectra if a Bayesian prior distribution based on DFT calculations is considered.
Herein, we follow an alternative route for determining the best parameter set from 263 GHz ENDOR spectra.We employ two representative model fluorine-nitroxide compounds that were investigated in [26], with one and two fluorine nuclei, respectively, see Fig. 1.We first neglect distance distributions, an approximation which turned out acceptable for the investigated semi-rigid model systems, but will be considered in more detail in future work.We combine statistical spectral uncertainties, recently made available through a statistical drift model [27,29] called here SDM, with an accelerated simulation code (SimSpec) that explicitly calculates the effect of orientation selection.Since the parameter space associated with the analysis is of moderately high dimension (typically 10 dimensions for a single 19 F nucleus) and objective functions typically possess many local minima, we employ Bayesian optimization for determining the set of interaction parameters that provides the best fit to several ENDOR spectra simultaneously.Using SimSpec, we then compute how the simulated spectra change when we vary the interaction parameters.Combining the statistical uncertainty of the spectrum with the dependence of the spectrum on the interaction parameters in turn yields stochastic uncertainties of the interaction parameters.
The paper is organized as follows: in Section 2 we experimentally determine the g-values of the two investigated compounds 1 and 2 from EPR spectra and then apply the SDM to 19 F Mims ENDOR data to remove baseline and other experimental artefacts.Subsequently, in Section 3, we describe the spin and experimental parameters required in the optimization procedure as well as the accelerated spectral simulation algorithm.In Section 4 we set out the statistical inference methodology.We then report and discuss results obtained using the proposed methodology in Section 5 and provide further details on materials and methods in Section 6.

Experiments and data processing
Estimation of HF and CS tensors from ENDOR data requires a work-flow that starts with the generation and examination of experimental data.Two types of experimental data are used here: (i) EPR spectra to characterize the nitroxide radical, i.e. the g-and the 14 N hyperfine coupling tensors, required to simulate orientation selection for ENDOR, and (ii) ENDOR spectra that are free from background signals and other experimental artefacts as the latter considerably affect the results of the optimization procedure.Such ENDOR spectra along with their uncertainties are extracted from recorded ENDOR data using the recently developed SDM [28], here described for 19 F-ENDOR.

EPR spectra
For optimization to succeed, we aimed to reduce the number of parameters that need to be estimated.To this end, we determined the g-values of the system in a measurement using a carbon (C) fibre (g ¼ 2:002644, [30]) as internal reference standard.Based on this calibration, the g-values of compound 1 were obtained from a simulation using the SimSpec code described in Section 3.3.The HF and quadrupole interaction parameters for the nitroxide's 14 N nucleus were adopted from our previous report [26] with a minor modification of the HF tensor A14 N eigenvalues to 15; 11; 95:8 ½ MHz to improve the fit.We estimate that the uncertainties of the HF tensor's eigenvalues are within the line width used in the EPR simulation, which is AE4 MHz or 2:8 G full width half height (FWHW).For completeness, we specify the eigenvalues of 1:3; 0:5; À1:8 ½ MHz used for the quadrupolar tensor P14 N taken from [31] and the Euler angles 0; 0; 0 ½ relative to the g-tensor used for both A14 N and P14 N .This simulation yielded g x;y;z ¼ 2:00886; ½ 2:00610; 2:00211 with approximate uncertainties of AE0:00005 estimated from uncertainties of magnetic field strengths of the fibre resonance.
The EPR spectra of compounds 1 and 2 showed a second, weaker contribution with a smaller g x -value (see Fig. 2 A and B, marked with an asterisk).This was found dependent on the freezing conditions and is attributed to a fraction of the sample with a different H-bonding environment of the nitroxide [32].For the simulation of the second contribution, g x;y;z ¼ 2:00835; 2:00610; ½ 2:00211 and A14 N eigenvalues of 15; 11; 95:8 ½ MHz were used (relative weight 0.15 for compound 1 and 0.25 for compound 2, inferred from relative g x peak heights in the measured EPR spectra).The influence of this second contribution on the analysis is discussed in Section 5.

ENDOR Spectra and Data Processing with the Statistical Drift Model (SDM)
19 F 263 GHz Mims ENDOR spectra of compounds 1 and 2 were recorded at orientations of the g-tensor as indicated in Fig. 2 B and C. We used quadrature detection yielding an in phase and an orthogonal component, referred to as real (R Y ð Þ) and imaginary (I Y ð Þ) parts which, as for complex numbers generally, we equiva- . Spectra of compound 1 were adopted from [26] while spectra of compound 2 were recorded again to obtain a better signal to noise (S/N-) ratio as compared to [26].Experimental details are given in Section 6 and raw data are displayed in the Supplementary Information (SI) A.1.
The spectra were processed with the SDM of [28] whose validity is established here for 19 F ENDOR at 263 GHz.The SDM allows quantification of the noise in the spectra as well as compensation of possible phase drifts of the echo signal that maximizes S/N-ratio.
In order to apply the SDM, the ENDOR data are recorded and stored in so-called batches indexed by b 2 1; . . .; B f g in a data Fig. 1.A: Chemical structure of compounds 1 and 2. B: Visualization of the expected orientation of the g-, HF (A) and CS (r) tensors with respect to the chemical structure, see [16].C, D: Energy minimized structure predicted by DFT calculations of compounds 1 (C, [16]) and 2 (D, [26]).Those DFT calculations also predicted that these compounds assume only one predominant conformation.
Here, i:i:d.denotes that the b;m are independent and all follow the same Gaussian distribution (with mean zero and covariance matrix R 2 R 2Â2 ), i.e. we use additive Gaussian white noise.In (1), / b models the batch-dependent strength and phase of the ENDOR effect and j m captures the ENDOR spectrum.Applying our procedure delivers maximum likelihood estimates ŵ; /; ĵ and R as described in [28], where the hat symbol denotes an estimator.To identify the direction in the complex plane along which ĵ contains the spec-  3) are examined for goodness of fit using a Kolmogorov-Smirnov (KS) test [33] and the resulting p-values are available in SI A.3.This examination leads us to conclude that the Gaussian distribution is a suitable error model and, therefore, the SDM is found to also fit 19 F 263 GHz ENDOR data.
In order to obtain a confidence region for the estimated spectrum, we employ the bootstrap procedure.This consists of the repeated generation of synthetic data from the estimated spectrum via adding simulated noise, followed by estimation of the spectrum implied by these synthetic data.The variability of the spectra thus obtained indicates the stochastic error of the spectrum.In detail, following [28] From J independent samples of these synthetic data, maximum likelihood estimates w Ã;j ; / Ã;j ; j Ã;j m and hence I Ã;j m of w; /; j and I respectively, indexed by j 2 1; . . .; J f g , are obtained.Their standard deviation is used to obtain approximate 95% confidence intervals displayed as shaded regions in panels A, B, C and D of Fig. 3.The resulting ENDOR spectra and their uncertainties for all orientations and compounds are shown in Fig. 4. Here, we also compare these spectra with the spectra obtained through the standard averaging method, i.e. summing of Y b;m over batches followed by normalization and phasing.We note that, in contrast with [27,29] which tackled 1 H ENDOR, there is little difference between the spectra resulting from these two methods because there is very little phase drift in / b .The advantage of the SDM is that approximate 95% confidence regions naturally arise from the model.If an SDM is not available, it is possible to extract an indication of the stochastic error from spectra obtained by the averaging method.This can proceed via comparing the measured spectrum with a smoothed version, as in the quasi-bootstrap method in SI B. In order to illustrate this, we used the quasibootstrap method to compute approximate 95% confidence regions for those spectra in Fig. 4 that result from the averaging method.However, the distinction between signal and noise is then less reliable and influenced by manual tuning which is why we proceed with the bootstrap method on the basis of the SDM in the remainder of the paper.
In order to prepare uncertainty estimation of tensor parameters, we also estimate the covariance matrix., which means that the stochastic errors at different RF frequencies are approximately uncorrelated.This is consistent with scanning RF frequencies in pseudo-random order (stochastic acquisition), see Section 6.2.Finally, we examine the distribution of the for some chosen fixed values of m and find that it is well-approximated by a Gaussian distribution in each case, see Fig. S3 in SI A.4.This justifies a Gaussian error model for the stochastic error of the spectra which enables propagation of the stochastic error in the spectra to the stochastic error in the interaction parameters.

Spectral simulation: parameters and algorithm
In this section, we discuss the parameters included in the optimization process.These comprise the spin parameters in the spin Hamiltonian and the experimental parameters magnetic field strength and line broadening.

Spin Hamiltonian parameters for the nitroxide -19 F system
The general spin Hamiltonian for the nitroxide- 19 F spin system at 9.4 T / 263 GHz was discussed in our previous publication [26].Briefly, it consists of two parts, Ĥ1 and Ĥ2 : where h ¼ 2p h is Planck's constant, l B and l N are the Bohr and nuclear magnetons, respectively, g n is the nuclear g-factor and k enumerates the fluorine nuclei whose total number is N19 F ¼ 1 for compound 1 and N19 F ¼ 2 for compound 2. A; P and r denote HF, quadrupolar and CS tensors, respectively.Please note that the hat symbol in this section indicates a quantum mechanical operator, not an estimator.
In the high-field approximation for the 19  Ĥ2 ' The parameters of the spin Hamiltonian Ĥ1 are inferred from the simulation of the EPR spectra using full matrix diagonalization.These parameters are not included in the parameterization because they are either unidentifiable from the ENDOR spectra (e.g. the g-tensor) or their impact on the simulated spectrum is assumed to be small.The parameters of the spin Hamiltonian Ĥ2 report on the inter-spin distance between the nitroxide and the fluorine and are the subject of optimization.In the case of   19 F Mims ENDOR spectra of compounds 1 and 2 at different orientations, respectively, estimated via the SDM (black) and the averaging method (blue).The chemical structures are indicated in the inset.Shaded areas correspond to approximate 95% pointwise confidence intervals obtained via bootstrap for the SDM (grey) and quasi-bootstrap for the averaging method (light blue).The dipolar splitting corresponding to the nitroxide-fluorine inter-spin distance is well visible for compound 1.For compound 2, only one dipolar splitting is well resolved, whereas the second, smaller splitting is partially suppressed by the central spectral hole of the Mims sequence.The resolved features of the dipolar splittings are indicated by colored asterisks for both F-atoms.The CS tensor leads to an asymmetry in the spectra and it cannot be evaluated visually.The MW frequency used for compound 1, orientation g y , differs from the one used for all other spectra resulting in a shift of the 19 F resonance.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)compounds 1 and 2, this distance is large enough so that we assume zero rhombicity [16], whence the HF interaction tensor can be expressed as where T represents the dipolar coupling strength, which depends on the nucleus and the inter-spin distance R. The order of the eigenvalues was adopted from [16] as it was assumed that the dipolar axis of the HF tensor would be close to parallel with g x .Generally, the order is such that the largest tensor component is along the z direction as detailed in [34].The value a iso describes the isotropic part arising through the Fermi contact mechanism.Finally, the rotation matrix R A determines the orientation of A in the nitroxide g-tensor frame: we use reduced 1 Euler angles a A ; b A .In total, only four parameters are required to describe the A-tensor: a iso ; T; a A and b A .
Similarly to the HF tensor, the CS tensor r19 F may be parameterized through its eigenvalues rxx ; ryy and rzz along with an associated rotation matrix R r 19 F (parameterized using the three Euler angles a r ; b r ; c r ) as given in ( 6), left.In our code, we adopt an alternative representation via the diagonal and off-diagonal entries of the symmetric 3x3 matrix ((6), right) which avoids singularities of Euler angles as coordinates (for b r ¼ 0, any combination of a r and c r with the same sum a r þ c r implies the same orientation, so one degree of freedom is lost) so that we alternatively use parameters r xx ; r yy ; r zz ; r xy ; r xz ; r yz .
Finally, we note that a rotation symmetry in the Hamiltonian leads to identical spectra for several distinct sets of interaction parameters.Careful treatment of this symmetry prevents its interfering with interaction parameter estimation: seemingly different minima at parameter values corresponding to images under this symmetry are removed.In the g-tensor frame, symmetries under rotation of coordinates can be expressed by those rotation matrices M that leave the g-tensor invariant.Since g is diagonal with distinct diagonal entries in this frame and these matrices must satisfy MgM T ¼ g, only the rotation matrices M from the set leave g invariant.Just like g, the HF and quadrupole interaction tensors A14 N and P14 N are diagonal in the g frame and will therefore also be invariant under transformation by M 2 M. Hence, the interaction parameter subsets r19 F ; A19 F ð Þand Mr19 F M T ; MA19 F M T will yield equivalent Hamiltonians Ĥ1 ; Ĥ2 and hence identical spectra for all M 2 M. Geometrically, these symmetry operations correspond to rotations about axes of the g-tensor frame by 0 or 180 degrees.

Experimental parameters
Several experimental quantities, particularly those connected to the pulse sequence such as MW frequency, MW pulse length and shape, inter-pulse delay s and RF axis values, were assumed known with sufficient precision as their estimation would add considerable complexity and lead to non-identifiability absent penalization or prior knowledge.Thus, we considered only those experimental parameters that are known to be a major source of error: the static magnetic field strength and the ENDOR line broadening.

Magnetic field strength B 0
The magnetic field strength affects orientation selection and has been observed to slowly degrade, likely due to residual resistance losses of the superconducting magnet.This drift amounts to approximately -2:7 G per day, also see SI C. We previously proposed measuring 1 H ENDOR resonance frequencies to reference the Larmor frequency of fluorine nuclei [26].In the present study, we have re-examined this method to calibrate the absolute magnetic field strength and we found it to be subject to stochastic errors of several G (data not shown).Thus, the magnetic field strength was retained as a parameter.

Line width convolution parameter
In ENDOR spectral simulations, a convolution with a line broadening function of either Gaussian or Lorentzian shape is usually applied to account for the experimentally observed line width.While a lower bound for the line broadening can be determined from the inverse of the RF pulse length, we previously found that optimal line width typically exceeds this minimum [16] and this depends on the selected orientation.Thus, the line width was retained as an optimization parameter.

Accelerated ENDOR Simulation Algorithm (SimSpec)
In order to estimate interaction parameters, frequently repeated simulation of ENDOR spectra for slightly different parameter values is required.Therefore, a major step of this work consisted in the acceleration of the spectral simulation code in Matlab, previously reported in [26], to which we refer as Sim.This was achieved by re-writing Sim in Python along with several algorithmic improvements detailed below.
Briefly, the accelerated code SimSpec computes transition energies by exact diagonalization of the nitroxide Hamiltonian Ĥ1 .These resonance energies are utilized to compute orientation selection and accumulate contributions to the ENDOR spectrum through a histogram weighted by the 'hole'-function proposed by Mehring [36] (given explicitly in [25]), parameterized via the pulse bandwidth of a p pulse.In our setting, varying this parameter in a reasonable range has very limited impact on the spectrum, see Section C of the SI.Optionally, the resonance energies can also be used to compute the EPR spectrum.Subsequently, the selected orientations are used to compute the fluorine resonance according to Ĥ2 in high-field approximation, see (4).The effect of the Mims ENDOR blind spot is treated analytically as proposed by [37].For more details, see [26].
For the powder pattern and orientation selection, SimSpec offers the choice between a Polar grid and the SOPHE grid [38], whereas Sim uses a grid similar to the Polar grid.Significant speed improvements were obtained by (i) pre-calculation of trigonometric expressions for all positions on the grid so that only changed quantities are re-computed and (ii) tensorification of the code exploiting high performance numerical linear algebra subroutines available through Numerical Python to reduce the number of explicit for loops.
A speed comparison between Sim and SimSpec to simulate the 263 GHz 19 F-ENDOR spectra of compound 1 yields the execution times in Table 1.All parameters were chosen for comparable computational accuracy between these codes.Repeating diagonalization of the Hamiltonian is only necessary if the magnetic field strength B 0 or the g eigenvalues have been changed and so computational speed-ups are available by selectively updating subsets of the parameters.Additionally, considerable computational savings from pre-calculation of the EPR spectrum and trigonometric expressions are apparent comparing execution times ('w/precalc' vs 'w/o precalc').Overall, speed-ups by a factor between 10 and 100 or more were achieved.More details are available in SI D.

Inference methodology
In this section, we introduce the methodology employed for estimation and quantification of the stochastic part of the error in this estimation.Chiefly, this consists of choosing a reasonable loss function that quantifies the fit between measured and simulated spectrum and an optimization procedure that needs to be carefully designed.We propose Bayesian optimization to perform a global search followed by refinement through a gradient-based method.Bayesian optimization is particularly suited as the loss function exhibits a large number of local minima that many other optimization algorithms tend to get stuck in.

Loss function
The full set of spin and experimental parameters can be described by a parameter vector h (not to be confused with any polar angle) of dimension 2N o þ 10N19 F , where N o denotes the number of orientations available (enumerated as o 2 1; . . .; N o f g corresponding to those g x ; g xy ; g y ; g yz ; g z for which data are available).
For a given h, our SimSpec code (Section 3.3) simulates spectra I o;m h ð Þ which are compared to the measured spectra 2 b I o;m obtained using the SDM from Section 2.2.
We seek a value for h such as to yield simulated spectra that match the measured spectra for all five orientations as closely as possible.We quantify the deviation of the simulated spectra from the measured spectra through the loss function (a re-scaled mean square deviation) The choice of this particular loss function is partly justified by the properties of b I: for large numbers B of batches we expect the estimator b I to approximately follow a Gaussian distribution and this is also observed approximately in bootstrap experiments (see SI A).Hence, the sum of squared deviations, though often used as a default choice without further justification, approximately makes L h ð Þ a negative multiple of the log likelihood for h and is therefore statistically appropriate in our case.Other choices (such as Wasserstein-based deviations, see [39]) have also been tried but were found to add computational cost at no gain in optimization speed.The problem of finding those parameter values h Ã in the parameter space H that yield the best fit between simulated and measured spectra is hence cast as the problem of minimizing L over h.

Optimization algorithms
We seek to solve the problem of minimizing the loss function from (8): where L : R m ' H ! R, in the context of optimization algorithms, is known as the objective function.We observed empirically that L was reasonably smooth if sufficiently precise spectral simulation was used (see Fig. S6 in SI D).However, it possessed a multitude of local minima and its domain is of moderately high dimension (up to 12 in our data depending on the compound under investigation).Hence, we decided to use Bayesian optimization which is an iterative strategy to solve the global optimization problem (9).There are two main ingredients involved in Bayesian optimization: a statistical model for the objective function L and an acquisition function to determine which h 2 H should be tried next.A priori, the loss function is modelled as a Gaussian process with some mean and covariance kernel ideally chosen to reflect pre-existing understanding of the loss function.Here, we adopted a standard zero mean function and Matérn covariance kernel [40] of order m ¼ 1:5 with expected improvement as acquisition function.An introduction to this algorithm is provided in SI F, a more detailed exposition can be found in [40].Instead of specifying a starting value, Bayesian optimization requires boundaries to be specified for all parameters: some parameters have natural boundaries such as a A and b A in the HF tensors, whereas other parameter boundaries are chosen to consider only physically reasonable values.One advantage of Bayesian optimization is that it deals well with large regions of parameter space, which enabled us to include the full range of theoretically possible Euler angles.
Once Bayesian optimization has identified a parameter value near the global minimum, refining the estimate using a quasi-Newton method such as Broyden-Fletcher-Goldfarb-Shanno (BFGS) [41] is standard practice because this will converge to h Ã quickly.To enhance the performance of BFGS, we elected to supply gradients of L approximated via a manually tuned finite difference method, see Fig. S11 in SI F.

Approximate confidence regions
Having approximately computed the best parameter value h Ã , there is a need to assess its error.Statistical methods allow quantification of the stochastic error, i.e. the parameter uncertainty implied by the measurement error of the spectrum.We start from the approximate covariance matrix of the measured spectrum, v 2 R NÂN , which arises from the measurement error and is computed using the bootstrap method, see Section 2.2 for details.We then compute the matrix of partial derivatives of the simulated spectrum with respect to the parameters, , by finite difference approximation.A linear approximation of the spectral simulation algorithm, , then enables us to approximately obtain the covariance matrix describing the stochastic uncertainty in h Ã that is implied by the uncertainty in b I via The least-square solution of (10) for Cov h ð Þ is used to construct approximate confidence regions for h Ã .This corresponds to linear propagation of Gaussian errors and delivers good results for small uncertainty ranges but less reliable results for particularly large uncertainty ranges.More detail on how uncertainties of orientations have been handled is available in SI E. 2 All spectra are normalized by imposing

Optimization results
Bayesian optimization requires the specification of boundaries of the parameter space and is sensitive to its dimension.Hence, we proceeded by firstly fixing the experimental parameters (see SubSection 3.2) at reasonable initial values (a line width of 20 kHz from [26] and the magnetic field strengths estimated from 1 H Larmor frequencies).Then, we used Bayesian optimization for all spin parameters except for a iso ¼ 0 which was kept fixed.Examples of intermediate steps (iterations 20 and 300) of the Bayesian optimization are plotted in Fig. S10 in SI G in the case of compound 1.These show that after 300 iterations, parameter estimates start to yield reasonably matched spectra and therefore Bayesian Optimization was halted at this point to limit computational time.In a second step, all experimental and spin parameters including a iso were jointly optimized using BFGS.
Choosing boundaries for the parameter space requires prior knowledge.Since Bayesian optimization deals well with large regions of parameter space, the boundaries were chosen to include all plausible parameter values.We considered the visible features in the spectrum (e.g.dipolar peaks in the spectrum, see Fig. 4) as well as the spread of DFT-predicted values for CS tensors, reported in [26].This resulted in the boundaries given in Table 2.

Compound 1
For compound 1, two local minima of the loss function were identified, corresponding to two solutions.These two parameter sets are compared in panel A of Fig. 5, where blue and orange bars illustrate the approximate 95% confidence interval of each parameter.Notably, the two minima have indistinguishable HF parameters with narrow confidence intervals.While the stochastic error in the spectra at each individual RF frequency is large enough to allow greater deviations in the HF parameters, such greater deviations would induce small errors at many RF frequencies.The simultaneous occurrence of such a large number of small errors, however, is unlikely under the Gaussian error model with independent errors which explains the confidence intervals being so narrow.The agreement of the HF parameters between the two minima is an important finding, since we consider the HF parameters as the primary source of information from 19 F ENDOR spectroscopy for (biological) structure determination.The two minima differ in their CS parameters and their occurrence is not surprising because of a lack of resolution of the CS tensor and a parameter space of moderately high dimension.Based on the optimization procedure alone, we cannot decide which set of parameters is preferable.We selected the parameter set represented in blue in this panel (values given in panel B) because the CS tensor Euler angle b r shows better consistency with DFT whereas the eigenvalues differ less between the two minima considering their confidence regions.In this regard, our approach is similar to that of including DFT-derived information penalizing deviations of estimated parameter values from their anticipated values [27].
The spectral residuals, shown in panel A of Fig. 6, demonstrate that the optimization procedure leads to a very close fit, substantially improving over the previously published fit shown in panel A of Fig. 7. Close inspection of the spectral residuals reveals some structure, i.e. the spectral residuals deviate from the expected pure noise indicating the presence of some systematic error due to imperfect model fit.Our statistical approach also yields correlations between stochastic errors for different parameters (see panel B of Fig. 6): for instance, we observe a strong positive correlation between magnetic field strengths and CS tensor values and a weaker negative correlation between magnetic field strengths and HF tensor parameters.
Given the closeness of measured and simulated spectra, the deviation of the estimated T-value from DFT and the previously reported value visible in panels A and B of Fig. 5 is striking.This triggered an examination of instrumental parameters, which revealed that nominal RF frequency differed from actual RF frequency, possibly by up to 8 kHz, due to a resolution issue in the commercial RF unit.This lead to a systematic error in the spectra and could partially explain the observed difference (see panel B in Fig. 5) in T-values to the ones previously obtained at 94 GHz [16], where the RF issue was absent.This finding underlines that a comparison of stochastic error with observed deviation from theoretical values can trigger a more careful study of instrumental errors.

Compound 2
Compound 2 exhibits a larger number of parameters due to its two non-equivalent 19 F nuclei.Therefore, we decided first to use Bayesian optimization to search for the CS parameters, keeping the HF interaction parameters fixed at reasonable values based on the peak positions in the spectra (Fig. 4 B).In a second step, BFGS was carried out over all spin parameters of the two nuclei.In a third step, we used BFGS over both spin and experimental parameters.Increasing the dimension of the parameter space via the second step ensures that BFGS stays near the minimum identified by Bayesian optimization rather than being attracted by another local minimum.
A substantial number of local minima of the loss function were identified: a fairly intensive search yielded four local minima (enumerated as minimum #1 to #4) but it is probable that there are further local minima not yet identified.To decide between the four minima, we relied firstly on the qualitative reproduction of the mfluorine HF coupling.We plotted the HF tensor parameters for both ortho and meta fluorine nuclei for the four minima and compared them with DFT and X-ray-derived values in Fig. 8. Panel A of this figure shows that estimated HF parameters for the ortho fluorine nucleus are all indistinguishable taking their stochastic error into account.We note a slight deviation of the T-value from the DFT Table 1 Approximate execution times (in seconds) for the simulation of the 263 GHz ENDOR spectra of compound 1. 'Total' refers to the full execution time including computation of the EPR spectrum and other set-up costs, whereas 'ENDOR' refers to the execution time of the ENDOR spectrum only.Similarly, 'w/precalc' includes the execution time of setting up the grid, pre-calculating trigonometric expressions and diagonalizing the Hamiltonian whereas 'w/o precalc' excludes these times.and previously reported values in the same direction and of similar magnitude to that observed for compound 1, likely due to similar systematic errors.Panel B displays the values for the m-fluorine nucleus and it is readily visible that minimum #2 (in orange) exhibits a strong deviation of dipolar tensor orientation from DFT values as well as from those of all other minima.In spectral Boundaries for the spin parameters used in Bayesian optimization for both compounds 1 and 2. For CS tensors, we use the parameterization via matrix entries as per Equation ( 6).Note that the precise values of the boundaries (e.g.À129 vs À130) are irrelevant as long as the ranges are large enough to include all plausible values.No boundaries for Bayesian optimization are provided for the experimental parameters and aiso because these are optimized by BFGS only.Fig. 5. A: Estimated HF interaction and CS parameters and the corresponding approximate 95% confidence regions for the 19 F nucleus in compound 1: minimum #1 in blue, minimum #2 in orange, DFT values as dotted green line and T-value calculated from X-ray structure as dotted red line from [16].B: Comparison of minimum #1 with manual fitting results and DFT values reported in [26].g-values were g x;y;z ¼ 2:00886; 2:00610; 2:00211 ½ , as reported in Section 2.1.Parameter uncertainties (approximate 95 % confidence regions) consider only stochastic error, not systematic error.Parameter estimates in [26] were based on 94 and 263 GHz data whereas optimized values here are derived solely from 263 GHz data.Uncertainties in [26] were assessed only from the impact of parameter-wise changes on the spectrum and are therefore not reported here.Similarly, DFT uncertainties are difficult to assess in general and are therefore not specified here, either.4. The spectral residuals are plotted below each of the spectra using the same color.B: correlation matrix of the corresponding parameters (from top to bottom/left to right: magnetic field strength B0 at each of the five orientations g x ; . . .; g z ; aiso; T; aA; b A ; rxx; ryy; rzz; ar; b r ; c r and line width for each of the five orientations g x ; . . .; g z ) calculated as described in Section 4. Therefore, we discarded minimum #2.In order to decide between minima #1, #3 and #4, we considered the relative orientation of the dominant eigenvectors of the CS tensors (i.e. the eigenvector associated with the largest eigenvalue of each CS tensor).These should both be orthogonal to the phenyl ring based on NMR studies [42] and therefore parallel to each other.To visualize this, we computed the angle between these two dominant eigenvectors for each minimum, taking stochastic error into account.This is represented in Fig. S15 in SI G.3 which shows that minimum #1 is best compatible with angle 0 , although minimum #3 is also acceptable.Minimum #4 can be excluded.

Parameter name
The results of the optimization procedure including correlations for minimum #1 are summarized in Fig. 9 and reported in Table 3.The approximate confidence regions for the CS tensors were determined using error propagation from Section 4.3 and tensor alignment as detailed in SI E. The spectral residuals are plotted in panel A of this figure.We find a very close fit: spectral residuals consist mostly of noise.This represents a main result, namely a clear improvement in the spectral residuals over the results of previous manual parameter tuning [26] as displayed in panel B of Fig. 7.Moreover, a great advantage is that our optimization procedure typically has computational time amounting to a few hours on a standard PC as compared to several weeks of manual parameter tuning underlying the results in [26].Overall, despite limited resolution of CS tensors and moderately high dimension of the parameter space, we conclude that the method was able to identify a physically reasonable set of parameter values.Fig. 7. A, B: Spectra, simulation and spectral residuals resulting from the estimated parameters reported in [26] for compounds 1 and 2, respectively, with the chemical structures indicated in the inset.Parameters are given in panel B of Fig. 5 for compound 1 and Table 3 for compound 2.

Final remarks and conclusion
Finally, we discuss the results regarding estimated magnetic field strengths and line widths which are reported in Table 4.We observe major deviations between magnetic field strengths previously used in spectral simulation and their current estimates, particularly for compound 2.This strongly suggests that in the previous analysis, a systematic error was present in the combined determination of the magnetic field strengths and g-values.Indeed, only an internal magnetic field strength calibration (Bruker field linearization) was available in that study.
The ENDOR line widths for a Lorentzian convolution were also optimized and for most orientations the values obtained are reasonably close to the ones used in [26] which are based on RF pulse lengths.However, orientations g xy and g z in compound 1 as well as orientation g xy in compound 2 showed a markedly increased line width, see Table 4.This may be attributable to two potential factors: Firstly, the observed magnetic field drift (see Section 3.2) which is not part of the model results in a mixing of ENDOR spectra across a narrow range of magnetic field strengths.Secondly, the weaker contribution with a smaller g x value as described in Section 2.1 will show different orientation selectivity at the magnetic field strength selected in the experiment, particularly between g x and g y .The resulting broadening is then matched by an increased estimated line width.The combined impact of deviations in line widths and magnetic field strengths could explain the systematic error observed in the spectral residuals in Fig. 7.
Turning to the specified errors, it is important to point out that the specified approximate confidence regions account solely for stochastic error.While the spectral residuals, e.g. in Fig. 9, are small, this does not necessarily mean that the implied systematic parameter errors are small.Indeed, including additional parameters in the spectral simulation such as two of the eigenvalues of the g-tensor, has been observed to lead to a notable change in estimated parameters, larger than would be expected from stochastic error.This tendency is especially pronounced when the additional parameters are strongly correlated with ones already included, as is the case with eigenvalues of g.Out of concern for such larger than expected parameter changes, we have tested an optimization considering the small spectral contribution with a second g x value Fig. 9. A: the 19 F ENDOR spectra b Io;m (black) extracted with the SDM for all orientations of compound 2. In different colors: The corresponding spectra Io;m ĥ simulated with HF and CS parameters from minimum #1 and experimental parameters as in Table 4.The spectral residuals b Io;m À Io;m ĥ are plotted below each of the spectra using the same color.B: The correlation matrix of the corresponding parameters calculated as described in Section 4.3 for compound 2. Parameters are sorted as in panel B of Fig. 6 except for the CS tensors where matrix entries rxx; . . .; ryz according to (6) are used.

Table 3
HF interaction and CS tensor parameters for the 19 F nuclei in compound 2. g-values were g x;y;z ¼ 2:00886; 2:00610; 2:00211 ½ , as reported in Section 2.1.Parameter uncertainties (approximate 95 % confidence regions) consider only stochastic error, not systematic error.Uncertainties for CS eigenvalues and Euler angles are computed using the method in SI E. Euler angles marked with an asterisk have been changed by symmetry transformations according to (7) to map them closer to the current confidence regions.Previous estimated values from manual parameter tuning (Sim.) and corresponding DFT values (DFT) used as input, both from [26], are included for comparison.Parameter estimates in [26] were based on 94 and 263 GHz data whereas optimized values here are derived solely from 263 GHz data.Uncertainties in [26] were assessed only from the impact of parameter-wise changes on the spectrum and are therefore not reported here.Similarly, DFT uncertainties are difficult to assess in general and are therefore not specified here, either.for compound 1, as reported in Section 2.1.The results are displayed in Fig. S13 in SI G.2 and show similar spectral residuals as using only a single g x value, so no unexpectedly large parameter changes were observed in this case.

Spin
The main sources of systematic error likely include instrumental error as well as our assumption of a single molecular conformation.Also, the spectral simulation approach does not consider any spin dynamics and relaxation.Modelling of these complex effects requires a different mathematical approach and is left for future work.Our study shows that knowledge of experimental parameters, such as magnetic field strengths or actual RF frequency, is crucial for parameter estimation of fluorine nuclei; it is easy to over-estimate the precision with which such experimental parameters are known.
For future work, a Bayesian approach including molecular dynamics as a source of prior information is a possible route towards systems of greater relevance.Similarly, replication of the present results on additional compounds as well as use of ENDOR spectra recorded at other MW frequencies are planned to establish robustness of the method and its ability to deliver HF tensors.Extension to spin systems with a larger number of nuclei is likely challenging and will require the imposition of penalties or the adoption of a Bayesian approach.Nonetheless, the present work provides a first step towards improved and faster parameter estimation through a better fit between measured and simulated spectra.

Sample preparation
Compounds 1 and 2 were synthesized as described in [26].For ESE and ENDOR, solutions contained sample concentrations of about 300 lM in a 1:1.5 mixture of deuterated DMSO and CD 3 OD.Solutions were loaded into Suprasil capillary tubes (VitroCom CV2033-S-100; O.D./I.D. = 0.33/0.20 mm), and shock frozen in liquid nitrogen.Then, sample capillaries were inserted into the precooled resonator immersed in a liquid nitrogen bath.Subsequently, the cold resonator was transferred into the cryostat, precooled to about 80 K. Samples contained typical volumes of about 50 nL.A carbon fibre was used for g-value calibration, as proposed in [30].For this measurement, a solution of compound 1 was loaded into the capillary along with carbon fibre and shock frozen in liquid nitrogen.

263 GHz EPR/ENDOR spectroscopy
263 GHz pulsed and CW EPR (ESE) as well as pulse ENDOR were recorded on a Bruker ElexSys E780 spectrometer equipped with a Bruker cylindrical TE012-mode EPR/ENDOR resonator (Model E9501510) [25].For ENDOR, the RF was produced by a Bruker DICE II RF synthesizer and pulse forming unit.A 125 W RF-amplifier (Amplifier Research, model 125W1000) was used, and the Mims pulse sequence [1] was applied.All ENDOR spectra were recorded in batches, each containing the sum of between 50 and 100 individual scans.The batches were stored in one data matrix, containing the batches in one dimension and the RF frequency as a second dimension.This matrix was used for data processing with the SDM.All spectra, including CW-EPR, were detected using a quadrature detection scheme, which enabled phasing of the signal prior to acquisition.
Experimental settings and conditions: ESE (compound 1 or 2): Mims-ENDOR (compound 1 or 2): T = 50 K, m = 263 GHz, p/2pulse (MW) = 32-40 ns, s = 850 ns, p-pulse (RF) = 50 ls, SRT = 3 ms, 1 shot/point in stochastic acquisition, RF resolution: 333 RF points recorded at nominal resolution of 3 kHz.During the writing of the manuscript we became aware of a bug in the Bruker spectrometer software connected with the DICE II unit, which limits the actual RF resolution to likely up to 8 kHz.While this issue is being addressed, the analysis in this paper was performed assuming the nominal 3 kHz resolution to be the actual resolution.Acquisition time was between 2 to 21 hours depending on sample and orientation as discussed in [26].

Table 4
Magnetic field strengths and line widths obtained from the reported optimization procedure.Parameter uncertainties (approximate 95 % confidence intervals) consider only the stochastic error, they do not include systematic error.The values yielding the simulated spectra in our previous report [26] are included for comparison.The values marked with y concern data not presented in [26] and are obtained using the method employed there.

Fig. 2 .
Fig. 2. 263 GHz EPR spectra (black) and their simulations (red).ESE spectrum of compounds 1 (A) and 2 (B), respectively, with an asterisk indicating the weaker contribution with a different g x -value.The first derivatives of the smoothed echo-detected spectra are displayed.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) error in the spectra.This matrix captures the standard deviation of the stochastic error at each frequency m as ffiffiffiffiffiffiffi ffi v m;m p as well as the dependency of stochastic error at different frequencies m and m 0 .Empirically, we found v to be approximately diagonal, see Fig. S2 in SI A.4

Fig. 3 .
Fig.3.Representative data processing by the SDM for orientation g x of compound 1.A: the estimated spectrum b I. B: the component x that is orthogonal to the estimated spectrum b I and contains no ENDOR signal, as expected.C, D: the real (black) and imaginary (red) parts of / and ŵ, respectively.A small phase and baseline drift is visible, particularly in the imaginary component.In A-D, 95% approximate pointwise confidence intervals are indicated as shaded regions; in D, these are so small as to be invisible.E: Kernel-density-estimation of the complex residuals b;m .F, G: histograms for the real and imaginary parts of the residuals, respectively.

Fig. 4 .
Fig. 4. A, B: Comparison of 263 GHz19 F Mims ENDOR spectra of compounds 1 and 2 at different orientations, respectively, estimated via the SDM (black) and the averaging method (blue).The chemical structures are indicated in the inset.Shaded areas correspond to approximate 95% pointwise confidence intervals obtained via bootstrap for the SDM (grey) and quasi-bootstrap for the averaging method (light blue).The dipolar splitting corresponding to the nitroxide-fluorine inter-spin distance is well visible for compound 1.For compound 2, only one dipolar splitting is well resolved, whereas the second, smaller splitting is partially suppressed by the central spectral hole of the Mims sequence.The resolved features of the dipolar splittings are indicated by colored asterisks for both F-atoms.The CS tensor leads to an asymmetry in the spectra and it cannot be evaluated visually.The MW frequency used for compound 1, orientation g y , differs from the one used for all other spectra resulting in a shift of the19 F resonance.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 6 .
Fig. 6.A: The 19 F ENDOR spectra b Io;m (black) extracted with the SDM for all orientations.In different colors: the corresponding spectra Io;m ĥ simulated with the values indicated in the Table 4.The spectral residuals are plotted below each of the spectra using the same color.B: correlation matrix of the corresponding parameters (from top to 3. H. Wiechers, A. Kehl, M. Hiller et al.Journal of Magnetic Resonance 353 (2023) 107491 simulation, this corresponds to the failure to reproduce the m-fluorine peak in orientation g y , see panel A of Fig. S14 in SI G.3.

Fig. 8 .
Fig. 8.Estimated HF parameters for minima #1, #2, #3 and #4 (blue, orange, pink, yellow) for compound 2. A: ortho 19 F nucleus, B: meta 19 F nucleus.For minimum #1, the corresponding approximate 95% confidence regions are shown in light blue.DFT values are shown in green dotted lines and the T-value calculated from X-ray structure in red dotted lines from [26].
F nuclei, the spin operators Ŝ and Î19 F k can be replaced by the m S and m I tum numbers, where r zz19F k À Á and A zz 19 F k À Á are the scalar zzcomponents of the respective r 19 F k À Á and A 19 F k À Á tensors: