Expanding neutrino oscillation parameter measurements in NOvA using a Bayesian approach

NOvA is a long-baseline neutrino oscillation experiment that measures oscillations in charged-current $\nu_{\mu} \rightarrow \nu_{\mu}$ (disappearance) and $\nu_{\mu} \rightarrow \nu_{e}$ (appearance) channels, and their antineutrino counterparts, using neutrinos of energies around 2 GeV over a distance of 810 km. In this work we reanalyze the dataset first examined in our previous paper [Phys. Rev. D 106, 032004 (2022)] using an alternative statistical approach based on Bayesian Markov Chain Monte Carlo. We measure oscillation parameters consistent with the previous results. We also extend our inferences to include the first NOvA measurements of the reactor mixing angle $\theta_{13}$ and the Jarlskog invariant. We use these results to quantify the strength of our inferences about CP violation, as well as to examine the effects of constraints from short-baseline measurements of $\theta_{13}$ using antineutrinos from nuclear reactors when making NOvA measurements of $\theta_{23}$. Our long-baseline measurement of $\theta_{13}$ is also shown to be consistent with the reactor measurements, supporting the general applicability and robustness of the PMNS framework for neutrino oscillations.

Long-baseline (LBL) accelerator neutrino oscillation experiments measure oscillations in ν µ → ν µ (disappearance) and ν µ → ν e (appearance) channels.These channels constrain the mixing angles θ 13 , θ 23 , the masssquared splitting ∆m 2  32 , and the CP-violating phase δ CP .Current measurements of θ 23 are consistent with maximal mixing (θ 23 = π/4), which would suggest a µ-τ symmetry in their mixing into the ν 3 mass eigenstate; a nonmaximal value such as θ 23 < π/4 (lower octant, LO) or θ 23 > π/4 (upper octant, UO) would indicate a preferential coupling of ν τ or ν µ , respectively, with ν 3 .Current experimental uncertainties on θ 23 are the largest among the mixing angles [20].LBL oscillation measurements, where neutrinos traverse significant quantities of matter, are also impacted by the coherent forward scattering of ν e s on electrons in the Earth [21].This modifies the oscillation probabilities P(ν µ → ν e ) and P(ν µ → νe ) with opposite signs.The direction of the resulting change in rate depends on whether ν 3 is the heaviest neutrino state (Normal Ordering, NO) or the lightest (Inverted Ordering, IO).Current observations from LBL (and other) experiments prefer the NO, though the strength of the NO hypothesis depends on which measurements are included [22].The value of δ CP is currently essentially unknown; LBL experiments provide the only constraints, but they are weak.
NOvA is a long-baseline neutrino oscillation experiment that observes the ν µ disappearance and ν e appearance channels using neutrinos of energies around 2 GeV over a distance of 810 km.Previously [23][24][25][26][27][28][29], we have presented NOvA constraints on ∆m 2  32 , sin 2 θ 23 , and δ CP using a classical frequentist approach.However, the Feldman-Cousins technique that is required in order to obtain correct classical confidence regions for these variables [30,31] poses challenges when confronted with highly degenerate sets of parameters.It also does not allow for post hoc transformations of the variables considered in the analysis.In this work we present a new analysis of the dataset from Ref. [29], based on Bayesian Markov Chain Monte Carlo, which enables us to extend our inferences to include θ 13 and the Jarlskog invariant J, for which J ̸ = 0 unambiguously indicates CP violation.We use these results to examine the implications of assuming short-baseline, nuclear-reactor antineutrino constraints on θ 13 when making measurements of other oscillation parameters in NOvA.
We also investigate the consistency of the PMNS framework by comparing the constraint from reactor experiments with our long-baseline measurement of θ 13 .

II. 3-FLAVOR NEUTRINO OSCILLATIONS IN NOvA
In this paper, we reanalyze data collected from an exposure of 13.6 × 10 20 14 kton-equivalent protons on target (POT) in the neutrino-enriched beam mode and 12.5×10 20 POT in the analogous antineutrino mode.The dataset, simulations, reconstruction, and estimation of systematic uncertainties remain unchanged for this analysis.A brief overview of these components follows, with detailed descriptions available in Ref. [29].Extensive discussion of the new analysis method, its implementations, and the resulting inferences are presented in Sec.III.

A. The NOvA experiment
NOvA observes oscillations using neutrinos from the NuMI beamline [32] at Fermilab using two functionally identical tracking calorimeter detectors that differ primarily in size.The 0.3 kton near detector (ND), the smaller of the detectors, is located 1 km from the neutrino production target, 100 m underground.The far detector (FD), by contrast, is 14 kton and is located on the surface at Ash River, Minnesota, 810 km from the target.Both detectors are built from PVC cells of 3.9 × 6.6 cm 2 crosssectional area and 3.9 m (ND) or 15.5 m (FD) length, which are arranged in alternating horizontal and vertical planes and filled with a mineral-oil-based liquid scintillator.A stack of alternating active planes and steel plates is placed behind the ND to range out muons while a small rock overburden is placed above the FD to aid in rejecting the cosmic-ray background.The detectors are placed 14.6 mrad from the central axis of the beam to receive a narrow-band neutrino flux predominantly between 1 and 3 GeV.
We use groups of spatially and temporally proximate cells with activity above threshold to apply basic data quality and containment selection cuts to events in both data and simulation.Within these groups we assign vertices and reconstruct likely particle trajectories.Ultimately we divide the events into ν µ charged-current (CC), ν e CC, neutral-current (NC), or cosmogenic background categories using NOvA's convolutional neural network (CNN)-based classifier [67].Boosted decision trees (BDTs) are used to further reject cosmic backgrounds in the FD samples.Both sets of tags are utilized together to create ν µ CC and ν e CC candidate samples.Fully contained ν e candidates at the FD are further divided into high and low purity samples based on the CNN score in order to enhance the signal-to-background rejection capability of the fit.To improve the statistical strength of the fit, we recover an additional sample of "peripheral" events that fail the containment or cosmic rejection BDT but pass stricter particle ID requirements.We estimate neutrino energy for ν µ CC events using the muon track length and the total deposited calorimetric energy of the hadronic system.The energy for ν e CC events is estimated using a function of calorimetric energy that takes as input the energy of the event's reconstructed trajectories divided into electromagnetic and hadronic components, as identified by a separate CNNbased classifier [68].

C. Near-to-far extrapolation and systematics
Predictions for the neutrino event rates at the FD are constrained using the high-statistics neutrino interactions measured in the ND.These measurements imply corrections to the ND prediction, which we propagate to the FD by adjusting for the differing efficiency and flux between the detectors using simulations.We call this process "extrapolation."We apply oscillations to this data-driven prediction when comparing to FD data during inference (Sec.III B below).
The signal spectra for both ν µ disappearance and ν e appearance channels are predicted using the ν µ spectra at the ND.The near-to-far extrapolation for the ν µ disappearance samples at the FD is performed in quartiles of hadronic energy fraction (f had = E had /E ν , with E ν being the reconstructed neutrino energy and E had the reconstructed hadronic energy).
Extrapolating in the f had bins has the effect of grouping events that share similar hadronic system characteristics so that compatible events are constrained together in the FD sample, despite the detectors' somewhat different acceptances.Performing the oscillation fit in these bins enhances sensitivity to the oscillation dip as a function of E ν since we achieve a finer energy resolution for bins with smaller f had .We further subdivide the near-to-far extrapolation into three bins of reconstructed transverse momentum of the outgoing charged lepton p T for both appearance and disappearance channels.Like the f had subdivision, this allows us to better match the constraints from the much smaller ND to the FD, in this case adjusting for the differing containment of events with leptons that emerge at large angles relative to the beam direction (i.e., large p T ).The predictions in p T bins are summed prior to the oscillation fit.We constrain the small beam backgrounds in the ν e appearance channels with a similar procedure based on ND ν e candidates after first decomposing them into NC, ν µ CC, and intrinsic ν e categories using data-driven constraints.(ν e beam backgrounds are all constrained together rather than being decomposed this way.)Cosmic backgrounds for all FD samples are determined from dedicated FD cosmic data samples.The remaining minor backgrounds are estimated from simulation.More detail on these procedures can be found in our previous paper [29].
We evaluate the impact of systematic uncertainties on the analysis by repredicting the sample spectra described above with altered parameters in the simulation.Uncertainties in the neutrino flux and interaction model are treated using event reweighting.Uncertainties in the detector calibration and custom modeling of light in the detectors, on the other hand, must be fully resimulated.The 67 systematically shifted simulations that arise from these techniques are extrapolated to the FD using ND data via the same process as above, which constrains the impact of the uncertainties on the predicted spectra.Parameters corresponding to these uncertainties are marginalized over in the oscillation inference in Sec.III.
Figure 1 shows the reconstructed energy spectra of the data observed at the FD using the ν e and ν µ CC selections described in Sec.II B, separated by the neutrinoenriched and antineutrino-enriched beam modes.Overlaid on these plots are bands of FD predictions produced according to the extrapolation procedure above.These illustrate the spectra predicted using the 68.3%, 95.4%, and 99.7% highest probability values of the systematics just described and of the relevant oscillation parameters, determined using the MCMC algorithms that will be detailed in the next section.
A Poisson likelihood [20] computed over the bins between the FD data and predictions such as those shown here forms the likelihood used in Bayes' theorem below.

III. OSCILLATION INFERENCES USING MARKOV CHAIN MONTE CARLO
We derive posterior probability distributions for relevant oscillation parameters using Bayes' theorem [69].Marginalizing away the nuisance parameters of our model, which include the tens of systematic uncertainties described in Sec.II C, is a challenging problem because it requires an integral over many dimensions.We therefore turn to a Monte Carlo method for computing the posterior: Markov Chain Monte Carlo (MCMC).In MCMC, we draw a collection of sequential samples from the posterior with frequency proportional to the posterior density.Histograms that approximate the posterior shape (with accuracy governed by the sample count) may be computed in any variable(s) of interest using these samples.In so doing, any dimensions not explicitly summed are implicitly marginalized. 2umerous algorithms for obtaining MCMC samples exist; we have implemented two for this analysis.The conclusions obtained from them agree with one another.Though descriptions of both methods are readily found in the literature, there are certain implementation choices that must be made for each, which we discuss in Sec.III A. Sec.III B lays out our resulting inferences on the parameters.
The traditional MCMC algorithm, the Metropolis-Rosenbluth-Rosenbluth-Teller-Teller (MR 2 T 2 ) method 3Reconstructed neutrino energy (GeV) FIG.1: Reconstructed neutrino energy distribution of selected data events (black crosses) in FD ν e CC samples (top) and FD ν µ CC samples (bottom) in neutrino-enriched beam mode (left) and antineutrino-enriched beam mode (right).The colored bands correspond to the range of 68.3% (darkest), 95.4%, and 99.7% (lightest) of the extrapolated FD spectra produced using the combinations of the oscillation and systematic parameters sampled by our MCMC algorithms (illustrating the posterior distributions described further in Sec.III B).The FD ν e samples are divided into bins of low (I) and high (II) particle ID confidence as well as the peripheral (III) sample discussed in Sec.II B. The four E f rac ν µ subsamples have been combined together in each of the lower two plots.is straightforward.We call our implementation ARIA, in honor of Arianna Rosenbluth, who first implemented the method in machine code [75].Proceeding from an initial seed in the parameter space, subsequent samples are selected by proposing a jump to a new set of coordinates, and accepting or rejecting that proposal according to an acceptance rule [20].This process is repeated until a sufficient number of samples have been collected.There is no explicit stopping criterion.The method also does not specify the distribution to be used in the proposal algorithm.In our implementation, we use the most common choice, which is of a multivariate Gaussian.We determine the characteristic length scale of the Gaussian empirically in order to optimize sampling efficiency (see App. A).We also "thin" the resulting chains to reduce autocorrelations among the samples (see App. B).Our ARIA results below have 5 × 10 5 effective samples.

"Stan:" Hamiltonian MCMC
Though the MR 2 T 2 method proposes samples quickly, they are typically highly autocorrelated.Its sampling proposals can also be inefficient if the posterior is sharply concentrated.Other MCMC methods have been developed to address these shortcomings, including one called "Hamiltonian" MCMC inference (HMCMC).We implemented a C++ interface to the Stan modeling platform [76] to obtain HMCMC samples.
The main difference between HMCMC and MR 2 T 2 is how proposals are generated.Rather than proposing randomly, HMCMC views the posterior surface as a topographical one that can be explored by a fictitious particle.Samples correspond to trajectories under the influence of a gravitational potential whose gradient corresponds to that of higher posterior density.Endowing the particle with an initial momentum that counterbalances the centripetal force from gravitation produces stable trajectories that traverse the highest density region of posterior space [77].HMCMC does this by numerically integrat-ing Hamilton's equations for the fictitious particle system with its position ⃗ q (which correspond to the parameters of interest) and momentum ⃗ p coordinates, and a Hamiltonian H = − log(posterior).This approach produces samples that are nearly uncorrelated without thinning at the expense of additional computing cycles to compute the gradient of the posterior.(Compare the 7 × 10 7 Stan samples we obtained, which require no thinning, to the 5 × 10 5 effective samples mentioned above for ARIA that remain after thinning.)We find Stan's default choices of the sampling distribution for the pseudoparticle kinetic energies and the integration stopping condition to be sufficient for our needs (see App. C).
Its topographical nature means HMCMC is ill-suited to parameters that assume only one of a discrete set of values, which would manifest as discontinuities in the trajectories considered.This presents a difficulty in neutrino oscillation parameter inference, where the absolute value of ∆m 2  32 is known with relatively good precision, but its sign remains an important unknown.While it is possible to allow HMCMC to explore the entire range of ∆m 2  32 , we find in practice that this results in poor exploration, as few trajectories manage to "jump" across the wide disfavored region at low |∆m 2  32 |.Instead, at the end of each trajectory determination, we introduce a separate MR 2 T 2 -like step which considers the possibility of changing the sign of ∆m 2  32 according to the prior probability chosen for it (50%, i.e., uniform prior; see Sec.III A 3 below).If the acceptance ratio between both mass orderings satisfies the MR 2 T 2 criteria, the proposed sign is retained; if not, it is reverted to its previous value.

Choices of prior
In Bayesian inference, the posterior probabilities are influenced by the choice of prior probability densities, which encode assumptions made about the parameters before the data is examined.If the data used for a measurement is sufficient, its constraint on the posterior will overwhelm the prior, and the prior choice is unimportant.However, when data is sparse, the choice of prior may affect the result.The priors we choose differ according to the parameters being considered.
a. Parameters of interest: |∆m 2 32 |, sgn(∆m 2 32 ), sin 2 θ 23 We prefer to use "uninformed" priors, which do not favor any particular value, for the physics parameters we intend to measure directly.In practice, this usually amounts to a prior uniform in the variable in question.However, uniformity is not preserved under a change of variable: for instance, a prior uniform in θ 23 is not uniform in the measured variable sin 2 θ 23 .In our results below, we have studied the impact of priors uniform both in a particular variable and relevant functions of it, and we report when the prior choice significantly affects the results.
b. Parameter of interest: δ CP While δ CP is intended to have a uniform prior as well, it receives additional special treatment.Its cyclical nature as the phase of a complex number results in an infinite set of values having identical consistency with the data.This can cause MCMC samplers never to converge on a single value for δ CP and consequently to sample much more slowly.To combat this problem, we developed a novel special prior over δ CP : otherwise.
(1) This function is illustrated in Fig. 2.This prior forces the samplers to remain near a single phase of δ CP , 0 ≤ δ CP /π ≤ 2, as the prior vanishes outside of [-1, 3].Because it makes the transition to the vanishing regions in a differentiable manner, this prior is suitable for use with HMCMC.Moreover, the sum of the prior's value at every point 0 ≤ δ CP /π ≤ 2 with that of all its complex branch points outside that interval is a value that does not depend on δ CP : This implies that the prior behaves identically to a uniform prior when used in conjunction with the oscillation probability.For the rest of this paper, whenever we refer to a "uniform" prior in δ CP , we mean this prior.When in the subsequent sections we study the effect of choosing a prior uniform in δ CP vs. sin(δ CP ), we reweight the MCMC samples obtained with the prior above, taking the Jacobian factor ∂(sin(δ CP )) = cos(δ CP ) as the weight.
c. Parameter of interest: sin 2 2θ 13 As discussed further in Sec.III B below, we consider two separate cases for sin 2 2θ 13 : one where its prior is treated identically to that of sin 2 θ 23 (see III A 3 a), and one where measurements from short-baseline reactor antineutrino oscillations are applied as a constraint.In the latter case, we impose a Gaussian prior with standard deviation obtained from the 2019 world average of reactor measurements [78], analogous to the treatment in our most recent frequentist result [29]: sin 2 2θ 13 = 0.085 ± 0.003.
d. Systematic uncertainties For all systematic uncertainties we use a unit Gaussian distribution as their prior.

B. Oscillation inferences: results and discussion
In this section we describe our inferences regarding the PMNS neutrino oscillation parameters, given the data, model, and the Bayesian methodology described above.Two separate sets of results are obtained.The first uses a Gaussian prior on sin 2 2θ 13 , imposing a constraint from reactor antineutrino experiments (Sec.III B 2-III B 3).The second uses a prior uniform in sin 2 2θ 13 , yielding results constrained only by the NOvA data (Sec.III B 4).Samples obtained from ARIA and Stan produce essentially identical distributions in all the variables considered (App.D).

Goodness of fit
To evaluate the goodness of the fit for this Bayesian analysis we use posterior predictive p-values (PPP) [79].In a PPP test, the coordinates from each MCMC sample are used to form a prediction for the observed spectra.A χ 2 statistic is computed between each prediction and the data, which we denote as χ 2 data .A second prediction is made for each sample by applying Poisson fluctuations to the prediction above, and a second χ 2 pseudodata calculated between this pseudodata and the original prediction.Because the data spectra are unchanged in this process, χ 2 data incorporates only variations in the oscillation parameters and systematic uncertainties.Conversely, since the pseudodata distributions have Poisson fluctuations applied but use the same oscillation parameters and systematic uncertainty pulls as the base model for each MCMC sample, χ 2 pseudodata treats only statistical uncertainties.
The distribution of these (χ 2 data , χ 2 pseudodata ) pairs for the entire ensemble of spectra considered in Fig. 1 is shown as the purple shading in Fig. 3.The PPP then consists of the fraction of points in this ensemble that lie above the χ 2 data = χ 2 pseudodata line.In the limit of infinite MCMC samples, a model that perfectly describes the data apart from statistical variations will produce a PPP of 0.5.We observe that the shaded distribution in Fig. 3 is distributed evenly around 1 unit of χ 2 per degree of freedom in both axes, and the PPP we obtain is 0.56.Both of these imply that the model is a good representation of the data.
We also find good p-values for the ν µ and νµ samples considered independently, whose 68.3% credible regions and νµ (light red, right dashed) data samples.The posterior predictive p-value is the fraction of the distribution that lies above the diagonal χ 2 data = χ 2 pseudodata line (black).The fact that the purple distribution is centered at (1, 1) and is evenly distributed above and below the the diagonal, with the associated p-value of 0.56, indicates a good fit.
are indicated by the dashed light blue and light red colored contours in Fig. 3, respectively.By contrast, the effect of fluctuations in the smaller-statistics samples can be seen more readily in the analogous ν e (solid dark blue) and νe (solid red) contours.Both contours are much larger than their ν µ counterparts, particularly along the y-axis, which corresponds to the dimension where statistical uncertainties are considered.In the ν e contour, the offset downwards from unity along the x-axis, and the corresponding shift above the diagonal along the yaxis, together suggest a set of fluctuations that are relatively close to the Asimov (unfluctuated) model prediction.Each member of the ensemble of pseudodata spectra thus typically has larger χ 2 relative to the Asimov than that of the data.This relative closeness of the ν e prediction to the data can also be seen in the top left panel of Fig. 1.The unusual shape of this contour arises because the PPP distribution for ν e consists of two modes superimposed upon one another, corresponding to the high-probability regions within the NO that will be discussed below in Fig. 7. On the other hand, the situation is reversed for the νe spectra, where the data fluctuations result in larger deviations from the Asimov prediction than the bulk of the pseudodata spectra.The fact that combining all four subsamples together produces a PPP that is closer to 0.5 than any of them are individually is good evidence that the deviations from PPP of 0.5 in the various subsamples are predominantly driven by statistical, rather than systematic, effects. 4We conclude therefore that our set of MCMC samples reflect PMNS parameters with a good description of the physics exhibited in our data.

PMNS parameter measurements
We produced Bayesian credible regions for the PMNS neutrino oscillation parameters using the data spectra and the MCMC samplers described in sections III A and II C, respectively.These credible regions together with the posterior probability distributions are shown in Figs. 4 and 7.In both cases the reactor θ 13 constraint used is the 2019 PDG's combination of extant measurements [78] and applied in the form of a Gaussian prior on sin 2 2θ 13 , as explained in section III A 3 c.In the figures in this section, credible intervals that show the normal and inverted mass orderings separately are created by first making one shared credible interval that spans both.The separate panels then display the relevant regions from this shared interval that apply to the specified ordering.By constructing them in this manner, we ensure the NO and IO intervals share a highest-posterior density point and posterior probability distribution, preserving any NOvA preference towards one of the mass orderings in the credible region limits.Similarly, the distributions and intervals labeled "both orderings" are created from MCMC samples over all values for ∆m 2  32 , summing together the normal and inverted ordering posteriors before extracting the credible intervals.
Table I shows the highest posterior probability points together with the credible intervals enclosing 68.3% of the posterior (henceforth called the 1σ contour in analogy with Gaussian p-values in a frequentist context).These points are given for all the PMNS oscillation parameters of interest, split into both, normal, and inverted mass orderings (using the same methodology regarding the mass ordering as for the figures).For some of the parameters the 1σ region spans disjoint areas; we denote this with a union symbol ∪.These high posterior probability regions are in generally good agreement with the frequentist analysis of the same dataset [29].and the inverted (bottom) mass orderings (marginalization over the mass orderings is explained at the beginning of Sec.III B 2). Contours indicate regions enclosing 68.3%, 95.5%, and 99.7% of the posterior probability, and for convenience are labeled in terms of Gaussian standard deviations σ using the corresponding traditional z-scores (i.e., the number of standard deviations away from the central value).
Figure 4 shows the sin 2 θ 23 -∆m 2  32 plane, where the denser MCMC samples (darker color) and larger credible regions in the upper panel relative to the lower indicate a mild preference for the normal ordering.This conclusion holds in the presence of the entire systematic uncertainty model discussed in Sec.II C.However, the preferred regions in each mass ordering depend in a nontrivial way on the systematic uncertainties, as illustrated in Fig. 5.The most significant effect is in ∆m 2  32 , where the systematic effects not only broaden the preferred region, but also shift the central value to larger absolute magnitudes.The most important of the uncertainties contributing to this movement is in the absolute calibration of the calorimetric energy scale.Because this directly affects reconstructed neutrino energies, it shifts the expected number of events in the trough of the ν µ and νµ disappearance spectra and is thus anticorrelated with ∆m 2  32 , as can be seen in Fig. 6.Shifting ∆m 2  32 in this way also moves sin 2 θ 23 closer to maximal disappearance (sin 2 θ MD 23 ≈ 0.51; the value depends slightly on sin 2 θ 13 ).Because the ν µ disappearance spectra are essentially identical for values reflected across the maximal disappearance line, the credible regions are nearly symmetric around it (the degeneracy is broken only by the ν e appearance spectra, which have less statistical power).Thus, the credible regions for smaller | sin 2 θ 23 − sin 2 θ MD 23 | appear narrower, even though the sensitivity is unchanged.This effect is responsible for the apparently slightly tighter constraint on sin 2 θ 23 observed in Fig. 5 under the effect of systematic uncertainies.
The situation is different in the δ CP -sin 2 θ 23 plane, which is shown in Fig. 7. Here, we observe preferences for CP-nonconserving values of δ CP (i.e., nonintegral values of δ CP /π) in both normal and inverted orderings, and for the upper octant (sin 2 θ 23 > 0.5) of θ 23 (reflected also in Fig. 4), though the CP-conserving points are only weakly disfavored in NO.However, in contrast to the conclusions for sin 2 θ 23 -∆m 2  32 , these inferences are minimally affected by the presence of systematic uncertainties, as Fig. 8 makes clear-even if the resolution does degrade slightly and the credible regions grow.Here the minimal impact of systematic uncertainties owes primarily to the smaller statistics of the ν e and especially νe appearance samples that drive the sensitivity to these parameters, which results in statistical uncertainties dominating the uncertainty budget.
To more quantitatively assess the mass ordering and octant preferences, we give the posterior probabilities inferred for each combination of hypotheses in Table II.We also express them in a less prior-dependent way using Bayes factors.In both cases the evidence for either option is weak. 5The analogous Gaussian p-values also correspond to significances of less than 1 σ.The interpre- 5 The reader unfamiliar with the interpretation of Bayes factors is referred to the standard treatments of Jeffreys [82]  between statistical-only fits and fits including all the NOvA systematic parameters.The external constraint on θ 13 from the reactor experiments was applied in these fits.tations of the octant and the mass ordering hypothesis preferences are in good agreement with the 2020 frequentist analysis of the same dataset, which used profiling with Feldman-Cousins corrections instead of marginalization [29,31].This general agreement is also true for the parameters' intervals, with small differences expected given the differing statistical methods used.To examine the CP-conservation situation more comprehensively, taking the whole PMNS matrix into consideration, we will use the Jarlskog invariant measure, explored in the next section.

CP violation -Jarlskog invariant
The Jarlskog invariant is a measure of the strength of charge-parity violation that is independent of how the mixing matrix is parameterized.The neutrino-mixing variant [84] parallels its development in the quark sector [85].Under the three-neutrino-flavor assumption, its where if any of the factors is zero, the invariant vanishes and the neutrino mixing matrix is CP-conserving.Nonzero values of J, on the other hand, indicate CP violation.We produce this measurement by taking the MCMC chain with all the oscillation parameters' values and calculating the Jarlskog invariant at each MCMC step.As NOvA is not sensitive to the value of θ 12 , for J we use a Gaussian prior analogous to the one used for the reactor constraint (sec.III A 3 c) but whose range consists of the PDG's 2019 average of solar and LBL reactor neutrino measurements: sin 2 θ 12 = 0.307 ± 0.013 [78], the same value used in our previous results.
As described in section III A 3, we use uninformed priors for the oscillation parameters where possible, meaning that the priors are uniform in the variable that is being shown.Since the Jarlskog invariant is written in terms of sin δ CP , the natural formulation is in terms of a prior uniform in sin δ CP .At the same time, there are theoretical considerations that suggest a prior uniform in δ CP may be more appropriate [86].We therefore consider both a prior uniform in sin δ CP and one uniform in δ CP .Priors on the other oscillation parameters were investigated; these showed very weak effects on the J posterior due to the far higher sensitivity of NOvA data to the other PMNS parameters constrained above besides δ CP .
Figure 9 shows the inferred values of the Jarlskog in-variant extracted from a fit to the NOvA data with the external constraint on θ 13 from the reactor experiments, marginalized over the normal (top) and inverted (bottom) mass orderings.The regions most favored by the data tend towards CP violation in the NO, although the 1 σ intervals lie very close to J = 0.The IO's preferred regions are more markedly distant from CP conservation, particularly for the prior uniform in δ CP .As expected, these trends mirror those observed when considering CPconserving and CP-violating values of δ CP in Sec.III B 2.
A more quantitative way of measuring the NOvA preference for CP conservation with the Jarlskog invariant is through Bayes factors.These can be calculated with the Savage-Dickey density ratio method, which computes Bayes factors for point hypotheses [87,88].Using this method we can calculate the Bayes factor for the CPconserving value J = 0, nested under the unconstrained hypothesis where J can take any value.Bayes factors always depend formally on the choice of prior, but in many circumstances (such as those discussed above), if two hypotheses being compared use the same prior, it will cancel.This is not the case in the Savage-Dickey method.Therefore, we compute the Bayes factor for J = 0 under both priors for δ CP considered above.We then invert it to obtain the associated Bayes factors for CP violation over CP conservation, J ̸ = 0. Table III shows the Bayes factors for CP violation against CP conservation for normal, inverted, and both mass orderings, calculated for priors uniform in sin δ CP or δ CP .All of these probabilities point towards a preference for CP nonconservation, although the preferences are minimal regardless of the δ CP prior or the mass ordering (and an analogous frequentist p-value would indicate significances less than 1 σ, apart from the inverted ordering, where they range 1.1 − 1.2 σ6 ).Posterior probability density (purple shading)-that is, the probability that a given (ν e , νe ) pair of counts is the true underlying value in nature (see text)-compared to the measurement and associated expected statistical variance (black cross) of the total number of νe candidates vs. ν e candidates.The black solid contours indicate the regions enclosing 68.3% of the posterior.Each panel overlays a set of ellipses corresponding to the predicted event counts over the range of possible values of δ CP , with the parameter notated nearby held at that value (blue for NO, red for IO, with varying shading depending on the parameter value) and other parameter values as given in Table IV.The triangular markers show the highest posterior density in the ν e -ν e space for when restricting to NO (blue) and IO (red) hypotheses.The posteriors shown here employ a uniform prior over sin 2 2θ 13 .

Using only NOvA constraints on θ13
NOvA's oscillation measurements simultaneously constrain a combination of mixing angles (θ 23 , θ 13 ), the mass-squared splitting ∆m 2  32 , the CP phase δ CP , and the neutrino mass ordering.Therefore, applying a strong external 1D constraint on θ 13 from reactor antineutrino experiments-and thereby reducing the available solution space-increases the sensitivity to other parameters.This is especially true of the octant of θ 23 , as we will show below.However, using an uninformed (uniform) prior in sin 2 2θ 13 enables us to directly compare a NOvA-only measurement against that of the reactor experiments.In so doing we can examine the robustness of the PMNS description across short-baseline reactor antineutrino experiments and long-baseline accelerator-based oscillation experiments.We also may study whether our preferences change in the presence of an external constraint on the data.
Without an external constraint on sin 2 2θ 13 , the relevant degrees of freedom present a complex space with many intercorrelations among the parameters and numerous possible solutions.We can use the type of posterior distributions shown in Fig. 1 to explore these possibilities: for example, by computing the total number of predicted ν e and νe candidates for the parameters of each MCMC sample and comparing the distribution of these predictions in (ν e , νe ) candidate space to what we observe in the data.This is shown in Fig. 10.To guide our intuition, we overlay ellipses corresponding to the ranges of predicted number of events for the possible values of δ CP at fixed values of the other parameters, subject to constraint from our data in that the highest posterior density (HPD) value in each ordering is used for parameters not explicitly varied.We note several important features.First, the highest-density (highest-probability) region lies essentially equally along the overlap region between NO and IO ellipses in all panels, meaning we will find good solutions for either, given an appropriate value of δ CP .However, the NO ellipses subtend more of the posterior region, which will result in a slight overall preference for NO.Second, it is clear from the left panel that the value of sin 2 2θ 13 preferred by the reactor average, sin 2 2θ 13 = 0.085, is very close to the highestprobability region for NOvA.Third, comparing the left and center panels, there is obvious degeneracy between sin 2 2θ 13 and sin 2 θ 23 , since independently varying them produces similar effects on the predictions.And finally, the right panel demonstrates that the value of ∆m 2 32 plays a minor (though not negligible) role in the appearance channel; its primary constraints arise from the ν µ disappearance measurement.These features will be examined more quantitatively in the distributions that follow.Table IV shows the highest posterior probability points together with the 1 σ credible intervals for all the PMNS oscillation parameters of interest, extracted from the fit with a prior uniform in sin 2 2θ 13 .Similarly to Table I, the values are split into both, normal, and inverted mass orderings with any disjoint 1σ regions denoted with a union symbol ∪.As compared to the results of the fit with the external constraint from reactor experiments, the central values for δ CP and sin 2 θ 23 shown here are the most different.This is not unexpected since there is also a degeneracy between these two parameters, and both δ CP and sin 2 θ 23 have multiple areas of high probability.
Fig. 11 shows our preferred regions in δ CP -sin 2 θ 23 and sin 2 θ 23 -∆m 2  32 spaces without the external θ 13 constraint.Comparing with Figs. 4 and 7, it is evident that removing the constraint diminishes the NOvA sensitivity especially to the octant of θ 23 (there is now near symmetry across sin 2 θ 23 = 0.5).Although the octant preference is weakened, the central values of the PMNS parameters sin 2 θ 23 , ∆m 2  32 , and especially δ CP do not otherwise see any significant change.This is as expected from Fig. 10.The results without the reactor constraint are therefore fully compatible with the standard NOvA results with the external constraint on sin 2 2θ 13 applied.We also observe that although the sensitivity is reduced, and although that for a given value of δ CP there is always a combination of parameters within either mass ordering compatible with the data, certain δ CP -sin 2 θ 23 -ordering combinations are still excluded with reasonable confidence, such as (δ CP = π 2 , IO) and (δ CP = 3π 2 , sin 2 θ 23 = 0.5, NO).
Combinations such as these produce strong asymmetries in the (ν e , νe ) counts, and as Fig. 10 makes clear, such parameter values lie in regions outside the posterior point cloud, and are thus disfavored.Table V examines the posterior probabilities for each of the octant and massordering hypotheses more quantitatively.Compared to Table II, we note a similar weakened preference for the upper octant of θ 23 for the fit without the reactor constraint.
In Fig. 12 we study the impact of applying the reactor constraint on our mass-ordering inference.As expected from Fig. 10, the normal ordering contains more posterior probability, as evidenced by the larger NO contour in the lower panel.Moreover, as we see in the marginal posterior distribution shown in the top panel, the posterior restricted to IO prefers generally larger values of sin 2 2θ 13 .The associated correlation with sin 2 θ 23 also pushes θ 23 into the lower octant, as seen in the lower panel; this point will be developed further momentarily.Because the reactor value (the yellow bar indicates its 1 σ range) lies at lower sin 2 2θ 13 , this results in the slightly stronger preference for the NO in Table II as compared to V.However, the difference is small, which indicates that the (mild) NO preference observed in the data is largely  independent of the reactor constraint.
The degeneracy in the measurement between sin 2 θ 23 and sin 2 2θ 13 we noted in Fig. 10 can be studied directly by examining their joint posterior probability distribution, marginalized over all other parameters and the mass ordering.We show this in Fig. 13.The central panel exhibits a clear anticorrelation between the octant of θ 23 and the value of sin 2 2θ 13 , which is expected since both parameters enter at leading order in the ν µ → ν e and νµ → νe oscillation probabilities.Here the overlap of the reactor measurements (again indicated by the yellow hatched bar) with our marginal posterior for sin 2 2θ 13 (right panel) favors the upper octant over the lower octant when we constrain the results to specifically the upper or lower octant of θ 23 .Thus, we see that the preference for the upper octant of θ 23 in Table II is an emergent behavior that arises from the application of the reactor constraint.We also note that though the marginal posterior for sin 2 θ 23 , in the top panel, shows a higher posterior density in the lower octant, the total posterior probability integrated across the upper octant is slightly larger We emphasize that reactor neutrino experiments and accelerator neutrino experiments measure the PMNS sin 2 2θ 13 by examining different sectors of neutrino oscillations, over a wide range of baselines.Reactor neutrino experiments measure the νe survival probability P(ν e → νe ) with low-energy (few-MeV) νe s over a short (few-km) baseline.Conversely, accelerator neutrino experiments simultaneously measure ν e appearance in a few-GeV ν µ beam, and νe appearance in a νµ beam, both over a long (hundreds-of-km) baseline.In long-baseline measurements P(ν µ → ν µ ), P(ν µ → νµ ), P(ν µ → ν e ), and P(ν µ → νe ) are all exploited to constrain the PMNS oscillation parameters, including sin 2 2θ 13 .Thus, the consistency observed between long-and short-baseline measurements lends strong support to the PMNS interpretation of neutrino oscillations.Both Orderings FIG.13: sin 2 θ 23 -sin 2 2θ 13 posterior probability densities with 1, 2, and 3 σ credible intervals, marginalized over both mass orderings.The posterior density was extracted from a fit with a prior uniform in sin 2 2θ 13 .The reactor experiments' 1 σ interval in sin 2 2θ 13 from the PDG 2019 [78] is shown in the yellow hatched bar.The right panel shows the posterior probability for sin 2 2θ 13 , with its contributions from the upper octant (UO, transparent, red outline) and lower octant (LO, transparent, blue outline) of θ 23 .

IV. CONCLUSIONS
We have refit the NOvA dataset of 13.6 × 10 20 POT in neutrino beam mode and 12.5 × 10 20 POT in antineutrino mode using a Bayesian statistical approach.NOvA data continues to be consistent with maximal mixing for sin 2 θ 23 and the regions of the sin 2 θ 23 -∆m 2  32 and δ CPsin 2 θ 23 spaces preferred by this analysis are consistent with those of the previous analysis done using a frequentist fit.
With the introduction of the Bayesian analysis, we also expand the neutrino oscillation parameters measured by NOvA.We report for the first time NOvA measurements that do not require constraints on sin 2 2θ 13 from reactor antineutrino oscillations.Moreover, we are also able to include new results for sin 2 2θ 13 and the Jarlskog invariant J; these were impractical to produce under the preceding frequentist method due to the necessity of Feldman-Cousins corrections.The inferences on J provide a model-independent measurement of CP violation and indicate that NOvA data has a weak preference for CP violation, which becomes more pronounced when assuming the inverted mass ordering.
As NOvA measures a convolution of mixing angles, δ CP , and mass ordering in the oscillation probabilities, removing the external constraint on θ 13 reduces our constraining power to determine the octant of θ 23 , mass ordering, and δ CP .While our sensitivity is reduced without the external constraint (particularly for the octant), we note that the conclusions arising from the analysis remain unchanged.
This measurement of sin 2 2θ 13 using electron neutrinos and antineutrinos with energies in the GeV range and propagating for hundreds of kilometers is fully consistent with measurements performed using few-MeV electron antineutrinos from nuclear reactors propagating for a few kilometers.The consistency of results using the PMNS framework across a broad regime of conditions bolsters its applicability.More stringent tests of CP violation and the consistency of our sin 2 2θ 13 measurement with those of reactors will be possible with increased statistics in upcoming NOvA measurements.

V. ACKNOWLEDGMENTS
This document was prepared by the NOvA collaboration using the resources of the Fermi National Accelerator Laboratory (Fermilab), a U.S. As noted in Sec.III A 1, an MR 2 T 2 chain is derived sample-by-sample using a repeated two-step procedure: 1. Proposal: The coordinates of a potential new sample are selected from a probability distribution centered on the current sample (or initial seed).
2. Acceptance: The proposal selected above is either accepted or rejected according to the rule of detailed balance, i.e., that every step in the chain be exactly reversible.
If accepted, the proposed coordinates become the next sample.If rejected, the previous sample is repeated to become the next sample.The MR 2 T 2 algorithm does not specify the distribution to be used in step 1 above, however.In our implementation, we use the most common choice, a multivariate Gaussian: where ⃗ x represents the current sample coordinates, ⃗ x ′ the proposed next coordinates, and N the dimensionality of the coordinate space.The matrix Σ imposes a length scale on the "distance" between successive samples, and (especially when it is diagonal) its elements are usually called the "step sizes" of the sampling for each degree of freedom.The ideal asymptotic fraction of samples accepted in step 2, α, is 23.4% under a wide range of circumstances [89,90].Though this figure is strictly true only for N → ∞, it has been shown to hold approximately even for parameter counts as low as N = 5 [91].
Because the outcome of step 2 is related to the proposals generated in step 1, we tuned the values of Σ to arrive at α = 23.4%.
Our overall heuristic in the tuning procedure is to maintain step sizes that yield similar autocorrelations (defined rigorously below) across all the parameters.This results in the most efficient exploration of the parameter space [71].We first optimized the step sizes for the parameters of interest, θ 13 , θ 23 , |∆m 2  32 |, and δ CP .We constructed a chain that sampled only those parameters using a unit matrix for Σ.We computed α for this chain and scaled the relevant elements of Σ in order to arrive at a tolerable preliminary acceptance rate of about 20%.We then computed the k-lag autocorrelation for each parameter θ, which measures the average correlation between MCMC sample n and sample n + k across all n [92]: where θ n refers to the value of parameter θ at step n, and θ is its mean value.These autocorrelations are shown in Fig. 14.Using these r k , we further adjusted the elements of Σ so the oscillation parameters would have similar autocorrelations.
To optimize the step sizes for the nuisance parameters (systematic uncertainties), we constructed a chain sampling only those parameters, again beginning with Σ entries of unity for them.As with the oscillation parameters, we adjusted Σ to ensure none of the nuisance parameters had significantly different autocorrelations from the others.We then constructed a new, much longer chain, and subsequently computed a covariance matrix over the nuisance parameters from it.The Cholesky decomposition of this matrix, L, was used in the next step.
A final chain, this time sampling both oscillation and nuisance parameters, was constructed using the adjusted Σ entries for both.While sampling, the proposed val- Step lag ues for the nuisance parameters were multiplied by the decomposed covariance scaled by a tunable factor, βL.
Using this chain, we recomputed the autocorrelations for all the parameters.The elements of Σ were readjusted to obtain similar autocorrelations from the oscillation parameters, now in the presence of the nuisance parameters.We also adjusted β to yield similar autocorrelations to those of the oscillation parameters.A final global scale was applied to Σ and β to finally arrive at α = 23.4%.The autocorrelations for the oscillation parameters at the end of the tuning procedure are shown in Fig. 15.The property of sample proportionality to posterior density in the MR 2 T 2 method is only guaranteed as asymptotic behavior.Therefore, it is usually necessary to discard some number of samples N 0 at the beginning of the Markov chain while the chain "burns in" or "warms up."Moreover, it is usually impossible to find a choice of step sizes Σ in Eq.A1 that proposes samples that are both fully uncorrelated with the previous one and whose acceptance probabilities are high enough to not impose severe computing requirements.We "thin" our chains by discarding all but the kth sample to reduce autocorrelations.The resulting fraction N eff = 1 k (N − N 0 ), the "number of effective samples," corresponds to the statistical power of the chain.
Proper thinning results in minimal autocorrelations.Though there is no universally agreed-upon threshold below which autocorrelations are considered acceptable, we found the plateau in Fig. 15 of around or less than 1% to be sufficient for our needs.Therefore we chose to thin at k = 10 4 .

Appendix C: Algorithm choices for HMCMC
As noted in Sec.III, HMCMC generates proposals by numerically integrating a Hamiltonian for a fictitious particle, whose potential arises from treating the logposterior in analogy to gravity: where T and V are the kinetic and potential energies of the system, respectively.There are two ingredients of HMCMC left unspecified by the method.In both cases Stan's default choices were found to be suitable for our needs.The first is the distribution of kinetic energies from which T in Eq.C1 is chosen.Stan's default is the Euclidean-Gaussian kinetic energy distribution: Here the mass matrix M (analogous to the effect of mass in gravitation) is a parameter that is automatically inferred by Stan during its warm-up sampling by iterative adjustments based on a running covariance over the samples.The second implementation choice is how long the integrator is allowed to run for each particular trajectory.
Stan uses an algorithm called No-U-Turns (NUTS) [93], which is a heuristic method that halts integration when two trajectories, extending in each direction from the

FIG. 3 :
FIG.3: Posterior predictive p-values from real data MCMC samples, with θ 13 constraint from reactor experiments applied.The purple distribution is a scatterplot of the binned χ 2 computed between the model and real data spectra (x-axis), against a similar χ 2 between the model and pseudodata spectra (y-axis), both divided by the number of degrees of freedom (DOF) in the fit, computed for each MCMC sample.(See the text for more on how the pseudodata spectra are constructed.)The dashed and solid contours show 1 σ intervals from the same posterior-predictive distributions calculated only for ν e (dark blue, left solid), νe (red, right solid), ν µ (light blue, left dashed) and νµ (light red, right dashed) data samples.The posterior predictive p-value is the fraction of the distribution that lies above the diagonal χ 2 data = χ 2 pseudodata line (black).The fact that the purple distribution is centered at (1, 1) and is evenly distributed above and below the the diagonal, with the associated p-value of 0.56, indicates a good fit.

FIG. 6 :FIG. 7 :
FIG.6: Binned posterior probability density (shaded) with 1, 2, and 3 σ credible intervals in |∆m2  32 | and the NOvA absolute calibration systematic uncertainty, where the horizontal axis is measured in units of standard deviations from the nominal value.The external constraint on θ 13 from reactor experiments was applied in this fit.

FIG. 8 :
FIG.8: Credible interval comparisons for δ CP -sin 2 θ 23 between statistical-only fits and fits including all the NOvA systematic parameters, with external constraint on θ 13 from reactor experiments.

FIG. 9 :
FIG. 9: Posterior probability density for the Jarlskog invariant in NO (top plot) and IO (bottom plot).The top half of each panel shows the posterior with a prior uniform in sin δ CP , while the bottom half uses a prior uniform in δ CP .A CP-conserving line is drawn at J = 0.The external constraint in θ 13 from the reactor experiments is used.
TABLE III: Bayes factors for preference of CP violation over CP conservation, extracted using the Savage-Dickey method at J = 0. Priors uniform in δ CP and sin δ CP are both shown.The preferences are given for the normal (NO), inverted (IO), and both (BO) mass orderings.Prior NO IO BO Uniform sin δCP 1.3 3.5 1.5 Uniform δCP 1.1 4.4 1.5 FIG.10: Posterior probability density (purple shading)-that is, the probability that a given (ν e , νe ) pair of counts is the true underlying value in nature (see text)-compared to the measurement and associated expected statistical variance (black cross) of the total number of νe candidates vs. ν e candidates.The black solid contours indicate the regions enclosing 68.3% of the posterior.Each panel overlays a set of ellipses corresponding to the predicted event counts over the range of possible values of δ CP , with the parameter notated nearby held at that value (blue for NO, red for IO, with varying shading depending on the parameter value) and other parameter values as given in TableIV.The triangular markers show the highest posterior density in the ν e -ν e space for when restricting to NO (blue) and IO (red) hypotheses.The posteriors shown here employ a uniform prior over sin 2 2θ 13 .

FIG. 15 :
FIG.15: Lag autocorrelation (see Eq. A2) computed using a MR 2 T 2 chain after step size tuning with all parameters included.

TABLE I :
The highest posterior density points (HPD) together with the 1 σ Bayesian credible interval limits for the PMNS parameters of interest marginalized over all the mass ordering (MO) hypotheses.Marginalization over the mass orderings is explained at the beginning of Sec.III B 2. In these results a Gaussian prior corresponding to reactor constraint on sin 2 2θ 13 (see sec.III A 3 c) is applied.

TABLE II :
Bayes factors (posterior probabilities) in all the θ 23 octant and mass ordering hypotheses combinations.A weak preference towards the normal mass ordering and upper octant is observed.Numbers extracted from a fit with the external θ 13 constraint.Probabilities summed across rows or columns may differ slightly from the totals due to rounding.In these results the reactor constraint on sin 2 2θ 13 is applied.

TABLE IV :
The highest posterior density points (HPD) together with the 1 σ Bayesian credible interval limits for the PMNS parameters of interest marginalized over all the mass ordering (MO) hypotheses.Marginalization over the mass orderings is explained at the beginning of Sec.III B 2. Extracted from a fit with prior uniform in sin 2 2θ 13 .

TABLE V :
Bayes factors (posterior probabilities) for all the θ 23 octant and mass ordering hypotheses, with a marginal preference towards the normal mass ordering and upper octant.Numbers extracted from a fit with a uniform prior in sin 2 2θ 13 .Probabilities summed across rows or columns may differ slightly from the totals due to rounding.
Department of Energy, Office of Science, HEP User Facility.Fermilab is managed by Fermi Research Alliance, LLC (FRA), acting under Contract No. DE-AC02-07CH11359.This work was supported by the U.S. Department of Energy; the U.S. National Science Foundation; the Department of Science and Technology, India; the European Research Council; the MSMT CR, GA UK, Czech Republic; the RAS, MSHE, and RFBR, Russia; CNPq and FAPEG, Brazil; UKRI, STFC and the Royal Society, United Kingdom; and the state and University of Minnesota.We are grateful for the contributions of the staffs of the University of Minnesota at the Ash River Laboratory, and of Fermilab.For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) license to any Author Accepted Manuscript version arising.