Likelihood preservation and statistical reproduction of searches for new physics

. Likelihoods associated with statistical ﬁts in searches for new physics are beginning to be published by LHC experiments on HEPData. The ﬁrst of these is the search for bottom-squark pair production by ATLAS. These likelihoods adhere to a speciﬁcation ﬁrst deﬁned by the HistFactory p.d.f. template. This is per-se independent of its implementation in ROOT and it is useful to be able to run statistical analysis outside of the ROOT and RooStats / RooFit framework. We introduce a JSON schema that fully describes the HistFactory statistical model and is su ﬃ cient to reproduce key results from published AT-LAS analyses. Using two independent implementations of the model, one in ROOT and one in pure Python, we reproduce the sbottom multi-b limits using the published likelihoods on HEPData underscoring the implementation independence and long-term viability of the archived data.


Introduction
Measurements in High Energy Physics (HEP) aim to determine the compatibility of observed events with theoretical predictions. The relationship between them is often formalised in a statistical model f (x|φ) describing the probability of data x given model parameters φ. Given observed data, the likelihood L(φ) then serves as the basis to test hypotheses on the parameters φ. For measurements based on binned data (histograms), the HistFactory [1] family of statistical models has been widely used for likelihood construction in both Standard Model (SM) measurements (e.g. Refs. [2,3]) as well as searches for new physics (e.g. Ref. [4]) and reinterpretation studies (e.g. Ref. [5]). A declarative, plain-text format for describing HistFactory-based likelihoods [6] is presented that is targeted for reinterpretation and long-term preservation in analysis data repositories such as HEPData [7].

Formalism
HistFactory statistical models -described in depth in Ref.
[6] -center around the simultaneous measurement of disjoint binned distributions (channels) observed as event counts n. For each channel, the overall expected event rate is the sum over a number of physics processes (samples). The sample rates may be subject to parametrised variations, both to express the effect of free parameters η and to account for systematic uncertainties as a function of constrained parameters χ, whose impact on the expected event rates from the nominal rates is limited by constraint terms. In a frequentist framework these constraint terms can be viewed as auxiliary measurements with additional global observable data a, which paired with the channel data n completes the observation x = (n, a). The full parameter set can be partitioned into free and constrained parameters φ = (η, χ), where a subset of the free parameters are declared parameters of interest (POI) ψ (e.g. the signal strength ) and all remaining parameters as nuisance parameters θ.
The overall structure of a HistFactory probability model is then a product of the analysis-specific model term describing the measurements of the channels and the analysisindependent set of constraint terms: where within a certain integrated luminosity one observes n cb events given the expected rate of events ν cb (η, χ) as a function of unconstrained parameters η and constrained parameters χ. The latter has corresponding one-dimensional constraint terms c χ (a χ | χ) with auxiliary data a χ constraining the parameter χ. The expected event rates ν cb are defined as from constant nominal rate ν 0 scb and a set of multiplicative and additive rate modifiers κ(φ) and ∆(φ).

JSON Schema
The structure of the JSON specification of HistFactory models closely follows the original XML-based specification [1]. The JSON specification for a HistFactory workspace is a primary focus of Ref.
[6], but a workspace can be summarised as consisting of a set of channels (an analysis region) that include samples and possible parameterised modifiers, a set of measurements (including the POI), and observations (the observed data). Listing 1 demonstrates a simple workspace representing the measurement of a single two-bin channel with two samples: a signal sample and a background sample. The signal sample has an unconstrained normalisation factor µ, while the background sample carries an uncorrelated shape systematic. The background uncertainties for the bins are 10% and 20% respectively.

Likelihood Preservation and Result Reproduction
Through the use of the HistFactory JSON specification, the statistical model used in a search for sbottom squarks [8] with the ATLAS detector [9], based on the full Run-2 dataset using 139 fb −1 of proton-proton collision data, was both preserved and reproduced. The search for new physics performs hypothesis tests on a simplified model that is parameterised by the masses of the sbottom squarkb 1 and the neutralinosχ 0 2 ,χ 0 1 and defines three separate statistical models. The full set of likelihoods for the three models is included as auxiliary material of the HEPData record of the analysis [10] for preservation and can be streamed from HEPData on demand. This is the first open publication of a full likelihood from an LHC experiment, fulfilling a proposal from the first Workshop on Confidence Limits (2000) [11].
In a demonstration of the full encapsulation of the HistFactory model in the JSON specification, a subset of the results from Ref. [8] are reproduced. The original analysis workspaces are converted to a set of XML and ROOT [12] files via RooStats [13], from which a JSON HistFactory workspace is made using the pyhf [14] libary's xml2json command-line tool. The subset of results are then reproduced using both a pyhf implementation and a ROOT implementation of the HistFactory model. pyhf implements the HistFactory model purely within the scientific Python software stack, i.e. using the scipy [15] and numpy [16] libraries. To convert from the JSON HistFactory workspace to a ROOT readable format, the pyhf json2xml command-line tool is used to convert the JSON to a set of XML and ROOT files, which are then converted into a RooFit workspace using the hist2workspace command-line tool. The full process is illustrated in Figure 1. In both model implementations the background-only fit from Ref. [8] is reproduced as well as upper limits on the visible cross section of Beyond the Standard Model physics, with excellent agreement as detailed in Ref. [6]. Based on the single-point hypothesis test procedure at fixed µ = 1.0, i.e. the nominal Beyond the Standard Model expectation, a set of tests for all simulated grid points are performed to infer a 95% CL s exclusion contour. Using the procedure described in Figure 1, the results obtained from the archived statistical models using original ROOT, round-tripped ROOT, and pyhf are overlaid in Figure 2, showing excellent agreement, with only minor numerical differences, validating the completeness of the JSON HistFactory specification.

Reinterpretation
The preservation of the statistical model in a structured form also aids in the derivation of new results through the method of reinterpretation. In reinterpretations a subset of the samples contributing to the expected event rates, most commonly those associated to Beyond the Standard Model processes, are replaced with alternative predictions derived from a new theoretical model, while keeping the remaining estimates, typically those derived for Standard Model processes, unchanged. 2 ) phase space for the m(χ 0 1 ) = 60 GeV signal scenario using the SR with the best-expected sensitivity. The shaded band shows the impact of the theory uncertainties on the SM background, and the experimental uncertainty on both the background and the signal. The contours labeled ROOT are calculated from the original workspaces of the analysis. From these original workspaces, xml2json was run and pyhf was used to produce the contours labeled pyhf. Finally, json2xml was used to generate XML and ROOT files, from which ROOT workspaces can be built, to produce the contours labeled roundtrip. The overlaid expected and observed limits and the exclusion contours, produced by pyhf and ROOT, reproduce the contours of Figure 8(a) in Ref. [8] with excellent agreement. All curves are superimposed at the level of graphical precision [6].
The process of replacing certain samples of the original likelihood with updated ones can be viewed as applying a patch p to the likelihood L to derive a new one L : L p → L . The choice of JSON as a serialisation format for the likelihood also enables an unambiguous definition of such likelihood patches using the JSONPatch format [17] -an ordered array of transformations applied to the original document. The patch format provides a well-defined target for reinterpretation tools to produce, when combined with the original likelihood, likelihoods for a reinterpretation. Using the 2-bin toy example in Listing 1, a JSON patch, seen in Listing 2, can be applied to replace the nominal expected event rates, an array of two floats, with new values. This patch, provided as a file patch.json, can be applied to the original likelihood, stored in a file original.json, using the jsonpatch command-line tool 1 which produces the result in Listing 3. The new JSON file can then be processed either through the ROOT implementation or the pyhf implementation.

Conclusions
HistFactory statistical models are widely used for published results within HEP to model the analysis and perform statistical tests. The simple structure of HistFactory allows for easily archiving the full statistical model in a JSON format introduced in Ref.
[6], which is optimised for long-term archival on data repositories such as HEPData. The ability to archive the full models from a recent search for sbottom squarks using 139 fb −1 of proton-proton collision data recorded with the ATLAS detector is demonstrated for the first time by an LHC experiment using the plain-text JSON specification. Finally, key statistical results of the analysis are reproduced with two independent implementations of the HistFactory model -the ROOT and Python scientific software ecosystems -underscoring the implementation independence and long-term viability of the archived data.