Extended Maximum Likelihood Halo-independent Analysis of Dark Matter Direct Detection Data

We extend and correct a recently proposed maximum-likelihood halo-independent method to analyze unbinned direct dark matter detection data. Instead of the recoil energy as independent variable we use the minimum speed a dark matter particle must have to impart a given recoil energy to a nucleus. This has the advantage of allowing us to apply the method to any type of target composition and interaction, e.g. with general momentum and velocity dependence, and with elastic or inelastic scattering. We prove the method and provide a rigorous statistical interpretation of the results. As first applications, we find that for dark matter particles with elastic spin-independent interactions and neutron to proton coupling ratio $f_n/f_p=-0.7$, the WIMP interpretation of the signal observed by CDMS-II-Si is compatible with the constraints imposed by all other experiments with null results. We also find a similar compatibility for exothermic inelastic spin-independent interactions with $f_n/f_p=-0.8$.


Introduction
We do not know what the dark matter (DM), the most abundant form of matter in the universe, consists of. Weakly interacting massive particles (WIMPs) are the most extensively studied DM particle candidates, not only because of their theoretical appeal but also because they could be detected in the near future.
The conventional analysis of direct search data relies on a specific model of the DM halo of our galaxy, often chosen to be the standard halo model (SHM). A halo-independent data comparison method has been developed more recently [19][20][21][22][23][24]. The basic idea behind this method is that all the dependence of the scattering rate on the halo model, in any detector, resides in the same function which we callη(v min , t) of the speed v min and the time t.
Sinceη(v min , t) is common to all direct search experiments, this function can be measured by all experiments, and the compatibility of the different measurements can be studied. The speed v min is the minimum speed necessary for the incoming interacting DM particle to impart a recoil energy E R to a nucleus in each detector. Conversely, given an incoming WIMP speed v = v min , E R is the extremum recoil energy (maximum energy for elastic collisions, or either maximum or minimum for inelastic collisions) that the DM particle can impart to a nucleus. Notice that E R and v min are exchangeable variables only for a single nuclide. When a target consists of multiple nuclides, a choice must be made between the two, E R and v min . Taking E R as independent variable (as is done in [19,20,22]) v min depends on each target nuclide. In our approach, v min and the observed energy E are the independent variables. This allows us to incorporate any isotopic composition of the target by summing over target nuclide dependent E R (v min ) for fixed observed E .
In earlier implementations of the halo-independent method, only weighted averages over v min intervals of the time average,η 0 (v min ), and annual modulation amplitudes,η 1 (v min ), of η(v min , t) have been obtained from putative DM signals in direct detection. These averages over v min intervals are represented in plots by a set of crosses in the v min −η plane, whose vertical and horizontal bars show the uncertainty inη 0 (v min ) orη 1 (v min ) and the v min range where they are measured, respectively. Combined with upper limits, these crosses can be used to assess the compatibility of data sets from various experiments. However, making a statistically meaningful evaluation of the compatibility of the data in this manner is not possible.
The compatibility of different data has been studied in [25] using the "parameter goodness-of-fit" test statistic [26]. The analysis is based on the likelihoods maximized with η 0 written as a sum of very large number of step functions, following a method presented in [27]. In this case, the level of compatibility is given by the p-value of the test statistic, which was calculated by Monte Carlo simulations in [25]. Another test statistic for comparing one data set with a positive result and another with a negative result has been defined in [28].
An alternative method to study the compatibility of a positive result with upper limits uses a band in v min −η 0 space at a given confidence level [29], derived from unbinned data, with an extended likelihood [30]. In this case, as shown in [29] for single-nuclide detectors, the likelihood is maximized by a non-increasing piecewise constantη 0 function, because of the exponential prefactor in the extended likelihood.
The proof presented in [29] relies on the assumption that the target is made of a single component. The main limitation of the approach of [29] relies on their use of the recoil energy E R as independent variable. Here we provide a derivation of the extended maximum likelihood halo-independent (EHI) analysis method using v min as a variable, which applies to any type of WIMP interaction, including inelastic scattering, and any target composition. We correct and extend the original proof of [29] using the formulation developed for the generalized halo-independent analysis in [24]. The proof for the realistic case of finite experimental energy resolution presented in [29] relies on the application to the likelihood functional maximization of the Karush-Kuhn-Tucker (KKT) conditions in [31,32]. The KKT conditions in [31,32], however, apply only to the minimization of functions with a finite number of variables subject to a finite number of inequalities, and they do not apply to functionals. Eqs. (A.3) to (A.6) of [29] are given without proof and without a reference. Moreover, Eq. (A.4) seems problematic for ag function (which in our paper we callη) that has discontinuities, as in the solutions found in [29]. In this case, Eq. (A.4) requires a Dirac δ function to be smaller or equal to zero, which is mathematically problematic. As we explain in Sec. 3, although the KKT conditions have been extended to functionals defined on specific kinds of function spaces and constraints, we did not find in the literature a proof that clearly applies to our problem. Thus, in Sec. 3 we present our own proof of the KKT conditions we use, Eqs. (3.22)-(3.25), which are clearly valid for discontinuous functions.
As in [29], here we find that the best fitη function is piecewise constant with a number of discontinuities at most equal to the number of observed events. In [29], this is a result found forg given as a function of the recoil energy, which can be easily translated to v min space only for a single target nuclide. Besides, the proof in [29] applies only to resolution functions with certain properties. We instead prove the result forη as a function of v min for any target composition and general resolution functions.
Besides these extensions, we make a correction to the method of [29] by providing a clear definition of the uncertainty band. In [29], the uncertainty band is defined in Eq. (2.16) through a numerical Monte Carlo simulation. In Sec. 4, we explain our objections to this procedure. We define instead a pointwise confidence band with a new method (see Sec. 4.2) and provide a clear statistical interpretation for this band using Wilks' theorem. The different definitions of the band, here and in [29], yield very different values of the parameter ∆L defined in both papers for the same confidence level.
In Sec. 2, we review the formulation of the generalized halo-independent analysis, on which the following sections are based. In Sec. 3, we prove crucial properties of the extended likelihood for unbinned direct dark matter detection data. In Sec. 4, we develop the EHI analysis method, and discuss the statistical interpretation of the confidence band computed with this method. In Sec. 5, we apply the method to the CDMS-II-Si [7] data for WIMPs with elastic isospin-conserving and isospin-violating SI interactions [33][34][35], and exothermic inelastic isospin-violating SI interactions [36,37], and compare the results with the upper limits imposed by other experiments. Finally, we give our conclusions in Sec. 6.

Generalized halo-independent analysis method
The differential recoil rate per unit detector mass, typically given in units of counts/day/kg/keV, for the scattering of WIMPs of mass m off a target nuclide T with mass m T is where C T is the mass fraction of nuclide T in the detector, E R is the nuclear recoil energy, ρ is the WIMP local energy density, f (v, t) is the WIMP velocity distribution in Earth's frame, dσ T /dE R is the WIMP-nucleus differential scattering cross section, and v min (E R ) is the minimum WIMP speed needed to impart to the target nucleus a recoil energy E R . The revolution of Earth around the Sun introduces an annual modulation of f (v, t). In detectors with more than one nuclide in their target, the total differential recoil rate is Allowing for the possibility of inelastic scattering of an incoming WIMP with mass m into another outgoing WIMP with mass m = m + δ, for µ T |δ|/m 2 where µ T ≡ (m m T )/(m + m T ) is the WIMP-nucleus reduced mass. By inverting this equation one obtains the minimum and maximum recoil energies E T,± R (v) that are kinematically allowed for a fixed DM speed v, The minimum possible value of v min , thus also of v, for the interaction to be kinematically allowed is v T δ = 2δ/µ T for endothermic scattering, and v T δ = 0 for elastic and exothermic scattering. This speed value corresponds to the point of intersection of the two E T,± R branches. We define v δ to be the smallest of the v T δ values among all nuclides T in the detector.
Most experiments do not measure the recoil energy E R directly. They measure instead a proxy E for it, such as the ionization or scintillation signals, subject to experimental uncertainties and fluctuations. These are represented by an energy response function G T (E R , E ), which is the probability distribution for an event with recoil energy E R to be measured with energy E . Including the experimental acceptance (E R , E ) from various experimental cuts and efficiencies, which, in general, is a function of both the recoil energy E R and the detected energy E , the differential event rate in the detected energy E can be written as By inserting (2.1) into (2.5) we can express the differential event rate in the detected energy E as a double integral from where changing the integration order and extracting a reference parameter σ ref from the cross section we get Here the function dH/dE is (2.9) In (2.7) and (2.9) we have written explicitly a parameter σ ref extracted from the differential cross section to represent the strength of the interaction. This is preferably, but not necessarily, the WIMP-proton scattering cross section.
Here we consider only differential cross sections (and thus also dH/dE functions) which depend only on the speed v = |v|, and not on the direction of the initial WIMP velocity v. The cross section depends only on v if the incoming WIMPs and the target nuclei are unpolarized and the detector response is isotropic, as is most common. In this case, one can write the differential event rate in a simpler form as . We now define the functionη(v min , t) as and (2.10) becomes Using thatη(∞, t) = 0 (see (2.11)) and dH/dE (E , v δ ) = 0 (since E T,− R (v δ ) = E T,+ R (v δ ) and the integrand in (2.8) is a regular function), the integration by parts of (2.13) leads to where we choose to call v min the integration variable because it makes obvious the physical meaning ofη as a function of v min , and where we define the "differential response function" dR/dE of the detector as Notice that dR/dE is a function of the target dependent recoil energies, E T,± R (v min ), which are functions of the independent variable v min . It is clear that, in (2.14), all the dependence on the halo model is in theη function which is independent of the experimental apparatus, and thus is common to all direct detection experiments. Therefore, by mapping the rate data intoη, it is possible to compare the different experimental results without any assumption on the dark halo of our galaxy.
With dR/dE given in (2.14), the energy-integrated rate over an energy interval [E 1 , E 2 ] is Direct detection experiments can measure the time-average R 0 [E 1 ,E 2 ] and the annual modu- The only source of time dependence in the rate isη, thus  can be used to infer the values ofη 0 andη 1 over the v min range in which the response function is non-zero. This is true for WIMPs whose differential cross section is inversely proportional to v 2 , such as for the usual spin-independent (SI) and spin-dependent interactions. Otherwise R [E 1 ,E 2 ] may need to be regularized (see [24] for the details).
For SI interactions, the WIMP-nucleus differential cross section can be written in terms of the effective couplings of the WIMP with neutrons and protons, f n and f p , as where σ p is the WIMP-proton cross section, µ p is the WIMP-proton reduced mass, A T and Z T are the atomic and charge numbers of the nuclide T , respectively, and F 2 T (E R ) is a nuclear form factor, for which we take the Helm form factor [38] normalized to F 2 and from (2.15), we get the following differential response function: dR SI /dE (2.21) For elastic scattering this reduces to 3 Piecewise constantη(v min ) resulting from the EHI method Most direct detection experiments measure energy-integrated rates and/or their annual modulation amplitudes in given energy intervals. CDMS-II-Si gives instead the recoil energies of three candidate DM events. Most halo-independent analyses of the CDMS-II-Si candidate events have chosen a binning scheme, which is arbitrary and may lose some of the information in the data [22-24, 36, 37, 39, 40].
Ref. [29] has introduced a halo-independent analysis method without binning. The method relies on the fact that the extended likelihood [30] yields piecewise constant functions as solutions of the likelihood maximization. The extended likelihood for unbinned data can be written as For simplicity we useη here for the time-average component of theη function (we call itη 0 in previous sections). Here N O is the total number of observed events, each with energy E a , with a = 1, . . . , is the total number of expected events within the energy range [E min , E max ] detectable in the experiment, which we write as a functional of the functioñ η(v min ): where N BG is the expected number of background events Here M T is the detector exposure, dR tot /dE is the total predicted differential event rate and dR BG /dE is the differential rate of the background events. Writing the rate in this form allows to take into account a non-trivial target composition (not included in [29]), through the differential response function dR/dE , defined in (2.8) and (2.15), or in (2.21) for SI interactions.
Without fixing the halo model, the likelihood function in (3.1) is actually a functional of theη function. If there is no uncertainty in the measurement of recoil energies, for a single target nuclide it was proven in [29] that the likelihood is maximized by a piecewise constantη function with the number of steps equal to or smaller than the number of observed events, N O . The proof for the realistic case with a finite energy resolution presented in [29] applies only to resolution functions with certain properties such as having a single local maximum, and relies on the application to the likelihood functional maximization of the Karush-Kuhn-Tucker (KKT) conditions in [31,32]. The KKT conditions in [31,32] apply to the minimization of functions with a finite number of variables subject to a finite number of inequalities. The proofs in [31,32] do not apply to functionals. The likelihood L[η] in (3.1) is instead a functional ofη(v min ) subject to an infinite number of inequalities, one for each value of v min . The inequality given in Eq. (A.4) of [29], dη/dE R in our notation, is actually an infinite set of inequalities, one for each E R .
The KKT conditions have been extended to functionals defined on specific kinds of function spaces and constraints. Banach spaces have been considered extensively (see e.g. [41]). However, the functions we are looking for, i.e. step functions, do not have derivatives everywhere unless interpreted as distributions, and the spaces of distributions defined on noncompact intervals like [0, ∞) are not Banach (under the usual weak-* topology). More general spaces, i.e. locally convex topological vector spaces, have been considered by Dubovitskii and Milyutin [42]. As explained in the book by R. B. Holmes [43] (see pages 51 to 53), the Dubovitskii and Milyutin theory applies to constrained sets of functions that have nonempty interiors. However, the set of non-increasing functions such asη has empty interior. A functionη ∈ S is in the interior of a set S if there is a neighborhood around it that belongs to the set, but for non-increasing functions there is no such neighborhood. This is because a non-increasing function always has a non-monotonic function arbitrarily close to it.
Since we did not find in the literature any proof that clearly applies to our problems, in the following we present our own proof by first discretizing the variable v min into a finite set of values v i min so that the KKT conditions are applicable, and then taking the continuum limit at the end. Our proof is heuristic only in that it does not address the convergence of the limit.
For convenience, let us define a different functional ofη, (−2 times the log-likelihood), With this definition, finding theη function that maximizes the extended likelihood is equivalent to finding the function that minimizes L. To simplify the problem, we discretize the v min space into a set of K + 1 positive variables With a K-dimensional vector η = (η 0 ,η 1 , . . . ,η K−1 ), we can define a piecewise constant functionη(v min ; η) given byη Notice that there is no loss of generality of theη(v min ) considered, since any physically meaningful function is the limit of a sequence of piecewise constant functions as the number of steps tends to infinity. The corresponding L functional becomes a function f L of the vector η, With this discretization we can formalize the minimization of the functional L as a limit of the function minimization of f L , and by doing so we can safely apply the KKT conditions. The KKT conditions for minimizing the function f L ( η) under the constraintsη i ≥η i+1 on its variables (i.e. requiring the piecewise constant functionη(v min ; η) to be non-increasing) are the minimization conditions for the function with respect to the variables η and q ≡ (q 0 , . . . , q K−1 ) considered as unconstrained, and the supplementary conditions q i ≥ 0 and q i (η i+1 −η i ) = 0. Written explicitly, the KKT conditions are: Choosing η to be a unit vectorη i along the ith componentη i , the first term on the left-hand side of (3.9) and (3.10) can be written as (3.14) Using (3.14), we now have (3.15) can be written in terms of the functional derivative of the L functional, .
From (3.6) one can easily see that the functionη(v min ;η i ) in (3.17) has a rectangular shape with value 1 between v i min and v i+1 min and zero everywhere else. The summation over i from i = 0 to j ≤ K − 1 of the left-hand side of (3.9) and (3.10) is thus Using an interpolation function q(v min ) satisfying q(v j min ) = q j , we finally conclude, using (3.20), that the KKT I conditions (3.9) and (3.10) imply By taking a large enough K with v MAX fixed, we can find an integer j such that v j min is arbitrarily close to a given v min value, if v δ ≤ v min ≤ v MAX . Therefore, in the limit K → ∞, and thus ∆v → 0, we can write the conditions (3.21) and (3.11) to (3.13) for continuous v min andη variables: Note that although we write the conditions in terms of continuous variables, they should always be understood as a limit of the conditions for discrete variables.
In (3.2) N E is given in terms of R, given in turn in (2.18), where dR/dE is in (2.15). Using these equations, (3.26) becomes We define: and Replacing (3.28) In this equation, the onlyη dependence is in γ a [η]. The functions ξ(v min ) and H a (v min ) do not depend onη. Fig. 1 shows the functions H a (v min ) and ξ(v min ) for the three candidate events of CDMS-II-Si assuming an SI cross section with f n /f p = 1 and m = 9 GeV. In order to explain the form of these functions, let us first consider a simple situation where the target material consists of a single nuclide, or multiple isotopes of the same element, as in CDMS-II-Si. In this case, the integrands of the different terms dH T /dE in (2.9) contributing to dH/dE in (2.8) are similarly localized in E R for all nuclides T (for a fixed E ). Notice that these integrands are independent of v if dσ T /dE R is proportional to v −2 . In this case, the v dependence of ]. If so, as v increases, this range covers more of the region in which the integrand is non-zero. Thus, dH/dE grows with v in a certain range. When v is large enough for the integration in (2.9) to cover all the region in which the integrand is non-zero, dH T /dE becomes constant, and so does dH/dE . This explains the step-like functional form of H a (v min ) given in (3.29), which is dH/dE with E = E a , as can be seen in the left panel of Fig. 1.
Looking at (2.9) which defines dH T /dE for each nuclide T , we see that the only dependence on E of the integrand is in (E R , E )G T (E R , E ). To compute ξ(v min ), we need thus a double integration, first in E R to obtain dH T /dE , and then in E , after summing all dH T /dE contributing to dH/dE . If we exchange the order of integration, performing the E integration first, we see that as E R increases, for E R very small the integrand G T will be zero within the E integration range. Then, the non-zero portion of G T within the E integration range will increase, then be entirely contained, and then decrease and become zero again. Thus, the resulting integrand in E R will be slowly changing in the E R range in which it is non-zero. As v min increases, the integration range in E R encompasses more of the slowly varying integrand, resulting in a smoothly increasing function ξ(v min ), as shown in the right panel of Fig. 1. Once v min becomes large enough for the integration range in E R to cover all the non-zero part of the integrand in ξ(v min ), this function becomes constant (see Fig. 1).
Let us return to study the discontinuity points of the best-fitη function which happen at the zeros of q(v min ) given in (3.31). For elastic (δ = 0) or endothermic (δ > 0) scattering, there is a region at small v min values where both H a and ξ vanish. Looking at (2.9), when E T + R (v) is below the experimental threshold, the integrand, in particular the acceptance , is zero, thus dH/dE = 0. In this v min region the condition q(v min ) = 0 is trivially satisfied, and the shape of the best-fitη function is undetermined.
Changes inη(v min ) produce changes in γ a [η]. For values of γ a which make the second term of the right hand side of (3.30) large enough to reach the first term 2ξ(v min ) from below, q(v min ) (see (3.31)) has non-trivial zeros where ξ and H a are non-zero. The non-negativity of q(v) ≥ 0 means that q(v) = 0 only when the monotonically increasing function ξ touches the step-like a H a /γ a function from above. Since a H a /γ a has N O steps, this can happen only at a number of v min values smaller than or equal to N O . Examples of these functions ξ, a H a /γ a , and q will be shown below in Figs. 5, 7 and 8.
To guess the generic shape of the ξ(v min ) and H a (v min ) functions for differential cross sections whose WIMP speed v dependence is different from ∝ v −2 , let us assume the differential cross section for a given E R behaves as v (n−2) for large values of v. One such example is that of WIMPs interacting with nuclei through a magnetic dipole moment, where n = 2 at large v (see (3.9) of [39]). In this case, from (2.9) one can easily see that the shapes of the functions H a (v)/v n and ξ(v)/v n should be similar to those of H a (v) and ξ(v) for a differential cross section proportional to v −2 . Therefore, the argument given above can be used for q(v)/v n , whose zeros are the same as those of q(v), leading to the same conclusions.
When the target consists of several elements, each H a has multiple step-like features, one for each element. This is illustrated in Fig. 2 for a fictitious CDMS-II-like detector composed of equal mass fractions of Si and Ge. We see in the left panel of Fig. 2 that for each of the three elements there are two step-like features in H a . One may naively expect that because in this case there are 2N O step-like features in a H a /γ a , the number of zeros of the function q(v min ) would equally double. However, this is not the case. Because ξ and H a are independent ofη, by changingη and thus γ a in general one can make at most one of the two steps per observed event in a H a /γ a touch the function ξ(v min ) from below. Thus the number of zeros of q(v min ) is still at most N O . This can be seen in the right panel of Fig. 2.
In summary, in this section we proved that theη function maximizing the extended likelihood is a piecewise constant function with a number of steps smaller than or equal to the number N O of observed events.

EHI analysis in the v min -space
In this section we show how to find the solution to the maximization of the extended likelihood in the EHI method, in the v min -η space. As shown in the previous section, the best-fit function, which we callη BF (v min ) from now on, is a piecewise constant function with at most N O steps (note that in the statistics literature the subscript "ML" for maximum likelihood is usually used instead of "BF"). We will also find a statistically meaningful confidence band aroundη BF (v min ), which we will define as a pointwise confidence band.

Finding the best-fit functionη BF (v min )
The properties of theη function maximizing the extended likelihood we have proven in the previous section can be utilized to findη BF . We can define a function f (4.1) The piecewise constant functionη (N O ) is defined as where a = 1, . . . , N O . Here we assume v min and v a 's are all larger than v δ , and the constraints (3.24)η a ≤η b for a > b are satisfied. Since the functionη cannot change after the last step and it must reach zero for large v min , it must be zero for v min > v N O . We do not specify the value ofη (N O ) below the minimum v δ since the event rate is independent of it. Notice that (4.2) requires the definition of v 0 . We define v 0 = v δ for convenience. From these definitions and the theorem we have proven, we can easily obtainη BF and L min , the minimum value of the functional L[η], by finding v BF and η BF that minimize f and with N BG given in (3.3). Defining the N a and M ai functions of v a as and the fixed constants we can write (4.5) as The minimization of the function f can be done numerically using a global minimization algorithm. In the implementation, we express f in terms of lnη a and use lnη a instead ofη a as variables, sinceη a span many orders of magnitude. This also accounts for theη a > 0 constraints, leaving only the constraints in (4.10) and (4.11) to be enforced in the minimization. Note that in general minimization algorithms may attempt to evaluate the function in regions where the constraints are not satisfied, and in these regions the function f is not well defined, thus a fictitious function must be used that grows smoothly with the absolute value of the unsatisfied constraints in (4.10) and (4.11).

Finding the confidence band
In order to compare theη BF we obtained with the upper limits imposed by other experiments, we need a way to represent the uncertainty in our determination ofη BF . This can be achieved by finding a region in the v min -η space satisfying a certain statistical criterion, analogous to the confidence interval in the usual analysis with a fixed halo model. The region in the v min -η space which is densely filled by the family of all possibleη(v min ) curves satisfying with given ∆L * , is a natural candidate to examine. The condition in (4.12) defines a twosided interval aroundη BF for each v min value, and the collection of those intervals forms a pointwise confidence band in v min -η space. From now on we will call it simply "the confidence band". Conceptually, computing the confidence band is a straightforward procedure, but in practice, finding all theη functions satisfying (4.12) and constructing the band from them is not possible. If the same band can be formed by a much smaller subset of them, and this subset is much easier to find than the whole set, the construction of the band would be practical.
As a possible subset, let us consider the set ofη functions which minimize L[η] subject to the constraintη (v * ) =η * . If ∆L c min (v * ,η * ) is larger than a chosen ∆L * , it simply means that the point (v * ,η * ) lies outside of the confidence band. If it were inside the band, there should be at least oneη function passing through the (v * ,η * ) point, for which ∆L[η] ≤ ∆L * , in contradiction with the fact that ∆L c min (v * ,η * ) > ∆L * . On the other hand, if ∆L c min (v * ,η * ) ≤ ∆L * , the confidence band should cover the point (v * ,η * ) by definition. Therefore, by finding the range ofη * values which satisfy ∆L c min (v * ,η * ) ≤ ∆L * for each v * value, we can construct the band. The remaining problem is how to find an easy way of computing L c min (v * ,η * ) (and therefore ∆L c min (v * ,η * )). We will now prove that theη function minimizing L[η] subject to the constraint (4.13) should be a piecewise constant function with at most N O +1 discontinuities.
Let us rewrite the KKT conditions in (3.9)-(3.13) but now with an additional equality constraintη k =η * , (4.15) where the index k is chosen to satisfy v k min ≤ v * < v k+1 min , so that v k min can be arbitrarily close to v * for large enough K values. The additional constraint leads to the necessity of adding the term p * (η k −η * ) to the function f L in (3.8) introducing a Lagrange multiplier p * , so we define another function f L ( η, q, p * ) as (4.17) and use it to derive new KKT conditions. The new KKT conditions consist of the unconstrained minimization conditions of the function f L ( η, q, p * ) with respect to the parameters η, q and p * , plus the complementary conditions, which are the same as before. Therefore, besides the constraint with additional terms p * δ ki and p * δ k0 , respectively, and the constraint (4.15). Following similar steps as those in Sec. 3 from (3.14) to (3.21), the summation of (4.18) and (4.19) over In the limit of K → ∞, the first condition for theη functions minimizing L[η] subject to the constraint (4.13) becomes with ξ(v min ), H a (v min ) and γ a [η] defined in (3.28), (3.29) and (3.30), respectively. Again, the conditions in (4.21) and (3.25) tell that theη function we find is piecewise constant with discontinuities only at the isolated zeros of q(v min ). We already argued that ξ(v min ) can touch the function N O a=1 H a /γ a from above at a number of points equal to or less than the number of observed events N O . Since p * θ(v min − v * ) introduces another step on the right hand side of (4.22), with the right p * value q(v min ) could have an additional zero. Thus theη(v min ) function minimizing L[η] subject to the constraint (4.13) is piecewise constant with at most N O + 1 discontinuities.
Using a functionη of this type in (4.12) for each (v * , η * ), we minimize L[η] in (3.5) as in Sec. 4.1 to compute ∆L c min (v * ,η * ) in (4.14). We define a function f can again be done numerically using a global minimization algorithm, subject to the same constraints as in (4.10)- (4.11), where in addition we keep (v i ,η i ) fixed at (v * ,η * ). As before, in our implementation of the algorithm we write f

Statistical interpretation of the confidence band
From the procedure described above we can get both the best-fitη function,η BF (v min ), and the confidence band. For a quantitative assessment of the compatibility with other experimental data, we need to know the statistical meaning of a particular choice for ∆L * . One may be tempted to interpret ∆L as −2 times the logarithm of the likelihood ratio with 2N O parameters, since we parametrized theη function with 2N O parameters (plus v * and η * which are fixed each time) to obtain the confidence band. However, this is not the proper interpretation. Note that the defining properties of the best-fitη BF and the band do not rely on how we compute them.
Let us return to the definition of ∆L c min (v * ,η) and use again the discretization procedure introduced to derive the KKT conditions in Sec. 3. With a discretization of v min we can define a likelihood function L(η 0 , . . . ,η K−1 ) = L[η(v min ; η)] (4.23) withη(v min ; η) defined in (3.6). With this discretization, ∆L c min (v * ,η) defined in (4.14) is replaced by a collection of functions ∆L c,k Here the η i values maximize the function L subject to the constraintη k =η * , while η i maximize the function L without the constraint. Thus ∆L c,k min is −2 ln of the profile likelihood ratio (see e.g. equation (38.53) of [44]) with only one parameterη k =η * . Notice that the continuous parameter v * becomes the discrete index k, and is no longer an additional parameter. According to Wilks' theorem, the distribution of ∆L c,k min approaches a chi-squared distribution with one degree of freedom, in the limit where the data sample is very large [44,45] (and this is independent of the value of K). In short, this amounts to profiling the likelihood at fixed v * over the nuisance parametersη 0 , . . . ,η k−1 ,η k+1 , . . . ,η K . In this language, the fact that the likelihood ratio in (4.24) has one degree of freedom is proven mathematically in corollary 2 of [46] even for the case K → ∞.
By taking large enough K, we can make v k min and v k+1 min arbitrarily close to v * , and for each v * , ∆L c,k min (η * ) approaches ∆L c min (v * ,η * ). Therefore, the natural interpretation of the band is the collection of the confidence intervals inη for each v min value, which defines a pointwise confidence band, based on a profile likelihood ratio with one degree of freedom. With this interpretation, we can now compare the confidence band with other limits or measurements in a statistically meaningful way. If any upper limit at some CL crosses the lower boundary of the band, at some other CL, it means that the two data, providing the limit and the band, are incompatible at their respective CLs.
The Wilks theorem ensures the asymptotic behavior of the distribution of ∆L c min as the number of events becomes large, and the 3 observed number in CDMS-II-Si may not be a large enough number to ensure that ∆L follows the classical chi-squared distribution. Assuming that ∆L c min is chi-squared distributed, the choices of ∆L * = 1.0 and ∆L * = 2.7 correspond to the confidence intervals ofη at the 68% and 90% CL, respectively, for each v min . The question on the convergence to the true confidence interval is also present in the analysis of the CDMS-II-Si data with a fixed halo model, if one uses the confidence interval estimator derived from the same likelihood function [22,36].
In [29], ∆L * = 9.2 is used to compute the confidence band at the 90% CL, a value much larger than our choice, corresponding to the 90% CL limit for a chi-squared distribution with five degrees of freedom, resulting from a numerical Monte Carlo simulation. However, in the simulation described in [29], only fake data with three simulated events are selectively generated instead of allowing for any number of simulated events, as would be necessary to avoid generating a biased data set. Yet, allowing the number of simulated events to vary does not seem compatible with the ∆L definition in Eq. (2.16) of [29]. In this equation, √ ∆L is defined as the radius of a hyper-ellipsoid in a 6-dimensional parameter space defined by the positions and heights of the three steps in the best-fitη BF for a number of simulated events N O = 3. This leads to a chi-squared distribution for ∆L with 2N O − 1 = 5 degrees of freedom (because there is one constraint). Allowing the number of simulated events N O to change, the dimension of theη BF parameter space is not fixed to 6, but would be 2N O , leading to a number of degrees of freedom 2N O − 1 that would change from simulated set to simulated set.

Application to the CDMS-II-Si data
In this section, we apply the EHI method to the three events observed by CDMS-II-Si in their signal region with recoil energies 8.2, 9.5, and 12.3 keV. We follow the procedure developed above. We use ∆L * = 1.0 and 2.7 for the 68% CL and 90% CL confidence bands, and compare the bands with the 90% CL upper limits from CDMSlite [16], SuperCDMS [18], LUX [17], XENON100 [10] data, as well as the CDMS-II-Si data itself. The data analysis to obtain the upper limits in this paper is the same as in [36]. Recent analyses of the CDMS-II-Ge data [47,48] use the same data set of [11], shown in [40] to provide weaker upper limits in the halo-independent analysis than SuperCDMS (and thus not included here).
The data analysis described in this paper is implemented in the CoddsDM software [49], an open-source Python program for comparing the data from direct detection experiments.

Elastic SI scattering
In this subsection we present the result of our analysis for elastic scattering with isospinconserving f n /f p = 1 and with isospin-violating f n /f p = −0.7 (Xe-phobic) and f n /f p = −0.8 (Ge-phobic) SI interactions [33][34][35]. They are shown in the left and right panels of Fig. 3 and in Fig. 4, respectively, for a WIMP of mass m = 9 GeV. This value of the mass is within the 68% CL CDMS-II-Si regions obtained assuming the Standard Halo Model (SHM) in [40] and [36]. Figs. 3 and 4 show the best-fitη BF (dark red line) and the 68% and 90% CL confidence bands derived from the CDMS-II-Si data shaded in darker and lighter red, respectively. Despite starting with three observed events, thus three steps inη, theη BF has only two steps, located at the zeros of the q(v min ) function shown in Fig. 5.  Figs. 3 and 4 show the 90% CL CDMSlite (cyan), SuperCDMS (dark yellow), LUX (magenta), XENON100(blue) and CDMS-II-Si (red) upper limits, and the red crosses derived from the halo-independent analysis using binned data [40]. The crosses represent 68% CL intervals of averagedη and the corresponding v min ranges for the CDMS-II-Si data with three equally spaced bins spanning the recoil energy range from 7 to 13 keV. Notice that the 68% CL crosses are similar in vertical extent to the 68% CL confidence band. Notice also that the 90% CL CDMS-II-Si limit follows closely the upper limit of the 90% CL confidence band.  As one can see in the left panel of Fig. 3 for f n /f p = 1, the 68% CL confidence band is excluded in the v min range from 370 to 560 km/s, by the combination of the 90% CL CDMSlite, SuperCDMS, and LUX upper limits. The lower boundary of the 90% CL confidence band is also cut at 450 km/s by the SuperCDMS 90% CL limit. Since there is no single continuous curve within the 90% CL confidence band which does not cross any 90% CL upper limit, we conclude that the potential signal and limits are incompatible for any halo model.

CDMS II Si
On the other hand, in the right panel of Fig. 3  confidence band remains below all the 90% CL upper limits. This shows that for SI interactions with f n /f p = −0.7 the CDMS-II-Si signal is consistent with the null results of all other experiments.
The choice of f n /f p = −0.8 (Fig. 4) disfavors maximally the Ge limits (while f n /f p = −0.7 disfavors maximally Xe couplings instead). Thus, as expected, in Fig. 4 the SuperCDMS limit is weakened with respect to Fig. 3, but the LUX upper limits exclude almost completely both confidence bands.
The dashed gray curves in Figs. 3 and 4 are theη functions assuming the SHM for WIMP-proton cross sections σ p = 10 −41 cm 2 and σ p = 10 −40 cm 2 in the left and right panels of Fig. 3, and σ p = 10 −39 cm 2 in Fig. 4. For m = 9 GeV, these σ p values are within the 68% and 90% CL CDMS-II-Si regions obtained assuming the SHM (in Fig. 1 of [40] and in Fig. 4 of [36], respectively). In the analyses of [40] and [36] assuming the SHM, the m and σ p choices for f n /f p = 1 and f n /f p = −0.8 interactions are shown to be rejected, while the choice for f n /f p = −0.7 interactions are allowed by all 90% upper limits. The same conclusions are evident in Figs. 3 and 4, where the dashed gray lines are above the upper limits for f n /f p = 1 and f n /f p = −0.8 and below them for f n /f p = −0.7.

Inelastic SI scattering
In this subsection we present the results of the analysis for the exothermic Ge-phobic WIMP proposed in [36,37] as an interpretation of the CDMS-II-Si data, shown in Fig. 6.This choice of f n /f p = −0.8 suppresses maximally the coupling to Ge. The limits due to Xe are weakened by the exothermic nature of the scattering, which disfavors heavier targets (such as Xe) with respect to lighter ones (such as Si) [36], leaving in principle Ge limits as the most important.  by the present halo-independent analysis, since the correspondingη functions assuming the SHM shown in Fig. 6 (dashed gray lines) escape all upper limits from experiments with null results.
The best-fitη BF functions for both Ge-phobic cases are shown in dark red in Fig. 6. They have two and one steps respectively in the left and right panels of Fig. 6, corresponding to the zeros of the q(v min ) functions in the right panels of Figs. 7 and 8 (located at v min values of 437 and 678 km/s in Fig. 7 and 792 km/s in Fig. 8).
Figs. 7 and 8 show the functions ξ (red) and N O a=1 H a (v min )/γ a (blue) in the left panels, and twice their difference, q(v min ), in the right panels, for the two Ge-phobic cases in Fig. 6.
In the previous analysis of Ge-phobic exothermic WIMP based on the SHM [36], the m and σ p parameters chosen in the current analysis are found to be compatible with the null results of all other experiments. Consistently with this result, we find a large portion of the 68% CL confidence band is below all the 90% CL upper limits imposed by all null results. Thus WIMP-nucleus scattering through Ge-phobic interaction can potentially explain the CDMS-II-Si data as a WIMP signal without any conflict with the null results of all other searches.

Conclusions
We have expanded and corrected a recently proposed extended maximum likelihood haloindependent (EHI) method to analyze unbinned direct dark matter detection data. Instead of the recoil energy E R as independent variable, we use v min , the minimum speed a dark matter particle must have to impart a given recoil energy to a nucleus. An earlier version of the method, using E R as variable, was introduced in [29]. The use of v min as variable allows to incorporate in the analysis any type of target composition and of WIMP-nucleus interaction, including elastic and inelastic collisions. This is not possible using E R . The advantages of using v min instead of E R in a halo-independent analysis was first pointed out in [21] and extensively used later on [23,24,36,39,40].
The EHI method uses unbinned direct dark matter detection data. The predicted differential rate as a function of the observed energy E in all direct detection experiments can be written in terms of a common functionη(v min ) (see (2.14)). The aim of the method is to find theη function that provides the best fit for the unbinned data. We have proven rigorously that the best-fitη function,η BF (v min ), is a piecewise constant function with a number of discontinuities smaller than or equal to the number of observed events N O . We have also rigorously defined a two-sided pointwise confidence band with a clear statistical meaning, as a collection of confidence intervals inη for every v min value. We can assign a confidence level to the band and thus compare with upper limits given at particular confidence levels. This allows to quantitatively assess the compatibility of the unbinned data with upper limits due to null results.
Using this method, we analyzed the compatibility of the three candidate events found by CDMS-II-Si with the best available upper bounds, for spin-independent (SI) WIMP-nucleus interactions with different neutron to proton coupling ratio f n /f p values and either elastic or exothermic inelastic scattering. We found the best-fitη BF function and 68% and 90% CL confidence bands. We chose values of the WIMP mass within the CDMS-II-Si regions in the m-σ p plane that we had found in previous studies [36,40] assuming the Standard Halo Model (SHM). Our results for f n /f p = 1, WIMP mass m = 9 GeV and elastic scattering are shown in the left panel of Fig. 3. The 90% CL CDMSlite, SuperCDMS and LUX limits derived as in [36] exclude the entire 90% CL band for this candidate. This case was also studied in [29], where the best-fitη BF is very similar to ours, but the 90% CL band is much larger. In [29], only 90% CL limits derived from LUX and XENON10 data are presented. The LUX limit in [29] is similar to ours, but it does not exclude their much larger confidence band.
The right panel of Fig. 3 shows our results for f n /f p = −0.7 (Xe-phobic) and m = 9 GeV. We found that in this case a significant portion of the 68% and 90% CL confidence bands remains below all the 90% CL upper limits. Thus, a WIMP candidate with these characteristics provides an explanation for the three CDMS-II-Si events compatible with all present null results of other direct searches. This case was also studied in [29], where their best-fitη BF function has the same number and position of steps as ours, but is an order of magnitude larger. We think this difference might be due, at least in part, to the inclusion of the isotopic composition of Si in our computation, which can not be done with the method used in [29]. The LUX limit presented in [29] for this case is similar to ours, but their 90% CL band is again much larger.
The Ge-phobic f n /f p = −0.8 case, again for m = 9 GeV, is presented in Fig. 4. The 90% CL confidence band is almost completely excluded by the 90% CL LUX limit.
Our results for the Ge-phobic coupling and exothermic inelastic scattering are presented in Fig. 6, for two different values of the WIMP mass m and mass split δ: m = 3.5 GeV, δ = −50 keV and m = 1.3 GeV, δ = −200 keV. In these cases the 68% and 90% CL confidence bands are almost entirely below all the 90% CL limits. Thus, again we found compatibility between a dark matter interpretation of the CDMS-II-Si data and all null results.
In all cases studied we included the crosses derived from the CDMS-II-Si data obtained with our previous halo-independent analysis using binned data [36,40]. The crosses represent 68% CL intervals of averagedη and v min ranges corresponding to three equally-spaced bins spanning the recoil energy range from 7 to 13 keV. We found that the crosses are similar in vertical extent to the 68% CL confidence bands in all cases. This shows agreement between both types of halo-independent analyses, but the present method is much more powerful.
We found remarkable that the 90% CL limit derived from the CDMS-II-Si data itself using the Maximum Gap method, as described in [36,40] (and references therein) is almost identical to the 90% CL upper boundary of the 90% CL confidence band in all cases studied. Again, this indicates agreement between the two different analyses.
SI elastic scattering was also studied in [25] and [28], where two different statistics were used to quantify the compatibility among different direct search data sets. In [25], for isospin-conserving SI interactions and WIMP mass 7 GeV, which is slightly smaller than our choice of 9 GeV, the parameter goodness-of-fit value derived from the global likelihood of the CDMS-II-Si, SuperCDMS and LUX data has a p-value of only 0.44%. This poor compatibility level is consistent with our results. For isospin-violating interactions, [25] used slightly different parameter sets, f n /f p = −0.71, m = 6.2 GeV, and f n /f p = −0.79, m = 6.3 GeV, with corresponding p-values of 18.7% and 18.5%. Thus the compatibility is significantly improved, which is also consistent with our results. In [28], a test statistic "p joint " is proposed and calculated, said to be the upper bound on the joint probability of obtaining the outcomes of two potentially conflicting experiments. Only if the value of p joint is small there is a clear interpretation of incompatibility, but a large p joint value does not imply compatibility. For m = 9 GeV, [28] finds incompatibility between CDMS-II-Si and SuperCDMS for f n /f p = 1, but not for f n /f p = −0.7 or −0.8. In this respect, we agree.
The use of a test statistic such as defined in [25] or [28] is complementary to our method of using a confidence band and upper limits in v min −η space to assess the compatibility among different data sets.