Combined search for anomalous pseudoscalar HVV couplings in VH production and H to VV decay

A search for anomalous pseudoscalar couplings of the Higgs boson H to electroweak vector bosons V (= W or Z) in a sample of proton-proton collision events corresponding to an integrated luminosity of 18.9 inverse femtobarns at a center-of-mass energy of 8 TeV is presented. Events consistent with the topology of associated VH production, where the Higgs boson decays to a pair of bottom quarks and the vector boson decays leptonically, are analyzed. The consistency of data with a potential pseudoscalar contribution to the HVV interaction, expressed by the effective pseudoscalar cross section fractions f[a3], is assessed by means of profile likelihood scans. Results are given for the VH channels alone and for a combined analysis of the VH and previously published H to VV channels. Under certain assumptions, f[a3](ZZ)>0.0034 is excluded at 95% confidence level in the combination. Scenarios in which these assumptions are relaxed are also considered.


Introduction
The observation of a new boson [1][2][3] with a mass around 125 GeV and properties consistent with those of the standard model (SM) Higgs boson [4][5][6][7][8][9][10] has ushered in a new era of precision Higgs physics. The ATLAS and CMS collaborations at the CERN LHC have begun a comprehensive study of the boson properties. The spin-parity of the Higgs boson has been studied in H → ZZ, Zγ * ,γ * γ * → 4 , H → WW → ν ν, and H → γγ decays [11][12][13][14][15][16], where is an electron or muon. The CDF and D0 collaborations have set limits on the pp → VH production cross section (with V = W or Z) at the Tevatron, for two exotic spin-parity models of the Higgs boson [17]. In all cases, the spin-parity J CP of the boson has been found to be consistent with the SM prediction. Based on a study of anomalous couplings in H → ZZ → 4 decays, the CMS collaboration has excluded the hypothesis of a pure pseudoscalar spin-zero boson at 99.98% confidence level (CL), while an effective pseudoscalar cross section fraction f ZZ a 3 > 0.43 is excluded at 95% CL (assuming a positive, real valued ratio of scalar and pseudoscalar couplings) [15]. Under the same assumptions, the ATLAS collaboration has excluded f ZZ a 3 > 0.11 at 95% CL [18].
We present here the first search for anomalous pseudoscalar HVV couplings at the LHC in the topology of associated production, VH. It will be shown that the VH channels are strong probes of the structure of the HVV interaction, with sensitivity even to small anomalous couplings. The ultimate LHC sensitivity to a potential pseudoscalar interaction in these channels is expected to greatly exceed that of H → VV [19]. Due to the highly off-shell nature of the propagator in VH production, small anomalous couplings can lead to significant modifications of cross sections and kinematic features. In particular, the propagator mass, measured by the VH invariant mass, m(VH), is highly sensitive to anomalous HVV couplings [20].
Results from the VH channels are ultimately combined with those from H → VV measurements [15]. The qq → VH → Vbb and gg → H → VV processes involve the Yukawa fermion coupling Hff and the same HVV coupling, assuming gluon fusion production is dominated by the top-quark loop. The dominance of the gluon fusion production mechanism of the Higgs boson at the LHC is supported by experimental measurements [4][5][6][7][8][9][10]. It is interesting to consider models where the ratio of the Hbb and Htt coupling strengths in the VH and H → VV processes is not affected by the presence of anomalous contributions [21]. In such a case, it is possible to relate the cross sections of the two processes for arbitrary anomalous HVV couplings and perform a combined analysis of the VH and H → VV processes, exploiting both kinematics and the relative signal strengths of the two processes. The H → VV signal strength is relatively well measured and can provide a strong constraint on the VH signal strength. For modest values of f ZZ a 3 , the VH signal strength is constrained to large values. The added constraint thereby significantly improves the sensitivity to anomalous couplings.
In the following, we consider only the interactions of a spin-zero boson with the W and Z bosons, for which the scattering amplitude is parameterized as where the a HVV i are arbitrary complex coupling parameters which can depend on the V 1 and V 2 squared four-momenta, q 2 V 1 and q 2 V 2 ; f (i)µν is the field strength tensor of a gauge boson with momentum q V i and polarization vector V i , given by µν is the dual field strength tensor, given by 1 2 µνρσ f (i)ρσ ; m V 1 is the pole mass of the vector boson; and Λ HVV 1 is the energy scale where phenomena not included in the SM become relevant [19]. The a HVV 1 , κ HVV i and a HVV 2 terms represent parity-conserving interactions of a scalar, while the a HVV 3 term represents a parity-conserving interaction of a pseudoscalar. In the SM, a HVV 1 = 2, which is the only nonzero coupling at tree level. All other terms in Eq. (1) are generated within the SM by loop-induced processes at levels below current experimental sensitivity. Therefore, any evidence for these terms in the available data should be interpreted as evidence of new physics.
We search for an anomalous a HVV 3 term of the HVV interaction, assuming that the κ HVV i and a HVV 2 terms are negligible. Throughout the remainder of the paper, the term "scalar interaction" will be used to describe the a HVV 1 term. The effective pseudoscalar cross section fraction for process j (WH, ZH, WW, or ZZ) is defined as where σ j i is the production cross-section for process j with a HVV i = 1 and all other couplings assumed to be equal to zero. A superscript is not included when making a general statement not related to a particular process. The purely scalar (pseudoscalar) case corresponds to f a 3 = 0 ( f a 3 = 1). The signal strength parameter µ j for process j can also be defined in terms of the a HVV i as µ j = a HVV For a given set of coupling constants, the physical observables f j a 3 and µ j vary for different processes as a result of the dependence on the σ j i . The f ZH a 3 and f WH a 3 variables are defined with respect to the ZH and WH production cross-sections in √ s = 8 TeV pp collisions, whereas the f VV a 3 variables are defined with respect to the cross-section times branching fraction for the corresponding pp → H → VV process. In the latter case, the dependence on the pp → H cross-section cancels.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections. Extensive forward calorimetry complements the coverage provided by the barrel and endcap detectors. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [22].
Ref. [15] is used to determine f a 3 confidence intervals. The discriminant of the boosted decision tree (BDT) described in Ref. [23] serves as one dimension of the templates. This BDT is trained separately for the WH and ZH channels to exploit various kinematic features typical of signal and background, and the correlations among observables. The b-tagging likelihood discriminants of the jets used to construct the Higgs boson candidate, the invariant mass of the Higgs boson candidate, and the angular separation between final state leptons and jets are the most important variables in terms of background rejection. Although initially trained to separate background from a scalar Higgs boson signal, it has been demonstrated with simulated events that the BDT is also effective for signals with anomalous f a 3 values. The second dimension of the templates is m(VH). Effectively, the BDT dimension provides a background-depleted region at high values of the BDT discriminant with which to test various signal hypotheses using the m(VH) distribution.
Signal templates in the x = {BDT, m(VH)} plane are constructed for arbitrary values of f a 3 from a linear superposition of templates representing the pure scalar (P 0 + ( x)) and pseudoscalar (P 0 − ( x)) hypotheses and a template (P int 0 + ,0 − ( x; φ a 3 )) that accounts for interference between the a HVV 1 and a HVV 3 terms in Eq. (1), as follows: The phase between the a HVV 1 and a HVV 3 couplings is represented by φ a 3 . The interference contributions to the BDT discriminant and m(VH) distributions are negligible, as verified with simulated events. Therefore the last term in Eq. (4) is ignored in the VH channels. Equation (4) is also used to parameterize the H → VV signals. Anomalous couplings that result from loops with particles much heavier than the Higgs boson are real valued, allowing phases of 0 and π. In the H → VV channels, we assume φ a 3 = 0. The resulting templates are used to perform profile likelihood scans [24] to assess the consistency of various signal hypotheses with the data. One-dimensional profile likelihood scans of f a 3 are performed (where µ is profiled), as well as two-dimensional scans in the µ versus f a 3 plane.
In order to combine channels that depend on the a HZZ i with those depending on the a HWW i , some assumption on the relationship between the couplings is required, and custodial symmetry is assumed (a HZZ 1 = a HWW 1 ). It is further assumed that a HWW 3 = a HZZ 3 . With these assumptions, the f a 3 and µ values in the WH and ZH channels are related by and where The σ 1 /σ 3 ratios given by the JHUGEN 4.3 [19,25,26] event generator and values of Ω i,j are given in Tables 1 and 2, respectively. In order to improve the sensitivity to anomalous couplings, results from the VH channels are combined with those from H → VV [15]. We assume the signal yield in the H → VV analysis to be dominated by gluon fusion production with negligible contamination from vector boson fusion or VH production, as in Ref. [15]. Provided that the ratio of the Hbb and Htt coupling strengths is given by the SM prediction, Eq. (6) can be used to relate the signal strength in the VH and H → VV analyses, with an appropriate change of indices (replacing 'WH' with 'ZZ' to relate the ZZ and ZH channels, or 'ZH' with 'WW' to relate the WW and WH channels). In the combination of the WH and H → WW channels, the ratio of the signal strengths µ WH /µ WW increases linearly from 1 to 173 as f WW a 3 increases from 0 to 1, according to Eq. (6). The WH signal strength has been measured by CMS to be 1.1 ± 0.9 [23], and for H → WW it has been measured to be 0.76 ± 0.21 [13]. Thus, for intermediate and large values of f WW a 3 it is not possible to reconcile the expected signal yield with data in both channels simultaneously. A similar effect occurs in a combination of the ZH and H → ZZ channels, where the ratio of the signal strengths µ ZH /µ ZZ rises sharply with f ZZ a 3 .
However, an anomalous ratio of the Hbb and Htt coupling strengths spoils the relationship in Eq. (6). We therefore perform two interpretations of the VH and H → VV combination; one interpretation in which this relationship is enforced, and one interpretation in which the signal strengths in the VH and H → VV channels are allowed to vary independently. These are referred to as the 'correlated-µ' and 'uncorrelated-µ' combinations, respectively.

Simulation
Simulated qq → VH signal events are generated for pure scalar and pseudoscalar hypotheses with the leading-order (LO) event generator JHUGEN, and assuming a mass m H = 125.6 GeV. The simulated event sample is reweighted based on the vector boson p T to include corrections up to next-to-next-to-LO and next-to-LO (NLO) in the QCD and electroweak (EW) couplings respectively [27][28][29][30][31]. These corrections are derived for a scalar Higgs boson, and applied to both scalar and pseudoscalar simulated event samples.
The gg → ZH process includes diagrams with quark triangle and box loops, as shown in Fig. 1. These diagrams interfere destructively with one another [32]. The box diagram contains no HVV vertex. The triangle diagram does, but is unaffected by the a HVV 3 term in Eq. (1). The triangle diagram mediated by a CP-odd HVV interaction is completely anti-symmetric under the reversal of the direction of loop momentum flow; the diagrams with opposite loop momentum flow therefore perfectly cancel one another. As the a HZZ 1 coupling varies within a profile likelihood scan, the box contribution remains fixed while the triangle contribution and the interference must be varied accordingly. This is accomplished by reweighting the simulated gg → ZH event sample to have the correct m(VH) distribution at the generator level, including interference effects. This reweighting is based on results obtained with the VBFNLO event generator [32,33], modified for this analysis to allow variation of the Hff and HZZ coupling strengths. Simulated background event samples are generated with a variety of event generators. Diboson, W+jets, Z+jets, and tt samples are generated with MADGRAPH 5.1 [34], while POWHEG 1.0 [35] is used to generate single top quark samples, as well as the gluon-initiated contribution to ZH production (gg → ZH). The HERWIG++ 2.5 [36] generator is used along with alternative matrix element generators to produce additional simulated background samples to assess the systematic uncertainty related to event simulation accuracy, as described in Section 6.
The PYTHIA 6.4 [37] and HERWIG++ generators are used to simulate parton showering and hadronization. Detector simulation is performed with GEANT4 [38]. Uncorrelated proton-proton collisions occurring in the same bunch crossing as the signal event (pileup) are overlayed on top of the hard interaction, in accord with the distribution observed. Corrections are applied to the simulation in order to account for differences in object reconstruction efficiencies and resolutions with respect to the data.
Control regions in data are defined in Ref. [23], from which normalization scale factors for the dominant backgrounds are derived. A simultaneous fit to data across control regions is performed to extract the scale factors, which are applied here. The shape of the W(V) boson transverse momentum p T distribution is corrected in the simulated tt (V+jets) event sample, based on a fit to data in a background-enriched control region.

Object and event selection
All objects are reconstructed using a particle-flow (PF) approach [39,40]. Among all reconstructed primary vertices satisfying basic quality criteria, the vertex with the largest value of ∑ p 2 T is selected. Electrons are reconstructed from inner detector tracks matched to calorimeter superclusters, and selected with a multivariate identification algorithm [41]. Electrons are required to have p T > 30 GeV and pseudorapidity |η| < 2.5, with a veto applied to the barrelendcap transition region (1.44 < |η| < 1.57) where electron reconstruction is sub-optimal. Muons are reconstructed from inner detector tracks matched to tracks reconstructed in the muon system, and selected with a cut-based identification algorithm [42]. Muons are required to have p T > 20 GeV and |η| < 2.4. Both electrons and muons are required to be well isolated from other reconstructed objects. Jets are reconstructed using the anti-k T algorithm [43], with a distance parameter of 0.5, from the reconstructed objects, after removing charged objects with a trajectory inconsistent with production at the primary vertex. Additionally, the energy contribution from neutral pileup activity is subtracted with an area-based approach [44]. Jets Table 3: Summary of the event selection criteria. Numbers in parentheses refer to the highboost region defined in the text.
Events are categorized based on the flavour and number of charged leptons into four channels. Events with two same-flavour, opposite-sign electrons (muons) are assigned to the Z → ee (Z → µµ) channel. Events with one electron (muon) and large E miss T are assigned to the W → eν (W → µν) channel. In the W → ν (Z → ) channels, Higgs boson candidates are constructed from the pair of jets (referred to as j 1 and j 2 ) with the largest vector p T sum among jets with p T > 30 (20) GeV and |η| < 2.5. The Z boson candidates are constructed from lepton pairs whose invariant mass is consistent with the Z boson mass. The W boson candidates are constructed by combining the momentum of the identified lepton with the event E miss T , and calculating the neutrino momentum along the beam axis based on a W boson mass constraint. To suppress contributions from QCD multijet events, in the W → ν channels the magnitude of the E miss T vector must exceed 45 GeV and it must be separated in direction from the charged lepton by less than π/2 radians in azimuth. In addition, the Higgs boson candidate p T must exceed 100 GeV.
The analysis sensitivity is increased further by categorizing events into medium-and highboost regions based on the p T of the vector boson candidate. The bulk of the sensitivity comes from the high-boost region. These regions are later combined statistically. In the W → ν channels, the medium-and high-boost regions are defined by 130 < p T (W) < 180 GeV and p T (W) > 180 GeV, respectively. In the Z → channels, the regions are instead defined by 50 < p T (Z) < 100 GeV and p T (Z) > 100 GeV. The low-boost region described in Ref. [23] is not included because of its negligible sensitivity to anomalous couplings. Requirements on the Higgs boson candidate mass and the b-tagging likelihood discriminants of the jets used to construct the Higgs boson candidate are also applied. The selection criteria are summarized in Table 3.
The expected scalar, pseudoscalar, and total background templates for the high-boost W → eν channel are shown in Fig. 2. One-dimensional projections of the templates for the high-boost W → µν and Z → ee channels onto the m(VH) axis are shown in Fig. 3. The discrimination power of m(VH) for the scalar and pseudoscalar hypotheses can be seen clearly; the pseudoscalar hypothesis tends to produce larger values of m(VH) than the scalar hypothesis.

Systematic uncertainties
A variety of sources of uncertainty are considered in this analysis. These include the energy scale, energy resolution, and reconstruction efficiencies of the relevant physics objects; integrated luminosity determination; cross section and background normalization scale factor uncertainties; and the accuracy and finite size of the simulated event samples. The treatment of most uncertainties is identical to that of Ref. [23], with the exceptions discussed below. All uncertainties are summarized in Table 4.
Uncertainties are assigned to both the scalar and pseudoscalar signal yields, related to the calculation of higher-order QCD and EW corrections. In the pseudoscalar case, the uncertainty in the NLO EW corrections is taken to be the size of the corrections for a scalar Higgs boson. A slight mismodeling of the m(VH) distribution is observed in a sideband of the medium-boost regions with values of the BDT discriminant less than −0.3. This sideband has negligible signal content. The ratio of data to the background prediction has an approximately constant, positive slope. As a result, an additional m(VH) modeling systematic uncertainty is included, which allows for a linear correction of the background model. The size of this uncertainty is taken as twice the ratio of data to prediction, as fitted by a linear function in m(VH).

Results
Results of one-dimensional profile likelihood scans in the VH channels are shown in Fig. 4, in terms of f ZH a 3 . Throughout the paper, expected results are derived from an Asimov data set [46] for a pure scalar Higgs boson with µ = 1. This dataset represents the expectation for an SM Higgs boson in the asymptotic limit of large statistics. The combined VH scan assumes a HWW i = a HZZ i . The expected −2∆ ln L values reach a plateau above f ZH a 3 ≈ 0.3, as a result of the small σ 1 /σ 3 values in the VH channels. Even for modest values of f ZH a 3 , the total signal cross section, and therefore the m(VH) shape, is dominated by the pseudoscalar contribution. Increasing f ZH a 3 further has little impact on the m(VH) shape, and therefore the likelihood.
Based on the available data, the VH channels alone do not have sufficient sensitivity to derive any constraint on f a 3 at 95% CL. Although there is some discrepancy between the expected and observed scans, all observed results are consistent with the SM prediction of f a 3 = 0. This discrepancy is driven by a modest excess (deficit) at high (low) values of m(VH) in a selected number of background-depleted bins in the high-boost Z → ee and W → µν channels, which is consistent with the SM prediction within statistical and systematic uncertainties. Results from the VH channels are combined with results from the H → VV channels [15], with and without assuming the SM ratio of the Hbb and Htt coupling strengths. Combined profile likelihood scans are shown in Figs. 5 and 6, in terms of f ZZ a 3 or f WW a 3 . The −2∆ ln L distributions shown here for the VH channels alone are the same as those shown in Fig. 4, after a transformation of the x-axis to f WW a 3 or f ZZ a 3 . These transformations compress (stretch) the low (high) f a 3 region, resulting in the distributions shown. The position of the −2∆ ln L minima and f a 3 confidence intervals are given in Table 5.
The WH (ZH) channel is first combined with the H → WW (H → ZZ) channel, enhancing the sensitivity to anomalous HWW (HZZ) interactions, without the need to introduce any assumption on the relationship between HWW and HZZ couplings. These results are shown in the upper (lower) portion of Fig. 5. The H → WW channel alone is not able to constrain f a 3 at 68% CL. However, in the uncorrelated-µ combination of the WH and H → WW channels, f WW a 3 > 0.21 is disfavoured at 68% CL. Due to the modest preference in the ZH channel for large f a 3 , the uncorrelated-µ combination of the ZH and H → ZZ channels results in a bound on f a 3 that is slightly weaker than that from the H → ZZ channel alone.
All four channels are combined under the assumption a HWW i = a HZZ i . The results of this uncorrelated-µ combination are shown in the top of Fig. 6. A slight improvement over the constraint from the H → VV channels alone is observed, with f ZZ a 3 > 0.25 excluded at 95% CL. Correlated-µ combinations of the VH and H → VV channels are performed as well, which are based on the assumption of the SM ratio of the Hbb and Htt coupling strengths. This assumption fixes the relationship between the signal strengths in the VH and H → VV channels. As a result of the relatively well measured signal strengths in the H → VV channels, for intermediate and large values of f a 3 the signal strengths in the VH channels are constrained to large values, and such a signal cannot be accommodated by the data. The results are shown in the bottom of Fig. 6. Relative to the f a 3 exclusions obtained from the H → VV channels alone, the results obtained here are significantly stronger, with f ZZ a 3 > 0.0034 excluded at 95% CL in the full combination of all channels.
The future power of the VH channels at probing small anomalous HVV couplings is demonstrated on the right side of Figs. 5 and 6. Although the expected exclusion of anomalous couplings in these channels is only at the ∼68 % CL level with the current 8 TeV dataset, the −2∆ ln L values increase sharply for small, non-zero values of f ZZ a 3 and reach a plateau at f ZZ a 3 ≈ 0.05. With the inclusion of √ s = 13 TeV collision data from the ongoing LHC run, the shape of these −2∆ ln L distributions will not change significantly, but the plateau will reach larger values of −2∆ ln L. As soon as the exclusion of a pure pseudoscalar becomes possible, it will be possible to exclude small values of f ZZ a 3 as well. Results of two-dimensional profile likelihood scans in the µ ZH versus f ZH a 3 plane based on a combination of WH and ZH channels are shown in Fig. 7. Smaller µ ZH values are preferred with increasing f ZH a 3 as a result of increasing signal efficiency, due to the harder m(VH) distribution of a potential pseudoscalar signal compared to that of a scalar. The minimum of the -2∆lnL values corresponds to µ ZH = 1.11 and f ZH a 3 = 0.22. Finally, we allow for the modification of the a HVV   Figure 6: Results of profile likelihood scans for the VH and VV channels, as well as their combination. The dotted (solid) lines show the expected (observed) -2∆lnL value as a function of f a 3 . The full range of f a 3 is shown on the left, with the low f a 3 region highlighted on the right. The bottom plots contain the results of correlated-µ scans. Horizontal dashed lines represent the 68%, 95%, and 99% CL. In the legend, VH refers to the combination of the WH and ZH channels, and VV refers to the combination of the H → WW and H → ZZ channels. factor [19], given by  where Λ represents a scale of new physics at which the a HVV 3 coupling can no longer be treated as a constant. Unlike earlier results in H → VV [15] where the vector boson q 2 is restricted to 100 GeV, in VH production much larger values are accessible. This fact is responsible for much of the sensitivity of this analysis, but also necessitates the consideration of form factor effects. Profile likelihood scans based on a combination of the WH and ZH channels for various values of Λ are shown in Fig. 8.
For Λ 10 TeV, a potential momentum-dependent form factor has a negligible impact on the analysis. But for smaller values of Λ, the tail of the m(VH) distribution is diminished, and along with it the sensitivity to anomalous couplings. However, even for Λ values as small as 1 TeV, the VH channels maintain significant sensitivity.

Summary
A search has been performed for anomalous pseudoscalar HVV interactions in √ s = 8 TeV pp data collected with the CMS detector. This is the first study of such interactions at the LHC in associated VH production. The results based on the VH channels are combined statistically with those from a previously published study of H → VV decays, which assumes the signal yield is dominated by gluon fusion production of the Higgs boson. Channels sensitive to the HWW and HZZ interaction are combined assuming equality of the couplings of the Higgs boson to W and Z bosons. are first treated as constants, but later modified to allow potential momentum-dependent form factor effects in VH production. Profile-likelihood scans are used to assess the consistency of the data with various effective pseudoscalar cross section fractions, f a 3 .
The VH channels alone do not currently have sufficient sensitivity to constrain the f a 3 at 95% CL. However, f ZZ a 3 can be constrained to the sub-percent level in a combination of VH and H → VV channels, when assuming the standard model ratio of the coupling strengths of the Higgs boson to top and bottom quarks. Under this assumption, and ignoring form factor effects, f ZZ a 3 > 0.0034 is excluded at 95% CL in the combination of all channels.