Search for excited states of light and heavy flavor quarks in the $\gamma$+jet final state in proton-proton collisions at $\sqrt{s} =$ 13 TeV

A search is presented for excited quarks of light and heavy flavor that decay to $\gamma$+jet final states. The analysis is based on data corresponding to an integrated luminosity of 35.9 fb$^{-1}$ collected by the CMS experiment in proton-proton collisions at $\sqrt{s} =$ 13 TeV at the LHC. A signal would appear as a resonant contribution to the invariant mass spectrum of the $\gamma$+jet system, above the background expected from standard model processes. No resonant excess is found, and upper limits are set on the product of the excited quark cross section and its branching fraction as a function of its mass. These are the most stringent limits to date in the $\gamma$+jet final state, and exclude excited light quarks with masses below 5.5 TeV and excited b quarks with masses below 1.8 TeV, assuming standard model like couplings.


Introduction
High energy proton-proton collisions resulting in a photon and a jet with large transverse momenta (p T ) provide a powerful means of searching for new physics. For example, models involving compositeness [1][2][3] predict excited states of quarks that can be identified by searching for events that contain a photon and a jet from their decays. We present a search for excited states of light (u,d) and heavy (b) quarks using this decay signature.
We assume that the coupling between the excited quark (q ), the ordinary quarks, and gauge bosons proceeds through a gauge-invariant magnetic-moment operator, described by the effective Lagrangian [4]: where q R is the right-handed excited quark field; σ µν the Pauli spin matrix; q L the left-handed quark field; G a µν , W µν , and B µν are the field tensors of the SU(3), SU (2), and U(1) gauge fields respectively; λ a , τ, and Y are the corresponding gauge structure constants, and g s , g, and g are the gauge couplings. The compositeness scale Λ is the energy scale typical for these interactions. The quantities f s , f , and f are unknown dimensionless constants that represent the strengths of the excited quark couplings to the standard model (SM) partners. Their values are determined by the compositeness dynamics, and are usually assumed to be of order unity.
In pp collisions, excited quarks are expected to be produced predominantly through quarkgluon fusion (qg), and then decay into a quark and a gauge boson (g, W, Z, γ). Searches have been performed in different channels [5][6][7][8][9][10][11][12], but no evidence for the existence of excited quarks has yet been found. This analysis looks for evidence of qg → q → qγ (where q represents u or d) and bg → b → bγ production by searching for resonances in γ + jet final states. The signal model includes excited quarks with spin- 1 2 , and assumes a compositeness scale that equals the mass of the resonance (m Res ). An assumption is also made that f s , f , and f have identical values [3,4] and henceforth these will be referred to collectively as f . The data correspond to an integrated luminosity of 35.9 fb −1 collected by the CMS experiment in pp collisions at √ s = 13 TeV at the CERN LHC, in 2016.
A final state with a photon and a jet is produced in the SM mainly through qg → qγ, qq → gγ, gg → gγ, multijet, and W/Z+γ processes. Among these, the main irreducible backgrounds are quark-gluon Compton scattering (qg → qγ) and quark-antiquark annihilation (qq → gγ). Although the probability for a jet to be reconstructed as a photon is ≈10 −4 to 10 −3 , the cross section for multijet production is two to three orders of magnitude larger than that for the irreducible backgrounds, depending on the p T of the jet [13], making jet misidentification the second-largest source of background. Electroweak production of W/Z+γ, where the W or Z boson decays to a pair of quark jets, contributes a very small fraction of the background due to its small production cross section.
This Letter provides a brief description of the CMS detector in Section 2. The main strategy used in selecting the events is discussed in Section 3. Section 4 contains information about signal and background models, while Section 5 lists the systematic uncertainties estimated in this analysis. The results of the study are presented in Section 6 and summarized in Section 7.

The CMS detector
The central feature of the CMS detector is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. The very forward regions of the detector near the beam line is covered by the forward calorimeters. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. In the barrel section of the ECAL, an energy resolution of about 1% is achieved for unconverted or late-converting photons in the tens of GeV energy range. The remaining barrel photons have a resolution of about 1.3% up to pseudorapidity |η| = 1 rising to about 2.5% at |η| = 1. 4 [13], where η is defined as − ln[tan(θ/2)], θ being the polar angle of the cylindrical coordinates of the CMS detector. In the endcaps, the resolution of unconverted or late-converting photons is about 2.5%, while the remaining endcap photons have a resolution between 3-4%. When combining information from the entire detector, the jet energy resolution is typically around 15% at 10 GeV, 8% at 100 GeV, and 4% at 1 TeV. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in [14].
The CMS experiment selects physics events using a two-tier trigger system, a hardware-based level-1 (L1) and a software-based high-level trigger (HLT). The L1 trigger selects events of interest using information from the calorimeters and the muon system only, and reduces the readout rate from the bunch crossing frequency of 40 MHz to below 100 kHz. The HLT system further decreases this rate to an average of a few 100 Hz to a maximum of 1 kHz. The events selected by the HLT are then reconstructed offline and used for analysis.

Event selection
Events are analyzed using a particle-flow (PF) algorithm [15], which reconstructs and identifies each individual particle with an optimized combination of information from the various elements of the CMS detector. The energy of photons is directly obtained from the ECAL measurement, corrected for zero-suppression effects [13]. The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track. The energy of muons is obtained from the curvature of the corresponding track. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for zero-suppression effects and for the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energy.
The jets in each event are formed mainly from photons, charged, and neutral hadrons using the infrared-and collinear-safe anti-k T algorithm [16], with distance parameter ∆R = 0.4 where ∆R = (∆η) 2 + (∆φ) 2 , ∆η and ∆φ being the pseudorapidity and azimuthal angle (in radians) difference between the jet axis and its constituents. Jet momenta and energies are corrected to establish a uniform calorimetric response in η and an absolute response in p T at the particle level using calibration constants [17] obtained from simulation, test beam results, and pp collision data at √ s = 13 TeV.
The data sample used in this analysis consists of events that are selected by a photon trigger having a p γ T threshold of 165 GeV and an additional condition on the ratio of the photon energy deposited in the HCAL to that in the ECAL (H/E), which is required to be less than 10%. The efficiency of the trigger used in the study has been evaluated separately using samples collected with photon, muon, or jet triggers to account for possible biases in trigger selection. The trigger efficiencies measured in these samples are greater than 95% for p γ T > 200 GeV, as measured offline.
In the offline selection, each event is required to have at least one reconstructed primary vertex with at least four associated tracks, and lie within 24 cm along the z direction and within 2 cm in the transverse plane, from the nominal collision point. The reconstructed vertex with the largest value of summed physics-object p 2 T is taken as the primary pp interaction vertex. The physics objects are the jets, clustered using the jet finding algorithm [16,18] with the tracks assigned to the vertex as inputs, and the associated missing transverse momentum, taken as the negative vector sum of the p T of those jets.
The photon identification [13] is based on requirements on H/E and shower profile of the photon. The photon is isolated from identified electrons in the detector by requiring the absence of hits in the inner tracker layers near the photon direction. The photon is also required to be well isolated from other photons and hadrons within a cone of ∆R = 0.3 around its axis. The photon must have p γ T > 200 GeV and lie in the central barrel region (|η γ | < 1.4442). Among the photons passing the above criteria in each event, the one with the highest p T is selected to reconstruct the mass of the photon+jet system in the event. The isolation quantities are corrected for effects from overlapping pp interactions (pileup) in the same or adjacent bunch crossings, by subtracting the energy calculated from the mean energy density in the event, as computed using the FASTJET package [18]. The photon identification and isolation criteria used in this analysis lead to a signal efficiency of ∼80% with an estimated background rejection of ∼90%.
In order to be combined with a photon to form a resonance candidate, the selected jet must be separated from the chosen photon candidate by ∆R > 0.5 and satisfy the tight jet identification criteria [19]. The jet identification criteria comprise requirements on the number of constituents, and on the fraction of jet energy carried by each constituent type. The jet is required to be within the region |η jet | < 2.4 and must have a p jet T > 170 GeV. The angular separation between the selected photon and jet is restricted by applying a requirement of ∆η (γ, jet) < 1.5. This selection removes a large fraction of the multijet background coming from non-isolated π 0 s, without rejecting signal events, and thus enhances the signal-over-background ratio. If more than one jet candidate is present in the event, the jet with the highest p T is used in the analysis. The selected events form the "inclusive category" for the search of light excited quarks.
Jets originating from b quarks are identified using the combined secondary vertex v2 algorithm (CSVv2) [20,21]. The algorithm combines the information from the primary vertex, impact parameters, and secondary vertices within the jet using a neural network discriminator. The loose working point used in the analysis has ∼81% b jet selection efficiency, ∼10% misidentification rate for light-quark and gluon jets, and ∼40% misidentification rate for c quark jets [20]. Depending on the outcome of the CSVv2 algorithm, a jet is tagged either as a b jet or a non b jet candidate. According to this tagging, for the b analysis, the events are classified into "1b tag" and "0b tag" categories, corresponding to the selections with b jets and without b jets respectively. Since the b acceptance falls off slightly for 1b tag category at higher masses ( Fig. 1), the sensitivity of the search is improved by including the results from 0b tag category in the b limit computation.
The above selection criteria are optimized for the best expected 95% confidence level (CL) limits on the cross section versus mass of q and b .
The efficiencies for assigning events to the 1b tag and 0b tag categories, determined from the Monte Carlo (MC) simulation, are corrected using b tag scale factors (SFs), to take into account the observed differences between the b tagging efficiency of the CSVv2 tagger applied to data and to MC simulation. The SFs are defined as data / MC , where data and MC correspond to the b tagging efficiencies of the CSVv2 algorithms in data and MC simulation, respectively. These SFs have been measured using the techniques described in [20].
The invariant mass of the selected γ + jet (γ+b jet) system is required to be m γ+jet > 700 GeV, to avoid the turn-on region due to the requirements imposed on the kinematic properties of the trigger objects. Fig. 1 shows the total selection and reconstruction efficiency times acceptance for q → qγ and b → bγ processes. The acceptance times efficiency for the 1b tag category decreases with increasing mass owing to the decrease in the efficiency of the track reconstruction and the resolution of the reconstructed track parameters with increasing p T of the jet.

Modeling signal and background
The signal samples for q and b are simulated at leading order (LO) with the PYTHIA 8.212 event generator [22] for f = 1.0, 0.5 and 0.1 at different resonance masses in the range from 1 to 7 TeV at intervals of 1 TeV and from 1 to 5 TeV at intervals of 0.5 TeV, respectively. The generated events are processed through a full CMS detector simulation based on GEANT4 [23]. The simulation uses the CUETP8M1 underlying event tune [24,25], a renormalization and factorization scale corresponding to µ = p T for the hard-scattered partons, and NNPDF2.3LO parton distribution functions (PDFs) [26]. The natural width of the resonance, at parton level, can be approximated as Γ ∼ 0.03 f 2 m Res [3]. The production cross section is also proportional to f 2 . The signals for intermediate mass points are interpolated at intervals of 50 GeV.
The MADGRAPH5 aMC@NLO v2.2.2 program [27] has been used to generate the γ + jet and W/Z+γ background MC samples at LO, with the showering and hadronization carried out by the PYTHIA 8.212 program. A double counting of the partons generated with MADGRAPH and those with PYTHIA is removed using the MLM [28] matching scheme. The multijet MC events are generated using PYTHIA 8.212 event generator. The same event reconstruction is employed in data and MC simulations. However, the background is evaluated from data, and the MC simulation is used only for the optimization of the event selection. The invariant mass distribution of the SM γ + jet background falls smoothly and can be described by an analytic function.
The inclusive invariant mass distribution and the distributions for 1b tag and 0b tag categories, expressed in TeV, are shown in Figs. 2 and 3, respectively. The binning is chosen to have a bin width approximately equal to the expected γ + jet mass resolution, which varies from about 4.5% at a mass of 1 TeV to 3.3% at 6 TeV. These distributions are modeled using an empirical parametrization that has been used widely in similar previous searches [7, 8, 10, 11]: where √ s = 13 TeV and P 0 , P 1 , P 2 , and P 3 are four parameters used to describe the background distribution and its normalization. The order of the function has been chosen by performing Fisher tests [29], with a cut-off p-value of 0.05. The function is found to be in good agreement with data with a χ 2 /ndf = 40.7/41.4. The highest invariant mass event observed in data has m γ+jet of 4.6 TeV with a b-tagged jet, and thus belongs to both the inclusive and 1b tag categories.
In order to examine the presence of a possible systematic bias due to the choice of background fitting function, tests are performed using alternate functional forms. These alternative expressions are polynomial functions that also provide adequate descriptions of the data. To perform these tests, an invariant mass distribution of the SM background is obtained from MC simulation. This invariant mass distribution is fitted with alternate test functions and the results of the fit, considered as the truth model, are used to generate a large number of pseudo-data samples that have bin-to-bin statistical fluctuations similar to those of the data. A signal with a cross section close to the expected sensitivity is also injected in the pseudo-data distributions. These distributions are then fitted using the default background function along with a signal model, and the signal cross section is extracted. Pull distributions defined as the difference between the true and extracted signal cross sections divided by the estimated statistical uncertainty, for the obtained signal cross sections are constructed. The deviation from zero, of the mean in the pull distribution, is a measure of the bias present in the model. The pull distributions for q and b modeling over the studied mass range are found to be consistent with normalized Gaussian forms with medians deviating by no more than 0.5 from zero, and widths consistent with unity for the full mass range. When added in quadrature with the statistical uncertainty, the bias uncertainty is found to contribute approximately 10% of the total. Therefore, it is concluded that the systematic uncertainty associated with the choice of the parametric function is negligible, and the statistical uncertainty of the fit is the only uncertainty in the background prediction that needs to be considered.

Systematic uncertainties
The dominant sources of the systematic uncertainties affecting the q and b signals are summarized in Table 1.
The uncertainties in the jet energy scale and jet energy resolution [17] affect both the signal yield and its distribution. The size of the effect is determined by varying the four-momenta of the jets by the corresponding uncertainties and repeating the full analysis with the modified quantities.
The systematic uncertainties in the photon energy scale and resolution, and photon identification efficiency are derived from Z → e + e − events. The uncertainty in the photon energy scale is found to be about 1% and it includes the uncertainty in the extrapolation to higher p T , beyond the reach of the Z → e + e − control samples [13]. The uncertainty in the photon identification is estimated to be around 2%. Also, a systematic uncertainty of 5% has been included to account for the precision of the photon trigger efficiency measurement. The effect of the b tagging scale-factor uncertainty on the distribution of the signal is evaluated to be around 1% while on the normalization, the effect is around 2%. The method used to interpolate the signal distributions from the generated distributions is assigned an uncertainty of 0.5-1.0%, which accounts for the difference between the generated and interpolated signals. The PDF uncertainty affects the signal acceptance by 1.5-3.0% for both q and b quarks and is evaluated using PDF4LHC recommendations [30].
The uncertainties in the measurement of the integrated luminosity (2.5%) [31] and pileup description (1%) affect the overall signal yield. The uncertainty in the background estimate is accounted for in the fit by varying the parameter values within their respective uncertainties, with no additional constraints.

Results
In the mass region studied, no significant excess has been observed. We use the γ + jet invariant mass spectra (Figs. 2, 3), the background parametrization, and the q and b theoretical predictions to set 95% CL upper limits on the production cross section of q and b decaying to qγ and bγ, respectively.
The modified frequentist CL s method [32,33] in the asymptotic approximation [34] is utilized to set upper limits on signal cross sections. The asymptotic approximation is found to be in good agreement with the full CL s approximation over the entire mass range. In order to evaluate limits, a likelihood function is constructed that is the product of the Poisson likelihoods of all the bins in the distribution. The systematic uncertainties in the signal are implemented in terms of nuisance parameters with Gaussian and log-normal constraints. The uncertainty due to the background parametrization is found to have the largest impact and is quantified by considering the effect of changing the parameters from their central values by their estimated ±1 sigma uncertainties. We calculate limits by evaluating the likelihood independently at successive values of resonance mass from 1 to 6 TeV for q , and 1 to 5 TeV for b in steps of 50 GeV. The cross section limits are not evaluated below 1 TeV, because of uncertainties in the signal efficiency associated with the invariant mass selection, m γ+jet > 700 GeV.
In order to evaluate limits for b , likelihoods for 1b and 0b tag categories are combined together. The observed and expected mass limits for q and b are computed at 95% CL. The results are presented in terms of limits on the product of the cross section (σ) and branching fraction (B). The cross section upper limits are compared to the LO theoretical predictions, for all the three couplings, to estimate the lower mass limit on excited quarks. In Figs. 4 and 5, the experimental limits for f = 1.0 are shown for q and b , respectively, with the theoretical predictions for the different couplings overlaid. There is a small dependence of σ × B on f , of the order of 10% − 20%, which is taken into account correctly when extracting the mass limits. Observed lower bounds of 5.5 and 1.8 TeV are obtained for q and b , respectively, for f = 1.0. The corresponding expected mass limits obtained are 5.4 (1.8) TeV for q (b ). The variation of the excluded mass as a function of the coupling strength, obtained by interpolating the efficiencies for three signal MC samples corresponding to f = 1.0, 0.5, 0.1, is shown for q and b in Fig. 6. This result can also be interpreted in terms of the ratio of the resonance mass and Λ, i.e., if we relax the assumption of Λ = m Res , the excited quark production cross section is proportional to f as well as m Res /Λ.

Summary
A search has been presented for excited states of light and b quarks in γ + jet final states, using data corresponding to an integrated luminosity of 35.9 fb −1 , collected at √ s = 13 TeV. Upper limits at the 95% confidence level are placed on the product of production cross section and decay branching fraction for the presence of q and b excited quarks in γ + jet final states. Comparing these upper limits with theoretical predictions, excited light quarks within the mass range 1.0 < m q < 5.5 TeV and excited b quarks within the mass range 1.0 < m b < 1.8 TeV are excluded at 95% confidence level, assuming standard model like coupling strengths. These are the most sensitive limits for q and b searches in the γ + jet final states. In addition, the search for excited b quarks is the first to be presented in any final state at √ s = 13 TeV.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: BMWFW and FWF (Aus  [12] ATLAS Collaboration, "Search for new phenomena in high-mass final states with a photon and a jet from pp collisions at √ s = 13 TeV with the ATLAS detector", Eur. Phys. J. C 78 (2018) 102, doi:10.1140/epjc/s10052-018-5553-2, arXiv:1709.10440.