Probing Compositeness with the CMS $eejj$&$eej$ Data

Quark-lepton compositeness is a well-known beyond the Standard Model (SM) scenario with heavy exotic particles like leptoquarks (LQs) and leptogluons (LGs) etc. These particles can couple to leptons and jets simultaneously. In this letter, we use the recent CMS scalar LQ search data in the $eejj$ and $eej$ channels to probe this scenario. We recast the data in terms of a color octet partner of the SM electron (or a first generation spin-1/2 LG) that couples to an electron and a gluon via a dimension five operator suppressed by the quark-lepton compositeness scale ($\Lambda$). By combining different production processes of the color octet electron ($e_8$) at the LHC, we use the CMS 8TeV data to obtain a simultaneous bound on $\Lambda$ and the mass of the $e_8$ ($M_{e_8}$). We also study the reach of the 13 TeV LHC to discover the $e_8$ and interpret the required luminosity in terms of $M_{e_8}$ and $\Lambda$.


I. INTRODUCTION
The idea of quark-lepton compositeness [1][2][3][4][5][6][7][8] goes along with our intention to describe nature in terms of its most fundamental building blocks. As its name suggests, in the models with quark-lepton compositeness, the Standard Model (SM) fermions are not elementary but rather have finer substructures. Similarities between the SM lepton and quark sectors (like, both come with three flavors and behave similarly under the SU (2) L × U (1) Y gauge symmetry with the same weak coupling) can be explained if they are assumed to be different bound states of some fundamental constituents. These fundamental constituents, called preons by Pati and Salam [1], are charged under some new strong force which confines them below a certain scale Λ, known as the compositeness scale.
As we have hadrons in QCD, in this scenario one expects a host of new exited preonic-condensates. Some of these condensates would be quite exotic, as they would carry both SU (3) c color charges and lepton numbers, like the bosonic leptoquarks (LQs or ℓ q 's) that transform as triplets under SU (3) c [9][10][11] or the leptogluons (LGs or ℓ 8 's) that are color-octet fermions [12][13][14][15][16][17] etc. Because of their color charges, if these exotic condensates have TeV-range masses, they would be produced copiously at the Large Hadron Collider (LHC) making it possible to probe this scenario experimentally.
The LHC has already put some constraints on the masses of scalar LQs decaying to SM quarks and leptons [18][19][20][21]. Of these, we look at the most recent search by CMS, for the first and second generations of scalar LQs in the ℓℓjj and the ℓν ℓ jj channels with 19.7 fb −1 of integrated luminosity at the 8 TeV LHC [18]. With pair production, the 95% confidence level (CL) exclusion limit on the mass of the first (second) generation scalar LQ now stands at M ℓq = 1005 (1080) GeV assuming it always decays to an electron (a * Electronic address: tanumoy.mandal@physics.uu.se † Electronic address: subhadip.mitra@iiit.ac.in ‡ Electronic address: sseth@uni-mainz.de muon) and a jet. Note that unless specified otherwise, we do not distinguish between any particle and its anti-particle. Hence, an electron here could mean a positron as well. In the first generation search, mild excesses of events compared to the SM background were observed in both the eejj and the eej channels for M ℓq ∼ 650 GeV. Currently, these excesses have attracted considerable attention in the literature. CMS has also performed a dedicated search for the single productions of the first two generations of LQs in the ℓℓj channels [21]. However, unlike the mostly QCD mediated pair production, the single productions depend strongly on an unknown coupling λ, the ℓ q -ℓ-q coupling. Hence, the exclusion limits from this search are λ dependent. For the first generation, the exclusion limit goes from 895 GeV to 1730 GeV when λ goes from 0.4 to 1.0 and for the second generation the data exclude M ℓq below 530 GeV for λ = 1.0.
In this letter, we recast the CMS 8 TeV eejj [18] and eej [21] data in terms of the first generation spin-1/2 LG carrying unit electric charge, i.e., the color octet partner of the SM electron (e 8 ) to probe the composite quark-lepton scenarios and obtain the most stringent limits available on the e 8 . This is possible because a LG can also decay to a lepton and a jet (gluon) just like a LQ. Hence, the pair production of e 8 's would have eejj final states. 1 Earlier, there have been other phenomenological studies on LGs [22][23][24][25][26] and the CMS 7 TeV eejj data [27] were used to infer bounds on M e8 [28,29]. Considering the pair production, Ref. [29] put the mass exclusion limit at about 1.2-1.3 TeV. Similarly, an e 8 could be produced singly in association with an elec- 1 In absence of any BSM decay, the only two body decay a LG can have is either ℓ 8 → ℓ g or ν 8 → ν ℓ g (ν 8 , color octet partner of a neutrino) but not both. Hence, unlike LQs, the QCD mediated pair production of LGs can not have a ℓν ℓ jj final state. However, depending on the underlying model, a charged ℓ 8 and a neutral ν 8 might couple simultaneously with a SM W boson allowing a weak interaction mediated process, with the ℓν ℓ jj final state. tron and give rise to an eej final state. Interestingly, the single productions of LGs open up a way to probe the compositeness scale. This is because, at the leading order (LO), the ℓ 8 -ℓ-g interaction comes from an effective operator of dimension five that is suppressed by the compositeness scale Λ [28,30] (see the next section). This is unlike the LQ interactions, where the LO terms are of dimension four and hence, apparently insensitive to Λ.
In a recent paper [31], we pointed out that the single productions of LQs can also lead to the eejj final state and similarly, events from the pair productions could also pass the signal selection criteria of the single production search in the eej channel. Combining these production processes in the signal simulations can provide better limits in the M ℓq -λ plane from both the eejj and the eej channels. The same argument applies for LGs too. Hence, following Ref. [31], here we systematically combine both the pair and the single production processes of the e 8 while reinterpreting the CMS eejj and eej data and obtain exclusion limits in the M e8 -Λ plane. This way, we obtain the mass exclusion limits as well as the limits on the compositeness scale from both the eejj and the eej data and compare them.
Our presentation is organized as follows. In the next section we discuss the details of the signal we consider, in section III, we present the results of our recast analysis, in section IV we investigate the prospect of discovering the color octet electron at the 13 TeV LHC and then in section V we conclude.

II. LEPTOGLUON (COMBINED) SIGNALS
If we assume M e8 is smaller than Λ and there is no violation of lepton flavor, we can write a generic effective Lagrangian for the e 8 allowed by the SM gauge symmetry as [28], with [30], In the Lagrangian, we have displayed only those dimension five terms that are important for our study. 2 Here, G a µν is the gluon field strength tensor, and η L/R are the chirality factors. Since, the electron chirality conservation implies η L η R = 0, we set η L = 1 and η R = 0 in our analysis without any loss of generality. This dimension five interaction opens two decay modes for the color octet electron: e 8 → eg and e 8 → egg. However, since the three body 2 As pointed out in Ref. [28], there are more dimension five operators allowed by the SM gauge symmetries and lepton number conservation like, However, these terms lead to e 8 e 8 V or e 8 e 8 V V vertices (may contain form factors) that would affect the production cross section. For simplicity, we assume the unknown coefficients associated with these terms are negligible.
decay is more suppressed than the two body one, we simply set the total width of the e 8 as [22,28], The production processes of the e 8 at the LHC (see Fig. 1 for some representative Feynman diagrams) are discussed in much detail in Ref. [28]. Instead, here we focus on some essential points. The main contribution to the e 8 pair production comes from the purely QCD mediated diagrams (see e.g. Fig. 1a). At the LO, there is an additional t-channel electron exchange diagram whose amplitude is proportional to 1/Λ 2 ( Fig. 1b) but, for the ranges of M e8 and Λ we consider in this letter, its contribution is small compared to the model independent QCD mediated contribution. That is why the pair production process is practically insensitive to the compositeness scale. On the other hand, all the single production diagrams contain at least one e 8 -e-g or e 8 -e-g-g vertex (∼ 1/Λ) coming from the interaction term of Eq.
We simulate the pair and the single productions of e 8 at the 8 TeV LHC to estimate their contributions to the eejj and the eej channels by modeling Eqs. (1) and (2) in FEYNRULES [32]. We use the CTEQ6L1 Parton Distribution Functions (PDFs) [33] to generate events with MAD-GRAPH5 [34] and then shower them with PYTHIA6 [35]. We set the factorization and the renormalization scales, µ F = µ R = M e8 . We use DELPHES 3.3.1 [36] to simulate the CMS detector environment and implement the selection cuts. In DELPHES, jets are clustered with FASTJET [37] using the anti-k T jet clustering algorithm [38] with the clustering parameter, R = 0.4. Since, we generate the pair and the single productions separately, any possible interference between them has been ignored. However, this is justified as, for the parameters considered, the e 8 decay width is much smaller than its mass (i.e., narrow width regime).
We generate events for the inclusive single production for certain Λ = Λ o by combining the following processes, where the curved connections indicate a pair of electron and gluon coming from an on-shell e 8 . However, a straightforward computation of cross section for the combined single and pair production processes would lead to some difficulties. Like, the jets that are not coming from a LG could be soft and lead to divergences. Ideally, to handle these divergences, one has to go beyond a tree level computation while combining the different single production processes as in Eq. (4). Moreover, such combination can lead to double counting of some diagrams while showering. Following Ref. [31], we avoid these difficulties by employing the matrix element-parton shower matching (ME⊕PS) technique with the shower-k T scheme [39,40] which effectively provides a consistent interpolation between the hard partons and the PYTHIA parton showers (PS). It relies on the PYTHIA PS for the soft jets and the parton level matrix elements for the hard jets and thereby, bypasses the double counting and the soft jets problems. The cross section for any other value of Λ = Λ n (say) is obtained by simply multiplying the cross section for Λ o by LG single productions at the LHC. Λ 2 o /Λ 2 n , since, as explained earlier, the Λ dependence of the inclusive single production cross section (σ s ) can be written as, if we ignore terms of O 1/Λ 4 or higher. In Table I,  TeV. There, we also show the LO values of the pair production cross-section (σ LO p ) for the four masses. While combining the pair and the single productions, we use the next-to-leading (NLO) in QCD K-factors only for the pair production, available from Ref. [29] for masses up to 1.5 TeV. Beyond this, guided by the trend, we assume a constant K NLO = 2.0. 3 Note, however, no K-factor is available for the single productions. Hence, for a particular Λ, we utilize the available information to the best possible manner and use the combined signal with the following cross section,

III. RECAST ANALYSIS AND NEW LIMITS
In Fig. 2, we show the recast mass exclusion plots obtained from the CMS eejj [18] data for three different values of Λ, namely, Λ → ∞ (i.e. the pair production only) in Fig. 2a, Λ = 5 TeV in Fig. 2b and Λ = 2.5 TeV in Fig. 2c. To obtain the expected and the observed 95% CL upper limits (ULs) for the recast plot, we rescale the corresponding limits from the CMS plot [18] by multiplying with a factor [31], where ǫ (M e8 , Λ) denotes the fraction of the combined signal events that survives the selection cuts optimized for M ℓq = M e8 . Since, the CMS eejj optimized cuts stop at M ℓq = 1.2 TeV, we extrapolate beyond this mass by assuming identical selection cuts for M ℓq ≥ 1.2 TeV. Because of the single productions, the lower limit of the allowed mass increases with decreasing Λ. For example, from the pure QCD mediated pair production (Λ → ∞) the limit stands at about 1.56 TeV and it improves to about 1.66 (1.90) TeV for Λ = 5 (2.5) TeV. 4 Note that with increasing mass, the pair production becomes more phase space suppressed compared to the single productions and hence, beyond a certain mass, the single productions dominate over the pair production. The crossover point depends on Λ, since all the single productions depend on it. With this in mind, we can now understand the behavior shown by the 95% CL UL lines in the high M e8 limit for finite Λ's. We expect the single productions to take over the pair production earlier when Λ = 2.5 TeV than when Λ = 5 TeV. This can be seen from Figs. 2b & 2c: the small raise in any UL line with increasing M e8 (that it is indeed coming from the single productions can be confirmed from its absence in the pair only plot) comes earlier for Λ = 2.5 TeV than Λ = 5 TeV. 5 In Fig. 3, the recast plots for Λ = 2.5 and 5 TeV obtained from the CMS eej [18] data are shown. For Fig. 3a, we have considered only the single productions in the signal to compare the mass exclusion limits for the two values of Λ while in Figs. 3b & 3c, we consider the combined productions. 4 Production processes for LGs generally have enhanced color factors than LQ production processes (color octet LGs vs. color triplet LQs). As a result, from the same data one generally obtains higher mass exclusion limits for LGs than LQs for similar choice of parameters. 5 It is not very straight forward to understand the reason behind the raise itself intuitively. When these selection cuts [18] are held fixed, both the efficiencies start to increase with increasing Me 8 till they saturate. However, since they evolve differently, there is a competition between the numerator and the denominator of Eq. (7).  [18] with three different choices of the compositeness scale Λ. For these plots, combined production of color octet electrons (i.e., the QCD mediated pair production plus the inclusive single productions from Eqs. (4) & (5)) with cross section (solid red lines) σp +σs/Λ 2 , Eq. (6), is considered to simulate the signal. To obtain the expected 95% CL upper limits (dashed lines) beyond 1.2 TeV, the selection cuts [18] are assumed to be identical for   Here, we rescale the CMS limits [18] bỹ for the single only plot (Fig. 3a) and by for the other two (Figs. 3b & 3c). Here, ǫ (ℓq|eej) s (M e8 ) is the efficiency of the final event selection cuts optimized for the single productions of the first generation scalar LQ of mass M ℓq = M e8 [18]. Notice that though the single productions of the LQ depend on the unknown ℓ q -ℓ-q coupling λ, the efficiency ǫ (ℓq|eej) s , being a ratio of the number of events, does not depend on any overall factor in the cross section like λ [31]. For the same argument ǫ (e8|eej) s , which is the cut efficiency for the inclusive single production of the e 8 , does not depend on Λ even though ǫ (e8|eej) p+s does. If we compare Fig. 3a with Figs. 3b & 3c, it is clear how the inclusion of the pair production in the signal for the eej search improves the mass exclusion limits. For example, for Λ = 5(2.5) TeV the eej data disfavor M e8 below 1.28 (1.84) TeV when only the single productions are considered. But the same limit goes up to about 1.62 (1.86) TeV when the pair production is also included. Obviously, the improvement is more prominent when the single productions are relatively smaller because of larger Λ.
In Fig. 4, we show the rescaled 95% CL exclusion limits in the M e8 -Λ plane. The blue shaded regions are disfavored by the data. We show the exclusion contours obtained from the CMS eejj data (Fig. 4a) and the eej data (Fig. 4b). We compare these two in Fig. 4c. The pair production dominates in the lower mass region and gives a limit on M e8 that is practically independent of Λ. From Fig. 4a or 4c, it is clear that irrespective of Λ, the eejj data disfavor the e 8 with mass below ∼ 1.55 TeV. In the high mass region, the pair production becomes negligible and the inclusive single production puts a strong limit on Λ. However, what is remarkable is that the eejj data give almost identical limit as the eej data in this regime. In other words, in the high mass limit, the contamination of single production in a search optimized for pair production is very significant. 6 As explained in the introduction, the Λ-dependent mass exclusion limits can also be translated as limits on Λ. The overlapping limits in Fig. 4c indicate that the lightest limit on Λ stands about Λ ≈ 2 TeV ≈ M e8 within the domain of the effective theory. If M e8 lies between 1.64 TeV and 2 TeV, Λ must be higher.

IV. FUTURE PROSPECTS
So far our discussions were centered on reinterpreting the available data. Now, let us look at the prospect of a discovery of the e 8 at the LHC in its 13 TeV runs. In this section, we assume a future search in the eej channel optimized for finding the e 8 and estimate the discovery reach using the combined production. We expect two high-p T electrons and at least one high-p T jet as the typical signature of the combined production of e 8 [28]. Therefore, taking a cue from the existing CMS eej search [18], we use the following selection cuts: 1. two oppositely charged electrons (e ± ) with transverse momentum p e T > 45 GeV and pseudorapidity |η e | < 6 Since the pair production search is insensitive to the spin of the particle being probed, kinematically it does not matter much whether the search is for LQs or LGs, at least in the narrow widths regime.

separation between any electron and the hardest jet in
the η-φ plane, ∆R ej1 > 0.3.
To suppress the inclusive-Z background, we apply a strong cut on 4. the invariant mass of the electron pair, M e1e2 > 400 GeV.
In addition, we also apply some cuts optimized for the different (M e8 , Λ) combinations, 5. the scalar sum of the p T of the two electrons and the hardest jet, 6. the maximum of the two electron-jet invariant mass combinations, The values of S opt T and M opt ej for some benchmark parameters are shown in Table II. The strong cut on M e1e2 suppresses the inclusive Z (+n jets) contribution which is the most dominant background. The other significant backgrounds are the inclusive top-pair production, the inclusive diboson (ZZ, ZW , W W ) productions etc. [28].
To  Here, L 5 is the luminosity required to attain a 5σ statistical significance for Sig./ √ Backg. and L 10 is the luminosity required to observe 10 signal events. In Table III, we display the 'after-cut' cross sections of the combined signal and the dominant Z + nj background (including the contributions from ZV ) for the benchmark points of Table II. Though we show only the dominant background in the table, we include other sub-dominant contributions [28] (like inclusive toppair etc.) while estimating L D .
We show two L D contours estimated for the 13 TeV LHC in Fig. 5. To obtain this, we use constant K NLO = 2 for M e8 beyond 1.5 TeV like the recast analysis in section III. With 300 fb −1 of integrated luminosity, the mass reach goes from about 2.5 TeV to about 3.5 TeV as Λ decreases to about 3.5 TeV (Λ ≈ M e8 ) from very large values. Obviously, this increase in reach with decreasing Λ happens because of the single productions whose cross sections go like 1/Λ 2 .
Before closing this section we make a note. Even though we have reinterpreted the CMS LQ data in terms of the e 8 , it is also possible to separate them at the LHC. Let us suppose, a significant excess is found in the eej data in future. In Fig. 6, we show the η distribution of the second hardest electron (as an example), which can be used to distinguish a spin-0 LQ from a spin-1/2 LG. Obviously, there are other possibilities as well. However, we do not pursue this issue further in this letter.

V. DISCUSSIONS AND CONCLUSIONS
The quark-lepton compositeness scenario is one of the well-known BSM scenarios which can accommodate LQs. In this letter, we have used the CMS first generation scalar LQ data in the eejj and the eej channels to probe this scenario. In these models, there exist other exotic composite particles that can also decay to lepton-jet final states. We have recast the CMS data in terms of such a particle, the color octet partner of the SM electron. An e 8 decays to an electron and a gluon via a dimension five interaction, suppressed by the compositeness scale. This opens up the possibility of probing the compositeness scale with the eejj and the eej data.
In a recent paper [31], we argued that at the LHC, a search for the pair production of a colored particle (generally, model independent) can get 'contaminated' from the model dependent single productions and vice versa. There, we used the examples of the CMS LQ searches to demonstrate how the pair and the single productions can be combined systematically in the signal simulations. As a result, even a search for the pair production can give information on the model parameters that control the single productions. In this letter too, we have adopted the same strategy, i.e., we have recast both the eejj and the eej data with signals that are combinations of the pair and the single productions for different values of Λ, the compositeness scale that controls the single productions. Hence, the analysis in this letter stands as yet another demonstration of our arguments in Ref. [31].
From the combined signal, we extract the exclusion limits in the M e8 -Λ plane. The limits obtained by our analysis are not very precise as they are obtained by simple rescaling instead of a full statistical analysis. However, one can conclude that the eejj data disfavor e 8 's with mass below ∼ 1.5 TeV for any value of Λ. 7 Beyond this mass range, the limit becomes a function of Λ. As the mass increases, the single productions dominate the combined signals in both eejj and eej channels giving almost overlapping limits that can also be interpreted as the limits on Λ. Data in both channels indicate that Λ 2 TeV for 1.5 TeV M e8 2 TeV. Beyond this mass range, where the exclusion limits enter in the region with M e8 > Λ, our effective theory approach becomes unreliable. We clearly mark this region in all the relevant plots. This is an inherent limitation present in any effective theory approach. It might also happen that, in nature, the e 8 is actually heavier than the compositeness scale. In that case, all our limits/predictions would not be reliable except 7 If additional sources to the LG pair production (like the higher dimensional operators in footnote 2 or the LO electroweak gauge mediated pair production etc.) are considered, this limit would receive corrections and could acquire some Λ-dependence even. However, it is normal to expect these corrections to be smaller than the QCD mediated LO pair production.
in the parameter region dominated by the (QCD mediated) pair production. For example, let us suppose that, in nature, Λ is actually smaller than 1.5 TeV, the mass range disfavored by the pair production data. In that case, we will still be able to say that the e 8 can not exist below 1.5 TeV but we would not be able to conclude anything definitively about Λ from our analysis. Notice that there are other higher dimensional operators (like O ggee or O qqee for contact interactions) that, in principle, could also connect Λ with the eejj/eej data irrespective of the values of M e8 . However, two points go against them -the first, the signal selection criteria are not designed to favor them, and the second, these operators are of dimensions higher than five (so unless Λ is very small, in which case the whole effective theory approach might break down, these terms are expected to be highly suppressed). Hence, despite the inherent limitation, our approach gives the best available limits on Λ and M e8 from the CMS 8TeV eejj and eej data within the domain of validity of the effective theory (compare the limit on M e8 with the limit quoted in the Particle Data Book [30], M e8 > 86 GeV from old Tevatron data [41]). Finally, we note that one can also analyze the second generation µµjj/µµj data in terms of color octet muon. However, it will be a very similar exercise and we do not expect that it will provide very different limits on Λ than what we have obtained. In case of the LQ, production of the second generation is reduced compared to the first generation because of the relative suppression of the second generation quark PDFs. However, since the LG productions at the LHC are mainly gluon mediated, they remain roughly the same for any generation.