On the chirality of the SM and the fermion content of GUTs

The Standard Model (SM) is a chiral theory, where right- and left-handed fermion fields transform differently under the gauge group. Extra fermions, if they do exist, need to be heavy otherwise they would have already been observed. With no complex mechanisms at work, such as confining interactions or extra-dimensions, this can only be achieved if every extra right-handed fermion comes paired with a left-handed one transforming in the same way under the Standard Model gauge group, otherwise the new states would only get a mass after electroweak symmetry breaking, which would necessarily be small ($\sim100\textrm{ GeV}$). Such a simple requirement severely constrains the fermion content of Grand Unified Theories (GUTs). It is known for example that three copies of the representations $\mathbf{\overline{5}}+\mathbf{10}$ of $SU(5)$ or three copies of the $\mathbf{16}$ of $SO(10)$ can reproduce the Standard Model's chirality, but how unique are these arrangements? In a systematic way, this paper looks at the possibility of having non-standard mixtures of fermion GUT representations yielding the correct Standard Model chirality. Family unification is possible with large special unitary groups --- for example, the $\mathbf{171}$ representation of $SU(19)$ may decompose as $3\left(\mathbf{16}\right)+\mathbf{120}+3\left(\mathbf{1}\right)$ under $SO(10)$.


Introduction
There is currently no explanation for the flavor structure of the Standard Model (SM) and Grand Unified Theories (GUTs) developed over the past decades have failed to shed light on this issue since particles with different flavors are usually assigned to distinct copies of a single representation of the enlarged gauge group. For example, with SU (5) one considers three copies (one for each flavor) of the representations 5 and 10, containing exactly the SM fermions [1]: three replicas of Q = 3, 2, 1 6 , u c = 3, 1, − 2 3 , d c = 3, 1, 1 3 , L = 1, 2, − 1 2 , and e c = (1, 1, 1). In SO(10) models, three 16's contain all SM fermions plus three right-handed neutrinos N c = (1, 1, 0) [2,3]. Once the SO(10) symmetry is broken, a vector (Majorana) mass mN c N c is allowed for each of these extra fermion states, explaining why they have yet to be (directly) observed. Increasing further the size of the group, there is also the well known possibility of having three copies of the 27 in E 6 -based models [4], which contain 11 additional vector particles per generation.
In order to completely explain flavor with GUTs it would be necessary to place the SM fermions in a single representation of the gauge group. 1 This idea goes by the name of family unification and it was attempted in the past with a variety of groups [5][6][7][8][9][10][11][12][13][14][15][16][17][18]. For instance, the spinor representation of SO(10 + 2N ) can be broken into 2 N −1 copies of the 16 of SO (10), yet it also contains an equal amount of 16's. Therefore, without confining interactions [5,6,19,20], extra dimensions [21][22][23][24][25][26][27][28][29][30][31][32][33][34][35], or some other elaborate mechanism, one cannot give a big mass to all these mirror families without making all the families super-heavy as well. 2 In fact, mirror families are just part of a larger problem: in general it is necessary to justify why all types of exotic fermions are heavy. Take as an example the representation 560 of SO (10), which is the smallest one containing all SM fermions. On top of the fact that the excess of fermions (5Q + 4u c + 4d c + 3L + 3e c ) over mirror fermions (1Q + 1u c + 1d c + 1L + 1e c ) is not the correct one, there are also fermions in exotic SM representations such as 15, 1, 1 3 , and none in matching conjugate representations. Such states could only acquire an electroweak (EW) scale mass and therefore would have already been seen at the Tevatron and the LHC.
Perhaps the idea of unifying the three families in a single GUT fermion representation is too ambitious. One should then also consider models where the observed fermion states are distributed over various GUT representations [36][37][38][39][40][41][42][43][44][45][46][47][48][49][50][51]. Such models might still be quite interesting: if the GUT representations are not just mere copies of one another, the gauge symmetry alone might explain the existence of non-trivial flavor structures at low energies.
GUTs with an exotic fermion content may also have unusual features which go against what is usually taken for granted. One of them is non-standard normalizations of the hypercharge operator, which we shall now discuss. In order to see if the gauge couplings unify at a high scale in a given model, one usually takes the values of g 1 = 5/3g , g 2 = g and g 3 = g s at roughly the Z-boson mass scale and runs them with the renormalization group equations up to high energies. The explanation for the numerical factor 5/3 is simple: the Lagrangian depends on the product of the gauge coupling constant g times the hypercharge operator Y , so the change (g , Y ) → n −1 g , nY for some n is of no consequence in the SM. Comparing the three gauge coupling constants is then pointless. However, if SU (3) C × SU (2) L × U (1) Y is a remnant of a larger simple gauge group, then suddenly there is a natural value for this n parameter: one 1 Rigorously speaking, when referring to a group we have in mind its algebra. 2 The presence of confining gauge interactions could be an elegant solution to this problem, as pointed out in [5,6]. For example, if SO (18) breaks into SO(10) × SO (5) such that 256 → 3 (16, 1) + (16, 5) + 2 16, 4 and if SO (5) becomes non-perturbative at some high-scale, then one would expect that the only fermions which would remain light would be the three 16's which are SO (5) singlets. However, it seems difficult to drive SO(5) into a non-perturbative regime at high energies.
would like Y to be one of the generators of this enlarged group, in which case its normalization must be the same as the rest of the generators: Tr T 2 a = constant. As such, if we identify the components of the 5 in an SU (5) theory with those of the SM representations d c = 3, 1, 1 3 n and L = 1, 2, − 1 2 n , then Y = n × diag 1 3 , 1 3 , 1 3 , − 1 2 , − 1 2 and the third generator of SU (2) L is given by the matrix T 3L = diag 0, 0, 0, 1 2 , − 1 2 , hence |n| = 3/5. This hypercharge normalization factor is often mentioned as being specific to SU (5) models, even though it is actually very generic. Nevertheless, it is conceivable that n might differ from 3/5: for instance, in SU (5) one might try to identify d c with the representation 3, 1, − 2 3 3/5 inside the 10, u c with 3, 1, 4 3 3/5 in the 45, and so on, in which case g 1 = − 1 2 5/3g would be the correct relation. 3 Another possibility would be to have, for example, an SU (7) model with the fundamental representation breaking into X ≡ 3, 1, 1 3 n + 1, 2, − 1 2 n + (1, 1, m) + (1, 1, −m) with a non-zero m, in which case it is clear that n will not be equal to 3/5, so g 1 = 5/3g if we were to identify d c and L with the first two SM representations. Note that X forms a unitary 7-dimensional (reducible) representation of the SM group, so it is certainly possible to make the fundamental representation of SU (7) break in this way.
What about having 7 → 3, 1, 1 3 n + 1, 2, − 1 2 n + (1, 2, 0)? Such a scenario is even more interesting. The fundamental representation of SU (7) branches into a single color anti-triplet, as usual, but now there are two doublets, which means that one should take 1/ 2 √ 2 times the Pauli matrices as the generators of SU (2) L , and not half the Pauli matrices. In such a model, the correctly normalized gauge couplings would then be g 1 = 5/3g , g 2 = √ 2g and g 3 = g s . 4 So, despite the widespread belief that this issue only affects abelian groups, clearly there might be potentially interesting normalization corrections to any of the gauge coupling constants in GUTs with non-standard fermion assignments.
In summary, with or without family unification, it seems appropriate to systematically study the possible ways of arranging the fermions in Grand Unified Theories. The requirement that the SM chirality must be reproduced is a simple yet very stringent constraint which can be readily used to narrow down the list of possibilities. The aim of this paper is precisely to analyze, in a comprehensive way, the fermion sector of GUTs based on different groups, checking whether or not it is possible to obtain only the observed three families of fermions plus vector particles. Importantly and in contrast to what is almost universally done in the literature, the fact that the SM group can be embedded in more than one way in a given GUT group will not be overlooked. The aim of the present work is therefore somewhat similar to the one of the papers [37,39,52,53], but it is substantially broader in scope. For example, comparing with the interesting paper [53], we do not require (a) asymptotic freedom of the GUT (which severely restricts the number of fermion components allowed and consequently the group), nor (b) absence of gauge anomalies (although they get canceled automatically in almost all cases) and, above all, we do not make the (c) "bold assumption" that the embedding of the SM group is as trivial as possible. 3 As explained later on, this particular example fails because one would not find anywhere the representation 3, 2, − 1 3 3/5 needed for the left-handed quarks. 4 It is amusing to consider the possibility of (almost) unifying the three gauge couplings at low energies exclusively in this way (although baryon number violation would be a concern). Conceptually, it is not very complicated: for example, if the fundamental representation of SU (9) is broken into 3, 1, 1 3 n + 1, 2, − 1 2 n + (1, 2, n) + (1, 2, −n) with n necessarily equal to 3/29, successfully associating the d c and L fermions of the SM with the first two representations would imply that g1, g2 and g3 have almost the same value at the EW scale (up to around ∼ 10%).
We shall first provide some generic considerations about the method used to scan over the various GUT groups, representations and embeddings (section 2) and, following that, the results for each group are presented and discussed (section 3 supplemented by an appendix). The main conclusions are summarized at the end (section 4).

The chirality of GUT models and representations
Let us briefly discuss and settle on a precise definition of (SM) chirality. Consider some embedding of the SM group G SM in a bigger group G. We shall be interested in tracking the representations R i SM of G SM contained in some fermion representation R of G -the so-called branching rules of R. Yet, since pairs of SM vector fermions are irrelevant for the present analysis (as they can be made very heavy), we may define the chirality of R to be the vector χ (R) with component i given by the number of SM representations R i SM contained in R minus the number of SM representations R i * SM in R: For any real 5 SM representation (R i SM = R i * SM ) we always get χ i (R) = 0. On the other hand, we have the relation χ (R * ) = −χ (R) which implies that χ (R) is the null vector for a real R (R = R * ). As such, SM (or GUT) real representations can be ignored completely and furthermore, concerning complex representations, the effect of having n copies of R i SM (or R) in a model is the same as subtracting from it n copies of R i * SM (or R * ) as far as chirality is concerned. For this reason, in this work we take −R i SM (or −R) to be the exactly the same as R i * SM (or R * ). In the case of sums of representations of G, chirality is taken to be simply the sum of the chirality of each representation, This definition of chirality encodes in a precise way the intuitive notion associated to this word. It counts the number of each type of SM representation, factoring out real and conjugate pairs of representations. In the following, we shall see how it allows us to turn the problem of finding GUTs with the SM chirality into solving a system of linear equations.

GUTs with the correct chirality
With the above definition, finding the chirality of a representation of a group G ⊃ G SM is a matter of decomposing it into SM representations. In order to do so, one must first know how G SM is embedded in G, and it turns out that figuring all the possible ways of doing so is a complicated problem, which we shall discuss later. For now, we may assume that this embedding information is known and fixed. If so, one can use computer programs such as Susyno [54] 6 or LieART [55] to decompose in a systematic way any representation of the group G into those of G SM (for this work, the former was used).
In turn, with the branching rules of a list of representations R i of G, it is a rather simple exercise of linear algebra to find all integer linear combinations i c i R i of the R i with the SM chirality. Indeed, defining M to be the matrix with entries M ij = χ i (Ψ j ), the vector c = (c 1 , · · · , c n ) T whose components are the integers c i we seek is the solution to the linear system where χ (SM ) is a vector with the SM chirality, as mentioned previously. From χ (SM ) and M, one can extract c. As it is well known, the general solution of this equation is of the form where c is any particular solution of equation (3) and the vectors n i are a basis of the nullspace of M (i.e.,M·n i = 0). The α i are plain numbers which can take any value, as long as the components of the c vector are integer numbers. One can understand this generic form of c as follows: the vector c describes a particular combination of the R j ( j c j R j ) possessing the correct chirality, and each of the n i describes an independent, non-trivial combination of the R j ( j (n i ) j R j ) with no chirality; therefore an arbitrary number of n i 's can be added or subtracted to c.
To clarify this approach, consider the following straightforward example. Take SU (5) as the grand unified group, and its complex representations up to size 35 (we only need to consider one member of each conjugated pair): 5, 10, 15, 35. They decompose into the following eleven SU The chirality of the SM itself is So, how many copies of each of the four SU (5) representations are needed in order to obtain the SM chirality? If the fermions of the GUT model are c 1 (5) + c 2 (10) + c 3 (15) + c 4 (35), solving the system Needless to say, the conclusion reached with this example is unremarkable given that the representations considered were just R = 5, 10, 15, 35. The analysis gets more interesting when bigger representations are considered. If we do so, how unique is the standard fermion content 3 5 + 3 (10) in SU (5) GUTs? This is an important question which we address in this work, noting that the normalization of the SM hypercharge (usually given by a factor 3/5) depends on its answer.
Unfortunately, the type of simple analysis just presented is complicated by the fact that the SM group may be embedded in a GUT group G in more than one way. In particular, it is not known a priori what are the valid ways of combining the multiple U (1) factors inside G in order to form the SM's U (1) Y .

Different ways of embedding G SM in a group G ⊃ G SM
A systematic study of the different ways in which the SM chirality can be achieved in a GUT based on a group G must necessarily take into account the distinct ways in which G SM can be embedded in G. (In fact, we only need to care about branching rules, so we shall be pragmatic and equate different embeddings to different branching rules.) Regardless of the actual symmetry breaking chain, we can can view it as being made of two symbolic steps: In the first step, G is reduced to SU (3) C × SU (2) L times a maximal number of U (1) factors, while in a second step this abelian part of the group is reduced to There is only a finite number of ways in which the first symmetry breaking step can be carried out (see table 1). Indeed, with the information in [56,57] one can break any semi-simple Lie algebra in a step-wise manner, G → G → G → · · · → SU (3) C × SU (2) L × U (1) m , such that the algebra of each group in this sequence is a maximal subalgebra of the preceding one, 7 discarding none of the U (1) factors at this stage.
There are two important points concerning this first symmetry breaking step. The first one is that the number of U (1) factors in the end result (SU (3) C × SU (2) L × U (1) m ) will depend on the chosen sequence of maximal subalgebras, as the rank of the groups may shrink. The second point is that the step-wise procedure of breaking G into SU (3) C × SU (2) L × U (1) m subgroups will in general produce a large number of repetitions. In order to verify whether two embeddings are indeed different, it suffices to check that the branching rules for the fundamental representation of G are distinct, with the exception of the SO (2n) groups which also require the branching rules of the spinor representation to be distinct [57][58][59].
To illustrate these two remarks, consider the case of G = SO (10). Its maximal subgroups (7), SP (4)×SP (4) although only the first two correspond to chiral embeddings. 8 In any case, consider for example the breaking chains (A) SO (10) . The fundamental and spinor representations of SO(10) branch as follows,

16
using unnormalized U (1) charges (the separation of the two U (1)'s is irrelevant; the U (1) 2 group as a whole should be the same). Paths (A) and (B) lead to the same branching rules: there are two of them, which are related to one-another by conjugation of the color quantum number. This is a trivial variation which exists for all chiral embeddings, so we may refer to 'pairs of chiral embeddings', although in the next section we shall simply focus on one member of each such pair of embeddings. So overall we could say that there are three possible embeddings of G SM in SO(10) (having in mind step one of the symmetry breaking only), including one pair which is chiral. In the second symmetry breaking step, the two U (1)'s can be combined in any way to form U (1) Y -at least from a purely group theoretical perspective. Therefore, the branching rules of the natural and spinor representations under the symmetry breaking SO(10) → G SM are (the normalization of the U (1)'s is irrelevant at the moment) for some α,β,γ factors. It is worth stressing that the last case is not a special case of either of the first two. Also, while we did only consider three maximal subgroups of SO(10) (chains A, B,

Group # Embeddings
Group # Embeddings Group # Embeddings  C) it can be checked that all others cases lead to the embedding of the form (14) -doing so by hand is tedious (and even more so for bigger groups), and for that reason the Susyno program was used to automatically check for these repetitions. Introducing physical considerations, not all embeddings of forms in equations (12)- (14) can be used to embed the SM in an SO(10) GUT. The one in equation (14) would lead to a vector theory, so it can be excluded. As for the chiral embedding described by equation (12) ((13) is similar), if we are to obtain the SM fermions from the 16, then α and β which describe the Unfortunately, with larger GUT groups the situation becomes even more complicated if we do not make assumptions about the GUT representations where the SM fields are embedded, since there are more U (1) factors to consider. This situation is not insurmountable, but it does require adaptations to the analysis suggested in section 2, since it cannot be carried out unless we know the hypercharge y of the representations (all that is known is that y = i α i y i where y i are the charges under U (1) m , and the α i are to be determined).
There seems to be no easy way to tackle this issue, and as a consequence the scans over GUT representations were smaller for the bigger groups. One way is to just look at the first two quantum numbers and try to match in all possible ways the SM representations with the ones of SU (3) C × SU (2) L × U (1) m obtained from some list of GUT representations -this is the intuitive approach which works very well for the 16 of SO (10). Whenever this approach proved to be too demanding computationally, we used instead a modification of the analysis in section 2, where the c i (encoding the unknown combination of the GUT representations) and the α i (encoding the unknown combination of the U (1)'s) are found simultaneously as the solution of complicated equations where the α i do not appear linearly.

GUTs with the SM chirality
The GUTs we wish to consider should be based on a group with complex representations, otherwise they would not give rise to an effective chiral theory. The simple Lie groups with this property and which contain G SM as a subgroup are SU (N ≥ 5), SO(4N + 2) for N ≥ 2, and E 6 . As such, in the following we shall analyze the fermion sector of GUTs based on one of these simple groups, investigating also models with the SU (3) × SU (3) gauge group (possibly with an extra U (1)).

SU (5)
We shall start by assuming that the hypercharge of the SM particles are normalized in the usual way: y(e c ) = 3/5 for example. All 2048 (pairs of) complex representations of SU (5) with size no larger than 1 million were decomposed with the Susyno program. A total of 29037 SM complex representations appear in these decompositions, therefore one obtains the system of linear equations in (3) where χ(SM ) is a 29037-dimensional vector (with null components everywhere except for five ±3 entries), c is 2048-dimensional vector of unknown coefficients c i (describing the number of copies of each SU (5) representation), and M is a 29037 by 2048 matrix. Both χ(SM ) and M are known, so it is possible to solve for the vector c as explained in section 2. It turns out that the matrix M has a trivial nullspace. As such, there is a single solution to equation (3), and it corresponds to the standard, well known one: three copies of 5 + 10. 9,10 This simple but effective analysis shows that the 5 and 10 fermion representations of SU (5) are extremely special. However, we did assume the standard GUT hypercharge normalization factor n strd ≡ 3 5 . The usual justification for this factor is tied to the identify the components of 5 with those of the SM representations L = 1,2,− 1 2 n and d c = 3,1,− 1 3 n , as discussed in the introduction of this document. Since we do not want to assume that the SM fermions are in the 5 and 10 representations necessarily, we must admit other values for n. Which other values can it take? Looking for SU (5) representations where the left-handed quarks might be embedded, we conclude that the G SM representations of the form 3,2, 1 6 n must have n = (1 + 6k)n strd for some integer k. This can be easily shown analytically with the weight projection method (the reader may wish to see for example [60]) and indeed, probing the SU (5) representations of size smaller or equal to a million, one encounters all the SM representations fermions {Q,u c ,d c ,L,e c } with the hypercharge normalizations n/n strd = −17,−11,−5,1,7,13,19. Crucially, each of these choices yields a different chirality vector χ(SM ) in equation (3), and it turns out that there is no solutions except for the standard hypercharge normalization (n/n strd = 1).

SO(10)
We now repeat for SO(10) the same analysis which was done for SU (5). According to the discussion of subsection 2.3, there is only one chiral embedding of SU (10) which is the one in equation (12) (equation (13) is similar) yet, since m = 2, we do have to probe all possible values of α and β which encode the relation between U (1) Y and the two U (1)'s which are contained in SO (10). A list of possibilities can be computed by breaking all SO (10) representations up to some size into those of G = SU (3) × SU (2) × U (1) 2 and then start assigning the SM fermions (at least two) to any G representations with the correct SU (3)×SU (2) quantum numbers. Such procedure should be compared with the one used for SU (5) (see above) where, instead of two, there was only one unknown parameter (a normalization factor).
This method produces an exhaustive list of (α,β) values for which SO(10) breaks into SU (3)× SU (2) × U (1) in such a way that all the SM fermions (Q, u c , d c , L, e c ) can be found inside some SO(10) representation, with the correct ratio of hypercharges. For each value of (α,β) it is then possible to compute an M matrix and solve equation (3) using it. This was done for all SO (10) representations up to size 1 million, and two conclusions became clear.
Firstly, concerning the embedding of G SM in SO (10) and the normalization of the hypercharge, it turns out that equation (3) admits solutions only for the values (α,β) = ± 1 2 ,− 1 6 , corresponding to with the standard hypercharge normalization. It is worth mentioning that even though SO (10) contains SU (5), which in turn contains G SM = SU (3) × SU (2) × U (1) as a subgroup, there are more such G SM subgroups in SO (10), and that is why the branching rules may vary depending on which one is picked. With this in mind, it is interesting to note that the only branching rule which can reproduce the SM chirality (shown above) matches the one of the G SM subgroup found inside SU (5). We point this feature now because in the remainder of this work we shall see that this is not a specific feature of SO (10): for all simple GUT groups which were tested, in order to recover the SM chirality, the embedding G SM ⊂ G GU T must be such that the branching rules match those for a G SM inside a particular SU (5) subgroup of G GU T . 11 The second conclusion concerns the valid combinations of SO(10) fermion fields: unlike SU (5), we do find non-standard solutions with the SM chirality, although they involve very large representations. To be precise, referring to the framework set forth in section 2, we have the solution 3 (16) to which we can add an arbitrary number of the non-trivial combinations of SO(10) repre-sentations which have no chirality (associated to the n i vectors of equation (4) We recall here that −R should be interpreted whenever necessary as R. The field combinations n 1 and n 2 are just two out of many with no chirality; there are more such (independent) mixtures of fields, involving even bigger SO (10) representations, which we will not write down here. 12 We simply point out here that the n 1 combination does not involve the 16, so any solution of the form 3(16) + kn 1 for some k ∈ Z will still involve 3 copies of the spinor representation. However, n 2 is a linear combination of representations which does contain the 16: a GUT theory with the fermion content 3(16) + 3n 2 would have no fermions in the spinor representation and still its chirality would be correct. Therefore, it is possible to build a GUT model based on the SO(10) group without spinors, even though its matter content would need to be extremely large 13 and, on the other hand, the flavor problem would persist since 3 copies (or more) of each fermion representation would still be needed.

E 6
The group E 6 has 38 pairs of complex representations with size at most 1 million. There are a total of 12 distinct ways of embedding SU (3) × SU (2) × U (1) m in E 6 , which includes 5 pairs of chiral embeddings. For each of these, we have allowed U (1) Y to be any combination of the m U (1)'s. Remarkably, once this variety of representations and embeddings is fully explored, it turns out that there is a unique solution with the correct chirality. In other words, there is both a unique embedding and a unique fermion field configuration which yield the SM chirality: it is 3 copies of the 27 representation with the embedding It is well known that this branching rule matches the one for the G SM subgroup of the SO (10) which is inside E 6 , with the 27 breaking into 1 + 10 + 16 of SO (10), so it is clear that the hypercharge normalization factor must be the standard one ( 3/5). . It involves probing all sequences of maximal subgroups G → G → G → ··· → H, which can be a very time consuming process that does not provide much insight on the direct relationship between H and G. It is therefore important to realize that there is a simpler way to list all these embeddings when G is a special unitary group. For a given G = SU (N ), we start by picking all possible combinations of the irreducible representations of H in order to obtain a complete list of the N -dimensional (potentially reducible) 12 A total of 6 independent n i combinations with no chirality exist involving only SO(10) representations with size smaller or equal to 1 million. 13 Quantum gravity effects are expected to become relevant in such a scenario [63,64].

SU (N > 5)
representations of the subgroup H. Since the representation matrices must be unitary (otherwise the kinetic terms would not be gauge invariant), it is obvious that there must be an embedding under which the fundamental of G breaks into any of the N -dimensional (unitary) representations of H. One only needs to ensure that none of the generators of these H representations has null/trivial generators T a as this would imply that Tr T 2 a = 0. For example, consider the embedding of SU (5) in SU (6). There are three inequivalent reducible 6-dimensional representations of SU (5): 1 + 1 + 1 + 1 + 1 + 1, 1 + 5 , and 1 + 5. However, the generators of the algebra of the first representation are null, so there are just two embeddings of SU (5) in SU (6) : 6 → 1 + 5 and 6 → 1 + 5. Next, take the SU (5) → G SM example. The fundamental representation of SU (5) must break into at least one non-trivial representation of SU  1,1,0) representation of G SM ). It is easy to check that the irreducible representation of size N (N − 1)/2 obtained from the anti-symmetric product of two F 's (let us call it K here) decomposes as so the combination −3(N − 4)F + 3K has the correct chirality. In fact, for N between 6 and 12, it turns out that this embedding is the only one which can reproduce the SM chirality; this is the conclusion of a scan over all combinations of fermion representations up to some limiting size (at least 1000), considering as well in the process all possible ways of forming the hypercharge operator from the available U (1)'s. In appendix there are various tables with the fermion content allowed by the chirality constraint in these SU (N ) GUTs (the combination −3(N − 4)F + 3K is not unique).
As the size of the SU (N ) groups increase, it becomes harder to manage the problem of testing all the possible ways of forming the SM hypercharge group. Yet, we should recall that for all groups G GU T where such an analysis was carried out -SO(10), E 6 or SU (12 ≥ n ≥ 5) -it turned out that there was at most one branching rule that worked for each G GU T , and it matched the one obtained by considering the G SM inside one of the SU (5) subgroups of G GU T . So, with hindsight, we could have simply looked for the possible ways of embedding SU (5) in the unification groups, and then break SU (5) to G SM ; no important embedding of G SM in G GU T would have been missed. Or, better yet, since an SU (5) theory will only reproduce the SM's chirality with the field combination −3(5) + 3(10), we would just need to see if this SU (5) chirality 15 is attainable with a given unification group.
Inspired by this observation, we have analyzed the different ways of embedding SU (5) in SU (N ≥ 13). From an SU (5) perspective, the embedding in equation (19) corresponds to the following branching rules, where F is the fundamental representation of SU (N ), and K is the one in the anti-symmetric part of the product F × F as before. The field combination−3(N − 4)F + 3K works from the chirality point of view, and one might add that it does not lead to gauge anomalies [65,66]: indeed, for this embedding, the chirality condition implies that the SU (N ) gauge anomalies cancel, so all the configurations one can build from the tables in appendix are fine. Going through the SU (N ) family of groups, we have checked that the embedding in equations (20) and (21) is the only one that works. Until SU (15) is reached. For this and bigger groups, two new remarkable SU (5) embeddings become possible. The first one corresponds to F → 2(5) + 5 + (N − 15)1, (22) which means that K ≡ (F × F ) Ant. and L ≡ (F × F ) Sym. will break as follows: L → 2(24) + 3(15) + 15 + 10 + 2(N − 15)5 + (N − 15)5 + 214 − 29N + N 2 2 1.
As such, the field combination −(N − 12)F + 2K − L will have the correct chirality and it is anomaly free. This is not the unique configuration that works for a given N ≥ 15; there are non-trivial combinations of the representations SU (N ) which have no chirality, just as in the embedding in equations (20)- (21). Nevertheless, we shall not print them here (they are fairly elaborate). 15 By 'SU (5) chirality' we are referring to the concept of chirality as discussed in section (2), but based on the analysis of SU (5) representations instead of those of SU (3) × SU (2) × U (1).
The second new noteworthy embedding encountered for N ≥ 15 is the one under which the anti-fundamental representation of SU (N ) breaking into exactly one SM family plus singlets: Obviously, the field configuration −3F will have the correct chirality. However, the reader will immediately notice that it leads to a theory with an SU (N ) 3 anomaly, which should be seen as a warning that the chirality condition does not always imply absence of anomalies (see [67]). For this embedding, there are non-trivial combinations n i of the SU (N ) representations with no chirality which contribute to the SU (N ) 3 anomaly, so one could have hoped that for some c i , the combination −3F + i c i n i would be anomaly free. Unfortunately, no such c i exist for the SU (N ) groups tested. How does one judge such models then? They can be seen as non-renormalizable effective models [68]; or maybe there is some way to cancel the anomaly (perhaps string inspired [69]). In any case, we shall not worry about these anomalies -we we simply assume that these models (or variations of them) can be made part of a consistent quantum field theory. In this spirit, we shall say a few more words about the curious embedding in equation (25). Under it, which means that the K representation of SU (16 + N ) contains precisely N SM families plus vector particles. In the particular case of SU (19), looking through its SO(10) subgroup makes things even more clear: K → 3(16) + 120 + 3(1) .
In other words, it is possible to unify the three SM families in K of SU (19) (and more generally N families in SU (16 + N )) although the model will have gauge anomalies. The three embeddings of SU (5) in SU (N ) which were presented above are the only ones that work for N ≤ 20 (representations of size up to one million were considered). For N > 20 one will certainly find new embeddings: for example, starting with the SU (45) group it is certainly possible to fit the three families (plus vector particles) in the fundamental representation (although leading to gauge anomalies once again). Yet, with increasingly big unification groups the possibilities of embedding SU (3) × SU (2) × U (1) m in it grow in a seemingly exponential way (see figure 1). For this reason, one might argue that models based on very large gauge groups are not as attractive as those based on smaller ones: they contain many subgroups, therefore a significant tuning of the scalar sector parameters would likely be needed in order to have the correct symmetry breaking.

SO(10 + 4N ) with N > 0
In four dimensions and without confining interactions, is it possible to embed the SM in a theory based on a gauge group of the family SO(10 + 4N ) for N > 0? These groups do have complex representations and, furthermore, it is possible to chirally embed the SM group in them (although such embeddings are not plentiful -see table 1). However, despite these promising features, after performing computer scans, it seems impossible to reproduce the SM chirality in SO (14) and SO(18) -at least not with fermions representations with a size smaller than 2 million. Perhaps it has to do with the fact that, unlike SO(10), it is not possible to chirally embed SU (5) in SO (14) nor SO (18). Interestingly, this last statement no longer holds for bigger groups of the SO(10 + 4N ) family.
As with the special unitary groups, it becomes hard to analyze all possible ways of embedding G SM so we shall focus instead on the SU (5) subgroups of SO(10 + 4N ) in the following. 16 Table 2 contains a curious piece of information: while there are two ways of embedding SU (5) in SO(10)a pair of chiral embeddings -there is just one for SO (14) and SO (18), and under it the complex representations of these groups break into a mixture of real and pairs of complex conjugated representations of SU (5). The number of embeddings is bigger for SO (22) and SO (26) but they too are all vector embeddings. However, in this case persistence pays off as it is possible to chirally embed SU (5) in SO (30).
The trouble with the SO(10 + 4N ) group family is that the complex representations become exponentially large with N : the spinor representation -which is the smallest one -has 4 2+N components. In the case of SO (30), we have considered the complex representations smaller than 16 The algorithm mentioned previously to find quick and easily the embeddings of a subgroup H in the special unitary groups cannot be readily adapted to the special orthogonal groups. The method consisted essentially in building all the N -dimensional representations of some H ⊂ SU (N ). With SO(N ), one would have to consider only the strictly real N -dimensional representations R of H ⊂ SO(N ) (i.e., exclude the pseudo-real and complex ones): for every such R there in an embedding under which the fundamental representation F of SO(N ) breaks into R.
But the branching rules of the spinor representation S of SO(N ) would be missing. One can only speculate that perhaps one can find these, up to conjugation, from the branching rules of the fundamental representation F , since F and S are related. For example, using the shorthand notation X {m} and X [m] to denote the completely symmetry and anti-symmetric parts of the product of m copies of a representation X, there is the relation (14), SO (18) 1 (0) SO(30) 10 (3) SO (22) 3 (0) Only some of these correspond to chiral embeddings (the integers in parenthesis indicate the number of pairs of chiral embeddings in each case).

Groups # Embeddings Groups # Embeddings
the 132562944 (there are only 7) and all the chiral embeddings of SU (5) in SO (30); it turns out that it is impossible to obtain three families of 5 + 10 plus vector particles.
As for bigger groups in the SO(10 + 4N ) family, they were not tested so one can only speculate about the possibility of embedding the three SM families in such models. The complex representations will be even bigger than those of SO (30), each of them potentially breaking into thousands of distinct complex SU (5) representations, so it seems unlikely that one could match all these sub-representations in pairs R,R leaving only a small excess of 5's and 10's over 5's and 10's. Even if it is possible, it would almost inevitably require millions of new vector particle components. 17

SU (3) × SU (3) and SU (3) × SU (3) × U (1)
We shall consider SU (3) × SU (3) -even though it is not a simple group -because it contains G SM as a subgroup, it has a minimal number of diagonal generators, and it is a group with complex representations. Besides SU (5), the only other semi-simple group with these properties is SU (3) × SU (2) × SU (2), which clearly will not yield the correct chirality as one would always have pairs of representations with opposite hypercharges. Are there SU (3) × SU (3) models with the correct chirality? Models with an extra U (1) can successfully embed the SM, achieving the correct chirality, but without this extra abelian factor we shall see that this is not possible.
One of the two factors is the color group SU (3) C which must not be broken, while the otherlet us call it henceforth SU (3) L -has to break into the EW group SU (2) L ×U (1) Y . We recall here that the representations of SU (3) L can be uniquely labeled by two non-negative integers {a,b} (the Dynkin coefficients) which in terms of Young tableaux can be identified with the representation with a columns with a single row and b columns with two rows. The conjugate representation of {a,b} is {b,a}, and their dimension is 1 2 (a + 1)(b + 1)(a + b + 2) -for example, {0,0}, {1,0} and {0,1} are the singlet, triplet (by convention), and anti-triplet representations, respectively. On the other hand, the SU (2) L ×U (1) Y representations can be labeled by their spin j and (unnormalized) hypercharge y.
Crucially, it is possible to derive the branching rules of a generic representation of SU (3) L into those of SU (2) L × U (1) Y : {a,b} decomposes into representations with hypercharge y = a − b + 3n (n ∈ Z,−a ≤ n ≤ b) whose SU (2) L spins are j = a+b−n

SU (3) L representation from which it originates is complex as well (a = b).
On the other hand, if a + b > 1 this state will be an SU (2) triplet or higher-dimensional representation which must have a vector mass (since it is not seen at low energies). As such, in a model where there is an {a,b} complex representation of SU (3) L with a + b > 1, one needs to ensure the presence of at least another SU (3) L representation which contains the state (j,y) = a+b 2 ,b − a . Obviously this can be achieved with the representation {a,b} * = {b,a}, but having both {a,b} and its conjugate would not affect a model's chirality (see section 2) and for that reason we have stated previously that such configurations should be excluded from the analysis. Therefore, in order to obtain the SU (2) L × U (1) Y representation (j,y) = a+b 2 ,b − a one must have an SU (3) L complex representation {a ,b } with a + b > a + b. But such a {a ,b } multiplet would decompose into complex states with spin even higher than a+b 2 , presenting a renewed problem. And thus, with this circular argument it is shown that by using any SU ( for some κ factor which controls the relative weight of X in Y . A known possibility is to take κ = 1, with the leptons in the representations 3 1,3,− 1 3 + 3(1,1,1) and the quarks in 2(3,3,0) + 3,3, 1 3 + 4 3,1,− 2 3 + 5 3,1, 1 3 [70]; an alternative is to take κ = −3, placing the leptons in the representations 3(1,3,0) and the quarks in 2 3,3,− 1 3 + 3,3, 2 3 + 3 3,1,− 2 3 + 3 3,1, 1 3 + 2 3,1, 4 3 + 3,1,− 5 3 [71][72][73]. Trivial variations to the fermion content -adding real or conjugate pairs of representations of the SU (3) C × SU (3) L × U (1) X gauge group -are always possible and might even have interesting motivations (see for example [74]). But what about non-trivial variations? Just as with simple unification groups, the need to reproduce the SM chirality poses a very significant constraint. We do note that because of the extra U (1) X group factor the chirality condition does not imply automatically the absence of gauge anomalies, so these should be seen as complementary conditions on the possible fermion content of a given model.
We shall state here what are the simplest 19 non-trivial modifications which can be made to the known SU (3) C × SU (3) L × U (1) X models. If R is a complex representation of SU (3) C , then it turns out that the simplest combinations of fields with no chirality and no gauge anomalies is of the form 20 − 4(R,1,z) + 5 R,3, for some arbitrary number z (κ was introduced earlier). Remarkably, this expression introduces many new SU (3) L representations going all the way up to size 24. So, for example, in the model [70], if we were to replace the quarks in four copies of the representation 3,1,− 2 3 by something else, the simplest possibility would be (using equation (30)  The complexity of this mixture of representations can be interpreted as saying that the quark assignment in SU (3) C × SU (3) L × U (1) X models (for a fixed κ) is, for practical purposes, unique. Note that gauge anomaly cancellation alone would allow the replacement of 4 3,1,− 2 3 with much simpler combinations.
If R is a real representation of SU (3) C , the combination in equation (30) is still valid, but it is not the simplest one. That distinction goes to where x should be different from y and − κ 3 −y, otherwise this would be a self-conjugate combination of fields. Note also that we can drop one of the last four representations if it is made real by choosing an appropriate value for x or y. As such, referring once more to the model of [70] where κ = 1, one could replace the leptons in the three copies of the representation 1,3,− 1 3 by three copies of the following rather more complex combination, which nevertheless does not involve SU (3) L representations bigger than (anti-)triplets: In summary then, for a fixed embedding of the SM group in SU (3) C × SU (3) L × U (1) X , it is fairly complicated to assign the leptons to other representations, and even more so for the quarks.

Concluding remarks
Pairs of right-and left-handed fermions transforming in the same way under the SM gauge group (vector fermions) can have a very high mass and therefore escape direct observation. On the other hand, unpaired ones (chiral fermions) may only get a small mass after electroweak symmetry breaking. Therefore, in Grand Unified Theories one must avoid introducing chiral fermions beyond those present in the Standard Model.
Motivated by this observation, we have analyzed in a systematic way the fermion sector of GUTs containing the SM fermions plus vector particles only. This very simple requirement on the fermion content of GUTs turns out to be a very constraining one (implying in most cases, but not all, the cancellation of all gauge anomalies). The analysis carried out assumes that there are no extra dimensions nor confining gauge interactions at high energies.
A thorough computer scan was performed over all simple groups with rank smaller than 12 (excluding SO (22)) and over all their representations up to some size (from a few thousands up to millions, depending on the group). A very significant part of the work consisted in tracking down and looking at all possible ways of embedding G SM in each unification group: this required cataloging the different ways that SU (3) × SU (2) × U (1) m can be embedded in G GU T and, for each of them, to consider that the SM's U (1) Y can be a priori any linear combination of the m available U (1)'s.
With these simple groups, how exotic can GUTs be (the group, the embedding, and the field content)? Concerning the group, SO (14) and SO (18) were found not to work, leaving SO(10), SU (5 ≤ N ≤ 12) and E 6 as viable unification groups. Surprisingly, it was found that the SM gauge group must be embedded in each of these groups such that the GUT representations decompose in a unique way into SM fields. This uniqueness is far from obvious, even though model builders have been working with these groups and these embeddings for a long time. In every case, it turns out that the SM group is embedded in the GUT group in such a way that it can be viewed as going through SU (5) for the calculation of the representation branching rules, G GU T → SU (5) → G SM , so an important consequence of this result is that the hypercharge normalization factor 3/5, usually associated to SU (5) GUTs, is in fact universal. 21 Indeed, the relations g 1 = 5/3g , g 2 = g and g 3 = g s are the correct ones for all the tested cases.
As far as the field content is concerned, we dismissed the introduction of real fermion GUT representations or pairs of complex conjugate ones as trivial variations to the fermion sector of a model. It is also inconsequential to exchange the GUT representations i R i by their conjugates i R i , since one will recover the same SM representations by considering a different embedding of G SM in G GU T . Factoring out such variations, the standard fermion assignments 3 5 + 3(10) in SU (5) and 3 (27) in E 6 appear to be unique, while 3 (16) in SO (10) is not. In this last case, it is possible to get rid of the spinor representation, but that requires the introduction of complicated mixtures of very large representations. On the other hand, SU (5 < N ≤ 12) models may have a rich variety of different fermion representations, some of which have been already explored in the literature. For example, it is possible to have an SU (11) model with just three fermion representations, 2 55 + 462, matching in this sense the minimality of SO (10) and E 6 GUTs, and perhaps exhibiting an interesting flavor structure.
Two non-simple groups were looked at as well. It was shown that no model based on the gauge group SU (3) × SU (3) can yield the SM chirality. On the other hand, viable models are known to exist with an extra U (1), and in this work we commented that non-trivial changes to their fermion sector are possible, although they do need to be very elaborate.
Bigger unification groups were also considered, assuming that G SM is in an SU (5) subgroup of G GU T . This analysis strongly suggests that SO(10 < N ≤ 30) are not suitable grand unified groups, in contrast to the SU (N )'s which, for N ≥ 15, can actually embed G SM in multiple valid ways. Interestingly, under one of these embeddings it is possible to unify N SM families in a single representation of SU (16 + N ). In particular, the 171 of SU (19) contains the SM fermions plus vector particles only. However, one should keep in mind that family unification with special unitary groups leads to gauge anomalies which, in a fundamental theory, need to be dealt with.

Appendix
As mentioned in the text, the embedding of the SM group in SU (5 < N ≤ 12) characterized by the branching rule in equation (20) was the only one found to be capable of reproducing the SM chirality. The fermion content c ≡ −3(N − 4)F + 3K is a particular example of a valid one (F being the fundamental representation of SU (N ) and K ≡ (F × F ) Ant. ). Other equality valid combinations of fields can be built by adding non-trivial combinations n i of the SU (N ) representations with no chirality (as discussed in section (2)). This appendix contains the tables (3)-(9) which list all such independent n i , for 5 < N ≤ 12. The biggest representation considered for the elaboration of each table was determined mainly by space considerations, given that more solutions seem to always appear if one includes bigger ones. For each group, one or more "interesting solutions" were picked in a somewhat subjective manner: they are notable for having either a reduced number of types of representations or a small number of total representations (including multiplicity).

Maximum size of representations considered 1000
Non-trivial combinations of representations with no chirality (n i ) Interesting solutions with the SM chirality c = −6(6) + 3(15) Table 3: Information on the combinations of SU (6) representations yielding the correct chirality. For example, the simplest solution, c, together with more vector particles is discussed in [44,46,49]. Interesting solutions with the SM chirality c = −9(7) + 3(21) , c − 3n 1 = −6(7) + 3 (35) . Table 4: Information on the combinations of SU (7) representations yielding the correct chirality. For example, [45] uses c and [36] mentions both c and c − 3n 1 .   [47] the author considers the combination c − n 1 which is readily seen to be the one involving the least amount of fermion components. Reference [43] considers instead the combination c − 2n 1 .  Table 6: Information on the combinations of SU (9) representations yielding the correct chirality. For example, the solutions c − n 1 and c − n 1 + n 2 are mentioned in [39] and [48] respectively.     (11) representations yielding the correct chirality. The solution c + n 1 − n 2 is mentioned in [37].  Table 9: Information on the combinations of SU (12) representations yielding the correct chirality. The solution c + 4n 1 − 6n 2 + 4n 3 is mentioned in [50].