Data Mining for Three-Dimensional Organic Dirac Materials: Focus on Space Group 19

We combined the group theory and data mining approach within the Organic Materials Database that leads to the prediction of stable Dirac-point nodes within the electronic band structure of three-dimensional organic crystals. We find a particular space group P212121 (#19) that is conducive to the Dirac nodes formation. We prove that nodes are a consequence of the orthorhombic crystal structure. Within the electronic band structure, two different kinds of nodes can be distinguished: 8-fold degenerate Dirac nodes protected by the crystalline symmetry and 4-fold degenerate Dirac nodes protected by band topology. Mining the Organic Materials Database, we present band structure calculations and symmetry analysis for 6 previously synthesized organic materials. In all these materials, the Dirac nodes are well separated within the energy and located near the Fermi surface, which opens up a possibility for their direct experimental observation.

Recently, we have witnessed growing interest in the research community in the Dirac materials where the low-energy excitations behave as massless Dirac fermions 1 . Among the most prominent examples are the two-dimensional material graphene 2 , the surface of bulk topological insulators 3 like Pb x Sn 1−x Te 4, 5 , Dirac-line materials [6][7][8] and Weyl semimetals like TaAs 9 . To date, the strong focus within electronic Dirac materials lies in the inorganic crystals. The class of organic crystals remains rather unexplored and only a few organic Dirac materials are known. One prominent example is the quasi two-dimensional charge transfer salt α-(BEDT-TTF) 2 I 3 which shows a tilted Dirac cone located at the Fermi energy under high pressure 10 . At the same time, organic crystals offer a high potential for technological applications due to low production costs, elastic properties (flexible electronics) and the opportunity to build large-area devices 11 . We therefore will focus on exploring the space of organic materials with the specific goal of identifying Dirac materials.
Since the crystal structure plays a crucial role for hosting Dirac nodes, attempts of identifying organic Dirac materials so far are mainly based on variations of already known Dirac materials. In two dimensions, this can be done by starting with the graphene structure and replacing the carbon atoms by more complex organic molecules 12 . A similar strategy was also discussed with respect to α-(BEDT-TTF) 2 I 3 13,14 . To go beyond this approach, we adopt a new strategy for the search for Dirac nodes in the class of three-dimensional organic crystals. We performed a data mining study on the basis of 5217 electronic Kohn-Sham band structures calculated using Density Functional Theory (DFT) 15,16 and stored within the Organic Materials Database (OMDB) 17 . We looked for isolated linear crossings, i.e., where no other bands can be found besides the crossing within the corresponding energy range. Although all considered organic materials were previously synthesized, little attention has been paid to the electronic structure for most of them so far.
To achieve stable Dirac points within the electronic structure, symmetry or topological protection needs to be present. In this connection, crystals with non-symmorphic space groups have been widely discussed [18][19][20] where the key role is played by high-dimensional irreducible representations at the Brillouin zone boundary. As a consequence of the Wigner-Eckart theorem, the degeneracy of an electronic state is equal to the dimension of an irreducible representation. In the context of crystals, these degeneracies were discussed in great detail during the 1960s 21 . Recently, with a reinterpretation, degenerate electronic states in crystals again attracted attention as a host to unconventional fermions as low-energy excitation 22 . There is also a second opportunity of hosting Dirac crossing as accidental crossings 23 , which are protected by band topology. Such crossings can be found for instance in crystals with the monoclinic space group P2 1 /c (#14) 6,24 . In crystals with this space group, electronic energy bands are sticking together in groups of four bands. As reported in ref. 6, for each of these groups, three topologically different orderings of electronic states can be found at the Γ-point within the Brillouin zone-a trivial phase and two different line-node phases. Within the space group P2 1 2 1 2 1 (#19), an almost similar situation is present, with the difference that at least one crossing has to occur along one of the paths ΓX, ΓY or ΓZ within the Brillouin zone. This group will be discussed below in Results.
In this paper, we report results of a combined study using abstract group theory and data mining within the Organic Materials Database (OMDB) 17 BrNO 30 . It will be shown that the found Dirac nodes are a consequence of the orthorhombic crystal structure of the space group P2 1 2 1 2 1 (#19). Within the band structure of the materials, two different kinds of nodes can be distinguished: 8-fold degenerate Dirac nodes protected by the crystalline symmetry and 4-fold degenerate tilted Dirac nodes protected by the band topology.

Results
Data mining and electronic structure calculations. As reported in ref. 17, most three-dimensional organic crystals are insulating. However, the doping of organic materials is extensively studied, opening the opportunity of shifting the Fermi level into the valence band (p-doping) or conduction band (n-doping) 11,31 . Even though it is chemically more difficult to achieve in comparison to inorganics, for example, due to purification effects of organics, it was successfully implemented within OLEDs 32, 33 , solar cells 34,35 or thermoelectric materials 36 . Therefore, we searched for isolated linear crossings in a neighborhood of 0.5 eV above the lowest unoccupied electronic state and below the highest occupied electronic state within the Kohn-Sham band structures stored in the OMDB by explicitly focusing our attention to materials with the space group P2 1 2 1 2 1 (#19). For the 6 most promising structures tabulated in Table 1, additional refined DFT calculations were performed (see Methods for more details).
In the following, we concentrate on the discussion of the material C 6 H 7 ClO 3 (the electronic and molecular structures for the other 5 materials can be found in the supplementary material). Its orthorhombic crystallographic unit cell is build up of four copies of C 6 H 7 ClO 3 molecules as shown in Fig. 1(a). The calculated band structure along several high symmetry paths within the Brillouin zone ( Fig. 1(b)) is plotted in Fig. 1(d). The isolated linear crossings occur at the high-symmetry point R (green dashed circles in Fig. 1(d)) as well as along the high-symmetry path ΓX (red dotted circles in Fig. 1(d)). A picture of the linear energy dispersion within the → b 2 -→ b 3 -plane within reciprocal space in the vicinity of the R-point can be seen in Fig. 1(c). The linear crossings are well separated within the energy and as a consequence the electronic density of states grows quadratically with the energy in the vicinity of the three-dimensional Dirac crossing ( as can be seen in Fig. 1(d).
However, to justify the claim of the found materials being Dirac materials, a protection of the crossings by crystalline symmetry is necessary. Indeed, we find that the nature of the crossings forming Dirac nodes in the spectra can be explained within the framework of group theory below.
Group theory analysis. The space group P2 1 2 1 2 1 (#19) itself (hereinafter denoted by G G) is an infinite group having the group of pure translations T as an infinite, normal and Abelian subgroup. The point group of the lattice G G 0 , i.e., the group of all rotational parts of the space group elements is given by 222 (D 2 ). The factor group G G T / is isomorphic to 222 and the coset representatives are given by Here, E denotes the identity element and C 2x , C 2y and C 2z denote two-fold rotations (rotations by 180°) about the Cartesian x-, y-and z-axis, respectively. In general, since G G is an infinite group, it has infinitely many irreducible representations. However, due to the special structure of space groups they can be indexed by the combined index → k p ( , ), where → k denotes a vector in reciprocal space and p denotes an additional index running over all the allowed representations at → k . A degeneracy of a state with energy → E k ( ) can be expected when the associated irreducible representation Γ→ k p has dimension d > 1 or when pairs of complex conjugate representations are present. Such a degeneracy is denoted as "protected by the crystalline symmetry". At the R point within the Brillouin zone, the physically-irreducible representation is four dimensional 37 . Additional spin-degeneracy leads to an 8-fold degenerate linear crossing. Such crossings were recently referred to as double Dirac crossings 38 . Within the band structure in Fig. 1(d), these crossings are highlighted by green dashed circles.
At the Γ point, the group of the → k -vector is given by the whole space group. Each of the eigenstates is one-fold degenerate and belongs to one of the irreducible representations listed in Table 2 (spin degeneracy is omitted for the moment). As soon as one moves slightly away from the Γ point, for instance on the path ΓX, the little group of the → k -vector only contains the elements T 1 and T 2 . Hence, according to Table 3, the bands will be classified by their transformation behavior with respect to the two-fold rotation, namely even (Γ 1 ) or odd (Γ 2 ). Every state at  X transforms as the 2-dimensional irreducible representation E illustrated in Table 4. Moving towards X and coming from ΓX bands of character Γ 1 and Γ 2 have to merge pairwise. In general, bands can only cross (accidental crossings) when they belong to different irreducible representations 23 . Otherwise they will hybridize and form a spectral gap. Clearly, a similar consideration holds along ΓY and ΓZ. Taking into account all possible permutations of the four irreducible representations at Γ as well as the possible connections to X, Y and Z, it can be verified that at least one crossing can be found along one of the 3 paths. Taking into account the C 2 rotation symmetry, a second copy of the crossing along the path Γ −X ( ) can be also found within the Brillouin zone. Furthermore, as soon as one slightly departs from one of the 3 paths towards the interior of the Brillouin zone, none of the symmetries is kept and only the identity element T 1 is present. Thus, there is no reason to protect the crossing at any point that is not lying on one of the 3 paths and the crossing itself is a Dirac point. Hence, by including spin-degeneracy, this crossing is 4-fold degenerate.
In the present case of C 6 H 7 ClO 3 , the topologically protected crossing can be found along the path ΓX, as can be verified from Fig. 1(d,e). In Fig. 1(d), the topologically protected crossing is highlighted by a red dotted circle. As can be seen, the four interesting bands below the Fermi level have the ordering B 3 , A, B 2 , B 1 . Bands originating from A and B 1 and from B 3 and B 2 are merging pairwise at X for the above discussed reason. As a consequence, a crossing of bands originating from A and B 2 can be observed (see Fig. 1(e)).
Without spin-orbit coupling the system can be split into two identical spin-polarized components. One simple (spin-polarized) point-node is naturally characterized through with the topological charge given by the Chern number, |C 1 | = 1. By C 2 symmetry, the two simple point-nodes (located here on the ΓX -axis) must carry the same charge, say C 1 = −1. According to the Nielsen-Ninomiya theorem 39,40 , the total charge over the whole Brillouin zone must be zero. We then deduce that the double point-node at R must carry a charge C 1 = +2. This can be confirmed both numerically and algebraically 41 .
Interestingly, C 6 H 7 ClO 3 is close to a Lifschitz transition between two inequivalent global band topologies, controlled by a band inversion at Γ. Here, A and B 1 are almost degenerate having an energy of ≈−0.39 eV (see Fig. 1(e)). Indeed, the energy ordering of the irreducible representations at Γ, i.e., {B 2 , B 1 , B 3 , A, B 1 , B 2 , A, B 3 }, guarantees to have two disconnected blocks of four bands (without counting the spin degree of freedom) over the whole Brillouin zone. Alternatively, if we exchange A and B 1 , the global band topology guarantees to have a single block of eight bands being nontrivially connected 41 . Supporting this, our data mining has revealed a clear candidate of a nontrivially connected eight-band subspace which is characterized by two pairs of topologically stable simple point-nodes (one pair along ΓX and one pair along ΓY ) shown in the supplementary material (see Fig. 2(b)). This is the first realistic band structure reported with this new type of global band topology.

Discussion
We presented the first 6 predicted compounds for three-dimensional organic crystals hosting isolated Dirac crossings. All 6 materials were synthesized before, so we encourage direct experimental verification of the results presented here. Within the respective band structures, we identified 8-fold degenerate Dirac nodes at the R point in the Brillouin zone together with 4-fold degenerate topologically protected Dirac points along the high-symmetry path ΓX. The crossings are well separated in energy due to the flat electronic bands of organic crystals and potentially accessible via doping or gating. In comparison to inorganic materials, the spatially sparse unit cells of organic materials characterized by van der Waals bonding between large molecules leads to an electronic structure characterized by blocks of well separated flat bands. This particular property also increases chances of finding topologically protected crossings to be isolated within the energy. We expect that more organic Dirac materials will be reported elsewhere with growth of the OMDB database. Furthermore, the slope of the crossings is usually much smaller then for similar inorganic crystals leading to potential applications as slow Dirac materials 42 .

Methods
Materials data and data mining. We analyzed 5217 Kohn-Sham band structures of the three-dimensional organic crystals stored within the Organic Materials Database (OMDB) 17 (http://omdb.diracmaterials.org). The search algorithm for isolated Dirac crossings consisted of two major steps. On the first step, illustrated in Fig. 2, the algorithm selected all materials with either zero or tiny direct energy gaps (less than 1 meV) located up to 0.5 eV above the minimum energy of the lowest conductance band or 0.5 eV below the maximum energy of the highest valence band (for metals, the reference point was the calculated Fermi energy). The purpose of this gap is to introduce numerical tolerance since the band structure calculations were performed along a discrete mesh and the algorithm could miss the crossing point if it occurs between two mesh points. The algorithm also checked Table 3. Character table of the point group 2 (C 2 ). that no other bands can be found within the corresponding energy range of the gap, which is a necessary (but not sufficient) criterion to find an isolated Dirac crossing. The first step allowed us to significantly reduce the search space as it found only 45 gaps (45 materials or 0.9% of the initial dataset) near the lowest conductance band and 92 gaps (91 materials or 1.8% of the initial dataset) near the highest valence band, making manual inspection of the search results feasible. The statistics suggest that this simple criterion performs well in filtering out irrelevant materials and can be used for the automated search for isolated Dirac crossings.
In the second step, we applied a nearest neighbor search algorithm to arrange the selected materials according to their similarity to a pattern of two crossing straight lines. With this aim, we considered the two nearest bands within a momentum window around a direct gap detected in the previous step. We empirically set the size of the momentum window to be 0.4 units wide as an expected characteristic scale of the Dirac crossing. Finally, we used the average Euclidean distance (the root mean square error) between the pattern and these two bands (within the momentum window) linearly scaled to the same bounding box to arrange the selected materials. The second step becomes important to prioritize search results when the number of materials is large.
Electronic structure calculations. Having found a subset of perspective materials, we performed refined electronic structure calculations in the framework of the density functional theory 15 by applying a pseudopotential projector augmented-wave method [43][44][45] , as implemented in the Vienna Ab initio Simulation Package (VASP) [46][47][48] and the Quantum ESPRESSO code 49 . The exchange-correlation functional was approximated by the generalized gradient approximation according to Perdew, Burke and Ernzerhof 50 . The structural information were taken from the Crystallography Open Database (COD) [51][52][53] and transformed into DFT input files by applying the Pymatgen package 54 .
Within VASP, the precision flag was set to "normal" meaning that the energy cut-off is given by the maximum of the specified maxima for the cut-off energies within the POTCAR files (for example, for carbon this value is given by 400 eV). The calculations were performed spin-polarized but without spin-orbit coupling. For the integration in → k -space, a 6 × 6 × 6 Γ-centred mesh according to Monkhorst and Pack 55 was chosen during the self-consistent cycle. A structural optimization was performed by allowing the ionic positions, the cell shape and the cell volume to change (ISIF = 3). Optimized structures from the VASP calculations were used. Quantum ESPRESSO was applied to estimate the associated irreducible representations of the energy levels within the band structure. The cut-off energy for the wave function was chosen to be 48 Ry and the cut-off energy for the charge density and the potentials was chosen to be 316 Ry. The calculated band structures using VASP and Quantum ESPRESSO are in perfect agreement. Figure 2. Illustration of the search criteria for isolated Dirac crossings near the Fermi level. The algorithm selected all materials with either zero or tiny direct energy gaps (less than 1 meV) located up to 0.5 eV above the minimum energy of the lowest conductance band or 0.5 eV below the maximum energy of the highest valence band, where no other bands can be found within the corresponding energy range of the gap. The purpose of this gap is to introduce numerical tolerance since the band structure calculations were performed along a discrete mesh.