Conditions for electronic hybridization between transition-metal dichalcogenide monolayers and physisorbed carbon-conjugated molecules

Hybridization effects play a crucial role in determining the electronic properties of hybrid inorganic/organic interfaces. To gain insight into these important interactions, we perform a first-principles study based on hybrid density-functional theory including spin-orbit coupling, focusing on eight representative systems formed by two carbon-conjugated molecules-pyrene and perylene-physisorbed on the transition-metal dichalcogenide monolayers (TMDCs) MoS$_2$, MoSe$_2$, WS$_2$, and WSe$_2$. By means of band unfolding techniques, we analyze the band structures of the considered materials, identifying the contributions of the individual constituents as well as the signatures of their hybridization. Based on symmetry and energetic arguments, we derive general conditions for electronic hybridization between conjugated molecules and underlying TMDCs even when the former do not lie planar on the latter, thus providing the key to predict how their mutual arrangement affects their electronic interactions.


I. INTRODUCTION
Hybrid interfaces formed by organic semiconductors deposited on inorganic substrates have received considerable attention in the last couple of decades: Combining light-absorbing molecules with inorganic crystals characterized by enhanced charge-carrier mobility is particularly promising in view of novel opto-electronic applications [1][2][3][4][5][6]. Since the first attempts at the beginning of this century to decorate oxide surfaces with light-harvesting molecules [7][8][9], countless hybrid materials and interfaces have been explored in order to identify optimal combinations for the desired targets [10][11][12][13][14][15][16][17]. In parallel, the rise of low-dimensional materials [18][19][20] has opened new horizons in this field. Among others, the unique electronic and optical properties discovered for transition metal dichalcogenide (TMDC) monolayers [20][21][22][23] have unveiled new perspectives to realize novel hybrid materials with unprecedented opto-electronic performance .
The electronic coupling between their constituents represents one of the crucial aspects ruling the characteristics of inorganic/organic interfaces. When molecules are physisorbed onto inorganic semiconducting substrates, chemical interactions are typically weak, especially if they do not induce charge transfer in the ground state [46,47]. However, even in this scenario, the band structure of the hybrid system is rarely the mere superposition of the features of its building blocks [27,[48][49][50]. Very often, hybridization effects are large enough to generate new electronic states that are unique of the new structure, and that are responsible for the peculiar types of excitations emerging at hybrid inorganic/organic interfaces [51][52][53][54][55][56].
Ab initio studies have played a decisive role in understanding electronic interactions [31,40,[57][58][59][60][61] and spectroscopic signatures [33,62,63] of hybrid materials. In particular, manybody perturbation theory methods applied on top of density-functional theory (DFT) are capable of providing a quantitative description of the electronic and optical proprieties of hybrid materials in excellent agreement with experiments [6,62]. Unfortunately, these calculations are extremely expensive when performed on systems approaching 100 atoms in their unit cells. Employing DFT with range-separated hybrid functionals represents a reliable and yet numerically sustainable strategy to obtain an accurate description of the electronic structure of hybrid materials [40,58,64]. However, even with the reduced computational costs provided by this approach, the use of large supercells that is needed to simulate such complex systems represents a serious limitation for the interpretation of the results, as the computed electronic bands are folded with respect to those obtained in the unit cell of the inorganic substrate [65,66]. This inhibits an immediate, visual identification of the electronic states of the constituents as well as of hybridization signatures.
A way to overcome these limitations is offered by band unfolding techniques, which enable mapping band structures computed within supercells in the Brillouin zone of the reference system simulated in its primitive cell [67][68][69][70][71][72][73]. These schemes have been successfully employed on top of first-principles calculations to decipher the electronic structure of several complex materials [74][75][76][77][78]. Adopting them in the context of hybrid inorganic/organic interfaces is not only useful to obtain a more accessible representation of the band structures of these systems. Most importantly, band unfolding methods can be used to identify and rationalize the signatures of electronic interactions between organic and inorganic constituents.
In this paper, we present a detailed first-principles study, based on hybrid DFT, of the electronic properties of hybrid inorganic/organic interfaces formed by two representative polycyclic aromatic hydrocarbons (PAHs) physisorbed on the four monolayer TMDCs, MoS 2 , MoSe 2 , WS 2 , and WSe 2 (see figures 1a-b). The considered molecules, pyrene and perylene, are of particular interest in the context of hybrid materials. The former is frequently adopted as an luminescent probe on several inorganic substrates [79,80]; the latter and its derivatives have been extensively used in hybrid interfaces [81][82][83] as they absorb visible radiation. In order to achieve a rigorous description of the band structures of the considered systems, spinorbit coupling (SOC) effects, which are known to be crucial in the above-mentioned TMDCs, are explicitly accounted for. By applying appropriate band unfolding techniques, developed in an in-house implemented post-processing script, we identify the electronic contributions of the individual constituents as well as the signatures of their hybridization. In this analysis, we discuss the role of the metal and of the chalcogen atoms in the electronic structure of the TMDCs, and highlight how the planar anisotropy of the physisorbed molecules affects their interactions with the substrate. We finally derive conditions for electronic hybridization between conjugated molecules and underlying TMDCs even when the former do not lie planar on the latter.

A. Theoretical Background
The electronic properties of the systems considered in this work are calculated using DFT [84] in the Kohn-Sham (KS) scheme [85]. This implies self-consistently solving the KS equation, for Bloch states |nk with band index n, wave vector k, and energy E n (k). In equation (1), m is the electron mass, p is the single-particle momentum operator, and V KS = V ext +V H +V xc is the KS potential, which contains the nuclear (pseudo)potential (V ext ), the Hartree potential (V H ), and the exchange-correlation term (V xc ), the exact form of which is not known and thus requires approximations. The solution of equation (1) also gives us access to the total energy of the system, E tot , which can be used to determine molecular adsorption energies, E ads . In the context of the investigated hybrid interfaces constituted by a TMDC monolayer 4 and a PAH, this is achieved by computing where E (PAH+TMDC) is evaluated for the hybrid system in its relaxed geometry, and E (PAH) and E (TMDC) are calculated in the same simulation cell as the hybrid system after the removal of the TMDC and of the PAH atoms, respectively, without performing any additional structural optimization.
The analysis of the electronic structure computed from DFT is conducted using unfolding techniques, i.e., by mapping Bloch-vector-dependent quantities defined in the supercell (SC) calculations into the primitive cell (PC). This transformation can be performed even if the PC translational invariance is broken in the SC, e.g., by a lattice impurity or an adsorbed molecule. Using the band energies E = E n (K), with band index n and SC wave vector K, we define the spectral function as [73] W with W n (k) = W n (K + G 0 ) = σ=↑,↓ g∈r where k is the wave vector in the Brillouin zone (BZ) of the PC, which is folded into K by means the SC reciprocal lattice vector G 0 , such that k = K + G 0 . The sum over g encompasses the PC reciprocal lattice (r), which is a subset of the SC one (R) (Fig. 1c).
The coefficients being thus selectively summed originate from the plane-wave representation of the spinor |nK associated with the band index n and the SC k-point K: where r|G = Ω −3/2 e iG·r is the plane wave with wave vector G normalized to the volume Ω of the SC, and |σ = | ↑ or | ↓ represent spin-up and spin-down channels, which are mixed through SOC. whereas low values, thinly spread throughout the BZ, indicate that the state must be represented by a superposition of many plane waves. This could be for example the case of localized defect states, but not necessarily of molecular orbitals. Indeed, some of them can come quite close to plane waves, as we will see in the following. Although not relevant for the purpose of the present analysis, we note in passing the analogy between mapping the molecular orbitals in the Brillouin zone of the substrate and analyzing exciton contributions in reciprocal-space [48,86].
The ionization potential of the building blocks of the hybrid system is of chief importance in the context of heterostructures, as it is a key factor determining the mutual level alignment of their components. While this is generally not an easily accessible quantity after imposing periodic boundary conditions, it can be estimated here as since in all considered cases there is at least one dimension, marked as z in equation (6), along which the system is non periodic. Inspection of the electrostatic potential (V ext + V H ) along z shows that it converges rapidly in the vacuum layer separating replicas, such that the the vacuum level, which is the asymptotic value in equation (6), can be well approximated by the electrostatic potential assumed halfway between periodic replicas along z.

B. Computational Details
DFT calculations are performed with the Quantum Espresso suite [87,88], version 6.7.
The wave-function cutoff is set to 30 Ry and 40 Ry for Mo-and W-containing systems, respectively, while the cutoff for the density is four times as high. The c parameter of the hexagonal lattice is set to 20Å, providing a sufficient amount of vacuum to decouple periodic replicas along the z direction. Geometries are optimized using the Perdew-Burke-Ernzerhof (PBE) approximation [89] for V xc , together with norm-conserving, scalar-relativistic SG15 pseudopotentials [90] for V ext ; the pairwise Tkatschenko-Scheffler (TS) scheme [91] is applied to account for van der Waals interactions. Band structures are computed with the Heyd-Scuseria-Ernzerhof (HSE06) functional [92] in conjunction with fully-relativistic versions of the SG15 pseudopotentials [93] including SOC. Band structures computed in the PC are interpolated with the Wannier90 code [94]; the character of the electronic states is determined by projection of the corresponding wave functions onto the Wannier functions [95] localized at the chalcogen atoms. In the SC, we instead add the k-points along the desired path with zero weight to the mesh of the self-consistent calculation, thus gaining direct access to the plane-wave representations of the wave functions needed for unfolding. In the PC calculations, we use a 8×8×1 k-mesh to sample the BZ and a 4×4×1 q-mesh for the evaluation of the Fock exchange; in the SC ones, corresponding 2×2×1 and 1×1×1 grids are employed, respectively. For molecule-only calculations in the SC of the hybrid system, we use PBE employing a 8×8×1 k-mesh in the non-self-consistent calculation, which allows us to map the orbitals to the whole BZ with satisfactory resolution. We deem PBE to be sufficient in this context, as we are mainly interested in the character of the orbitals, which is insensitive to the choice of the exchange-correlation potential.

A. Structural properties of the hybrid systems
The initial geometry of the hybrid systems is constructed by expanding the optimized PC of the considered TMDC monolayers into a 4×4 SC and placing the molecule above the transition-metal layer at a distance of 1.4 times the in-plane lattice constant of the TMDC. The long molecular axis is aligned with the long diagonal of the hexagonal SC, maximizing the distance between molecular replicas in neighboring cells (see figure 1a-b).
This symmetric alignment is maintained throughout the structural optimization. Monolayer TMDCs have D 3h symmetry; neighboring high-symmetry points K and K' as well as M and M' along the boundary of the hexagonal Brillouin zone are inequivalent. However, as a consequence of the 3-fold rotation axis and time-reversal symmetry, the energy dispersion along the Γ-M-K-Γ and Γ-M'-K'-Γ paths is identical, although there are differences in the spin structure [96,97]. For our purposes, we can neglect these details and represent the band structures of the TMDCs as if the systems had hexagonal D 6h symmetry in reciprocal space.
However, belonging to the D 2h point group, the adsorbed molecule lowers this symmetry, giving rise to different energy dispersions along the directions of the long and short molecular axis. We refer to corresponding k-paths as Γ-M-K-Γ (along long axis) and Γ-M * -K * -Γ (along short axis, see figure 1d). We emphasize that K * and M * do not coincide with the previously mentioned K' and M'. To gain a full understanding of the band structure of the hybrid system, both paths should be considered. In this work, we mainly focus on the dispersion along the Γ-M-K-Γ path, but dedicate Section III E to exploring and rationalizing the differences emerging for the two paths in two selected systems. Among the four TMDCs, the diselenides have larger lattice constants than the disulfides, as known from previous experimental and ab initio studies [98,99], while the transitionmetal species barely affect these values (see table I heterostructures [100,101]. In general, such distances are larger for the Se-containing hybrid systems than for the S-based ones, due to the larger size of the Se atoms. Inspecting the adsorption energies (table I) with the size of the PAHs. This is the first hint at the presence of interface-specific and non-trivial interactions between the organic and inorganic components, going beyond the common picture of a system held together purely by van der Waals forces.

B. Electronic properties
In the next step of our analysis, we investigate the electronic properties of the hybrid systems described in Section III A. To set the stage, we first inspect the band structures of the isolated monolayer TMDCs computed in their unit cells (Section III B 1) and then move on considering the effects of pyrene (Section III B 2) and perylene adsorption (Section III B 3)

Monolayer TMDCs
We first review the electronic band structure of the four considered TMDC monolayers, focusing in particular on the character of the electronic states (see figure 2). Computed band gaps and band dispersions are in good agreement with previous results obtained at analogous level of theory [102][103][104]. In all systems, the fundamental gap is direct and located at the high-symmetry point K, with the corresponding values ranging from 1.6 eV to 2.1 eV, inversely related to the atomic masses of the involved species (see table II). The ionization potential is strongly influenced by the chalcogen species, with diselenides having values 0.75 eV smaller than their disulfide relatives. Thus, the bands of the former are shifted up in energy, such that heterojunctions between disulfides and diselenides tend to exhibit a staggered (type-II) level alignment [102]. This shift is somewhat smaller for the conduction bands than for the valence bands, resulting in smaller bandgaps for the diselenides.
Apart from these overall shifts, the chalcogen species affects also the energetic dispersion. This is evident upon comparison of the valence bands, but to lesser extent also in the conduction bands, which appear vertically compressed in the diselenide monolayers. This results in a smaller conduction bandwidth, which we define as the difference of the highest and the lowest energies within the first manifold of conduction bands. The opposite is true when replacing Mo with W, which instead increases this value. Furthermore, the transition-metal species affects the size of the indirect gap between K and Q, as the local conduction band minimum at Q is subject to a stronger spin-orbit split in the W-containing monolayers compared to the Mo-based ones. We note that in WSe 2 , the local conduction band minimum (CBm) at Q is almost degenerate with the global CBm at K. In fact, experimental results [105,106] as well as many-body perturbation theory studies [104] suggest that WSe 2 actually has an indirect bandgap. Generally, however, the effects of SOC in the W monolayers are most evident around K, where corresponding fingerprints are easily identified in higher conduction bands upon comparison to the Mo-containing counterparts, while the CBm is always unaffected. The highest valence bands in the W-based monolayers are parted by up to 0.6 eV due to SOC and thus significantly more than in the Mo-containing TMDCs, which is the root of the smaller bandgaps of the W-based siblings. The character of the electronic bands is expected to play a key role in the interactions with adsorbed molecules, as it is correlated to the localization of the wave functions within the TMDC. Due to the structural characteristics of the monolayers, states dominated by atomic orbitals of the chalcogen species are situated on the two outer atomic layers of the TMDC, while those with transition-metal character lie inside and moreover are strongly confined to the nuclei, since they correspond to d electrons. Hence, wave functions localized on the chalcogen atoms will more likely interact with the orbitals of the adsorbed molecule due to the enhanced overlap. This also implies that the interaction between the organic and inorganic parts are unlikely to be ruled by strong correlations typically associated with d or f electrons. We note in passing that, likewise, phenomena like surface reconstruction, which lead to reduced coordination at the surface, can be excluded from this discussion: in TMDCs, they appear at very high temperatures [107] when, however, molecules would desorb from the substrate. Regarding the molecular part, only π orbitals can be expected to

Pyrene.
We start the analysis of the electronic structure of the hybrid systems considering pyrene (Py) as an adsorbant. Physisorbed on MoS 2 , this molecule gives rise to a heterostructure with type-II level alignment, with the highest-occupied molecular orbital (HOMO) being ∼0.5 eV above the valence band maximum (VBM) of the TMDC, and localized around the high-symmetry point K (see figure 3a). Conversely, the HOMO-1, found at approximately -1.7 eV, is evidently hybridized with the MoS 2 bands halfway between M and K. Considering even deeper molecular states, we find strong interactions with TMDC bands close to - When Py is adsorbed onto MoSe 2 (see figure 3b), the resulting hybrid system exhibits a type-I level alignment, which is a consequence of the decreased ionization potential in the MoSe 2 monolayer compared to its disulfide counterpart (table II) From the previous analysis, we can derive three conditions that have to be met for hybridization between molecular orbitals and TMDC bands to occur: 1. the energies of the PAH orbital and the TMDC wave function have to be similar; 2. the nodal structure of the PAH orbital has to be compatible with the plane-wave part of the TMDC wave function: this corresponds to their k-points matching in the unfolded band structure; 3. the TMDC wave function must have a significant chalcogen contribution, and the PAH orbital must have π character with a locally high spectral weight.
Point (i) can be rationalized in terms of perturbation theory, which shows that the degree of mixture of states caused by their interaction is inversely proportional to their energy difference. Regarding points (ii) and (iii), we note that chemical interactions occur if there is wave function overlap [108], which, in turn, requires significant probability density overlap, as well as non-orthogonality between ψ PAH and ψ TMDC . This implies an at least partial match in the phases of these wave functions, thus enabling constructive interference. These two requirements are met when points (iii) and (ii) are fulfilled, respectively.
As a consequence of the proposed conditions, no strong hybridization can be expected to occur at the VBM and at the CBm of the TMDC, as the corresponding bands bear mainly transition-metal character (see figure 2). Yet, the remaining chalcogen fraction might lead to weak mixing when the weight of the molecular orbital is strongly concentrated at K, as is the case for Pe@WSe 2 (see figure 4d). As far as we can extrapolate from the examined Also the HOMO and HOMO-2 appear quite different along the two paths. For Pe, such anisotropies seem less pronounced, as there is strong mixing along both paths (figure 5b). However, it is clear that there are substantial differences in the details: Along the Γ-M-K-Γ path, a molecular orbital induces a clean split of the highest valence band (left panel), whereas along the Γ-M * -K * -Γ path, the band structure is strongly distorted slightly higher in energy. In any case, the perturbing influence of the orbitals on the TMDC bands is remarkable. This is partly a consequence of the non-negligible chalcogen character of the highest valence band close to M. However, it also comes down to the specific electronic structure of the molecular orbitals at play, as analyzed in Section III F below.

F. Molecular orbitals
In order to rationalize how the character of the molecular orbitals affects their hybridization with TMDC bands, we map the orbitals of the isolated molecules into the TMDC unit cells. To do so, the molecule is placed in a hexagonal unit cell with a lattice constant that is a multiple of the lattice constant of the primitive cell of a TMDC. In the following, we consider Py and Pe in a 4×4 MoS 2 SC. The orbital energies do not change across the BZ as there are negligible interactions between the replicas. Thus, we can consider the statespecific W n (k) instead of the whole W (k, E) (compare equations 3 and 4), with n being the orbital index, and plot W n (k) as a function of k x and k y across the whole BZ. The spectral weight of all considered orbitals for Py, ranging from the HOMO-2 to the LUMO+2 (see figure 6), is localized at the zone edges. The orbitals can be grouped into two main classes. The first one, encompassing the HOMO-2, the LUMO, and the LUMO+2, is characterized by states that are fairly delocalized over the edge of the BZ, such that spectral weights are rather low (maximum ∼20%). This implies that either the orbitals lack any plane-wave character, or the orientation of the molecule prohibits a stronger k-localization.
The latter possibility will be addressed towards the end of this section. The other three orbitals, i.e., the HOMO-1, the HOMO, and the LUMO+1, show instead distinct areas of weight accumulation. Among them, we highlight the HOMO-1 and LUMO+1, whose We note that the distribution of spectral weight in the BZ is not an intrinsic property of the orbital, but also depends on the underlying substrate through its lattice constant as well as via the orientation of the molecule with respect to it. Contrary to a normal, continuous

20
While cofacial arrangements of planar molecules and TMDCs cover the most common and intuitive cases, some molecules and particularly organic crystalline thin films have been observed to be adsorbed on-edge or in a herringbone manner [109][110][111][112][113]. In order to get an idea to which extent the spectral weights are influenced by a variation of the polar angle, we consider the HOMO of perylene. When the molecule is rotated around its long axis, the orbital spectral weight is redistributed from K to M (figure 8b). A qualitatively similar observation has been made for the HOMO of acenes [114,115], which have a similar nodal structure, suggesting that this is a general feature of PAHs with D 2h symmetry. Although this particular orbital does not interact with the TMDC in either orientation, this weight localization shows that also the polar angle can determine which PAH and TMDC states interact with each other. In turn, this means that the distribution of spectral weights in k-space can be viewed as a determining factor for the polar orientation of the molecule with respect to the substrate. Other aspects to consider in this context include the interfacial area, which is responsible for the wave function overlap and decreases when departing from the cofacial arrangement, as well as competing intermolecular interactions in the cases of dense coverage or crystalline thin films.

IV. SUMMARY AND CONCLUSIONS
In summary, we have investigated the electronic properties of eight hybrid inorganic/organic interfaces formed by two representative PAHs (pyrene and perylene) physisorbed on four TMDC monolayers (MoS 2 , MoSe 2 , WS 2 , and WSe 2 ), focusing on the conditions that induce electronic hybridization between the constituents. To this end, we have employed a state-ofthe-art first-principles methodology based on hybrid DFT explicitly accounting for spin-orbit coupling. The latter effects play a role especially in the presence of W-based TMDCs, where the induced splittings are as large as a few hundreds meV. To gain quantitative insight into the computed band structures, we developed and applied a band unfolding technique to map the results obtained for the hybrid interfaces into the unit cell of the TMDC monolayers.
In this way, we could estimate the level alignment between the organic and inorganic components and identify interaction patterns between the electronic states within the hybrid systems. A staggered lineup is found in all the considered interfaces including the disulfide monolayers, while WSe 2 forms a straddling heterostructure with both pyrene and perylene; in the borderline case of MoSe 2 , the valence band maximum is found slightly above (below) the HOMO of pyrene (perylene), giving rise to a type-I (type-II) level alignment. Although the relative energies of the electronic levels of the constituents naturally play a role in the electronic properties of the heterostructure, the hybridization conditions identified in the examined systems are general. Specifically, we found that hybridization occurs primarily close to M, involving the highest valence band of the former and non-frontier occupied orbitals of the latter. The derived conditions are general to this class of materials: they are based not only on symmetry arguments and on the energetic proximity between the involved states, but also on the spatial distribution of the TMDC bands and of the molecular orbitals in the unit cell of the substrate. By inspecting the distribution of the molecular orbitals across the Brillouin zone, we could explain the interaction mechanisms with the underlying TMDCs. We extended this analysis also to molecules which are rotated with respect to the equilibrium adsorption geometry, finding a sensitive dependence of the weights on both azimuthal and polar rotation angles.
By unraveling the microscopic mechanisms leading to hybridization in inorganic/organic interfaces, our study provides the keys to understand the electronic structure of these materials and to predict their optical response in the linear regime. In particular, the analysis of the k-projected molecular orbitals represents a computationally efficient and yet reliable tool to predict whether electronic hybridization is likely to occur with the sole knowledge of the molecular orbital wave function and the crystal structure of the substrate, thereby avoiding, at least for a first screening, costly simulations of the whole interface. As such, this approach is complementary to a recent development of an effective treatment of substrate-induced electrostatic interactions [116]. Possible case-studies for further investigations include functionalized counterparts of the considered PAHs, larger carbon-conjugated molecules absorbing visible light, as well as organic thin films.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are openly available at the following DOI: 10.5281/zenodo.5362454 (record number 5362454).