Modifying the Power and Performance of 2-Dimensional MoS2 Field Effect Transistors

Over the past 60 years, the semiconductor industry has been the core driver for the development of information technology, contributing to the birth of integrated circuits, Internet, artificial intelligence, and Internet of Things. Semiconductor technology has been evolving in structure and material with co-optimization of performance–power–area–cost until the state-of-the-art sub-5-nm node. Two-dimensional (2D) semiconductors are recognized by the industry and academia as a hopeful solution to break through the quantum confinement for the future technology nodes. In the recent 10 years, the key issues on 2D semiconductors regarding material, processing, and integration have been overcome in sequence, making 2D semiconductors already on the verge of application. In this paper, the evolution of transistors is reviewed by outlining the potential of 2D semiconductors as a technological option beyond the scaled metal oxide semiconductor field-effect transistors. We mainly focus on the optimization strategies of mobility (μ), equivalent oxide thickness (EOT), and contact resistance (RC), which enables high ON current (Ion) with reduced driving voltage (Vdd). Finally, we prospect the semiconductor technology roadmap by summarizing the technological development of 2D semiconductors over the past decade.


Introduction
In 1965, Gordon Moore, the cofounder of Intel, extrapolated an observation and projection through historical trend that the number of transistors in a dense integrated circuit (IC) doubles about every 2 years, which has become an indestructible economic driver for the exponential growth of the semiconductor industry. It is important to note that Moore's Law [1] is not a law of nature, but a guideline that the semiconductor industry uses as a goal. In a sense, this law will fail one day as the atomic scaling approaches. The first 50 years of Moore's Law is the miniaturization of device scale, and then the optimized performance and power consumption are obtained naturally, which is the so-called "happy scaling era". After that, the benefits of scaling down have not been enough to meet the needs for performance improvement, and the industry aimed at more extreme interconnection, mobility, and dielectric engineering, and even subverted the bulk/planar structure to obtain the performance improvement and power consumption reduction. The characteristic field effect transistor (FET) scaling length is derived as = √ EOT • s • t s ∕ ox , where EOT, ε ox , ε s , and t s are the equivalent oxide thickness of gate oxide, the dielectric constant of gate oxide, the dielectric constant, and the thickness of channel materials, respectively. Therefore, transistors demand larger gate capacitances (high dielectric constant and thinner oxide thickness) and thinner channel thicknesses to maintain ideal switching characteristics at sufficiently small dimensions. Figure 1 reviews the development of semiconductor technology since the 1990s. The early device characteristics roughly follow the Denard Scaling principle; that is, the channel length and the driving voltage V dd of the new-generation devices are reduced by 30% to ensure a constant electric field, which, in turn, leads to a 30% lower circuit delay τ. Crucially, after the 130-nm node, Dennard Scaling [2] is out of work, enabling explosive new technologies beyond scaling down. Before the 65-nm node, aluminum was replaced by copper for interconnect, resulting in lower series resistance [3]. On the other hand, SiGe increased silicon mobility by 35% through lattice strain [4,5]. During this period, the reduction of power consumption was just a windfall of the reduced V dd that comes with shrinking channel dimensions. In the following 45-nm node, the severe short-channel effect caused leakage current and degradation of the subthresh- where k B , T, e, C s , C ox , and D it are the Boltzmann constant, the absolute temperature, the elementary charge, the areal channel capacitance, the areal gate oxide capacitance, and the density of interface traps, respectively). In fact, the industry has employed oxynitride instead of pure SiO 2 as a gate dielectric since the 1990s to raise the dielectric constant and to offer other advantages against dopant diffusion. By the 65-nm node, the thickness of oxynitride has been reduced to 1.2 nm [6], which approaches the quantum tunneling limit, and the next-generation devices (45 nm node) have to introduce a higher dielectric constant (high-κ metal gate abbreviated as HKMG) to enhance gate modulation while suppressing tunneling current [7]. An even bigger technological leap occurred at 22 nm, and since then, semiconductor devices embraced the biggest structural revolution since complementary metal oxide semiconductor, with Fin-FET and ultrathin body silicon on insulator (UTB-SOI) replacing the planar devices and bulk silicon technologies, respectively [8]. For more advanced nodes, SOI requires a wafer-scale fully depleted nanosheet with a thickness of sub-3 nm, which is a huge challenge to the present technology. As shown in the inset of Fig. 1, Fin-FET has undergone sixth-generation testing since its commercial application, up to the state-of-the-art 3-nm node. It is worth noting that, to obtain more effective gate control, the Fin thickness is continually decreased with increasing gate height to compensate for current decline. For the 3-nm node, the aspect ratio exceeds 12:1 (64 nm vs. 5 nm) [9], while maintaining a smooth interface. Manufacturing such an aspect ratio extremely challenges both cost and processing, and therefore, gate-allaround (GAA)-FET is proposed to enhance gate control, thanks to the more comprehensive gate coverage and thinner channel compared to Fin-FETs. Remarkable performance and SS improvements have been achieved by vertically stacking multibridge channel (MBC) FET [8].
In this paper, we will review the development of TMD logic devices, especially the key technologies to achieve highperformance, low-power transistors through systematic optimizations of mobility, dielectric, and contact. First, we expound the characteristics of ideal 2D metal oxide semiconductor field-effect transistors. Next, we start with key device metrics and summarize the most important progresses published in the literature. Finally, we discuss the optimal device performance data reported to date and prospect the requirements for future industrial applications.

The Ideal 2D Transistor
Benefiting from van der Waals (vdW) stacking of 2D semiconductors, it is more feasible to realize MBC-FET and complementary FET (CFET) structures compared to bulk materials, and thus to obtain superior device characteristics. Figure 2A schematically illustrates the typical 2D MBC-FET, consisting of multiple vertically stacked 2D semiconductor channels. Paralleling multiple channels provides a larger drive current while saving footprint on wafer. The total resistance arises mainly from the contact resistance (R C ) and the channel resistance (R ch ). The latter can be expressed as: where L, W, N, C ox , μ, V gs , and V ds are the effective channel length and width, the stacking layer number, the areal capacitance of the gate oxide, the mobility, the gate voltage, and the drain voltage, respectively. Therefore, for a specific node, lower R c , higher mobility, and greater areal capacitance are critical for better drive capability. Figure 2B and C plots the representative transfer and output curves, where some key metrics are marked beside. Note that, small SS and low off-state current are essential for low power consumption (dynamic and static). On the other hand, a large on-state current is also highly desired, as seen in both the output and transfer curves, which points to good contact and high mobility. Furthermore, a well-controlled threshold voltage (V th ) that can serve as an indicator of process variation is especially important for large-scale integration. Next, we will systematically discuss the principal performance metrics at the device level.

Scattering and Mobility Booster
Carrier mobility represents the drift velocity per unit electric field, and is a key parameter to characterize the charge transport. Mobility can be expressed as = e m m * , closely related to the mean free time (τ m ) and the effective mass (m * ). For a specific semiconductor, scattering determines τ m and thus its mobility. Compared with that in Si, the charge transport in TMD is more susceptible to scattering (intrinsic and extrinsic). According to Matthiessen's Law, in a TMD containing multiple scattering mechanisms, the free carrier mobility can be evaluated by [33,34] where μ ph , μ SO , μ CI , and μ DE are the mobilities associated with the intrinsic phonons, the surface optical phonons (SOs), the Coulomb impurities (CIs), and the defects, respectively. Therefore, quantitatively determining the contributions of the various scatterings to the resistivity poses an enormous challenge, which invokes collaboration between the experimental study with precisely controlled parameters and the numerical modeling so as to interpret the relevant mobility variations with scattering.
Take MoS 2 as an example, the μ − T −1 relationship under different scattering mechanisms is qualitatively plotted in Fig. 3A. The highest or theoretical mobility of ~410 cm 2 V −1 s −1 for monolayer MoS 2 at room temperature should signify a transport mode dominated by the intrinsic phonon scattering, which consists of longitudinal acoustic (Fig. 3A (I)) and optical ( Fig. 3A (II)) wave phonons. Acoustic phonon scattering contributes mainly at low temperatures less than 100 K and its scattering rate linearly depends on the temperature (T), and thus the intrinsic phonon-limited mobility varies as μ ∝ T −1 . When the temperature is greater than 100 K, the optical wave phonons become the major contributor, and then the intrinsic phonon-limited mobility varies as μ ∝ T −1.69 [35]. Apart from the intrinsic electron-phonon scattering, remote interaction with the polar optical phonons at the dielectric surface ( Fig. 3A (III)), named SO scattering, gives rise to even stronger temperature dependence, in particular at high temperatures [36]. SO scattering is proportional to the ionization strength of the chemical bonds in dielectric, and the molybdenum-oxygen (Mo-O) bonds of oxide with high dielectric constants have stronger ionization strengths, which can lead to more severe SO scattering [37]. Differing from that in traditional Si transistors, SO scattering could significantly degrade the mobility of 2D-FETs, especially for GAA devices with doubleside high-κ dielectric [36]. Under the combined influence of intrinsic phonons and SO, the room-temperature mobility of monolayer MoS 2 FETs using high-κ dielectric may be as low as 200 cm 2 V −1 s −1 [38,39]. Even so, the experimental mobility of monolayer MoS 2 could be much lower than the theoretical value. The nature of the charge transport in MoS 2 , however, remains poorly understood despite the extensive theoretical and experimental studies performed to date. In the early days, the mobility of monolayer MoS 2 was around 10 cm 2 V −1 s −1 , and the μ − T −1 relationship exhibited obvious insulating behaviors [40][41][42] (Fig.  3A (VII)). Combining the electrical transport measurements and the atomic-resolution transmission electron microscopy, Qiu et al. found a large number of intrinsic sulfur vacancies presented in the MoS 2 lattice. These vacancies acted as localized states, giving rise to hopping transport at low carrier concentration (n 2D ) [43]. Besides, a particularly important source of scattering stems from the CIs at the semiconductor/dielectric interface, which are believed to be the limiting factor of the present MoS 2 devices [44]. The intensity of CI scattering depends on the CI density. The essence of CI scattering is the interaction with the electric field, which can be screened by the external field. Accordingly, the screened potential reads [45,46] are the bare potential and the screening effect by the substrate and the free electrons, respectively; ε box and ε 0 are the permittivity of the substrate and vacuum; ε el (q) corresponds to the electronic part of the dielectric function and depends on the carrier density n.
As n and ε box increase, screening gets stronger, reducing the scattering potential and thus increasing the CI-limited mobility.
To obtain higher mobility, numerous interface and defect engineering designs have been proposed for TMD. Yu et al. [34] developed thiol chemistry to achieve in situ defect modification, which effectively passivates defects and traps, significantly increasing mobility at room temperature. Furthermore, they introduced the screening effect through a high-κ with high carrier concentration to suppress CI scattering and realized a phononlimited mobility of ~150 cm 2 V −1 s −1 at room temperature in monolayer MoS 2 [33]. It was demonstrated that by sandwiching a monolayer MoS 2 channel in-between boron nitride (BN) layers, CI scattering can be considerably suppressed, leading to high mobility at room temperature and at low temperatures [41,47,48]. Moreover, optimized chemical vapor deposition (CVD), especially single-crystal growth, is an effective and industrycompatible technique to reduce defects and increase mobility [29,30,32,. Currently, for most 2D semiconductors, CVD films exhibit properties that far exceed those of mechanically exfoliated flakes. Figure 3B summarizes the thickness-dependent mobility of Si and 2D semiconductors, including MoS 2 , MoSe 2 , WS 2 , and WSe 2 . The conventional Si is subject to the repeated thinning and etching processes, and its mobility drops sharply as the SOI thickness reduces below 3 nm. This will severely restrict the device performance at the advanced nodes. By contrast, the thickness of monolayer TMDs is only 0.7 nm. Although the experimental mobility is far lower than the theoretical prediction, it still holds distinct advantage against SOI to provide one-decade higher mobility at similar thickness. In future, an ultraclean, nondestructive manufacturing environment and the delicate design of device structures are vital to the development of 2D-FETs.
layer. The layer-by-layer deposition of ALD depends on the precursor adsorption and nucleation by interfacial dangling bonds. Therefore, the surface rich in dangling bonds is the key to achieving a high-quality ALD dielectric. For 2D materials, however, the inherent absence of out-of-plane dangling bonds, responsible for vdW stacking, poses a huge obstacle to ALD-based dielectric integration. When dielectric is directly deposited on 2D materials by ALD, the precursors preferentially nucleate at the sites of lattice defects to form island-like growth, which exacerbates interface properties ( Figure 4B). In fact, the ideal integration of dielectric for 2D devices should be through direct vdW stacking of high-κ dielectric and 2D materials (Fig. 4C). However, these methods are neither compatible with industrial technology nor applicable to top-gate architecture. Therefore, dielectric integration in manufacturing top-gate 2D-FETs must involve interfacial modifications.
In the past decade, several strategies have been developed for interfacial modification of 2D materials, especially for 2D semiconductors ( Fig. 4D): (I) surface treatment, (II) self-oxidative metal interface layer, (III) native oxide interface layer, and (IV) vdW heterogeneous interface layer. Regardless of the method utilized, the effective EOT will be the sum of the EOT of interface layer and ALD oxide. Thus, the optimum interface layer should exhibit the lowest EOT while maintaining the extremely low D it .
The purpose of surface treatment is to create uniform point defects and dangling-bonds on 2D surface, which assist the adsorption and the nucleation of ALD precursors. Typical treatment methods include plasma [78][79][80], ozone [81,82], and electron beam irradiation [83,84]. Obviously, surface treatment sacrifices interface quality and will sizably raise D it . Furthermore, introduction of these interfacial defects further increases the source of CI scattering, resulting in irreversible degradation of mobility. Huang et al. [78] achieved ~1.5 nm of Al 2 O 3 through water plasma treatment and the corresponding D it reached 2.1 × 10 12 cm −2 eV −1 , which is more than 2 orders of magnitude higher than that at SiO 2 /Si [76]. Another widely used interface layer aims to evaporate a thin chemically active metal (such as aluminum [85][86][87], magnesium [85], and yttrium [85,88,89]). The self-oxidative product can serve as a dielectric or ALD interface layer. This method is also commonly applicable to other systems such as carbon nanotube [90] and organic transistors [91]. The self-oxidative metal methodology has good versatility for different materials. The resultant interface layer maintains a relatively high dielectric constant, but inevitably incurs channel damage during the evaporation process [92]. The fatal drawback is that, in order to ensure the interface uniformity and controllable leakage, the thickness of the accompanying self-oxidative layer usually exceeds a few nanometers, which is difficult to meet the EOT requirements of advanced node. As a result of combining the advantages of the aforementioned methods, a more advanced interface layer-2D native oxide-was invented recently. Typical examples are HfS 2 [93] and Bi 2 O 2 Se [94]. Lai et al. [93] fabricated top-gate FET with a low D it of 6 × 10 11 cm −2 eV −1 between HfS 2 channel and converted HfO 2 dielectric. Li et al. [94] achieved 0.9 nm of EOT dielectric layers of Bi 2 SeO 5 through layer-by-layer oxidation of the underlying 2D Bi 2 O 2 Se semiconductor. The corresponding leakage current was lower than 1 × 10 −7 A·cm −2 under an external field strength of 1 MV·cm −1 . This method is expected to meet both of the requirements for interface quality and EOT under precise control. However, the problem is also obvious: only a few specific 2D semiconductors can form dense native oxide, especially Mo-based and W-based TMD with the greatest application potential, for which this methodology is completely infeasible. Hence, another strategy, viz., vdW heterogeneous interface layer, is more friendly to 2D materials owing to ultrasmooth interface and damage-free vdW interaction. Wang et al. [95] used organic molecule perylene tetracarboxylic acid (PTCA) to treat graphene and realized noncovalent functionalization, assisting the deposition of high-quality ALD dielectric. Afterwards, Li et al. attained an organic crystal interface monolayer of only 0.3 nm by choosing new organic molecules of perylenetetracarboxylic dianhydride (PTCDA) in conjunction of optimizing the molecular deposition process. Breaking through of 1 nm of EOT was made for 2D materials [96]. In further statistical studies, Yu et al. [97] proved that the PTCDA interface layer could enable high dielectric reliability comparable to that of SiO 2 /Si. The noncovalent monolayer of organic crystal has excellent interface quality, versatility, and reliability for 2D devices, but is restricted by the intrinsic low dielectric constant. Therefore, it is still very challenging to achieve a more aggressive EOT of sub-1 nm by means of this technique. More recently, Liu et al. employed a 2D inorganic molecular crystal of Sb 2 O 3 as an ALD interface layer. Benefiting from its ultrahigh dielectric constant of 11.5 and the equally perfect interfacial quality to PTCDA, inorganic molecular crystals demonstrated great potential to become a new candidate of interfacial layer [98].
In addition to the traditional ALD oxide, 2D layered dielectrics have also been considered to be a unique solution for their 2D vdW properties. The representative, first-generation layered dielectric should be hexagonal BN (h-BN). Yet, its dielectric constant is only 4 [99]. Moreover, the serious leakage current [100] and the layer-by-layer transfer make h-BN almost impossible to be applied for larger-scale integration. A new generation of layered dielectric has been reported in recent years. For example, layered calcium fluoride [101,102] and strontium titanate [103] permitted vdW integration, realizing a record EOT down to 0.6 nm with excellent interface between high-κ and 2D semiconductors. However, such technologies are still in demonstration for back-gate devices. Future development of top-gate integration will be essential for CFETs and MBC-FETs.
In principle, EOT, gate leakage currents (I g ), and D it of high-quality dielectric/semiconductor systems must be strictly controlled to be lower than 1 nm, 10 −2 A·cm −2 [104], and 10 10 cm −2 eV −1 [96,105], respectively. The quality of the 2D channel and the gate dielectric layer, and the breakdown voltage are critical for top-gate or GAA 2D-FET as well. Currently, vdW integration (as shown in Fig. 4D (IV)) is a promising solution to the damage-free manufacturing of high-κ dielectrics onto 2D semiconductors, while satisfying the needs of EOT, I g , and D it . Of course, the premise is to find a high-κ 2D dielectric material suitable for 2D semiconductors.

Ohmic Contact for 2D-FET
At the advanced nodes, the physical channel length is only of dozen nanometers, and (quasi-)ballistic transport within mean free path dominates [106]. Consequently, the influence of mobility on the on-state current of metal oxide semiconductor field-effect transistors greatly attenuates. For ultrashort channel, R C contributes mostly to the total resistance [107]. The Schottky barrier (SB), which is the energetic difference between the valence (or conduction) band edge of the semiconductor and the work function (WF) of the contact metal, blocks carrier injection from the source into the semiconductor and becomes the most important origin of R C . According to the Schottky-Mott limit, the SB height of metal-semiconductor (M-S) contact can be manipulated by adjusting metal's WF. However, SB height is usually insensitive to WF. In reality, the Fermi level is often pinned by interface states, known as Bardeen limit [108,109]. Fermi level pinning (FLP) influences charge injection from metal into semiconductor at 2 levels: (a) metalinduced gap states (MIGS), which are ubiquitous in M-S contact [110][111][112][113] and (b) surface states resulting from contact interface traps or defects. FLP effect presents in almost all M-S junctions and has never been resolved. Yet, in commercial devices, this issue can be effectively circumvented by using degenerate doping to form an ultrathin tunneling barrier [114][115][116]. Owing to the atomic-scale thickness and the dopingfree nature, SB in 2D-FET is significantly dictated by the contact characteristics. Take the conventional evaporated metal on MoS 2 as example (see Fig. 5A and E for details), the intrinsic and extrinsic defects (created by high-energy metal atoms and clusters bombarding during metal deposition) introduce large density of states, which, in turn, pins the Fermi level near the midgap states [92,[117][118][119][120]. In this case, no matter how the metal's WF changes, the carriers will always experience a similar injection barrier, manifesting almost the same transport behavior [121][122][123][124][125][126].
Therefore, in order to satisfactorily obtain R c , 2 strategies can be utilized: (a) contact doping to make metallic contact or tunneling barrier and (2) MIGS suppression to minimize the SB height. The mainstream strategies for R C optimization are summarized in Fig. 5I. Inherited from silicon technology, contact doping is a consequential approach to reducing injection barrier and thus R C for whatever n-type and p-type devices [127][128][129][130][131][132][133][134]. Yang et al. [129] reported that, by employing chloride molecular doping to reduce SB width, R C of few-layer WS 2 and MoS 2 were decreased to 0.7 kΩ·μm and 0.5 kΩ·μm, respectively. Gao et al. [130] achieved degenerate p-type doping and observed hole transport in the MoS 2 channel for the first time, where the specified MoS 2 was treated by electronegative materials such as MoO 3 or MoO 2 . Metallic phase 2D materials are another effective strategy [135][136][137][138]. The metallic 1T phase of MoS 2 can be locally induced by alkali metal intercalation on semiconducting 2H phase, thus decreasing R c to 200 to 300 Ω·μm at zero gate voltage [135]. Also, the metallic 1T phase of vdW materials can be locally induced by strain [136], laser beam irradiation [137], and argon plasma bombardment [138] on semiconducting 2H phase. Recently, Wu et al. attained R C as low as 250 Ω·μm through atomically ultraclean vdW vanadium diselenide contacts [139]. A schematic of the energy band diagram and the structure of phase-engineered contacts are provided in Fig. 5C and G, respectively. For 2D semiconductors, however, the metallic phase is a thermodynamically metastable structure, which causes serious reliability and stability issues to the potential electronic applications. The vdW interaction of 2D semiconductors is thus expected to fundamentally eliminate the FLP effect so as to achieve ohmic contacts, as told by the Schottky-Mott limit. Yet, in the early works, the tunneling interface layer was the mainstream solution to reduce defects caused by evaporation. Researchers employed ALDdeposited oxide or transferred h-BN intercalated interfaces to form metal-insulator-semiconductor contacts, effectively reducing the SB height to be around 35 meV (see Fig. 5B) [122,[140][141][142][143]. In addition, the graphene interface layer exhibits better conductivity and tunable Fermi level, offering more possibilities to new contact structures [144,145]. Despite that, the interface layer widens the vertical tunnel barrier (e∅ T ) between metal and 2D semiconductor (see Fig. 5F), raising R C again. On this basis, Liu et al. proposed a laminating metal contact with weak dipole polymer as channel capping layer to create substantially disorder-free interface. The SB height was dictated by the metal's WF and thus highly tunable, approaching the Schottky-Mott limit [92]. Other researches also showed that deposition of indium and platinum at low temperature and with low diffusion energy can produce ultraclean vdW interface, which led to R C as low as 0.8 kΩ·μm and 3.3 kΩ·μm, for n-type and p-type transistors, respectively [146,147]. Compared with the transferred electrodes, this method is more industrially compatible. Recently, Shen et al. [110] reported a new approach to make ohmic contacts for monolayer MoS 2 by using semimetallic bismuth (Bi), where MIGS were considerably (E to H) Energy band diagram of different metal-2D semiconductor contacts, corresponding to the strategies of (A) to (D), respectively. There are 3 different injection mechanisms in (E) and (F), including thermionic emission (I), thermionic field emission (II), and field emission (tunneling) (III). In (E), the thermionic emission dominates. In (F), the tunneling current dominates. The yellow, dense short lines represent FLP. E F , E C , and E V represent the Fermi level of the metal, and the conduction and valence bands of the 2D semiconductor, respectively. (I) Benchmark R C for monolayer 2D semiconductors with different contacts. The dashed lines exemplify the theoretical R c at different thicknesses of the vdW gap (d). The green dashed line (d = 0 nm) denotes the quantum limit for R c approached by mature semiconductor technologies. The R c values of WSe2, MoS2, and WS2 are reproduced from Refs. [146,165], Refs. [110,121,125,129,133,134,147,148,[166][167][168][169][170][171][172][173], and Refs. [129,147,174], respectively. suppressed and degenerate states were spontaneously generated in the TMD (Fig. 5D and H). As a result, they achieved zero SB height and a record-low R C of 123 Ω·μm, which paved a new way toward ultralow R C for 2D ohmic contact. Following that, a new semimetal contact of stibium (Sb) was discovered by Intel, TSMC, etc. [148], which showed similar contact characteristics to Bi but with better stability and reliability.
Finally, one can explore what are the critical factors of R C in a vdW system. Theoretically, the quantum limit of R C can be expressed by [149] where W m is the metal's WF, n s is 2D sheet carrier density, h is the Planck constant, g v is the valley degeneracy (g v = 2 for monolayer vdW materials), d is the effective tunneling gap, and χ 2D is the affinity energy of the 2D semiconductor. According to Eq. 3, the theoretical quantum limits of R C is linearly dependent on the tunneling gap induced by the vdW gap. As depicted in Fig. 5I, the dashed lines that represent R c at different vdW gaps (d = 0 nm, 0.1 nm, 0.2 nm, and 0.3 nm) all display a strong dependency on the channel's carrier concentration. It implies that, in addition to designing a perfect vdW interface with the matched WF, enhancing the coupling between 2D semiconductor and contact metal is crucial to reduce the R c further, approaching the quantum limit.

Driving Current Improvement of 2D Transistors
After going through the optimizations discussed above, we then focus on the most important performance parameter for IC application-I on , which can be approximated as I on = V ds /R tot . Figure 6 shows the measured I on as a function of the channel length (L ch ) for monolayer 2D-FETs reported in the literature, where I on is extracted at V D equal to 1 V with maximum V gs . The dashed line illustrates prediction using R tot = L ch R sh + 2R C , where the average sheet resistance of the channel R sh = (eμn 2D ) −1 ≈ 1,524 Ω·sq −1 with n 2D = 10 13 cm −2 , μ = 125 cm 2 V −1 s −1 (IRDS 2021), V ds = 0.75 V (IRDS 2021), and R C = 100 Ω·μm. Apparently, I on is significantly limited by the channel mobility for long-channel devices, but becomes strongly dependent on R C for short-channel devices. As shown in Fig. 6, the theoretical prediction saturates as L ch < 100 nm. In the past decade, the I on of short-channel 2D-FET has been successively improved through contact and material optimizations. Nowadays, n-type MoS 2 -FETs demonstrated I on up to 1,135 μA/μm at L ch = 35 nm for monolayer [110] and 1,270 μA/μm at L ch = 50 nm for bilayer [67], respectively. In addition, the p-type 2D-FETs have also achieved several important breakthroughs recently. By modifying the vdW gap at the contact interface and employing platinum electrodes, the contact resistance of p-type few-layer WSe 2 -FET has been reduced to 3.3kΩ·μm [146]. Besides, with a 20-nm-long p-type bilayer WSe 2 -FET, an I on of 1,720 μA/μm and a contact resistance of 250 Ω·μm have been achieved by employing the vdWepitaxy VSe 2 contact [139], showing comparable performance to the n-type counterparts. The red star symbols in Fig. 6 indicate the future requirements for high-performance and high-density devices at technology nodes below 7 nm, respectively. The highest current density in advanced process is about 952 μA/μm at L ch = 18 nm [9], which means that the device performance of both p-type and n-type 2D semiconductors has to meet or exceed the performance requirements of IRDS. Table summarizes the representative works with detailed metrics of performance, processes, and optimization strategies.

Summary and Prospect
To extend the relentless downscaling, Fin-FET, nanosheet FET, and envisaged Fork-sheet and CFET have been developed (see the top panel of Fig. 7). As the device scale approaches the quantum limit, some issues (e.g., the quantum confinement effect) arise. The advantages of silicon-based technology by means of node evolution, such as energy efficiency and cost, will not exist any longer. Thus, research of new semiconductors suited for industry application is inevitable. 2D-FETs attract tremendous interest for the atomically thin channels to mitigate short-channel effects and to extend "Moore's Law". In this review, we summarize the key technologies for achieving high-performance, low-power 2D-FETs through optimizations of mobility, dielectric, and contact. Finally, we show a roadmap of 2D-FET technology in Fig. 7. As for mobility, it is determined by material itself (intrinsic mobility) and impurity scattering as well. The wafer-scale growth of 2D single crystals is highly expected to break through material limitations. Meanwhile, the interface engineering is essential to reduce the number of defects and to restrain the CI scattering. For dielectric engineering, the main aim is to construct a high-quality high-κ/2D semiconductor interface and then to realize low EOT. The surface modification strategy or the selection of interfacial layer should take low D it , low EOT, and low I g all into account. For contact engineering, the key is to reduce the height and the width of SB and suppress MIGS, and then to realize stable ohmic contacts with ultralow R C . Controllable contact-localized doping and suitable contact  [110,139,152,154,155,[175][176][177][178][179][180], Refs. [31,85,110,135,148,159,167,169,170,[181][182][183][184][185][186][187][188][189], Refs. [68,110,129,185,[190][191][192], and Refs. [156,193], respectively. HD, high density; HP, high performance.
(semi)metal selection are 2 major strategies. Now, the special semimetal (Sb and Bi) contacts show great potential in this regard [110,148].
The following are several prospects of 2D transistors for future development. Although material-level progresses are the earliest and most accessible, it is still urgent to make a major breakthrough in terms of stable growth method and the relevant industry-compatible CVD reactor for making high-quality 2D semiconductor single crystals over a large area. Due to the good electrostatic control, as well as the smaller bandgap and higher mobility than monolayers, precise layer-controlled TMD single crystal (including bilayer and trilayer) epitaxy with large-area homogeneity would further boost the performance of 2D TMD devices [67]. Also, the synthesis of high-κ 2D dielectrics is another vital task that may be easily overlooked. In addition, the present methods to prepare 2D materials mainly adopt solid-phase precursors as raw materials. Their sublimation and diffusion are complex and hard to control, limiting controllability and reproducibility. Thus, the growth method based on all gaseous precursors deserves more exploration.
At the device level, as discussed before, the comprehensive optimization, including mobility, dielectric, and contact, should continue uninterruptedly until the expected target for 2D-FETs is satisfied, i.e., μ e = 410 cm 2 V −1 s −1 , μ h = 219 cm 2 V −1 s −1 , R C < 100 Ω·μm at n 2D = 10 13 cm −2 , D it < 10 10 cm −2 eV −1 , EOT < 1 nm, and I g < 10 −2 A·cm -2 , where μ e (μ h ) is the electron (hole) mobility of n-type (p-type) 2D transistors. Achieving these performance merits actually implies the maturity of 2D-FET technology. From another point of view, the operating frequency of 2D transistors is also a recognized indicator and should be considered to systematically evaluate device performance. At present, the oscillating frequency of the reported ring oscillator based on 2D transistors is still far below 1 GHz, which hardly meets the requirement for commercial ICs. This means that, despite impressive progresses made in laboratory, more effort should be dedicated to device array and integration, i.e., not just a single transistor. Hence, more attention should be paid to their uniformity, reliability, and yield. Moreover, the CFET is the mainstream development direction for 2D transistors. Although the performance of p-type 2D-FET has made great breakthrough, its process compatibility with n-type device remains to be solved.
At the application level, although 2D-FET technology is considered to be one of the most promising candidates in the post-Moore era, it is still implausible that silicon will be fully replaced by 2D semiconductors in the foreseeable future,  [156] because of the enormous investment and the immature technology to commercialize 2D transistors. For constant development, some specific "killer application" is necessary. As transferring vdW materials to arbitrary substrates is feasible under roomtemperature conditions, monolithic 3D integration of 2D transistors with mainstream semiconductor technologies is a promising way, even being able to substitute the existing amorphous silicon or oxide thin-film transistor technologies. A recent demonstration of fully 3D monolithic, 1270-PPI active-matrix micro-LED displays driven by monolayer MoS 2 transistors offered a persuasive example [69]. Moreover, due to the excellent flexibility, the low integration temperature, and the outstanding optical transparency of 2D semiconductors, flexible electronics are predictable to become another essential application scenario for 2D transistors. In addition, many potential applications of 2D transistors in sensing, nonvolatile memory, and neuromorphic computing are expected to be achieved soon. Though 2D transistor technology is young, it is very appealing to bring 2D transistors into production by combining continuous process optimization and application exploration. BK20200746), the Startup Foundation of Nanjing University

Data Availability
All data used in the analysis within this paper and other finding of this study are available from the corresponding author upon reasonable request Fig. 7. Schematic visualization of transistor development and 2D-FET technology roadmap. The transistor has been structurally innovated step by step, evolving in accordance with Moore's Law and bringing huge benefits to the semiconductor industry. To bring into production, 2D-FET must overcome a series of challenges from laboratory to manufacturing infrastructure, which is outlined in the diagram.