Energetic mapping of oxide traps in MoS2 field-effect transistors

The performance of MoS2 transistors is strongly affected by charge trapping in oxide traps with very broad distributions of time constants. These defects degrade the mobility and additionally lead to the hysteresis of the gate transfer characteristics, which presents a crucial performance and reliability issue for these new technologies. Here we perform a detailed study of the hysteresis in double-gated MoS2 FETs and show that this issue is nothing else than a combination of threshold voltage shifts resulting from positive and negative bias-temperature instabilities. While these instabilities are well known from silicon devices, they are even more important in 2D devices given the considerably larger defect densities. Most importantly, the magnitudes of these threshold voltage shifts depend strongly on the density and energetic alignment of the active oxide traps. Based on this, we introduce the incremental hysteresis sweep method which allows for an accurate mapping of these defects and extract their energy distributions from simulations. By applying our method to analyze the impact of oxide traps situated in the Al2O3 top gate of several devices, we confirm its versatility. Since all 2D devices investigated so far suffer from a similar hysteresis behavior, the incremental hysteresis sweep method provides a unique and powerful way for the detailed characterization of their defect bands.


Introduction
Molybdenum disulfide (MoS 2 ) is a next-generation semiconductor from a wide range of transition metal dichalcogenides which is now considered for applications in beyond-CMOS electronic devices. Owing to a direct electronic bandgap of around 2.6 eV in the single-layer limit [1,2], MoS 2 has attracted considerable attention for digital device applications. In particular, numerous successful attempts at fabricating MoS 2 field-effect transistors (FETs) have been undertaken recently [3][4][5][6][7][8][9][10]. However, these studies mostly deal with the analysis of the performance characteristics of these devices, such as mobilities and on/off current ratios, as well as the exploration of fabrication techniques allowing them to realize their theoretical performance potential predicted by simulations [11]. In addition, there is some understanding that MoS 2 FETs are suitable for circuit integration [12][13][14] and high-frequency applications [15]. However, one of the most important performance limitations is due to defects. Even though at the moment these defects severely reduce the potential for industrial integration of these new technologies, they have not yet received the attention they deserve.
The most critical component in terms of hysteresis and reliability is the insulator materials itself and in particular its interface to the semiconducting channel. Many of these materials and interfaces have already received a lot of attention in Si transistors [16] and other electronic devices, such as flash memory cells [17]. These studies have shown that every insulator studied up to now contains some preexisting defects [18] with widely distributed time constants [19], which can act as trapping sites for the charge carriers in the channel, and thus affect the device performance and reliability. In particular, the most obvious consequence of oxide traps is the ubiquitous charge trapping, which can lead to long-term drifts of the transfer characteristics during device operation or under the presence of gate bias stress. These issues are known as biastemper ature instabilities (BTI) and are a serious reliability concern in conventional Si technologies [20][21][22][23][24][25]. During the last decade BTI in Si FETs has been thoroughly characterized [22,24,[26][27][28]. In particular, charge trapping was identified to be a crucial contributor to the degradation [19,29]. At the same time, the reliability of next-generation 2D devices, which is known to be far below the standards of modern industrial FETs [30][31][32][33], is still poorly understood. As such, commercialization of these new technologies requires systematic reliability assessment studies, which would allow to understand and minimize charge trapping in 2D FETs.
Although the reliability of MoS 2 FETs and other 2D devices is not properly understood yet, the similarities to available data on Si technologies clearly suggest that charge trapping in oxide traps is at the heart of the problem [34][35][36][37][38][39]. As the dynamics appear to be very similar to Si FETs [40], we expect that they can be described using the models previously developed for Si technologies [25,29,41]. However, the most ubiquitous issue, which is typically observed in addition to partially recoverable BTI shifts [30,31,42], is the hysteresis of the gate transfer characteristics [34][35][36][37]. While in mature Si techologies the hysterisis is negligible, in MoS 2 devices this issue can still lead to considerable instabilities of the device characteristics. In our previous work [40] we have performed a combined study of the hysteresis and BTI in MoS 2 /SiO 2 FETs and speculated that both the hysteresis and BTI are due to the same type of oxide traps, with the faster ones leading to the hysteresis and slower ones to BTI. However, the oxide traps are known to be localized within certain defect bands which are unique for every insulator [18]. As such, information about the density and energetic alignment of the defects would allow for a considerable advancement in our general understanding of the performance and reliability of MoS 2 FETs.
Here we perform a detailed study of the hysteresis behavior in double-gated MoS 2 FETs and show that the observed instability of the threshold voltage is a consequence of what is typically referred to as positive and negative BTI (PBTI and NBTI, respectively). Based on this, we introduce the incremental hysteresis sweep method which allows us to perform an accurate mapping of oxide traps with different time constants. The main ingredients of this method are the experimental techique, which gives the density of active oxide traps at different gate voltages, and the technology computer aided design (TCAD) simulations which allow to link the applied gate voltage with the trap level alignment in a gate insulator. Thus, the results of our method are the energy distributions of the density of oxide traps with different time constants. In order to demonstrate the versatility of our approach, we apply the method to extract the distributions of oxide traps in the Al 2 O 3 top gate insulator of our MoS 2 FETs. Although we demonstrate that the density of oxide traps in Al 2 O 3 layers is relatively large, our results are instrumental for understanding the charge trapping dynamics in this high-k oxide, which is typically used either as a top gate insulator [7,32,43] or encapsulation layer [33,44] in many 2D devices.
Note that our method targets exactly those oxide traps which are responsible for the hysteresis and BTI in MoS 2 FETs. These traps are typically localized within certain defect bands which are a fundamental property of any insulator. This is in contrast to some other techniques based on capacitive measurements [45] or on the analysis of the subthreshold swing [46]. For example, the authors of [45] extract the density of band tail trapping states in MoS 2 FETs, which result from the impact of the defects in MoS 2 and interface states. At the same time, the method of [46] allows to extract the density of localized states in the semiconductor bandgap using the subthreshold swing model. However, this approach does not suggest any distinction between semiconductor defects and oxide traps. Furthermore, the analysis is done in a simplistic manner ignoring the impact of oxide traps, which is not suitable for 2D FETs.

Devices
Our devices are single-layer and bilayer double-gated MoS 2 FETs. After chemical vapor deposition (CVD) of the MoS 2 film, it was transferred onto a 280 nm thick oxide coated silicon wafer, which was used as a backgate dielectric. Following fabrication of source and drain contacts, the devices were coated with a 23.5 nm thick Al 2 O 3 top-gate dielectric using atomic layer deposition (ALD). The channel length L of the devices varies from 500 nm to 3 µm, while the channel width W is kept constant at 3 µm. Top gate electrodes were designed to be slightly smaller than L (350 nm to 2 µm) to avoid electrical break-through. A schematic layout and microscope image of the devices are shown in figures 1((a) and (b)). Electrical characterization shows that the devices exhibit a relatively high on/off current ratio, which can reach 10 6 -10 7 for single-layer devices in both top gate and back gate operation modes (figures 1((c) and (d)). Single-layer devices with = L 500 nm were selected for the detailed study.

Charge trapping in MoS 2 FETs: basics
Charge trapping in preexisting oxide traps is one of the major issues known to affect the reliability of Si technologies [20][21][22][23][24][25]. These defects are energetically localized within certain defect bands and present a fundamental property of every insulator [18]. During device operation, traps within several nanometers from the oxide/channel interface, known as border traps [47], can exchange charge with the channel by means of carrier capture or emission via tunneling process. These charge trapping events are well described by nonradiative multiphonon processes [25,48,49] and their dynamics depend on the capture and emission time constants τ c and τ e , respectively. The essential aspect of these processes is that the time constants are dominated by structural relaxation at the defect sites rather than the tunneling probabilities. Due to the amorphous nature of most oxides, these time constants are widely distributed for different defects and present the time which is required for each particular defect to capture or emit a carrier under favorable bias conditions. Note that the most important characteristic of a non-radiative multiphonon process is its strong temperature-and bias-dependence.
As has been found recently, next-generation 2D devices [32,33,50,51], in particular MoS 2 FETs [30, 31, 34-37, 40, 42], also suffer from the charge trapping at preexisting oxide defects. The dynamics of the underlying processes are surprisingly similar to Si technologies [32,33,40]. Depending on the microscopic structure of the defect, studied in detail for SiO 2 based on Si/ SiO 2 FETs, one usually speaks of electron or hole traps respectively. The charge transfer process itself is the same in both cases. The two processes only differ in the change of the charge of the trapping defect in the oxide. This defect either goes from positive to neutral (hole trap) or from neutral to negative (electron trap). Therefore, the difference between electron and hole traps is only visible in an offset of the transfer characteristic, as it only changes the balance of fixed charges. From the overall charge balance required for our TCAD simulations, we conclude that the defect band in Al 2 O 3 , which dominates the charge capture and emission processes causing the hysteresis, is an electron trapping band. In figure 2 we schematically illustrate the charge trapping in our devices operated in the top gate mode. In equilibrium, which corresponds to the flat-band voltage V fb , the defects localized below the Fermi level E F are negatively charged, while the ones above E F are neutral. At the same time, the threshold voltage of the device strongly depends on the concentration of charged defects and can be given as with q being the elementary charge, C tg the top gate oxide capacitance, N ot the concentration of charged defects, V th eq the equilibrium threshold voltage and N ot eq the concentration of charged defects. In particular, if < V V tg fb is applied, band-bending shifts most defects above E F . As such, charged defects, except those with very large emission time constants, can emit an electron into the channel and become neutralized (i.e. discharged). Thus N ot becomes smaller, which makes the threshold voltage V th of the device more negative. This issue is known as NBTI. Conversely, if > V V tg fb is applied, a considerable number of defects is below E F , which is close to the conduction band of MoS 2 . Thus, neutal defects, except those with very large capture time constants, can capture an electron from the channel and become charged. As a result, N ot becomes larger and V th is more positive, which is known as PBTI. The most obvious consequence of both issues on the performance of MoS 2 FETs is the hysteresis of the gate transfer characteristics [34][35][36][37]. Namely, V th measured using a + V sweep (from V tgmin to V tgmax ) is typically more negative than the one measured using a − V sweep (from V tgmax to V tgmin ), which is because N ot at V tgmin is smaller than at V tgmax . Furthermore, those oxide traps which are too slow to follow the hysteresis sweeps remain charged and can cause long-term NBTI and PBTI drifts which appear if a constant gate bias stress is applied for a considerable time and can be recovered if the device is returned back to the equilibrium (for more details see our previous work [40]). Below we will show that the charge trapping behavior can be used as an efficient instrument for accurate mapping of oxide traps in MoS 2 FETs.

Results and discussions
We performed all our measurements in complete darkness and in a vacuum ( × − 5 10 6 -10 −5 torr, = T 27 C). The latter was necessary to avoid the detrimental impact of the ambient [36]. The hysteresis was investigated by measuring the gate transfer characteristics at V d = 0.5 V using both + V and − V sweep directions. In order to capture the impact of oxide traps with widely distributed time constants, we varied the sweep rate / = S V t step step between 0.02 and 5000 V s −1 by adjusting the step voltage V step and the sampling time t step . While using smaller V step allowed us to access more oxide traps, a larger t step allowed to increase the amount of slower traps which are able to contribute to the hysteresis. Also, we mostly focused on the analysis of the hysteresis on the top gate transfer (I d -V tg ) characteristics and used different sweep ranges V tgmin to V tgmax . The hysteresis width ∆V H was measured around V th which was extracted using a constant current method at = I 10 d nA. In figure 3 we compare the I d -V tg characteristics measured using both sweep directions and different sweep rates for the V tg sweep range from −10 to −3 V. The threshold voltages measured using the + V sweep mode become more negative as S is decreased. As shown by the schematic band diagrams (figure 3(b)), this is because around V tgmin most defects are above the Fermi level E F , which allows their efficient discharging by means of electron emission. As a result, an NBTI degradation is observed, which is more pronounced for slower sweeps, i.e. larger stress times. Conversely, around V tgmax we are dealing with PBTI degradation, which is associated with charging of some defects. Thus, the threshold voltages measured using − V sweeps are more positive compared to their + V sweep counterparts, and a clockwise, i.e. PBTI-like, hysteresis is observed. Since the magnitude of PBTI degradation is strongly dependent on S, which determines the stress time, for slower sweeps the hysteresis becomes larger. However, if the sweep rate is as fast as 5000 V s −1 , we do not see any considerable hysteresis, while V th is more positive than even for = S 100 V s −1 . This means that the time constants of most oxide traps accessible within this narrow sweep range are larger than the corre sponding sweep time, which is around 1 ms. As such, we assume that the I d -V tg characteristic measured using = S 5000 V s −1 is weakly affected by charge trapping and take it as a reference curve (see more details in figure S1 in the supporting information (SI) (stacks. iop.org/TDM/4/025108/mmedia)). The latter allows us to split the total hysteresis width ∆V H into the threshold voltage shifts ∆ + V th and ∆ − V th obtained for the I d -V tg characteristics measured using the + V and − V sweep modes, respectively. As shown in figure 3(c), both shifts are of NBTI-like nature, while being larger for smaller measurement frequencies / = f Nt 1 step with N being the number of V tg steps of duration t step [40]. This is because for the narrow sweep range the major fraction of the total sweep time is spent at V tg corresponding to an NBTI bias condition, i.e. V tg is below the equilibrium voltage. Nevertheless, ∆ − V th is smaller than ∆ + V th due to PBTI degradation, which occurs more close to V tgmax and becomes stronger for smaller f. The latter leads to the observed hysteresis.
The results corresponding to the V tg sweep range from −10 to 14 V are shown in figure 4. While the I d -V tg curve measured using the + V sweep mode is shifted in an NBTI-like manner with respect to = S 5000 V s −1 curve, a larger V tgmax leads to a PBTI-like shift of the − V sweep characteristics. Thus, a considerable hysteresis is observed, which is a result of both NBTI and PBTI degradation. While the NBTI magnitude remains comparable to the case of narrow sweep ranges, the PBTI contribution is dramatically increased. The latter is because the larger V tgmax increases both the number of oxide traps which can be charged ( figure 4(b)) and the total time spent at a PBTI bias condition, when most of the defects are shifted below E F due to the band bending. Since both NBTI and PBTI shifts become larger for slower sweeps, the hysteresis width also increases. However, for very slow sweep frequencies ∆V H tends to saturate, which is consistent with the universal behavior reported in our previous work [40]. The results obtained on the back gate can be found in figure S2 in the SI.
The results above show that the number of oxide traps which are able to contribute to the hysteresis depends strongly on V tgmax , which determines the magnitude of the PBTI contribution and hence the total hysteresis width. In addition, the number of traps able to contribute strongly depends on the sweep rate, as the traps with capture/emission times larger than the sweep time will not be able to react. Thus, aiming to map these defects with their widely distributed time  The reason for this is a larger V tgmax , which leads to a larger fraction of sweep time spent at V tg corresponding to PBTI condition and also increases the number of accessible defects. (c) Owing to a dramatically increased PBTI contribution, the hysteresis width is considerably larger than it was for the narrower sweep range. constants and different energy levels, we employed the following experimental technique which presents the main ingredient of our incremental hysteresis sweep method. An elementary loop consists of measurements of the I d -V tg characteristics using both + V and − V sweep directions with a fixed sweep range and different V step and t step [40]. As shown in figure 5(a), the full measurement procedure consists of repeated loops of this kind for different sweep ranges using a fixed = − V 10 tgmin V and V tgmax varied from −3 to 14 V in 1 V steps. This allows us to obtain a set of ∆V H ( f ) characteristics which strongly depend on V tgmax and thus contain the information about the energy distribution of the density of charged oxide traps with different time constants (see figure S3 in the SI). Next we follow the approach of figures 3-4 and evaluate the PBTI and NBTI contributions into the total hysteresis widths by splitting the ∆ V H ( f ) characteristics into ∆ + V th ( f ) and ∆ − V th ( f ) parts. The results measured using = − V 10 tgmin V and V tgmax between −3 and 14 V for single-layer devices are shown in figure 5(b). Since we have used the same V tgmin for all these measurements, the ∆ + V th ( f ) characteristics, which are associated with the NBTI contribution into the total ∆V H , are nicely reproducible. Some negligible variations of ∆ + V th ( f ) originate from a slight drift of the device in between the measurement loops with different V tgmax . Conversely, the ∆ − V th ( f ) curves, which contain the fingerprint of the PBTI contribution, follow an increase of V tgmax . The latter is because the defect band of Al 2 O 3 is bent by applying a top gate voltage, which shifts the traps below the Fermi level. Thus, an increase in the maximum of the sweep range by an interval ∆V tgmax bends the defect band downwards a bit stronger. In this way, traps which formerly have been situated above the Fermi level during the whole sweep can now contribute as well. In other words, if higher top gate voltages are applied, defects with a higher energy level can be accessed. At the same time, the high carrier density in the accumulation regime assures that the band bending affects first and foremost the Al 2 O 3 layer. As such, after polynomial smoothening of the obtained ∆ − V th ( f ) characteristics (see figure S4 in the SI) we calculate the concentration of oxide traps which come into play between V i tgmax and + V i tgmax 1 as (2) with C tg being the top gate oxide capacitance and q the elementary charge. In order to be able to contribute to the charge trapping processes, these traps should be able to capture the electrons which are tunneling from the MoS 2 channel through the top gate dielectric. As such, the active oxide traps should be situated not farther than a maximum distance ≈ d max 2.6 nm from the MoS 2 /Al 2 O 3 interface (see the detailed evaluation in the SI). Therefore, the oxide trap density within the device operation range can be estimated as , . 1.35 eV have been used for the electronic bandgap and the electron affinity of single-layer MoS 2 and ALDgrown Al 2 O 3 , respectively. Furthermore, the four-state non-radiative multiphonon (NMP) model known from Si technologies [25] has been implemented to describe the charge trapping by oxide traps which leads to the hysteresis. In figures 6(a) and (b) we show that both the shape of the I d -V tg characteristics and the hysteresis dynamics measured using different sweep rates and sweep ranges can be reasonably well matched by our TCAD simulations. Furthermore, simulations of a large number of I d -V tg characteristics with different sweep parameters allowed us to extract the hysteresis widths and reproduce the experimental set of the ∆V H ( f ) characteristics for different V tgmax ( figure 6(c)). This The band diagrams for the SiO 2 /MoS 2 /Al 2 O 3 system underlying the results of figure 6 are shown in figure 7(a). For SiO 2 we are using two distinct defect bands which have been identified in our previous works [33,54]. The upper defect band is located at eV below the SiO 2 conduction band edge [33], which almost exactly matches the value previously reported for Si technologies (∼2.6 eV) [55]. The lower defect band, also known from Si technologies [54], is at = ± E 4.56 0.35 l T eV below the SiO 2 conduction band edge. However, the charge trapping issues in MoS 2 n-FETs can be only due to the upper defect band in SiO 2 , which is located close to the conduction band of MoS 2 . As for the Al 2 O 3 , we found that there is one defect band at = ± E 2.55 0.3 T eV below the Al 2 O 3 conduction band. This value is also Figure 6. The I d -V tg characteristics measured using the sweep ranges from -10 to -3 V (a) and from -10 to 14 V (b) and different sweep rates can be reasonably well reproduced by our TCAD simulations. This allows us to obtain reasonable fits of the ∆V H ( f ) characteristics for different sweep ranges (c), thus validating our experimental approach. in reasonable agreement with the one previously identified for Si technologies (∼2.0 eV) [55], which further confirms that every insulator has its unique defect bands. The distributions of oxide traps for the defect bands in SiO 2 and Al 2 O 3 used in our TCAD simulations are shown in figure 7(b). While our mapping range from = − V 3 tg V to 14 V corresponds to accumulation, the Fermi level is pinned close to the conduction band of MoS 2 , which agrees with previous literature reports [56][57][58]. Thus, in agreement with our qualitative interpretation above, for more positive V tg the number of Al 2 O 3 defects in the active region, i.e. below the Fermi level E F , increases due to band-bending which changes the shape of the Al 2 O 3 defect band and brings more defects downwards. This leads to their efficient charging and, consequently, a larger hysteresis for wider sweep ranges.
The results of figures 7(a) and (b) allow us to recalculate the applied V tg into the trap level E T by considering the band-bending of the Al 2 O 3 defect band within ≈ d max 2.6 nm from the interface. In figure 7(c) we show the obtained ( ) E V T tg dependence with E T given in the units of electronic energy E used in the band diagrams. We notice that there is a saturation at larger V tg , which is an artifact related to the discrete defect bands used in our TCAD simulations (see figure S5 in the SI). Therefore, in order to obtain a more physical ( ) E V T tg dependence, we use a linear approximation. With this mapping we can convert our experimental ( ) D V ot tg dependences into the differential energy distributions ( ) D E ot . In figure 7(d) we show the ( ) D E ot curves obtained for the measurement frequency spaced logarithmically between 10 and × − 1 10 3 Hz, which corresponds to capture times / τ ∼ f 1 c from 10 −1 to 10 3 s. The typical defect densities determined for the Al 2 O 3 defect band are similar to those previously obtained for the same oxide in Si devices (∼10 20 cm −3 eV −1 ) [55]. At the same time, the ( ) D E ot distributions are broadly consistent with the simple Gaussian shape used in our TCAD simulations (see figure S6 in the SI), although additional peaks, likely associated with some imperfections of real devices, are present. Furthermore, the dependence of D ot versus the measurement frequency along the defect band is also consistent with the shift of the Gaussian peak which follows from our TCAD simulations ( figure S6). In particular, we can experimentally resolve three regions with different D ot ( f ) behavior. In the lower (L) region, i.e. below the D ot (E) peak, we observe a monotonous increase of D ot as f is decreased. This means that the capture times are distributed within the whole interval [10 −1 s ... 10 3 s], while the number of slower traps is larger ( figure 7(e)). In the middle (M) region, i.e. around the D ot (E) peak, the D ot ( f ) dependence is weak, which suggests that the fractions of slower and faster traps are comparable. Finally, in the upper (U) region D ot decreases for smaller f, i.e. faster traps dominate. Thus, at very slow sweeps most defects have enough time to become charged, which leads to some saturation in the magnitude of the PBTI contribution and, consequently, in the total hysteresis width. The latter is fully consistent with the results of our TCAD simulations and some experimental observations reported in the previous work [40], which suggest that the maximum of ∆V H can be reached at a comparably small measurement frequency.
Finally, we have performed a detailed verification of our method by comparing the ( ) D E ot distributions extracted from the simulated ( ) ∆V f H curves (figure 6(c)) with the total trap density which has been originally used in the TCAD simulator (see the details in figures S7-8 in the SI). While a reasonable agreement has been achieved, we found that the peak D ot values extracted using our approach are smaller than those for the input trap density (e.g. figure 7(b)). This observation confirms that our incremental hysteresis sweep method is sensitive exactly to those oxide traps which contribute to the hysteresis, while the number of traps which can be captured depends on the ratio between 1/f and the time constants. At the same time, the typical ∼ D ot 10 20 cm −3 eV −1 obtained for MoS 2

Conclusions
In summary, we have performed a detailed study of the hysteresis dynamics in double-gated MoS 2 FETs. We found that this issue is a consequence of device degradation due to positive and negative biastemperature instabilities, which are related to charging and discharging of oxide traps, respectively. Based on this finding, we have developed the incremental hysteresis sweep method which allows to perform an accurate mapping of oxide traps with widely distributed time constants. By using the experimental technique of our method and TCAD simulations to convert the applied top gate voltage into the electronic energy, we have extracted the differential energy distributions of oxide traps in single-layer MoS 2 FETs and confirmed the validity of our approach. Taking into account that the hysteresis appears to have the same origin in all new 2D technologies, we are confident that the reported method is universal.

Device fabrication
Single-layer MoS 2 was grown by chemical vapor deposition (CVD) on c-plane sapphire similarly to the method suggested in [62]. After cleaving the sapphire substrates and consecutive cleaning by ultrasonication in acetone and 2-propanol, they have been placed facedown over an alumina crucible containing ∼5 mg MoO 3 (99.998%, Alfa Aesar) and loaded into a quartz tube (diameter 20 mm) of a three-zone split-tube CVD furnace. A second alumina crucible containing 30 mg of sulfur (99.9%, Sigma-Aldrich) was loaded upstream of the substrate in a colder region of the furnace. The tube was flushed several times with ultra-high purity argon at room temperature. After heating up the furnace (ramp rate: 50 o C/min, Ar flow: 10 sscm, atmospheric pressure), the temperature was kept constant at 700 o C for 15 minutes to grow the MoS 2 film. Subsequently, the furnace was left to cool down to ambient temperature. The synthesized MoS 2 film was transferred from the sapphire substrate onto a 280 nm thick oxide coated silicon wafer for electrical characterization similarly to the method suggested in [63]. To perfrom the transfer, the MoS 2 film was spin-coated with a thin polystyrene film, which was lifted-off in water. Subsequently the carrier film was transferred onto the target wafer and dissolved in toluene. Transistor devices were fabricated by e-beam lithography and dry etching of the MoS 2 film in rectangular shape ( µ × 20 3 m 2 ) to create single devices. Source and drain contacts were defined by e-beam lithography, e-beam evaporation of Ti/Au (5 nm/40 nm) and lift-off. The channel length varied from 500 nm to 3 μm, while the channel width was kept constant at 3 μm. For electrical isolation of the top-gate, a thin layer (23.5 nm) of Al 2 O 3 was deposited on the whole wafer by atomic-layer deposition. Topgate electrodes were fabricated by e-beam lithography (gate length from 350 nm to 2 μm, depending on the channel length), e-beam evaporation of Ti/Au (5 nm/40 nm) and lift-off in acetone.

Experimental technique
All our measurements have been performed using a Keithley-2636A in a chamber of a Lakeshore vacuum probestation ( × − 5 10 6 -10 −5 torr). We measured the I d -V tg characteristics of our double-gated MoS 2 FETs in both sweep directions using step voltages V step in the range [1 V…0.01 V] and a sampling time t step varied between 0.2 ms and 500 ms. This allowed us to vary the sweep rate / = S V t step step between 0.02 and 5000 V s −1 . An elementary loop of our experimental technique consists of measurements using a fixed sweep range V tgmin to V tgmax and different V step and t step . By loops using a fixed = − V 10 tgmin V and V tgmax between −3 to 14 V in 1 V steps, we obtain a set of ∆V H ( f ) characteristics which contain the information about the density of charged oxide traps with different time constants. The measurement frequency is given as f = 1/(Nt step ) with N = 2((V gmax -V gmin )/V step + 1) being the number of voltage step points.

Modeling
The modeling has been done using the drift-diffusion based TCAD simulator Minimos-NT [64]. First a twodimensional model of the device cross-section was implemented using parameters taken from the literature [1,2,53] and validated against measured I d -V tg and I d -V bg characteristics. Then the modeling of oxide traps was performed based on our previously developed four-state non-radiative multiphonon (NMP) model [25]. This model has already been successfully applied to capture various aspects of charge trapping by oxide traps in Si technologies [54,65,66] and back-gated FETs with MoS 2 [40] and black phosphorus [33]. To simulate the hysteresis widths and offsets using the four-state NMP model, a set of microscopic defects was generated while assuming normally distributed model parameters. Finally, the model parameters were calibrated to the comprehensive experimental data set, which covers different sweep rates and sweep ranges corresponding to different time constants and energy level ranges of the traps involved.