An Assessment of the Statistical Distribution of Random Telegraph Noise Time Constants

As transistor sizes are downscaled, a single trapped charge has a larger impact on smaller devices and the Random Telegraph Noise (RTN) becomes increasingly important. To optimize circuit design, one needs assessing the impact of RTN on the circuit and this can only be accomplished if there is an accurate statistical model of RTN. The dynamic Monte Carlo modelling requires the statistical distribution functions of both the amplitude and the capture/emission time (CET) of traps. Early works were focused on the amplitude distribution and the experimental data of CETs were typically too limited to establish their statistical distribution reliably. In particular, the time window used has been often small, e.g. 10 sec or less, so that there are few data on slow traps. It is not known whether the CET distribution extracted from such a limited time window can be used to predict the RTN beyond the test time window. The objectives of this work are three fold: to provide the long term RTN data and use them to test the CET distributions proposed by early works; to propose a methodology for characterizing the CET distribution for a fabrication process efficiently; and, for the first time, to verify the long term prediction capability of a CET distribution beyond the time window used for its extraction.


I. INTRODUCTION
As the downscaling of transistor size continues, random telegraph noise (RTN) is becoming increasingly important [1]- [5], because of three reasons. First, a single trapped charge has a larger impact on smaller devices. Second, the RTN-induced malfunction of a system is mainly caused by the devices in the tail of its amplitude statistical distribution. More transistors per chip increase the number of devices in the tail. Third, low power requires smaller overdrive voltage, (Vdd-Vth), so that there is less room to tolerate the RTN-induced jitter of threshold voltage, Vth.
To take RTN into account when optimizing circuit design, substantial efforts have been made to model RTN [6]- [11]. For dynamic Monte Carlo modelling, one needs the statistical distributions of the number of traps per device, the amplitude of RTN per trap, and the capture/emission The associate editor coordinating the review of this manuscript and approving it for publication was Qiangqiang Yuan. time (CET) of traps [3], [11], [12]. Early works [9], [13] have focused their attentions on the amplitude distributions and the CET distribution has been rarely reported based on test data [1], [14]- [17]. This is because it is difficult to obtain sufficient amount of experimental CET data to establish a convincing statistical distribution.
The difficulties arise from that, when CET is measured directly from the two discrete states of drain current, it requires a device having one trap only within the test time window [14]. This limits the number of CETs available. The Hidden Markov Model (HMM) [17], [18] has been used to extract trap properties. To analyze the RTN of multiple traps, Factorial HMM (FHMM) is proposed, where the measured signal is assumed to be a superposition of a number of independent two level RTNs, with each of them from one trap and modeled by a Markov chain [19], [20]. Although this raises the number of traps analyzable from one device, it becomes increasingly difficult to apply as the number of traps in a device increases with time window. Although it is generally believed that there is no clear up-limit for CETs [1], [3], [12], [21], the time window used in early works is often limited, e.g. 10 sec or less [14], [17], partially to control the number of active traps in one device and partially for test convenience. RTNs were measured for longer time windows [22]- [25], but the statistical CET distributions were not established based on these test data.
Based on the limited data, two cumulative distribution functions (CDF) have been proposed for CET: Loguniform [1], [3] and Log-normal [15], [16]. A Log-uniform distribution means that CET is statistically uniformly distributed against logarithmic time. As shown in Fig. 1, the two distributions are very different, especially if they are used to predict the long term RTN outside the time window for their extraction. Log-uniform CDF predicts that number of active traps increases linearly against logarithmic time without saturation, while the Log-normal CDF predicts that there are fewer traps with long CETs and the CDF approaches saturation. As a result, the long term RTN modelling cannot be trusted unless one has a trustable CET distribution. The objectives of this work are three-fold: to obtain the long term RTN data experimentally and, based on them, to assess if any of these two and other distributions of CETs are correct; to propose a methodology for characterizing the CET distribution; and to address the issue how accurately a distribution can make long term RTN predictions. As the practical time window for statistical tests is ∼ day, it is of importance to assess how accurate these data can be used to predict the RTN years ahead.

A. METHODOLOGY
Early works used two approaches to obtain the statistical CET distribution: extracting CET directly [14]- [17] or inferring the CET distribution from indirect measurements [1], [12]. As mentioned earlier, the difficulties in extracting CET directly often led to inadequate data to establish CET distribution unambiguously [1], [14]. Based on the measured CETs, some researchers proposed Log-normal CET distribution [15], [16].
The 1/f noise spectrum was used to infer the CET distribution [1]. It has been shown theoretically that a Log-uniform CET distribution will produce the commonly observed linear relation between power spectrum density and 1/f [1]. There are, however, two concerns with this inference. Kirton and Uren [1] showed that 1/f spectrum is insensitive to CET distribution and different CET distributions can produce similar spectrum. Another concern is that the 1/f spectrum typically has a low frequency limit of ∼1 Hz, corresponding to an up-limit in the time domain of ∼1 sec. There is a lack of data for the long term distribution, therefore.
A Log-uniform CET distribution is also inferred from the negative bias temperature instability (NBTI) tests [12]. It has been shown that the Vth grows linearly against logarithmic time within the first ∼1 sec [12]. Unfortunately, the charging kinetics starts deviating from this linear relationship [12], as new traps are generated [26], [27].
As the approaches adopted by early works did not give long term data for establishing CET distribution, we will not follow them here. Instead, we carried out overnight RTN tests. Fig. 2 shows the result of an overnight noise measurement. Although the noise amplitude may appear insensitive to time when plotted in linear scale in Fig. 2a, the plot against logarithmic time in Fig. 2b shows that noise amplitude clearly increases for longer time. It is difficult to extract CETs from such data unambiguously. Instead, the increase of noise amplitude with time in Fig. 2b can be used to uncover the underlying CET distribution and the methodology is given below.
For a time window of tw, traps with CETs less or close to tw are covered by the measurement. An increase in tw will bring slower traps into measurements, leading to a higher cumulative RTN amplitude, as shown in Fig. 2b. The build-up of RTN amplitude with time can be used to uncover the cumulative distribution function of CETs, therefore.
To illustrate this methodology, a case study is given in Fig. 3. Fig. 3a shows the combined simulation results of 5 traps with their amplitude, capture and emission times listed in Table 1. The envelope of the complex multi-level RTN, Env, is extracted by, The RTN of each trap is given in Figs. 3b-3f. When the fastest trap makes a capture, it causes the first step-up of the envelope in Fig. 3a, as marked out by '(1)' in Figs. 3a and 3b. As the amplitude of this trap is fixed, the envelope remains the same when this trap goes through subsequent RTN events. When the second fastest trap becomes active, it causes the second step-up of the envelope, as marked out by '(2)' in Figs. 3a and 3c. As time increases further, slower traps progressively become active, resulting in more step-ups in the envelope, as marked out by the corresponding numbers in Figs. 3a and 3d-f. The evolution of the envelope with time in Fig. 3a originates from a distribution of time constants of the underlying traps, therefore.   To support this methodology, dynamic Monte Carlo simulations were carried out. We assume that the CET is either Log-normal or Log-uniform distributed, as shown in Fig. 1 and 2,000 traps are then Poisson distributed into 400 devices. Each grey line represents the Env of one device in Fig. 4a for log-uniform and in Fig. 4b for Log-normal distributions. The black lines are the average results. Although the envelope of individual device increases in steps, their average rises smoothly with time. A comparison with the CDF of CETs in Fig. 1 clearly shows that the average Env correctly uncovers the underlying cumulative distribution of CETs. We can use the experimental Env of RTN to extract the CET distribution, therefore.

B. DEVICES AND MEASUREMENT
nMOSFETs with a channel length and width of 27 × 90 nm were used. The high-k/SiON stack has an equivalent oxide thickness of 1.2 nm and the gate is metal. Tests start by measuring a pulse Id ∼ Vg with Vd = 0.1 V and a pulse edge time of 3 µs. The Vg is then stepped from zero to 0.5 V and Id is monitored against time under Vd = 0.1 V. The average threshold voltage of the nMOSFETs used here is 0.45 V and Vg is chosen to be Vth+0.05 V, as the requirement of low power is driving Vdd towards Vth and the near threshold computing acutely suffers from RTN [28]. The temperature is between 28 • C and 125 • C.
It has been reported that both as-grown traps and traps generated by stresses can induce RTN [29]- [31]. The generation process, however, follows power law [32], which is different from the Log-uniform [1], [3] or Log-normal distributions [15], [16] of time constants for charging-discharging as-grown traps. This work focuses on investigating the distribution of time constants for charging-discharging as-grown traps and a low Vg = 0.5 V is chosen for the tests to minimize the interference from trap generation process [33]. Moreover, metastable and anomalous RTNs have been reported [3], [4] and their effects have been included in the experimental data.
The Id fluctuation, Id, is calculated from Id-Iref, where Iref was evaluated from the average Id between 1 and 10 µs. As Vg is close to Vth, Vth can be evaluated from − Id/gm [5], where gm is the transconductance and is obtained from the pulse Id∼Vg. The system noise is below ±1 mV.
The extraction of the envelope from experimental data is illustrated in Fig. 5. The sampling rate used here is 1 MS/sec, where 'MS' is 'Mega-Sample points'. Although there are only a limited steps in the Env, it does not mean that a low sampling rate can be used to extract the Env. Fig. 6 plots the Env obtained from test data taken at different sampling rate. Slower sampling rate leads to lower Env, as it fails to capture the fast traps [34].
With 1 MS/sec, the size of dataset for one measurement is 10 MS for a time window of 10 sec. For a time window of 10 5 sec (∼ day), the data size rises to 100 GS, which is beyond the memory depth of modern oscilloscope. To overcome this difficulty, we used different rates for data sampling and data recording. As shown in Fig. 4, the number of steps in Env are  limited and Env remains constant most of the time. Env can be recorded under much slow rate, although it is measured at 1 MS/sec, therefore.
In this work, we used two oscilloscopes to monitor Id. One of them has a time window of 10 sec and every data point is recorded. The other has a time window of 10 5 sec and monitors Env at 1 MS/sec, but the result is only recorded every 20 sec for the overnight test. The Env measured by this set-up is given in Fig. 7 for 51 different devices. Each grey line represents one individual device and the red line is their average. The Env measured by the two oscilloscopes joins together smoothly.
The statistical tests require repeating the same test many times for different devices. For a time window of overnight, the test becomes costly in terms of test time and it is desirable to minimize the number of devices under test (DUTs). For a time window of 10 sec, DUTs up to 402 were used and the average Env at 10 sec is plotted against the number of DUTs in Fig. 8. Initially, the average is sensitive to the number, but settles down within 2% when the number is over 50. We can use 50 DUTs to extract the average Env for the overnight tests, therefore.
It should be clarified that, in addition to RTN, the measurement can also include other sources contributing to the 1/f spectrum. By using the measured data to characterize RTN, we effectively treated the other sources as additional RTN through a higher RTN amplitude. For nanoscale MOSFETs, RTN plays a dominant role. This can be seen from the step-like changes of the envelope in Figs. 5 and 7. Fig. 5 also shows that, when the step-like changes are small, the total noise is much lower (the green trace).

III. RSULTS AND DISCUSSIONS A. STATISTICAL DISTRIBUTIONS OF CETS
For the first time, we use the overnight RTN experimental results in Fig. 7 to assess the statistical distribution of CETs. The non-saturation behavior is widely observed for device ageing, which typically follows a power law [35]- [38]. To test if the Env also follows a power law, we attempted to fit it with a power law. Fig. 9a shows that the agreement with power law is not good. Figs. 9b and 9c show that the experimental data fit reasonably well with Log-uniform and Log-normal distributions, respectively. This demonstrates that good fitting with experimental data is not a sufficient criterion for qualifying a model [27], [33], [35]. As the mission of modelling is to use the model to predict the device performance where experimental data are not available for model extraction, we will test the predictive capability of these models next.

B. PREDICTION OF THE LONG TERM CETS
Although Figs. 9b and 9c show that the CETs within a time window of ∼ day can be fitted reasonably by the Log-uniform and Log-normal distributions respectively, most electronic products requires a lifetime of years, rather than days. VOLUME 8, 2020 To optimize a design, one needs modelling the impact of RTN over the whole device lifetime. As it is impractical to carry out the repetitive statistical tests with a time window of years, one relies on that the models extracted from the test of ∼ day can The obtained CDFs were then used to make prediction beyond 10 sec by extrapolation, as shown by the dashed black lines. The Log-uniform CDF has the best agreement between prediction and the experimental data (red symbols). be used to predict three orders of magnitude ahead to reach ∼ years [27], [33], [35]. The question is how to verify this long term prediction capability of a model.
As we do not have the test data of ∼years, it is impossible to have a direct verification. What we do have is the test data up to 2 × 10 4 sec in Fig. 7. Reducing it by three orders of magnitude gives us a time window of ∼10 sec. We can extract the model based on the data in a time window of 10 sec and then use it to predict the RTN three orders of magnitude ahead to reach ∼10 4 sec. As we have the test data for ∼10 4 sec, we can verify this prediction.
The solid black lines in Figs. 10a-c represent the model extracted from the data with a time window of 10 sec for power law, Log-uniform, and Log-normal distributions, respectively. The dashed lines are obtained by extrapolating the solid lines according to the extracted models. When compared with the experimental data that have not been used to fit the models (red symbols), the Log-uniform CDF in Fig. 10b is the clear winner. It predicts that Env reaches 18.5 mV at 10 years. The power law in Fig. 10a overestimates Env and gives a value of 47.5 mV at 10 years. On the other hand, the Log-normal CDF in Fig. 10c underestimates EnV and gives a value of 12.8 mV at 10 years. The Log-normal CDF approaches saturation at longer time, which was not observed in the test data. As a result, the experimental data support the log-uniform distribution of CETs.
As the model extracted from the test data over five orders of magnitude of time between 10 −4 and 10 sec can be used to predict three orders of magnitude ahead, it gives us the confidence that the model extracted over eight orders of magnitude from 10 −4 to 2×10 4 sec can also be used to predict three orders of magnitude ahead, reaching ∼ years.

C. CHARACTERIZING LOG-UNIFORM CDF
The Log-uniform CDF of CETs only has one parameter to be characterized: the number of traps per decade of time, Nt. We propose the following procedure to extract Nt: • measure RTN of multiple devices; • extract the Env of each device, as shown in Fig. 5; • obtain the average envelope, as shown in Fig. 7, and fit it with a straight line against logarithmic time, as shown in Fig. 9b, and obtain the Slope; • measure the amplitude of RTN per trap and determine their average value, µ; • Evaluate Nt by: For the process used in this work, the experimental results give Nt = 0.75/decade. Using this Nt and Log-uniform CDF for CETs and a Poisson distribution for traps per device, 400 hypothetic devices were generated for dynamic Monte Carlo simulation. Fig. 11 shows that the simulated average Env agrees well with that measured one.
In principle, the Log-uniform distribution can be explained by two possible mechanisms: trapping-detrapping through elastic carrier tunneling and inelastic multi-phonon trappingdetrapping.
It is well known that the carrier tunneling probability decreases exponentially with the tunneling distance [39], [40], resulting in an exponential increase of capture time with distance when moving from the dielectric/Si interface into dielectric. An assumption of a spatially uniform distribution of traps in gate dielectric can explain the Log-uniform CET distribution, therefore. Recent work [14], however, has reported that the CETs are not well correlated with the spatial position of traps. For the thin dielectric used in modern devices, carriers can readily tunnel through the whole dielectric in short time [41], so that the depth into the dielectric typically does not control CETs.
For inelastic multi-phonon trapping-detrapping, carriers from the channel has to overcome an energy barrier, E, to charge a trap. The capture time, τ c , increases exponentially with E [1], [14], where τ o is a constant, k the Boltzmann constant, and T the temperature. A statistical uniform distribution of traps in E will result in a Log-uniform distribution of CETs. We now discuss the advantages and disadvantages of our 'envelope approach', when compared with the conventional method, for extracting the statistical distribution of CETs. Conventionally, a bottom-up method was used: the time constant of each trap is measured first and then used to establish statistical distributions [15], [16]. The advantage of this approach is that one knows the time constant of each trap and we have learnt a lot about the property of individual traps from the early works [14]- [17]. The disadvantage of this approach is that the number of traps and their time constants obtained through experiments is too limited to establish the statistical distribution convincingly, especially for slow traps. We do not believe that we can do better than these early works if we followed the same bottom-up approach.
The envelope approach developed in this work can be considered as a top-down or integrated method: the results of multiple traps from multiple devices were combined and analyzed together to extract the statistical distribution without knowing the precise time constant of each trap first. The advantage of this approach is that it allows extracting the statistical distribution of time constants efficiently based on the long term RTN data, as shown in Figs. 7 and 9. The disadvantage of this method is that the precise time constant of each trap is not known and this precludes any quantitative comparison of simulation with test data for individual devices. As the precise time constant of each trap is not known, the time constant of each trap has to be statistically assigned according to the distribution for the simulation. Fig. 11, however, shows that the simulation agrees well with test data statistically.
Finally, we investigate if the Log-uniform CDF is applicable to RTN under different test conditions. As RTN is sensitive to temperature, overnight RTN were measured at 28 • C in Fig. 12a, while the results in Fig. 7 were measured at 125 • C. Fig. 12b shows that the Log-uniform CDF again fits the experimental data at 28 • C well.

IV. CONCLUSION
In this work, we investigated the statistical distribution of the capture and emission time of traps responsible for RTN by developing a top-down methodology. We started by using the dynamic Monte Carlo simulation to confirm that the average envelope of RTN, resultant from multiple devices and many traps, can uncover the underlying cumulative distribution of CETs. The overnight RTN tests were then carried out to extract the experimental envelopes for RTN. Based on these long term RTN data, the CDFs proposed by early works for CETs were assessed. The power law, widely used for ageing, does not agree well with the test data and overestimates the long term RTN. On the other hand, the Log-normal CDF underestimates the long term RTN. The overnight experimental data endorse the Log-uniform CDF for CETs. A methodology is proposed to extract the CDF of CETs efficiently. For the first time, the long term prediction capability of the extracted Log-uniform CDF is verified, allowing assessing the RTN in years, based on the experimental data in days.