Does the Threshold Voltage Extraction Method Affect Device Variability?

The gate-all-around nanowire FET (GAA NW FET) is one of the most promising architectures for the next generation of transistors as it provides better performance than current mass-produced FinFETs, but it has been proven to be strongly affected by variability. For this reason, it is essential to be able to characterize device performance which is done by extracting the figures of merit (FoM) using data from the IV curve. In this work, we use numerical simulations to evaluate the effect of the threshold voltage (<inline-formula> <tex-math notation="LaTeX">$\mathrm {V_{TH}}$ </tex-math></inline-formula>) extraction method on the variability estimation for a gate-all-around nanowire FET. For that, we analyse the impact of four sources of variability: gate edge roughness (GER), line edge roughness (LER), metal grain granularity (MGG) and random discrete dopants (RDD). We have considered five different extraction methods: the second derivative (SD), constant current (CC), linear extrapolation (LE), third derivative (TD) and transconductance-to-current-ratio (TCR). For the ideal non-deformed device at high drain bias, the effect of the extraction technique can lead to a 137 mV difference in <inline-formula> <tex-math notation="LaTeX">$\mathrm {V_{TH}}$ </tex-math></inline-formula> and an 89 mV/V difference in the drain-induced-barrier-lowering (DIBL), and when considering GER and LER variability, the influence of the extraction method leads to differences in the standard deviation values of the <inline-formula> <tex-math notation="LaTeX">$\mathrm {V_{TH}}$ </tex-math></inline-formula> distribution (<inline-formula> <tex-math notation="LaTeX">$\sigma \mathrm {V_{TH}}$ </tex-math></inline-formula>) of up to 2.3 and 3.7 mV respectively, values comparable to intrinsic parameter variations. Therefore, the <inline-formula> <tex-math notation="LaTeX">$\mathrm {V_{TH}}$ </tex-math></inline-formula> extraction technique presents itself as an additional parameter that should be included in performance comparisons as it can heavily impact the results.

The most common methodology to compare the performance of different device architectures using physically-based simulations is done through the comparison of the figures of merit (FoM) that characterize the device performance [14], [15].
Previous works showed that some of these FoM can be obtained using different extraction methods as explained in [16], which may impact directly the results and influence the outcome [17]. For this reason, it is necessary to develop a consistent methodology that enables researchers in academia and industry to compare device performance between different architectures without being affected by the FoM extraction method. Moreover, there are several studies comparing the extraction method influence in some architectures like MOSFETs [16], [18], [19], TFETs [20] or FinFETs [17], but there is work to be done both in recent devices like the GAA NW FET and in variability cases. In this work, we present a study using five different extraction techniques (SD, CC, LE, TD and TCR) to calculate the V TH values of a state-of-the-art variability-affected (GER, LER, MGG and RDD) GAA NW FET and assess the effect on the results. The paper is structured as follows: Section II contains a detailed description of the device dimensions, the different extraction methods that are used and an explanation of the simulation methodology. In Section III, the simulation results with the extraction method comparison are presented. Finally, the conclusions of this work are summarised in Section IV.

II. DEVICE DESCRIPTION AND SIMULATION METHODOLOGY
The benchmark architecture used in this work is a 10 nm gate length Si GAA NW FET. This is a state-of-the-art device characterized from experimental data [21] and later scaled as explained in [22], following the ITRS guidelines [23]. The main dimensions and doping values can be seen in Table 1. The device channel has been uniformly doped whereas the source/drain (S/D) regions have Gaussian doping. These Gaussian doping profiles, reverse engineered from experimental data and scaled as explained in [22], are characterized by the slope (σ x ) and the peak (x p ) of the Gaussian profile (see Table 1).
We use VENDES, an in-house-built software [24], to obtain the GAA NW FET transfer characteristics. VENDES includes two transport schemes: drift-diffusion (DD) and Monte Carlo (MC), and two different types of quantum corrections: anisotropic Schrödinger equation based (SCH) and density gradient (DG). Also, the mesh used by the simulator is based on the finite element (FE) approach that accurately describes complex three-dimensional geometries. Detailed information about the simulator functionalities and the models it is based on can be seen in [24]. The DD-DG simulations used for this study require calibration against self-consistent schemes. In the sub-threshold region anisotropic Schrödinger equation (SCH) quantum corrections (SCH-DD) are used to fit the electron effective masses of the DD-DG that mimic source-to-drain tunneling and quantum confinement effects [25] allowing for a perfect agreement in the I-V transfer characteristics. SCH quantum-corrected Monte Carlo simulations (SCH-MC) fail to capture these effects in the sub-threshold region because at low gate voltages noise interferes with the results. However, at gate biases above V TH , SCH-MC produces more accurate predictions as it considers scattering events (see the calibration curves in Fig. 1).
The DD-DG simulation method has been chosen for this study as it produces sound results in the sub-threshold region if thoroughly calibrated against a self-consistent schemes like the SCH-DD and SCH-MC and at a fraction of the simulation time, which is especially important when dealing with a massive number of simulations like in variability studies.
VENDES implements several variability sources that have been proven to play a critical role in performance, limiting the scaling of the device, and as a consequence difficulting the reduction of supply voltage and power dissipation [26]. GER and LER are modelled similarly, the device gate or the edge of the nanowire, respectively, are deformed according to a given roughness profile characterised by the root mean square (RMS) height, that defines the amplitude, and the correlation length (CL), that accounts for the spatial correlation. The MGG is modelled by generating a work-function map of the device gate that matches realistic metal grain distributions generated using the Voronoi approach. RDDs are introduced in the n-type doped S/D regions using a rejection technique from the doping profile (shown in Table 1) of the ideal non-deformed device and the charge of each dopant is later distributed through the mesh using the cloudin-cell approach. A full description of the generation of all the aforementioned variability sources can be seen in [24]. For each variability source, ensembles of 300 device configurations were simulated to be able to statistically assess the impact on device performance. Figure 2 shows examples of the aforementioned sources on the 10 nm GAA-NW FET structure.
To perform this study in a time-efficient manner, we developed an in-house built python library called FoMPy, available to download at [27]. FoMPy implements some of the most common FoM extraction techniques used by the semiconductor community. This software can import large datasets, as required in variability studies, allowing the user  General capabilities of the FoMPy library. FoMPy is able to import your data into a dataset, and after optional conditioning (data filtering or interpolation) is able to extract and plot some of the most commonly studied FoMs.
to automatically obtain relevant FoM (V TH , SS, I OFF , I ON and DIBL) for each one of thousands of IV curves. Fig. 3 shows the overall capabilities of this library. It enables users to easily import their data, preprocess it with filters or interpolation mechanisms and extract and plot the main FoMs with the several implemented extraction techniques.
In this work, we employed FoMPy to analyze the influence that the chosen V TH extraction method has on the results. The study was performed both at low (V D,lin = 0.05 V) and high drain biases (V D,sat = 0.70 V). This library includes the following V TH extraction methods: the second derivative (SD), constant current (CC), linear extrapolation (LE), third derivative (TD) and transconductance-to-current-ratio (TCR). See Fig. 4 for an illustrative explanation of the five aforementioned extraction methods.
The SD method evaluates V TH at the V G value where the derivative of the transconductance (g m = dI D /dV G ) is maximum. For the saturation region, V TH is determined at the maximum point of the function d 2 I 0.5 D /dV 2 G as explained in [16], where I 0.5 D is the square root of the drain current. Although this method is user-independent and relies on a physical parameter, the transconductance of the device can also be subjective to error and noise, as it acts as a high pass filter of the data [16].
The LE method obtains V T as the V G axis intercept of the tangent of the IV characteristics at its maximum first derivative point. For the saturation region, V TH is determined at the V G axis intercept of the I 0.5 D /V G function. This method, as it is based on finding the maximum slope of the curve, can be strongly affected by mobility degradation and source and drain series parasitic resistances [16].
The constant current criterion is one of the most commonly used V TH extraction methods [16], [26] because of its simplicity. It determines V TH at a critical user-defined value of the drain current (I Dcc ). In this article we use one initially proposed by Tsuno et al. [28] for MOSFETs and later adapted for NWs by Tiwari et al. [29] where the constant drain current is set to 100nAx(W m /L m ). L m and W m refer to the channel length and channel perimeter of the device, respectively. Also, we have used a similar criteria by Wu and Su [30] where the constant drain current is 300nAx(W m /L m ). This method has the serious disadvantage that all the outcomes depend on an arbitrary criterion which might lead to inconsistent results. For example, when studying ensembles of curves with great excursions, it has been shown that the CC criterion fails to capture this behaviour [17]. To overcome this flaw Bazigos et al. [31] and Zhou et al. [32] proposed to pair the CC method with other physically-based mechanisms like the SD method. In the same manner, we have used a constant current criterion that matches the gate voltage extracted using the LE method. Therefore, three different drain current values will be taken into consideration to compare the effect that the arbitrary fixation of a criterion may have in the results. The TD method chooses the V TH where the third derivative of the current (d 3 I D /dV 3 G ) has a maximum, inherently disagreeing with the SD results. Equal to the SD method, another degree of differentiation introduces even greater noise, hence making this method the less reliable of all as shown in [17]. For simulations at high drain bias, V TH is chosen at the maximum point of d 3 I 0.5 D /dV 3 G . Finally, the TCR method [16], [33], [19] is based on calculating the following ratio between the transconductance and the current: V TH is then determined calculating the maximum negative slope, that can be found with the derivative of the ratio (dTCR/dV G ).

III. SIMULATION RESULTS
In order to study the variability distributions of the aforementioned sources, we have employed two estimators: the standard deviation (σ V TH ) and the threshold voltage shift ( V TH ), defined as the difference between the V TH of the ideal non-deformed device and the mean value of the V TH distribution due to a particular source of variability ( V TH = V TH,ideal − V TH ). Tables 2 and 3 show σ V TH , V TH,ideal and V TH obtained using SD, CC, LE, TD and TCR extraction methods for the GER, RDD, MGG and LER variability distributions at low and high drain biases, respectively. Note that the CC criterion has been calculated following Tsuno et al. [28], Wu and Su [30] and using the drain current value corresponding to the V TH value of the LE method for the ideal device. For the ideal non-deformed device, we observe maximum differences of 23% and 39% in the V TH between the different extraction methods, for V D,lin and V D,sat , respectively. Also, the DIBL is a FoM that will heavily depend on the V TH extraction method, as shown in Fig. 5. The SD and TD methods provide values higher than 130mV/V, the LE, CC Wu and CC LE in   Table 2 clearly shows that, at low drain bias, the σ V TH due to a particular source of variability depends on the extraction technique. The difference between the σ V TH of the different methods ranges from 59%, in the GER case, to 10%, in the RDD one. Also, the highest σ V TH values are not always extracted with the same method. For LER and GER, the CC LE method outputs the highest σ V TH , whereas for RDD and MGG, the maximum is obtained for the TCR and SD methods, respectively. Similarly, at high drain bias, V TH values are shown, and we obtain an analogous behaviour. In Table 3 the difference in σ V TH for LER (RMS 1.0 nm and CL 20 nm) depending on the extraction method is up to 9.3 mV. As shown in [2] the difference in σ V TH due to LER, for two given RMS (0.7 and 0.85 nm) was 10 mV using the CC criteria in both cases. Additionally, for GER (RMS of 1.0 nm and CL 11 nm) a maximum difference of 2.3 mV was found between the different extraction techniques, whereas in [6] a variation of RMS from 0.8 to 1.0 nm produced a change in σ V TH of 1 mV. These results show that the effect of the extraction method can even be comparable to the variation of the intrinsic variability parameters. On the contrary, for other variability sources like MGG (with a grain size of 2.5 nm), the effect of the V TH extraction method modifies the results up to 0.5 mV, a value much lower than the data provided in [2] where the MGG variability (for a grain size of 3.0 nm) produced a σ V TH of approximately 20 mV. Also, in the case of RDD, a maximum difference between extraction methods of 16% was found.
In Fig. 6 (top) σ V TH for the SD, LE and TD method is presented normalized by the σ V TH extracted with the CC Tsuno criterion in order to compare both the influence of the variability source and the extraction method at high drain bias (V D,sat = 0.70 V). It shows that σ V TH depends on the extraction method for GER, RDD and LER. For instance, with GER it can be seen a maximum difference of 30% between the TD (7.9 mV) and the LE method (5.6 mV). However, for MGG all methods,except the TCR, produce a very similar σ V TH , as this variability source only induces a V TH shift in the IV curves without modifying the subthreshold slope.
Similarly V TH has been plotted in Fig. 6 (bottom) at high drain bias. V TH can be positive or negative, depending on the extraction method and on the variability source. Also, the extraction method can either have a small effect on V TH , like in LER and MGG or influence them considerably, as in RDD, where the CC methods yield a maximum shift of −51 mV, 4 times larger than any other method. The SD,  TD, TCR and LE methods are based on the change of the transconductance for each individual device, however, the CC methods set the same fixed drain current value (I Dcc ) (chosen for the ideal non-deformed device) for all the variability affected devices. RDD induces not only a V TH shift but also a large change in the slope of the IV curve. Therefore, choosing the same I Dcc for both the nominal and the variability affected devices may lead to the overestimation of V TH . Fig. 7 and Fig. 8 show, at low and high drain biases respectively, σ V TH normalized by σ V THCC,LE for the other two CC criteria used in this work, those proposed by Tsuno et al. [28] and Wu and Su [30]. On the one hand, we observe changes of up to 40% and 20% for low and high drain biases, respectively, depending on the CC criteria. On the other hand, the behaviour that these methods predict may differ. For instance, in Fig. 7 when using CC Tsuno , the σ V TH yielded is higher in MGG than in the GER, RDD or GER cases, but if CC Wu is used, the results are the opposite. Similarly, in Fig. 8 when using CC Tsuno , σ V TH is higher in a RDD affected device than in the MGG case whereas when using CC Wu the opposite behaviour can be seen. Depending on the drain current that is used-defined, studying the transfer characteristics at different points may result in misinterpretations. This is exemplified in Fig. 9, where five voltages extracted using the TD, SD, TCR, CC Wu and LE methods are used to simulate the nominal device at V D,sat . A clear change in the electron concentration in the channel leads to different extraction results, making the V TH extraction method an essential parameter to be taken into account when studying device performance.

IV. CONCLUSION
In this work, we tested the effect of the V TH extraction method on variability studies. In order to do so, 3D quantumcorrected FE DD-DG simulations of a 10 nm gate length GAA NW FET were performed and GER, RDD, MGG and LER variations were applied to the benchmark device. The results were extracted using five different extraction techniques: SD, CC, LE, TD and TCR.
For V TH in the ideal non-deformed case, maximum variations of 39% and 23% between different extraction methods have been found, for simulations at high drain bias (V D,sat ) and low drain bias (V D,lin ) respectively. Also comparing the DIBL extracted values, a difference of up to 89 mV/V has been found between proposed methods. In variability, the dependence of the results on the extraction method can even be comparable to intrinsic variability parameters like the RMS in GER and LER, with up to a 59% and 30% variation on V TH in GER between the different methods for V D,sat and for V D,lin respectively. Also, even though the CC criterion is one of the most commonly used extraction methods because of its simplicity, the arbitrary value of the drain current the user has to set might lead to misinterpretations. Changes of up to 40% and 20% and opposite behaviours have been found when extracting σ V TH using different CC criteria for low and high drain biases respectively.
In summary, we have demonstrated that the V TH extraction method may play a significant role in variability studies, becoming an additional factor that has to be taken into account and used consistently in performance comparisons.