Dynamic Power Consumption and Delay Analysis for Ultra-Low Power 2 to 1 Multiplexer Designs

This paper highlights a comparative analysis of eight diverse techniques for 2 to 1 multiplexer implementation. The functionality is identical but significant differences in dynamic power consumption and propagation delay are observed. This paper aims to enable the designer to pick out the best fit structure for a specific application in keeping with their design requirement. The multiplexers are designed at 90 nm technology node and simulated at a supply voltage of 1 V.


Introduction
Technology is pacing at an exponential rate against time. Devices are being remodelled and improved within short spans to outdo themselves. Consequently, each component of a device is being analysed to improve its performance. Designers are ceaselessly trying to improve the performance of the existing devices or trying to devise a new way to design and improve the performance [1]. A multiplexer is one of the basic building blocks of digital systems [2] and [3]. In this paper, different design techniques of 2 to 1 multiplexer are compared.
The advancements in design automation of ASICs have enabled positioning millions of transistors on a single chip for robust circuitry implementation. Consequently, the density of transistors for a given area has increased significantly, thereby leading to an increase in formidable designing issues [4]. Henceforth, the solutions that have been recommended suggest a reduction of the transistor power supply voltage, switching frequency, and capacitance [5], [6] and [7]. Depending on the application, different types of circuits and design methodologies were proposed which disallowed the formulation of uniform rules for optimum logic types.
The smallest multiplexer that can be designed is a 2 to 1 multiplexer. It forms the building block for other larger multiplexer modules [8] and optimising its configuration enhances its stability [9]. Therefore, to optimise the performance of 2 to 1 multiplexer, different configurations including Gate Diffusion Input (GDI), Pass Transistor (PT), Multiplexer Single with Level Restoration (MSL), Transmission Gate (TG), Static CMOS, Complementary Pass Logic (CPL), Cascode Voltage Switch Logic (CVSL) and Multi Threshold CMOS CVSL (MTCMOS CVSL) are analysed in this paper. All the multiplexers are designed using complementary MOSFET transistors at 90 nm technology node. Further, their performance is analysed and compared by means of output response and dynamic power dissipation using Cadence Virtuoso software. Additionally, power and delay are analysed to find the best multiplexer amongst all the configurations.
This paper is arranged into five sections, including this introductory section. Further, Sec. 2. illustrates schematics diversity and working for different multi-plexer configurations. The output response for different multiplexer designs is presented and discussed in Sec. 3. . Additionally, dynamic power dissipation and delay response are elaborated in Sec. 4. . Finally, evaluated structures' properties are summarized in Sec. 5. .

CMOS Structures of Multiplexers
Diverse structures of 2 to 1 multiplexer are simulated and analysed. Designs such as GDI and PT multiplexer have simpler schematic and utilize lesser area with low power dissipation but with degraded output response. Other techniques such as MSL, Static CMOS, CPL, CVSL and MTCMOS based multiplexer are liable for producing non-degraded output but consume more power and introduce larger delay, as well.
Transmission gate based 2 to 1 multiplexer is designed using a pair of Transmission Gates (TGs). Each TG is a pair of NMOS and PMOS transistors wherein the source and drain terminals of transistors are connected in parallel, as illustrated in Fig. 1(a). Both NMOS and PMOS permit the same input simultaneously. Thus, it is transferred to the output node through this TG without any deterioration [10]. At high input signal, the NMOS gives a weak 1 at the output. However, PMOS provides a strong 1 at the same time, thereby maintaining the output level. Similarly, at low input, the PMOS produces a weak 0 but NMOS supplies a strong 0 at the output [11]. TG configuration is used to isolate the components and signals/data from being transmitted to the other nodes without using any other hardware.
Gate Diffusion Input (GDI) logic allows the user to design complex logic circuits using a smaller number of transistors. The basic structure of GDI resembles a CMOS inverter and inherits characteristics of CMOS and PTL logic [12]. This is an appropriate technique for designing fast and low-power circuits using fewer transistors as compared to CMOS and PTL techniques [13] and [14]. The schematic for the GDI based 2 to 1 multiplexer is given in Fig. 1(b). When Sel = 1, the NMOS transistor operates in ON mode and input signal B will pass to the output; O. On the other side, output O receives the input signal A, when Sel is maintained at 0.
The GDI based multiplexer employs A and B inputs to the multiplexer, while the Sel signal acts as the select line and determines the input that gets transmitted to the output. The logical function implemented by the GDI based SRAM bit cell is nSel.A+ Sel.B. The advantages associated with the GDI technique are minimal transistor requirements, low power dissipation and fast operation. However, the limitations of this structure encounters are -(1) If A = 0, then PMOS being a weak 0 will not pass a perfect 0 at the output, (2) The complementary is applicable for B = 1, as NMOS delivers weak 1 to output.   Pass Transistor Logic (PTL) uses two NMOS transistors in pass transistor configuration. This logic is different from CMOS design as the source side of the logic network is connected to the input signal instead of the power supply [15], [16], [17] and [18], as depicted in Fig. 1(c). When Sel = 0, the MN2 transistor is saturated that leads the value at A to appear at the output terminal. Whereas, for Sel = 1, the value at B gets transferred to the output. The upsides of PTL are high speed, low power consumption and low interconnect effect. However, the factor that limits the use of PTL technique is reduced signal integrity due to the inability of NMOS to pass a strong 1 [19]. Therefore, at conditions such as Sel = 1, B = 1 or Sel = 0, A = 1, the output driving capabilities are weak.
The shortcoming of the PT based multiplexer is modified and MSL based multiplexer is created. In this design, an additional PMOS transistor is used to restore the output level. The gate of the PMOS (MP1) is driven by an inverter (INV1) controlled by the output of the PT based multiplexer [20]. The gate and drain of PMOS (MP1) are connected to the output and input of the second inverter (INV2), respectively, as indicated in Fig. 1(d). Similarly, gate terminals of both NMOS transistors (MN1 and MN2) are connected to the input and the output of the first inverter (INV1). The purpose of designing MSL based multiplexer is to overcome the drawback of output deterioration and complex synthesis methodology for PT multiplexer [21]. When Sel = 1 and B = 1, NMOS produces a weak 1 as explained in the previous subsection. To rectify it, MSL based multiplexer converts this 1 to 0 through the inverter [22]. This inverter then drives the PMOS to ON state, thereby making the output signal a strong 1. The other advantage of MSL based multiplexer is maximum output swing (nearly V DD and 0). The next multiplexer implementation for analysis is Static CMOS based multiplexer. This multiplexer uses eight transistors, wherein four PMOS transistors form the pull-up network and the other four NMOS transistors form the pull-down network. This multiplexer utilizes inverted inputs and delivers inverted output; OB along with the actual output; O. The schematic for static CMOS based multiplexer is illustrated in Fig. 2(a) [23]. When Sel = 0, B = 0 or nSel = 1, A = 1/0, the MP2, MP4, MN2 are ON and MN1, MN3, MP1 are OFF, thus output becomes 0. There is some leakage due to MN2 and MN4 of the pulldown network. Similar observations are achieved for nSel = B = 1 or Sel = 0, A = 0/1 and 1 is produced at the output O. With similar phenomena, a zero output is produced for nSel = A = 0 or Sel = 1, B = 0/1. In such a condition, the leakage occurs through MN2 and MN4 of the pull-down network. This analysis concludes that the input B/A gets produced at the output for Sel = 0/1. The inverters are used to invert the select signal and the output. In this circuit, half of the transistors are used to pull up, while the other half is used to pull down the logic, thereby providing both actual output (O) and its complement (nO), as depicted in Fig. 2(b). When Sel = 0, transistors MN2-MN3 are turned ON and pass input signal B and nB (complement of input B) to O and nO, respectively. If input B = 0 (i.e. nB = 1), MP2 is ON. Therefore, V DC gets connected to the output of MN3 and restores the logic level 1. Hence, 0 is produced at the output of the inverter. Now, if B = 1 (nB = 0), transistor MP1 is ON, the V DD gets connected to the output of MN2 and logic level 1 is restored. Accordingly, inverted output (nO) is 0. This technique exhibits advantages like the presence of both output and inverted output, fast operation and restoration of output level. But this structure dissipates significantly more power than other structures, thereby limiting its application.
Cascode Voltage Switch Logic (CVSL) multiplexer is shown in Fig. 2(c). The NMOS logic forms the pulldown network and generates the complementary logic. CVSL is composed of a differential latching circuit and a cascaded complementary logic array [24]. Therefore, this structure is also acknowledged as Differential Cascode Voltage Switch Logic (DCVS or DCVSL) [20] and [23]. When Sel = 0, A = 0, transistors MN3, MN5, and MN6 are ON, henceforth, the output and gate of MP2 get connected to the ground terminal through transistor MN5 and MN6. Thus, the output obtained at O = 0 (nO = 1). For Sel = 0, A = 1, transistors MN3, MN4, and MN5 are ON and the ground gets connected to the output O through MN3 and MN4, thereby making the output nO = 0 and O = 1 as it gets connected to V DC owing to the ON state of MP1. This configuration exhibits an advantage of reduction in number of PMOS transistors from each logic function leading to a significant area reduction.
The functionality of Multi-Threshold CMOS Cascode Voltage Switch Logic (MTCMOS CVSL) based multiplexer is similar to the CVSL multiplexer. This technique is used to reduce the circuit leakage in static conditions [24]. In this configuration, one pair of PMOS-NMOS (MP3-MN9) is used, wherein MP3 isolates the logic circuit from V DC and MN9 isolates the ground, as shown in Fig. 2(d). MTCMOS technique separates the circuit from the power supply and ground to prevent power dissipation in static state [25]. Here, two complemented sleep signals; Sleep and nSleep (complement of sleep signal) are used to control the gates of PMOS (MP3) and NMOS (MN9), respectively. When sleep is low, the circuit works as standard CVSL based multiplexer. During sleep state, both outputs are in High Z (high impedance) state and the circuit is non-operational.

Output Response and Analysis
In this section, the output responses corresponding to different multiplexer structures discussed in the previous section are analyzed. The rise and fall timing values for A, B and Sel are incorporated as 0.45 ns, 0.35 ns and 1 ns, respectively. To analyze the realistic performance of multiplexers, a load capacitor of 1 fF [26] is used at the output.
In an ideal scenario, the generated output waveform will attain 0 V and 1 V level as per the input combination and select line configuration. If the output waveform is unable to attain a perfect 0 V or 1 V level, it is a non-ideal waveform. The output waveforms generated for all multiplexer structures discussed are ideal except for GDI, PT and MTCMOS CVSL. The output waveforms for ideal and non-ideal response for different multiplexer techniques are illustrated in Fig. 3.
The structure of TG, MSL, Static CMOS, CPL, and CVSL is such that they can transfer perfect 0 V or 1 V to the output whenever the select lines permit. Therefore, their output waveforms obtained are ideal. The ideal output waveform is given in Fig. 3(a). The ideal output situation is not attained for GDI based multiplexer, as showcased in Fig. 3(b). The GDI based multiplexer is designed using a pair of NMOS and PMOS. When the input at PMOS is 0, the output is not a perfect 0, as PMOS delivers a weak 0. Similarly, when input is 1 at NMOS, the output obtained is not equivalent to 1 as NMOS provides weak 1 at the output. The other multiplexer structure unable to produce ideal output is PT based multiplexer. This is the case as the schematic for PT based multiplexer is purely dependent on NMOS for the generation of output. As a result, the structure can never pass a perfect 1 due to NMOS being a weak 1. This technique produces a perfect 0 V level but the 1 V output level always falls short of its perfect value, as can be observed in Fig. 3(c).
Lastly, the output obtained for MTCMOS CVLS is different in comparison to all other multiplexer structures, as depicted in Fig. 3(d). This technique is dependent on an additional pair of PMOS-NMOS, of which the PMOS transistor is used to isolate the V DC from the circuit and the NMOS transistor is used to isolate the ground. This is done to reduce the power dissipation of the circuit. But this has its implications on the output waveforms. When sleep = 0, the circuit works as expected of a multiplexer circuitry, but when sleep = 1, the multiplexer circuit gets isolated from supply voltage and ground. The output is put in high-impedance state, losing the driving capabilities. This occurs as the entire circuit is disconnected from the supply voltage and ground. On the other side, when the sleep signal is low, the connections to the supply voltage and ground are re-established for the multiplexer core. Thus, the circuit produces output in keeping with the functioning of a multiplexer circuit.

Dynamic Power Dissipation Analysis
Power dissipation is a key characteristic of all considered structures. With the increasing popularity of low-power devices, it has become imperative to study the power dissipation of all multiplexer techniques as well. In Fig. 4, the dynamic power dissipation pulses corresponding to the output wave pulse are presented for different multiplexer techniques. Dynamic power consumption is caused by current flow in circuit, when the transistors of the devices are switching from one logic state to another. The frequency at which the device is switching, plus the rise and fall times of the input signal, as well as the number of internal nodes on the critical path for the device, have a direct effect on the duration of the current spike [27].
To measure the power dissipation for the multiplexers, a test wave was applied to all and the power dissipation was measured from the output waveform generated at the output node. The maximum and minimum power dissipation of TG, GDI, PT, MSL, Static CMOS, CPL, CVSL and MTCMOS CVSL based multiplexer are depicted in Tab. 1. The maximum power dissipation of MSL based multiplexer is obtained as 463.59 µW, which is observed least amongst all other multiplexers, as demonstrated in Tab. 1. On the other hand, the obtained power dissipation is minimum for GDI based multiplexer with a magnitude of 9.7 fW, which is reduced by order of 5 as compared to TG, PT and MTCMOS CVSL based multiplexer and decreased by order of 7 than that of MSL, Static CMOS, CPL and CVSL based multiplexers. The average power dissipation of PT based multiplexer is achieved as 537.1 nW, which is the lowest amongst all other multiplexers, as shown in Tab. 1.
Having explained the maximum, minimum and average power consumption for different configurations for the ease of inter-technique comparison, the values are tabulated in Tab. 1 and Tab. 2. As can be noticed from Tab. 1, the maximum power consumption is least for the MSL technique at 463.59 µW. However, the minimum power consumption for this technique is significantly high. The least minimum power consumption is observed for GDI based technique at 9.71 fW, but its maximum power consumption is the highest of all.
Another parameter, average power dissipation, is used to characterize the multiplexer structures. All the inputs are switching at different time instance, so the current will also be different at each time, which leads to different power dissipation value [28]. The average power is proportional to the energy required to charge and discharge the circuit capacitance. This power dissipation parameter is dependent on   the switching frequency but is independent of the device parameters.
The average power dissipation is found least for PT based structure at 537.1 nW. The interesting thing to observe about the PT structure is its moderate maximum and minimum power. They are neither too high nor too low, making it the ideal technique in terms of power dissipation performance.

Delay and Power Analysis
The necessities of multiplexer for better performance include short propagation time and total power consumption, while static power dissipation ought to be minimum. The selection of multiplexer depends on the requirement of either propagation delay or power consumption. Table 3 explains the analysis of Delay, Average Power, PDP and Static Power Dissipation. By analysing the delay and power of different multiplexers, it is evident that the performance of each multiplexer is different, moreover, two of them presenting better performance compared to others in terms of propagation time, power consumption and leakage current. Table 3 reveals that the propagation delay of TG based multiplexer is the shortest. For static condition, GDI based multiplexer outperforms with less amount of minimum power dissipation. Additionally, it dissipates minimum static power of 689 pW. On the other side, PT based multiplexer is slower by the value of 0.49 ps than TG multiplexer but it shows the lowest average power dissipation of 537.1 nW. In some condition, PDP of the circuit is considered as an important characteristic which ought to be minimum. Subsequently, the PDP of PT based multiplexer is minimum with a magnitude of 0.00796 aJ, thereby PT based multiplexer shows the optimum performance.

Conclusion
Eight structures of multiplexer implementation have been analysed for dynamic power consumption, delay, static power and power delay product. The maximum and minimum power dissipation are illustrated in Tab. 1. If average power dissipation is considered, then PT based multiplexer has achieved the best performance at 537.1 nW. The delay analysis recommends TG based multiplexer as the fastest multiplexer as it has the least delay recorded at 14.34 ps. Additionally, GDI based multiplexer registers the minimum static power dissipation of 689 pW. While in terms of average power dissipation, PT based multiplexer outperforms TG based multiplexer but it is slower by the value of 0.49 ps than the TG multiplexer. The results obtained above strongly suggest that each multiplexer technique has its own merits. Some techniques demonstrate excellent power performance, while the other have faster operations, or are better at static performance. We believe that this comparison will be helpful for right choice of multiplexers in the design structure according to switching requirements, power consumption and occupied area.