A Fast-Lock Low-Jitter DLL with Double Edge Synchronization in 0.18µm CMOS Technology

Aims: This paper describes a fast-lock, low-power, low-jitter and good duty-cycle correction capability delay locked loop with double edge synchronization which is mainly used in clock alignment process. A clock aligner’s task is to phase-align a chip internal clock with a reference clock. The main advantage of delay locked loop rather than phase locked loop is related to good jitter performance of it. Double edge synchronization method leads to more power consumption and it can increase rms and peak-to-peak jitter therefore, in this work rms jitter, peak-to-peak jitter and power consumption are implemented to understand if this statement is always true or not. So, this case became one of our aims

supply voltage. The HSPICE simulation results show the proposed delay locked loop circuit generates clock signals ranging from 750MHz to 1GHz. The maximum power consumption of the DLL circuit at 1GHz is 3.4mW. The maximum and minimum of rms jitters are 9.12 and 0.463ps and the maximum and minimum of peak-to-peak jitters are 124.89 and 2.52ps, respectively. The locking time of proposed delay locked loop is less than 20ns within the operating frequency band. Another feature of this architecture is that it has good duty cycle correction capability (50±0.9%). It should be note that, in double edge DLL it is so important to find a balance between duty cycle (should be around 50%), jitter and power consumption. Rms jitter, peak-to-peak jitter, power consumption and also duty cycle error are calculated by HSPICE simulator. (Cosmosscope program in HSPICE simulator can be used for these measurements). Conclusion: Although designing double edge synchronization method in delay locked loops is challenging and it takes more area than single edge delay locked loops (which is mentioned as the main disadvantages of double edge delay locked loops and we all agree on this), by choosing suitable blocks it can be used without jitter performance or power consumption challenges. In other word, the results of this paper shows that all the effective and important items of introduced double edge delay locked loop (such as power consumption, rms jitter and peak-to-peak jitter) are as well as single edge delay locked loops in most articles. So when it is suitable to use double edge delay locked loop instead of single edge delay locked loop, it should be no concern about these items.

INTRODUCTION
In the past decades, PLLs and DLLs have been widely used in high-speed applications, such as memory ICs, communication ICs, microprocessors, network processors, etc.
Ordinarily, if there is no frequency multiplication, using DLL for signal synchronization would be the better choice than PLL. Because DLL is a first order control system, it's more stable and easier to design. Moreover, PLL suffers from a later locking time and jitter accumulation due to the closed loop Voltage Controlled Oscillator (VCO). On the other hand, the DLL using the VCDL instead of the VCO does not accumulate over many clock cycles therefore, DLL exhibits better jitter performance than PLL. In addition, DLL have smaller area and faster locking time than the PLL [1]. Low power, wide lock range, short locking time, and low jitter are focuses of the DLL design. In order to achieve low jitter operation, DLL designs require delay stage design with low supply and substrate noise sensitivity and good matching between the up/down CP currents [2].
In this work, a DLL structure with double edge synchronization with clock alignment capability of both rising and falling edges is proposed. In the rest of the paper, Section 2 of the paper describes architecture of DLL with clock alignment capability of both rising and falling output pulse edges and also concentrates on implementation of proposed structure. Section 3 includes simulation results by HSPICE simulation, and conclusions are given in Section 4.

MATERIALS AND METHODS
The building block of a conventional analog delay loop with double edge synchronization is shown in Fig. 1. This structure of a DLL circuit has clock alignment capability of both leading and trailing output pulse edges. A clock aligner's task is to phase align a chip internal clock with a reference clock, effectively removing the variable buffer delay and reducing uncertainty in clock phase between communicating VLSI IC constituents [3]. Fig. 1 are: a VCDL, two differential Charge Pumps (CPs), CP1 and CP2, two first order low pass filters (LPFs), LPF1 and LPF2, and a multistage clock buffer (MCB). The reference clock, CLKref, is propagated through VCDL and MCB. The output signal, CLKout, is compared through with the reference input. If the delay difference from integer multiples of clock period is detected, the closed loop will automatically correct it by changing the delay time of the VCDL [4].

Constituents of
According to Fig. 1, in first stage two Phase Frequency Detectors (PFDs) are shown. Their function is to compare phase frequency of rising or falling edges between the input (CLKref) and output (CLKout). Note that the PFD1 is sensitive to a rising and PFD2 is sensitive to a falling pulse edge. These PFDs are high speed and have small dead zone. Hence, the DLL circuit has very fast lock feature compared to other dynamic Phase Detectors (PDs) [3,5]. In the next stage, two ideal CPs and two LPFs are sketched in Fig. 1 UP1 and UP2 pulses cause Ip to add charge to the capacitors of LPF1 and LPF2, while DN1 and DN2 function is to discharge the capacitors. The output of CP1 and CP2 are Vctrl1 and Vctrl2, and they are connected to the VCDL control input (Vbn and Vbp). In the last stage of this figure, VCDL and MCB blocks are shown. The control input voltages of VCDL (Vbn and Vbp) can regulate the rising or falling clock pulse edge.

Fig. 1. DLL's structure with double edges synchronization
Circuit structure is discussed in Section 2.1, 2.2 and 2.3.

Phase-Frequency Detector
The PD function is to detect the phase difference between the reference clock signal and the feedback clock. PD can detect the skew of the clock, and it can be analog or digital as well. Nowadays digital phase detectors have become more popular. As its name indicates, PDs are sensitive to the phase difference between two signals, but they are not sensitive to frequency. Practically PD can work as frequency detector but with limited range. Thus, it is preferred to replace the PD with PFD. On the other hand, many PFDs have a large dead zone. As we know, dead zone occurs when the loop does not respond to small phase errors. Each width of the dead zone directly feeds to jitter in the DLL and should be avoided. Hence, this kind of PFDs cannot be used at high frequencies.
To overcome the speed limitation and reduce the dead zone, we proposed high speed PFDs which are sketched in Fig. 2. These schematics have fast lock loop feature. In this work, PFD1 [6] is used to detect a rising pulse edge. We needed another PFD for falling edge, therefore, we proposed new PFD (PFD2) to detect a falling pulse edge. Hence, PFD1 ( Fig. 2  (a)) is sensitive to rising clock pulse edge, while PFD2 ( Fig. 2 (b)) is sensitive to the falling edge. The width of UP and DN signals are proportional to the phase difference of the input signals [7]. PFD1 and PFD2 have three states. Compare to both PDs in reference [3], our proposed PFDs for rising and falling edge have smaller dead zone and they can also work as frequency detector since they can work in higher frequencies. As shown in Fig. 3(a), (b) and 4(a), (b) both reference and output have the same frequency but with phase difference, while in Fig. 5(a), (b) and 6(a), (b) both signals have different frequencies and phases. There is another state, which is when both signals have the same frequency and phase (Fig. 7). As can be seen in Fig. 3

Charge Pump and Loop Filter
CP design is one of the most complicated parts of the DLL structure. The CP controls the charging/discharging current by UP/DN signal from PFD, and uses the phase difference between the up and down signals from the PFD to convert the phase error into current. Then, loop filter converts the current into the control voltage, by charging or discharging the capacitor and sending it to Vctrl to set the VCDL delay. Two differential CPs (CP1 and CP2) are used, because in this work, they seem more proper choice and the advantage of these structures is that switching time is improved by using the current steering switches. CP1 is depicted in Fig. 8(a). This CP can be used for rising edge [8]. Therefore, we needed another CP for falling edge, and we proposed CP2 (Fig. 8(b)) to detect a falling pulse edge.
One of the possible filters is a RC low-pass filter, like the filter mentioned in [9]. But in this work two simple capacitors are used as LPFs, and they are adjusted to be C1=C2=1.88pF. As we can seen in Fig. 1, the equivalent model of the CP and LPF consists of a source current, a sink current and two switches controlled by PFD output. When the output phase of the DLL circuit leads the reference phase, the current source switch opens and the current sink switch closes. Thus, the voltage in the capacitor decreases. The voltage in the capacitor increases if the reference phase leads the output phase. In this work, we adjusted source current and sink current to be ICP-up=ICP-down=100µA. This amount of source and sink current is chosen by HSPICE simulation test which works suitably in both mentioned CPs. Each CP can charges or discharges their filter capacitors. Vctrl1 and Vctrl2 (Vbn and Vbp) are the voltages on capacitors C1 and C2 respectively, and sets the VCDL stage propagation delay.

Voltage Controlled Delay Line and Multistage Clock Buffer
The most critical component in the performance of DLL is VCDL. A VCDL can influence DLL action. Therefore, VCDL can influence DLL stability and jitter performances. The implementation of a VCDL is composed of several variable delay elements connected in series. There are several examples of buffer elements [10], such as cascade delay cell, differential delay cell, shunt capacitor delay cell, etc.
The VCDL used in this work ( Fig. 9) [9] consists of cascaded variable delay stage, is driven by the reference input clock, and the output is CLKout1, which is an input voltage for MCB circuit. As we can see, in this circuit, Vbp drives gates of M1 and M5 transistors, while Vbn drives gates of M4 and M8. A few (or many) cascade delay stage can be implemented in the structure of the proposed VCDL. We should note that, the load capacitance represented in the previous stage, is the input capacitance of the next inverter.
In high-speed design a multistage clock buffer implemented with a long inverter chain is often needed to drive a heavy capacitive load. For these designs, as well as for applications in which the timing of both edges of the clock is critical, it is difficult to keep the clock duty cycle at its ideal value of 50%, primarily due to various asymmetries in signal paths and unbalances of the p and n transistors in the long buffer. As a consequence the clock duty cycle will deteriorate from 50%, and in the worst case, the clock pulse may disappear inside the clock buffer, as the pulse width becomes too narrow or too wide [3]. As can be seen in Fig. 1 the output of VCDL (CLKout1), is an input for MCB and the output of MCB (CLK-out), interns to PFD1 and PFD2. The VCDL and MCB were implemented as a chain of ten delay elements (five VCDL and five MCB). It should be noted that proposed VCDL and MCB are tested by HSPICE simulation for their jitter performances. Compare to reference [3], Selected VCDL and MCB take much less area than mentioned reference since they have fewer delay cells.
Time delay variation of the leading and trailing pulse edge term of control voltage Vctrl (Vbn and Vbp) is presented in Fig. 10. If the control voltage Vctrl, decreases, the time delay of the trailing edge increases and time delay of leading edge decreases and vice versa.

RESULTS AND DISCUSSION
The proposed DLL structure with double edge synchronization is implemented in 0.18µm CMOS technology, with the supply voltage of 1.8V. The operational frequency range is from 750MHz to 1GHz. Fig. 11 shows the result of DLL operation at (a) 750MHz and (b) 1GHz. In Fig. 11 illustrates the behaviour of CLK-ref and CLK-out, and waveforms of UP1 (UP2), DN1 (DN2). Also, this figure shows the behaviour of CP's output (Vctrl1 and Vctrl2). As can be seen from UP1 and DN1 (UP2 and DN2) signals define the control voltages Vctrl1 (Vctrl2). The locked time of DLL is less than 20ns within the operating frequency band. This circuit also has good duty cycle correction capability. Duty cycle error is also measured and as shown in Fig. 12, within the full operating range, the duty cycle error is less than ±0.9%.  Fig. 14 show output rms jitter and peak-to-peak jitter versus operation frequency. As can be seen, when frequency becomes bigger, both jitters are decreasing.
The maximum and minimum of rms jitters are 9.12 and 0.463ps and the maximum and minimum of peak-to-peak jitters are 124.89 and 2.52ps, respectively (these results seems acceptable for even single edge DLLs). Fig. 15 shows power consumption variation during the frequency range of DLL. Power consumption increases proportional to frequency in whole of operation range (It can be seen from the picture that the frequency range is calculated from 750 775…….up to 1000MHz). The maximum and minimum power consumption at 750MHz and 1GHz are 2.2mW and 3.4mW, respectively. As we compared this range with other power consumption range in other articles, it's been realized that this range is the normal range in single edge DLLs (it should be mentioned that there are some works that could achieve lower power consumption in single edge DLLs than our work).  Table 1, gives the performance summary of the proposed PWCL and the characteristics of other published PWCLs. As it can be seen, reference [3] is also double edge synchronization DLL. In this work, approximate architecture proposed in this reference is used, but with more proper blocks and different process to improve important items of DLL. Therefore, compare to this reference, different technology is used, locking time is faster, wider range of frequency is achieved. On the other hand, rms jitter, peak-to-peak jitter and power consumption were computed, which did not mention in reference [3] at all and as we know these two items are the most important ones in DLLs. It can obtain that in our work, good jitter performance and low power consumption are achieved too. In the rest of the table, our work is compared with other references [11,12,13,14]. It should be mentioned that our previous work [15] is also used to make the results of this work better (specially the locked time, rms jitter and peak-to-peak jitter).
It must also be pointed out that the reported information for this work are extracted from the simulation results, whereas some of those previously reported works are from the experimental results. So some problems such as parasitic elements, impedance mismatch and calibration errors have been ignored, which could influence the performance of proposed system for future fabrication and test setup.

CONCLUSION
In this paper, DLL architecture with double edge synchronization based on 0.18μm CMOS technology at 1.8V power supply is proposed. Operating frequency range is from 750MHz to 1GHz. Fast-lock double edge synchronization DLL (maximum 20ns) is achieved by using high speed double edge PFD1 and PFD2. Proposed PFDs also have small dead zone. Also, differential CPs (CP1 and CP2) are used, because in this work, they seem more proper choice and the advantage of these CPs is that switching time is improved by using the current steering switches. Another feature of this structure is that it has good duty cycle correction capability (50±0.9%). On the other hand, as we know, double edge synchronization method leads to more power consumption and it can increase rms and peak-to-peak jitter, because of using two PFDs, two CPs and LPFs instead of one, therefore, in this work analysis of rms jitter, peak-to-peak jitter and power consumption is also implemented. The maximum and minimum power consumption at 750MHz and 1GHz are 2.2mW and 3.4mW, respectively. The maximum and minimum of rms jitters are 9.12 and 0.463ps and the maximum and minimum of peak-to-peak jitters are 124.89 and 2.52ps, respectively. Although designing double edge synchronization method in DLLs is challenging and it takes more area than single edge DLLs (which is mentioned as the main disadvantages of double edge DLLs and we are working on this part, to minimize the chip area for our next DLL project), these results show that double edge DLLs can be used without jitter performance or power consumption challenges. In other word, the results of this paper shows that most of the effective and important items of introduced double edge DLL are as well as single edge DLLs in most articles. So when it is suitable to use double edge DLL instead of single edge DLL, it should be no concern about these items.