

Master Radiation and its Effects on MicroElectronics and Photonics Technologies (RADMEP)



## PRECISE DELAY GENERATION USING DIFFERENTIAL INPUT DELAY CELLS USED IN DELAY-LOCKED-LOOP

Master Thesis Report

Presented by

MD Nahid Hasan

and defended at

University Jean Monnet

11.09.2023

Academic Supervisors: Prof. Dr. P. Leroux

Prof. Dr. J. Prinzie

Jury Committee:

Prof. Dr. F. Saigné

University of Montpellier

Prof. Dr. S. Girard University Jean Monnet

Prof. Dr. P. Leroux KU Leuven

Prof. Dr. A. Javanainen University of Jyväskylä







# Precise Delay Generation Using Differential-Input Delay Cells Used In Delay-Locked-Loop

**MD Nahid HASAN** 

Academic Supervisors: Prof. Dr. P. Leroux

Prof. Dr. J. Prinzie

This thesis was submitted as a partial fulfillment of the requirements for the Master in Science (MSc) degree.

Jury Committee:

| Prof. Dr. Frédéric Saigné | University of Montpellier |
|---------------------------|---------------------------|
| Prof. Dr. Sylvain Girard  | University Jean Monnet    |
| Prof. Dr. Paul Leroux     | KU Leuven                 |
| Prof. Dr. Arto Javanainen | University of Jyväskylä   |

August, 2023

# Acknowledgment

I want to acknowledge a number of people who were always encouraging and supportive during my thesis. I am really thankful to my project supervisors Professor Paul Leroux and Professor Jeffrey Prinzie for facilitating the thesis and helping me to complete the project despite having a very limited amount of time. Their valuable guidance and expertise helped me a lot to explore deep into the concept. I am also thankful to the researchers at ADVISE group, KU Leuven who were always helpful whenever I needed.

I also want to express my heartfelt gratitude to Professor Frédéric Saigné for always keeping an eye on my progress. I want to thank all four universities in the RADMEP program, and all the personnel involved in this program for helping me throughout the last two years in facilitating this amazing journey.

Lastly, my deepest gratitude goes to my family and my close friends for their constant encouragement and unwavering backing.

## Abstract

With the advancement of technology, integrated circuits became smaller and faster with ever-decreasing technology nodes. The voltage headroom to work with analog circuits became smaller with the downscaling and it paved the path to process the analog signals in time domain. Time-to-digital converter is that kind of device that can take advantage of processing the signal in time domain with picosecond resolution. A delay-locked-loop (DLL) based TDC uses a delay line which acts as the timing generator. Like a phase-locked-loop (PLL), a DLL is also constituted by a phase detector, charge pump, and voltage-controlled delay line. The number of delay cells in the delay line depends on the total amount of time that needs to be measured. Although an additional counter could act as coarse converter to increase the dynamic range. Every individual cell produces the same amount of delay that defines the resolution of the TDC.

A delay cell can be single-ended which is a current-starved structure or differentialinput structure that produces picosecond delay. The delay of every individual cell can be controlled externally by the control voltage. A single-ended buffer can have the advantage of less power consumption, and less area but a differential stage can obviously perform better in terms of noise rejection. High noise and environmental variation immunity is the primary concern for the delay line as it causes output jitter in the timing generator.

This thesis presents the comparison between the two most used differential-input buffer architectures that are used for timing generation in DLL-based TDCs: Maneatis cell and Lee-Kim cell. Maneatis cell is a fully differential architecture where the NMOS tail current source is biased with self-biasing technique. This self-biasing structure omits the requirement to use a different process-independent bandgap reference voltage for the biasing. On the other hand, the Lee-Kim cell is a pseudo-differential architecture with two cross-coupled inverters where both PMOS transistors are current starved. Less voltage headroom requirement, and high output swing can be the aspects by which one can tend to choose Lee-Kim cell as it does not have any tail current source but their performances against process, voltage, and temperature (PVT) variations are worth considering.

This thesis presents how these two delay cell architectures perform in PVT variations and random mismatch. Obviously, for a DLL, an external reference clock is needed which has a frequency in GHz range and requires a PLL. But that is also one aspect which can be seen in the future.

# **Table Of Contents**

| Acknowle          | dgment i                                 |  |
|-------------------|------------------------------------------|--|
| Abstract.         | ii                                       |  |
| Table Of          | Contentsiii                              |  |
| List Of Fig       | guresv                                   |  |
| List Of Ta        | bles vii                                 |  |
| List of Ab        | breviations viii                         |  |
| Chapter           | 1. Introduction1                         |  |
| 1.1               | Why in time domain1                      |  |
| 1.2               | History of data converters1              |  |
| 1.3               | Conclusion                               |  |
| Chapter           | 2. Basics Of Time-To-Digital Converter   |  |
| 2.1               | Working principle of TDC                 |  |
| 2.2               | Delay line based TDC                     |  |
| 2.3               | Locked loop TDC                          |  |
| 2.3.1             | Delay locked loop based TDC7             |  |
| 2.3.2             | 2 Phase locked loop based TDC 11         |  |
| 2.4               | Performance figures of TDC13             |  |
| 2.4.1             | Quantization error13                     |  |
| 2.4.2             | 2 Linear imperfections in TDC14          |  |
| 2.4.              | 3 Non-linear imperfections in TDC15      |  |
| 2.5               | Conclusion16                             |  |
| Chapter           | 3. Buffer Design                         |  |
| 3.1               | Noise sensitivity and delay variations17 |  |
| 3.2               | Maneatis delay cell                      |  |
| 3.2.1             | Self-biasing circuit for Maneatis cell   |  |
| 3.3               | Lee-Kim cell                             |  |
| 3.4 Conclusion 22 |                                          |  |
| Chapter           | 4. Simulation Results                    |  |

| 4.1       | Delay line implemented with the Maneatis cell 23     |
|-----------|------------------------------------------------------|
| 4.1.1     | PVT variations and mismatch in the Maneatis cell     |
| 4.2       | Delay line implemented with the Lee-Kim delay cell27 |
| 4.2.1     | PVT variations and mismatch in the Lee-kim cell      |
| 4.3       | Delay-locked loop results                            |
| 4.3.1     | DLL with delay line constituted by Maneatis cell     |
| 4.3.2     | DLL with delay line constituted by Lee-Kim cell      |
| 4.4       | Conclusion                                           |
| Chapter : | 5. Conclusion and Future Work                        |
| Reference | 5                                                    |

# List Of Figures

| Figure 1. Block diagram of Alec Harley Reeves's proposed ADC [5]2                                                                                          |
|------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Figure 2. Basic flash TDC architecture5                                                                                                                    |
| Figure 3. Working principle of delay line based TDC5                                                                                                       |
| Figure 4. Looped TDC architecture6                                                                                                                         |
| Figure 5. DLL based TDC7                                                                                                                                   |
| Figure 6. (i) Phase Frequency detector (PFD) with D flip-flops , (ii) PFD output (iii) D flip-flops architecture with two cross-coupled SR latch           |
| Figure 7. (i) Circuit diagram of a charge pump (ii) Charge pump output showing it's behavior as an integrator9                                             |
| Figure 8. Diagram of a voltage controlled delay line10                                                                                                     |
| Figure 9. Closed loop model of a DLL10                                                                                                                     |
| Figure 10. Type-II PLL 11                                                                                                                                  |
| Figure 11. Closed loop model of a PLL 11                                                                                                                   |
| Figure 12. Ideal input/output behavior of a TDC13                                                                                                          |
| Figure 13. Offset error in input/output characteristics of a TDC14                                                                                         |
| Figure 14. Gain error in input/output characteristics of a TDC15                                                                                           |
| Figure 15. DNL and INL in input/output characteristics of a TDC 15                                                                                         |
| Figure 16. Delay variation in open-loop and closed-loop delay line                                                                                         |
| Figure 17. Generalized version of a differential buffer stage                                                                                              |
| Figure 18. Maneatis cell with symmetric load and dynamically19                                                                                             |
| Figure 19. Complete implementation of the self-biasing circuit of the maneatis cell 20                                                                     |
| Figure 20. Lee-Kim delay cell22                                                                                                                            |
| Figure 21. Optimization of the width of the devices in maneatis cell for minimum delay                                                                     |
| Figure 22. (i) Static Supply voltage variation in buffer delay in maneatis cell (ii)<br>Supply variation in two extreme cases in maneatis cell             |
| Figure 23. (i) Temperature variation in maneatis cella at different control voltages<br>(ii)Temperature variation at two extreme cases for maneatis cell25 |
| Figure 24. Process corner variation in maneatis cell25                                                                                                     |

| Figure 25. Monte Carlo analysis of delay line constructed with maneatis cell at $$V_{ctrl}$=600\ mV$                                             |
|--------------------------------------------------------------------------------------------------------------------------------------------------|
| Figure 26. DNL and INL measurement of a maneatis cell based delay line27                                                                         |
| Figure 27. Width optimization for Lee-Kim delay cell producing minimum delay 28                                                                  |
| Figure 28. (i) Static Supply voltage variation in buffer delay in maneatis cell (ii) Supply variation at two extreme cases                       |
| Figure 29. (i) Temperature variation in Lee-kim cell at different control voltages (ii)<br>Temperature variation in Lee-Kim at two extreme cases |
| Figure 30. Process corner analysis in Lee-kim delay cell                                                                                         |
| Figure 31. Monte Carlo analysis of delay line constructed with Lee-Kim cell at $V_{ctrl} = 600 \text{ mV}$                                       |
| Figure 32. DNL and INL measurement of a Lee-Kim cell based delay line                                                                            |
| Figure 33. Buffer delay for various control voltage in delay line constituted by Lee-<br>Kim cell                                                |
| Figure 34. Diagram of the DLL constituted with maneatis cell based delay line32                                                                  |
| Figure 35. DLL output with delay line made with maneatis cell                                                                                    |
| Figure 36. DLL output with delay line made with Lee-kim cell                                                                                     |

# List Of Tables

| Table 1. Width of the devices in Maneatis cell for minimum delay    24                              |
|-----------------------------------------------------------------------------------------------------|
| Table 2. Results of monte carlo analysis of maneatis cell under nominal conditions 26               |
| Table 3. Width of the devices in Lee-Kim cell for minimum delay                                     |
| Table 4. Results of monte carlo analysis of Lee-Kim cell    31                                      |
| Table 5. Minimum delay produced by the maneatis and Lee-Kim cell at different PVT corners           |
| Table 6. Comparison between delay line implemented with the Maneatis cell and the      Lee-Kim cell |

# List of Abbreviations

| ADC   | Analog-to-Digital Converter.             |
|-------|------------------------------------------|
| TDC   | Time-to-Digital Converter.               |
| ADPLL | All-Digital Phase-Locked Loop.           |
| ASIC  | Application-Specific Integrated Circuit. |
| ToF   | Time of Flight                           |
|       |                                          |
| PD    | Phase Detector.                          |
| CMOS  | Complementary Metal Oxide Semiconductor. |
| СР    | Charge Pump.                             |
| CPPLL | Charge Pump PLL.                         |
| FF    | Flip-flop.                               |
| DAC   | Digital-to-Analog Converter.             |
|       |                                          |
| DLL   | Delay-Locked Loop.                       |
| DNL   | Differential Non-Linearity.              |
| INL   | Integral Non-Linearity.                  |
| ENOB  | Effective number of bits.                |
| FPGA  | Field Programmable Gate Array.           |
| RO    | Ring Oscillator.                         |
| HEP   | High Energy Physics.                     |
|       |                                          |
| IC    | Integrated Circuit.                      |
| IF    | Interpolation factor.                    |
| LF    | Loop Filter.                             |
| VCO   | Voltage-Controlled Oscillator            |
| LSB   | Least-Significant Bit.                   |
| MGRO  | Multi-Path Gated Ring Oscillator.        |
|       |                                          |

| MSB  | Most-Significant Bit.             |
|------|-----------------------------------|
| PFD  | Phase frequency detector.         |
| PLL  | Phase-locked loop.                |
| PVT  | Process, Voltage and Temperature. |
| PWM  | Pulse width modulation.           |
|      |                                   |
| VCDL | Voltage-Controlled Delay Line     |
| SNR  | Signal-to-Noise Ratio.            |

#### SPO Static Phase Offset.

| SSP | Single-Shot p | recision. |
|-----|---------------|-----------|
|     | 0 1           |           |

- SSR Single-Shot Resolution.
- UGB Unity Gain Bandwidth.
- LDO Low Drop-out Voltage

- LIDAR Light Detection And Ranging
- TID Total Ionizing Dose
- SEE Single Event Effects

## **Chapter 1. Introduction**

The Buffer delay lines and ring oscillators are used in numerous applications in integrated circuits to generate delays with very high accuracy at a very high frequency [1]. It is the fundamental building block of a Delay Locked Loop (DLL) or PLL (Phase Locked Loop) based time-to-digital converter (TDC). Their performances are often characterized by delay step, dynamic range, jitter performance, most importantly their uncertainty due to various process and environmental variability. Their application also includes clock synchronization[14], clock recovery [2] and time digitization [3]. There are numerous buffer delay cell architectures that are proposed over the years to have precise delay over the delay line, producing high resolution. But there is often a question how they perform in terms of process, temperature, and supply (PVT) variations. This thesis will compare the DLL performances using the two most used buffer architectures differential Maneatis cell [1] and Pseudo-differential Lee Kim cell [4].

#### **1.1** Why in time domain

Microelectronics is a fast, rapidly growing field which is driven by technology scaling. The aim is to have higher performance with less manufacturing cost. Systems that are built on digital circuits, often have the advantages of reduced chip area, faster switching speed. On the other hand, analog circuits do not rely on the speed but the actual characteristics of the transistors. With the downscaling of the transistors, the parameters such as gain, output resistance, noise, and distortion which are key to the analog-mixed signal circuits often degrade. As the supply voltage also scales down, analog circuits are increasingly becoming harder to design to meet the requirements. That is why, designing a high performance Analog-to-Digital Converter is becoming difficult where a voltage domain is transformed into digital domain. To meet the robustness of the system and technology downscaling, many analog blocks in data converters are replaced by the digital circuits. Hence, the time-to-digital converter (TDC) comes to the picture where the working principle shifted to analog time domain from the analog voltage domain. As the TDC is based on analog time domain, its resolution depends on the available time and not on the available supply voltage.

Time-to-Digital Converters (TDCs) have been used in many applications for the last 60 years but recently it again came into discussion because of TDC's extensive use in time of flight (ToF) measurement such as LIDAR (Light Detection and Ranging), high energy particle (HEP) detectors, etc.

#### **1.2** History of data converters

In 1921, Paul M. Rainey invented the first data converter to transmit an image. His technique was later known as Pulse Code Modulation (PCM) which was way ahead of that generation that the patent was forgotten. After several years, when other PCM patents were starting to get used, his patent was rediscovered [5]. With the invention of the transistor in 1947, data converters started to get widely used, became smaller, faster, and even faster as the time progresses.

Many years later, when the telecommunication industry started to grow rapidly, they used a technology named Frequency Division Multiplexing to transmit multiple telephone channels using vacuum tube technology. That technology suffered from noise and distortion and could not cope with the requirement.

The first analog-to-digital converter (ADC) and digital-to-analog (DAC) were invented by a British scientist Alec Harley Reeves. For many years, he studied analog-to-time conversion and is widely regarded as the inventor of PCM technology. His converters could perform 5-bit conversion which is shown in Figure 1. He used a 6 kHz sampling pulse to sample the analog voice input with the help of a 5-bit counter that is enabled by a pulse width modulator (PWM) signal [9].



Figure 1. Block diagram of Alec Harley Reeves's proposed ADC [5]

During that time, time measuring techniques took the attention of many scientist in the field of high energy particles. Italian scientist Bruno Rossi is one of many who used this time measuring technique to study the decaying nature of the muon. A capacitor integrates voltage by charging or discharging and he used this behavior as the time measurement. This was quite a big achievement at that time but it could only detect a small amount of time and the time was represented in analog fashion [6].

At that time, a new approach was taken to modify the newly developed ADC by Reeves. It used a counter integrated into the newly developed PCM technology to directly quantify the time interval. But as it was required to use it in the field of high energy particles where the time resolution is a very important factor, the vacuum tube which was the most popular technology in the field of telecommunication, fell short of the requirement. The resolution of a few nanoseconds was not achieved by the vacuum tube

technique that was needed for particle physics. As a consequence, the time-to-amplitude conversion gained people's attention [7].

After 20 years, again during the 1950s and 1960s, extensive research in high particle physics, paved the way for time of flight analysis and pulse height systems. At the same time, the very first form of TDC was widely employed.

## 1.3 Conclusion

.

To sum up, we can clearly see two types of data converters that are distinct by their way of conversion: one converts analog voltage (ADC) and the other one converts analog time (TDC). In this thesis, we will only consider the second one (TDC) as the main work has been done on the timing generator circuit.

# **Chapter 2. Basics Of Time-To-Digital Converter**

Before we go into the design of the delay-locked-loop (DLL) and buffer architectures which are the primary objective of this thesis, it is important to understand the basic concepts of TDC. In this chapter, we will look at the working principle, some basic architectures, and the non-linearities of the TDC. There are many TDC architectures, we will only discuss basic digital DLL-based and PLL-based TDC.

## 2.1 Working principle of TDC

A TDC measures and digitizes continuous or discrete signals in the time domain. These signals are digital signals where transition happens very sharply between two voltage levels. A continuous signal is converted in a discrete manner, each by sample by sample, and the time difference between the samples depends on the conversion time. The digital output of the TDC is determined by the time difference between the signal edges, whether they are falling or rising. which can be processed afterward. Input signals can be a reference clock or start-stop measurement depending on the application.

TDCs can fall into single-shot precision category where the level of accuracy is achievable by a single measurement taking account the noise associated to it. It is similar to the flash ADCs where it requires only one voltage sample for the measurement. The second category is called oversampling TDCs which are very similar to oversampled ADC (Sigma-delta ADC) where the resolution can be enhanced through oversampling and noise shaping techniques. The similar strategy can be used in the time domain that implies repetitive time intervals. In high energy physics, single-shot TDCs are used because the particles are encountered only once. But for measuring numerous identical measurements in LIDAR applications, oversampling TDCs are mostly preferred.

### 2.2 Delay line based TDC

A TDC based on delay line is often known as a flash TDC. It uses a basic delay line architecture to measure the time duration between two events. In a DLL, the delay line is locked to a frequency that is coming from an external reference clock. The main architecture of a DLL-based TDC consists of two parts. The first part containing the delay line generates the time. The second part contains start and stop registers, and an array of flip-flops and to sample the state of the delay line. In normal operation, a start signal at the input causes the sampling registers to save the present state of the delay line.



Figure 2. Basic flash TDC architecture

A delay line consists of n individual buffer cells that generate n equally spaced versions of the reference clock signal. The resolution of the TDC can be increased by increasing the number of delay cells which form the chain. When the stop signal arrives,  $i^{th}$  version of the start signal which has already passed through the delay line are sampled in parallel by the registers. The sampling process halts the current state of the delay line instantly when the stop signal occurs.



Figure 3. Working principle of delay line based TDC

It results in a thermometer code wherein the sampling elements' outputs display a HIGH value for the delay stages that were traversed by the start signal and a LOW value for the delay stages that the start signal has not yet reached. The HIGH-LOW transition in the thermometer code indicates how long the start signal propagates through the

delay line during the time span between the start and stop signal. Hence, it measures the time duration between the two events.

### 2.3 Locked loop TDC

The TDC that we discussed before is a so-called linear TDC that consists of one or multiple feed-forward delay lines without incorporating any feedback loop. With the increase of the maximum time interval that needs to be measured, the number of buffers in the delay lines increases, and consequently the overall area of the TDC expands. To address this long delay line and avoid the excessive use of large areas, looped TDC architectures (as shown in Figure 4.) are introduced. In this architecture, a short delay line is bent into the loop which allows the start signal to transverse again multiple times. This process is sometimes called "reference recycling" [8].

When the start signal reaches the end of the delay line, it is again fed back into the beginning for continuous transversal. Usually, a counter is used to keep tracking how many times the start event passes through the loop before the TDC stops.



Figure 4. Looped TDC architecture

The counter value  $B_{count}$  represents a coarse quantization of the time interval being measured. It provides a broad measurement range but lacks fine resolution. On the other hand, the one-zero transition in the thermometer code generated by the short delay line that describes the position within one counter interval and offers fine resolution, denoted as  $B_{tdc}$ . Combining these two components allows for an accurate representation of the measured time interval with both coarse and fine quantization. We can define the overall measurement value *B* as follows:

$$B = N \cdot B_{count} + B_{tdc}$$
 2.1

Here, *N* defines the number of the delay cells within the short delay line. The combination of coarse and fine quantization allows us to represent the measured time interval of a TDC in a very efficient and accurate way.

#### 2.3.1 Delay locked loop based TDC

As we already discussed in the previous section, a DLL-based TDC has a delay line which acts as a timing generator. To sample the states of the delay cells, a series of start and stop registers produces a thermometer code. An encoder then transforms the thermometer code into a binary output.

The performance of the TDC depends on the DLL that controls the timing of the overall system. It produces delayed versions of the clock signal that is given to the DLL as the reference. These signals have a phase difference which proportional to the delay of every individual delay cells. DLL always keeps track of the total delay produced by the delay line and maintains that equal to one period of the input clock.

A phase frequency detector (PFD) determines the phase difference between the input reference clock and the output of the voltage-controlled delay line (VCDL). This finite phase difference is then delivered to the charge pump which integrates the average phase difference by charging or discharging the capacitor. The control voltage ( $V_{ctrl}$ ) produced by the loop filter at the charge pump is given to the VCDL to adjust the delay of the buffers so that the (*i*+1)<sup>th</sup> cycle of the reference clock converges with the *i*<sup>th</sup> cycle of the VCDL. Figure 5 shows the basic architecture of a DLL-based TDC.



Figure 5. DLL based TDC

Before discussing the closed-loop response of VCDL, it is essential to give an overview of every component that construct a DLL.

A phase frequency detector is the first block of a DLL which determines the phase difference between the reference clock and the output of the VCDL. A PFD can be made with two D flip-flops where the input D is always high and connected with the supply. The two inputs A and B which are the reference clock and the output of the VCDL respectively. If the rising edge of A comes before B,  $Q_A$  goes high for a moment which is

the phase difference between the two. At the arrival of the rising edge of  $Q_B$ , both  $Q_A$  and  $Q_B$  are high for a brief moment and they reset the flip-flops through an AND gate.  $Q_A$  and  $Q_B$  are simultaneously high for a duration that is determined by the delay of the AND gate and the reset path of the flip-flops.



Figure 6. (i) Phase Frequency detector (PFD) with D flip-flops , (ii) PFD output (iii) D flipflops architecture with two cross-coupled SR latch

The charge pump along with the loop filter is the second stage of the DLL that comes after the PFD. It is a relatively simple integrating circuit where two S1, and S2 switches are controlled by the Q<sub>A</sub> and Q<sub>B</sub> which are the output of the PFD. When Q<sub>A</sub> is high for a duration of  $\Delta$ T considering a period of T, S1 turns on charging the capacitor C<sub>L</sub>. The voltage on the capacitor, V<sub>ctrl</sub> then goes up by  $\Delta T.I/C_L$ . Similarly, when the Q<sub>B</sub> is high, the capacitor C<sub>L</sub> discharge to the ground, and V<sub>ctrl</sub> drops. If we remember, Q<sub>A</sub> and Q<sub>B</sub> were both high for a moment to reset the PFD, at that time both S1, S2 are ON and current flows directly from the supply to the ground. For a phase difference of  $\Delta\Phi$  rad =  $(\Delta\Phi/2\pi).T$  seconds, the average current that charges the capacitor is equal to  $I.\Delta\Phi/2\pi$ . So, the slope of the voltage on the capacitor V<sub>ctrl</sub> is  $I.\Delta\Phi/2\pi.C_L$ .



Figure 7. (i) Circuit diagram of a charge pump (ii) Charge pump output showing it's behavior as an integrator

The average slope of the capacitor voltage  $V_{ctrl}$ ,

$$\left(\frac{dV_{Ctrl}}{dt}\right)_{\text{average}} = \frac{I \ \Delta \Phi}{2\pi \ C_L}$$
 2.2

$$\Delta V_{ctrl} = \frac{\Delta \Phi \cdot I}{2\pi \cdot C_l} \cdot T$$
 2.3

$$V_{ctrl} = \frac{\Delta \Phi. I}{2\pi. c_L} \cdot t u(t)$$
2.4

$$\frac{V_{ctrl}}{\Delta\Phi}(S) = \frac{I}{2\pi \cdot C_L} \frac{1}{s}$$
2.5

From equation 2.5 we can see that the charge pump acts as an integrator which takes the phase difference between the reference clock and the VCDL output and integrates it into a voltage  $V_{ctrl}$ . This control voltage is then fed to all the buffers in the delay line to have a total delay of one period of the reference clock.

The final segment of a DLL is a delay line which consists of an array of buffers. The delay line delays a reference signal by a predefined time constant that can be externally controlled [15]. The control voltage  $V_{ctrl}$  generated by the charge pump and the loop filter is provided to every buffer to match the delay so that the total delay becomes equal to the period of the reference clock. Every buffer produces the same amount of delay generating a total delay of the measured time interval. The total duration of time that can be measured by the TDC can be calculated by

$$\Delta T = N. T_{LSB} + \epsilon$$
 2.6

Here, *N* is the number of delay cells and  $T_{LSB}$  is the least significant bit of the TDC which determines the resolution of the TDC.  $T_{LSB}$  is the minimum delay that every buffer

generates in the delay line.  $\epsilon$  is the quantization error which we will discuss with other non-idealities in the next section.



Figure 8. Diagram of a voltage controlled delay line

To describe the overall system performance of a DLL-based TDC, we need to look at the closed-loop characteristics of the DLL. It is necessary to look at the open-loop and closed-loop system model of a DLL to understand their performance.

To determine the open loop gain of the DLL, we can design the system taking the gain of each block (Figure 9). The gain of the phase-frequency detector is  $K_{pd}$  which shows the relationship between its output and the phase difference between the reference clock and the VCDL output. The transfer function of the charge pump is  $I/2\pi.C_L.s$  which is basically an integrator. Finally, the gain of the voltage-controlled delay line (VCDL) is  $K_{VCDL}$  which represents the relationship between the output signal and the control and the propagation delay produced by the delay line.



Figure 9. Closed loop model of a DLL

The open-loop gain of the DLL,

$$G(s) = k_{pd} \frac{l}{2\pi \cdot C_L s} \cdot K_{VCDL}$$
 2.7

Looking at the open loop gain G(s), we can also calculate the closed loop transfer function as:

$$H(s) = \frac{G(s)}{1+G(s)} = \frac{K_{pd} \cdot \frac{l}{2\pi . C_L} \frac{1}{s} \cdot K_{VCDL}}{1 + Kpd \cdot \frac{l}{2\pi . C_L} \frac{1}{s} \cdot K_{VCDL}}$$
2.8

Equation 2.8 shows that the DLL is a single pole system generated by the charge pump integrator. That is why the DLL is inherently stable which is not the case for a type-II PLL that will be discussed in the next section.

#### 2.3.2 Phase locked loop based TDC

A PLL-based TDC is very identical to a DLL-based TDC that we discussed in the previous section. The main distinction between them is that insted of using a voltage-controlled delay line, it uses a voltage-controlled ring oscillator (VCRO). Previously, the control voltage,  $V_{ctrl}$  from the loop filter was controlling the delay where the frequency was fixed but now it also controls the oscillation frequency of the ring oscillator.

Figure 10. shows the block diagram of a typical type-II PLL. The VCRO acts as the second integrator in the loop which integrates the control voltage to output phase.



Figure 10. Type-II PLL

As we have two integrators in the loop, this system is inherently unstable because the phase margin is zero. That is why an additional resistance ( $R_1$ ) is placed in series with the loop filter capacitor  $C_1$  to add a Left-Half-Plane (LHP) zero to counteract one of the poles to stabilize the system. To control the VCRO, we want a stable dc control voltage  $V_{ctrl}$  otherwise the frequency will fluctuate and the loop will not be able to lock. But the resistance ( $R_1$ ) will cause positive and negative ripple pulses on the control voltage that has an amplitude of  $I.R_1$ . To counteract this ripple, another capacitor  $C_2$  is added in parallel with the low-pass filter. It adds another pole into the system along with the two poles at the origin coming from the two integrators (from charge pump and VCRO).



Figure 11. Closed loop model of a PLL

The frequency of the voltage controlled oscillator is determined by the controlled voltage that is coming from the low pass filter  $V_{ctrl}$  and the number of delay cells. Usually, the frequency of the oscillator is very high (GHz) and generating that high frequency clock

is often not feasible. That is why, the output frequency is divided into an integer ratio (M) and compared with a MHz frequency reference clock. The VCRO itself is an integrator which integrates the control voltage  $V_{ctrl}$  and stabilizes the phase difference  $\Delta \Phi$  between the input and output constant.

$$\Phi_{out}(t) - \Phi_{ref}(t) = constant \longrightarrow \frac{d\Phi_{out}}{dt} = \frac{d\Phi_{ref}}{dt} \qquad \omega_{ref} = \omega_{out}$$
 2.9

To analyze the PLL architecture, we can have a look at the closed loop system model of a type-II PLL. The open-loop gain of the system, G(s) can be calculated by putting the gains of all subsystems:

$$G(s) = \frac{K_{pd} \cdot I. Z(s). K_{VCRO}}{M. s}$$
2.10

Here, Z(s) is the transfer function of the low-pass-filter,

$$Z(s) = (R_1 + \frac{1}{sC_1}) = \frac{1+s. R_1. C_1}{s. C_1}$$
2.11

The closed loop-gain of the system,

$$H(s) = \frac{G(s) \cdot M}{1 + G(s)} = \frac{\frac{I \cdot K_{VCRO}}{2\pi \cdot C_1} (1 + s. R_1 \cdot C_1)}{s^2 + \frac{I}{2\pi} \cdot \frac{K_{VCRO}}{M} \cdot R_1 \cdot s + \frac{I}{2\pi \cdot C_1} \cdot \frac{K_{VCRO}}{M}}$$

$$= M \cdot \frac{\omega_n^2 (1 + \frac{s}{\omega_z})}{s^2 + 2\zeta \omega_n s + \omega_n^2}$$
2.12

Equation 2.12 helps us to find the natural frequency  $\omega_n$ , zero ( $\omega_z$ ), and the damping coefficient of the system ( $\zeta$ ),

. ..

$$\omega_{\rm n} = \sqrt{\frac{K_{pd} \cdot K_{VCRO} \cdot I}{M \cdot C_1}}$$
2.13

$$\zeta = \frac{R_1}{2} \sqrt{\frac{K_{pd} \cdot K_{VCRO} I \cdot C_1}{M}}$$
2.14

$$\omega_{\rm Z} = \frac{\omega_n}{2\zeta} = \frac{1}{R_1 \cdot C_1}$$
 2.15

Selecting the bandwidth of the loop filter for PLL TDC plays a crucial role in determining the performance of the TDC. A smaller bandwidth of the loop filter will lead to reduced reference spurs and improve phase noise level along with a longer locking time of the PLL [9]. The VCRO generates different phases and often generates 30 dB more phase noise than the LC based oscillator which is very popular in RF systems [10].

#### 2.4 Performance figures of TDC

A TDC quantizes a continuous time interval into a discrete digital output. Figure 12 shows the quantization function of a TDC where on the x-axis we have the input time interval to be measured and on the y-axis their corresponding digital outputs are represented. It is to note that a span of time is described as the same output. The width of this time interval defines the resolution of the TDC ( $T_{LSB}$ ). It causes an increment of 1 LSB in the output. The input-output characteristics displays a quantization error,  $\varepsilon$  ( $O \le \varepsilon < T_{LSB}$ ), which is a random signal and contributes to the noise floor of the measurement. The actual characteristics can be described below:

$$T_{in} = B_{out} \cdot T_{LSB} + \varepsilon$$
 2.16

The transfer characteristic of a TDC is very similar to an ADC except it quantizes a continuous time instead of a continuous voltage.



Figure 12. Ideal input/output behavior of a TDC

#### 2.4.1 Quantization error

The Quantization error comes into the picture when a continuous signal is quantized into a discrete domain. Equation 2.16 shows us how the quantization error,  $\varepsilon$  is associated with time-to-digital conversion. Unlike ADC, where the quantization error is typically symmetrical around zero and ranges from  $-1/2V_{\rm LSB}$  to  $+1/2V_{\rm LSB}$ , the quantization error in TDC is not mean free (o  $\leq \varepsilon < T_{\rm LSB}$ ). The mean of the quantization error,

$$\overline{\epsilon} = \frac{1}{TLSB} \int_0^{TLSB} \epsilon \, d\epsilon = \frac{1}{2} \, T_{LSB}$$
 2.17

Power of the quantization noise can be calculated by taking the square of equation 2.17,

$$[\overline{\epsilon}]^2 = \frac{1}{TLSB} \int_0^{TLSB} \epsilon^2 d\epsilon = \frac{1}{3} T_{LSB}$$
 2.18

Quantization error can described in terms of signal-to-noise ratio (SNR). If all other non-linearities are neglected other than quantization noise, the SNR of a TDC can be given as [8],

$$SNR = 6.02 \, dB \, N + 1.76 \, dB$$
 2.19

where 2<sup>N</sup>-1 is the total number of quantization steps. Equation 2.19 shows that for each addition bit, the SNR of an ideal quantizer increases around 6 dB. But practically it does

not meet that because of other non-linearities and the Effective Number of Bits (ENOB) is used to describe the true picture.

#### **2.4.2** Linear imperfections in TDC

The offset and gain errors are referred as linear imperfections because they do not cause any non-linear distortion [11]. Another reason to term them as linear is because offset can be represented by adding a term to the input and the gain error as a multiplication factor.

In an ideal scenario, the initial step of a TDC takes place at the position  $T_{000...01} = T_{LSB}$ . If this first conversion happens before that and consequently the entire converter characteristics are displaced along the time axis, it results in an offset error (Figure 13) which is denoted by  $E_{offset}$  and expressed as:



Figure 13. Offset error in input/output characteristics of a TDC

The gain of the TDC represents the steepness of the input/output characteristics and can be expressed as:

$$k_{TDC} = \frac{\Delta B}{\Delta T}$$
 2.21

Gain can be defined in many ways depending on how the deltas are defined. In the looped TDC, it is more meaningful to define the core gain  $k_{core-TDC}$  as the converter characteristics exhibit periodicity and the overall gain taking the entire converter characteristics. If the periodic parts do not fit together perfectly, we often have discrepancies between these two gains.

$$K_{TDC} = \frac{1}{T_{LSB}}$$
 2.22

In an ideal TDC, the gain,  $k_{TDC}$  is  $1/T_{LSB}$  causing every change of a bit after a time difference of  $T_{LSB}$ . Any variation in the gain can be quantified as gain error ( $E_{gain}$ ) which

represents the deviation of the last step position from its ideal value in terms of LSB after the offset error is omitted (Figure 14)

$$E_{gain} = \frac{1}{TLSB} \left( T111....11 - T000.....01 \right) - \left( 2^N - 2 \right)$$
 2.23



Figure 14. Gain error in input/output characteristics of a TDC

#### 2.4.3 Non-linear imperfections in TDC

Due to local variations amongst the delay cells such as mismatch in the delay elements, and process variation, we often have mismatches in the individual delays which causes non-linear imperfections in the TDC. It is named non-linear because it causes non-linear distortion and deviation of the TDC characteristics from its expected shape. Differential Non-linearity (DNL) is the cause of local variation in the delay elements resulting in the deviation of each individual step from its ideal value,  $T_{LSB}$  (Figure 15). On the other hand, Integral-non-linearity (INL) happens due to the global variation in the delay elements. INL is the higher level description of how the converter characteristics bends or deviates. INL shows the deviation of the TDC from the ideal one.



Figure 15. DNL and INL in input/output characteristics of a TDC

To measure the INL, we can either connect a straight line between the first and last steps or a best-fit line. The INL represents either the maximum deviation or the root-meansquare (rms) deviation from the straight line representing the ideal characteristics of the TDC.

The delay of a delay line after n stages is given by [16],

$$T_N = \sum_{i=1}^N T_i = N \cdot T_{LSB} + \sum_{i=1}^N \epsilon_i$$
 2.24

Here,  $T_{LSB}$  is the actual delay and  $\varepsilon_i$  is the random error associated to every individual cells. If there is no correlation between the stages, the standard deviation of an n stage delay line is given by,

$$\sigma_{T_n} = \sqrt{n.\,\sigma_{\epsilon_i}} \qquad 2.25$$

Equation 2.25 shows that the consistency of the delay decreases as we progress along the delay line. In longer delay line, the INL error is more susceptible than the shorter one. The DNL can be computed by,

$$DNL_i = \frac{T_{i+1} - T_{i-} T_{LSB}}{T_{LSB}} = \frac{\epsilon_i}{T_{LSB}}$$
 2.26

In a closed-loop system for voltage-controlled delay line (VCDL), the end point of the delay line becomes synchronized to the reference clock. That is why the maximum DNL is found at the middle of the delay line which can be described as,



 $\sigma_{\varepsilon_{DLL}}(n) = \sigma_{\varepsilon} \cdot \sqrt{\frac{n (N-n)}{N}}$  2.27

Figure 16. Delay variation in open-loop and closed-loop delay line

#### 2.5 Conclusion

In this chapter, we tried to outline the fundamental concepts of a TDC. As the main work is done on the timing generator circuit, the flash TDC was the focal point. The main blocks and system model of a DLL-based and a PLL-based TDC were explained. Finally, the linear and non-linear imperfections of a TDC were explained.

# **Chapter 3. Buffer Design**

The tunable buffer delay stage serves as the fundamental component of a delay line. It is responsible to deliver a consistent and well-defined delay across all stages which can be tuned by the control voltage. Ideally, this control voltage should be the sole factor governing the buffer stage delay. However, like all the integrated circuits, process, and environmental variations can influence the buffer delays which can show inconsistency.

The timing generator of a TDC demands high resolution and accuracy, making it crucial to limit the amount of delay variation across the buffer stages. The variations in the delays can reduce the accuracy of the TDC. Therefore, a key factor in the buffer stage is to minimize the sensitivity of the buffer to supply, temperature and process variation to reduce the variations. However, this is often challenging because a significant amount of supply and noise rejections are required across the delay line stages.

Besides having a strong supply and noise cancellation, the buffer design is subject to additional limitations. The first thing is that it should be compatible with designing with low supply voltage because of the recent technology nodes where the supply voltage is as low as 1 V. Secondly, the buffer delays must be tunable with a wide range of control voltage. There should be a linear control of the unit delay as the stability of the DLL relies on the loop gain, making the linear control crucial for optimal performance.

This chapter will commence with how PVT variations can impact delays in the buffer stage. Then we will discuss the two most used differential buffer architectures which we used to implement delay lines. The first design is the very well-known maneatis cell [1] built on a source-coupled pair, employing load elements with symmetrical I-V characteristics. It results in exceptional noise rejection capabilities because of its self-biasing implementation. The second one is a pseudo-differential implementation without the tail current source broadly known as the Lee-Kim delay cell [4] implemented with positive feedback by a cross-coupled PMOS pair.

### 3.1 Noise sensitivity and delay variations

Process, voltage, and temperature are the main factors contributing to the delay variations across the delay line. We need to analyze how a static variation in the supply voltage can impact the buffer delays which is known as static supply noise sensitivity. It is measured as the percentage difference between the unit delays produced by the buffer divided by the difference between supply voltages. This change in supply variation can be corrected by the loop but it depends on the loop bandwidth. According to the industrial standard, all integrated circuits should operate between  $-45^{\circ}$  C to  $+85^{\circ}$  C. This temperature variation can also cause changes in the delay of the buffer stage. Another cause of the random variation across the buffer stages is the process associated to the manufacturing. This can cause non-linearities in the delay because of small differences in the transistor characteristics.

The chain of the buffers forms the delay line and every stage produces the same amount of delay which is controllable by the control voltage established by the loop filter. But with the random process variation amongst the buffers, the delays can be different even if they are very small, this may lead to the misread of the TDC output. The differential non-linearity (DNL) and integral non-linearity (INL) come into the picture when we talk about the random variations amongst the delay cells.

#### 3.2 Maneatis delay cell

A differential buffer stage is constructed using an NMOS source-coupled pair that can drive two resistive load elements and an NMOS tail current source. It acts as a differential amplifier when it functions in the large signal region where the NMOS source coupled pair act as switches. The primary function is to invert and amplify the input signal providing a unit delay. The output swing is mostly dependent on the resistive load and the NMOS current source. The delay is mostly related to the resistance of the load element and the parasitic capacitance associated at the outputs.



Figure 17. Generalized version of a differential buffer stage

As we discussed before, one of the crucial concerns in designing the buffer stage is the rejection of static supply noise cancellation which affects inconsistent delays along the delay line. The differential buffer that is shown in Figure 17 would achieve a high supply noise rejection if the load elements on the top are linear because the variation in the supply can not change the resistance. However, we often want to use a load that can be adjusted externally which will be inevitably non-linear. This adjustable load is often dependent on the amount of current in the differential pair. As a whole, the level of supply noise cancellation depends on the sensitivity of the different pair currents to the supply variation. The diffusion capacitances at the output node which form the other component for the delay generation often have far less impact of the supply variation[1]. It is to be noted that the adjustable load element implemented by MOS devices often have non-linear current-voltage (I-V) characteristics.

The differential buffer stage that is shown in Figure 17 has limited supply noise cancellation due to the constrained output impedance of the NMOS tail current source. As the output voltage swing depends on the top supply  $V_{DD}$ , any change in the supply will change the output drain voltage of the simple NMOS current source. That will effectively change the current through the differential pairs and the buffer delay. That is

why, a simple NMOS current source is not very good for the buffer design in terms of supply rejection as it has a limited output resistance. To enhance the supply noise cancellation, a cascode implementation of the current source can be exercised which will have a much higher output impedance [12]. But a cascode stage often has a voltage headroom problem as the supply voltage is very low. Considering inadequate supply voltage for the cascode current source, a simple NMOS current source is used in the maneatis cell (Figure 18) with dynamically biased implementation. The biasing voltage  $V_{BN}$  is dynamically adjusted to compensate for the limited output impedance. This approach provides isolation of the impedance, as determined by the dependency of the current through the branches on the supply which could be attained by a cascode implementation.



Figure 18. Maneatis cell with symmetric load and dynamically biased NMOS tail current source

The maneatis cell is built around an NMOS source coupled pair with symmetric load elements and dynamically biased NMOS tail current source. The bias voltage  $V_{BN}$  of the tail current source is dynamically adjusted to have a current that is independent of supply and substrate voltage. The symmetric load consists of two diode-connected PMOS devices which are equal in size and biased with the voltage  $V_{BP}$ .  $V_{BP}$  is equal to the control voltage,  $V_{ctrl}$  (Figure 19) that is applied externally and it is alone responsible for the overall biasing of the maneatis cell, providing control over the buffer delay. The device sizes are chosen to have the minimum delay and have a wide range of control voltage so that every buffer stage produces the same amount of delay over the delay line.

As we discussed before, linear load elements provide high supply noise rejection because this variation is coupled to both outputs and changes the output common mode level. But the differential mode resistance remains independent. That is why the buffer delay is unaffected by the supply variation when designed with a linear load element. However, it is very difficult to implement adjustable load elements with MOS transistors providing linear I-V characteristics which can be controlled with a wide range of control voltage. Nonetheless, there is a category of non-linear loads that are implemented with MOS that exhibit very good supply noise cancellation for small variations just like linear resistors [1]. This is called symmetric load which has the unique property of having I-V characteristics symmetrically centered about the midpoint of the voltage swing for the entire extent of the swing.

#### 3.2.1 Self-biasing circuit for Maneatis cell

Figure 19 depicts the overall current source biasing circuit of the maneatis cell which serves two essential purposes. Firstly, it establishes the appropriate symmetric load swing ensuring the current through the NMOS current source. Secondly, it dynamically adjusts the bias voltage of the NMOS current source to counteract the impact of the finite output resistance of the simple NMOS current source. This dynamic arrangement enables the current to be constant and independent in terms of supply variation.

The current source bias circuit is mainly composed of a half buffer replica and a singlestage amplifier (gain  $\geq$  40 dB). The amplifier adjusts the current of the NMOS current source so that the symmetric load swing of the replica buffer circuit becomes equal to the control voltage  $V_{ctrl}$ . Consequently, the NMOS tail current source's current is determined by the load element and remains unaffected by the change in the supply voltage. As the supply voltage changes, the drain voltage of the current source changes but the amplifier adjusts the gate voltage to maintain a constant output current counteracting the finite impedance [13]. As we have a negative feedback implementation with the amplifier, we may need to consider the frequency compensation in case the loop is not stabilized. It has two poles, one at the output of the symmetric load and another one at the output of the amplifier. Often, the pole at the amplifier's output will be dominant due to the much higher output impedance associated with the node. For the frequency compensation, we can employ dominant pole compensation which is making the dominant pole more dominant. This can be done by adding a capacitor at the output node which lowers the dominant pole and eventually lowers the unity gain frequency (f<sub>ugb</sub>) providing a much better phase margin.



Figure 19. Complete implementation of the self-biasing circuit of the maneatis cell

With the help of the self-biasing circuit, we do not need any external bias generator like a bandgap reference. Even though the current source are not implemented with cascode implementation, it can have supply voltage rejection which will be equivalent to the buffer with a cascoded tail current source [2].

In case of a very large delay line which is often required for a higher resolution TDC, the control voltage generated by the charge pump has to be distributed to a significant number of buffer stages. In consequence, the effective control voltage input capacitance will be very high, inevitability including a significant amount of parasitic capacitance. That is why, we need a much higher loop capacitor and the control voltage may be so high that the loop can not even lock. In such circumstances, it is advantageous to buffer the control voltage which may result in a much lower loop filter capacitor than before. Biasing the control voltage externally to the bias circuit may introduce other sources of noise coupling. That is why, a control voltage buffer circuit is added in addition to the half-buffer replica in the self-biasing circuit. The buffer circuit consists of a diode connected load element in a half-buffer replica, with the gate of the biased PMOS device connected to the load element's output. It can generate an internal control voltage from the NMOS current source bias to be utilized as the PMOS bias voltage as shown in Figure 19. Eventually the control voltage  $V_{ctrl}$  becomes equal to the output voltage  $V_o$  and it is buffered as  $V_{BP}$ .

### 3.3 Lee-Kim cell

The second delay cell architecture which we will discuss is much simpler and is known as the Lee-Kim cell [4]. It features a pseudo-differential implementation comprised of an NMOS pseudo-differential input pair and two cross-coupled PMOS devices in parallel with voltage-controlled PMOS devices in saturation(Figure 20). A single ended delay cell implemented with a simple inverter with current starving implementation can not meet the full potential of the technology as the delay is quite large and it has the pulse shrinking effect when they are used in large delay line [9]. The Lee-kim cell is just the modified version of the single-ended current starving implementation where we have differential outputs and produce a much smaller propagation delay. The asymmetrical current starvation, affecting only the rising edge, is compensated by the cross-coupling within the delay cell and the alternating connection between delay cells in the delay line.



Figure 20. Lee-Kim delay cell

## 3.4 Conclusion

In this chapter, we explained the two delay cell architectures that are used to implement the delay line. Maneatis cell is a fully differential architecture with the tail current source and its biasing is accomplished by the self-biasing circuit. The self-biasing circuit of the maneatis cell was also explained in great detail. The second delay cell architecture was Lee-Kim cell which features a pseudo-differential setup.

# **Chapter 4. Simulation Results**

In this section, we will discuss the results of the simulations of the delay lines implemented with Maneatis cell and Lee-Kim cell that we have seen in Chapter 3. Choosing the right dimension of the devices is essential to have the minimum delay and a wide range of controllability. As we discussed before, environmental and process variation can change the delays amongst the stages significantly. In most part of this chapter, we will show how the delay changes with the PVT variation and random mismatch. Both of the two architectures are implemented with gpdk90 nm technology with the nominal supply voltage of 1.20 V. The delay of an inverter with minimum length and width provided in this technology is 5.5 ps. The goal is to optimize the sizes of the devices to produce the minimum delay which can deliver minimum close to the inverter delay.

## 4.1 Delay line implemented with the Maneatis cell

We tried to optimize the width of the PMOS in the symmetric load and the NMOS tail current source to have the minimum delay and have a wide range of controllability, not exploding the power consumption. Every buffer stage should produce a minimum delay for each control voltage. The length of all devices is kept at 100 nm which is the minimum length that is provided except tail current source,  $L_{tail} = 200$  nm. Increasing the length of the tail current source will increase the drain resistance thus increases the power supply rejection. It decreases the variations in the current between the delay cells and decreases theDNL. That is why the length of the NMOS tail current source is kept slightly higher than the other devices.

First and foremost we need to optimize the size of the devices to have the minimum delay and make sure the devices operate in the saturation region. The width of the PMOS devices in the symmetric load (Wp) and NMOS tail current source (W<sub>tail</sub>) are chosen by doing iterative simulations providing the minimal buffer delay.





Finally, the size of the devices are chosen which produces the minimum delay (Figure 21) and also gives quite a good range of controllability over the delay from 9.54 ps to 45.13 ps under nominal condition with control voltage between 450 mV to 800 mV.

| Name of the device                  | Width (µm) | Length (nm) |
|-------------------------------------|------------|-------------|
| PMOS <sub>symmetric-load</sub>      | 0.7        | 100         |
| NMOS <sub>differential</sub> pair   | 0.5        | 100         |
| NMOS <sub>tail current source</sub> | 1.2        | 200         |

Table 1. Width of the devices in Maneatis cell for minimum delay

#### 4.1.1 PVT variations and mismatch in the Maneatis cell

In 90 nm technology, the nominal supply voltage is 1.2 V. As the supply voltage usually comes from a low-dropout voltage regulator (LDO), there might be a  $\pm 10\%$  variation in the supply (1.08 V – 1.32 V). Considering that aspect, the variations in the buffer delay at different control voltages are simulated.



Figure 22. (i) Static Supply voltage variation in buffer delay in maneatis cell (ii) Supply variation in two extreme cases in maneatis cell

Due to static supply variation in the maneatis cell, we have seen 15-30% change in the delay except at control voltage 800 mV (Figure 22). Most of these variations are found under the most extreme cases if the supply varies a lot going down to 1.08 V or up by 1.32 V which is also shown above in Figure 22 (ii).

To look at the temperature variation in the buffer delay, we simulated the delay line at different temperatures ranging from  $-45^{\circ}$  C to  $+85^{\circ}$  C.



Figure 23. (i) Temperature variation in maneatis cella at different control voltages (ii)Temperature variation at two extreme cases for maneatis cell

In terms of the temperature variation, maneatis cell shows only 8-20% variation from the nominal temperature at  $27^{\circ}$  Celsius. This variation happened mostly at the extreme temperature Figure 23 (ii).

An integrated circuit should perform properly taking consideration of various flaws during the manufacturing process. Along with the nominal corner (NN), the circuit is simulated in FF, FS, SF, and SS corner to see the variations in the buffer delay.



Figure 24. Process corner variation in maneatis cell

Figure 24 shows the buffer delay at various control voltage simulated at various corner. These variations mostly happened at SS and FF corner where both NMOS and PMOS are slow or fast. The buffer delay varied around 30-40% at those extreme cases from the nominal corner (NN).

A 7-bit resolution TDC is constituted by a delay line that has  $2^7 = 128$  delay cells. Every cell should produce the same amount of step delay  $T_{LSB}$  which defines the resolution. But a random mismatch in the buffers can change the delay. A Monte carlo analysis can predict how much deviation we can expect from the random variations amongst the delay cells. In figure 25, a histogram shows the mean and the standard deviation of the buffer delay because of random mismatch amongst the delay cells along the maneatis cell based delay line.



Figure 25. Monte Carlo analysis of delay line constructed with maneatis cell at  $V_{ctrl}$ =600 mV

| Control Voltage<br>(mV) | Mean delay (ps) | Standard deviation, σ<br>(fs) |
|-------------------------|-----------------|-------------------------------|
| 450                     | 9.55            | 451.17                        |
| 500                     | 10.53           | 457.30                        |
| 550                     | 11.81           | 490.58                        |
| 600                     | 13.63           | 583.78                        |
| 650                     | 16.46           | 793.60                        |
| 700                     | 21.02           | 1170                          |
| 750                     | 29.28           | 1931                          |
| 800                     | 46.71           | 3918                          |

Table 2. Results of monte carlo analysis of maneatis cell under nominal conditions



At 600 mv, the maximum DNL and INL of open-loop delay line based with the maneatis cell are 0.072 and 0.103 respectively.

Figure 26. DNL and INL measurement of a maneatis cell based delay line

### 4.2 Delay line implemented with the Lee-Kim delay cell

Just like the maneatis cell, the first task is to optimize the sizes of the devices in the Lee-Kim cell to have the minimum buffer delay and have quite a good range of controllability over the delay line while keeping the devices in the saturation region.



Figure 27. Width optimization for Lee-Kim delay cell producing minimum delay

The width of the top PMOS ( $M_3$ ,  $M_4$ ) and the PMOS in the feedback ( $W_P$ ) denoted as  $M_5$ ,  $M_6$  in Figure 20, are chosen to produce the minimum delay for various control voltages. The length of all the devices are takes as the minimum length that is provided in the technology node and the width of the devices are chosen by iterative simulations as shown in Figure 27.

| Name of the devices                 | Width (µm) | Length (nm) |
|-------------------------------------|------------|-------------|
| PMOS <sub>top</sub>                 | 2.50       | 100         |
| $\mathbf{PMOS}_{\mathrm{feedback}}$ | 0.95       | 100         |
| <b>NMOS</b> <sub>switch</sub>       | 1.50       | 100         |

Table 3. Width of the devices in Lee-Kim cell for minimum delay

By choosing the size of the devices, we have achieved a tuning range between 11.6 ps to 31.8 ps under nominal condition with control voltage ranging from 400 mV to 900 mV where the devices operate in the right region.

#### 4.2.1 PVT variations and mismatch in the Lee-kim cell

The circuit is simulated by varying the supply voltage ( $\pm 10\%$  of nominal voltage) at different control voltages.



Figure 28. (i) Static Supply voltage variation in buffer delay in maneatis cell (ii) Supply variation at two extreme cases

The supply variation causes 20-35% change in the buffer delay which mostly happens if the supply varies a lot (1.08 V).

Temperature variation in Lee-Kim delay cell causes around 20% change in the step delay from the nominal temperature at  $27^{\circ}$  C.



Figure 29. (i) Temperature variation in Lee-kim cell at different control voltages (ii) Temperature variation in Lee-Kim at two extreme cases

At higher control voltages, in the FS corner, the Lee-Kim cell produced a delay which was significantly higher than in the NN corner. The SF corner behaved very similar to NN and FF and SS corner produced 15-35% variation in the step delay.



Figure 30. Process corner analysis in Lee-kim delay cell

A Monte carlo analysis in the random mismatch between the Lee-kim cells along the delay line is shown in the Figure 31. The histogram shows the standard deviation and other statistical parameters to predict how much the delay varies from the expected result.



Figure 31. Monte Carlo analysis of delay line constructed with Lee-Kim cell at  $V_{ctrl}$  = 600 mV

| Control Voltage (mV) | Mean delay (ps) | Standard deviation, $\sigma$ (fs) |
|----------------------|-----------------|-----------------------------------|
| 400                  | 11.35           | 116.39                            |
| 500                  | 13.54           | 127.15                            |
| 600                  | 16.52           | 134.87                            |
| 700                  | 20.53           | 143.05                            |
| 800                  | 25.96           | 163.98                            |
| 900                  | 31.80           | 174.72                            |

Table 4. Results of monte carlo analysis of Lee-Kim cell

At 600 mV, the maximum DNL and INL of open-loop delay line based with Lee-Kim cell are 0.062 LSB and 0.087 LSB respectively.



Figure 32. DNL and INL measurement of a Lee-Kim cell based delay line

#### 4.3 Delay-locked loop results

After appropriate sizing of the devices and analyzing PVT variations and other nonlinearities amongst them, the delay line is put into locked loop configuration along with the phase-frequency detector(PFD), charge pump, and loop filter that is discussed in Chapter 2.

For a 7-bit DLL-based TDC, a delay line consisting of 2<sup>7</sup>= 128 delay cells are needed. In an ideal scenario, at every control voltage, all the buffers in the delay line should produce the same delay. But in most cases, we see that the timing generator settles after 4-5 delay cells because of the rise and fall time associated with the reference clock.



Figure 33. Buffer delay for various control voltage in delay line constituted by Lee-Kim cell

Figure 33 shows the settling behavior of buffer delays after 5 cells. The last 5 cells also show some inconsistencies because of less loading at the end of the delay line. That is why, in place of 128 cells, the delay line is structured with 138 cells leaving the first five and last five cells as the timing generator in TDC, and the output is taken from the 133<sup>rd</sup> stage (Figure 34).



Figure 34. Diagram of the DLL constituted with maneatis cell based delay line

#### 4.3.1 DLL with delay line constituted by Maneatis cell

As discussed in section 2.3.1, the delay line is finally placed into locked-loop configuration. Usually, a DLL locks the phases after a period cycle of the reference clock. Unlike a phase-locked-loop (PLL) where the output frequency is a multiple of the input, a DLL delays an input signal which is controlled by the control voltage without changing the reference frequency.



Figure 35. DLL output with delay line made with maneatis cell

In Figure 35, we can see that the DLL settles at control voltage ( $V_{ctrl}$ ) 600 mV which produces a buffer delay of 13.5 ps at every delay cell along the delay line which we have seen earlier. As the output signal was taken from the 133<sup>rd</sup> stage, ideally the output should be settled after one cycle of the reference clock and the total delay should be 13.5 ps x 133 = 1.796 ns. But the output settles at 1.785 ns and makes a 1 ps difference between the ideal and simulation results. This happens because of the presence of dead time of the DLL. The average power consumption calculated is 19.12 mW. The dead time of the DLL can be improved by interlocked interpolation techniques which will further reduce the delay than the minimum buffer delay.

#### 4.3.2 DLL with delay line constituted by Lee-Kim cell

A delay line structured with Lee-Kim cell is also placed in locked-loop configuration but we first need to buffer the control voltage. Otherwise, the diffusion capacitance of all of the delay cells will effectively increase the control voltage and the loop will never be able to lock. In case of delay line maneatis cell, the control voltage is already buffered along with the self-biasing circuit. But in case of DLL constituted with Lee-Kim based delay line, the control voltage should be buffered after the loop filter.



Figure 36. DLL output with delay line made with Lee-kim cell

The control voltage settles around 471 mV which produces a 12.81 ps buffer delay. Theoretically, the total delay should be 1.703 ns which is very close to what the simulation produced (1.701 ns) as shown in Figure 36. The average power consumption simulated is 73.29 mW. The average power consumption in the DLL implemented with maneatis cell-based delay line is less than that Lee-Kim cell-based delay line. This is because the sizes of the devices in maneatis cell are smaller than that of Lee-Kim cell. In addition to that, an additional source follower is used in the Lee-Kim cell based DLL to buffer the control voltage that is supplied to the VCRO. That also contributes additional power consumption. The frequency of the clock in the Lee-Kim cell based DLL is 0.59 GHz which is higher than that of Maneatis cell (0.56 GHz). That is another reason why the power consumption in the Lee-Kim cell based DLL is higher than that of the Maneatis cell.

#### 4.4 Conclusion

The minimum delay at different PVT corners varied for the both architectures compared to what we found under nominal condition. The minimum delay simulated at different corners for the both architectures are shown in Table 5.

| Conditions                     | Delay with Maneatis cell<br>(ps) | Delay with Lee-Kim cell<br>(ps) |
|--------------------------------|----------------------------------|---------------------------------|
| Nominal condition<br>(NN)      | 9.54                             | 11.60                           |
| V <sub>DD</sub> =1.08 V (-10%) | 10.75                            | 13.68                           |
| $V_{DD}$ = 1.32 V (+10%)       | 8.66                             | 10.19                           |
| At T = $-45^{\circ}$ C         | 7.63                             | 9.03                            |
| At T = $+85^{\circ}$ C         | 11.81                            | 13.43                           |
| FF corner                      | 7.55                             | 9.15                            |
| FS corner                      | 11.70                            | 12.80                           |
| SF corner                      | 12.28                            | 11.34                           |
| SS corner                      | 13.41                            | 15.60                           |

Table 5. Minimum delay produced by the Maneatis and Lee-Kim cell at different PVT corners

The minimum delay achieved by the Maneatis cell under nominal condition was slightly lower than that of the Lee-Kim cell. In terms of the supply variation, the delay with the Maneatis cell varied much less than that of the Lee-Kim cell. At two extreme temperatures ( $-45^{\circ}$  C and  $+85^{\circ}$  C), the minimum delays were varied around 2 ps for both delay cells. In the SS corner, the Lee-Kim cell performed worst with around 4 ps deviation from what we found under nominal condition.

The power consumption at  $V_{ctrl} = 600 \text{ mV}$  for the open-loop delay lines made with the Maneatis cell and the Lee-Kim cell are 19.05 mW and 46.32 mW respectively. At the same control voltages, the Lee-Kim cell based delay line consumed much more power because of the wider devices than that of Maneatis cell. A detailed comparison is given in the next chapter.

# **Chapter 5. Conclusion and Future Work**

Based on the simulation results, Maneatis cell showed a clear advantage in terms of rejecting static supply variation over Lee-Kim cell. The self-biasing circuit in the maneatis cell rejects a significant amount of static supply variations. Even at very high and very low temperatures, and process corner variations, the variation in delays generated by a maneatis cell is relatively smaller than Lee-Kim cell.

| Parameters                         | Delay line with Maneatis cell | Delay line with Lee-<br>Kim cell                 |
|------------------------------------|-------------------------------|--------------------------------------------------|
| Minimum delay (ps)                 | 9.54                          | 11.60                                            |
| Static Supply variation            | 15-30%                        | 20-35%                                           |
| Temperature variation              | 8-20%                         | 19-22%                                           |
| Process Corner variation           | 20-40%                        | 19-41% (except FS at<br>high V <sub>ctrl</sub> ) |
| Maximum DNL (LSB)                  | 0.072                         | 0.062                                            |
| (open loop)                        |                               |                                                  |
| Maximum INL (LSB)                  | 0.103                         | 0.087                                            |
| (open loop)                        |                               |                                                  |
| Power Consumption<br>(closed loop) | 19.12 mW at 0.56 GHz          | 73.29 mW at 0.59 GHz                             |

Table 6. Comparison between delay line implemented with the Maneatis cell and the Lee-Kim cell

These results suggest that the Maneatis cell perform relatively better in PVT variations than the Lee-kim cell. The average power consumption in the Lee-Kim cell based DLL is more than that of the Maneatis cell because of the wider devices, faster reference clock, and use of additional source-follower circuit to buffer the control voltage. Non-linearities in both architectures are low as the delay line only produces thermometer code. In a TDC, this timing generator is connected to the flip-flop through buffers. That will further increase the DNL and INL errors of the overall TDC. Additionally, it will also increase the minimum delay because of more capacitive load.

The minimum delays achieved by both of the delay cell are higher than the inverter delay with the same technology (5.5 ps). This resolution can be further improved by the use of local passive interpolation (LPI) techniques where a chain of resistors is placed between two delay cells. This can introduce a different time constant because of the additional capacitance associated to it.

The jitter performance is one of the aspects which can be further investigated. The reference clock of the DLL in both of the architectures have a frequency in the GHz range. To generate this high-frequency clock, it is required to have a PLL which also can

be implemented with these two delay elements. The frequency tuning of the voltagecontrolled oscillator in the PLL will depend on the control voltage just like the delay in VCDL.

Radiation effects are another important aspects that were not investigated in this thesis. The impact of total ionizing dose (TID) and single event effects (SEE) are important for the TDCs used in high energy physics applications. TID can change the threshold voltage of the devices which will change the delay.

To improve the fast locking of the DLL, modifications can be made to D-flip flop based PFD. Hybrid phase detector which is composed of bang-bang phase detector and linear PFD which are proposed in [9,16] can be explored.

## References

[1] J.G. Maneatis and Horowitz, M. (1993), *"Precise delay generation using coupled oscillators"*, 28(12), pp.1273–1282, doi:https://doi.org/10.1109/4.262000.

[2] Kim, et al., "*A 30-MHz Hybrid Analog/Digital Clock Recovery Circuit in 2-um CMOS*," IEEE J. Solid-State Circuits, vol. 25, no. 6, pp. 1385-1394, Dec. 1990.

[3] M. Loinaz, and B. Wooley, *"A BiCMOS Time Interval Digitizer for High-Energy Physics Instrumentation,"* Proc. of the IEEE 1993 CICC, May 1993, pp. 28.6.1-28.6.4.

[4] J. Lee and B. Kim (2000), "A low-noise fast-lock phase-locked loop with adaptive bandwidth control", *IEEE Journal of Solid-State Circuits*, 35(8), pp. 1137–1145 doi:10.1109/4.859502.

[5] W. Kester and A. D. I. Engineering, *Data conversion handbook*. Newnes, 2005.

[6] G. W. Clark, "The contributions of bruno b. rossi to particle physics and astrophysics," *Review of Modern Physics*, vol. 21, p. 1, 1949.

[7] N. A. of Sciences, "Bruno benedetto rossi," in *Biographical Memoirs*, ch. 16, pp. 310–341, Washington, DC: The National Academies Press, 1998.

[8] S. Henzler, (2010), *"Time-to-Digital Converters. Springer Series in Advanced Microelectronics"*, Springer Netherlands, doi:10.1007/978-90-481-8628-0.

[9] B. Van Bockel, J. Prinzie, P. Leroux, *Radiation Assessment of a 15.6PS Single-Shot Time-to-Digital Converter in Terms of TID Electronics*, vol. 8, no. 5, 2019, p. 558, DOI:10.3390/electronics8050558.

[10] D. Monda, G. Ciarpi, and S. Saponara, "Analysis and comparison of rad-hard ring and lc-tank controlled oscillators in 65 nm for spacefibre applications," Sensors, vol. 20, no. 16, 2020.

[11] Andreani, P., Bigongiari, F., Roncella, R., Saletti, R., Terreni, P., Bigongiari, A., Lippi, M., "*Multibit multichannel time-to-digital converter with 1 Solid-State Circuits*", IEEE Journal of 33(4), 650–656 (1998). DOI10.1109/4.663573

[12] I. Young, et al., "A PLL Clock Generator with 5 to 110 MHz of Lock Range for Microprocessors," IEEE J. Solid-State Circuits, vol. 27, no. 11, pp. 1599-1607, Nov. 1992.

[13] Maneatis, J.G. (1996), "*Low-jitter process-independent DLL and PLL based on self-biased techniques*" IEEE Journal of Solid-State Circuits, 31(11), pp.1723–1732. doi:https://doi.org/10.1109/jssc.1996.542317.

[14] M. Johnson, and E. Hudson, "A Variable Delay Line PLL for CPU-Coprocessor

Synchronization," IEEE J. Solid-State Circuits, vol. SC-23, no. 5, pp. 1218-1223,

Oct. 1988.

[15] B. I. Abdulrazzaq, I. A. Halin, "*A review on high-resolution CMOS delay lines towards sub-picosecond jitter performance*", 2016, SpringerPlus, 5(1), doi:https://doi.org/10.1186/s40064-016-2090-z.

[16] J. Prinzie, M. Steyaert, and P. Leroux, *"Radiation Hardened CMOS Integrated Circuits for Time-Based Signal Processing"*, Cham, Switzerland: Springer, 2018.