IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-II EXPRESS BRIEFS, VOL. XX, NO. XX, SEPT. 2017

# A Slew Rate Variation Compensated 2×VDD I/O Buffer Using Deterministic P/N-PVT Variation Detection Method

Tzung-Je Lee, Member, IEEE, Tsung-Yi Tsai, Wei Lin, U-Fat Chio, Member, IEEE, and Chua-Chin Wang\*, Senior Member, IEEE

Abstract—A 2×VDD I/O buffer based on deterministic PVT variation detection algorithms to achieve slew rate compensation is proposed in this paper. By using the P-PVT and N-PVT Variation Detectors consisting of an inverter and a capacitor, the slew rate variation is significantly reduced against the PVT variation. Besides, the source-drain leakage current is reduced by turning off the auxiliary current paths after the charging and discharging transients are completed. The proposed design is implemented using a typical 40 nm CMOS process. The area of the I/O buffer is 0.216  $\times$  0.052 mm<sup>2</sup>. Based on post-layout simulations, the slew rate variation is reduced 38.29% after the PVTL (process, voltage, temperature, and leakage) compensation in the worst case.

*Index Terms*—I/O buffer, mixed-voltage tolerant, PVT variation, leakage, slew rate compensation.

### I. INTRODUCTION

**M** IXED-voltage I/O buffer is widely used to transmit and receive the digital signal with different voltage between separate chips, as shown in Fig. 1 [1]-[4]. With the device scaled down to nano-meter, the SR is dramatically sensitive to PVT variations. Take Fig. 1 as an example, where the frequency larger than 533 MHz and the SR variation smaller than 1.5-5.0 V/ns are required to meet the DDR2 specifications.

Recently, several PVT compensation methods were developed for the mixed-voltage I/O buffers to adjust the SR corresponding to the PVT corners [5]-[9]. However, the methods detect only 3 process corners (TT, FF, and SS) [5]-[8] and take multiple clock cycles [9], which may result in long settling time and missing code due to poor SR in high frequency.

In the authors' prior works [5] and [9], 18 MOS transistors and 46 MOS transistors are used respectively for PVT detection. By contrast, the PVT detection design is achieved in this

This investigation was partially supported by grant NSC-102-2221-E-110-083-MY3 and NSC-101-2632-E-230-001-MY3 of National Science Council, and MOST 104-2622-E-006-040-CC2 of Ministry of Science and Technology, Taiwan. The authors would like to express our deepest appreciation to CIC (Chip Implementation Center) in NARL (Nation Applied Research Laboratories), Taiwan, for the assistance of chip fabrication.

T.-J. Lee is Adjunct Assistant Professor of Department of Electrical Engineering, National Sun Yat-Sen University. He is also the Co-Advisor of Tsung-Yi Tsai. Besides, Prof. Lee is Assistant Professor of Department of Computer Science and Information Engineering, Cheng Shiu University, Kaohsiung, Taiwan 83347. (email:tjlee@gcloud.csu.edu.tw).

C.-C. Wang, T.-Y. Tsai and W. Lin are with Department of Electrical Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan 80424. Prof. C.-C. Wang is the contact author. (email: ccwang@ee.nsysu.edu.tw).

U.-F. Chio is with State Key Laboratory of Analog and Mixed Signal VLSI, University of Macau, Macao, China.

work by using only 4 MOS transistors and 2 capacitors within one cycle. It results in the SR variation reduction of 38.29% in the worst case. Besides, the source-drain leakage current of the large driving MOS transistor is reduced by 50.9% without affecting the SR performance.

1



Fig. 1. Application example of mixed-voltage I/O buffer.

#### II. 2×VDD I/O BUFFER WITH SR COMPENSATION

Fig. 2 (a) shows the block diagram of the proposed  $2 \times VDD$  I/O buffer composed of PVTL (process, voltage, temperature and leakage) compensation circuit and I/O Buffer. The digital corner signals, Pcode[3:1] and Ncode[3:1], are generated in one cycle to drive the output PMOS and NMOS transistors in the Output Stage, respectively. Dout and Din are the output and input data signals, respectively. To avoid the reliability problems and control the large driving PMOS and NMOS transistors, the gate voltages should be given appropriately, as shown in Fig. 2 (a) [5].

### A. System Consideration

1) SR variation compensation: To compensate the SR variation,  $P301_a$ - $P301_c$  with width  $W_{pa}$ - $W_{pc}$ , respectively, are used in the output stage to provide three current paths controlled by  $P_a$ - $P_c$  according to the operation mode and the PVT corners, as shown in Table I. Referring to Fig. 2 (b), we assume that the 9 circles denote the SR among all PVT corners, called corner 1 to corner 9, respectively. They are equally separated into three groups, called fast, typical, and slow corners, respectively, by the reference voltages,  $V_{P1}$  and  $V_{P2}$ , in the P-PVT Variation Detector. If the typical corners are detected,  $W_{pa}$  + $W_{pb}$  is used. When the fast corners are detected, only  $W_{pa}$  is used for the Output Stage. The SR of corner 1 to corner 3 is then shifted to the typical region. When

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-II EXPRESS BRIEFS, VOL. XX, NO. XX, SEPT. 2017



Fig. 2. (a) Block diagram; (b) the SR compensation with the width selection of the proposed  $2 \times VDD$  I/O Buffer.

 $\begin{tabular}{l} TABLE I \\ Function table of the control of the SR compensation. \end{tabular}$ 

| Mode | Corners | PS<br>NS | PF<br>NF | L                 | Pcode[3:1]<br>Ncode[3:1] | DOUT | $P_{\rm a}$ | $P_{\rm b}$         | Pc                  | $Vg4_{\rm a}$ | $Vg4_{b}$           | Vg4c  |
|------|---------|----------|----------|-------------------|--------------------------|------|-------------|---------------------|---------------------|---------------|---------------------|-------|
| Rx   | -       | -        | -        | -                 | 000                      | -    | 1           | 1                   | 1                   | 0             | 0                   | 0     |
| Tx   | Fast    | 0        | 0        | $0 \rightarrow 1$ | 001                      | 0/1  | 1/0         | 1                   | 1                   | 1/0           | 0                   | 0     |
| Tx   | Typical | 0        | 1        | $0 \rightarrow 1$ | $011 \rightarrow 001$    | 0/1  | 1/0         | $1/0 \rightarrow 1$ | 1                   | 1/0           | $1/0 \rightarrow 0$ | 0     |
| Tx   | Slow    | 1        | 1        | $0 \rightarrow 1$ | $111 \rightarrow 001$    | 0/1  | 1/0         | $1/0 \rightarrow 1$ | $1/0 \rightarrow 1$ | 1/0           | 1/0→0               | 1/0→0 |

the slow corners are detected,  $W_{pa} + W_{pb} + W_{pc}$  is used to increases the driving current such that the SR (of corner 7 to 9) are pulled up to the typical region. Thus, the SR variation could be reduced by 66.67% ideally.

To accomplish the SR variation compensation shown in Fig. 2 (b), the width of  $P301_{a-c}$  and the SR variation must satisfy the following equation.

$$SR_{rise,avg} : SR_{rise,max} : SR_{rise,min}$$

$$= \frac{1}{W_{pa} + W_{pb}} : \frac{1}{W_{pa}} : \frac{1}{W_{pa} + W_{pb} + W_{pc}}$$
(1)

where  $SR_{rise,max}$  and  $SR_{rise,min}$  denote the maximum and minimum values of  $SR_{rise}$  in all PVT corners, respectively.  $SR_{rise,avg}$  is the averaged SR.

Thus, the optimal ratio of the  $W_{pa}$ ,  $W_{pb}$ , and  $W_{pc}$  is derived as follows to determine the sizes of those large transistors in Output Stage.

$$W_{pa}: W_{pb}: W_{pc}$$
  
= 1:  $(\alpha_{rise,max} - 1): (\frac{\alpha_{rise,max}}{\alpha_{rise,min}} - \alpha_{rise,max})$  (2)

where  $\alpha_{rise,max} = \frac{SR_{rise,max}}{SR_{rise,avg}}$  and  $\alpha_{rise,min} = \frac{SR_{rise,min}}{SR_{rise,avg}}$ . 2) *PVT corner detection:* To accomplish the process vari-

2) PVI corner detection: To accomplish the process variation caused by PMOS and NMOs transistors, the P-PVT and N-PVT Variation Detectors are used, respectively. The digital signals, PS, PF, NS, and NF, are generated to represent the PVT corners, as shown in Table I.

3) Output stage control: Referring to Fig. 2 (a) and Table I,  $P_a-P_c$  and  $Vg4_a-Vg4_c$  are control signals generated by the Pre-driver for the three branches of the large driving PMOS

and NMOS transistors, respectively. In the receiving mode, Pcode[3:1] is 000 such that  $P_{a-c}$  are logic 1 to turn off P301<sub>a-c</sub>. In the Tx mode, the states of  $P_a$ - $P_c$  and Vg4<sub>a</sub>-Vg4<sub>c</sub> are determined by the output data, DOUT, and Pcode[3:1]. At the fast corner, Pcode[3:1] is 001 such that only one branch in the Output Stage is turned on. Similarly, Pcode[3:1] is 011 at the typical corner, resulting in two branches are turned on. At the slow corner, Pcode[3:1] becomes 111 to activate all three branches,  $P_a$ - $P_c$ . Vg4<sub>a</sub>-Vg4<sub>c</sub> are determined by DOUT and Ncode[3:1] in the similar way for N302<sub>a</sub>-N302<sub>c</sub>.

4) Leakage reduction: To reduce the source-drain leakage current, the Input Stage detects the output signal, VPAD, and generates the digital output signal, L, to turn off the auxillary branches of the Output Stage. Referring to Table I, when VPAD reaches the final steady voltage, L becomes logic 1 such that Pcode[3:1] changes their values to turn off the corresponding branches. Besides, the delay due to Vg1 Generator would cause the large unwanted DC current in the Output Stage. Thus, a delay buffer is employed to avoid large DC current.

## B. Circuit Design

1) P-PVT Variation Detector: Fig. 3 (a) discloses the schematic of P-PVT Variation Detector. The PMOS Sensor is composed of a low-skew inverter, P341 and N341, and an on-chip capacitor,  $C_P$ . Referring to Fig. 3 (b),  $V_P$  is charged very slowly in 5 ns (half of the clock cycle) because of the low-skew inverter. If the transistor is in a fast PVT corner,  $V_P$  is charged faster than other corners such that its peak value is larger than the reference voltages,  $V_{P1}$  and  $V_{P2}$ . Because

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSII.2018.2837110, IEEE Transactions on Circuits and Systems II: Express Briefs

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-II EXPRESS BRIEFS, VOL. XX, NO. XX, SEPT. 2017



Fig. 3. (a) Schematic and (b) waveform of P-PVT Variation Detector.

 $V_{P1}$  and  $V_{P2}$  are used to separate the entire range of  $V_P$  into 3 equal regions, they are chosen as 1/3 and 2/3 of the final peak value of  $V_P$ , respectively. Moreover, the offset and the input-referred noise of the comparator are 0.27 mV and 28.47 uV/ $\sqrt{Hz}$ , respectively. Therefore, the PVT corner codes resulted from PMOS, namely PF and PS, are both logic 0, as shown in Table I.

To make sure that  $V_P$  is consistent with Fig. 3 (b) in different PVT corners, the duration variations should satisfy the following Eqn. (3)-(5).

$$T_{PF(F)} < \frac{T_{Clk}}{2}$$
 (Fast) (3)

$$T_{PS(T)} < \frac{T_{Clk}}{2} < T_{PF(T)}$$
 (Typical) (4)

$$\frac{T_{Clk}}{2} < T_{PS(S)} \tag{Slow} \tag{5}$$

where  $T_{Clk}/2$  refers to the half of the clock period. It is easy to derive the bounds of  $T_{Clk}$  by Eqn. (3) to (5) as the inequalities in Eqn. (6).

$$T_{PF(F)} < \frac{T_{Clk}}{2} < T_{PS(S)} \tag{6}$$

Besides, the duration is proportional to the charging capacitance and the reference voltage, and is inversely proportional to the charging current, as shown in Eqn. (7).

$$T_{PS} = \frac{C_P \times V_{P2}}{I_{P341}}, \text{ and } T_{PF} = \frac{C_P \times V_{P1}}{I_{P341}}$$
 (7)

Thus, the boundary equation of the charging capacitance,  $C_P$ , could be derived as follows.

3

$$\frac{T_{Clk} \times I_{P341} \times \alpha_{P(S)}}{2 \times V_{P2}} < C_P < \frac{T_{Clk} \times I_{P341} \times \alpha_{P(F)}}{2 \times V_{P1}}$$
(8)  
where  $\alpha_{P(F)} = \frac{\min[SR_{(F)}]}{SR_{(T)}}$ , and  $\alpha_{P(S)} = \frac{\max[SR_{(S)}]}{SR_{(T)}}$ .



Fig. 4. Schematic of the N-PVT Variation Detector.

2) N-PVT Variation Detector: The N-PVT Variation Detector is similar to the P-PVT Variation Detector, as shown in Fig. 4. Notably, the NMOS Sensor is composed of two NMOS transistors, N341 and N342, with low-skew design such that the charging duration of  $V_N$  is only related to the process variation caused by NMOS transistors.

For instance, the area penalty for the capacitors  $C_P$  and  $C_N$  are only 0.905um  $\times$  0.905um and 2.18um  $\times$  2.18um, respectively, when the SR requirements of DDR2 are given.



Fig. 5. Schematic of Input Stage.

*3) Input Stage and Leakage Compensation:* Referring to Fig. 5, P327 and N327 consists of the input driving inverter, which is indirectly controlled by the PAD signal, VPAD, and the mode signal, OE. P321 and N325 act as an inverter with the input signal of VPAD. P322, P323, and N321 are used to protect the internal devices from the threat of the HV (high voltage) signal at VDDIO. Similarly, N322-N324 also protect the devices from the HV level of VPAD. N326 is added to prevent the HV signal directly coupled at V326. When VPAD is at 1.8 V, V326 is discharged to 0 V such that DIN and L would be 0.9 V.

4) Vg2 Generator: Fig. 6 shows the schematic of the Vg2 Generator, which generates a bias voltage, Vg2, according to the modes of VDDIO, as shown in Fig. 2 (a). P211-P216 and N211 consist of the string of the diode-connected MOS transistors to generate two reference voltages, V1 and V2.

1549-7747 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications\_standards/publications/rights/index.html for more information.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-II EXPRESS BRIEFS, VOL. XX, NO. XX, SEPT. 2017



Fig. 6. Schematic of Vg2 Generator.

5) Vg1 Generator: Fig. 7 reveals the schematic of the Vg1 Generator [6]. It generates the gate drives for  $P301_a$ -P301<sub>c</sub> according to the modes of VDDIO and  $P_a$ - $P_c$ , as shown in Fig. 2 (a). Notably,  $V_{g,P316}$  and  $V_{g,P315}$  are clamped at 1.8 V or 0.9 V +  $V_{thn}$  by P317 or N315, respectively, to avoid the reliability problem caused by VDDIO of 1.8 V.



Fig. 7. Schematic of Vg1 Generator.



Fig. 8. (a) Layout and (b) die photo of the proposed 2xVDD I/O Buffer.

# III. IMPLEMENTATION, SIMULATION AND MEASUREMENT

This work is realized and implemented by a typical 40 nm CMOS process. Fig. 8 shows the layout and die photo of the proposed design, where a single I/O buffer circuit is only 0.216  $\times$  0.052 mm<sup>2</sup>, and the overall chip size is 0.579  $\times$  0.704 mm<sup>2</sup>. The PVTL detector is 0.039  $\times$  0.014 mm<sup>2</sup>, which is only

4.86% of area overhead. There are two I/O Buffers included to compare for the cases with and without PVTL compensation. Notably, several additional top layers are covered in the chip such that the I/O buffers are invisible in the diephoto.

4

Referring to Fig. 9, the output signals, VPAD, in all corners are revealed. After PVTL compensation, the slew rate variation ( $\Delta$ SR<sub>rise</sub>) is improved by 46.58%. For the falling edge, the improvement is 38.29%. For VDDIO at 1.8 V,  $\Delta$ SR<sub>rise</sub> and  $\Delta$ SR<sub>fall</sub> are 48.09% and 55.87%, respectively. Beside, the source-drain leakage current is reduced 50.9%.



Fig. 9. Post-layout simulated SR variations for VDDIO at 0.9 V with PVT corners of TT, FF, SS, FS, and SF, the 10% voltage variation and the temperatures from  $0^{\circ}$ C to  $100^{\circ}$ C with 20 pF load.



Fig. 10. Monte Carlo simulation results of SR at the worst case for (a) VDDIO at 0.9 V and (b) VDDIO at 1.8 V.

Fig. 10 shows the statistical histogram based on Monte Carlo simulations. The standard deviation of SR is reduced 6.77% and 8.72% after PVTL compensation, respectively. The improvement is degraded due to the layout mismatch.

The chip is measured on a PCB with 50  $\Omega$  connectors to the Tester 81250. The measured eye diagrams are shown in Fig. 11 and Fig. 12. The eye height is 0.710 V and 0.538 V for VDDIO at 1.8 V and 0.9 V, respectively. The signal is attenuated compared to the simulation results due to the impedance mismatching. However, the performance improvement after PVTL compensation is verified. Fig. 13 shows the measured SR and standard deviation. After PVTL compensation, the standard deviation from -20°C to 100°C is improved for all 3 chip samples. The averaged  $\Delta$ SR reduction is 21.12%.

Table II tabulates the comparison with several prior works. The proposed design uses the least number of devices and possesses the best  $\Delta$ SR reduction with 5 process corners in one cycle.

1549-7747 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications\_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSII.2018.2837110, IEEE Transactions on Circuits and Systems II: Express Briefs

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-II EXPRESS BRIEFS, VOL. XX, NO. XX, SEPT. 2017

|                                     | [7]                | [8]                   | [6]                 | [5]                | This work          |
|-------------------------------------|--------------------|-----------------------|---------------------|--------------------|--------------------|
| Year                                | 2003               | 2010                  | 2013                | 2012               | 2017               |
| Publication                         | JSSC               | TCAS-II               | TCAS-I              | ICICDT             | TCAS-II            |
| Process (nm)                        | 180                | 180                   | 90                  | 90                 | 40                 |
| VDD (V)                             | 3.3                | 1.8                   | 1.2                 | 1.2                | 0.9                |
| VDDIO (V)                           | 3.3                | 1.8                   | 2.5                 | 2.5                | 0.9/1.8            |
| Process Corners                     | TT, FF, SS         | TT, FF, SS            | TT, FF, SS          | TT, FF, SS, FS, SF | TT, FF, SS, FS, SF |
| Lock Time                           | Hundreds of cycles | One cycle             | One cycle           | Tens of cycles     | One cycle          |
| Max. Date Rate (MHz)                | 25                 | 1000                  | 125                 | 330                | 700/650            |
| Simulated SR (V/ns)                 | 0.403-0.986        | 2.1-3.58              | 2.2-3.4             | 0.65-1.65          | 1.616-2.246        |
| Simulated $\Delta$ SR (V/ns)        | 0.583              | 1.48                  | 1.2                 | 1.0                | 0.63               |
| Simulated $\Delta$ SR reduction (%) | 23                 | N/A                   | 37.5                | N/A                | 38.29 (Worst case) |
| Measured $\Delta$ SR reduction (%)  | N/A                | N/A                   | N/A                 | N/A                | 21.12 (Averaged)   |
| Measured STD (V/ns)                 | N/A                | N/A                   | N/A                 | N/A                | 0.199 (Worst case) |
| Core area (mm <sup>2</sup> )        | 0.162              | 0.009                 | N/A                 | 0.651              | 0.011              |
| Normalized area #                   | 5                  | 0.28                  | N/A                 | 20.9               | 0.34               |
| Power (mW)                          | N/A                | 13.7 (@ 1000 MHz)     | N/A                 | 2.2 (@ 330 MHz)    | 3.92 (@ 650 MHz)   |
| Detection devices                   | 1 PLL              | 20-stage Delay cells, | 8-stage Delay cells | 18 MOS             | 4 MOS, 2 Cap       |

 TABLE II

 Performance comparison of Output Buffer

Note: # Normalized area = (Core area)/(Process scale)<sup>2</sup>.



Fig. 11. Measured eye diagram of VPAD for VDDIO at 1.8 V with 50  $\Omega$  load.



Fig. 12. Measured eye diagram of VPAD for VDDIO at 0.9 V with 50  $\Omega$  load.



Fig. 13. Measured SR and standard deviation (STD) from  $-20^{\circ}$ C to  $100^{\circ}$ C for three chip samples.

# IV. CONCLUSION

5

This investigation proposes the PVT variation detection circuits with very simple structure based on only one inverter and a capacitor. Besides, the analysis to compute design factors and aspects is derived. The simulation and measurement results verify that the proposed design could reduce the SR variation and improve the performance against the process and temperature variations. Furthermore, the SR compensation branches could be turned off automatically to reduce the leakage.

#### REFERENCES

- M. Takahashi, T. Sakurai, K. Sawada, K. Nogami, M. Ichida, and K. Matsuda, "3.3V-5V compatible I/O circuit without thick gate oxide," in Proc. *IEEE Custom Integrated Circuits Conference*, pp. 23.3.1-23.3.4, May 1992.
- [2] M.-D. Ker and C.-S. Tsai, "Design of 2.5 V/5 V mixed-voltage CMOS I/O buffer with only thin oxide device and dynamic N-well bias circuit," in Proc. *IEEE Int. Symposium on Circuits and Systems (ISCAS)*, pp. 97-100, May 2003.
- [3] A.-J. Annema, G. J. G. M. Geelen, and P. C. de Jong, "5.5-V I/O in a 2.5-V 0.25- um CMOS technology," *IEEE Journal of Solid-State Circuits* (JSSC), vol. 36, no. 3, pp. 528-538, Mar. 2001.
- [4] B. Serneels, T. Piessens, M. Steyaert, and W. Dehaene, "A high-voltage output driver in a 2.5-V 0.25-um CMOS technology," *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 40, no. 3, pp. 576-583, Mar. 2005.
- [5] C.-L. Chen, H.-Y. Tseng, R.-C. Kuo, and C.-C. Wang, "On-chip MOS PVT variation monitor for slew rate self-adjusting 2×VDD output buffers," in *Proc. Int. Conf. on IC Design and Technology (ICICDT)*, pp. 1-4, Jun. 2012.
- [6] M.-D. Ker and P.-Y. Chiu, "Design of 2×VDD-tolerant I/O buffer with PVT compensation realized by only 1×VDD thin-oxide devices," *IEEE Trans. Circuits and Systems I (TCAS-I)*, vol. 60, no. 10, pp. 2549-2560, Oct. 2013.
- [7] S.-K. Shin, S.-M. Jung, J.-H. Seo, M.-L. Ko, and J.-W. Kim, "A slewrate controlled output driver using PLL as compensation circuit," *IEEE J. Solid-State Circuits (JSSC)*, vol. 38, no. 7, pp. 1227-1233, Jul. 2003.
- [8] Y.-H. Kwak, I.-H. Jung, and C.-W. Kim, "A Gb/s+ slew-rate/impedancecontrolled output driver with single-cycle compensation time," *IEEE Trans. Circuits Syst. II Exp. Briefs*, vol. 57, no. 2, pp. 120-125, Feb. 2010.
- [9] C.-C. Wang, W.-J. Lu, K.-W. Juan, W. Lin, H.-Y. Tseng, and C.-Y. Juan, "Process corner detection by skew inverters for 500 MHZ 2VDD output buffer using 40-nm CMOS technology," *Microelectron. J.*, vol. 46, no. 1, pp. 1-11, Jan. 2015.