## Low-cost Video Decoder with 2D2L Comb Filter for NTSC Digital TVs

Chua-Chin Wang, Senior Member, IEEE, Ching-Li Lee, and Ming-Kai Chang

**Abstract** —*A* low-cost video decoder for NTSC signals is described in this paper. The proposed NTSC video decoder design employs a 2D2L comb filter, and a DDFS (Direct Digital Frequency Synthesizer) -based DCO (digital control oscillator) based on a trigonometric quadruple angle formula in a digital PLL to track and lock the demodulation clocks. The complexity of the digital video decoder is consequently drastically reduced. The overall cost of the proposed design is 7.22 mm<sup>2</sup> (26 K gates). The maximum power dissipation is 109.2mW at the highest clock rate which is 21.48 MHz<sup>1</sup>.

Index Terms — color burst, comb filter, DDFS, line delay, NTSC, video decoder.

#### I. INTRODUCTION

Video decoders (VD) play a very important role in the design of the most popular consumer electronic devices: TVs. This is especially true for NTSC-based TVs [1]. The tasks of the VD include Y/C separation, sync separation, and color demodulation. Many prior attempts [8]-[13] have been reported to pursue a digital version of NTSC TVs. The difficulty in recovering the color information is the result of the poor received signal quality which in turn introduces serious jitters into the clocks of color burst as well as the H-sync and V-sync. We have therefore analyzed the PSNR (peak noise to ratio) performance of different comb filters in order to explore the lowest-cost feasible solution. A novel NTSC video decoder solution with a DDFS-based DCO (digital control oscillator) and a 2D2L comb filter [4] is described in this paper.

#### II. DIGITAL VIDEO DECODER DESIGN

An overview of a generic digital NTSC video decoder system is shown in Fig. 1, which is used to extract the Y (luminance) and C (chrominance) signals of an NTSC signal given in Fig. 2.

#### A. Comb Filter Selection

Three widely adopted comb filters are 2D1L, 2D2L, and 3D2F (D: delay, L: line, F: frame). The larger the numbers of lines and frames, the larger the memory sizes to be used. Moreover, the 3D2F comb filter requires much greater memory size to store two frames of video data. The transfer

functions of the 3 types of comb filters are as follows.

2D1L 
$$C: 0.5 - 0.5 \cdot z^{-1}, Y: 0.5 + 0.5 \cdot z^{-1}$$
 (1)

2D2L 
$$C: -0.25 + 0.5 \cdot z^{-1} - 0.25 \cdot z^{-2}$$
, (2)

$$Y:+0.25+0.5\cdot z^{-1}+0.25\cdot z^{-2}$$

3D2F 
$$C: 0.5 + 0.5 \cdot z^{-1}, Y: 0.5 - 0.5 \cdot z^{-1}$$
 (3)



Fig. 1. Overview of a digital NTSC video decoder.



Fig. 2. NTSC signal in time domain and frequency domain.

<sup>&</sup>lt;sup>1</sup> This work was supported in part by the National Science Council under Grant NSC 92-2220-E-110-001and 92-2220-E-110-004 and HiMax Corp. under grant 92A20703.

C.-C. Wang, C.-L. Lee, and M.-K. Chang are with the Department of Electrical Engineering, National Sun Yat-Sen University, Kaohsiung, 80424, Taiwan. (e-mail: ccwang@ee.nsysu.edu.tw).



Fig. 3. Two of the test patterns for comb filters.



Fig. 4. Block diagram and schematic of the 2D2L comb filter.

Many different video patterns are used to measure the PSNR of the aforementioned comb filters, e.g., patterns in Fig. 3. The definition of PSNR is as follows.

$$PSNR_{dB} = 20 \log_{10} \frac{2^n - 1}{RMSE}$$
(4)

where RMSE denotes the root mean square error. Notably, every line of the video signal is  $328 \times 3 \times 8 \approx 8$ K bits. A frame is composed of 525 lines which means we need to use  $525 \times 328 \times 3 \times 8 \approx 4$ M bits to store a whole frame. Although the 3D2F has the best PSNR, it has to store two frames of video data which are far larger than the other two comb filters. Hence, the 2D2L comb filter [4] is employed in the final chip implementation. The block diagram and the schematic of the 2D2L comb filter are shown in Fig. 4.

#### B. Clock recovery by DDFS-based DCO

Prior researches proposed many complicated comb filter designs to enhance the quality of the decoded images. Even worse, a high-resolution ADC was thought to be required to resolve the problem. The underlying cause of this problem resides in the line length variation of the received NTSC signal which leads to the synchronization difficulties of H-sync edges. We adopt three methods besides the weighting window [3] to overcome this problem.



Fig. 5. Phase error in the integrator of the PFD due to the accumulation of non-fixed periods' errors.

#### 1) Digital PLL

The jitter of the burst clock results in its locking problem. The received NTSC signal has neither constant swing nor amplitude. A sophisticated DDFS-based DCO is used to replace the common VCO in the proposed video decoder. Notably, the large and slow cos and sin ROMs can be removed by using a modified  $4\theta$ -based DDFS which will be described later in the text. The sampling frequency is selected to be an integer multiple of the sub-carrier's frequency to minimize the phase error. If not, a significant variation of phase error will be produced. For instance, since there are 18 samples in a cycle, the integrator in the PFD (phase-frequency detector) will accumulate the non-fixed periods' error to generate the large fluctuations in Fig. 5. With the same phase errors - 7% and 10% - the PFD can not lock the clock given a 20 MHz sampling frequency. By contrast, if the sampling frequency is identical to the system clock which is 21.48 MHz, these phase errors are clearly detected. A digital version of the loop filter (LF), which is a 1st-order IIR, is included in the digital PLL. Fig. 6 shows the architecture of the LF. The equivalent loop constants,  $C_1$  and  $C_2$  [6], are determined by the following equations,

$$\omega_n = \frac{1}{T} \sqrt{\frac{4C_2 K_o K_d}{4 - (2C_1 + C_2)K_o K_d}},$$
(5)



$$T_n = \frac{2\pi}{\omega_n},\tag{7}$$

where *T* is the sampling period,  $K_o$  and  $K_d$  denote the gain of PFD and DCO,  $\omega_n$  is the natural frequency, and  $T_n$  is the lock time. The architecture of the DDFS-based DCO is shown in Fig. 7. The *p* in the phase accumulator of the DDFS controls the center frequency of the DCO, and is determined by the following equations,

$$\frac{F_{sc}}{F_s} = \frac{p}{2^k - 1},\tag{8}$$

where  $F_{sc}$  is the output frequency,  $F_s$  is the sampling clock frequency of the DDFS, and k denotes the bit length of the phase accumulator. In the proposed video decoder, k is given as 28, and p is equal to 44739242 [4].



Fig. 7. Architecture of the DDFS-based DCO.

#### 2) ROM-less DDFS

A modified  $4\theta$ -based DDFS [3] is employed to carry out the function of the required DCO such that the slow and large ROMs can be removed. The central frequency is set to 3.58 MHz. The sin and cos ROMs are pre-computed by [3] and

hard-wired with combinational logic in order to eliminate real embedded and slow ROMs.

# 3) Synchronization circuitry for color burst and sub-carrier signals

Prior digital PLLs had a tendency to lock the color burst and the sub-carrier in an out-of-phase state due to their phase difference over 90°. As a result, the chrominance demodulator could not demodulate the correct color. An out-of-phase detecting PLL is used to resolve this problem between the color burst and the sub-carrier. Fig. 8 shows the block diagram of the circuitry. The peak detector detects the peak of the periodic sub-carrier in the steady state. The signal delay comparator compares the peak of the color burst with the peak of the sub-carrier. Fig. 9 shows the signal delay comparator circuit. When the color burst and sub-carrier are out-of-phase, the MSBs (most significant bit) of the two signals are distinct. Then MSB comparator outputs 0. The multiplexer selects the delayed color burst signals to output. Otherwise, the color burst signal and sub-carrier are in-phase. The multiplexer directly outputs the color burst signals to the PFD. Fig. 10 shows the variation of signals when the color burst and the sub-carrier are out-of-phase. Fig. 10 (a) shows the variation of signals from the 1st line to the 10th line, and Fig. 10 (b) shows the signals from the 11th line to the 20th line. In the 11th and 12th lines, the color burst is delayed to lock the sub-carrier inphase.



Fig. 8. Block diagram of the out-of-phase detecting PLL.



### *C. Chrominance Demodulator* The C is derived to be [4], [5]

$$C = (C_b - 128) \cdot 0.504 \cdot \sin \omega t + (C_r - 128) \cdot 0.711 \cdot \cos \omega t$$
(9)

The  $C_b$  and  $C_r$  are produced after the multiplication products of C and the outputs of the digital PLL are low-passedly filtered based on the follows.

$$2 \cdot 1.406 \cdot \cos \omega t \cdot C \to C_r - 128 \tag{10}$$

$$2 \cdot 1.984 \cdot \sin \omega t \cdot C \to C_{h} - 128 \tag{11}$$



Fig. 10. Variation of signals when the color burst and the sub-carrier are out-of-phase.

The LPF to execute the filter operation is a 20-tap transposed FIR [7]. Notably, the 128 in (10) and (11) is a DC offset of the 8-bit ADC. Hence, the correct results of  $C_b$  and  $C_r$  are composed of the outputs of (10) and (11) and the 128 DC offset.

#### **III. SIMULATION AND IMPLEMENTATION**

TSMC (Taiwan Semiconductor Manufacturing Company) 0.35  $\mu$ m 2P4M CMOS technology cell library is adopted to implement the proposed design. Fig. 11 shows the die photo of the proposed video decoder. Results of proposed video decoder are revealed in Fig. 12. The overall characteristics of the proposed video decoder as well as the comparison with prior works are summarized in Table I.



Fig. 11. Die photo of the proposed video decoder.



TABLE I **PERFORMANCE COMPARISON** [3]‡ OURS [2] Area 67.76 \* 6.02 7.22  $(mm^2)$ Power 980 mW 86 mW 109.2 mW Gate# N/A 39K + 26K +2×8 Mb SRAM 3×8 Mb SRAM Process 1.2 µ m 2P2M 0.35 µ m 1P4M 0.35 µ m 2P4M

<sup>†</sup>: Estimated from the die photo.

Avg. PSNR

<sup>‡</sup>: Only provided simulation results, not on silicon.

N/A

#### **IV. CONCLUSION**

15.59 dB

18.36 dB

We have proposed a novel NTSC video decoder design in this work. If you areThe traditional VCO is replaced with a ROM-less  $4\theta$ -based DDFS-based DCO to reduce the phase tracking problem. The complexity of the digital video decoder with 2D2L comb filter is drastically reduced and the PSNR is enhanced.

#### REFERENCES

- [2][1]Y. C. Faroudja, "NTSC and Beyond," *IEEE Trans. on Consumer Electronics*, vol. 34, no. 1, pp. 166-178, Feb. 1988.
- [3][2] M. Ohta, K. Kohiyama, N. Tahara, K. Sugihara, F. Asami, O. Kobayashi, Y. Hino, and T. Akiba, "A Single-Chip CMOS Analog/Digital Mixed NTSC Decoder," *IEEE J. of Solid-State Circuits*, vol. 25, no. 6, pp. 1464-1469, Dec. 1990.
- [4][3] C.-C. Wang, Y.-L. Tseng, C.-C. Chen, and C.-S. Chen, "Low-cost NTSC Digital Video Decoder Using 4θ-based DDFS," 2003 Workshop on Consumer Electronics (WCE2003), pp. 41 (CD-ROM version), Nov. 2003.
- [4] K. Jack, "Video Demystified," 3rd ed., Eagle Rock, VA: LLH Technology Publishing, 2001.
- [5] B. Gorb, C. Herndon, "Basic Television and Video Systems." 6th ed., New York: Glencoe/McGraw-Hill, 1998.
- [3][6] J.-S. Wu, M.-L. Liou, H.-P. Ma, and T.-D. Chiueh, "A 2.6-V, 44MHz All-Digital QPSK Direct-Sequence Spread-Spectrum Transcevier IC," *IEEE J. of Solid-State Circuits*, vol. 32, no. 10, pp. 1499-1510, Oct. 1997.
- [7] V. Pasham, A. Miller, and K. Chapman, Application Note: Transposed Form FIR Filter, Xilinx Inc., Oct. 2001. [Online]. Available: http://www.xilinx.com/bvdocs/appnotes/xapp219.pdf.
- [8] Philips, "PAL/NTSC/SECAM Video Decoder with Adaptive PAL/NTSC comb filter, VBI-Data Slicer and High Performance Scaler," Data Sheet: SAA7114H, Mar. 15, 2000.
- [9] Philips, "A Single-Chip Multimedia Engine," Data Sheet: TriMedia TM-1100, 1998.
- [10] Philips, "Multistandard Vision and Sound-IF PLL with DVB-IF Processing," Data Sheet: TDA9819, July 14, 1998.
- [11] Micronas, "Digital Receiver Front-End," Data Sheet: DRX 3960A, Oct. 2000.
- [12] Broadcom, "HDTV/CATV Receiver," Data Sheet: BCM 3510, 2001.
- [13] Trchwell, "Enhanced Video Decoder with Component Inputs," Data Sheet: TW99, Sep. 28, 2000.



**Chua-Chin Wang** (M'90-SM'04) was born in Taiwan in 1962. He received the B.S. degree in Electrical Engineering from National Taiwan University in 1984, and the M.S. and Ph. D. degree in Electrical Engineering from State University of New York in Stony Brook in 1988 and 1992, respectively. He is currently a Professor in Department of Electrical Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan. His recent

research interests include low power and high speed logic circuit design, VLSI design, neural networks, and interfacing I/O circuits.



**Ching-Li Lee** was born in Taiwan in 1960. He received the B.S. degree in Electronic Engineering from National Taiwan University of Science and Technology in 1987, and the M.S. degree in Electrical Engineering from National Sun Yat-Sen University in 1991. He is currently working toward the Ph. D. degree in Electrical Engineering at Sun Yat-Sen University. His current research interest is VLSI design.

Ming-Kai Chang was born in Taiwan in 1979. He received B.S. degree in Information Management from Kun Shan University of Technology, Taiwan, in 2001. He is currently working toward the M.S. degree in Electrical Engineering at National Sun Yat-Sen University. His current research interests are VLSI design and video decoder.