# A 13-Bit Resolution ROM-Less Direct Digital Frequency Synthesizer Based on a Trigonometric Quadruple Angle Formula

Chua-Chin Wang, Member, IEEE, Yih-Long Tseng, Hsien-Chih She, Chih-Chen Li, and Ron Hu

Abstract—A ROM-less direct digital frequency synthesizer employing trigonometric quadruple angle formula is present in this paper. The worse case spectral purity is better than -130 dBc. The amplitude resolution is up to 13 bits, while the phase resolution is 12 bits. Neither any scaling table nor error correction tables are required. The maximum error is mathematically analyzed. The word length of each multiplier is carefully selected in the digital implementation such that the error range is circumscribed and the resolution is preserved.

*Index Terms*—Direct digital frequency synthesizer (DDFS), frequency synthesizer, ROM-less.

#### I. INTRODUCTION

**WER** since the low-cost RF CMOS technology became the challenger of its conventional discrete counterpart, the spectral quality of frequency synthesizers in a single-chip solution has been demanded to possess better purity. Direct digital frequency synthesizers (DDFSs) are very much preferred in some modern communication systems owing to their advantages over phased-locked loop (PLL)-based solutions, e.g., fast settling time, sub-Hertz frequency resolution, continuous-phase frequency switching and low phase noise [1].

A conventional DDFS usually consists of a phase accumulator, a sine–cosine generator, a digital-to-analog converter (DAC), and a low-pass filter (LPF), as shown in Fig. 1. The sine–cosine generator is a look-up table in such a ROM-based DDFS. By contrast, the sine–cosine value is real-timely computed in a ROM-less DDFS as shown in Fig. 2.  $f_{\rm ctrl}$  decides the accumulating step in the phase accumulator.  $f_{\rm clk}$  is the operation frequency of the DDFS.  $\sin \theta / \cos \theta$  is the output of the DDFS. The change of the output frequency is controlled by tuning  $f_{\rm ctrl}$ . Each phase accumulating interval is decided by

Manuscript received July 2, 2003; revised February 18, 2004. This work was supported in part by the National Science Council under Grant NSC 92-2220-E-110-001 and National Health Research Institute under Grant NHRI-EX93-9319EI.

C.-C. Wang and Y.-L. Tseng are with the Department of Electrical Engineering, National Sun Yat-Sen University, Kaohsiung 80424, Taiwan, R.O.C. (e-mail: ccwang@ee.nsysu.edu.tw)

H.-C. She is with VastView Technology, Inc. Hsin-Chu 300, Taiwan, R.O.C. C.-C. Li is with The Analog Department, Terax Communication Technolo-

gies, Hsin-Chu 300, Taiwan, R.O.C. R. Hu is with Asuka Microelectronics Inc., Hsin-Chu 300, Taiwan, R.O.C. Digital Object Identifier 10.1109/TVLSI.2004.833664

Fig. 1. Conventional architecture of the ROM-based DDFSs.



Fig. 2. Conventional architecture of the ROM-less DDFSs.

 $f_{\rm clk}$ . Assume phase accumulator is *M*-bit. The relation of  $f_{\rm ctrl}$ ,  $f_{\rm clk}$ , and output frequency is

output frequency = 
$$\frac{f_{\text{ctrl}}}{2^M} \cdot f_{\text{clk}}$$
. (1)

The bottleneck of the DDFS method is the spurious noise caused by amplitude quantization error, phase truncation, and DAC nonlinearities. Many prior works were proposed to resolve above problems, including ROM-based lookup tables [1]-[4], [7] or scaling and error correction tables [5], [6]. All of the ROM-based solutions suffered from ROMs intrinsic drawbacks which are slow speed, large area, and high power consumption. Sodagar et al. proposed a ROM-less DDFS by using secondorder parabolic approximation [5], [6]. However, the amplitude resolution of Sodagar's approach is limited by the second-order parabolic approximation error. Caro et al. proposed a ROM-less DDFS by using polynomial interpolation technique [8], but the resolution is also limited by the approximation error. In this paper, we propose a novel ROM-less DDFS architecture, which utilizes 12-bit trigonometric  $4\theta$  formula and finally attains a 13-bit amplitude resolution.

# II. $4\theta$ Approximation

A new idea to carry out the ROM-less DDFS is to utilize the trigonometric quadruple angle formula such that the irregularity of the scaling and error correction difficulties in [5] will be eliminated. In addition, the upper bound of the error range can be analytically solved.

TA1

70

60

80

90

cosine

1.2

0.8

0.6

0.4

0.2

Π

-0.2

0

10

Fig. 3. Comparison of cosine and (5).

# A. Trigonometric First-Order $4\theta$ Approximation

The double angle equality is well known as

$$\cos 2\theta = 2\cos^2 \theta - 1 = 1 - 2\sin^2 \theta. \tag{2}$$

Equation (2) can be rearranged as the following equality if  $\theta$  is replaced with  $2\theta$ .

$$\cos 4\theta = 2\cos^2 2\theta - 1 \tag{3}$$

$$= 1 - 8\sin^2\theta (1 - \sin^2\theta). \tag{4}$$

Since the range of  $4\theta$  is limited in  $[0, \pi/2]$  according to [1], the range of  $\theta$  is  $[0, \pi/8]$ . Thus,  $\sin \theta \approx \theta$ . Equation (4) becomes

$$\cos 4\theta \approx 1 - 8\theta^2 (1 - \theta^2), \quad 0 \le \theta \le \frac{\pi}{8}.$$
 (5)

Fig. 3 shows the comparison of the true cosine function and (5). Notably, the maximum amount of error occurs at 90°. In order to minimize the phase quantization error and the amplitude approximation error, the upper bound must be chosen to be smaller than and close to  $\pi/8 \approx 0.3927$ . This bound should also be easily converted into a digital representation which makes the physical implementation more feasible. The simulink of MATLAB is employed to find such a proper bound which meets the requirement of at least 12-bit output amplitude resolution. The simulation results suggest a nice selection at 3135/8192 with a phase quantization error  $\leq 2.4 \times 10^{-4} = 1/2^{12}$ . Hence, we redefine our first-order approximation method, called TA1(x) (first-order trigonometric approximation), as follows:

$$TA1(x) = 1 - 8x^2(1 - x^2), \quad 0 \le x \le \frac{3135}{8192}.$$
 (6)

Fig. 4 illustrates the actual cosine function and TA1(x), while the difference of these two functions, which is TA1(x) – cosine, is given in Fig. 5. The maximum error attained graphically is  $13 \times 10^{-3}$  which is smaller than  $15.625 \times 10^{-3} = 1/2^6$ . It in-

Fig. 4. Comparison of cosine and TA1(x).

20

30

40

50

degree



Fig. 5. TA1(x) - cosine.

dicates that the first-order approximation has at least 6-bit resolution. Fig. 6 produced by MATLAB simulation shows that the worst case spurious is -84 dBc by using the TA1(x) approximation solely without any error correction at all.

Since the error function  $err(x) = TA1(x) - \cos \theta$  is a complex function to be implemented digitally, we propose to use a polynomial function to fit the error function. The steps are summarized as follows.

- 1) Keep dividing TA1(x)(1 TA1(x)) by 2 until the maximum of TA1(x)(1 TA1(x)) is close to the maximum of err(x).
- 2) A scaling factor K is chosen to further reduce the error between TA1(x)(1 TA1(x)) and err(x). The K must be digitally representable. Besides, the final error must be less than  $2.4 \times 10^{-4} = 1/2^{12}$  to ensure the resolution.











The optimization procedure is carried out by simulink of MATLAB. The final optimized error function becomes as follows:

$$\operatorname{err1}(x) = K \cdot (0.5)^{4} \operatorname{TA1}(x) \cdot (1 - \operatorname{TA1}(x))$$
$$\approx \operatorname{TA1}(x) - \cos\theta \tag{7}$$

where  $K = (0.84375)_{10} = (0.11011)_2, \ 0 \le \theta \le \pi/2$ , and  $0 \le x \le 3135/8192$ . Fig. 7 illustrates the optimally tuned  $\operatorname{err}(x)$  which is very close to the  $\operatorname{err}(x)$  function, i.e.,  $TA1(x) - \cos \theta$ .

# B. Second-Order Approximation

A simple thought to further reduce the amount of error between the cosine function and the approximation equation is to utilize a second-order difference method, which is given as follows:

$$TA2(x) = TA1(x) - err1(x), \quad 0 \le x \le \frac{3135}{8192}.$$
 (8)





Fig. 9. Graphical solutions for maximum error in err2(x).

Fig. 8 shows (TA2(x) - cosine) graphically. We attain the maximal amount of error from the figure is  $0.8 \times 10^{-4} < 1.22 \times$  $10^{-4} = 1/2^{13}$ . Hence, we conclude that the output resolution of our proposed method is guaranteed to be 13 bits, which is more accurate than any prior work. In other words, a trigonometric  $4\theta$  approximation with error correction for sinusoidal output is attained.

# C. Analytic Solutions

The difference between TA2 and cosine function is represented as err2(x) as follows:

$$\operatorname{err2}(x) = \operatorname{TA2}(x) - \cos\theta$$
  
where  $0 \le x \le \frac{3135}{8192}, \quad 0 \le \theta \le \frac{\pi}{2}$  (9)

$$\operatorname{TA1}(x) = \operatorname{TA1}(x) - A \cdot \operatorname{TA1}(x)(1 - \operatorname{TA1}(x))$$
  
where  $A = K \cdot (0.5)^4 = 0.84375 \cdot (0.5)^4$  (10)  
$$\operatorname{TA1}(x) = 1 - 8x^2(1 - x^2).$$
 (11)



Fig. 10. Proposed ROM-less DDFS architecture.



Fig. 11. Verilog simulation results [Note: "MSB" is the sign bit of TA2(x)].

By substituting (10) and (11) into (9), we obtain the entire  $\operatorname{err2}(x)$ . Then, we take the first order derivative of  $\operatorname{err2}(x)$  and solve the solution given that  $\operatorname{err2}'(x) = 0$  to attain the following:

$$\operatorname{err2}'(x) = \operatorname{TA2}'(x) - (\cos \theta)' = 0$$
  

$$0 = (32x^3 - 16x)(16Ax^4 - 16Ax^2 + A + 1) - (\cos \theta)'$$
  

$$0 = (32x^3 - 16x)(16Ax^4 - 16Ax^2 + A + 1)$$
  

$$+ \frac{8192\pi}{6270}\sin\theta, \quad \text{where } \theta = \frac{8192\pi}{6270}x. \tag{12}$$

By graphically solving the two terms in (12) as shown in Fig. 9, there are two intersections between the two curves where the solid line denotes the first term, and the dash line is the second term. The locations of the two intersections exactly match the maximum and minimum of the curve in Fig. 8, respectively. This phenomenon verifies that our method indeed provides a high resolution result.

# D. System Implementation

We propose our architecture basing upon the proposed  $4\theta$  approximation method in Fig. 10.  $f_{ctrl}$  decides the accumulating

step of phase accumulator and controls the DDFS ouput frequency. TA1 approximation calculates TA1(x) while error approximation calculates err1(x). TA2(x) is generated by subtracting err1(x) from TA1(x). TA2(x) is the digital output of the proposed DDFS while  $f_{DDFS}$  is the analog output.

#### **III. SIMULATION AND SYSTEM IMPLEMENTATION**

# A. System-Level Simulation

Modelsim of Mentor and MATLAB of Mathworks are the S/W tools to proceed the system-level simulations. The steps that we adopted are summarized as follows.

1) The design in Fig. 10 is coded by Verilog which is then simulated by Modelsim. The decimal output data in a 12-bit format are collected. Fig. 11 shows the result of this part of work.

2) The collected data are fed into MATLAB. The fast Fourier transform (FFT) command is executed to attain the spectrum.

Fig. 12 illustrates the spurious performance of the proposed method is as high as -130 dBc, which is far better than any prior work. Table I summarizes the performance of the proposed work and prior methods.

|                         | [1]   | [2]   | [6]    | [7]     | [8]     | [9]    | ours    |
|-------------------------|-------|-------|--------|---------|---------|--------|---------|
| resolution (bit)        | 11    | 10    | 10     | 12      | 11      | 9      | 13      |
| spurious (dBc)          | -55   | -55   | -62.8  | -90.3   | -60     | -60    | -130    |
| process                 | GaAs  | GaAs  | 0.6 µm | 1.25 μm | 0.35 µm | 0.8 µm | 0.35 µm |
|                         | HBT   | HBT   | CMOS   | CMOS    | CMOS    | CMOS   | CMOS    |
| power (mW)              | 5000  | 5800  | 780    | 950     | 15.52   | 9.5    | 13.53   |
|                         | @ 500 | @ 500 | @ 10   | @ 100   | @ 83    | @ 30   | @ 100   |
|                         | MHz   | MHz   | MHz    | MHz     | MHz     | MHz    | MHz     |
| area (mm <sup>2</sup> ) | 2.8   | 15.58 | N/A    | 21.09   | 0.18    | 0.9    | 0.31    |
| DAC                     | Yes   | Yes   | No     | No      | No      | No     | No      |
| ROM                     | Yes   | Yes   | No     | Yes     | No      | Yes    | No      |

TABLE I Performance Comparison



Fig. 12. Spurious performance of the proposed DDFS.



Fig. 13. Die photo of the proposed DDFS.

#### B. Implementation and Testing

In order to verify the correctness and performance of the proposed design, we use Taiwan Semiconductor Manufacturing Company (TSMC) 0.35- $\mu$ m 1P4M CMOS process to implement the entire circuit. Fig. 13 shows the die photo of the proposed design. The core area is  $559 \times 557 \ \mu$ m<sup>2</sup>. Fig. 14 shows the test system. The logic analyzer–pattern generator (HP 1660CP) sends clock and the control signals to the DDFS chip, and monitors the DDFS digital outputs. The DAC (TI DAC2904) converts the digital cosine outputs of the DDFS chip into an analog sinusoidal signal which appears in the



Fig. 14. Test system.

oscilloscope (HP 54616C). The system operation clock is up to 100 MHz without any pipelining. The power consumption of the DDFS chip and the DAC is 13.53 mW at 100-MHz system clock. With high-order LPF design, well-done noise suppression, and clock synchronization, the spectrum performance will be expected to be much closer to the ideal performance.

#### **IV. CONCLUSION**

In this paper, we have presented a novel method utilizing the trigonometric quadruple angle formula to reduce the spurious tones of the DDFSs. The second-order approximation is used to justify the capability of subsiding the noise power of the harmonics. We physically fabricated the proposed DDFS which shows a 13-bit digital amplitude resolution given a 100-MHz system operation clock.

#### REFERENCES

- G. W. Kent and N.-H. Sheng, "A high purity, high speed direct digital synthesizer," in *Proc. 1995 49th IEEE Int. Frequency Control Symp.*, 1995, pp. 207–211.
- [2] G. Van Andrews, J. B. Delaney, M. A. Vernon, M. P. Harris, C. T. M. Chang, T. C. Eiland, C. E. Hastings, V. I. DiPerna, M. C. Brown, and W. A. White, "Recent progress in wideband monolithic direct digital synthesizers," in *IEEE MTT-S Int. Microwave Symp. Dig.*, vol. 3, 1996, pp. 1347–1350.
- [3] M. J. Flanagan and G. A. Zimmerman, "Spur-reduced digital sinusoid synthesis," *IEEE Trans. Commun.*, vol. 43, pp. 2254–2262, July 1995.

- [4] V. F. Kroupa, V. Cizek, J. Stursa, and H. Svandova, "Spurious signals in direct digital frequency synthesizers due to the phase truncation," *IEEE Trans. Ultrason., Ferroelect., Freq. Contr.*, vol. 47, pp. 1166–1172, Sept. 2000.
- [5] A. M. Sodagar and G. R. Lahihi, "A novel architecture for ROM-less sine-output direct digital frequency synthesizers by using the 2nd-order parabolic approximation," in *Proc. 2000 IEEE Int. Frequency Control Symp. Exhibition*, 2000, pp. 284–289.
- [6] —, "Mapping from phase to sine-amplitude in direct digital frequency synthesizer using parabolic approximation," *IEEE Trans. Circuits Syst. II*, vol. 47, pp. 1452–1457, Dec. 2000.
- [7] H. T. Nicholas III and H. Samueli, "A 150-MHz direct digital frequency synthesizer in 1.25- μm CMOS with-90-dBc spurious performance," *IEEE J. Solid-State Circuits*, vol. 26, pp. 1959–1969, Dec. 1991.
- [8] D. D. Caro, E. Napoli, and A. G. M. Strollo, "ROM-less direct digital frequency synthesizers exploiting polynomial approximation," in *Proc.* 9th Int. Conf. Electronics, Circuits Systems, 2002, vol. 2, Sept. 2002, pp. 15–18.
- [9] A. Bellaouar, M. S. O'brecht, A. M. Fahim, and M. I. Elmasry, "Lowpower direct digital frequency synthesis for wireless communication," *IEEE J. Solid-State Circuits*, vol. 35, pp. 385–390, Mar. 2000.



**Chua-Chin Wang** (M'97) was born in Taiwan, R.O.C., in 1962. He received the B.S. degree in electrical engineering from National Taiwan University, Taipei, Taiwan, in 1984 and the M.S. and Ph.D. degrees in electrical engineering from the State University of New York at Stony Brook in 1988 and 1992, respectively.

In 1992, he joined the Department of Electrical Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan, where he is currently a Professor. His recent research interests include VLSI

design, low-power and high-speed logic circuit design, neural networks, and wireless communication.



Yih-Long Tseng was born in Taiwan, R.O.C., in 1975. He received the B.S. and M.S. degrees in electrical engineering from National Sun Yat-Sen University, Kaohsiung, Taiwan, in 1997 and 1999, respectively. He is currently working toward the Ph.D. degree in electrical engineering at National Sun Yat-Sen University.

His recent research interests include VLSI design, wireless communication, and video decoding systems.



Hsien-Chih She was born in Taiwan, R.O.C., in 1977. He received the B.S. degree in electrical engineering from National Chung-Hsing University, Taichung, Taiwan, in 2000 and the M.S. degree in electrical engineering from National Sun Yat-Sen University, Kaohsiung, Taiwan, in 2002.

He is currently an Analog IC Design Engineer with the VastView Technology Inc., Hsin-Chu, Taiwan. His research interests include clock generator and clock synthesizer.



**Chih-Chen Li** was born in Taiwan, R.O.C., in 1979. He received the M.S. degree in electrical engineering from National Sun Yat-Sen University, Kaohsiung, Taiwan, in 2003.

Since 2003, he has been with the Analog Department of Terax Communication Technologies, Hsin-Chu, Taiwan. His recent research interests include CMOS mixed-signal design, and delta-sigma A/D or D/A converters for audio and RF applications.



**Ron Hu** was born in Tainan, Taiwan, R.O.C., in 1962. He received the B.S. degree from National Taiwan Institute of Technology, Taipei, Taiwan, in 1987, the M.S. degree from Utah State University, Logan, in 1990, and the Ph.D. degree from the State University of New York at Stony Brook, in 1994, all in electrical engineering.

In 1994, he joined Holtek Semiconductor Inc., Hsin-Chu, Taiwan. Since 2001, he has been General Manager of Asuka Semiconductor Inc., Hsin-Chu, Taiwan. His research interests include consumer

product circuit design, and wireless communication.