# FPGA Based Efficient WCDMA DUC for Software Radios

Rajesh Mehra

Abstract— This paper presents an efficient method to design & implement WCDMA based digital up converter for Software Radios. The Park McClellan algorithm has been proposed for optimal filter length to reduce hardware requirements. A computationally efficient polyphase decomposition structure is also proposed to optimize both speed and area. The embedded multipliers and LUTs of target FPGA are efficiently utilized to enhance the system performance. The proposed DUC has been designed with Matlab, synthesized with Xilinx Synthesis Tool (XST) and implemented on Virtex-II Pro based xc2vp30-7ff896 FPGA device. The developed DUC can operate at a maximum frequency of 136.37 MHz by consuming 0.10313W power at 25° C junction temperature. The proposed design has been implemented on multiplier based target FPGA to provide cost effective solution for SDR based wireless applications.

*Index Terms*—3 G Mobile Communication, Digital Filters, Software radio, Radio Frequency.

## I. NTRODUCTION

There are many applications where the sampling rate must be changed. Interpolators and decimators are utilized to increase or decrease the sampling rate. Up sampler and down sampler are used to change the sampling rate of digital signal in multi rate DSP systems. This rate conversion requirement leads to production of undesired signals associated with aliasing and imaging errors [1]. So some kind of filter should be placed to attenuate these errors. Filter is the key point of the interpolator. Single-rate digital filters have been used to perform the sampling rate conversion process; however, they proved to be slow in terms of processing time due to the many filtering taps that must be used. Ultimately, multirate filters were developed to offer relatively sampling rate conversion, thereby resulting in fewer filtering taps compared to single-rate filters [2,3]. The digital signal processing application by using variable sampling rates can improve the flexibility of a software defined radio. It reduces the need for expensive anti-aliasing analog filters and enables processing of different types of signals with different sampling rates. It allows partitioning of the high-speed processing into parallel multiple lower speed processing tasks which can lead to a significant saving in computational power and cost.

Due to a growing demand for such complex DSP applications, high performance, low-cost Soc implementations of DSP algorithms are receiving increased attention among researchers and design engineers. Although ASICs and DSP chips have been the traditional solution for high performance applications. now the technology and the market demands are looking for changes. On one hand, high development costs and time-tomarket factors associated with ASICs can be prohibitive for certain applications while, on the other hand, programmable DSP processors can be unable to meet desired performance due to their sequential-execution architecture. In this context, embedded FPGAs offer a very attractive solution that balance high flexibility, time-to-market, cost and performance. Digital up converter (DUC) is the significant nuclear physical module of wireless communication transmitter. One significant advantage of this converter is the requirements on the analog components are relaxed.

An ideal SDR base station would perform all signal processing tasks in the digital domain. However, current-generation wideband data converters cannot support the processing bandwidth and dynamic range required across different wireless standards. As a result, the analog-to-digital converter (ADC) and the digital-to-analog converter (DAC) are usually operated at intermediate frequency (IF) and separate wideband analog front ends are used for subsequent signal processing to the radio frequency (RF) stages. Digital IF extends the scope of digital signal processing (DSP) beyond the baseband domain out to the antenna to the RF domain. This increases the flexibility of the system while reducing manufacturing costs. Moreover, digital frequency conversion provides greater flexibility and higher performance in terms of attenuation and selectivity than traditional analog techniques.

The DUC section of SDR is mainly consists of root raised cosine (RRC) filter, interpolator and numerically controlled oscillator (NCO). In digital up conversion, the input data is baseband filtered and interpolated before it is quadrature modulated with a tunable carrier frequency. To implement the interpolating baseband finite impulse response (FIR) filter is used which can be optimal fixed or adaptive filter

Manuscript received on December 9, 2010. This work was supported by National Institute of Technical Teachers' Training & Research, Sector-26, Chandigarh, U.T. India-160019

Rajesh Mehra is presently Faculty Member in Electronics & Communication Engineering Department, National Institute of Technical Teachers' Training & Research, Chandigarh, India; e-mail: rajeshmehra@ yahoo.com.

architectures. NCO can generate a wide range of architectures for oscillators with spurious-free dynamic range in excess of 115 dB and very high performance. Depending on the number of frequency assignments to be supported, digital up converters can be implemented in an FPGA. 3G code-division multiple access (CDMA)-based systems and multi-carrier systems such as orthogonal frequency division multiplexing (OFDM) exhibit signals with high crest factors (peak-toaverage ratios). Such signals drastically reduce the efficiency of PAs used in the base stations. So crest factor reduction (CFR) is used to overcome this problem.

In DUC, pulse shaping, interpolation and frequency translation are the processes that executed in the design. Pulse shaping filters are used at the heart of many modern data transmission systems like mobile phones, HDTV, SDR to keep a signal in an allotted bandwidth, maximize its data transmission rate and minimize transmission errors. The ideal pulse shaping filter has two properties:

- i. A high stop band attenuation to reduce the inter channel interference as much as possible.
- ii. Minimized inter symbol interferences (ISI) to achieve a bit error rate as low as possible.

The RRC filters are required to avoid inter-symbol interference and constrain the amount of bandwidth required for transmission [4]. Root Raised Cosine (RRC) is a favorable filter to do pulse shaping as it transition band is shaped like a cosine curve and the response meets the Nyquist Criteria. The first Nyquist criterion states that in order to achieve an ISI-free transmission, the impulse response of the shaping filter should have zero crossings at multiples of the symbol period [5]. A time-domain sinc pulse meets these requirements since its frequency response is a brick wall but this filter is not realizable.

We can however approximate it by sampling the impulse response of the ideal continuous filter. The sampling rate must be at least twice the symbol rate of the message to transmit. That is, the filter must interpolate the data by at least a factor of two and often more to simplify the analog circuitry. In its simplest system configuration, a pulse shaping interpolator at the transmitter is associated with a simple down sampler at the receiver. The FIR structure with linear phase technique is efficient as it takes advantage of symmetrical coefficients and uses half the required multiplications and additions [6].

The rectangular pulse, by definition, meets criterion number one because it is zero at all points outside of the present pulse interval. It clearly cannot cause interference during the sampling time of other pulses. The trouble with the rectangular pulse, however, is that it has significant energy over a fairly large bandwidth. Due to this fact rectangular pulse is unsuitable for modern transmission systems. This is where pulse shaping filters come into play. If the rectangular pulse is not the best choice for band-limited data transmission, then what pulse shape will limit bandwidth, decay quickly, and provide zero crossings at the pulse sampling times? The raised cosine pulse is used to solve this problem in a wide variety of modern data transmission systems [7]. Up sampler is basic sampling rate alteration device used to increase the sampling rate by an integer factor. The DUC of SDR paradigm can be exploited resorting to different physical layers: in particular it is widely recognized that DSPs and FPGAs are the most suited supports. However, when high data rates are involved, FPGAs can better fit the system requirements [8]. So this paper focuses on efficient design of DUC on an FPGA target device.

## II. DUC DESIGN REQUIREMENTS

The DUC provides pulse shaping, interpolation, and frequency translation to the single-carrier baseband WCDMA signal from 0 Hz to a set of specified centre frequencies to meet the 3rd Generation Partnership Project (3GPP) TS 25.104 specification, which defines the transmission and reception requirements for the base station radio [9]. The WCDMA DUC design parameters are shown in Table1. The designed DUC should be optimal in both the system performance and the hardware resource usage. The interpolation filter chain in the DUC needs to pulse-shape and up sample the baseband data by a factor of 61.44/3.84=16.

TABLE I

|    | WCDMA DUC DESIGN PARAMETERS   |                                                                    |  |  |
|----|-------------------------------|--------------------------------------------------------------------|--|--|
|    | Parameter                     | Value                                                              |  |  |
| 1. | Carrier Bandwidth             | 5.0 MHz                                                            |  |  |
| 2. | Number of Carriers            | 1 carrier                                                          |  |  |
| 3. | Baseband Chip Rate            | 3.84 MCPS                                                          |  |  |
| 4. | IF Sample Rate                | 61.44 MSPS(16×3.84 MSPS)                                           |  |  |
| 5. | Input Signal Quantization     | 16-bit I and Q (Complex)                                           |  |  |
| 6. | Output Signal<br>Quantization | 16-bit I and Q (Complex)                                           |  |  |
| 7. | Mixer Properties              | Tunability: Variable<br>Resolution: ~0.25 Hz<br>SFDR: up to 115 dB |  |  |

There are several options to perform the specific rate change. First option is design an interpolator with up sample factor of 16 having pulse shaping capability in one shot. This is impractical as it is often difficult to design such a filter to meet the required spectral mask within a reasonable filter length [10]-[12]. Even such design will result in extremely high computational complexity. Second option is to decompose the rate conversion into multiple interpolation stages. It is practical to design an RRC channel filter with up sampling factor of 2 and lower order which can meet the system performance requirement. After the channel filter, the signal still needs to be up sampled by 8 with aliasing effects removed in this process. Four possible configurations have been shown in Table2.



Fig.1 Hardware Efficient DUC

| FILTER CONFIGURATIONS   |                                                        |                                                                 |                                                                    |  |  |  |
|-------------------------|--------------------------------------------------------|-----------------------------------------------------------------|--------------------------------------------------------------------|--|--|--|
| Anti-Aliasing<br>Filter | Filter Length<br>&<br>Sample Rate<br>for<br>1st Filter | Filter Length<br>&<br>Sample Rate<br>for 2nd Filter<br>(if any) | Filter<br>Length<br>& Sample<br>Rate for 3rd<br>Filter (if<br>any) |  |  |  |
| Configuration 1         | 91 taps (†8)<br>61.44 MSPS                             | -                                                               |                                                                    |  |  |  |
| Configuration 2         | 47 taps (†4)<br>30.72 MSPS                             | 11 taps (↑2)<br>61.44 MSPS                                      |                                                                    |  |  |  |
| Configuration 3         | 23 taps (†2)<br>15.38 MSPS                             | 25 taps (↑4)<br>61.44 MSPS                                      |                                                                    |  |  |  |
| Configuration 4         | 23 taps (†2)<br>15.38 MSPS                             | 11 taps (†2)<br>30.72 MSPS                                      | 11 taps (†2)<br>61.44 MSPS                                         |  |  |  |

TABLE 2

The last configuration will result in cost effective solution because its hardware implementation will require less number of multipliers as compared to others whose block diagram is shown in Fig 1.

# III. PROPOSED WCDMA DUC DESIGN

An optimized Digital up Converter (DUC) is designed to meet the WCDMA specifications using a multistage half band interpolator [13]. The RRC filter is designed with roll-off factor of  $\alpha = 0.22$ . When it convolves with the matched RRC filter, the overall raised-cosine response has no inter-chip interference (ICI) because the zero crossings occur at chip intervals [14]. A 45-tap symmetric RRC filter with Chebyshev window and 27 dB side lobe attenuation meet all of the requirements.



Fig 2. RRC Response

The Chebyshev windowing provides better side lobe suppression than a rectangular window at the expense of some widening of the main lobe. This window results in lower filter order to achieve similar performance as compared to other windows like Hann, Blackman, and Hamming. The frequency response of the RRC filter with an interpolation factor of two is shown in Fig2. The input, output and coefficients precision has been selected as 16 bits each to meet the WCDMA requirements. After the channel filter, a cascade of interpolation filters follows to remove the aliasing effect produced by up sampling. The signal is first up sampled by two using a half band interpolator. Half band filters are a type of FIR filter where its transition region is centred at one quarter of the sampling rate, Fs/4. The end of its pass band and the beginning of the stop band are equally spaced on either side of Fs/4 [15].

The Park McClellan algorithm based equiripple technique is used to limit the length of the filter. When it comes to implementing an interpolation filter with a rate of two, the half band filter is more suitable as it requires much less hardware for which in turn reduces the power consumption also. The hardware reduction results from the fact that every odd indexed coefficient in the time domain is zero except the centre tap and even indexed coefficients are symmetric.



Fig 3. First Half Band Interpolator Response



Fig 4. Second Half Band Interpolator Response

The output sample rate of the filter is four times the chip rate, i.e. Fs = 3.84\*4 = 15.36 MSPS. The pass band was set to 3.84\*1.22/2 = 2.34 MHz and the pass band ripple is chosen to be 0.002 dB. An equiripple half band filter was obtained with order of 22. The magnitude response of the filter is shown in Fig3. In the next stage, the signal is again up sampled by a factor of two using a half band interpolator. The output sample rate of the filter is eight times the chip rate, i.e. 3.84\*8 = 30.72 MSPS. The pass band is set to 2.34 MHz along with

pass band ripple factor of 0.002 dB. A filter with order of 10 has been obtained whose magnitude response is shown in Fig4.

The last half band interpolator provides an output sample rate of 61.44 MSPS. The pass band has been set to 2.34 MHz with pass band ripple factor of 0.001 dB. A filter order of 10 is obtained whose magnitude response is shown in Fig5.



Fig 5. Third Half Band Interpolator Response

The 16 bit precision for input, output and coefficient has been used. Finally all filters are cascaded to get final output response of WCDMA digital up converter which is shown in Fig6.



Fig 6. WCDMA DUC Response

# IV. HARDWARE IMPLEMENTATION RESULTS

The concept of polyphase decomposition and pipelined registers has been proposed to enhance the computational efficiency in implementing the proposed DUC. In this technique, all the required coefficients are divided in two parts by using 2-branch polyphase decomposition. The generalized two branch polyphase DUC structure has been shown in Fig7 and can be expressed as:

$$H(z) = E_0(z^2) + z^{-1}E_1(z^2)$$
<sup>(1)</sup>



Fig 7. General Poly Phase Structure

The proposed computationally efficient polyphase structure is shown in Fig8.



Fig 8. Computationally Efficient Poly Phase Structure

In this structure signal is decimated before filtering which reduces the number of coefficients required to implement the desired DUC. This coefficient reduction in turn reduces the area consumption and improves the speed of the proposed design.



Fig 9. ISE Simulator Based DUC Response

Then VHDL code has been developed and synthesized on Virtex-II Pro based xc2vp30-7ff896 target device using Xilinx based Integrated Software Environment. The developed VHDL is simulated using ISE Simulator whose output response is shown in Fig9. The proposed DUC can operate at maximum frequency of 136.37 MHz. The resource consumption of proposed design on specified target device is shown in Table3. The proposed design has consumed power of 0.10313W at 25°C junction temperature as shown in Table4.

#### Timing Summary:

#### Speed Grade: -7

Minimum period: 7.333ns (Maximum Frequency: 136.374MHz) Minimum input arrival time before clock: 3.902ns Maximum output required time after clock: 3.293ns Maximum combinational path delay: No path found

| TABLE 3   |            |  |  |
|-----------|------------|--|--|
| ESOURCE I | TILIZATION |  |  |

R

| Device Utilization S          | E    |           |             |
|-------------------------------|------|-----------|-------------|
| Logic Utilization             | Used | Available | Utilization |
| Number of Slices              | 1996 | 13696     | 14%         |
| Number of Slice Flip<br>Flops | 3465 | 27392     | 12%         |
| Number of 4 input LUTs        | 2267 | 27392     | 8%          |
| Number of bonded IOBs         | 33   | 556       | 5%          |
| Number of<br>MULT18X18s       | 65   | 136       | 47%         |
| Number of GCLKs               | 1    | 16        | 6%          |

TABLE 4 POWER CONSUMPTION

| TOWER CONSOMITION     |                  |      |                 |  |  |  |
|-----------------------|------------------|------|-----------------|--|--|--|
| Name                  | Value            | Used | Total Available |  |  |  |
| Clocks                | 0.00000 (W)      | 1    |                 |  |  |  |
| Logic                 | 0.00000 (W)      | 2272 | 27392           |  |  |  |
| Signals               | 0.00000 (W)      | 8231 |                 |  |  |  |
| IOs                   | 0.00000 (W)      | 33   | 588             |  |  |  |
| MULTs                 | 0.00000 (W)      | 65   | 136             |  |  |  |
|                       |                  |      |                 |  |  |  |
| Total Quiescent Power | 0.10313 (W)      |      |                 |  |  |  |
| Total Dynamic Power   | 0.00000 (W)      |      |                 |  |  |  |
| Total Power           | 0.10313 (W)      |      |                 |  |  |  |
|                       |                  |      |                 |  |  |  |
| Junction Temp         | 25.0 (degrees C) |      |                 |  |  |  |

The proposed WCDMA DUC is compared with existing designs of [11] and [12] where System Generator based DSP48E blocks are used for implementation on Virtex 5 based XC5VSX50T FPGA device. The design implementation using in-built DSP48E blocks of System generator can improve the time to market factor but cannot provide cost effective solution because FPGAs which contain DSP48E blocks are costly as compared to multiplier based FPGAs. In order to provide cost effective solution the proposed design has been implemented on Virtex-II Pro based xc2vp30-7ff896 target device which contains multipliers and is less costly as compared to Virtex-5 FPGA using optimized m-code. The Park McClellan algorithm is proposed for optimal filter length to reduce the hardware requirement which is further supported by the concept of half band filter to improve the computational complexity for enhanced speed. Finally, Poly-phase decomposition technique is used in hardware implementation of proposed WCDMA DUC to optimize both speed and area together by introducing the partially serial architecture.

## V. CONCLUSION

In this paper, an optimized hardware efficient technique is presented to implement WCDMA based digital up converter for software defined radios. The Park McClellan algorithm is proposed for optimal filter length to reduce the hardware requirement which is further supported by the concept of half band filter to improve the computational complexity for enhanced speed. Finally, Poly-phase decomposition technique is used in hardware implementation to optimize both speed and area together by introducing the partially serial architecture. The proposed WCDMA DUC is compared with existing designs where System Generator based DSP48E blocks are used for implementation. The FPGAs with DSP48E blocks are costly as compared to multiplier based FPGAs. In order to overcome this problem the proposed design has been implemented on Virtex-II Pro based xc2vp30-7ff896 target device which contains in built multipliers. The proposed design can operate at maximum frequency of 136.37 MHz by consuming 0.10313W power at 25°C junction temperature. The efficient utilization of embedded multipliers and LUTs available on specified target device results in enhanced speed and area efficiency to provide cost effective solution for SDR based wireless applications.

## ACKNOWLEDGMENT

The author would like to thank Dr. Swapna Devi, Associate Professor, Electronics & Communication Engineering Department, Dr. S. Chatterji, Professor and Head, Electronics & Communication Engineering Department and Dr. S.S.Pattnaik, Professor & Head, ETV Department, NITTTR, Chandigarh for constant encouragement, and guidance during this research work.

### REFERENCES

- ShyhJye Jou, Kai-Yuan Jheng\*, Hsiao-Yun Chen and An-Yeu Wu, "Multiplierless Multirate Decimator *I* Interpolator Module Generator", IEEE Asia-Pacific Conference on Advanced System Integrated Circuits, pp. 58-61, Aug-2004.
- [2] Haipeng Kuang, Dejiang Wang, Gang Zhoul, Zhengping XU, "A Multi-Channel, Area-Efficient, Audio Sampling Rate Interpolator", IEEE 8<sup>th</sup> International Conference on ASIC, pp.-21-24, ASICON-2009.
- [3] Binming Luo, Yuanfu Zhao, and Zongmin Wang, "An Area-efficient Interpolator Applied in Audio Σ-□DAC" Third International IEEE Conference on Signal-Image Technologies and Internet-Based System, pp.538-541, 2008
- [4] K Macpherson, I Stirling, D Garcia, G Rice, Rstewart "Arithmetic Implementation Techniques and Methodologies for 3G Uplink Reception in Xilinx FPGAs" IEE Conference on 3G Mobile Communication Technologies, pp. 191-195, IEE-2002.
- [5] G. Mazzini, G. Setti, and R. Rovatti, "Chip pulse shaping in asynchronous chaos-based DS-CDMA," IEEE Trans. Circuits Syst. I, vol. 54, no. 10, pp. 2299–2314, Oct. 2007.
- [6] J. Chandran, R. Kaluri, J. Singh, V. Owall and R. Veljanovski "Xilinx Virtex II Pro Implementation of a Reconfigurable UMTS Digital Channel Filter" IEEE Workshop on Electronic Design, Test and Applications, pp.77-82, DELTA-2004.
- [7] Ali AI-Haj, "An Efficient Configurable Hardware Implementation of Fundamental Multirate Filter Banks", 5th International Multi-Conference on Systems, Signals and Devices, pp.1-5, IEEE SSD 2008.
- [8] Rajesh Mehra, Dr.Swapna Devi, "Reconfigurable Design of an Area Efficient Digital up Converter for SDR Based Wireless Communication Systems" Journal of Communication and Computer, USA, Volume 7, No.7, Serial No.68, pp. 28-32, July 2010.
- [9] Xilinx Corp., Application notes: Virtex-5 Spartan DSP FPGAs, XAPP1018 (v1.0), pp.8-16, October 22, 2007.
- [10] N.M.Zawawi, M.F.Ain, S.I.S.Hassan, M.A.Zakariya, C.Y.Hui and R.Hussin, "Implementing WCDMA Digital Up Converter In FPGA"

IEEE INTERNATIONAL RF AND MICROWAVE CONFERENCE, pp. 91-95, RFM-2008.

- [11] Wang Wei, Zeng Yifang, Yan Yang, "Efficient Wireless Digital Up Converters Design Using System Generator" IEEE 9<sup>th</sup> International Conference on Signal Processing, pp.443-446, ICSP-2008.
- [12] Lin Fei-yu, Qiao Wei-ming, Jiao Xi-xiang, Jing Lan; Ma Yun-hai "Efficient Design of Digital Up Converter for WCDMA In FPGA Using System Generator" IEEE International Conference on Information Engineering and Computer Science, pp. 1-4, ICIECS 2009.
- [13] Mathworks, "Users Guide Filter Design Toolbox", March-2007.
- [14] Rajesh Mehra, Dr. Swapna Devi, "Area Efficient & Cost Effective Pulse Shaping Filter for Software Radios" International Journal of Ad hoc, Sensor & Ubiquitous Computing (IJASUC) Vol.1, No.3, pp. 85-91, September 2010.
- [15] S K Mitra, Digital Signal Processing, Tata Mc Graw Hill, Third Edition, 2006.

Author:



Rajesh Mehra: Mr. Rajesh Mehra is currently Assistant Professor at National Institute of Technical Teachers' Training & Research, Chandigarh, India. He is pursuing his PhD from Panjab University, Chandigarh, India. He has completed his M.E. from NITTTR, Chandigarh, India and B.Tech. from NIT, Jalandhar, India. Mr. Mehra has 14 years of academic experience. He has authored more than 35 research papers in national, international conferences and reputed journals. Mr. Mehra's interest areas are VLSI Design, Embedded System Design, Advanced Digital Signal Processing, Wireless & Mobile Communication and Digital System Design. Mr. Mehra is life member of ISTE.