Design of Digital IF Based on Parallel Processing Based on FPGA

The so-called intermediate frequency, as the name suggests, refers to a signal form of intermediate frequency. The intermediate frequency is relative to the baseband signal and the radio frequency signal. The intermediate frequency can have one or more levels, and it is the bridge between the baseband and the radio frequency.

digital IF

The so-called intermediate frequency, as the name suggests, refers to a signal form of intermediate frequency. The intermediate frequency is relative to the baseband signal and the radio frequency signal. The intermediate frequency can have one or more levels, and it is the bridge between the baseband and the radio frequency.

Design of Digital IF Based on Parallel Processing Based on FPGA

As shown in Figure 1, if the intermediate frequency part is realized in a digital way, it is called a digital intermediate frequency. Digital IF techniques typically include up-conversion (DUC/DDC), crest factor reduction (CFR), and digital pre-distortion (DPD).


The DUC realizes the conversion from a “complex” baseband (Baseband) signal to a “real” bandpass (Passband) signal. The incoming complex baseband signal has a relatively low sampling rate, usually the symbol rate of digital modulation. The baseband signal is filtered and then converted to a higher sample rate for modulation to the NCO’s IF carrier frequency.

DUC usually needs to complete the spectrum shaping (Pulse shaping), and then modulate to the IF carrier, so as to drive the following analog converter through the DAC.

Design of Digital IF Based on Parallel Processing Based on FPGA  

In Figure 2, the channel filter (Channel Filter) completes the spectral shaping of the baseband signal, usually implemented by FIR. The Interpolation part completes the signal sampling rate conversion and filtering functions, which can be implemented by CIC or FIR. For a narrowband signal, if high sampling rate conversion is required, then CIC will be very suitable, and CIC will be better than FIR in terms of performance or resource savings.

NCO is a numerically controlled oscillator, also known as DDS, which can be used to generate a pair of mutually orthogonal sine and cosine carrier signals, and mix them with the baseband signal after interpolation (increasing the sampling rate) to complete the spectrum up.

In contrast to DUC, DDC basically does the following:

1. Spectrum downshift: The useful spectrum of the digital signal sent by the ADC is moved from the IF to the baseband

2. Sampling rate reduction: The data after spectrum shifting is reduced from the high-speed sampling rate of the ADC to a suitable sampling rate level by decimation (DecimaTIon).

3. Channel filtering: Before sending the I/Q signal to baseband processing, it needs to be filtered again

In fact, digital up-conversion technology is widely used, and it is an indispensable function in wireless communication, cable TV network (Cable Modem), digital TV broadcasting (DVB), medical imaging equipment (ultrasound), and military fields. .


At present, in many wireless communication systems, such as WCDMA and WiMAX, an intermediate frequency signal is usually formed by adding a plurality of independent baseband signals. The synthesized IF signal has a larger peak-to-average ratio (Peak-to-Average RaTIo) and conforms to a Gaussian distribution. However, the linear region of the power amplifier (PA) is usually limited, and the working range of the PA corresponding to the intermediate frequency signal of the larger PAR will be reduced, thereby causing the reduction of the PA efficiency. Therefore it is very important to reduce the PAR of the IF signal before PA. Crest factor attenuation (CFR) is used to complete this function, it will help to ensure the linearity of the PA output, reduce out-of-band radiation, and improve the PA efficiency.

At present, the CFR algorithms used in the IF are: peak clamping (Clip), peak trimming (Peak Windowing) and peak reduction (Peak CancellaTIon). Among them, the performance and achievability of the wave peak trimming method are relatively moderate. Compared with peak trimming, peak reduction has better out-of-band characteristics, but consumes more FPGA resources.


In wireless communication systems, the output of PA is often required to have high linearity to meet the stringent requirements of air interface standards, and linear power amplifiers are very expensive. In order to improve the output efficiency of the PA and reduce the cost as much as possible, the nonlinear characteristics of the PA must be corrected, and predistorting the input signal of the PA is a good choice.

DPD implementations are divided into two categories: look-up table (LUT) and polynomial (Polynomial). The advantages and disadvantages of the two algorithms are shown in Table 1.

Design of Digital IF Based on Parallel Processing Based on FPGA

FPGA Implementation Advantages

FPGA Realization of Digital IF

With the gradual maturity of broadband wireless communication technologies such as WiMAX/LTE, the requirements for the digital intermediate frequency bandwidth of wireless devices are getting higher and higher. At the same time, multi-antenna technologies such as MIMO are widely used, and the number of channels of digital IF is also increasing rapidly.

For such a large operational bandwidth requirement, many DSP processors are difficult to meet practical applications, and special purpose chips (ASSPs) lack the corresponding flexibility. Using FPGA to realize digital intermediate frequency can well coordinate the contradiction between processing capability and flexibility. At the same time, Altera has developed a large number of digital IF reference designs and IPs for 3G/4G and other applications, which simplifies the development difficulty of designers and shortens the design cycle.

FPGA device belongs to hardware, and its characteristic is that it is more suitable for the realization of data path with high speed and uncomplicated logic relationship.

Through our analysis of the previous DDC and DUC functions, we found that the Modules and operations that implement DDC/DUC mainly include CIC/FIR filtering, NCO, interpolation/decimation, and frequency mixing. These basically belong to the processing with simple algorithm but high calculation speed, which is very suitable for the realization of FPGA.

From another point of view, the advantage of FPGA compared to DSP processor is parallel architecture. After a DDC/DUC module is completed, it can be extended to multiple DDC/DUCs by simply duplicating. At the same time, one ADC/DAC device can be connected to multiple channels of DDC/DUC, which can easily support multi-carrier (MulTI-carrier) systems.

 Design of Digital IF Based on Parallel Processing Based on FPGA  

Sometimes the internal resources of the FPGA are limited, and multiple DDC/DUCs can even be time-division multiplexed, sharing a DDC/DUC circuit. Of course, the circuit operating clock also needs to be increased by a corresponding multiple, as long as the FPGA performance allows it. . Altera has reference designs that support WCDMA, TD-SCDMA, and WiMAX.

CFR circuit has a large amount of calculation, such as TD-SCDMA, the sampling rate is from 61.44MHz to 92.16MHz, and the parallel processing based on FPGA can be easily completed.

The polynomial DPD is divided into forward and reverse modules. The forward module is a predistorter composed of multiple FIR filters, which is very suitable for hardware FPGA implementation. Altera’s IP core can provide complete FIR support. The reverse module is a specific convergence algorithm, such as LMS, RLS, and Altera can provide corresponding reference designs. Among them, for RLS, Altera’s reference design adopts the QR decomposition method, which shortens the convergence time and improves the stability of the algorithm.

Resources provided by Altera

In addition to considering the actual situation of digital IF applications in device design, Altera has also done a lot of work in IP cores, control glue logic, interface logic, design tools and processes, and reference designs.

In terms of FPGA device resources, Altera’s latest Cyclone and Stratix series have greatly improved both in terms of quantity and speed in terms of embedded memory and multiply-accumulate modules.

In terms of IP core components of DSP, Altera can provide functional components including FIR, NCO, CIC, CORDIC and so on. In order to facilitate the user’s system integration, it also provides a unified interface for interconnection between these modules: Avalon Streaming (Avalon-ST) interface. In addition, for the multiplexing and demultiplexing of multiple channels, Altera also designed the Avalon-ST interface packet format converter (Packet Format Converter), which is used to convert the input single or multiple Avalon-ST channels with the output single or multiple Each Avalon-ST channel provides time and space interfaces for multiplexing and demultiplexing of multiple channels.

In some areas that require flexibility, such as DPD, Altera’s Nios II embedded processor can play a role. For example, in the feedback path of DPD, it can help users flexibly add their own interpolation routines. The Nios II embedded processor can also help the system to do some data statistics, parameter reconfiguration and other management tasks.

In terms of design verification tools and processes, Altera strongly promotes the integrated design process of MATLAB/Simulink+DSP Builder+Quartus II. As shown in Figure 3.

  Design of Digital IF Based on Parallel Processing Based on FPGA

Simulink can also integrate ModelSim and FPGA embedded logic analyzer SignalTap-II to assist users in functional simulation and debugging. In addition, the hardware in loop (Hardware In Loop) function can help users to verify the design algorithm on the actual hardware, and also accelerate the verification speed.

reference design


Altera’s WiMAX DDC/DUC reference design is based on a 1024-point FFT-based OFDM design with a working bandwidth of 10MHz. The sampling rate of the baseband signal is 11.424MSps, which is the symbol rate (Symbol Rate). The sampling rate of the IF signal is 91.392MSps. From base to IF, a total of 8 times the sample rate change is required.

As we mentioned earlier, CIC is suitable for the narrow-band high-power transformation field, but only 8 times transformation is required here, and the useful signal bandwidth is 10MHz, so it is a better choice to use FIR for decimation or interpolation filtering.

  Design of Digital IF Based on Parallel Processing Based on FPGA

As shown in Figure 4, when dividing the function, we consider the resources and efficiency of the implementation, and divide the shaping filtering and decimation interpolation filtering into three FIRs to design: G(z) is responsible for spectrum shaping, usually root raised cosine (RRC) filter; Q(z) is responsible for 2x decimation or interpolation filtering; P(z) is responsible for 4x decimation or interpolation filtering.

In order to save FPGA resources and improve performance, we designed G(z) with the lowest operating frequency as a 111-order FIR, with the narrowest transition band; Q(z) is the second, 79-order; and P(z) is only 39-order, and its work the highest frequency. The combined response of the three filters is shown in Figure 5, which fully meets the mask (Mask) required by WiMAX.

 Design of Digital IF Based on Parallel Processing Based on FPGA

In the specific FPGA implementation, we consider that the filtering characteristics of the I/Q channels are exactly the same. In order to save device resources, we reuse the three-level FIR of the I/Q channels. Please refer to Figure 6.

On the DDC, we first convert the IF signal of 91.392MSps into the same signal of two consecutive clock cycles of 182.784MSps through oversampling, and mix them with NCO respectively. After three stages of FIR, we finally get two channels of 11.424MSps. I/Q signal.

On DUC, FIR works at 22.848MSps, 45.696MSps and 182.784MSps, respectively. Finally, add the mixed two IQ signals to obtain a band-pass real signal with a sampling rate of 91.392MSps.

For multi-channel multiplexing/demultiplexing, we use Altera’s Avalon-ST packet format conversion module (PFC) for module interconnection.

A typical requirement in a WiMAX base station is 2 transmit antennas and 4 receive antennas, but this reference design can also support 2 transmit antennas and 4 receive antennas.

Through the simulation verification of the reference design, the relative constellation error (Relative Constellation Error) of the DUC is much better than the specified value. For example, at 64QAM 3/4 code rate, the measured RCE is -55.29dB. The DDC’s acceptance sensitivity and Adjacent Channel Rejection metrics are much better than the required values.


The WiMAX system puts forward higher requirements for CFR. Due to the 64QAM modulation method, the error vector magnitude (EVM) is required to be < 3%, and there are stricter requirements for the peak-to-average ratio (PAR) and adjacent channel leakage ratio (ACLR). Altera's WiMAX CFR solution adopts the Constrained Clipping algorithm of Georgia Institute of Technology, and its EVM "3%, PAR reduction" 5dB, and the signal out-of-band spread is extremely small. Refer to Figure 7.

Design of Digital IF Based on Parallel Processing Based on FPGA


The IF bandwidth of WiMAX exceeds 10MHz, and at the same time, adaptive algorithms such as LMS/RLS need to be introduced, which puts forward high requirements on the DSP processing capability and flexibility of the entire DPD module. Using Altera’s “on-chip processor NIOS II + FPGA hardware co-processing unit” approach can well meet the design requirements.

Design of Digital IF Based on Parallel Processing Based on FPGA

As shown in Figure 8, the forward module is a predistorter, which consists of multiple FIR filters. In the reverse link, we collect a set of 64 samples in a “sample cache”, the Nios embedded processor can help calculate the input to the CORDIC, and the CORDIC accelerator does the QR decomposition work. Nios then performs an inverse substitution, updating the coefficients of the FIR filter in the forward link. The soft processor NIOS+CORDIC accelerator is used to complete the upper triangular matrix operation of QRD_RLS, which has good flexibility. We can adjust the number of CORDIC accelerators to improve the data throughput of the reverse module.

The Links:   FP40R12KT3 G121I1-L01