Anda di halaman 1dari 14


1. Introduction
2. Literature Review
3. Problem Formulation
4. Objectives
5. Methodology
6. Plan of Work
7. Expected Outcome
8. References

Fixed-angle-rotation operation of vectors is widely used in signal processing, robotics, and

graphics. Various opti- mized coordinate rotation digital computer (CORDIC) designs have been
proposed for uniform rotation of vectors through known and specied angles. CORDIC
(Coordinate Rotation DIgital Computer) is an iterative algorithm which is used to calculate
mathematical functions such as trigonometric, hyperbolic, exponential functions and so on. A
fully parameterized hardware is presented that allows for extensive exploration of the
resources-accuracy design space, from which we generate optimal realizations.

The coordinate rotation digital computer (CORDIC) algorithm involves a simple shift-add
iterative procedure to perform several computing tasks by operating in either rotation-mode or
vectoring-mode following any one among linear, hyperbolic, and circular trajectories [1].
Applications such as singular value decomposition, eigenvalue estimations, QR decomposition,
phase and frequency estimations, synchronization in digital receivers, 3-D graphics processor,
and interpolators require the CORDIC to operate in both rotation and vectoring-modes. The 3-D
structures such as hyperboloids, paraboloids, and ellipsoids require the CORDIC to be operated
in both circular and hyperbolic trajectories. The hardware implementation of these applications
requires more than one CORDIC processor operating in different modes and different
trajectories. A reconfigurable CORDIC, which can operate in rotation and vectoring-modes, for
both circular and hyperbolic trajectories can replace multiple CORDIC processors, and would be
highly useful for such applications. A reconfigurable CORDIC can be utilized for a variety of
applications in communication systems, signal processing, 3-D graphics, robotics apart from
general scientific calculations, and waveform generations.

In the last five decades, several algorithms have been proposed for area-delay-efficient and
power-efficient implementation of CORDIC algorithms, either for circular trajectory [2][7] or
for hyperbolic trajectory [8][10]. But, we do not find any systematic study on design and
implementation of reconfigurable CORDIC in the existing literature. A basic design of
reconfigurable CORDIC based on a unified CORDIC algorithm [11] has been proposed recently
[12]. The reconfigurable design of [12] is found to involve high reconfiguration overhead and
results in low hardware utilization efficiency.

In general, the architectures can be broadly classified as folded and unfolded as shown in Figure
4, based upon the realization of the three iterative equations (6). Folded architectures are
obtained by duplicating each of the difference equations of the CORDIC algorithm into hardware
and time multiplexing all the iterations into a single functional unit. Folding provides a means for
trading area for time in signal processing architectures. The folded architectures can be
categorized into bit-serial and word-serial architectures depending on whether the functional unit
implements the logic for one bit or one word of each iteration of the CORDIC algorithm. The
CORDIC algorithm has traditionally been implemented using bit serial architecture with all
iterations executed in the same hardware [3]. This slows down the computational device and
hence, is not suitable for high speed implementation. The word serial architecture [7, 48] is an
iterative CORDIC architecture obtained by realizing the iteration equations (6). In this
architecture, the shifters are modified in each iteration to cause the desired shift for the iteration.
The appropriate elementary angles, i are accessed from a lookup table. The most dominating
speed factors during the iterations of word serial architecture are carry/borrow propagate
addition/subtraction and variable shifting operations, rendering the conventional CORDIC [7]
implementation slow for high speed applications. These drawbacks were overcome by unfolding
the iteration process [1], so that each of the processing elements always perform the same
iteration as shown in Figure 5.Themain advantage of the unfolded pipelined architecture
compared to folded architecture is high throughput due to the hard- wired shifts rather than time
and area consuming barrel shifters and elimination of ROM. It may be noted that the pipelined
architecture offers throughput improvement by a factor of n for n-bit precision at the expense of
increasing the hardware by a factor less than n.

The implementation of CORDIC algorithm has evolved over the years to suit varying
requirements of applications from conventional non-redundant to redundant nature. The unfolded
implementation with redundant arithmetic initiated the efforts to address high latency in
conventional CORDIC. Subsequently, several modifications have been proposed for redundant
CORDIC algorithm to achieve reduction in iteration delay, latency, area and power. The
evolution of the unfolded rotational CORDIC algorithms is shown in Figure 6. As this taxonomy
is fairly rich, the remainder of the review presents taxonomy in top-down approach.

CORDIC is broadly classified as non-redundant CORDIC and redundant CORDIC based on the
number system being employed. The major drawback of the conventional CORDIC algorithm [3,
7] was low throughput and high latency due to the carry propagate adder used for the
implementation of iterative equations. This contradicted the simplicity and novelty of the
CORDIC algorithm attracting the attention of several researchers to device methods to increase
the speed of execution. The obvious solution is to reduce the time for each iteration or the
number of iterations or both. The redundant arithmetic has been employed to reduce the time for
each iteration of the conventional CORDIC. We have analyzed and presented in the following
Sections, features of different pipelined and non-pipelined unfolded implementations of the
rotational CORDIC.

Concept, Design, and Implementation of Reconfigurable CORDIC Supriya Aggarwal, Pramod

K. Meher, and Kavita Khare IEEE Transactions On Very Large Scale Integration (Vlsi) Systems

In this brief, for the first time a systematic design method for reconfigurable CORDIC is
proposed to let a CORDIC function in different modes and different trajectories of operations.
The proposed reconfigurable CORDIC architectures can be used in a variety of applications,
such as synchronizers, waveform generators, low-cost scientific calculators, and so on.
Approximately 60% of the area is saved by the proposed rotation or vectoring-mode
reconfigurable CORDIC designs over the reference recursive reconfigurable CORDIC, without
any effect on the maximum operating frequency. On the other hand, the proposed pipelined
rotation and vectoring-mode reconfigurable CORDIC designs save 30%50% area compared
with the reference reconfigurable design, with nearly the same maximum operating frequency.

A Hybrid Adaptive CORDIC in 65nm SOTB CMOS Process Hong-Thu Nguyen, Xuan-Thuan
Nguyen, Cong-Kha Pham Trong-Thuc Hoang, Duc-Hung Le, 978-1-4799-5341-7/16/ 2016 IEEE

In this paper, the HA-CORDIC was verified in SOTB CMOS 65nm technology. The circuit
contained all advantages of HA-CORDIC implementation such as low-resource, low- latency,
and high precision. In the worst case, the latency of HA-CORDIC is still 5X, and 1.4X faster
than that of [8] and [9], respectively. The core size of HA-CORDIC implementation in 65nm
SOTB CMOS technology is 0.058 mm2. In the operation mode, this design can operate at 50
MHz with 0.5 V supply voltage, and the current in operation mode is about 0.36 mA. Moreover,
its power consumption is about 0.251 mW, three times lower than that of HA-CORDIC in
conventional CMOS. In standby mode, the leakage current of HA-CORDIC implementation in
SOTB CMOS technology can be reduced by applied the bias voltage to the N well and P well of
SOTB CMOS. When the supply voltage VDD is 0.4 V and the bias voltage VBB is -1.5 V, its
leakage current is 0.492 A, about four times smaller than the leakage current of the HA-CORDIC
in conventional CMOS.

Implementation of a Fast Hybrid CORDIC Architecture Bhawna Tiwari, Nidhi Goel 2016
Second International Conference on Computational Intelligence & Communication Technology
IEEE 2016

This paper presents HDL implementation of Hybrid CORDIC algorithm. Though this
architecture has faster execution, but this comes at the cost of accuracy and power. Apart from
this, hybrid architecture requires less resources as compared to parallel architecture during
synthesis phase which serves as an added advantage.

CORDIC-based FFT Real-time Processing Design and FPGA Implementation Aimei Tang*, Li
Yu, Fangjian Han, Zhiqiang Zhang, 2016 IEEE 12th International Colloquium on Signal
Processing & its Applications (CSPA2016), 4 - 6 March 2016, Melaka, Malaysia
This paper presents a designing scheme of high-speed real-time serial pipelined Fast Fourier
Transform (FFT) processor on FPGA which is based on Coordinate Rotation Digital Computer
(CORDIC) algorithm. The CORDIC algorithm will reduce the hardware complexity compared to
the direct implementation of the butterflies using complex multipliers. Moreover, the design uses
the butterflies of the radix-2 Decimation-In-Time (DIT) algorithm, the dual-port RAM and the
pipelined structure, which will sufficiently increase the performances of the FFT processor. The
simulation results show that compared with the same type of real-time FFT processor, the
scheme presented in this paper reduces the hardware resource requirements of Adaptive Look-up
Tables (ALUTs) and increase the Signal Noise Ratio (SNR) by about 25dB.

CORDIC II: A New Improved CORDIC Algorithm Mario Garrido, Member, IEEE, Petter
Kllstrm, Martin Kumm and Oscar Gustafsson, Senior Member, IEEE Ieee Transactions On
Circuits And Systems Part Ii: Express Briefs 2016

The CORDIC II is a new algorithm that substitutes the CORDIC micro-rotation by a new angle
set. This involves three new types of rotators: friend angles, USR CORDIC and nano-rotations.
By using the proposed micro-rotations, the CORDIC II requires the minimum number of adders
among CORDIC algorithms so far.

Scale-Free Hyperbolic CORDIC Processor and its Application to Waveform Generation Supriya
Aggarwal, Pramod K. Meher Senior Member and Kavita Khare IEEE Transactions On Circuits
And Systems 2011

A scale-free hyperbolic CORDIC algorithm is proposed and used for arbitrary waveform
generation. The key features of the proposed algorithm are that it is completely scaling-free,
provides greater RoC, and reduces the number of iterations. A generalized micro-rotation scheme
based on most-significant-1 detector with single direction micro-rotations is used to eliminate the
redundant CORDIC iterations. Using the proposed CORDIC algorithm, we have suggested a low
complexity waveform generator using a single DDS by random phase modulation. The proposed
AWG requires on an average 36% less area, and less latency compared with existing CORDIC-
based designs with nearly the same throughput rate.

CORDIC Architectures: A Survey B. Lakshmi and A. S. Dhar, Hindawi Publishing Corporation

VLSI Design Volume 2010, doi:10.1155/2010/79489

In this paper, we have surveyed the algorithms for unfolded implementation of 2D rotational
CORDIC algorithms. Special attention has been devoted to the systematic and comprehensive
classification of solutions proposed in the literature. In addition to the pipelined implementation
of nonredundant radix-2 CORDIC algorithm that has received wide attention in the past, we have
discussed the importance of redundant and higher radix algorithms. We have also stressed the
importance of prediction algorithms to precompute the directions of rotations and parallelization
of x/y path. It is worth noting that the considered algorithms should not be implemented as
alternatives over the others, rather they should be integrated depending on the design constraints
of a specific application.

The power consumption for digital circuits is given by the expression:

P = a x VDD x f2

Thus, the power consumption of the design depends upon:

a: activity factor
VDD: supply voltage
f: operating frequency

A pipelined approach results in better hardware utilization and increases the operating frequency
of the design due to shorter combinational paths. By appropriate selection of operating
frequency, we can improve the power consumption of the modified design so that it becomes
more power efficient.

1. Study of CORDIC algorithm

2. Implementation of CORDIC computation unit.

3. Design of pipelined CORDIC architecture

4. Estimation of area, speed, and power of HDL design

5. Comparative study with other designs present in literature.


If a digital signal processing algorithm is implemented with FPGAs and the algorithm uses a
nontrivial (transcendental) algebraic function, like x or arctan y/x, we can always use the Taylor
series to approximate this function. The problem is then reduced to a sequence of multiply and
add operations. A more ecient, alternative approach, based on the Coordinate Rotation Dig-
ital Computer (CORDIC) algorithm can also be considered. The CORDIC algorithm is found in
numerous applications, such as pocket calculators [5], and in mainstream DSP objects, such as
adaptive lters, FFTs, DCTs [6], demodulators [7], and neural networks [5]. The basic CORDIC
algorithm can be found in two classic papers by Volder [8] and Walther [7]. Some theoretical
extensions have been made, such as the extension of range in the hyperbolic mode, or the
quantization error analysis by Hu et al. [8], and Meyer-B ase et al. [7]. VLSI implementations
have been discussed in Ph.D. theses, such as those by Timmermann [8] and Hahn [82]. The rst
FPGA implementations were investigated by Meyer-B ase et al. [4, 7]. The realiza- tion of the
CORDIC algorithm in distributed arithmetic was investigated by Ma [3]. A very detailed
overview including details of several applications, was provided by Hu [6] in a 1992 IEEE
Signal Processing Magazine review paper.

CORDIC state machine

Two basic structures are used to implement a CORDIC architecture: the more compact state
machine or the high-speed, fully pipelined processor. If computation time is not critical, then a
state machine is applicable. In each cycle, exactly one iteration will be computed. The most
complex part of this design is the two barrelshifters. The two barrelshifters can be replaced by a
single barrelshifter, using a multiplexer or a serial (right, or right/left) shifter.

The iterations can be unrolled and this leads to a pipelined and/or parallel implementation of the
CORDIC arithmetic unit as shown in the figure below. The result of the iteration unrolling is
that this latter implementation is faster at the expense of requirement of more area and power.
The pipelining results in reduction of latency.

1. RTL description of CORDIC processor

2. Design verification using testbench based simulation results
3. Area, speed, and power results
4. Modification of original design by pipelining.
5. Frequency adjustment to achieve improved power consumption

Research Planning

Aug- Sept- Oct- Nov- Dec- Jan- Feb- Mar- Apr- May-
17 17 17 17 17 18 18 18 18 18
Duration in Months

Literature Survey Verification of

Analysis & Detailed Study Intermediate Result
& Final Result
Implementation Thesis Writing &
Paper Publications

The CORDIC architecture is an efficient hardware implementation for conversion of rectangular

to polar coordinates. The usual technique to perform this conversion requires the use of
squaring, addition, square root, and arctangent functions implemented in hardware. CORDIC
allows us to implement the same functionality using only an ADD/SUB and a SHIFT register,
thus greatly reducing the hardware complexity of the implementation. At the end of this study,
the following results are to be obtained.

1. VHDL implementation of CORDIC architecture

2. Test-bench simulation results
3. Area, power, and speed results

[1] J. E. Volder, The CORDIC trigonometric computing technique, IRE Trans. Electronic
Computing, vol. EC-8, pp. 330334, Sep. 1959.
[2] M. Garrido, F. Qureshi, and O. Gustafsson, Low-complexity multiplier-less constant
rotators based on combined coefficient selection and shift-and-add implementation (CCSSI),
IEEE Trans. Circuits Syst. I, vol. 61, no. 7, pp. 20022012, Jul. 2014.
[3] C.-S. Wu, A.-Y. Wu, and C.-H. Lin, A high-performance/low-latency vector rotational
CORDIC architecture based on extended elementary angle set and trellis-based searching
schemes, IEEE Trans. Circuits Syst. II, vol. 50, no. 9, pp. 589601, Sep. 2003.
[4] R. Shukla and K. Ray, Low latency hybrid CORDIC algorithm, IEEE Trans. Comput., vol.
63, no. 12, pp. 30663078, Dec 2014.
[5] C.-S. Wu and A.-Y. Wu, Modified vector rotational CORDIC (MVR-CORDIC) algorithm
and architecture, IEEE Trans. Circuits Syst. II, vol. 48, no. 6, pp. 548561, Jun. 2001.
[6] S. Aggarwal, P. K. Meher, and K. Khare, Area-time efficient scaling-free CORDIC using
generalized micro-rotation selection, IEEE Trans. VLSI Syst., vol. 20, no. 8, pp. 15421546,
Aug. 2012.
[7] F. Jaime, M. Snchez, J. Hormigo, J. Villalba, and E. Zapata, Enhanced scaling-free
CORDIC, IEEE Trans. Circuits Syst. I, vol. 57, no. 7, pp. 16541662, July 2010.
[8] Y. Liu, L. Fan, and T. Ma, A modified CORDIC FPGA implementation for wave
generation, Circuits Syst. Signal Process., vol. 33, no. 1, pp. 321329, 2014.
[9] Concept, Design, and Implementation of Reconfigurable CORDIC Supriya Aggarwal,
[10] Implementation of a Fast Hybrid CORDIC Architecture Bhawna Tiwari, Nidhi Goel 2016
Second International Conference on Computational Intelligence & Communication Technology
IEEE 2016.
[11] CORDIC-based FFT Real-time Processing Design and FPGA Implementation Aimei
Tang*, Li Yu, Fangjian Han, Zhiqiang Zhang, 2016 IEEE 12th International Colloquium on
Signal Processing & its Applications (CSPA2016), 4 - 6 March 2016, Melaka, Malaysia.
[12] CORDIC II: A New Improved CORDIC Algorithm Mario Garrido, Member, IEEE, Petter
Kllstrm, Martin Kumm and Oscar Gustafsson, Senior Member, IEEE Ieee Transactions On
Circuits And Systems Part Ii: Express Briefs 2016.