Anda di halaman 1dari 38

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

VOL. 25,

NO. 3,

MARCH 2014

633

External Gradient Time Synchronization


in Wireless Sensor Networks
Kasim Sinan Yildirim, Member, IEEE, and Aylin Kantarci
AbstractSynchronization to an external time source such as Coordinated Universal Time (UTC), i.e., external synchronization, while
preserving tight synchronization among neighboring sensor nodes may be crucial for applications such as determining the speed of a
moving object in wireless sensor networks. However, existing time synchronization protocols in the literature, which can be used for
external synchronization, poorly synchronize neighboring nodes. On the other hand, the only protocol that aims at optimizing the
synchronization error between neighboring nodes is lack of a mechanism which synchronizes sensor nodes to a reference node, and
hence, it cannot provide external synchronization. Therefore, there is a lack in the literature of a time synchronization protocol, which
can be used by applications demanding both external synchronization and tight synchronization among neighboring nodes. In this
paper, we answer the question of whether it is possible for sensor nodes to synchronize to a reference node while they optimize the
clock skew between their neighboring nodes at the same time. Within this context, we present a novel time synchronization protocol,
namely External Gradient Time Synchronization Protocol (EGSync). In EGSync, each sensor node synchronizes to a reference node
by using time information flooded by this node, as well as synchronizes to its neighboring nodes by employing the agreement
algorithm. We implemented EGSync on the MICAz platform using TinyOS and evaluated it on a testbed setup including 20 sensor
nodes. We present the experimental results on our testbed and the simulation results for networks with larger diameters and densities.
Index TermsDistributed algorithms, external time synchronization, gradient time synchronization

INTRODUCTION

notion of time is vital for the correct and


efficient operation of sensor nodes forming wireless
sensor networks (WSNs). This notion cannot be achieved by
using built-in hardware clocks alone since they frequently
drift apart. Hence, each sensor node is required to
participate in time synchronization and calculate a software
clock which represents the common time inside the network.
The objective of time synchronization is to minimize the
differences between the software clocks of the nodes, i.e.,
clock skew, at any time instant [1], [2], [3].
A majority of the time synchronization protocols for
WSNs in the literature [1], [4], [5], [6], [7], [8], [9], [10] aim at
optimizing the clock skew between arbitrary sensor nodes,
i.e., global skew, regardless of the distance1 between them.
However, sensor nodes may be poorly synchronized to
their neighboring nodes with these protocols since they do
not take into account the time information of all of their
neighbors. Optimizing the synchronization error between
neighboring nodes, i.e., local skew, may be crucial for
applications such as target tracking in WSNs [11].
Consider a tracking application which requires sensor
nodes to report a target objects velocity in units meters/
second to a base station. To achieve this goal, each sensor
OMMON

1. i.e., number of hops.

. The authors are with the Department of Computer Engineering, Ege


_
University, Bornova, Izmir
35100, Turkey.
E-mail: {sinan.yildirim, aylin.kantarci}@ege.edu.tr.
Manuscript received 6 Aug. 2012; revised 10 Jan. 2013; accepted 8 Feb. 2013;
published online 27 Feb. 2013.
Recommended for acceptance by S. Olariu.
For information on obtaining reprints of this article, please send e-mail to:
tpds@computer.org, and reference IEEECS Log Number TPDS-2012-08-0708.
Digital Object Identifier no. 10.1109/TPDS.2013.58.
1045-9219/14/$31.00 2014 IEEE

node is required to record the time at which it detected the


target object and exchanged this information with its
neighboring nodes. The estimated velocity of the target
object can be calculated by dividing the difference of the
detection times 4 by the distance between the sensor
nodes. To calculate 4 in terms of real-time passed in units
of second, sensor nodes are required to be synchronized
with respect to an external time sources such as Coordinated Universal Time (UTC) [12], [13]. As can be observed,
the smaller the synchronization error is between the
neighboring nodes, the more accurate 4 and hence the
velocity is estimated.
Gradient Time Synchronization Protocol (GTSP) [14] is
the first and only protocol which aims at optimizing local
skew in WSNs. GTSP is a completely decentralized protocol
since each sensor node synchronizes to its neighboring
nodes and there is not any special node which acts as a
time reference. Unfortunately, this property leads to
inability for GTSP to provide synchronization to an
external time source, i.e., external synchronization [9]. In
WSNs, a reference node, which is synchronized to an
external time source, is required to achieve external
synchronization. By internally synchronizing the remaining
sensor nodes to this reference node, network-wide external
synchronization is established. Therefore, there is a lack in
the literature of a time synchronization protocol which can
be used by applications demanding both external synchronization and local skew optimization, such as the tracking
application we mentioned.
The main contribution of this paper is to answer the
question of whether it is possible for sensor nodes to
synchronize to a reference node while they optimize
the clock skew between their neighboring nodes at
the same time. Within this context, we present a novel time
synchronization protocol, namely External Gradient Time
Published by the IEEE Computer Society

634

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

Synchronization Protocol (EGSync). In EGSync, each sensor


node synchronizes to a reference node by using time
information flooded by this node, as well as synchronizes
to its neighboring nodes by employing an agreement
algorithm. Hence, EGSync is able to achieve external
synchronization by optimizing the synchronization error
among the neighboring nodes.
The remainder of this paper is organized as follows:
Section 2 describes the related work on time synchronization in WSNs. In Section 3, we describe our system model.
We present and describe EGSync in detail in Section 4.
Section 5 presents our implementation of EGSync, which
has been done in TinyOS2 and its evaluation on a testbed
setup including 20 MICAz sensor nodes. We also present
simulation results in this section. Finally, we present our
conclusions in Section 6.

RELATED WORK

The fundamental problem of time synchronization has been


extensively studied for traditional distributed systems.
There are several theoretical studies in the literature that
focus on bounding the skew between arbitrary nodes, i.e.,
global skew, in any network [15], [16], [17], [18], [19]. There
are also considerable amount of protocol-based studies in
the literature that focus on optimizing global skew in WSNs
[1], [4], [5], [6], [7], [20], [21], [8], [9], [10]. The fundamental
shortcoming of these practical studies is that their executions may introduce large synchronization errors between
neighboring nodes. As an example, Flooding Time Synchronization Protocol (FTSP) [7], the de facto time synchronization protocol in WSNs, is designed to optimize global
skew. FTSP synchronizes the whole network by electing a
reference node based on the smallest node identifier which
serves as a time source. The reference node periodically
floods its clock through the network. With flooding, an
ad hoc tree is constructed on the fly. However, neighboring
nodes that are on different paths of the tree may not be
synchronized well because errors propagate down differently on these paths.
Fan and Lynch [11] introduced gradient clock synchronization, where the clock skew between nodes is a function of
their distance and they emphasized the skew between
neighboring nodes, i.e., local skew. There are considerable
amount of studies that focus on the development and
analysis of theoretical algorithms achieving optimal local
skew [22], [23], [24], [25], [26], [27], [28], [29]. However, less
attention has been paid to the development of practical
protocols which aim at optimizing the local skew. Up to our
knowledge, GTSP [14] is the only practical study in WSNs
with this aim. In GTSP, synchronization messages received
from neighboring nodes are used to adjust clocks. Using a
simple algorithm based on averaging the time information
received from neighbors (which has previously been
introduced in [30], [31]), the sensor nodes agree on a
common clock speed and clock value. Although GTSP
achieves a better local skew than FTSP, it leaves the
question open whether it is possible for sensor nodes to
synchronize to a reference time source while they optimize
the clock skew between their neighbors at the same time.
2. http://www.tinyos.net.

VOL. 25,

NO. 3,

MARCH 2014

SYSTEM MODEL

In this section, we introduce our system model which we


use in the remainder of the paper. A WSN can be modeled
as a graph G V ; E which consists of a set of n sensor
nodes represented by vertex set V f1; . . . ; ng and bidirectional communication links between these nodes represented by E  V  V , respectively. Any node u 2 V can
only communicate with its neighboring nodes, which are
defined as the nodes to which it is directly connected. N u
fv 2 V j fu; vg 2 Eg and jN u j represent the set of neighbors
and the number of neighbors of node u, respectively.
The distance between any nodes u; v 2 V is defined as the
number of edges on the shortest path between those two
nodes in the graph G. The diameter of the graph G is the
maximum distance between any two nodes. The adjacency
matrix A and the Laplacian matrix L of the graph G are
defined as follows:

1; fi; jg 2 E;
Ai; j
1
0; otherwise;
L D  A;

such that D diagd1 ; d2 ; . . . ; dn is the degree matrix of G,


i.e., di jN i j. The eigenvalues of L are represented by
1 L  2 L      n L.
Each node u is assumed to be equipped with an
unmodifiable hardware clock, which is denoted by Hu .
We define the reading of the hardware clock of the node u
at real time t as
Z t
hu d;
3
Hu t
0

where hu  represents the rate (speed) of the hardware


clock at time . Clock drift occurs where the hardware clock
does not progress at the exact speed of real time. The
frequencies of crystal oscillators in WSNs exhibit a drift
between 30 and 100 ppm.3 Therefore, we assume that
hardware clocks have bounded drifts such that for all times t,
it holds that
1  "  hu t  1 ";

where 0 < "  1.


Since the hardware clocks of the nodes drift apart and
the nodes cannot modify their hardware clocks, alternatively each node u maintains a logical clock, which is denoted
by Lu , to acquire synchronized notion of time. The value
of the logical clock at time t is calculated as follows:
Z t
Lu t
hu lu d u t;
5
0

where lu t is called the rate multiplier and u t is called the


offset of the logical clock.
The objective of time synchronization is to minimize the
differences of the logical clock values in the system, i.e.,
clock skew. The global skew at any time t is defined as the
largest clock skew between any two arbitrary nodes, i.e.,
maxu;v2V fjLu t  Lv tjg. Similarly, the local skew at any
3. ppm = parts per million, i.e., 106 .

YILDIRIM AND KANTARCI: EXTERNAL GRADIENT TIME SYNCHRONIZATION IN WIRELESS SENSOR NETWORKS

Fig. 1. The general strategy of EGSync: Nodes agree on a common


clock value and speed by executing an agreement protocol with their
neighbors. Additionally, a reference node floods its rate multiplier and
time offset in order for the other nodes to agree on its clock speed
and value.

time t is defined as the largest clock skew between any two


neighboring nodes, i.e., maxu2V ;v2N u fjLu t  Lv tjg.

EXTERNAL GRADIENT TIME SYNCHRONIZATION


PROTOCOL

In this section, we present EGSync which synchronizes


sensor nodes to the clock of a reference node while
optimizing synchronization error between neighboring
nodes at the same time. To achieve this goal, EGSync
executes an agreement protocol among the neighboring
sensor nodes and forces them to agree on a common clock
value and clock speed. Additionally, a reference node
floods its time information into the network. By using the
flooded information together with the common clock value
and clock speed, the other nodes agree on the clock speed
and value of the reference node. Fig. 1 presents the general
strategy of EGSync on a sample network.
The pseudocode of the EGSync protocol is presented in
Algorithm 1. Sensor nodes executing this algorithm agree
on the hardware clock speed and hardware clock value of a
predefined reference node ref, whose node identifier
equals to ROOT . Each sensor node v 2 V maintains a
neighbor repository to keep track of time information of its
neighbors. By using this repository, node v can estimate at
any time the relative hardware clock rate (with respect to its
hardware clock) and the logical clock value of each of its
neighboring nodes. Due to the memory constraints of the
sensor nodes, the amount of memory dedicated to this
repository must be specified in advance. Hence, the
maximum number of neighbors which a sensor node keeps
track of is limited. To keep track of the latest time
information received from the reference node ref, node v
holds the latest rate
maintains three extra variables: lref
v
holds
the
latest
difference
between the
multiplier, ref
v
hardware clock and the logical clock, i.e., time offset, and
seqv holds the largest sequence number received from the
reference node, respectively.
Algorithm 1. EGSync pseudocode for node v with a fixed
reference node whose node identifier is ROOT.
1: Initialization
1, luv
1, Luv
0
2: 8u 2 N v set huv
3: lv
1, v
0
1, ref
0, seqv
0
4: lref
v
v
5: set periodic timer with period B
6:
ref
7: t
u Upon receiving <Hu ; Lu ; lu ; lref
u ; u ; sequ >

635

8: store Hv ; Hu pair and estimate huv


9:
luv
lu
10: Luv
Lu
11: update lv and v
12: if seqv < sequ
13:
lref
lref
v
u
ref
14:
v
ref
u
15:
seqv
sequ
16: endif
17:
18: u
t Upon timer timeout
19: if node identifier ROOT
lv
20:
lref
v
21:
ref
Hv  Lv
v
22:
seqv
seqv 1
23: endif
ref
24: broadcast <Hv ; Lv ; lv ; lref
v ; v ; seqv >
When node v is powered on, the time information kept
for each neighboring node u 2 N v are initialized: the
relative hardware clock rate huv is initialized to 1, the rate
multiplier luv is initialized to 1, and the estimated logical
clock value Luv is initialized to 0 (Line 2). Additionally, node
v initializes its rate multiplier lv to 1 and its logical clock
offset v to zero (Line 3). Hence, the value of the logical
clock of node v at any time is equal to its hardware clock
reading until the values of the parameters lv and v are
changed upon receiving synchronization messages from
neighboring nodes. Moreover, lref
is initialized to 1, and
v
ref
and seqv are initialized to zero (Line 4). Last, node v
v
starts a periodic timer which will fire each time its
hardware clock progresses B units (Line 5).
Each time a synchronization message of the form
ref
<Hu ; Lu ; lu ; lref
u ; u ; sequ > is received from any neighboring node u 2 N v (Line 7), node v stores the received
hardware clock reading Hu and its hardware clock reading
Hv at the receipt time of this message as a pair Hv ; Hu in
the regression table assigned for that neighbor. This table
has a limited capacity to hold the most recent N pairs.
Node v assumes a linear relationship between its hardware
clock and the hardware clock of node u and performs
least-squares regression on the received pairs to calculate
the estimated regression line (Line 8). It should be noted that
the slope of this line is an estimate for the relative
hardware clock rate hu =hv , which we denote by huv . The
received rate multiplier and logical clock value are also
stored for that neighbor (Lines 9 and 10). Using these
information, node v is able to calculate the estimated
logical clock value of node u at any time t, i.e., Luv t, by
progressing the received logical clock value Lu at the
speed huv :luv with respect to its hardware clock.
Using the values of huv and luv stored for each neighbor
u 2 N v , node v updates its rate multiplier and logical clock
offset in (5) as follows (Line 11):
P
lv t u2N v huv t:luv t
lv t
;
6
jN v j 1
v t v t

u2N v

 u

Lv t  Lv t
;
jN v j 1

636

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

such that t denotes the time just after the update operation.
It can be proven that sensor nodes agree on a common logical
clock speed and clock value by applying these equations.4
If the received synchronization message carries a higher
sequence number (Line 12), this situation indicates that the
reference node has recently flooded its current rate multiplier and time offset into the network. Hence, v updates lref
v ,
ref
,
and
seq
to
the
corresponding
received
values
v
v
(Lines 13-16).
For all times t after the clock speed and clock value
agreement is achieved, it theoretically holds for the reference
node ref and for the node v that href t:lref t hv t:lv t.
Thus, it follows that
href t hv t:

lv t
:
lref t

Hence, to progress its logical clock at the hardware clock


rate of the reference node, node v calculates its logical clock
using the following equation instead of (5):
Z t
lv 
Lv t
hv  ref
d v t:
9
lv 
0
When a timeout event is generated by the system (Line 18),
lref
is set to the rate multiplier, ref
is recalculated by
v
v
subtracting the value of the logical clock from the value of the
hardware clock, and the sequence number is incremented
if that node is the reference node (Lines 19-23). Otherwise,
these parameters remain unchanged. It should be noted that
with the increment of the sequence number, a new flooding
round is initiated by the reference node. Each node broadcasts a message which carries the value of its hardware clock,
logical clock, rate multiplier, rate multiplier, and time offset
of the reference node and the sequence number (Line 24).
By using the synchronized logical clock values, node v is
also able to output the hardware clock value of the reference
node, which we denote by Hvref , as follows:
Hvref t Lv t ref
v t:

10

Hence, the time offset received from the reference node is


added to the logical clock value to calculate this value, as
presented by getReferenceTime() interface of Algorithm 2
which returns Hvref .5
Algorithm 2. getReferenceClock interface for node v in
EGSync, which returns the estimated clock value of the
reference node.
1: getReferenceClock()
2: return Lv ref
v
Due to the nondeterministic delays occurred during
the exchange of time information [7], [1], [5], [14], [32],
the agreement mechanism employed by EGSync is unable
to provide perfect agreement. It can be shown that the
error of the agreement is bounded and it essentially
depends on several values related to network parameters.
4. Please see Section B of the Appendix, which can be found on the
Computer Society Digital Library at http://doi.ieeecomputersociety.org/
10.1109/TPDS.2013.58.
5. EGSync is able to synchronize sensor nodes in the network to an
external time source. Please see Section C of the Appendix, available in the
online supplemental material, for the details.

VOL. 25,

NO. 3,

MARCH 2014

Consequently, Lref t  Lv t is bounded for all time


instants t after the agreement is established.
Theorem 4.1. If the graph G representing the communication
network remains strongly connected, then 8v 2 V :
limt!1 Lref t  Lv t   holds such that  depends on
the total degree of G and the eigenvalues of L and A2 .
Proof. Please see Section B of the Appendix, available in the
online supplemental material, for the details.
u
t

TESTBED EXPERIMENTS AND SIMULATIONS

In this section, we evaluate the performance of EGSync


through the experiments performed on a real hardware
platform and the simulations performed in our WSN
simulator. For performance comparison, we considered
GTSP since its objective is to provide tight synchronization
among the neighboring nodes, as in EGSync. We also
considered FTSP since it requires a reference node which
acts as a time source for the other nodes in the network, it
has a publicly available implementation in TinyOS, and it is
used as a benchmark by most of the studies in the literature
[14], [8], [9], [10].
For the evaluation of these protocols, we focused on the
instantaneous global skew and local skew between sensor
nodes. We also considered average global skew which is
defined as the instantaneous average of the global skew and
average local skew which is defined as the instantaneous
average of the local skew by considering all nodes.
Moreover, we made a comparison of these protocols in
terms of their computation and memory requirements and
message complexities.

5.1 Experimental Results


We used a testbed of 20 MICAz sensor nodes in our
experiments.6 We compiled the same application code with
FTSP, GTSP, and EGSync protocols, and executed these
applications on the identical testbed. The beacon period of
these protocols was 30 seconds and the number of entries in
each regression table was eight. The duration of the
experiments was approximately 20,000 seconds and we
powered on sensor nodes randomly in the first 3 minutes.
The placement of sensor nodes during the experiments is
shown in Fig. 2. The experiments on the ring topology
allowed us to observe the synchronization error between
neighboring nodes on the different paths of the ad hoc tree
constructed by flooding. The experiments on the line
topology, with which a larger diameter is obtained when
compared to the ring topology, allowed us to observe the
synchronization error between neighboring nodes as the
diameter of the network increases.
5.1.1 Synchronization Performance
We did not take into account the skew values during
approximately the first 2,500 seconds of the experiments
until the initial synchronization process completes. Table 1
summarizes the maximum skew values observed during
our experiments.
6. Please see Section A of the Appendix, available in the online
supplemental material, for details on MICAz hardware platform,
implementation of the protocols and testbed setup.

YILDIRIM AND KANTARCI: EXTERNAL GRADIENT TIME SYNCHRONIZATION IN WIRELESS SENSOR NETWORKS

637

TABLE 1
Summary of the Measured Skew Values Observed with
FTSP, GTSP, and EGSync during the Experiments

Fig. 2. The placement of sensor nodes to construct the line and ring
topologies for the experiments.

Figs. 3, 4, 5, and 6 show global, average global, local,


and average local skews measured for FTSP, GTSP, and
EGSync on the line and on the ring topologies, respectively. On the line topology, EGSync outperformed FTSP
dramatically in terms of local and global skew. Moreover,
the performances of EGSync and GTSP were quite similar.
The same situation can also be observed on the ring
topology. When compared to FTSP and GTSP, it can be
concluded that EGSync eliminated the deficiency of

inability to provide synchronization to a reference node


with tight local skew, as well as tight global skew.
Figs. 7 and 8 present the maximum synchronization
error between the reference node and the other nodes on
the line and ring topologies for FTSP and EGSync. As can
be observed, there is an exponential growth of the
synchronization error between the reference node and
other nodes as the distance to the reference node gets
larger on the line topology. This fact has been previously
introduced in [8], which is consistent with our experimental

Fig. 3. Global and average global skews measured for FTSP (left), GTSP (middle), and EGSync (right) on the line topology, respectively.

Fig. 4. Local and average local skews measured for FTSP (left), GTSP (middle), and EGSync (right) on the line topology, respectively.

Fig. 5. Global and average global skews measured for FTSP (left), GTSP (middle), and EGSync (right) on the ring topology, respectively.

Fig. 6. Local and average local skews measured for FTSP (left), GTSP (middle), and EGSync (right) on the ring topology, respectively.

638

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

VOL. 25,

NO. 3,

MARCH 2014

Fig. 7. Maximum skew to the reference node (node with identifier 1) per node measured for FTSP (left column) and EGSync (right column) on the
line topology.

Fig. 8. Maximum skew to the reference node (node with identifier 1) per node measured for FTSP (left column) and EGSync (right column) on the
ring topology.

Fig. 9. The synchronization error between nodes 19 and 20 for FTSP (left), GTSP (middle), and EGSync (right) on the line topology, respectively.

Fig. 10. The maximum local skew per node for FTSP (left), GTSP (middle), and EGSync (right) on the line topology, respectively.

results with FTSP. However, with EGSync, since all nodes


agree on the clock value and the clock speed of the
reference node, the synchronization errors of faraway and
nearby nodes to the reference are quite similar. Hence, we
conclude that the serious scalability problem for FTSP does
not hold for EGSync. As shown in Fig. 2, two subtrees
rooted at node 1 are formed due to flooding, and hence,
nodes 8 to 12 have maximum distance to the reference node
when compared to the other nodes on the ring topology. In
FTSP, these nodes exhibit the largest synchronization error
to the reference node. However, in EGSync, the synchronization error between the reference node and the other
nodes is quite stable, as those on the line topology.
The main objective of EGSync is to provide tight
synchronization between neighboring nodes while synchronizing them to the clock of the reference node. Figs. 9 and 10
present the synchronization error between the nodes 19 and
20, which are at the end of the line topology and the
maximum local skew per node, respectively. With FTSP, as
the distance from the reference node increases, the skew
between neighboring nodes also grows. We observed a
maximum local skew more than 400 s for node with

identifier 20 during the experiments. However, the maximum local skews of sensor nodes observed with GTSP and
EGSync are quite close to each other and they do not
increase drastically as the distance to the reference node
increases. This situation can also be observed from the
synchronization error between nodes 19 and 20, which are
at the end of the line topology. In FTSP, since nodes
consider only the clock values flooded by the reference
node and they do not consider the clock of their neighbors,
the synchronization error between nodes 19 and 20 can be
quite large during the execution. However, the maximum
synchronization error between nodes 19 and 20 is quite
small in GTSP and EGSync when compared to FTSP. This is
due to the fact that sensor nodes executing GTSP and
EGSync synchronize to their neighbors. Although EGSync
additionally synchronizes sensor nodes to a reference time
source in contrary to GTSP, the synchronization error
between the neighboring nodes is quite comparable to that
in GTSP.
Figs. 11 and 12 present the synchronization error
between nodes 9 and 10 on the ring topology and the

YILDIRIM AND KANTARCI: EXTERNAL GRADIENT TIME SYNCHRONIZATION IN WIRELESS SENSOR NETWORKS

639

Fig. 11. The synchronization error between nodes 9 and 10 for FTSP (left), GTSP (middle), and EGSync on the ring topology, (right), respectively.

Fig. 12. The maximum local skew per node for FTSP (left), GTSP (middle), and EGSync (right) on the ring topology, respectively.

maximum local skew per node, respectively. It can be


observed from this figure that the maximum local skew
increases in FTSP as the distance from the reference node
increases. However, this case does not hold for GTSP and
EGSync. If we consider nodes 9 and 10, although they are
neighbors, a local skew of 25 s is observed between them
in FTSP. However, a skew of 5 and 8 s are observed
between them in GTSP and EGSync, respectively. The large
skew in FTSP is due to the fact that nodes 9 and 10 are on
different paths of the tree formed by flooding and the
synchronization errors propagate differently on these
paths. We conclude that neighboring nodes are more
tightly synchronized in EGSync when compared to FTSP,
and the maximum local skews of the nodes are quite close
to each other.
Our experiments showed that EGSync synchronizes
sensor nodes to a reference node quite tightly by also
preserving tight synchronization among the neighboring
nodes.

5.1.2 Evaluation of the Flooding Mechanism


As can be observed from the experiments, the local and
global skew values of EGSync are very slightly worse than
those of GTSP. The reason behind this observation can be
explained by examining the periodic time offset flooding
mechanism initiated by the reference node. Upon receiving
the flooded time offset, nodes update their time information
about the reference node. However, nodes propagate the
received time offset after waiting for a given period of time.
This slow-flooding strategy introduces larger instantaneous
local and global skews between the nodes which collected
the recent time information of the reference node and the
nodes which have not collected this time information yet.
The negative effect of the slow propagation of the
reference time information can be reduced by increasing
the speed of the flood, i.e., with rapid-flooding. However,
rapid-flooding can also be slow due to neighborhood
contention since the nodes cannot propagate the flood until
their neighbors have finished their transmissions and it is
difficult in WSNs [33], [10].

5.1.3 Energy Consumption and Memory Requirements


Apart from the synchronization quality, we also compared
the performances of EGSync, GTSP, and FTSP in terms of
memory allocation and communication and CPU overhead.
Table 2 summarizes our measurements. The amount of
memory which is allocated to store collected time information determines the major memory requirements of the
protocols. In publicly available implementation of FTSP,
40 bytes of memory is allocated for the least-squares table
which is used to store the time information of the reference
node. In our implementation of GTSP, a least-squares table
is allocated for each neighbor, and hence, together with
the additional stored information, a total of 64 bytes of
memory is allocated to keep track of a neighboring node.
EGSync requires an additional 8 bytes for each neighbor
(to store information of a neighboring node about the
reference node) and, hence, allocates 72 bytes per neighbor.
It can be concluded that GTSP and EGSync increase the
memory requirements of time synchronization.
Normally, the longer the synchronization messages,
the more the time is required to transmit and receive these
messages which increases the energy consumption. The
length of the synchronization messages is 13 and 14 bytes in
GTSP and FTSP, respectively. Since the time information of
the reference node is also propagated in EGSync, this
increased the length of the synchronization messages to
23 bytes. However, the payload of the messages is fixed in
most platforms, for example, it is 28 bytes in TinyOS.
Hence, synchronization messages of EGSync do not exceed
TABLE 2
Memory Requirements, CPU Overhead, and Synchronization
Message Length of FTSP, GTSP, and EGSync during
the Experiments

jN jis used to denote maximum neighborhood cardinality.

640

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

VOL. 25,

NO. 3,

MARCH 2014

Fig. 13. EGSync simulation results for different size of networks with line (left) and ring (middle) topologies. On the right, simulation results of
networks consisting 100 nodes which have different neighbor densities are presented.

the fixed message length in TinyOS and do not increase the


duration of the communication. Moreover, the communication frequency in all of these three protocols is fixed (the
beacon period was 30 seconds in our experiments), i.e., their
communication complexities are equal. Consequently, it can
be concluded that their overhead in terms of communication are equal.
From the time at which a synchronization message is
received to the time at which the received message is
processed and the time information is updated determines
the major computation requirements. Upon receiving a
synchronization message, FTSP, GTSP, and EGSync consumed at most 5,500, 6,800, and 6,950 microseconds of CPU
time, respectively, for processing the corresponding message. Hence, EGSync has approximately 23 and 2 percent
more CPU overhead when compared to FTSP and GTSP,
respectively.
By considering the overall results, it can be concluded
that EGSync slightly increases the energy consumption and
memory allocation of GTSP.

5.2

Simulation Results for Longer and Denser


Networks
In addition to the real-world experiments, we implemented
EGSync in our WSN simulator using Java language to
observe how synchronization errors between arbitrary
nodes and between neighboring nodes change as the
diameter of the network increases. During our simulations,
we modeled the hardware clocks of nodes in software with a
random drift of 50 ppm. We applied our evaluation metrics
for the networks which is constructed as line and ring
topologies which have different number of sensors
(and hence diameters). For each network, we performed
10 simulation runs and averaged the calculated average
global and average local skews for these runs. Fig. 13 shows
the simulation results from which we conclude that the local
and global skews of EGSync grow substantially slowly as
the diameter of the network increases. EGSync exhibits quite
tight synchronization on longer and larger networks as well.
To observe the synchronization accuracy of EGSync on
denser networks, we constructed five different networks
consisting of 100 sensor nodes and having an average
neighborhood cardinality of 10, 20, 30, 40, and 50,
respectively. These networks may introduce more packet
collisions due to their high density when compared to the
line and ring topologies. On the other hand, sensor nodes
interact with more neighbors and we observed that this
situation improves the performance of the agreement
protocol. Moreover, since there are much more alternative
paths and the synchronization messages from the reference

node may follow shorter paths to reach faraway nodes,


these nodes may collect time information more quickly. As
can be observed from Fig. 13, the synchronization performance of EGSync is affected positively from the increase of
the network density.

CONCLUSION

Most of the time synchronization protocols in WSNs suffer


from exhibiting large synchronization errors between
neighboring nodes, although they provide tight synchronization between arbitrary nodes. For instance in FTSP,
neighboring nodes are poorly synchronized since each
sensor node neglects the time information of its neighboring
nodes while synchronizing to a reference node. GTSP
prevents this shortcoming by synchronizing each sensor
node to its neighboring nodes. However, it is quite open
how to anchor the sensor network to an external time
source since there is not any reference node in GTSP.
External synchronization is crucial for the correct operation
of the applications such as target tracking which also
require tightly synchronized neighboring nodes.
In this study, we addressed this problem and designed a
time synchronization protocol, namely EGSync, which aims
at providing tight synchronization between neighboring
nodes while synchronizing them to the clock of a reference
node at the same time. In EGSync, all of the sensor nodes
agree on the clock speed and the clock value of the reference
node by using the time information flooded by this node
and by applying an agreement algorithm which is based on
averaging the time information received from neighboring
nodes. Our experiments and simulations showed that the
synchronization accuracy of EGSync is quite tight in terms
of local and global skews. Moreover, we observed that
EGSync has similar overhead in terms of computation,
communication, and memory requirements when compared to GTSP.
A major disadvantage of EGSync is that our current
implementation requires a fixed reference node and it is
predefined before deployment of the sensor network.
Hence, EGSync fails to maintain synchronization to the
reference node when this node crashes. However, EGSync
still continues to synchronize the logical clocks of the sensor
nodes by executing the agreement algorithm and by using
the time information flooded by the reference node just
before the time it has crashed. When using EGSync to
synchronize the network to UTC time, more than one
reference nodes which have access to UTC time can be used
to improve the robustness of EGSync in case of reference
node failures.

YILDIRIM AND KANTARCI: EXTERNAL GRADIENT TIME SYNCHRONIZATION IN WIRELESS SENSOR NETWORKS

ACKNOWLEDGMENTS
Kasim Sinan Yildirim acknowledges the Turkish Scientific
_
BITAK)
and Technical Research Council (TU
for supporting
this work through a domestic PhD scholarship program
(BAYG-2211).

REFERENCES
[1]
[2]
[3]
[4]

[5]
[6]

[7]
[8]
[9]
[10]

[11]
[12]
[13]

[14]

[15]
[16]
[17]
[18]
[19]
[20]

[21]

J. Elson, L. Girod, and D. Estrin, Fine-Grained Network Time


Synchronization Using Reference Broadcasts, SIGOPS Operating
System Rev., vol. 36, no. SI, pp. 147-163, 2002.
F. Sivrikaya and B. Yener, Time Synchronization in Sensor
Networks: A Survey, IEEE Network, vol. 18, no. 4, pp. 45-50, July/
Aug. 2004.
I.F. Akyildiz and M.C. Vuran, Wireless Sensor Networks. John Wiley
& Sons, 2010.
J. van Greunen and J. Rabaey, Lightweight Time Synchronization for Sensor Networks, Proc. Second ACM Intl Conf.
Wireless Sensor Networks and Applications (WSNA 03), pp. 1119, 2003.
S. Ganeriwal, R. Kumar, and M.B. Srivastava, Timing-Sync
Protocol for Sensor Networks, Proc. First Intl Conf. Embedded
Networked Sensor Systems (SenSys 03), pp. 138-149, 2003.
H. Dai and R. Han, TSync: A Lightweight Bidirectional Time
Synchronization Service for Wireless Sensor Networks, ACM
SIGMOBILE Mobile Computing and Comm. Rev., vol. 8, pp. 125-139,
2004.
M. Maroti, B. Kusy, G. Simon, and A. Ledeczi, The Flooding
Time Synchronization Protocol, Proc. Second Intl Conf. Embedded
Networked Sensor Systems (SenSys 04), pp. 39-49, 2004.
C. Lenzen, P. Sommer, and R. Wattenhofer, Optimal Clock
Synchronization in Networks, Proc. Seventh ACM Conf. Embedded
Networked Sensor Systems (SenSys 09), Nov. 2009.
T. Schmid, Z. Charbiwala, R. Shea, and M. Srivastava, Temperature Compensated Time Synchronization, IEEE Embedded Systems
Letters, vol. 1, no. 2, pp. 37-41, Aug. 2009.
T. Schmid, Z. Charbiwala, Z. Anagnostopoulou, M.B. Srivastava,
and P. Dutta, A Case against Routing-Integrated Time Synchronization, Proc. Eighth ACM Conf. Embedded Networked Sensor
Systems, pp. 267-280, http://doi.acm.org/10.1145/1869983.
1870010, 2010.
R. Fan and N. Lynch, Gradient Clock Synchronization,
Distributed Computing, vol. 18, no. 4, pp. 255-266, 2006.
J.E. Elson, Time Synchronization in Wireless Sensor Networks,
PhD dissertation, Univ. of California, Los Angeles, 2003.
D. Tulone, On the Feasibility of Time Estimation under Isolation
Conditions in Wireless Sensor Networks, Algorithmica, vol. 49,
no. 4, pp. 386-411, http://dx.doi.org/10.1007/s00453-007-9099-1,
2007.
P. Sommer and R. Wattenhofer, Gradient Clock Synchronization in Wireless Sensor Networks, Proc. Eighth ACM/IEEE Intl
Conf. Information Processing in Sensor Networks (IPSN 09), Apr.
2009.
J. Lundelius and N.A. Lynch, An Upper and Lower Bound for
Clock Synchronization, Information and Control, vol. 62, no. 2/3,
pp. 190-204, 1984.
J.Y. Halpern, N. Megiddo, and A.A. Munshi, Optimal Precision
in the Presence of Uncertainty, Proc. Seventeenth Ann. ACM Symp.
Theory of Computing (STOC 85), pp. 346-355, 1985.
S. Biaz and J.L. Welch, Closed Form Bounds for Clock
Synchronization under Simple Uncertainty Assumptions, Information Processing Letters, vol. 80, pp. 151-157, 2001.
B. Patt-Shamir and S. Rajsbaum, A Theory of Clock Synchronization (Extended Abstract), Proc. 26th Ann. ACM Symp. Theory of
Computing (STOC 94), pp. 810-819, 1994.
T.K. Srikanth and S. Toueg, Optimal Clock Synchronization,
J. ACM, vol. 34, no. 3, pp. 626-645, 1987.
K. Sun, P. Ning, and C. Wang, TinySeRSync: Secure and Resilient
Time Synchronization in Wireless Sensor Networks, Proc. 13th
ACM Conf. Computer and Comm. Security (CCS 06), pp. 264-277,
2006.
B. Kusy, P. Dutta, P. Levis, M. Maroti, A. Ledeczi, and D. Culler,
Elapsed Time on Arrival: A Simple and Versatile Primitive for
Canonical Time Synchronisation Services, Intl J. Ad Hoc
Ubiquitous Computing, vol. 1, no. 4, pp. 239-251, 2006.

641

[22] R. Fan, I. Chakraborty, and N.A. Lynch, Clock Synchronization


for Wireless Networks, Proc. Eighth Intl Conf. Principles of
Distributed Systems, pp. 400-414, 2005.
[23] T. Locher and R. Wattenhofer, Oblivious Gradient Clock
Synchronization, Proc. 20th Intl Symp. Distributed Computing
(DISC 06), Sept. 2006.
[24] C. Lenzen, T. Locher, and R. Wattenhofer, Clock Synchronization
with Bounded Global and Local Skew, Proc. 49th Ann. IEEE
Symp. Foundations of Computer Science (FOCS 08), Oct. 2008.
[25] R.M. Pussente and V.C. Barbosa, An Algorithm for Clock
Synchronization with the Gradient Property in Sensor Networks,
J. Parallel Distributed Computing, vol. 69, no. 3, pp. 261-265, 2009.
[26] F. Kuhn and R. Oshman, Gradient Clock Synchronization Using
Reference Broadcasts, Principles of Distributed Systems, vol. 5923,
pp. 1878-1886, 2009.
[27] C. Lenzen, T. Locher, and R. Wattenhofer, Tight Bounds for
Clock Synchronization, J. ACM, , vol. 57, no. 2, article 8, Jan. 2010.
[28] F. Kuhn, T. Locher, and R. Oshman, Gradient Clock Synchronization in Dynamic Networks, Proc. 21st ACM Symp. Parallelism in
Algorithms and Architectures (SPAA 09), Aug. 2009.
[29] F. Kuhn, C. Lenzen, T. Locher, and R. Oshman, Optimal Gradient
Clock Synchronization in Dynamic Networks, Proc. 29th Symp.
Principles of Distributed Computing (PODC 10), July 2010.
[30] L. Schenato and G. Gamba, A Distributed Consensus Protocol for
Clock Synchronization in Wireless Sensor Network, Proc. IEEE
Conf. Decision and Control (CDC 07), 2007.
[31] L. Schenato and F. Fiorentin, Average TimeSynch: A ConsensusBased Protocol for Time Synchronization in Wireless Sensor
Networks, Automatica, vol. 47, no. 9, pp. 1878-1886, 2011.
[32] K. Romer, P. Blum, and L. Meier, Time Synchronization and
Calibration in Wireless Sensor Networks, Handbook of Sensor
Networks: Algorithms and Architectures, I. Stojmenovic, ed., John
Wiley & Sons, pp. 199-237, 2005.
[33] J. Lu and K. Whitehouse, Flash Flooding: Exploiting the Capture
Effect for Rapid Flooding in Wireless Sensor Networks, IEEE
INFOCOM 09, pp. 2491-2499, 2009.
Kasim Sinan Yildirim received the BSc, MSc,
_
and PhD degrees from Ege University, Izmir,
Turkey, in 2003, 2006, and 2012, respectively.
Since 2007, he has been a research assistant at
the Department of Computer Engineering, Ege
University. His research interests include embedded systems, distributed systems, distributed algorithms, and wireless sensor networks.
He is a member of the IEEE.

Aylin Kantarci received the BSc, MSc, and PhD


degrees from Ege University, _Izmir, Turkey, in
1992, 1994, and 2000, respectively. She is an
associate professor at the Department of Computer Engineering, Ege University. Her current
research interests include distributed systems,
wireless sensor networks, and multimedia.

. For more information on this or any other computing topic,


please visit our Digital Library at www.computer.org/publications/dlib.

1232

IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, VOL. 62, NO. 3, MARCH 2014

A Low-Profile Unidirectional Printed Antenna for


Millimeter-Wave Applications
Mingjian Li, Graduate Student Member, IEEE, and Kwai-Man Luk, Fellow, IEEE

AbstractBy combining a printed reflector-backed one-wavelength bowtie antenna and a printed double loop antenna, a new
wideband high-gain antenna element is demonstrated. Due to its
low-profile structure of
, it is attractive to be used in
an array environment. Experimentally, a series of antenna arrays
operated at 60 GHz are designed and fabricated. They achieve
wide impedance bandwidths covering the 5764 GHz unlicensed
frequency band. Over this band, the measured antenna gains are
ranged from 14.515.5 dBi, 18.320.1 dBi and 22.525.2 dBi for
4-element, 14-element and 50-element antenna arrays, respectively. The measured radiation pattern has low cross-polarization
level and low back radiation. The proposed low-profile antenna
and arrays are low-cost in fabrication as they are simply made on
a single-layer printed circuit board.
Index TermsAntenna array, low-profile antenna, wideband
antenna, 60 GHz radio.

I. INTRODUCTION

ITH the advancement of millimeter-wave (MMW)


technologies, various applications have been proposed,
such as the 39-GHz band for licensed high-speed data links[1],
60-GHz band for unlicensed short range data links [2], 77-GHz
band for automotive radar [3], 94-GHz band for imaging radar
[4], and 7176, 8186, 9295 GHz bands for point-to-point
high bandwidth communication links. For these applications,
the high free space loss and strong atmospheric absorption limit
the distance of communications. Thus, high-gain antennas are
normally required to mitigate the attenuation of such high-frequency electromagnetic radiation.
Recently, many antennas and arrays with good directional
property were proposed in [5][15] for the unlicensed 5764
GHz frequency band due to many attractive applications including high definition multimedia interface, high definition
video streaming, high-speed internet, and wireless gigabit
Ethernet. Most of these designs are made of multi-layer substrates which increase antenna complexity and manufacturing
costs, regardless of using any fabrication techniques such as
printed circuit board (PCB), liquid crystal polymer (LCP)
and even low-temperature co-fired ceramic (LTCC). Among
those antennas, Yagi antenna is a good solution for high gain
Manuscript received May 25, 2013; revised November 02, 2013; accepted
December 17, 2013. Date of publication December 23, 2013; date of current
version February 27, 2014. This work was supported in part by the Research
Grants Council of the Hong Kong SAR and CityU Strategic Research Grant
[Project No. CityU 9041677 and SRG 7008113].
The authors are with the Department of Electronic Engineering and the State
Key Laboratory of Millimeter Waves, City University of Hong Kong, Hong
Kong (e-mail: mingjiali2-c@my.cityu.edu.hk).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TAP.2013.2295832

applications [5], [6]. In [6], the method of placing stacked Yagi


directive elements on a 4
4 patch antenna array fed by a
substrate integrated waveguide (SIW) network realized a high
gain and low side-lobe performance. Slot antenna array also
exhibits high-gain property [7][9]. The use of backed cavity
as reflector and radiating element enhancing antenna gain and
impedance bandwidth was reported in [7]. The plate-laminated slot array excited by a hollow-waveguide corporate-feed
achieved a high gain of over 32 dBi [8]. In 2011, Grid array
antennas at 60-GHz band were first proposed by Zhang, et al.
[11]. The antenna achieved a maximum gain of 17.7 dBi [11]. In
addition, based on cavity array, horn antenna, magneto-electric
dipole and tapered slot, many millimeter-wave antennas with
good performances were designed [12][15].
For practical use, antennas that are low in fabrication cost and
low in profile are highly desirable. Microstrip patch antenna is a
competitive choice. Several designs on metamaterial structures
or superstrate enable a patch element to achieve high gain at
60-GHz band [16][18]. However, these techniques need multilayer fabrication, and therefore lose the low-cost advantage of
microstrip antenna. Array of patch elements can be used for
realizing high gain. The microstrip line fed microstrip antenna
array is a single-layer structure, and hence low in fabrication
cost, but suffers from narrow bandwidth [19]. Techniques are
available to enhance the bandwidth of microstrip antennas but
multi-layer structures are unavoidable, including using L-probe
feed [20], stacked patches [21], U-slot patch [22] and aperture
coupled feed [23].
In this paper, we present a new unidirectional radiating
element consisting of a planar reflector-backed one-wavelength
bowtie and a double-loop antenna. Design of high gain antenna
arrays based on this antenna element is also investigated. These
antenna array prototypes are fabricated by using a single-layer
printed circuit board technology and designed to operate at
( is the
60-GHz band. They have a low profile of
wavelength referring to 60 GHz) and achieve wide impedance
bandwidths and high gains. The design process, broadband
characteristic and radiation pattern for single element are
elaborated in Section II. Three antenna arrays of 4-element,
14-element and 50-element are investigated in Section III.
II. ANTENNA ELEMENT
A. Design Process
Practical applications at millimeter wave usually require
high-gain unidirectional radiation antennas. If we design a
millimeter-wave antenna array in a single-layer structure, the
dielectric substrate must be properly chosen for minimizing the
insertion loss of the feeding network and assuring the validity
of using microstrip line at millimeter wave. Too thin substrate

0018-926X 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

LI AND LUK: A LOW-PROFILE UNIDIRECTIONAL PRINTED ANTENNA FOR MILLIMETER-WAVE APPLICATIONS

1233

TABLE I
DIMENSIONS FOR THE PROPOSED ANTENNA

is one electrical wavelength in Duroid 5880 substrate at 60 GHz.

Fig. 1. Antenna design process. (a) Perspective view of the trapezoidal


microstrip patch antenna, (b) perspective view of the reflector-backed
one-wavelength bowtie antenna, (c) top view of the loop-loaded reflectorbacked one-wavelength bowtie antenna.

gives large conductor loss and too thick substrate introduces


large radiation loss [24]. In this paper, Duroid 5880 substrate
is chosen as its
(
,
is the
wavelength at 60 GHz in free space), dielectric permittivity
, metal thickness
oz and loss tangent
at 60 GHz [25]. On this substrate, 46.5170
ohm microstrip lines can be achieved practically (line width
0.0350.89 mm) at 60 GHz. Hence, we need an antenna which
possesses wideband and low-profile characteristics at the
same time. Microstrip line fed microstrip antenna has a low
profile but suffers from a narrow bandwidth. A conventional
half-wavelength dipole antenna has a wide bandwidth, but it
should be placed above a reflector with a distance of about
for achieving a reasonable performance.
Fig. 1(a) shows a trapezoidal microstrip patch antenna which
is excited at its edge. This antenna is very narrow in bandwidth
as no bandwidth enhancement technique is employed [20][23].
Fig. 1(b) shows a reflector-backed one-wavelength bowtie antenna which consists of two portions, the upper dipole arm and
the lower dipole arm. The upper arm is operated as a microstrip
line fed trapezoidal microstrip patch antenna which is identical
to the antenna shown in Fig. 1(a). The lower arm, placed opposite to the upper arm, is operated as another trapezoidal microstrip patch antenna. This arm is excited by a microstripline (MSL) to coplanar-waveguide (CPW) transition which also
feeds the upper arm at its output end. This transition, composed
of two sections including a tapered section and a straight section
,
), ensures the
bowtie an(
tenna with a relatively wide impedance bandwidth. As shown in
Fig. 1(c), the reflector-backed one-wavelength bowtie antenna

Fig. 2. Input impedances of the trapezoidal microstrip patch antenna, reflectorbacked one-wavelength bowtie antenna and loop-loaded reflector-backed onewavelength bowtie antenna.

Fig. 3. Radiation pattern of the loop-loaded reflector-backed one-wavelength


bowtie antenna at 60 GHz.

is loaded with two


loops. The two loops, working as the conventional double-quad or bi-quad antenna [26], are connected
in parallel to the bowtie antenna for broadening the antenna
impedance bandwidth further. The dimensions of the proposed
antenna element are summarized in Table I.
B. Broadband Characteristic
Fig. 2 shows the input impedances of the three antennas
shown in Fig. 1. The trapezoidal microstrip patch antenna
resonates at around 50 GHz with a very large Re
. It can
be easily matched to 50 ohm but will be narrowband owing

1234

IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, VOL. 62, NO. 3, MARCH 2014

Fig. 4. Loop-loaded reflector-backed one-wavelength bowtie antenna excited


at its center and its radiation pattern at 60 GHz.

to its single resonance. The reflector-backed one-wavelength


bowtie antenna has two electrical resonances at around 51 GHz
and 56.5 GHz which are caused by the upper and lower dipole
arms, respectively. The two arms, operated as two trapezoidal
patches, have different resonant frequencies due to their different feeding methods. From 50 to 65 GHz, the Re
and
Im
of the bowtie antenna vary substantially from 18 to
187 ohm and 60 to 86 ohm, respectively. The loop-loaded
reflector-backed one-wavelength bowtie antenna possesses
three resonances at around 52 GHz, 56 GHz and 63 GHz. Like
the bowtie antenna, the first two resonances are from the bowtie
arms and the third resonance is introduced by the loaded
loops. The loops, paralleled with the bowtie, reduce the magnitudes of the Re
and Im
below 59 GHz. Meanwhile,
the input impedance above 59 GHz is increased because of the
third resonance introduced. Hence, compared with the bowtie
antenna, this loop-loaded bowtie antenna element has more
stable input impedance with Re
of 2695 ohm and Im
of 2550 ohm from 50 to 65 GHz.

Fig. 5. Top and side views of the proposed antenna arrays. (a) 4-element array,
(b) 14-element array, (c) 50-element array.

C. Radiation Pattern
Fig. 3 shows a simulated radiation pattern of the loop-loaded
dipole antenna at 60 GHz. The maximum gain is about 10 dBi
in the broadside direction. The radiation pattern in the E-plane
is slightly asymmetrical and the cross-polarization level in the
H-plane is high, over 15 dB, which are due to the separation
in one arm of the bowtie to make space for the microstrip-line
to coplanar-waveguide transition. To justify this argument, a
loop-loaded electric dipole antenna excited at its center (without
the MSL to CPW transition) is examined as shown in Fig. 4. It
demonstrates that symmetrical radiation patterns in both principle planes are achieved. Since the cross-polarization levels are
lower than 40 dB in both planes, the
curves cannot be
shown in the figure.
III. ANTENNA ARRAYS
A. Array Geometries
,
Fig. 5(a) shows the 4-element antenna array (
) which adopts the parallel feeding method. This
method suffers from an asymmetrical radiation pattern with a
slightly high cross-polarization level because of the radiation
loss from the feed lines and the element asymmetrical geometry
caused by the MSL to CPW transition.

Fig. 5(b) shows the 14-element antenna array (


,
,
and
). It
adopts the hybrid series/parallel feeding method. This method
has smaller frequency sensitivity of the radiation pattern than
a purely series feeding method and therefore achieves a wider
gain bandwidth. The reason is that nearly each frequency dependent microstrip line segment in the network can find a counterpart in which the current is in 180 phase difference. Meanwhile, compared with the parallel feeding method, the series/
parallel feeding method reduces microstrip lines used in the network when exciting an array with a given number of elements
[24], [27]. Thus, this method has lower insertion loss.
Fig. 5(c) shows the 50-element antenna array which also
adopts the hybrid series/parallel feeding method. Each element
)
is through a 100 ohm impedance transformer (
to increase the antenna output impedance. And then it is connected to its opposite neighboring elements through
71
lines (Line 1, 2, 3 & 4), so that the signals at [1] and [2]
reference planes are 180 apart, which is vital for achieving a
broadside radiation beam and suppressing the radiation losses
due to the element asymmetry geometry, as mentioned in
Section II-C. This also means that the adjacent elements are
lines (Line 1, 2, 3 & 4), which makes the
joined through
( is the wavelength at 60
element spacing equals to

LI AND LUK: A LOW-PROFILE UNIDIRECTIONAL PRINTED ANTENNA FOR MILLIMETER-WAVE APPLICATIONS

1235

Fig. 7. Simulated and measured gains of the 4-element, 14 element and 50-element antenna arrays.

B. Array Performances
Fig. 6. Simulated and measured SWRs. (a) 4-element array, (b) 14-element
array, (c) 50-element array.

GHz in free space). The Line 2 and Line 3 are responsible for
feeding the middle 28 elements. And the Line 1 and Line 4 feed
the upper and lower 22 elements. Notice that the Line 1, 2, 3 &
4 do not function as impedance transformers because of
lines between opposite neighboring elements [25]. The Line 5
and Line 6 with
are responsible for connecting the Line 1
& 2 and Line 3 & 4, respectively. The junctions in the middle
of the Line 2 and Line 3 are shifted upwards and downwards
for the Line 5 & 6, so
respectively for keeping a length of
that all elements can be excited in phase. The junctions of the
50 impedance
Line 2 & 3 joined to the feed point through
transformers
and tapered lines for impedance
transformation. The feed point is positioned at the lower side of
the substrate for the sake of confirming 180 phase difference
at the junctions of the Line 2 & 3. It can be observed that most
of the elements are connected together at the edges of the loops
. The connection
for reducing the element spacing to
between adjacent elements has no influence on the element
operation. The reason is that the main electromagnetic radiation
is from the radiating slots of the bowtie arms. Although some
part of the radiation is from the loops, especially at higher
frequencies, the adjacent elements are in phase and therefore
the connection between them will not affect the element radiation. The unconnected elements are not joined to their adjacent
elements for leaving enough space occupied by the vertical
lines (such as Line 5 & 6). This 50-element array, printed on the
single-layer Duroid 5880 substrate, is mounted on an aluminum
by using four metal
antenna fixture
screws (M1). A V-band connector (SHF: KPC185F302) and a
glass bead (SHF: GB185) are assembled together and launched
at the back of the antenna fixture. The inner conductor of the
connector is connected to the array feed network at the feed
point, as shown in Fig. 5. In practice, the three proposed arrays
may be integrated into a 60-GHz MMIC or silicon CMOS
system directly so that the antenna fixture is not a compulsory
part.

All simulations were implemented by EM simulation software Ansoft HFSS. In simulation, the setups for the substrate
and metal are referred to [25] (
,
,
. And the measurements on
impedance bandwidth, gain and radiation patterns were accomplished by a millimeter wave band Agilent E8361A Network
Analyzer and an in-house far-field millimeter-wave antenna
measurement system which was also employed in [14]. It is
noted that this system has been updated so as to broaden the
. The details on the
measurable elevation angle range to
measurement setup have been introduced in [14].
Fig. 6 shows the simulated and measured SWRs of the three
proposed arrays. The measured impedance bandwidths of the
are
4-element, 14-element and 50-element arrays
24.2% (51.8 to 66.1 GHz), 24% (55 to 70 GHz) and 16.7%
(56.7 to 67 GHz), respectively, which all cover 5764 GHz
band. Fig. 7 shows the simulated and measured broadside gains
of the proposed arrays. The measured 3-dB gain bandwidths
of the 4-element, 14-element and 50-element arrays are 20.9%
(54.3 to 67 GHz), 18.9% (56.4 to 68.2 GHz) and 12.5% (57
to 64.6 GHz), respectively. For the sake of the series/parallel
feeding method, the gain bandwidth tends to decrease, as the
number of antenna elements increases [27]. The measured peak
gains for the three arrays are 15.5 dBi at 60.2 GHz, 20.1 dBi
at 61 GHz and 25.2 dBi at 58.4 GHz. The overlapped 3 dB
gain bandwidth and the impedance bandwidth for the 4-element,
14-element and 50-element arrays is 19.6% (54.3 to 66.1 GHz),
18.9% (56.4 to 68.2 GHz) and 12.5% (57 to 64.6 GHz), respectively. Simulations are in relatively good agreement with results obtained from experiments. Fig. 8 shows the simulated and
measured radiation patterns of the 50-element antenna array.
Over the whole operating frequency band, the measured radiation pattern exhibits a cross-polarization level less than 30
dB, front-to-back ratio larger than 31 dB and the first side-lobe
level less than 9 dB. Both the measured and simulated results
agree very well. By applying the Tai and Pereira equation [28],
the antenna directivity can be evaluated. The approximated directivity at 60 GHz is 26.5 dBi, and the corresponding radiation
and aperture efficiencies are 63.7% and 72.5%, respectively.

1236

IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, VOL. 62, NO. 3, MARCH 2014

TABLE II
KEY DATA OF ANTENNA ARRAYS FOR 60 GHz APPLICATIONS

The aperture efficiency and radiation efficiency are the measured values at the operation frequency of each antenna (60 GHz for our works).
Aperture efficiency was calculated from simulated 90% radiation efficiency.

Fig. 9. Photos of the antenna array prototypes. (a) Top view of all proposed arrays, (b) backside view of the fabricated PCB for 50-element array, (c) backside
view of the 50-element array with antenna fixture, (d) side view of the 50-element array with antenna fixture.

achieve wide impedance bandwidth, high gain, and high radiation and aperture efficiencies.
IV. CONCLUSION

Fig. 8. Simulated and measured radiation patterns of the 50-element antenna


array at 57 GHz, 60 GHz, and 64 GHz.

Photos of the array prototypes with 4 elements, 14 elements


and 50 elements are shown in Fig. 9. The key data of our
works and other arrays for 60-GHz applications is summarized
in Table II. It is noted that our proposed low-cost low-profile
printed antenna arrays, fabricated by using single-layer PCBs,

A new printed loop-loaded reflector-backed one-wavelength


bowtie antenna has been presented. This element has a low profile of
and stable input impedance at the same time.
Based on this element, a series of low-cost single-layer antenna
arrays were designed and fabricated at 60 GHz. They have
wide impedance bandwidths covering 5764 GHz and high
gain properties. The unidirectional radiation patterns exhibit
low cross-polarization and back radiation levels. The measured
results of the fabricated prototypes are in good agreement
with simulated results. In addition, the proposed antenna array
made of single-layer printed circuit board is very low cost in
fabrication. They are excellent candidates for various low-cost
millimeter-wave systems.

LI AND LUK: A LOW-PROFILE UNIDIRECTIONAL PRINTED ANTENNA FOR MILLIMETER-WAVE APPLICATIONS

ACKNOWLEDGMENT
The authors would like to thank K. B. Ng for helping with the
antenna measurements.
REFERENCES
[1] T. Sehm, A. Lehto, and A. V. Raisanen, A large planar 39-GHz antenna array of waveguide-fed horns, IEEE Trans. Antennas Propag.,
vol. 46, no. 8, pp. 11891193, Aug. 1998.
[2] P. Smulders, Exploiting the 60 GHz band for local wireless multimedia access: Prospects and future directions, IEEE Commun. Mag.,
vol. 40, no. 1, pp. 140147, Jan. 2002.
[3] B. G. Porter, L. L. Rauth, J. R. Mura, and S. S. Gearhart, Dual-polarized slot-coupled patch antennas on Duroid with Teflon lenses for
76.5-GHz automotive radar systems, IEEE Trans. Antennas Propag.,
vol. 47, no. 12, pp. 18361842, Dec. 1999.
[4] G. P. Gauthier, J.-P. Raskin, L. P. B. Katehi, and G. M. Rebeiz, A
94-GHz aperture-coupled micromachined microstrip antenna, IEEE
Trans. Antennas Propag., vol. 47, no. 12, pp. 17611766, Dec. 1999.
[5] S.-S. Hsu, K.-C. Wei, C.-Y. Hsu, and H. Ru-Chuang, A 60-GHz millimeter-wave CPW-fed Yagi antenna fabricated by using 0.18CMOS technology, IEEE Electron Device Letters, vol. 29, no. 6, pp.
625627, Jun. 2008.
[6] O. Kramer, T. Djerafi, and K. Wu, Very small footprint 60 GHz
stacked yagi antenna array, IEEE Trans. Antennas Propag., vol. 59,
no. 9, pp. 32043210, Sep. 2011.
[7] K. Gong, Z. N. Chen, X. Qing, P. Chen, and W. Hong, Substrate integrated waveguide cavity-backed wide slot antenna for 60-GHz bands,
IEEE Trans. Antennas Propag., vol. 60, no. 12, pp. 60236026, Dec.
2012.
[8] T. Tomura, Y. Miura, M. Zhang, J. Hirokawa, and M. Ando, A 45
linearly polarized hollow-waveguide corporate-feed slot array antenna
in the 60-GHz band, IEEE Trans. Antennas Propag., vol. 60, no. 8,
pp. 36403646, Aug. 2012.
[9] X.-P. Chen, K. Wu, L. Han, and F. He, Low-cost high gain planar
antenna array for 60-GHz band applications, IEEE Trans. Antennas
Propag., vol. 58, no. 6, pp. 21262129, Jun. 2010.
[10] T. Sehm, A. Lehto, and A. V. Risnen, A high-gain 58 GHz box-horn
array antenna with suppressed grating lobes, IEEE Trans. Antennas
Propag., vol. 47, no. 7, pp. 11251130, Jul. 1999.
[11] B. Zhang and Y. P. Zhang, Grid array antennas with subarrays and
multiple feeds for 60-GHz radios, IEEE Trans. Antennas Propag., vol.
60, no. 5, pp. 22702275, May 2012.
[12] J. Xu, Z. N. Chen, X. Qing, and W. Hong, Bandwidth enhancement
for a 60 GHz substrate integrated waveguide fed cavity array antenna
on LTCC, IEEE Trans. Antennas Propag., vol. 59, no. 3, pp. 826832,
Mar. 2011.
[13] B. Pan, Y. Li, G. E. Ponchak, J. Papapolymerou, and M. M. Tentzeris,
A 60-GHz CPW-fed high-gain and broadband integrated horn antenna, IEEE Trans. Antennas Propag., vol. 57, no. 4, pp. 10501056,
Apr. 2009.
[14] K. B. Ng, H. Wong, K. K. So, C. H. Chan, and K. M. Luk, 60 GHz
plated through hole printed magneto-electric dipole antenna, IEEE
Trans. Antennas Propag., vol. 60, no. 7, pp. 31293136, Apr. 2012.
[15] L. Pazin and Y. Leviatan, A compact 60-GHz tapered slot antenna
printed on LCP substrate for WPAN applications, IEEE Antennas
Wireless Propag. Lett., vol. 9, pp. 272275, 2010.
[16] S. J. Franson and R. W. Ziolkowski, Gigabit per second data transfer
in high-gain metamaterial structures at 60 GHz, IEEE Trans. Antennas
Propag., vol. 57, no. 10, pp. 29132925, Oct. 2009.
[17] A. E. I. Lamminen, A. R. Vimpari, and J. Sily, UC-EBG on LTCC for
60-GHz frequency band antenna applications, IEEE Trans. Antennas
Propag., vol. 57, no. 10, pp. 29042912, Oct. 2009.
[18] H. Vettikalladi, O. Lafond, and M. Himdi, High-efficient and highgain superstrate antenna for 60-GHz indoor communication, IEEE
Antennas Wireless Propag. Lett., vol. 8, pp. 14221425, 2009.
[19] B. Biglarbegian, M. Fakharzadeh, D. Busuioc, M.-R. N. Ahmadi, and
S. S. Naeini, Optimized microstrip antenna arrays for emerging millimeter-wave wireless applications, IEEE Trans. Antennas Propag.,
vol. 59, no. 5, pp. 17421747, May 2011.
[20] L. Wang, Y.-X. Guo, and W.-X. Sheng, Wideband high-gain 60-GHz
LTCC L-probe patch antenna array with a soft surface, IEEE Trans.
Antennas Propag., vol. 61, no. 4, pp. 18021809, Apr. 2013.

1237

[21] T. Seki, N. Honma, K. Nishikawa, and K. Tsunekawa, Millimeterwave high-efficiency multilayer parasitic microstrip antenna array on
Teflon substrate, IEEE Trans. Antennas Propag., vol. 53, no. 6, pp.
21012106, Jun. 2005.
[22] H. Sun, Y.-X. Guo, and Z. Wang, 60-GHz circularly polarized U-slot
patch antenna array on LTCC, IEEE Trans. Antennas Propag., vol.
61, no. 1, pp. 430435, Jan. 2013.
[23] D. Liu, J. A. G. Akkermans, H.-C. Chen, and B. Floyd, Packages with
integrated 60-GHz aperture-coupled patch antennas, IEEE Trans. Antennas Propag., vol. 59, no. 10, pp. 36073616, Oct. 2011.
[24] E. Levine, G. Malamud, S. Shtrikman, and D. Treves, A study of microstrip array antennas with the feed network, IEEE Trans. Antennas
Propag., vol. 37, no. 4, pp. 426434, Apr. 1989.
[25] D. Liu, B. Gaucher, U. Pfeiffer, and J. Grzyb, Advanced Millimeterwave Technologies, Antennas, Packaging and Circuits. New York,
NY, USA: Wiley, 2009.
[26] S. Ahmed and W. Menzel, A novel planar four-quad antenna, in
Proc. Antennas and Propagation (EUCAP), Prague, Mar. 2012, pp.
19461949.
[27] S. Demir, Efficiency calculation of feed structures and optimum
number of antenna elements in a subarray for highest G/T, IEEE
Trans. Antennas Propag., vol. 52, no. 4, pp. 12041029, Apr. 2004.
[28] C. A. Balanis, Antenna Theory: Analysis and Design. New York, NY,
USA: Wiley, 2005.
Mingjian Li (S10) received the B.Sc. (Eng.) degrees
in electronic and communication engineering from
City University of Hong Kong in 2010, where he is
currently working toward the Ph.D. degree.
His recent research interests include wideband
antennas, millimeter-wave antennas and arrays, base
station antennas, circularly-polarized antennas and
small antennas.
Mr. Li received the Honorable Mention at the student contest of 2011 IEEE APS-URSI Conference
and Exhibition held in Spokane, US. He was awarded
the Best Student Paper Award (Second Prize) in the 2012 IEEE International
Workshop on Electromagnetics (IEEE iWEM2012) held in Chengdu, China.
Kwai-Man Luk (M79SM94F03) was born
and educated in Hong Kong. He received the B.Sc.
(Eng.) and Ph.D. degrees in electrical engineering
from The University of Hong Kong, in 1981 and
1985, respectively.
He joined the Department of Electronic Engineering, City University of Hong Kong, in 1985
as a Lecturer. Two years later, he moved to the
Department of Electronic Engineering, Chinese
University of Hong Kong, where he spent four years.
He returned to the City University of Hong Kong in
1992, and is currently Chair Professor of Electronic Engineering and Director
of State Key Laboratory in Millimeter waves (Hong Kong). He is the author
of three books, nine research book chapters, over 290 journal papers and 220
conference papers. He has received five US and more than 10 PRC patents. His
recent research interests include design of patch, planar and dielectric resonator
antennas, and microwave measurements.
Prof. Luk is a Fellow of the Chinese Institute of Electronics, PRC, a
Fellow of the Institution of Engineering and Technology, UK, a Fellow of the
Institute of Electrical and Electronics Engineers, USA and a Fellow of the
Electromagnetics Academy, USA. He is Deputy Editor-in-Chief of PIERS
journals. He was a Chief Guest Editor for a special issue on Antennas in
Wireless Communications published in the PROCEEDINGS OF THE IEEE in
July 2012. He was Technical Program Chairperson of the 1997 Progress in
Electromagnetics Research Symposium (PIERS), General Vice-Chairperson
of the 1997 and 2008 Asia-Pacific Microwave Conference (APMC), General
Chairman of the 2006 IEEE Region Ten Conference (TENCON), Technical
Program Co-chairperson of 2008 International Symposium on Antennas and
Propagation (ISAP), and General Co-chairperson of 2011 IEEE International
Workshop on Antenna Technology (IWAT). He received the Japan Microwave
Prize at the 1994 Asia Pacific Microwave Conference held in Chiba in December 1994, and the Best Paper Award at the 2008 International Symposium
on Antennas and Propagation held in Taipei in October 2008. He was awarded
the very competitive 2000 Croucher Foundation Senior Research Fellow in
Hong Kong and the 2011 State Technological Invention Award (2nd Honor)
of China.

220?M

         


       
 
/9yX.xQU9?9k`9x/9}@`ts "k9`>@s>3<s

 !(#!( $ !( (

1QL`9X3M@gk9`?g[[x`Q>9tQg`s @h9kt[@`t
4- 9k>@Xg`93@>M
@[9QX H9`>@s>
t9kk@sxh>
@?x

/@}@ks@!`LS`@@kS`L#kgxh
%)*)1%
@[9QX s9X}9?gk
MQ?9XLg>sQ>
@s

+-)- 5 0*,5 & &* &5 ,5 .5 (*',,5 .'5 &#345 &5
&.*.5  */ .5 .'5 '. &5

&'*$. '&5 '/.5

.,5 , &5

$.* #,5 #' 5  */ .*35 /&. '&# .35 (*'*$&5 &5 '.*5
*#0&.5 ./*,5 5

&*, &#35 '$(#2 .35 '5 $ *' (,5

/, &5 5 *.*5 &/$*5 '5 #3*,5 &5 #' 5 .,5 $",5 . ,5


(*',,5 /&'*#5 1&5 /, &5 .* . '&#5 $.',5 ..5 *#35 '&5
/$&5

&,(. '&5

&5

&#3, ,5

*'*5

  .#5

$5

(*',, &5 ,5 (*,&.5 ,5 5 */ ./#5 #5 '*5 /.'$. '&5 &5 . ,5
((*5 5 ,3,.$5 '*5 .5  */ .*35 2.*. '&5 &#3, ,5 &5
(*,&.. '&5 ,5 ,* 5 .5 ,5  0 5 &5 .*5 #'",5 2 $5
 # &5 ' 5
.,5 '# 4. '&5 2 '& . '&5 &5 *' (5
0 .'*5 5 ((*5 (*,&.,5 &5 '0*0 15 '5 .5 '$(#.5
,3,.$5 &5 ,5 $ &#35 ,5 '&5 .5 ,* (. '&5 '5 .5 $5
(*',, &5 #'* .$,5 ..5 *5 ((# 5 .'5 .5  *&.5 #'",5
,/5 ,5

$5 ,. . &5 /,.'$ 45 #5 &0* &.5 ./*5

*&,'*$5  5 &5 #' 5 .5 #'# 4. '&5 2 *'& . '&5

0-.2 )%+ $5 ,. . &5 $',  &5 '!.5 *'& . '&5
5 *0*,5 & &* &5

%*'&+*&%2

%` tM@ k@>@`t @9ks /@}@ks@ !`LR`@@kS`L 0! M9s L9S`@?


hghxX9kQt Rc 9>tQ}QtQ@s sx>M 9s h@kBgk[9`>@ 9`? s@>xkQt
=@`>N[9kVR`L >g`tkgXjx9XQt >@ktQE>9tQg`s9`?sxhhgkvh9t@`t
XQ>@`sS`L7 8
3M@[9RcQ?@9gB 0!Qstg9a9X@9AZ9Xhkg?x>t
tg g=t9R` S`Bgk[9tQg` >g`>@pS`L Qts ?@sQL` Qts >g`stQtx@`t
[9t@kQ9Xs 9`? >g[hg`@`ts J`>tQg`9XQt h@kBgk[9`>@ 9`?
gtM@k k@X@}9`t B@9txk@s
>>gk?R`L tg tM@ st9t@gBtM@9kv R`
%`t@Lk9t@?Rk>xQt/@}@ks@!`LS`@@kS`L7 &8 tMQs hk9>tQ>@>9` =@
Lkgxh@? R` s@}@k9X sh@>Q9XQ9tQg` 9k@9s hkg?x>t t@9k?g~`s
sst@[ X@}@X 9`9XsQs hkg>@ss 9`9XsQs 9`? >Rk>xQt @k9>tQg`

+`XtM@X9tt@kQstk@9t@?R`tMQsh9h@k

3M@[9S`Q?@9gB 0!Qstg9`9X@9AZ9Xhkg?x>ttgg=t9S`
RcBgk[9tQg` >g`>@pS`L Qts ?@sQL` tM@ [9t@kQ9Xs 9`?
>g[hg`@`ts tM9t >g`stQtxt@s Qt J`>tQg`9XQt h@kBgk[9a>@ 9`?
gtM@k k@X@}9`t B@9txk@s
>>gk?R`L tg tM@ st9t@gBtM@9kv R`
%`t@Lk9t@?Rk>xQt/@}@ks@!`LR`@@kS`L7 8 tMQshk9>tQ>@>9`=@
Lkgxh@? R` s@}@k9X sh@>Q9XQ9tQg` 9k@9s hkg?x>t t@9k?g~`s
sst@[ X@}@X 9`9XsQs hkg>@ss 9`9XsQs 9`? >Rk>xQt @vk9>tQg`

+`XtM@X9tt@kQstk@9t@?R`tMQsh9h@k

Rk>xQt@tk9>tQg`QstM@%k@}@ks@@`LS`@@kS`Lh9kvBg>xs@?
g` tM@ [Q>kg>MQh s>M@[9tQ> 9a? `@tXQst k@tkQ@}9X
!ss@`tQ9XX
tM@9`9XsQs hkg>@ssS`}gX}@s9 ?@>9hsxX9tQg` gBtM@ [Q>kg>MQh
tg 9hhX 9?@X9@kS`L[@tMg? tM9t >g`sQsts S` stkQhhS`L gBB tM@
[Q>kg>MQhk@L9k?R`LQts[@t9XX9@ks
"gk@9>MX9@k9`S[9LS`L

  (%


( 2

hkg>@ss Qs h@kBgk[@? tM9t k@jxRk@s X9kL@ [9L`QE>9tQg`s tg


?QstS`LxQsM 9a? Q?@`tQK tM@ >g[hg`@`ts @
L
 XgLQ> L9t@s
[@[gk >@XXs @t>
3M@`@t st@h >gkk@shg`?s tgtM@;cgt9tQg`
gB tM@ >g[hg`@`ts 9a? tM@Rk R`t@k>gbc@>tQg`s
"S`9XX tM@ `@t
XQst 9`? s>M@[9tQ> >9` =@ g=t9S`@?
3M@k@ 9k@ gtM@k R]hXQ>Qt
gh@k9tQg`s sx>M 9s }@kQE>9tQg`9`? >Rk>xQt sS^zX9tQg`tM9t 9k@
`@>@ss9ktg}9XQ?9t@tM@hkg>@ss sS`>@S[9LS`L9`?9d`gt9tQg`
hkg>@ss@s9k@hkg`@tg@rgks

QLQt9X R]9L@hkg>@ssS`L Qs hk@s@`t@?9s9 HxQtJX E@X? Bgk


9xtg[9tQg` t9sVs k@X9t@? ~QtM tM@ S[9LS`L 9`? 9``gt9tQg`
st9L@s
3M@gxthxt gB tM@ S[9LS`L hkg>@ss >g`t9R`s tMgxs9`?s
gBsS`LX@S[9L@stM9tM9}@tg=@stQt>M@?S`tg9sR`LX@S[9L@h@k
X9@k
>>xk9t@? stQt>MS`L k@jxRk@s 9xtg[9tQ> 9`? k@XQ9=X@
AZ?S`L gB s9XQ@`> hgS`ts S` tM@ S`?Q}Q?x9X S[9L@s
(9t@k
g=U@>tk@>gL`QtQg`Qsxs@?tgXg>9t@9`?Q?@`tQwtM@XgLQ>L9t@s

3M@ >g`tkQ=xtQg` gB tMQs h9h@k Qs tM@ hk@s@`t9tQg` gB 9


sst@[ ~Mgs@hxkhgs@Qs tM@9xtg[9tQg`gBsg[@st9L@s gBtM@
>Rk>xQt @tk9>tQg` hkg>@ss S` h9ktQ>xX9k S[9LS`L 9`?
9bcgt9tQg`
$@`>@ tM@sst@[ st9kvs Hg[tM@hQ>t{n@s g=t9R`@?
Hg[ tM@ R]9LRcL hkg>@ss 9`? gxthxts 9 Bgk[9tt@? `@tXQst R`
5@kQXgL tg =@ [9`9L@? = >g[[@k>Q9X sgI~9k@ Rc gk?@k tg
L@`@k9t@ tM@ s>M@[9tQ>
3MQs sst@[ M9s =@@` ?Q}Q?@? S` tOl@@
=Xg>Vs  %[9L@ 3QXS`L (gLQ> #9t@s (g>9XS9tQg` 2
/@>gL`QtQg` 9`? )Q>kg>MQh *9}QL9tgk
3M@ Fkst =Xg>V ?@9Xs
~QtM tM@ [gs9Q>RcL k@>g`stkx>tQg` g=t9R`@? Hg[ tM@ R]9LRcL
st9L@
 >xstg[S@? }@ksQg` gB tM@ 1>9X@ %`}9kQ9`t "@9txk@
3k9`sBgk[ 1%"3 7 8 Qs xs@? Bgk AZ?RcL >gr@shg`?@`>@s
=@t~@@`R]9L@s ~M@k@9stM@k@>g`stkx>tQg`Qs h@kBgk[@? = 9
k@LQg`Lkg~RcL 9XLgkQtO]
3M@ `@t =Xg>V B9>@s >g[hg`@`t
Xg>9XQ9tQg` 2 Q?@`tQE>9tQg` xsRcL [gkhMgXgL Bgk R]9L@
@bP9`>@[@`t 9a?-@9ksg`>gkk@X9tQg`9ssS[QX9kQt[@9sxk@Bgk
tM@ k@>gL`QtQg`
3M@ X9st =Xg>V Qs 9??k@ss@? tg ;cgt9t@ tM@
S`t@k>gbc@>tQg`s =@t~@@` tM@ >g[hg`@`ts 9`? tg 9XQL` S[9L@s
=@t~@@`>g`s@>xtQ}@X9@ks

3M@ gkL9`S9tQg` gB tM@ h9h@k Qs ?Q}Q?@? Rc tM@ BgXXg~RcL


s@>tQg`s
3M@?@s>kQhtQg`gBtM@R]hX@[@`t@?sst@[9s~@XX9s
tM@@hX9`9tQg`gBtM@xs@?9XLgkQtM[9k@?@AZ@?S`1@>tQg`%%

3M@k@sxXts9m@hk@s@`t@?S`1@>tQg`%%% 9`?AZ9XX1@>tQg`%5
>g`t9R`stM@>g`>XxsQg`s9`?Jtxk@~gkV

(M $>:BD<F;@M%9E8D<CF<BAM

%%
1613!) !10'-3%+*

3M@st9L@sgB  %[9L@3QXRcLsst@[9k@XQst@?=@Xg~

g[hxt@tM@tk9`sX9tQg`}@>tgks=@t~@@`S[9L@sxsS`L
9` 9XLgkQtM[ Bgk 9xtg[9tQ> B@9txk@ ?@t@>tQg` 9`?
[9t>MR`L =@t~@@` `@QLM=gkR`L S[9L@s
3M@ 9XLgkQtO]
Qs9>xstg[S@?}@ksQg`gB1%"37 8


Xxst@k tM@tk9`sX9tQg` }@>tgks Bgk @9>M h9Rk gB S[9L@s
xsR`L 9 MQ@k9k>MQ>9X tk@@ =9s@? g` tM@ !x>XQ?@9`
?Qst9`>@


1>gk@tM@[9t>MS`L k@XQ9=QXQt =9s@? g`tM@h@k>@`t9L@
gBtk9`sX9tQg`}@>tgksHg[tM@X9kL@st>Xxst@k


hhX 9 k@LQg` Lkg~R`L 9XLgkQtM[ tg k@>g`stkx>t tM@
[gs9Q> ~M@k@tM@[gk@k@XQ9=X@ [9t>MR`Ls9k@[@kL@?
Fkst

3M@ BgXXg~S`L sx=s@>tQg`s 9k@ Bg>xs@? g` tM@ hkg=X@[


st9t@[@`t9`?tM@t@>OcgXgLQ@sxs@?RctM@?QBB@k@`t=Xg>VstM9t
>g[hgs@tM@>g[hX@t@sst@[

$M %M'@7:9M3<><A:M
3MQs =Xg>V ?@9Xs ~QtM tM@ stQt>MR`L gB tM@ S[9L@s g=t9R`@?
Hg[ tM@ R]9LRcL hkg>@ss
%ts hxqgs@ Qs tg k@>g`stk|>t 9
[gs9Q>Bgk@9>MX9@kgB tM@[Q>kg>MQh

3M@ [9S` hkg=X@[ 9kQs@s Hg[ tM@ ?@X9@kS`L 9a? S[9LRcL


hkg>@?xk@s
%` gxk >9s@ 9[@>M9`Q>9X hgXQsMS`L Bgk ?@X9@kRcL
9`? ghtQ>9X Q[9LRcL xsS`L [9`x9X sMggtR`L M9s =@@` xs@?

g_[g` S`>g`}@`Q@`>@s gB sx>M [@tMg?gXgL 9k@ Rkk@LxX9k


hgXQsMS`L sg[@ k@LQg`s 9k@ [gk@ hgXQsM@? tM9a gtM@ks 9`?
`g`x`QBgk[ g}@kX9hhS`L 9[g`L S[9L@s S`>Xx?RcL L9hs ?x@
>9[@k9 hgsQtQg`S`L @kkgks
+tM@k B9>tgks [9 =@ ghtQ>9X
?QstgkvQg`tM9t[9L`QE@stM@gxt@kk@LQg`sgB tM@R]9L@

#9t@9kk9h9kts [9 9Xsg Tukg?x>@s@kQgxs ?QBE>xXtQ@s Bgk


k@>g[hgsS`L sS`>@ S[9L@s hk@s@`t MQLM s[[@tk ~MQ>M 9k@
[9S`X Bgk[@? = `g?@s >g`t9>t hgTus 9`? Mg[gL@`@gxs
k@LQg`ss@@"QL
9`?"QL


9

1%"3 Qs9 [@tMg? tM9t AZ?s B@9txk@sS`tM@g=U@>ts gB9s@t


gB S[9L@s 9`? s@9k>M@s Bgk >gr@shg`?@`>@s tg Q?@`tQK tM@s@
g=U@>ts tg~:m?s >M9`L@s gB s>9X@ kgt9tQg` tk9`sX9tQg` @t>
2
9?9ht9tQg` gB tM@ 9XLgkQtM[ [9 =@ xs@? tg F`? B@9txk@s R`
`@QLM=gkR`L R]9L@s 9`? AZ? tM@ tk9`sX9tQg` }@>tgks Bgk
stQt>MS`L tM@ [gs9Q> 7 8
%ts kg=xst`@ss Rc s>9X@ 9`? kgt9tQg`
[9V@s Qt 9`9ttk9>tQ}@ sgXxtQg` tg sgX}@ tM@ ghtQ>9X ?QstgkvQg`s

@sQ?@s Qts B@9txk@ ?@s>kQhtgk hk@s@`ts 9 k@XQ9=X@ h@kBgk[9`>@


9L9S`st }9kQ9tQg`s S` QXXx[S`9tQg` 9`? >Olg[9>Qt
$g~@}@k
tM@ >M9k9>t@kQstQ>s gB g{n s>@`9kQg s@@ "QL
 ?@>k@9s@ tM@
h@kBgef9`>@ gB tM@ L@`@kQ> 1%"3 }@ksQg` M@`>@ tM@
[@tMg?gXgLM9stg=@9?9ht@?tgtM@s@hkg=X@[s

+xk 1%"3 }@ksQg` 9?ghts tM@ Q?@9 hkghgs@? R` 7 8 ~M@`


hk@s@`tR`LtM@4hkQLMt1h@@?4h/g=xst"@9txk@9XLgkQtN[4
140"
140" Qs >g`sQ?@k@? 9a @}gXxtQg` gB 1%"3 ~M@k@ Qts
>g[hxt9tQg`9X>gstQssQL`QE>9`tX?@>k@9s@? @tQtskg=xst`@ss
Qs9Y[gst sS[QX9k gk sXQLMtX S`B@kQgk R` sg[@ t@sts
4140" Qs
tM@`tM@140"}@ksQg`~M@`kgt9tQg`Qs `gt >g`sQ?@k@?
%`gxk
>9s@ tM@ kgt9tQg` >9` =@ `@LX@>t@? tM@k@= tM@ @BE>Q@`> >9`
=@ R]hkg}@? =?Qs9=XS`L1%"3kgt9tQg`S`}9mQ9`t [@tMg?
3MQs
9?9ht9tQg` >9` =@ x`?@kstgg? 9s tM@ 4hkQLMt1%"3 }@ksQg`

`gtM@k [9S` ?QBB@k@`>@ =@t~@@` g{n }@ksQg` 9a? tM@ L@`@kQ>


1%"3 Qs tM@ B@9txk@ ?@s>kQhtgk ~M@k@g{n ?@s>kQhtgk Qs Bg>xs@?
g` tM@ ?QstR`>tQ}@`@ss L@ttR`L 9?}9`t9L@ tM9t `g kgt9tQg` Qs
9hhk@>Q9=X@
3M@ ?@s>kQhtgk Qs g=t9Rc@? tOlgxLM gkQ@`t9tQg`
MQstgLk9[ ~QtM 9 ?gx=X@ hk@>QsQg` ~M@k@  =S`s 9k@ xs@?
S`st@9?gB 2 +`tM@gtM@kM9`? tM@L@`@kQ> }@ksQg` hkghgs@s9
sh9tQ9XSbu@khgX9tQg`tgsx=hR@X9>>xk9>tgR`>k@9s@?@s>kQhtgk
kg=xst`@ss 9L9Rcst kgt9tQg` =xt k@?x>S`L ?QstR`>tQ}@`@ss
>g`s@jx@`tX tMQs Sbu@khgX9tQg` Qs ?Qs[Qss@? sS`>@ kgt9tQg` Qs
`@LXQLQ=X@

%tQsR]hgkv9`ttg`gt@tM9ttM@k@h@9t9=QXQtRcsx>Ms>@`9kQg
[xst =@ Rc>k@9s@? ?x@ tg tM@ MQLM sQ[QX9kQt9`? Mg[gL@`@Qt
gB tM@S[9L@s

=

"QLxk@ /
2 %[9L@s 9 9`? = k@hk@s@`t 9 [9t>MR`L @9[hX@
tM9t >g`t9S`s Mg[gL@`@Qt 9k@9s
!2 Qs S[hgkv9`t tg `gt@ tM@
`g`x`QBgk[ hgXQsMS`L k@hk@s@`t@? = tM@ ~MQt@ shgt Hg[ tM@
X@Ik@LQg`gBtM@R]9L@9

 2 -B:<8M&7F9EM-B87><H7F<BAM 2 198B:A<F<BAM

"QLxk@
%[9L@s>9`?? sMg~s9[9t>MR`L@9[hX@~QtM
9sQL`QE>9`tsS[QX9kQt

3M@ >xr@`t >g[hX@Qt gB tM@ S`t@Lk9t@? >Rk>xQts @`t9QXs


t@`s gB tMgxs9`?s gB XgLQ> L9t@s tg =@ ?@t@>t@? Xg>9t@? 9`?
Q?@`tQE@?
%?@`tQKRcL L9t@ stk|>t{n@s S` sQXQ>g` Qs `gt ?QD>xXt
Bgk 9` @h@kt R` k@}@ks@ @`LRc@@kRcL =xt tM@ ~gkVXg9? Qs sg
MxL@ tM9t 9xtg[9tQg` [9 hX9 9 V@ kgX@ Rc ?k9stQ>9XX
k@?x>R`LtM@9`9XsQstS[@

3MQs st9L@ >g`sQsts R` tM@ ?@t@>tQg` Xg>9XQ9tQg` 2


Q?@`tQE>9tQg`gBtM@XgLQ>L9t@s
3M@[9S`hkg=X@[@`>gx`t@k@?
st@[s Hg[ tM@ S[9L@s g=t9Rc@? Hg[ tM@ hgXQsMS`L st@h
%ts
`g`x`QBgk[Qt =@M9}Qgk 9Xt@ks Xg>9XX tM@ R]9L@ Sbu@`sQtQ@s
tMxsQt>g[hXQ>9t@stM@k@>gL`QtQg`hkg>@sss@@"QL
&

 $>:BD<F;@M%9E8D=F<BAM
3M@ stk9t@L 9hhXQ@? tg sgX}@ tMQs hkg=X@[ Qs ?Q}Q?@? Q`
Xg>9XQ9tQg`2 k@>gL`QtQg`

(g>9XS9tQg` Qs t9kL@t@? tg ?@t@>t 9`? Xg>9t@ XgLQ> L9t@s


~QtMS`tM@>@XXX9@k
3M@9XLgkQtO]Qs=9s@?g`tM@@`@kLX@}@X
gBtM@R]9L@Lk9?Q@`t ~M@k@9tOl@sMgX?Qs@[hRkQ>9XX ?@Fc@?
tg ?Qs>@p k@LQg`s tM9t XRV@X >g`t9R` XgLQ> L9t@s
s tM@ >@XX
X9@k =9>VLkgx`? Qs jxQt@ Mg[gL@`@gxs Qts @`@kL X@}@X
k@hk@s@`ts Xg~ }9Xx@s tM@k@= XgLQ> L9t@s =@Xg`Ls tg tMgs@
k@LQg`stM9tsxkh9ss@stM@tOl@sMgX?s@@"QL


"QLxk@ 
"kg[ X@I tg kQLMt 9 XgLQ> L9t@ h9tt@p S[9L@
MQLMXQLMtS`L9>g`t9>thgTu9`?Qts[9sV

2 .<8DB8;<CM/7G<:7FBDM

3M@ X9kL@ [9L`QE>9tQg` `@@?@? R` tM@ R]9LS`L hkg>@ss tg


Q?@`tQK tM@ [Q>kg>MQh >g[hg`@`ts @`t9QXs 9 k@[9kV9=X@
`x[=@k gB S[9L@s tg =@ stQt>M@?
x@ tg tM@Rk ?S[@`sQg`s Qt
k@sxXts R` MxL@ R]9L@s ~MQ>M 9k@ ?QBE>xXt tg =@ M9`?X@? =
hMgtg}Q@~@k hkgLk9[s
)gk@g}@k tM@ S`Bgk[9tQg` [@t9
S`Bgk[9tQg` gB tM@ Q?@`tQE@? >g[hg`@`ts 9s ~@XX 9s tM@Rk
S`t@k>gbc@>tQg` [xst =@ 9Xsg k@hk@s@`t@? ~MQ>M sxhhgs@s 9`
S`>k@9s@ S` >g[hxt9tQg`9X >gst
XX tM@s@ Rc>g`}@`Q@`>@s
k@jxRk@ tM@ ?@}@Xgh[@`t gB 9 sh@>QE> Lk9hMQ> xs@k Sbu@kB9>@
9?9ht@?tgtMQshkg=X@[

3M@)Q>kg>MQh*9}QL9tgkQshk@s@`t@?9s9hkgLk9[9=X@tg
[9`9L@ X9kL@ S[9L@s k@hk@s@`t tM@ [@t9R`Bgk[9tQg` gB tM@
[Q>kg>MQh >g[hg`@`ts 9`? Qt Qs 9 >g[hX@t@ Lk9hMQ>9X xs@k
Sbu@kB9>@tg?@Fc@tM@`g?@stgTu@k>gbc@>ttMgs@>g[hg`@`ts

@sQ?@s QtS[hX@[@`ts  S[9L@9XQL`[@`t J`>tQg` tg >gkk@>t


Xg>9X[Qs9XQLd[@`tstM9t[9M9}@=@@`hkg?x>@?S`tM@R]9L@
>9ht{nRcL

 $>:BD<F;@M%9E8D<CF<BAM

"QLxk@
@XXX9@kk@LQg`~QtMXgLQ>L9t@s9`?Qts@`@kL

3M@ )Q>kg>MQh *9}QL9tgk >9` =@ ?Q}Q?@? S` t~g


?QstS`LxQsM9=X@ [g?xX@s tM@ `9}QL9tQg` 9`? tM@ `g?@
?@AZQtQg`

*9}QL9tQg`?@9Xs~QtMtM@[9`9L@[@`tgBX9kL@S[9L@s tM@
k@hk@s@`t9tQg` gB tM@ [@t9R`Bgk[9tQg` gB tM@ hk@}QgxsX
Q?@`tQE@? >g[hg`@`ts  s@@ "QL 9`? tM@ S[9L@ 9XQLd[@`t
9[g`LtM@>xkk@`t9`?tM@9?U9>@`t [Q>kg>MQh X9@kS[9L@s
%`
gk?@k tg sgX}@ tM@ >g[hxt9tQg`9X XQ[Qt9tQg`s tg M9`?X@ X9kL@s
S[9L@s tM@ hkghgs@? sgXxtQg` Qs sS[QX9m tg #ggLX@ )9hs tM@
[gs9Q>s 9k@ ?Q}Q?@? S` tQX@s gB hQ@Xs 9`? g`X tMgs@
tQX@s k@jx@st@? = tM@ xs@k >g`>@pS`L MQs M@k Xg>9tQg` ~QtMRc
tM@ X9@k 9k@ Xg9?@? S` k@9X tR]@
3M@ `x[=@k gB tQX@s tg =@
Xg9?@? ?@h@`?s g` tM@ >g[hxt@k s>k@@` k@sgXxtQg` 9`? sS@

3M@`9}QL9tQg` Qs h@kBgk[@? S` sS`>@tM@X9@ks9k@st9>V@?


tMxs tM@ xs@k >9` `9}QL9t@ ?g~` gk xh tOkgxLM tM@ [Q>kg>MQh
X9@ks
%t Qs R]hgkv9`t tg MQLMXQLMt tM9t tM@ `9}QL9tgk
>g[hxt9tQg`9X>gst?g@s`gt?@h@`?g`tM@tgt9X`x[=@kgB tQX@s
gB tM@ [Q>kg>MQh tM@ k@jxRk@? tQX@s 9m@ Xg9?@? Rc k@9X tS[@
9}gQ?S`LM9k?~9k@>g`stk9Tus

3M@ R]9L@ 9XQL`[@`t =@>g[@s 9` Q[hgkv9`t J`>tQg` tg


?@AZ@ tM@ Rct@k>gbc@>tQg` =@t~@@` >g[hg`@`ts Rc tM@
9bcgt9tQg` st9L@
 [Qs9XQL`[@`t =@t~@@` X9@ks [9 X@9? tg
@kkgks R` tM@ `g?@ tk9>VS`L hkg>@ss
3M@ R]9L@ 9XQL`[@`t
J`>tQg` R]hX@[@`t@? R` tM@ `9}QL9tgk xs@s 9`>Mgk hgSbus
hk@}QgxsX ?@AZ@? = tM@ xs@k tg sMQI X9@k k@LQg`s Xg>9XX
9>>gk?R`LtgtM@k@sxXtR`Ltk9`sX9tQg`}@>tgk9[g`LX9@ks

3M@k@>gL`QtQg`st9L@QsBg>xs@?tgF`?S`st9`>@sgBs@}@k9X
XgLQ> L9t@ h9tt@ps tOlgxLM tM@ hk@}Qgxs ?@AZ@? k@LQg`s
s
tM@k@9m@`g >M9`L@s R` s>9X@ 9`? kgt9tQg`S` tMQs s>@`9kQg tM@
-@9ksg` >gkk@X9tQg` Qs 9 sxQt9=X@ [@9sxk@ tg >g[hxt@ tM@
sS[QX9kQt =@t~@@` 9 h9tt@k` 9`? tM@Rk S`st9`>@s
$g~@}@k tM@
>g[hX@Qt k@sxXts Hg[ tM@ Rkk@LxX9k hgXQsMRcL tM9t 9Xt@ks tM@
R]9L@ Rct@`sQtQ@s 9`? >g`s@jx@`tX tM@ Fkst 9`? s@>g`?
st9tQstQ>9X [g[@`ts >g[hxt@? = tM@ -@9ksg` >gr@X9tQg`
3g
k@?x>@ tMQs S`>g`}@`Q@`>@ 9 =S`9k [9sV Qs 9hhXQ@? tg @9>M
>gr@X9tQg` ~QtM tM@ 9S[ tg MQLMXQLMt tM@ [gk@ k@XQ9=X@ XgLQ>
L9t@ h9kvs
3Mgs@ h9kvs xsx9XX =@Xg`L tg tM@ sQXQ>g` k@LQg` gB
tM@ XgLQ> L9t@s 9s >9` =@ g=s@k}@? R` "QL k@hk@s@`t@? = tM@
~MQt@L9hs

-@9ksg`>gr@X9tQg`Qs9k@XQ9=X@ sS[QX9kQt [@9sxk@=xtQt Qs


>g[hxt9tQg`9XX @h@`sQ}@
$g~@}@k Qt >9` =@ sQL`QBQ>9`tX
k@?x>@?QBtM@>gkk@X9tQg`Qs g`X >g[hxt@?S`9 B@~ hgTusBgk
@9>M XgLQ> L9t@
3Mgs@ hgR`ts [xst =@ kg=xst 9L9Rcst hgXQsMRcL
R]h@kB@>tQg`s 9s Bgk S`st9`>@ tM@ >g`t9>t hgR`ts gB tM@ XgLQ>
L9t@ss@@"QL
3M@k@Bgk@[gkhMgXgLQs9hhXQ@?tg?@t@>ttM@
>g`t9>t hgR`ts
%t Qs R]hgkv9`t tg MQLMXQLMt tM9t tM@ XgLQ> L9t@
h9tt@p 9`? Qts S`st9`>@ [xst =@ 9XQL`@? 9>>gk?Q`L tg tM@
?@t@>t@? >g`t9>t hgR`t =@Bgk@ 9hhXS`L tM@ >gkk@X9tQg`

3M@k@Bgk@ sg[@ >g`t9>t hgSbus [xst =@ hk@}QgxsX Q?@`tQE@?


BgktM@XgLQ>L9t@sh9tt@ps

3M@`g?@?@AZQtQg`[g?xX@xs@sMQ@k9k>MQ>9Xtk@@stg?@AZ@
tM@Tu@k>gbc@>tQg`9[g`L [Q>kg>MQh >g[hg`@`ts
3M@s@tk@@s
9k@ Bgk[@? = Bgxk ?QBB@k@`t th@s gB `g?@ hgR`ts =@LSd`S`L
>g\@k ?Q}@kL@`t 9`? t@k[Rc9X
@LSd`S`L hgSbus xs@ tg =@ 9`
gxthxt >g`t9>t hgR`t gB 9 XgLQ> L9t@ >gp@ks 9k@ xs@? tgUgRc
t~g tk9>Vs ?Q}@kL@`t 9k@ XRV@ >g\@ks =xt Qt k@X9t@s [gk@tM9`
t~g tk9>Vs 9`? t@k[S`9X Qs ~M@k@ tM@ `g?@ FcQsM@? ~MQ>M
xsx9XX=@Xg`LstgtM@S`hxt>g`t9>thgRctgB9`gtM@kXgLQ>L9t@

3MQs[g?xX@Qs>xkk@`tX 9ssQst@? =tM@xs@k~MgM9stg?@F`@


tM@`g?@hgR`tsgB tM@tk@@

+`>@9XXtM@S`t@k>gbc@>tQg`sM9}@=@@`?@AZ@? tM@`@tXQst
>9` =@9xtg[9tQ>9XX L@`@k9t@?S`9hkgB@ssQg`9X Bgk[9t 9sBgk
Rcst9`>@5@kQXgL

k@>gL`QtQg` h@kBgk[9`>@ ~M@k@9s hk@>QsQg` g`X h@`9XQ@s tM@


>g[hxt9tQg`9X>gst

3M@ XgLQ> L9t@ Xg>9XQ9tQg` 2 k@>gL`QtQg` h@kBgo9a>@ Qs


[gk@ ?QD>xXt tg @}9Xx9t@ sS`>@ 9 M@Xh [g?xX@ ~9s
S[hX@[@`t@? tg 9ssQst tM@ xs@k Bgk tM@ >gkk@>t ?@t@>tQg`
}9XQ?9tQg`s ?x@ tg tM@ MQLM >g[hX@Qt gB k@>gL`QtQg` st9L@

3M@k@Bgk@ tM@ k@>9XX Qs gxt gB Sbu@k@st 9`? tM@ 9tt@`tQg` Qs


Bg>xs@? tg tM@ hk@>QsQg` g=t9R`S`L 9 }9Xx@ gB 
 Bgk tM@
@}9Xx9t@?[Q>kg>MQh

3M@>g[hxt9tQg`9X>gst?@h@`?sgBtM@`x[=@k gB?QBB@k@`t
XgLQ>L9t@sh9tt@psgBtM@@}9Xx9t@?[Q>kg>MQh9`?tM@`x[=@k
gBS`st9`>@stg=@Xg>9t@?
%`tM@@}9Xx9t@?[Q>kg>MQhtM@k@9k@
9kgx`?  ?QBB@k@`t XgLQ>L9t@h9tt@k`s9`? [gk@gkX@ss 
XgLQ> S`st9a>@s
2 9}@k9L@tS[@ gB  [S`xt@s Bgk @9>M s@t gB
 XgLQ> L9t@s Qs @stS[9t@? Bgk tM@ k@>gL`QtQg` hkg>@ss Rc 9
X9htgh
3MQs tS[@ S`>Xx?@s tM@ hkg>@ssS`L tR]@ 9s ~@XX 9s tM@
xs@kTu@k9>tQg`

%%%
0!14(31
3M@ @`tRk@ sst@[ M9s =@@` t@st@? sx>>@ssJXX g}@k 9
[Q>kg>MQh Bgk[@? = s@}@k9X X9@ks ~QtM 9 tgt9X `x[=@k gB
hQ@Xs h@k [gs9Q> sxh@kQgk tg    ~M@k@9s tM@
`x[=@k gB XgLQ> L9t@s k@9>M@s tM@ gk?@k gB t@`s gB tMgxs9a?s

3M@ k@sxXts 9k@ hk@s@`t@? 9>>gk?R`L tg tM@ ?QBB@k@`t =Xg>Vs


~M@k@ tM@ >g[hxt9tQg`9X >gst Qs tk@9t@? ~QtM sh@>Q9X9tt@`tQg`
sS`>@QtQs9`S`t@k@stR`Lh9k9[@t@ktgjx9`tQwtM@h@kBgk[9`>@
gB tM@ 9xtg[9tQg`
*g >g[hxt9tQg`9X >gst Qs hk@s@`t@? Bgk tM@
`9}QL9tgk sRc>@QtQs9ssQst@?=tM@xs@k

%5
+*(41%+*1
%` tMQs h9h@k 9 sst@[ tg 9xtg[9t@ tM@ [gs9Q>
k@>g`stkx>tQg` tM@ XgLQ> L9t@s Xg>9XQ9tQg` 2 k@>gL`QtQg` 9`?
tM@ ?@AZQtQg` gB tM@ [Q>kg>MQh >g[hg`@`ts R`t@k>gbc@>tQg` Qs
hk@s@`t@?
3M@ sst@[ Qs ?Q}Q?@? S` tOl@@ =Xg>Vs  %[9L@
3QXS`L (gLQ>#9t@s(g>9XS9tQg` 2 /@>gL`QtQg`9`?)Q>kg>MQh
*9}QL9tgk
"Rkst =Xg>V xs@s gxk 1%"3 }@ksQg` tg stQt>M [gs9Q>
S[9L@s
3M@ Q?@`tQE>9tQg` gB tM@ XgLQ> L9t@s Qs 9>MQ@}@? =
[@9`sgBtM@-@9ksg`>gr@X9tQg`9ssS[QX9kQt[@9sxk@J`>tQg`
S` tM@ s@>g`? =Xg>V
3g k@?x>@ tM@ >g[hxt9tQg`9X >g[hX@Qt
tM@>gr@X9tQg` Qs g`X 9hhXQ@? tg9 }@k s[9XX s@t gB>9`?Q?9t@
hgSbus
"S`9XX tM@)Q>kg>MQh*9}QL9tgkQs9kQ>Mxs@kSbu@kB9>@
9=X@ tg k@hk@s@`t 9`? 9XQL` [Q>kg>MQh [gs9Q>s 9s ~@XX 9s tg
?@F`@Qts>g[hg`@`tsTu@k>gbc@>tQg`

"xtxk@k@s@9k>MXS`@s9k@Bg>xs@?g`tM@9xtg[9tQ> tk9>VR`L
gB tM@ `g?@s tM9t >gbc@>t tM@ [Q>kg>MQh >g[hg`@`ts

)gk@g}@k tM@ R]9LRcL hkg>@ss Qs 9Xsg =@S`L R]hkg}@? xsS`L


1!) t@>OcgXgL 9Q?@? = 9` @X@>tkg`Q>9XX LxQ?@? >9[@k9
hgsQtQg`S`L

$M %M'@7:9M3<><A:M
 Fkst S^iX@[@`t9tQg` ~9s t@st@? xsS`L -@9ksg`
>gr@X9tQg` =xtQts>g[hxt9tQg`9X>gst9`?@BE>Q@`>X@9?xstg
?@}@Xgh tM@ >xstg[S@? 1%"3 R]hX@[@`t9tQg`
/@L9k?R`L tM@
>g[hxt9tQg`9X >gst tMQs Q[hX@[@`t9tQg` gxth@kBgk[s tM@
-@9ksg` >gkk@X9tQg` }@ksQg` = 9 B9>tgk gB   ~M@k@9s tM@
?@s>kQhtgk kg=xst`@ss h@`9XQ@s tM@ >g[hxt9tQg`9X >gst = 9
B9>tgk
 >g`>@pR`L tM@ L@`@kQ> 1%"3 }@ksQg`
"gk R`st9`>@
tM@ 9}@k9L@ tR]@
k@jxRk@? Bgk tM@ >xstg[Q@? 1%"3
R]hX@[@`t9tQg` Bgk S[9L@s gB 2,, hR@Xs Qs 9kgx`? 
s@>g`?s ~M@k@9s tM@ L@`@kQ> }@ksQg` Qs [gk@ gk X@ss ,
2
s@>g`?s

3M@[9t>MS`Lh@kBgk[9`>@Qs>9X>xX9t@?Bgk9sx=k@LQg`gB
g`@ gB tM@ [@t9X X9@ks Bgk[@? =  S[9L@s
3M@ >gkk@>t
[9t>MR`L h@k>@`t9L@ R` tM@ MgkSg`t9X ?Rk@>tQg` Qs 9=gxt

 ~M@k@9sR`tM@}@ktQ>9X?Rk@>tQg`Qs

$g~@}@k
tM@ =@st >g[=S`9tQg` gB =gtM ?Rk@>tQg`s ?@F`@? = tM@ k@LQg`
Lkg~R`L9XLgkQtN[g=t9R`s9


"%&,#$%*2

3M@9xtMgksgB tMQsh9h@ks~9`ttgtM9bWtM@>g`tkQ=xtQg`gB
Lk9?x9t@stx?@`tX=@kt)9kvG

('%)2

 2 -B:<8M&7F9EM-B87><H7F<BAM 2 198B:A<F<BAM

' (? $688)5+-? "? )4-9? ? $0-? 9:):-6.:0-)8:? 15? 9-41+65,;+:68?


8-<-89-? -5/15--815/? -91/5? ;:64):165? 65.-8-5+-? ?

3M@ k@sxXts gB tMQs [g?xX@>g`t9R`s tM@ h@kBgk[9`>@ gBtM@


>g`t9>t hgTus ?@t@>tQg` 9XLgkQtM[ 9`? tM@ XgLQ> L9t@s
Xg>9XQ9tQg` 2 k@>gL`QtQg`

 k@>9XX gB 
 :a?9 hk@>QsQg` 
 Qs g=t9R`@? Bgk
tM@ >g`t9>t hgTus ?@t@>tQg` 9XLgkQtO] xsS`L 9 Lkgx`? tkxtM gB
 >g`t9>t hgR`ts
%` tMQs >9s@ tM@ k@>9XX Qs hk@B@k9=X@ k9tM@k
tM9a hk@>QsQg`
 Xg~ k@>9XX 9C@>ts tM@ XgLQ> L9t@ Xg>9tQg` 2


? :0? 

 ? ? <62? 56? 77  ? ? ;5-?


?
' (? 6=-? ? 19:15+:1<-? 4)/-? -):;8-9? .864? #+)2-5<)81)5:?
->7615:9? 5:-85):165)2? 6;85)2? 6.? 647;:-8? &19165? <62? 
?
56  77  


?
' (? )>? 2 $;>:-2))89? $? )5,? &)5? 662? ? #;8.? #7--,-,? ;7?
86*;9:? .-):;8-9? ;867-)5? 65.-8-5+-? 65? 647;:-8? &19165?

?
'(?

!

? #? "-<-89-? 5/15--815/? 86;7? 14*+54+91+-93?

1686

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 62, NO. 8, AUGUST 2014

High-Power High-Efficiency Class-E-Like


Stacked mmWave PAs in SOI and Bulk
CMOS: Theory and Implementation
Anandaroop Chakrabarti and Harish Krishnaswamy

AbstractSeries stacking of multiple devices is a promising


technique that can help overcome some of the fundamental limitations of CMOS technology in order to improve the output power
and efficiency of CMOS power amplifiers (PAs), particularly at
millimeter-wave (mmWave) frequencies. This paper investigates
the concept of device stacking in the context of the Class-E family
of nonlinear switching PAs at mmWave frequencies. Fundamental
limits on achievable performance of a stacked configuration are
presented along with design guidelines for a practical implementation. In order to demonstrate the utility of stacking, three
prototypes have been implemented: two fully integrated 45-GHz
single-ended Class-E-like PAs with two- and four-stacked devices
in IBMs 45-nm silicon-on-insulator (SOI) CMOS technology,
and a 45-GHz differential Class-E-like PA with two devices
stacked in IBMs 65-nm low-power CMOS process. Measurement
results yield a peak power-added efficiency (PAE) of 34.6% for
the two-stacked 45-nm SOI CMOS PA with a saturated output
power of 17.6 dBm. The measurement results also indicate true
Class-E-like switching PA behavior. A peak PAE of 19.4% is measured for the four-stacked PA with a saturated output power of
20.3 dBm. The two-stacked PA exhibits the highest PAE reported
for CMOS mmWave PAs, and the four-stacked PA achieves the
highest output power from a fully integrated CMOS mmWave PA
including those that employ power combining. The 65-nm CMOS
differential two-stacked PA exhibits a peak PAE of 28.3% with a
saturated differential output power of 18.2 dBm, despite the poor
ON-resistance of the 65-nm low-power nMOS devices. This paper
also describes the modeling of active devices for mmWave CMOS
PAs for good model-hardware correlation.
Index TermsClass-E, CMOS, high efficiency, millimeter wave
(mmWave), power-added efficiency (PAE), power amplifier (PA),
power device modeling, stacking.

I. INTRODUCTION
A. Millimeter-Wave (mmWave) CMOS Power
Generation Challenges

HE advent of scaled CMOS technologies with transistor


GHz has generated significant interest in
using the mmWave bands above 30 GHz in applications such

Manuscript received November 09, 2013; revised February 09, 2014 and
March 13, 2014; accepted May 11, 2014. Date of publication June 12, 2014;
date of current version August 04, 2014.
The authors are with the Department of Electrical Engineering, Columbia
University, New York, NY 10027 USA (e-mail: ac3215@columbia.edu;
harish@ee.columbia.edu).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TMTT.2014.2327919

Fig. 1. (a)
of deep-submicrometer CMOS technology nodes from the
literature. (b) Supply voltage of deep-submicrometer CMOS technology nodes.
(c) Survey of the saturated output power achieved by reported RF and mmWave
CMOS PAs. (d) PAE achieved by reported RF and mmWave CMOS PAs.

as wideband commercial wireless communication, satellite


communication, automotive radar, and biomedical imaging.
This, in turn, has driven research focus on the development of
efficient mmWave power amplifiers (PAs). The frequency band
around 45 GHz (upper end of the -band, which extends from
33 to 50 GHz) is well suited for satellite communication owing
to a low atmospheric attenuation of 0.2 dB/Km. The requirement of high output power for such long range communication
necessitates a high-power PA, in addition to energy-efficiency.
Traditionally, IIIV compound semiconductor technologies
were the preferred choice for implementing such amplifiers.
This is because implementing high-power high-efficiency PAs
in CMOS at mmWave frequencies has proven to be a challenging task owing to the limited breakdown voltage of highly
scaled CMOS technologies, low available gain of devices, and
poor quality of on-chip passives. The low breakdown voltage
limits the output swing, and consequently, the output power that
can be delivered to a 50- load. The load may be transformed to
a lower impedance to enable higher output power, but the poor
quality of on-chip passive components limits the efficiency of
the transformation. The low available gain results in large input
power requirements for mmWave PAs, limiting power-added

0018-9480 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

CHAKRABARTI AND KRISHNASWAMY: HIGH-POWER HIGH-EFFICIENCY CLASS-E-LIKE STACKED mmWAVE PAs IN SOI AND BULK CMOS

1687

Fig. 2. (a) Stacked CMOS Class-E-like PA concept with voltage swings annotated in volts and (b) loss-aware Class-E design methodology for stacked CMOS
PAs.

efficiency (PAE). A possible solution is to power combine the


outputs of several PAs. However, the efficiency and scalability
of conventional on-chip power-combining schemes such as
transformer-based series combining [1][5], current combining
[6], and Wilkinson combining are limited by the characteristics
of the back-end-of-the-line, steepness of inherent impedance
transformation, and combiner asymmetry.
These design tradeoffs can be clearly appreciated from
a survey of RF and mmWave PAs reported in the literature
(Fig. 1). Fig. 1(a) depicts the scaling of
of CMOS technology nodes based on prior reports [7][10], while Fig. 1(b)
depicts the scaling of supply voltage
. Experimentally,
it would seem that as technology scales,
remains
approximately constant and equal to 250 GHz-V , although
explicitly deriving such a scaling law for constant-field scaling
remains challenging due to the complex and layout-dependent
nature of loss mechanisms within nanoscale CMOS devices.
This observation is, however, consistent with the known
fundamental tradeoff (referred to as the Johnson limit [11])
between speed of operation and the breakdown voltage of
a technology. If one assumes that a PA designer chooses a
technology with an
that is three times the operating
frequency to ensure approximately 10-dB gain, that the PA
is designed to directly drive a 50- load with no impedance
transformation to maintain efficiency, and that the output node
sustains a peak-to-peak swing that is twice the
(i.e., no
harmonic shaping), the output power of such a PA would be
GHz-V
mW GHz
. This
scaling law indicates that achieving watt-class output power
at frequencies around 1 GHz is feasible, but typical output
powers at, say, 60 GHz, would be in the range of 1015 mW
if impedance transformation or power combining are not exploited. Fig. 1(c) largely bears out this trend, with some efforts
achieving higher output powers through either impedance
transformation, power combining, or a combination of the two.
Fig. 1(d) indicates that efficiency also degrades significantly as
frequency increases, although an explicit scaling trend for efficiency is more complicated because of the complex nature of
loss mechanisms in active and passive devices. State-of-the-art

PAEs at mmWave frequencies are generally below 20%, except


for a few outliers.
B. Device Stacking in CMOS PAs
Series stacking of multiple devices [e.g., Fig. 2(a)] is a
potential technique that breaks these tradeoffs associated with
CMOS PA design. Stacking of multiple devices increases the
voltage swing at the load, as the increased voltage stress can
be shared by the various devices in the stack. Thus, for a stack
of devices, the output voltage swing can be times higher
than that of a single device (provided that design techniques
are incorporated to ensure that each individual device sees
,
, and
swings that lie within permissible breakdown
limits). Stacking, however, does not alleviate the drainbulk
and sourcebulk stress of the individual stacked devices. In
particular, the topmost device of the stack sees a drainbulk
swing that is equal to the -times increased output swing of
the stacked PA. Consequently, in bulk CMOS, the junction
breakdown voltage limits the maximum number of devices
that can be stacked to three or four devices. However, in
silicon-on-insulator (SOI) CMOS, the presence of an isolated
floating body for each device eliminates this limitation. The
number of devices that can be stacked in SOI CMOS is only
limited by the breakdown of the buried oxide (BOX) below
each device. This voltage is, however, higher than 10 V in
IBMs 45-nm SOI CMOS process [12], enabling five or more
devices to be stacked (assuming 2-V peak RF voltage swing
per device for long-term reliability [13]).
Prior studies on stacked PAs at RF frequencies have explored
the cases where input power is provided only to the bottommost
device in the stacked configuration [14], [15], as well as to all the
devices in the stack through transformer coupling [16]. These
works have demonstrated the characteristics of stacking in a
variety of technologies such as GaAs MESFET [14], [16] and
SOI [15]. More recently, stacking for mmWave PAs has been investigated in nanoscale SOI CMOS [12], [17][21]. However,
while stacking has been predominantly explored in the context of linear/quasi-linear PAs, nonlinear switching-type stacked
PAs remain an interesting topic of research.

1688

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 62, NO. 8, AUGUST 2014

Switching PAs are extensively utilized at RF frequencies


owing to their (ideally) lossless operation. The Class-E PA
[22] has been of particular interest because of its relatively
simple output network. The design of switching PAs in CMOS
at mmWave frequencies [12], [17], [19], [20] is challenging
due to the lack of ideal square-wave drives (resulting in soft
switching), impracticality of harmonic shaping of voltages and
currents, low PAE due to the high input drive levels required
to switch the devices, and high loss levels in the device/switch.
Thus, at mmWave frequencies, one can practically implement
a switch-like PA at best. In this paper, we explore stacked
Class-E-like PAs in SOI CMOS and low-power bulk CMOS
technologies.
In order to determine if device stacking overcomes the
speed-breakdown voltage tradeoff of CMOS technology
scaling (quantified earlier using the
product), it is
important to determine if stacked Class-E-like PAs are able
to increase output power without substantial degradation in
efficiency (the PA metric that is significantly impacted by
transistor speed). A loss-aware Class-E design methodology
that accounts for high device loss [23] is applied to stacked
Class-E-like mmWave CMOS PAs to understand the design
tradeoffs and determine the fundamental limits of performance.
These theoretical results are further explained by means of
an analysis that identifies technology-dependent metrics that
govern the performance of stacked Class-E-like CMOS PAs.
Specifically, the analysis introduces a technology constant,
referred to as the switch time constant, which is an important
technology metric for switching PAs in addition to
. It is
shown theoretically that mmWave stacked Class-E-like PAs
implemented based on the loss-aware design methodology
achieve better efficiency than PAs using conventional power
enhancement techniques, such as impedance transformation
and power combining, for the same output power level. Two
single-ended 45-GHz Class-E-like prototypes with two- and
four-stacked devices have been fabricated in IBMs 45-nm SOI
CMOS technology based on this design methodology. Another
45-GHz differential Class-E-like PA with two devices stacked
has been implemented in IBMs 65-nm low-power CMOS
technology. These PAs provide experimental validation for the
theoretical results.
II. STACKED CMOS CLASS-E-LIKE PAs
A. Concepts
Fig. 2(a) depicts the concept of a stacked CMOS Class-E-like
PA. The stacked configuration consists of multiple series devices, which might be of equal or different size. In order to
preserve input power and improve PAE, only the bottom device
is driven by the input signal. The devices higher up in the
stack turn on and off due to the swing of the intermediary
nodes. The topmost drain is loaded with an output network
that is designed based on Class-E principles, and consequently
sustains a Class-E-like voltage waveform. The intermediary
drain nodes must also sustain Class-E-like voltage swings
with appropriately scaled amplitudes so that the voltage stress
is shared equally among all devices. In the 45-nm SOI and

65-nm CMOS technologies employed, the nominal


of the
high-speed thin-oxide devices is 1 V and for long-term reliability, the maximum swing across any two transistor junctions
under large-signal operation is limited to
V [13].
Consequently, for a PA with stacked devices, the peak output
swing is
V, as marked on Fig. 2(a), and the appropriate
intermediary node swings are also noted. Appropriate voltage
swing may be induced at the intermediary nodes through
techniques such as inductive tuning [13], capacitive charging
acceleration [24], and placement of Class-E load networks at
intermediary nodes [19]. In this paper, we employ inductive
tuning at select intermediary nodes, which, for simplicity, is
not shown in Fig. 2(a). The tradeoffs associated with inductive
tuning and capacitive charging acceleration for Class-E-like
PAs at mmWave frequencies will be discussed later in this
paper. In order to conform to the peak ac swing limit across the
gatesource junction in the on half-cycle and the gatedrain
junction in the off half-cycle, the gates of the devices in the
stack must swing, as shown in Fig. 2(a). The swing at each gate
is induced through capacitive coupling from the corresponding
source and drain node via
and
, respectively, and is
controlled through the gate capacitor
. The dc biases of all
gates are applied through large resistors.
From Fig. 2(a), it can be seen that for a two-stacked switching
PA, the gate of the top device is connected to signal ground via
a large capacitor and experiences no signal swing. However,
this does not reduce a two-stacked switching PA to a regular
cascode configuration because of the following reasons. The
main objective of stacking is to allow operation off a higher
supply voltage by distributing the overall voltage stress equally
amongst the transistors. This is accomplished by engineering the
drain and gate nodes voltage profiles [as depicted in Fig. 2(a)] to
ensure that all devices have the same
,
, and
swings,
which results in a linear increase in the supply voltage with the
number of devices stacked. In stacked switching PAs, the nature of the voltage swings requires a constant gate bias only for
the second device. Conventional cascode PAs can operate off a
higher supply voltage, as well, but a linear scaling in supply
voltage cannot be achieved. This is because the gate of the
top device is usually connected to the supply voltage to maximize small-signal power gain. The ensuing unequal voltage
stress across the devices compromises long-term reliability and
can even enforce operation off a single-device supply voltage
[25][27]. The simulated drainsource and gatesource waveforms for the two-stacked and four-stacked Class-E-like PAs
(Fig. 9(c) and (d) respectively, presented in Section IV) demonstrate that voltage swings are indeed equal for the prototypes
implemented in this work and serve to distinguish a two-stacked
switch-like PA from a conventional cascode PA. This claim is
further validated by comparing the measured performance of the
two-stacked PA implemented in this work with a prior mmWave
cascode PA [27] in Section V-B.
B. Theoretical Analysis and Fundamental Limits
To facilitate a theoretical analysis, the improved loss-aware
Class-E design methodology described in [23] is employed. The basic Class-E design methodology of ensuring
zero-voltage switching (ZVS) and zero-derivative-of-voltage

CHAKRABARTI AND KRISHNASWAMY: HIGH-POWER HIGH-EFFICIENCY CLASS-E-LIKE STACKED mmWAVE PAs IN SOI AND BULK CMOS

switching (ZdVS) [22] is no longer optimal for achieving


high-efficiency operation in the presence of high loss levels in
the switching device and/or passive components. The improved
loss-aware Class-E design methodology formally takes switch
loss and passive loss into account. The methodology also
incorporates the input power required to drive the switch and
enables optimization of the PAE rather than drain efficiency.
In essence, the methodology is an analytical loadpull for
optimizing PAE in the presence of high loss levels and input
power requirements.
The devices (taken to be equal in size) in a stacked switching
PA are assumed to behave as a single switch with linearly increased breakdown voltage and ON-resistance. As far as theoretical results are concerned, only the total ON-resistance of
the stacked configuration and output capacitance of the top device are pertinent. The output capacitance of a stacked configuration should ideally scale down linearly with number of devices
stacked. However, wiring parasitics are significant at mmWave
frequencies and there will be parasitic capacitance to ground
from the intermediate drain/source and gate nodes, which will
prevent linear scaling of output capacitance with stacking. As a
worst case estimate, the overall output capacitance is taken to
be the same as that of a single device (
, where
since the high body resistance in SOI technology causes
and
to appear in series). It is indeed this mechanism that prevents efficiency from
remaining constant as we stack more devices, as will be shown
later in this paper [graphically in Fig. 3(a) and theoretically in
(9)]. The devices can be of different sizes and there could potentially be some benefit in tapering device sizes as well [17], [28]
since the gate capacitors conduct a portion of the device current. Thus, progressive device size reduction up the stack would
reduce parasitic capacitances and prevent capacitive discharge
loss at intermediate nodes. However, device size tapering has
not been pursued in this work.
The time-domain equations and corresponding design procedure for a stacked Class-E PA are described in the Appendix.
For various levels of stacking
, the design methodology is
used to analytically vary device-size and dc-feed inductance to
find the design point(s) with optimal PAE under the constraint
of a 50- load impedance to avoid impedance transformation
losses. As an example, for a stack of four devices
, we
start with an initial device size of 100 m and set the tuning
parameter
. The design methodology then determines the optimal load impedance for highest PAE and the
corresponding output power. The load impedance is then scaled
(along with device size, input, and output powers) to have a
real part of 50 . The procedure is repeated by changing the
tuning parameter . Finally, amongst all these design points
for a stack of four devices driving a 50- load, the one with the
highest PAE is chosen. This yields a device size of 204 m for
the four-stacked PA, with theoretical output power and PAE of
145 mW and 48%, respectively (as shown in Fig. 3). The procedure can similarly be used to determine the corresponding metrics for other levels of stacking. Device ON-resistance, output
capacitance, and input-drive-power as functions of device size
are determined from post-layout device simulations and are validated through device measurements (discussed in Section III).

1689

Fig. 3. (a) Theoretical and simulated (post-layout) output power and PAE and
(b) device size and theoretical device stress for the optimal design as a function
of number of devices stacked based on the loss-aware Class-E design methodology at 45 GHz in 45-nm SOI CMOS. Loss in dc-feed inductance is included
for theoretical results. Output power and PAE for a switch capacitor-based
model for the four-stacked configuration are also annotated.

The values for these parameters are shown in Table I, where


is the operating frequency,
,
,
, and
are technology parameters
normalized to the device width (
,
,
, and
being,
respectively, the ON-resistance, output capacitance, input capacitance, and input power corresponding to a device of width
). At mmWave frequencies, the constants of proportionality
in the input power functions take into account the power lost
in the gate resistance, and are consequently frequency dependent. The values of those constants reported in Table I are based
on 45-GHz simulations. In order to incorporate the loss of the
dc-feed inductance, a quality factor of 15 is assumed at 45 GHz
based on measurements [20].
Fig. 3(a) depicts the optimal output power and PAE for different levels of stacking in 45-nm SOI CMOS at 45 GHz. The
optimal size of each stacked device and the associated device
stress (defined as the ratio of the average current drawn from
the power supply to the device width) are shown in Fig. 3(b).
It is clear that due to the increasing achievable output voltage
swing, stacking in Class-E-like CMOS PAs enables dramatic
increases in output power (near-quadratic due to linear increase
in output swing). The PAE reduces with increased stacking due
to increasing total switch loss. However, the methodology ensures that the PAE degradation is gradual. In order to do this,
the design methodology requires the size of each stacked device to increase with to reduce the individual (and hence,
overall) ON-resistance. Consequently, careful device layout is
required for high levels of stacking as it is challenging to layout
large devices while maintaining a high
. Another important consideration for device stacking is the current stress for
the stacked devices, which increases with the level of stacking.
Current stress (or large-signal current density) is the ratio of the

1690

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 62, NO. 8, AUGUST 2014

TABLE I
NORMALIZED DEVICE PARAMETERS USED IN LOSS-AWARE CLASS-E ANALYSIS

Note:
,
,
,
, and
correspond to ON-resistance, gatesource, gatedrain, and drainsource capacitances, and
input power (to switch a device between the triode and cutoff regions), respectively, for a device of width
(including layout parasitics).
is the operating
frequency.
and
refer, respectively, to the high and low amplitude levels of a 45-GHz square-wave input signal.
Estimated in triode region of operation.
Estimated in cutoff region of operation.
Estimated as average of capacitance values in cutoff and triode regions of operation.

average current drawn from the supply under large-signal operation


to the device width. Note that
is different
from the supply current
drawn with no input power (i.e.,
under small-signal operation), the latter being used to determine the current density for operating at highest
in linear
PAs. The current drawn under large-signal operation is typically 1.52 times higher than the small-signal bias current in
our implementations. This implies that in a practical implementation, the metallization of the source and drain fingers of
the MOS devices must be augmented with additional metal
layers, if required, so that they can support the required currents while satisfying electromigration rules for the technology.
While Fig. 3(a) shows an increasing trend for output power
until five devices, at much higher levels of stacking, the assumption in the theoretical analysis that the switch ON resistance is much smaller than the impedance of its output capacitance [23] would be violated. Furthermore, there would
be diminishing returns in output power owing to increased
losses with stacking. In practice, the maximum practical device size, the maximum current stress that can be tolerated
as per electromigration requirements, and drainbulk/buriedoxide breakdown mechanisms would determine the maximum
number of devices that can be stacked. The post-layout simulated results for output power and PAE for a two-stacked
and a four-stacked Class-E-like PA have been annotated on
Fig. 3(a), as well and show excellent agreement with the theoretical output power. The post-layout simulated efficiency is
lower by 20% owing to various implementation losses and
soft-switching at mmWave, as well as power loss at intermediate nodes, which are not accounted for in theory. However,
the theoretical and simulated trends in PAE are in agreement.
Later in this paper, a switch capacitor-based model for the device is constructed for simulation-based investigation of power
loss at intermediate nodes for a four-stacked configuration. The
resulting output power and PAE [see Fig. 3(a)] show excellent
agreement with post-layout device-based simulations and reaffirm the utility of a simplified theoretical analysis.
C. Interpretation Using Waveform Figures of Merit
An analysis using the unique properties of switching PAs facilitates a better understanding of the underlying phenomena associated with device stacking and an interpretation of the results

of the loss-aware Class-E design methodology. An excellent description of the characteristics of switching PAs can be found in
[29]. We have
(1)
where
and
are input power to the PA and the dc power
consumption, respectively. The loss in the PA
is given
by
(2)
(3)
(4)
where is the number of devices stacked in series,
is
the ON-resistance of each device in the -stacked PA,
is
the width of each device/switch, and
is the root mean
square (rms) value of the current flowing through the stack of
switching devices (excluding the output capacitance).
is the switching loss associated with the output capacitance of
the PA and is dependent on the topmost drain voltage value at the
switching instant. In general, at mmWave frequencies, the capacitive discharge loss is negligible compared to the conduction
loss in the switching device(s). This is evident from Table II,
where the conduction loss in the switch and the capacitive discharge loss have been tabulated for the optimal designs at different levels of stacking described in Fig. 3. Indeed, this reinforces our earlier assertion that the conventional ZVS/ZdVSbased Class-E design methodology is not applicable at mmWave
frequencies. Therefore, for the purpose of simplifying our analysis, we shall ignore the contribution of the term
to the
overall loss. In a switching PA, the average current drawn from
the supply is always proportional to
. The proportionality
constant depends on the tuning of the load network [29]. Tuning
of a Class-E load network is determined by the dc-feed inductance
and the load impedance
in relation to the device output capacitance. Since we are in a regime where conduction loss is significant, the proportionality constant will also depend on the value of the total switch ON-resistance
relative to the output capacitance
. Since
is a technology constant, specifying ,
, , and
completely characterizes the tuning of the stacked Class-E PA.

CHAKRABARTI AND KRISHNASWAMY: HIGH-POWER HIGH-EFFICIENCY CLASS-E-LIKE STACKED mmWAVE PAs IN SOI AND BULK CMOS

1691

TABLE II
CONDUCTION LOSS AND CAPACITIVE DISCHARGE LOSS FOR THE OPTIMAL DESIGNS AT DIFFERENT LEVELS OF STACKING DESCRIBED IN FIG. 3.
VALUES FOR THE WAVEFORM FIGURES OF MERIT
AND
FOR LOSS-AWARE AND ZVS-BASED DESIGNS ARE ALSO TABULATED

It is also therefore clear that the optimal tuning is likely to vary


for different levels of stacking due to the increasing total switch
loss. Ignoring capacitive discharge loss, (4) becomes

(5)
where
is a waveform figure of merit (FOM)
defined in [29] and
is the average supply current with
devices stacked.
For a stack of devices, the supply voltage scales linearly
with . On the other hand, in a switching PA, average supply
current is proportional to the product of output capacitance,
supply voltage, and the operating frequency
, the constant
of proportionality being dependent on the tuning of the circuit. The linear dependence on output capacitance is simply
an artifact of circuit scaling properties, while the linear scaling
with supply voltage arises from the fact that switching PAs are
linear with respect to excitations at the drain node (e.g., supply
voltage) [29]. Denoting the impedance of the device output capacitance at the fundamental frequency by

the waveform FOM

is defined in [29] as
(6)

For an -stacked PA,


, where
supply voltage for a single device PA. Substituting

is the

in (6), we get
(7)
Consequently,

(8)

(9)
where is a technology- and frequency-dependent constant
of proportionality that results from the input power functions
shown in Table I.
Table II lists the values of the waveform figures of merit for
designs based on the loss-aware and ZVS methodologies for
different levels of stacking. While the waveform metric
is
comparable for both, the loss-aware methodology shapes the
waveforms to minimize
, thereby yielding optimal designs
with highest possible PAE.
The foregoing expression captures the variation in PAE in
terms of technology constants and number of devices stacked.
The only design-related variables in this expression are the waveform figures of merit. The third term captures the PAE benefit of
stacking since output power is increased quadratically, but input
power is only provided to the bottom device in the stack. This explains why PAE improves when one goes from a single-device
Class-E-like PA to a two-stacked Class-E-like PA in Fig. 3(a).
However, as stacking is increased beyond two devices, the benefits from the third term wear off and the second term causes a
reduction in PAE due to a reduction in drain efficiency. It is well
known that
is the technology constant (which we
shall refer to as the switch time constant) that determines the
drain efficiency of a Class-E PA [29] for a given operating frequency and this constant degrades linearly for an -stacked device since the ON-resistances add in series while the output capacitance remains that of a single device. However, the loss-aware
Class-E design methodology optimizes the output network tuning
to ensure that the PAE degradation is gradual as stacking is increased by minimizing the
product. The benefits of
the loss-aware Class-E tuning methodology over the ZVS/ZdVSbased tuning methodology can be appreciated in Fig. 4, where the
product for the loss-aware design technique can be
observed to be lower than that corresponding to the ZVS design
methodology by a factor of 23 depending on the number of devices stacked.
The preceding analysis also highlights the importance of the
switch time constant as a technology metric that determines the
efficiency of switching PAs. For linear-type PAs,
is a sufficient metric to gauge the PA efficiency. For switching-type PAs,
determines the input power requirements (via the technology- and frequency-dependent constant ) while the switch
time constant determines the drain efficiency.
As the levels of stacking are increased, the switch time
constant becomes more significant than
. As can be seen in

1692

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 62, NO. 8, AUGUST 2014

Fig. 4. Product of waveform figures of merit


and
for stacked
Class-E-like PAs in 45-nm SOI CMOS at 45 GHz based on the loss-aware and
ZVS based design methodologies.

Table I, and later in this paper, 65-nm low-power bulk CMOS


and 45-nm SOI CMOS have similar
, but significantly different switch time constants. Consequently, it can be expected
that switching-type PAs in 45-nm SOI CMOS will achieve
higher efficiencies than those in 65-nm low-power bulk CMOS.
This is validated by our experimental results in Section V.
The loss-aware Class-E analysis takes into account several
mmWave nonidealities, and therefore, the results in Fig. 3
represent the fundamental limits on achievable performance
in stacked CMOS Class-E-like PAs. The main nonideality
that causes deviation from these limits in practice is soft
switching of the stacked devices due to the lack of square-wave
drives. Nevertheless, the optimal design points predicted by
the analysis are excellent starting points for simulation-based
optimization.

Fig. 5. Comparison of device stacking in Class-E-like PAs (based on


loss-aware Class-E design methodology) at 45 GHz in 45-nm SOI CMOS with:
(a) two-, four-, and eight-way Wilkinson-tree-based power combining and
transformer-based series power combining (the two-stacked and one-stacked
Class-E-like PAs obtained from the loss-aware Class-E design methodology
are used with both the -way Wilkinson-tree-based and transformer power
combiners) and (b) impedance transformation at 45 GHz in 45-nm SOI
CMOS (the two-stacked and one-stacked Class-E-like PAs obtained from the
loss-aware Class-E design methodology are scaled to increase output power
and a two-element LC network is used to transform the 50- load to the
optimal load impedance for the scaled PAs. Quality factors for the inductor and
capacitor are assumed to be 15 and 10, respectively, at 45 GHz).

D. Stacking Versus Power Combining


In order to appreciate the benefits of device stacking using
the loss-aware Class-E design methodology, it is imperative
to contrast this approach to conventional impedance transformation and power-combining techniques. To evaluate the
performance of power combining, a 45-nm SOI two-stacked
Class-E-like PA (resulting from the loss-aware design methodology) with a theoretical output power of 34 mW and a
corresponding theoretical PAE of 54% at 45 GHz is chosen
since it has reasonable output power as well as the highest
efficiency [see Fig. 3(a)]. Since the two-stacked PA is designed
for an optimal load impedance of 50 , a cascaded-tree of
two-way 50- Wilkinson power combiners is chosen. A 2-way
50 Wilkinson power combiner with 70.7transmission
lines in 45-nm SOI CMOS technology has an electromagnetic
(EM)-simulated efficiency
at 45 GHz. An -way
cascaded-Wilkinson-tree power combiner (where is an even
multiple of two) will therefore have an overall efficiency of
. Fig. 5(a) compares the theoretical PAE, as a function of
output power, for different levels of device stacking with that
of two, four-, and eight-way Wilkinson-tree power combining.
For a given output power, stacked Class-E-like PAs implemented using the loss-aware Class E design methodology offer
10%20% higher efficiency compared to Wilkinson power
combining (using two-stacked PAs). Power combining using
transformers is a better alternative at mmWave frequencies
since ideally transformer-based series power combining has
a constant efficiency with the number of elements combined.

However, interwinding and self-resonant capacitances introduce asymmetry in transformer power combiners, degrade
efficiency, and cause stability problems, usually permitting a
maximum of two transformer sections to be combined in series
[4]. Ignoring the effect of parasitic capacitances, a two-section
series transformer combiner is used to power-combine two
two-stacked PAs. The secondary inductance is chosen for
maximum efficiency subject to a 50- load, and the PAs are appropriately scaled to drive the load impedance presented by the
primary of the transformer. As shown in Fig. 5(a), transformer
power combining utilizing two-stacked PAs can yield results
similar to stacking only under ideal conditions and is fundamentally limited to two-way combining. The corresponding
results for Wilkinson and transformer-based power combining
using one-stacked (single device) Class-E-like PAs obtained
from the loss-aware design methodology are also included to
emphasize the inefficacy of the traditional design technique of
using single-device PAs for high-power amplification.
E. Stacking Versus Impedance Transformation
The efficiency of the alternative technique of impedance
transformation is dependent on the steepness of transformation as well as the topology of the impedance transformation
network. The two- and one-stacked Class-E-like PAs in 45-nm
SOI obtained from the loss-aware Class-E design methodology
at 45 GHz are again employed for the purpose of comparison.
In order to achieve output power comparable to those obtained

CHAKRABARTI AND KRISHNASWAMY: HIGH-POWER HIGH-EFFICIENCY CLASS-E-LIKE STACKED mmWAVE PAs IN SOI AND BULK CMOS

from device stacking, the Class-E-like PAs are scaled appropriately while an impedance transformation network is used
to transform the 50- load to the corresponding lower load
impedance for the scaled PAs. A two-element LC impedance
transformation network is designed and used in each case.
The quality factors of the inductor and capacitor are assumed
to be 15 and 10, respectively, at 45 GHz, based on measured
characterizations of inductors, capacitors, and transmission
lines in the 45-nm SOI CMOS technology [20]. A comparison
of the PAEs of impedance transformation and device stacking
is summarized in Fig. 5(b). Device stacking results in designs
with 1030% higher efficiency for the same output power
compared to impedance transformation.
Once device stacking is exploited to the limit as dictated by
secondary breakdown mechanisms (e.g., that of the BOX in
the SOI), it is interesting to consider the combination of device
stacking with impedance transformation and/or power combining to achieve watt-class output power levels at mmWave
frequencies.

1693

Fig. 6. Close-up of: (a) devices and (b) devices with connections to gate capacitors in four-stacked PA layout implemented in 45-nm SOI CMOS.

III. IBM 45-nm SOI AND 65-nm CMOS MODELING


In the 45-nm SOI CMOS, an accurate high-frequency model
for the device, which accounts for intrinsic input resistance
(IIR), as well as layout-related wiring resistances, capacitances,
and inductances of the gate, drain, and source fingers and vias is
nonexistent. The model provided in the design kit is augmented
to incorporate the impact of IIR, which models the distributed
characteristics of the channel in a MOSFET [30]. While IIR
is controlled by two parameters (XRCRG1 and XRCRG2)
in BSIMSOI modeling [31] (which default to 0 in the PDK),
we have found that transient simulations in SPECTRE fail to
converge when these parameters are assigned values based on
our device measurements. Consequently, a bias-independent
resistance is added in series with the gate to account for IIR.
The bias independence of this resistor, along with its location
(outside the PDK device model), is a source of inaccuracy in
our transient simulations. Wiring resistances and capacitances
are extracted using Calibre PEX. High-frequency models for
the gate and drain vias are simulated in the IE3D field solver
[32].
The layout of a fabricated floating-body power device test
structure in 45-nm SOI technology employs a continuous array
of gate fingers (4070) with a finger width of 2.793 m. We
make use of a doubly contacted gate with a symmetric gate via
on both sides to reduce gate resistance [30]. Wiring resistances
and capacitances are extracted for the entire stacked-device
layout configuration and high-frequency models for the vias
are added to this overall RC extracted model. The layout
for the four-stacked configuration is shown in Fig. 6, while
the corresponding high-frequency model with parasitics is
illustrated in Fig. 7. The source and drain fingers of the devices
consist of metal layers
strapped so as to conform to
electromigration requirements. The connection from the source
of the bottom device to the ground node supports a large current
under large-signal operation. Consequently, this connection
is augmented with thick metal strips in metal layers
and
strapped together, which also helps minimize the source
inductance.

Fig. 7. Augmented schematic of four-stacked power device in 45-nm SOI


CMOS with capacitive and inductive layout parasitics.

The measured
and
of the test structures of the
individual power devices used in the implemented 45-nm
SOI two-stacked and four-stacked PAs (distinct from the
custom stacked layout as discussed earlier) are shown in
Fig. 8(a) and (b), respectively. Device measurements were
conducted up to 65 GHz using a pair of coaxial 1.85-mm
(dc65 GHz) groundsignalground (GSG) probes, calibrated
at the probe tip planes. The industry-standard open-short
de-embedding was performed to a reference plane at the top of
the gate and drain vias. The measured
and
were obtained by extrapolating the measured Masons unilateral power
gain
and
at 20 dB/decade. The measured is observed
to have 20-dB/decade slope up to 65 GHz and the modeled
exhibits the same slope up to
. Peak
of 180 GHz
and 190 GHz are achieved for these power devices.
It is difficult to achieve
for power devices that is similar to that of smaller devices due to layout challenges [33]. For
reference, our measurements reveal that a 1 m 10 40 nm
device achieves an
of 250 GHz in this technology [30].
The use of a compact device layout with a continuous array of
large number of gate fingers with large finger width reduces the
parasitic capacitance and causes the 204- m device to have a
higher
compared to the 115- m device However, the layout

1694

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 62, NO. 8, AUGUST 2014

Fig. 8. Measured (extrapolated): (a)


and (b)
of 2.793 m
41 40 nm and 2.793 m
73 40 nm power devices in IBM 45-nm
SOI CMOS across current density. These devices are used in designing the
two-stacked and four-stacked PAs, respectively. (c) Measured and simulated
for a 3 m 50 60 nm 65-nm low-power bulk-CMOS power device
across current density.

also results in an increased gate resistance and lower


for
the larger device. This prevents the devices of the four-stacked
PA from being driven into a hard-switching condition, as will
be discussed later in this paper. Splitting the overall device into
several smaller devices (each with reduced finger width and
small number of gate fingers) wired appropriately in parallel
should further improve the
to approach 250 GHz and available gain [33]. It should be noted that such a multiplicity-based
layout approach might compromise
due to increased wiring
capacitance. In a switch-like PA, a good balance between
and
must be maintained.
A similar device layout approach and modeling strategy
is employed for power devices in IBMs 65-nm low-power
bulk-CMOS technology. A key point of difference, however,
is the presence of IIR modeling within the PDK, eliminating
the need for an external IIR resistance. A 3 m 50 60 nm
power device test structure is measured using the same approach as mentioned earlier [see Fig. 8(c)]. A peak
of approximately 180 GHz is observed in measurement. It
should, however, be noted that while power devices in 65-nm
low-power bulk CMOS are able to achieve similar
to
power devices in 45-nm SOI CMOS, the width-normalized
ON-resistance (quantified as
in Table I) is almost three
times higher for the same gate drive level due to the high
threshold voltage of the low-power process (
mV
at the PA bias point). As will be demonstrated experimentally,
this leads to inferior performance for mmWave Class-E-like
PAs in 65-nm low-power bulk CMOS.
IV. IMPLEMENTATION DETAILS
The schematics in Fig. 9(a) and (b) depict the Class-E-like
PAs implemented by stacking two and four floating-body
devices in 45-nm SOI CMOS technology. Device sizes and
dc-feed inductance values are chosen based on the theoretical
analysis, while supply and gate bias voltages and gate capacitor
values are selected based on the considerations described

Fig. 9. Schematics of 45-nm SOI CMOS -band Class-E-like PAs with:


(a) two devices stacked and (b) four devices stacked. Simulated drainsource
and gatesource voltage waveforms of the -band (c) two-stacked Class-E-like
PA (
V,
V,
V), and (d) four-stacked
Class-E-like PA in 45-nm SOI CMOS (
V,
V,
V,
V, and
V).

earlier. For the first stacked device (


in both designs),
the gate voltage must be held to a constant bias as discussed
previously. This can be accomplished through a large bypass
capacitor placed as close as possible to the gate to mitigate
stray inductance that can result in oscillations. DGNCAPs
(which are device capacitors) are suitable for this purpose
since their wiring is in the lowest metal layer and they provide
higher capacitance density than VNCAPs (interdigitated finger
capacitors). All other capacitors, including gate capacitors for
the higher stacked devices, which are not large in value, are
implemented using VNCAPs. For both the designs, the output
harmonic filter is eliminated to avoid passive loss with minimal
impact on performance.
As was mentioned earlier, a tuning inductor may be placed
at intermediary nodes to improve their voltage swing and make
them more Class-E-like. Simulation results indicate that the improvement in swing for the two-stacked PA is offset by an increase in the conduction loss of the top device. This can be explained as follows. The voltage swing at the intermediate node
controls the turn-on and turn-off of the top device. As shown
in Fig. 10(a), in the absence of the tuning inductor, the intermediate node voltage gets clipped to
once the top device
turns off during the OFF half-cycle [13]. The voltage remains

CHAKRABARTI AND KRISHNASWAMY: HIGH-POWER HIGH-EFFICIENCY CLASS-E-LIKE STACKED mmWAVE PAs IN SOI AND BULK CMOS

1695

Fig. 10. Simulated voltage profiles for two-stacked Class-E-like PA: (a)
without tuning inductor and (b) with tuning inductor. (c) Close-up of voltage
profiles with (bottom) and without (top) tuning inductor.
Fig. 11. Schematic of the two-stage 45-nm SOI CMOS -band Class-E-like
PA with a two-stacked driver stage and a four-stacked main PA.

unchanged at
until the end of the OFF half-cycle,
when the drain voltage of the top device reduces to
and the top and bottom node voltages roll-off in tandem thereafter. Introducing an inductor at the intermediate node results in
a Class-E-like voltage profile [see Fig. 10(b)], which causes the
top device to turn back on earlier during the latter part of the
OFF half-cycle, as shown in Fig. 10(b). This leads to additional
power loss in the top device. Consequently, no tuning inductor
is used in designing the two-stacked PA. For the four-stacked
PA, a tuning inductor at
is seen to provide benefit. Intuitively, a four-stacked configuration can be viewed as a stack of
two two-stacked PAs with the inductor serving as an inter-stage
tuning element. Fig. 9(d) shows the drain waveforms for the
four-stacked PA. As is evident, drainsource voltage swings
are almost equally shared across all four devices. The lack of
a tuning inductor at
results in a relatively flat-topped waveform. This is to be expected in view of the foregoing discussion
for the two-stacked PA. The situation is somewhat different for
node
. Despite the absence of a tuning inductor, we can observe a Class-E-like waveform even when device
is off. This
is a consequence of capacitive coupling through
and
of
(in conjunction with capacitive voltage division due to presence of the 80-fF gate capacitor), which induces voltage swing
at
when
is not conducting. This eliminates the need for
a tuning inductor at
. A similar voltage coupling does occur
for
as well. However, in that case, the coupling is through
two levels of devices and the resulting series connection of intrinsic capacitances reduces the strength of the voltage coupled
to
. Since
and
can be viewed as a two-stacked PA
with the tuning inductor serving as the choke inductance in large
signal, a tuning inductor is not required at
(as discussed earlier).
Another technique for inducing voltage swing at the intermediary nodes in a stacked configuration is through the use of capacitive charging acceleration. The work in [24] describes two
methods for accomplishing this. The first is by placing an explicit capacitor between every pair of intermediary nodes and
the second is using the inherent drainbulk capacitance of a device by connecting the bulk and source nodes of stacked devices along with appropriate device sizing. The first method is

less desirable at mmWave frequencies owing to the poor quality


factor of on-chip capacitors. The second method is applicable
only when the body terminal of the device is explicitly available
to the designer. Furthermore, the efficacy of such an approach
would depend on accurate modeling of the characteristics of the
sourcebulk junction. For the 45-nm SOI implementations, the
body of the floating-body devices is not accessible. Inductors,
on the other hand, have better quality factor than capacitors at
mmWave frequencies. Therefore, in the implemented PAs, inductive tuning is preferred to the capacitive feed-forward technique.
The lack of square-wave drive at mmWave frequencies results in soft switching, which increases the input power required
to drive the PAs into a hard-switching state. Thus, to ensure that
the PAs are driven into saturation, it is imperative to include a
driver stage when delivering high output power. A third prototype (Fig. 11) was designed in 45-nm SOI CMOS by cascading
the two- and four-stacked designs discussed previously. The
two-stacked PA thus serves as the driver for the four-stacked
main PA, with an inter-stage matching network transforming
the input impedance of the main PA to the optimal 50- load
desired by the driver stage.
In order to demonstrate the benefit of scaled SOI technology
over bulk CMOS for implementing stacked PAs, a prototype
two-stacked PA was implemented in IBM 65-nm CMOS technology. The schematic of the pseudo-differential two-stacked
Class-E-like PA is shown in Fig. 12. The design strategy is similar to that of the single-ended two-stacked PA in 45-nm SOI
CMOS discussed previously. The differential input and output
terminals are routed directly to groundsignalsignalground
(GSSG) pads for probing. A pseudo-differential structure was
chosen to facilitate an increase in the overall output power.
An important characteristic of switching PAs, which sets
them apart from the linear classes, is the nonoverlapping nature
of switch voltage and switch current waveforms and the high
harmonic content of these waveforms compared to linear PAs.
In a device-based implementation, it is difficult to isolate the
current flowing through the device capacitances from that

1696

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 62, NO. 8, AUGUST 2014

Fig. 12. Schematic of differential two-stacked Class-E-like PA implemented


in 65-nm low-power bulk CMOS.
Fig. 13. Post-layout simulated drainsource voltages and corresponding switch
currents for: (a) two-stacked PA and (b) four-stacked PA in 45-nm SOI CMOS.

flowing through the switch. As a first-order approximation,


the currents through the external wiring parasitic capacitances
,
,
, and
are scaled in proportion to the ratio
of the intrinsic to external wiring parasitic capacitance, and
their sum is subtracted from the total device current to arrive
at the switch-current in simulation. Fig. 13 shows the
and
the corresponding
for the various devices in the twoand four-stacked PAs implemented in 45-nm SOI CMOS from
which the nonoverlapping characteristic of voltage and current
waveforms is clearly evident. Figs. 14 and 15 compare the
switch voltage and switch current waveforms for devices
and
of the two- and four-stacked PAs, respectively, with
theory. Aside from the sharp current spikes in the theoretical
waveforms at switch turn-on, there is excellent correspondence
between theory and simulation. The current spikes arise from
the assumption of hard switching, which is not possible at
mmWave. However, the soft switching in simulation does not
compromise the shaping of voltages and currents and their
harmonic content for the rest of the switching cycle. These
results clearly indicate the feasibility of switching operation at
mmWave frequencies.
As mentioned before, the theoretical loss-aware Class-E design methodology approximates the stacked configuration as
a series connection of switches and assumes that appropriate
voltage swings are somehow ensured at the intermediate nodes.
A more realistic model for the circuit is a stack of switches, each
accompanied by the corresponding intrinsic device capacitances
(
,
,
, and
) and gate capacitors (except for
the bottom switch). An Elmore network of RC delays is encountered in stacked linear PAs (owing to the simultaneous presence of capacitances and finite device output resistances), which
can cause phase shift in the voltages and currents as one moves
up the stack. It is unclear that there would be a similar Elmore
delay in stacked switching PAs since during the OFF cycle the
switch devices have very high OFF resistance. However, the capacitive discharge loss at intermediate nodes might be nonnegligible and can have a considerable influence on overall efficient

operation of the PA. Ignoring these losses in the theoretical analysis results in PAE higher than what is obtained from actual device-based simulations.
For linear stacked PAs, the impact of Elmore delay can
be accounted (and compensated) for theoretically [28] since a
small-signal model for the devices is used for the preliminary
analysis and design procedure. A similar endeavor for stacked
switching PAs is challenging owing to nonlinear operation of the
circuit. Consequently, we adopt a simulation-based approach to
investigate this effect for a four-stacked configuration (without
the tuning inductor) using a switch capacitor-based model for
the devices. The resulting circuit resembles that in Fig. 9(b)
(sans the tuning inductor, the input matching network, and with
a square-wave drive instead of a sinusoidal input) with each device modeled as a switch augmented with intrinsic device capacitances. Layout parasitics (capacitors, resistors, and inductances) based on Fig. 7 are also incorporated in the circuit to
facilitate better correlation with device-based results. The delay
in the switch-voltage waveforms exhibit close correspondence
with those obtained from device-based simulations as well, as
shown in Fig. 16. Since the voltage profiles confirm no significant delay, the phenomenon of Elmore delay is not a concern
in stacked switching PAs. The resulting output power and PAE
are reported in Fig. 3(a). The reduction in efficiency ( 20%) for
the switch capacitor-based model indicates that capacitive discharge loss at intermediate nodes is a more important practical
consideration. These results, in conjunction with the comparison presented in Fig. 15 indicate that ignoring Elmore delay
and the additional loss mechanisms in the theoretical analysis is
not a crippling limitation since it does not significantly alter the
waveform characteristics, and hence, the impact on switching
behavior of the stacked configuration at mmWave frequencies.
One would therefore obtain output power similar to that predicted by theory, but at a lower PAE, which follows the theoretical trend [see Fig. 3(a)]. This also demonstrates the efficacy of

CHAKRABARTI AND KRISHNASWAMY: HIGH-POWER HIGH-EFFICIENCY CLASS-E-LIKE STACKED mmWAVE PAs IN SOI AND BULK CMOS

Fig. 14. Comparison of post-layout simulated waveforms for device

of the two-stacked PA in 45-nm SOI CMOS with theory.

Fig. 15. Comparison of post-layout simulated waveforms for device

of the four-stacked PA in 45-nm SOI CMOS with theory.

Fig. 16. (a) Post-layout simulated device voltages for the four-stacked PA prototype without tuning inductor in 45-nm SOI CMOS. (b) Simulated switch voltages for the same four-stacked configuration without tuning inductor, using a
switch capacitor-based model for the devices and layout parasitics from Fig. 7.

a simple switch capacitor-based model to predict the performance of a practical implementation.


V. EXPERIMENTAL RESULTS
The chip microphotographs of the PAs are shown in Fig. 17.
The PAs are tested in chip-on-board configuration through
on-chip probing using two coaxial 1.85 mm (dc65 GHz) GSG
probes.

1697

Fig. 17. Chip microphotographs of the mmWave stacked Class-E-like PAs


with: (a) two devices stacked in 45-nm SOI CMOS, (b) four devices stacked
in 45-nm SOI CMOS, (c) a two-stage cascade of a main PA with four devices
stacked with a two-stacked driver stage in 45-nm SOI CMOS, and (d) two
devices stacked in 65-nm low-power bulk CMOS.

A. Small-Signal Measurements
The small-signal measurement setup is calibrated at the probe
tip planes. The small-signal measurements are conducted up to
65 GHz using an Anritsu 37397E Lightning vector network analyzer (VNA). Figs. 18 and 19 illustrate the simulated and measured small-signal -parameters of the two-stacked PA and the
four-stacked PA implemented in 45-nm SOI CMOS. The measured peak gain of the two-stacked PA is 13.5 dB at 46 GHz

1698

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 62, NO. 8, AUGUST 2014

Fig. 18. Small-signal -parameters of 45-nm SOI two-stacked Class-E-like PA


(
V,
V,
V). Power consumption
mW.

Fig. 19. Small-signal


PA (
V,
Power consumption

-parameters of 45-nm SOI four-stacked Class-E-like


V,
V,
V,
V).
mW.

with a 3-dB bandwidth extending from 32 to 59 GHz. The


1-dB bandwidth extends from 42 to 52 GHz, making it suitable for wideband applications. The measured peak gain of the
four-stacked PA is 12.3 dB at 48.5 GHz with a 3-dB bandwidth
extending from 37 to 56 GHz. The measured 1-dB bandwidth
spans a wide frequency range from 43.5 to 52.5 GHz. A frequency shift of 35 GHz is observed between measured and
simulated curves for both PAs in both
and
. This can
probably be attributed to overestimation of capacitive parasitics
at design time. The fact that the PAs have a significant smallsignal gain goes against the concept of conventional Class-E PA
design, but is simply an outcome of the Class-E-like design
methodology described in this paper. The PA is designed for
optimum performance at a Class-E input drive level, at which
point the devices can be regarded as hard switching. However,

Fig. 20. Small-signal -parameters of 45-nm SOI four-stacked Class-E-like


PA with the tuning inductor eliminated using laser trimming (
V,
V,
V,
V,
V).
Power consumption
mW.

Fig. 21. Small-signal -parameters (single ended) of the 65-nm differential


two-stacked Class-E-like PA (
V,
V,
V).
Power consumption
mW under small-signal operation.

at the dc bias point, the devices are biased somewhat above


the threshold voltage, imparting the circuit with small-signal
gain. Of course, this gain is less than the maximum gain available from the stacked devices as the output load is designed
based on Class-E principles. A modified version of the 45-nm
SOI four-stacked PA, obtained by laser-trimming the tuning inductor, was also characterized, and its simulated and measured
small-signal -parameters are reported in Fig. 20. The measured
peak gain is 11.6 dB at 45 GHz with a 3-dB bandwidth extending from 30 to 55 GHz. The 1-dB bandwidth extends from
36 to 50 GHz.
The measured and simulated small-signal -parameters of
the two-stacked differential PA implemented in 65-nm bulk
CMOS are illustrated in Fig. 21. The measurement setup and

CHAKRABARTI AND KRISHNASWAMY: HIGH-POWER HIGH-EFFICIENCY CLASS-E-LIKE STACKED mmWAVE PAs IN SOI AND BULK CMOS

Fig. 22. Large-signal

1699

-band measurement setup for the fabricated PAs.

Fig. 24. Measured gain, drain efficiency and PAE as a function of output power
for: (a) the 45-nm SOI four-stacked Class-E-like PA with the tuning inductor
eliminated through laser trimming at 42.5 GHz and (b) the 45-nm SOI fourV,
V,
stacked Class-E-like PA at 47.5 GHz (
V,
V, and
V for both designs).

Fig. 23. Measured gain, drain efficiency, and PAE as a function of output power
for: (a) the 45-nm SOI two-stacked Class-E-like PA at 47 GHz (
V,
V,
V) and (b) the 65-nm differential two-stacked
Class-E-like PA at 47.5 GHz (
V,
V,
V).

calibration procedure are the same as discussed before with


the exception that coaxial 1.85-mm coplanar wave GSSG
probes are used for the measurements. However, one probe of
each differential pair is terminated with 50 so, in essence,
single-ended measurements are being performed. This stems
from the practical challenges in creating a differential mmWave
measurement setup. The measured peak gain is 9.5 dB at
47 GHz with a 1-dB bandwidth extending from 44 to
50 GHz.
B. Large-Signal Measurements
The large-signal measurement setup for the aforementioned
PAs is shown in Fig. 22. The large-signal characteristics of the
45-nm SOI PAs are shown in Figs. 23(a) and 24. Measurement results yield a peak PAE of 34.6% for the two-stacked
PA with a saturated output power of 17.6 dBm at 47 GHz.
Compared to the cascode PA in [27] operating at a similar
frequency at supply voltages close to the nominal single-device
of the technology, the two-stacked prototype achieves
37-dB higher output power along with 10%12% higher PAE.
The four-stacked PA has measured saturated output power of

20.3 dBm at 47.5 GHz at a peak PAE of 19.4%. For the trimmed
version of the four-stacked PA without the tuning inductor, a
peak PAE of 18.3% was achieved at 42.5 GHz along with a
saturated output power of 20.3 dBm. Unlike the two-stacked
PA, the measured performance metrics of the four-stacked
PAs (particularly efficiency) are somewhat lower than those
predicted by simulations. This is an indication of unmodeled
active losses, as there is good correspondence between the
measured and simulated characteristics of the various passive
components [20]. The loss in the active components depends on
a proper choice of device layout, as well as accurate modeling,
as discussed in Section III.
Large-signal measurements were also conducted for the twoand four-stacked PAs across frequency (at the optimal bias
point) and for different supply voltages (at a fixed frequency,
keeping gate biases constant). The results are depicted in
Figs. 25 and 26. Large-signal measurement beyond 48 GHz
was limited by the characteristics of the measurement equipment (specifically, the Quinstar PA used to drive the PAs
under test, as well as the isolator, dual-directional coupler,
and the power sensors used in the measurement setup). Unlike
the two-stacked PA, the output power does not increase with
increasing supply voltage for the four-stacked prototype. Once
again, this can probably be attributed to the device layout
discussed previously that results in lower
.
This hypothesis is tested in measurement. As discussed previously, an important characteristic of switching PAs is linearity
with respect to supply voltage, which causes the average supply
current and the output power to scale linearly and quadratically
with supply voltage, respectively. This unique feature distinguishes switching PAs from the class of linear PAs. At mmWave

1700

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 62, NO. 8, AUGUST 2014

Fig. 27. Measured and expected: (a) average supply current versus
and (b) saturated output power versus
for two-stacked Class-E-like
PA in 45-nm SOI CMOS. The profiles display the linearity with respect to
supply voltage associated with switching Class-E PAs, thereby establishing the
Class-E-like characteristics of the PA even at mmWave frequencies.

Fig. 25. Measured gain, saturated output power, drain efficiency, and PAE:
(a) across frequency (
V,
V,
V) and
(b) across supply voltage at 47 GHz of the 45-nm SOI two-stacked Class-E-like
V,
V).
PA (

Fig. 26. Measured gain, saturated output power, drain efficiency, and PAE:
(a) across frequency (
V,
V,
V,
V) and (b) across supply voltage at 47 GHz of the 45-nm SOI four-stacked
Class-E-like PA.

frequencies, the various sources of nonidealities result in deviation from ideal Class-E characteristics. Thus, the scaling of
supply current and output power with supply voltage can be utilized as a useful metric to determine the extent of switching
characteristics of a PA in the mmWave regime. Fig. 27 illustrates the measured average supply current and saturated output
power of the two-stacked PA in 45-nm SOI CMOS as a function of
and
, respectively. The respective linear trends

Fig. 28. Measured and expected: (a) average supply current versus
and
(b) saturated output power versus
for four-stacked Class-E-like PA in
45-nm SOI CMOS. The profiles do not display the linearity with respect to
supply voltage characteristic of switching Class-E PAs, owing to layout-induced
increased gate resistance, which prevents hard switching at mmWave frequencies.

Fig. 29. (a) Measured small-signal -parameters and (b) measured gain, drain
efficiency, and PAE as a function of output power for the two-stage 45-nm SOI
PA comprising a four-stacked main PA and a two-stacked driver stage at 47 GHz
V,
V,
V,
V,
V,
(
V,
V, and
V). Power consumption
mW under small-signal operation.

can be clearly observed, thereby corroborating the Class-E-like


nature of the design. This also proves that switch-like PAs can
indeed be implemented at mmWave frequencies with appropriate design methodology. The corresponding results for the

CHAKRABARTI AND KRISHNASWAMY: HIGH-POWER HIGH-EFFICIENCY CLASS-E-LIKE STACKED mmWAVE PAs IN SOI AND BULK CMOS

1701

TABLE III
COMPARISON OF FABRICATED CLASS-E-LIKE PAs WITH STATE-OF-THE-ART CMOS AND SiGe mmWAVE PAs
(REFERENCES ARE ORGANIZED IN ORDER OF DECREASING PAE, FOR CMOS AND SiGe PAs SEPARATELY)

Defined as
dBm
Gain dB
Ideal external lossless output balun assumed.
Uses off-chip bias-T for providing power supply.

GHz

four-stacked PA with the tuning inductor are shown in Fig. 28.


The four-stacked PAs measured characteristics deviate from
expected trends. This indicates that the devices are not being
driven to a hard-switching condition, likely due to reduced device
. It should be noted that [18] and [17] have realized
large power devices at mmWave with high
, and hence, the
foregoing results for the four-stacked PA should not be taken to
mean that Class-E operation is not possible for high levels of
stacking.
The small-signal -parameters and large-signal performance
metrics of the two-stage PA implemented in 45-nm SOI CMOS
are summarized in Fig. 29. The measured peak gain is 24.9 dB
at 51 GHz while a peak PAE of 15.4% was achieved at 47 GHz
along with a saturated output power of 20.1 dBm.
Large-signal measurements of the two-stacked differential
PA implemented in 65-nm bulk CMOS yield a peak PAE of
28.3% with a saturated output power of 15.2 dBm at 47.5 GHz
[see Fig. 23(b)], implying a saturated differential output power
of 18.2 dBm. The lower efficiency of this PA compared to the
two-stacked 45-nm SOI PA stems from the higher ON-resistance of the 65-nm devices.
C. Comparison With State-of-the-Art
Table III depicts a comparison of these PAs to state-of-the-art
mmWave CMOS and SiGe PAs. The references are arranged
in order of decreasing PAE. The 65-nm PA is comparable to
state-of-the-art implementations in efficiency, despite the poor
ON-resistance characteristics of the technology. This is a direct
consequence of the loss-aware Class-E design methodology. On
the other hand, the two-stacked PA in 45-nm SOI CMOS exhibits the highest PAE reported for a CMOS mmWave PA. The

PA reported in [21] exhibits similar PAE and output power, and


also employs device stacking in 45-nm SOI CMOS, albeit in the
context of Class-AB operation. The four-stacked PA in 45-nm
SOI CMOS exhibits the highest output power achieved from
a fully integrated CMOS mmWave PA. The work in [17] uses
an off-chip bias-T to provide the supply voltage, and consequently does not integrate the mmWave dc-feed inductor. Furthermore, it is a differential implementation with a differential
output, and assumes ideal 3-dB external differential-to-singleended conversion. It is important to study the output power delivered to a single-ended output pad when comparing works. An
on-chip dc-feed inductor is seen in simulation to introduce approximately 1-dB output-side loss based on the quality-factor
achievable in this technology. When these are factored in, the
work in [17] achieves comparable output power to our fourstacked PA, with an associated PAE that is lower than our fourstack and comparable to our cascade PA. Other prior fully integrated CMOS mmWave PAs with comparable output power
[19], [36], [38] rely on power combining. Since most of the
works reported in Table III operate at higher frequencies, it is
important to use a FOM to ensure fair comparison. The ITRS
FOM, defined as
dBm

Gain(dB)
(10)

where is the operating frequency in gigahertz, takes into account four important performance metrics of a PA. The implemented single-stage prototypes achieve ITRS FOM comparable
to current state-of-the-art mmWave CMOS PAs and the highest

1702

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 62, NO. 8, AUGUST 2014

amongst fully integrated PAs, which do not employ power combining. In particular, the two-stage cascade PA in 45-nm SOI
CMOS achieves the highest ITRS FOM amongst PAs, which
do not employ power combining, and second highest overall.

at the switching instant


forms as follows:

and the periodicity of wave-

(14)
VI. CONCLUSION

(15)

This work indicates that stacked switching CMOS PAs potentially take us one step closer to implementing efficient PAs
in CMOS with watt-level output power at mmWave frequencies
for the first time. Topics for future research include large-scale
low-loss power-combining techniques so that multiple such PAs
may be combined to approach watt-class output power, and linearizing architectures that enable such mmWave switching-type
PAs to be used with complex modulation formats with high average efficiency.

The capacitive discharge loss at the switching instant can be


estimated as

(16)
The expressions for

can be derived using

(17)
APPENDIX

and

Referring to Fig. 2(b) and utilizing the loss-aware Class-E


design methodology described in [23], the analytical equations
describing the switch voltage
and
during the ON
and OFF
half-cycles,
respectively, for a stacked configuration can be derived to be

(18)
The loss in the switch and the dc-feed inductance are given by
(19)

(11)
and
(20)
respectively, while the input power
required to switch a
device between the triode and cutoff regions is approximated as
(21)
(12)
where

is the switching frequency,

where
is the input capacitance of the
bottom device in the triode region,
is the input drive level
in the ON half-cycle and is a fitting parameter determined
from schematic simulations. The foregoing lead to a complete
expression for PAE,
(22)
where
(23)
and
(24)
(25)

(13)
,
and
are constants dewhile
termined by imposing continuity of dc-feed inductor current

A MATLAB code subsequently sweeps the magnitude and


phase of the load current to arrive at a design point with optimal PAE for a given device size, input drive level
, the
tuning parameter , and the number of devices stacked .

CHAKRABARTI AND KRISHNASWAMY: HIGH-POWER HIGH-EFFICIENCY CLASS-E-LIKE STACKED mmWAVE PAs IN SOI AND BULK CMOS

REFERENCES
[1] P. Haldi, G. Liu, and A. Niknejad, CMOS compatible transformer
power combiner, Electron. Lett., vol. 42, no. 19, pp. 10911092, Sep.
2006.
[2] Y. Zhao, J. Long, and M. Spirito, Compact transformer power combiners for millimeter-wave wireless applications, in IEEE Radio Freq.
Integr. Circuits Symp., May 2010, pp. 223226.
[3] D. Chowdhury, C. Hull, O. Degani, Y. Wang, and A. Niknejad, A fully
integrated dual-mode highly linear 2.4 GHz CMOS power amplifier for
4G WiMax applications, IEEE J. Solid-State Circuits, vol. 44, no. 12,
pp. 33933402, Dec. 2009.
[4] J. W. Lai and A. Valdes-Garcia, A 1 V 17.9 dBm 60 GHz power amplifier in standard 65 nm CMOS, in IEEE Int. Solid-State Circuits Conf.
Tech. Dig., Feb. 2010, pp. 424425.
[5] D. Zhao, S. Kulkarni, and P. Reynaert, A 60 GHz dual-mode power
amplifier with 17.4 dBm output power and 29.3% PAE in 40-nm
CMOS, in Proc. ESSCIRC, Sep. 2012, pp. 337340.
[6] M. Bohsali and A. Niknejad, Current combining 60 GHz CMOS
power amplifiers, in IEEE Radio Freq. Integr. Circuits Symp., Jun.
2009, pp. 3134.
[7] T. Dickson, K. H. K. Yau, T. Chalvatzis, A. Mangan, E. Laskin, R.
Beerkens, and P. Westergaard et al., The invariance of characteristic
current densities in nanoscale MOSFETs and its impact on algorithmic
design methodologies and design porting of Si(Ge) (Bi)CMOS highspeed building blocks, IEEE J. Solid-State Circuits, vol. 41, no. 8, pp.
18301845, Aug. 2006.
[8] A. Niknejad, S. Emami, B. Heydari, M. Bohsali, and E. Adabi,
Nanoscale CMOS for mm-wave applications, in IEEE Compound
Semicond. Integr. Circuits Symp., Oct. 2007, pp. 14.
[9] B. Heydari, M. Bohsali, E. Adabi, and A. Niknejad, Millimeter-wave
devices and circuit blocks up to 104 GHz in 90 nm CMOS, IEEE J.
Solid-State Circuits, vol. 42, no. 12, pp. 28932903, Dec. 2007.
[10] S. Nicolson, A. Tomkins, K. Tang, A. Cathelin, D. Belot, and S.
Voinigescu, A 1.2 V, 140 GHz receiver with on-die antenna in 65 nm
CMOS, in IEEE Radio Freq. Integr. Circuits Symp., Jun. 2008, pp.
229232.
[11] E. Johnson, Physical limitations on frequency and power parameters
of transistors, in IRE Int. Convention Rec., Mar. 1965, vol. 13, pp.
2734.
[12] I. Sarkas, A. Balteanu, E. Dacquay, A. Tomkins, and S. Voinigescu, A
45 nm SOI CMOS class-D mm-wave PA with 10 Vpp differential
swing, in IEEE Int. Solid-State Circuits Conf. Tech. Dig., Feb. 2012,
pp. 8890.
[13] A. Mazzanti, L. Larcher, R. Brama, and F. Svelto, Analysis of reliability and power efficiency in cascode class-E PAs, IEEE J. Solid-State
Circuits, vol. 41, no. 5, pp. 12221229, May 2006.
[14] A. Ezzeddine and H. Huang, The high voltage/high power FET
(HiVP), in IEEE Radio Freq. Integr. Circuits Symp., Jun. 2003, pp.
215218.
[15] S. Pornpromlikit, J. Jeong, C. Presti, A. Scuderi, and P. Asbeck, A
33-dBm 1.9-GHz silicon-on-insulator CMOS stacked-FET power
amplifier, in IEEE MTT-S Int. Microw. Symp. Dig., Jun. 2009, pp.
533536.
[16] J. McRory, G. Rabjohn, and R. Johnston, Transformer coupled
stacked FET power amplifiers, IEEE J. Solid-State Circuits, vol. 34,
no. 2, pp. 157161, Feb. 1999.
[17] A. Balteanu, I. Sarkas, E. Dacquay, A. Tomkins, and S. Voinigescu,
A 45-GHz, 2-bit power DAC with 24.3 dBm output power,
14 Vpp differential swing, and 22% peak PAE in 45-nm
SOI CMOS, in IEEE Radio Freq. Integr. Circuits Symp., Jun.
2012, pp. 319322.
[18] S. Pornpromlikit, H.-T. Dabag, B. Hanafi, J. Kim, L. Larson, J. Buckwalter, and P. Asbeck, A -band amplifier implemented with stacked
45-nm CMOS FETs, in IEEE Compound Semicond. Integr. Circuit
Symp., Oct. 2011, pp. 14.
[19] A. Chakrabarti, J. Sharma, and H. Krishnaswamy, Dual-output
stacked class-EE power amplifiers in 45 nm SOI CMOS for -band
applications, in IEEE Compound Semicond. Integr. Circuit Symp.,
Oct. 2012, pp. 14.
[20] A. Chakrabarti and H. Krishnaswamy, High power, high efficiency
stacked mmWave class-E-like power amplifiers in 45 nm SOI CMOS,
in IEEE Custom Integr. Circuits Conf., Sep. 2012, pp. 14.

1703

[21] A. Agah, H. Dabag, B. Hanafi, P. Asbeck, L. Larson, and J. Buckwalter,


A 34% PAE, 18.6 dBm 4245 GHz stacked power amplifier in 45 nm
SOI CMOS, in IEEE Radio Freq. Integr. Circuits Symp., Jun. 2012,
pp. 5760.
[22] N. Sokal and A. Sokal, Class E-A new class of high-efficiency tuned
single-ended switching power amplifiers, IEEE J. Solid-State Circuits, vol. 10, no. 3, pp. 168176, Jun. 1975.
[23] A. Chakrabarti and H. Krishnaswamy, An improved analysis and design methodology for RF class-E power amplifiers with finite DC-feed
inductance and switch on-resistance, in IEEE Int. Circuits Syst. Symp.,
May 2012, pp. 17631766.
[24] O. Lee, J. Han, K. H. An, D. H. Lee, K.-S. Lee, S. Hong, and C.-H. Lee,
A charging acceleration technique for highly efficient cascode class-E
CMOS power amplifiers, IEEE J. Solid-State Circuits, vol. 45, no. 10,
pp. 21842197, Oct. 2010.
[25] D. Sandstrom, B. Martineau, M. Varonen, M. Karkkainen, A.
Cathelin, and K. A. I. Halonen, 94 GHz power-combining
power amplifier with
13 dBm saturated output power in 65
nm CMOS, in IEEE Radio Freq. Integr. Circuits Symp., Jun.
2011, pp. 14.
[26] S. Ko and J. Lin, A linearized cascode cmos power amplifier, in IEEE
Annu. Wireless Microw. Technol. Conf., 2006, pp. 14.
[27] A. Siligaris et al., A 60 GHz power amplifier with 14.5 dBm
saturation power and 25% peak PAE in CMOS 65 nm SOI,
IEEE J. Solid-State Circuits, vol. 45, no. 7, pp. 12861294,
Jul. 2010.
[28] H. Dabag, B. Hanafi, F. Golcuk, A. Agah, J. Buckwalter, and P. Asbeck, Analysis and design of stacked-FET millimeter-wave power
amplifiers, IEEE Trans. Microw. Theory Techn., vol. 61, no. 4, pp.
15431556, Apr. 2013.
[29] S. Kee, The class E/F family of harmonic-tuned switching power amplifiers Ph.D. dissertation, Dept. Elect. Eng., California Inst. Technol.,
Pasadena, CA, USA, 2001. [Online]. Available: http://resolver.caltech.
edu/CaltechETD:etd-04262005-152703
[30] J. Sharma and H. Krishnaswamy, 216- and 316-GHz 45-nm SOI
CMOS signal sources based on a maximum-gain ring oscillator
topology, IEEE Trans. Microw. Theory Techn., vol. 61, no. 1, pp.
492504, Jan. 2013.
[31] BSIM SOI Manual, BSIM Group, Univ. California at Berkeley,
Berkeley, CA, USA.
[32] IE3D User Manual, Mentor Graphics Corporation, Wilsonville, OR,
USA.
[33] U. Gogineni, J. del Alamo, and C. Putnam, RF power potential of 45
nm CMOS technology, in Silicon Monolithic Integr. Circuits RF Syst.
Top. Meeting, Jan. 2010, pp. 204207.
[34] O. Ogunnika and A. Valdes-Garcia, A 60 GHz class-E tuned power
amplifier with PAE 25% in 32 nm SOI CMOS, in IEEE Radio Freq.
Integr. Circuits Symp., Jun. 2012, pp. 6568.
[35] D. Zhao, S. Kulkarni, and P. Reynaert, A 60 GHz outphasing
transmitter in 40 nm CMOS with 15.6 dBm output power,
in IEEE Int. Solid-State Circuits Conf. Tech. Dig., Feb. 2012,
pp. 170172.
[36] K.-Y. Wang, T.-Y. Chang, and C.-K. Wang, A 1 V 19.3 dBm 79 GHz
power amplifier in 65 nm CMOS, in IEEE Int. Solid-State Circuits
Conf. Tech. Dig., Feb. 2012, pp. 260262.
[37] J. Chen and A. Niknejad, A compact 1 V 18.6 dBm 60 GHz power
amplifier in 65 nm CMOS, in IEEE Int. Solid-State Circuits Conf.
Tech. Dig., Feb. 2011, pp. 432433.
[38] C. Law and A.-V. Pham, A high-gain 60 GHz power amplifier with 20
dBm output power in 90 nm CMOS, in IEEE Int. Solid-State Circuits
Conf. Tech. Dig., Feb. 2010, pp. 426427.
[39] K. Datta, J. Roderick, and H. Hashemi, Analysis, design and
implementation of mm-wave SiGe stacked class-E power amplifiers,
in IEEE Radio Freq. Integr. Circuits Symp., Jun. 2013, pp.
275278.
[40] K. Datta, J. Roderick, and H. Hashemi, A 22.4 dBm two-way
Wilkinson power-combined -band SiGe class-E power amplifier
with 23% peak PAE, in IEEE Compound Semicond. Integr. Circuit
Symp., Oct. 2012, pp. 14.
[41] N. Kalantari and J. Buckwalter, A 19.4 dBm, -band class-E
power amplifier in a 0.12 m SiGe BiCMOS process, IEEE
Microw. Wireless Compon. Lett., vol. 20, no. 5, pp. 283285,
May 2010.

1704

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 62, NO. 8, AUGUST 2014

Anandaroop Chakrabarti received the B.Tech.


degree in electronics and electrical communication
engineering from the Indian Institute of Technology,
Kharagpur, India, in 2010, the M.S. degree in electrical engineering from Columbia University, New
York, NY, USA, in 2011, and is currently working
toward the Ph.D. degree at Columbia University,
New York, NY, USA.
In Summer 2013, he was with the IBM T. J.
Watson Research Center, Yorktown Heights, NY,
USA, on a three-month internship. His research
interests include mmWave and RF circuits and systems in silicon, massive
mmWave multi-input-multi-output (MIMO) systems and related applications.

Harish Krishnaswamy received the B.Tech. degree


in electrical engineering from the Indian Institute of
Technology, Madras, India, in 2001, and the M.S. and
Ph.D. degrees in electrical engineering from the University of Southern California (USC), Los Angeles,
CA, USA, in 2003 and 2009, respectively.
In 2009, he joined the Electrical Engineering
Department, Columbia University, New York,
NY, USA, as an Assistant Professor. His research
interests broadly span integrated devices, circuits,
and systems for a variety of RF and mmWave applications. His current research efforts are focused on silicon-based mmWave
PAs, sub-mmWave circuits and systems, and reconfigurable broadband RF
transceivers for cognitive and software-defined radio.
Dr. Krishnaswamy serves as a member of the Technical Program Committee
(TPC) of several conferences, including the IEEE RFIC Symposium and IEEE
VLSI-D. He was the recipient of the IEEE International Solid-State Circuits
Conference (ISSCC) Lewis Winner Award for Outstanding Paper in 2007, the
Best Thesis in Experimental Research Award from the USC Viterbi School
of Engineering in 2009, and the Defense Advanced Research Projects Agency
(DARPA) Young Faculty Award in 2011.

Anda mungkin juga menyukai