Makinwa
AndreaBaschirotto Editors
123
Editors
Pieter Harpe Kofi A. A. Makinwa
Department of Electrical Engineering Department of Microelectronics
Eindhoven University of Technology Delft University of Technology
Eindhoven, Noord-Brabant, The Netherlands Delft, Zuid-Holland, The Netherlands
Andrea Baschirotto
Department of Physics G. Occhialini
University of Milan
Milano, Italy
This book is part of the Analog Circuit Design series and contains contributions
of all 18 speakers of the 26th workshop on Advances in Analog Circuit Design
(AACD). The local organizers were Kathleen Philips, Yao-Hong Liu, and Steffie van
de Vorstenbosch from Holst Centre/imec, Eindhoven, the Netherlands. The sponsors
of the workshop this year were as follows: NXP Semiconductors (platinum spon-
sor), Dialog Semiconductor (gold sponsor), and silver sponsors Analog Devices,
AnSem, Catena, Huawei, ICsense, ItoM, and Philips. The workshop was held in
Eindhoven, the Netherlands, from March 28 to 30, 2017.
The book comprises three parts, covering advanced analog and mixed-signal
circuit design topics that are considered highly important by the circuit design
community:
Hybrid data converters
Smart sensors for the IoT
Sub-1V and advanced-node analog circuit design
Each part is set up with six papers from experts in the field.
The aim of the AACD workshop is to bring together a group of expert designers
to discuss new developments and future options. Each workshop is followed by
the publication of a book by Springer in their successful series of Analog Circuit
Design. This book is the 26th in this series. The book series can be seen as a
reference for all people involved in analog and mixed-signal design. The full list
of the previous books and topics in the series is given next.
We are confident that this book, like its predecessors, proves to be a valuable
contribution to our analog and mixed-signal circuit design community.
v
The Topics Covered Before in This Series
ix
x Contents
The first part of this book is dedicated to recent developments in the field of hybrid
data converters. While hybrid architectures, algorithms and circuits have been used
for a long time, hybrid converters are much more in the picture recently, thanks to
their ability to deal with technology limitations and achieve better performance in
terms of speed, accuracy, efficiency and chip area. The first chapter in this part of the
book gives an overview of the scope of hybrid, while the remaining five chapters
introduce hybrid converter examples aiming for different performance directions.
The first chapter, by Kostas Doris, discusses the different dimensions of hybrid
converter concepts and shows a classification of these. Combined with concrete
examples, it exemplifies the broad nature of hybrid concepts in data converters.
In the second chapter, by Alessandro Venca, Nicola Ghittori, Alessandro Bosi
and Claudio Nani, a time-interleaved SAR- ADC is proposed. Moreover,
the DAC implementation also uses a hybrid charge redistribution/charge sharing
topology. It is shown that this enables excellent performance and also reduces chip
area of the integrated system.
The third chapter, by Ewout Martens, discusses several time-interleaved
pipelined SAR ADCs, aiming to move the speed-precision-efficiency envelope.
The various circuit implementations being used give insight into their respective
limitations and advantages.
The fourth chapter, by Arindam Sanyal, Wenjuan Guo and Nan Sun, combines
a SAR ADC with a modulator and a VCO-based converter, resulting in noise
shaping and higher precision than achievable by a typical SAR ADC.
In the fifth chapter, Yun-Shiang Shu, Liang-Ting Kuo and Tien-Yu Lo describe
a hybrid architecture for a reconfigurable SAR ADC, including SAR, and
subranging and flash converter techniques. Noise and mismatch shaping techniques
enable a state-of-the-art efficiency, up to 101dB SNDR and flexibility in speed and
precision.
The sixth chapter, by Burak Gnen, Fabio Sebastiano, Robert van Veldhoven,
and Kofi Makinwa, presents a dynamic zoom ADC, which is a combination of a
SAR and a ADC. This enables very high resolution while maintaining excellent
efficiency in energy and area.
Chapter 1
Hybrid Data Converters
Kostas Doris
1.1 Introduction
K. Doris ()
NXP Semiconductors, Eindhoven, The Netherlands
e-mail: kostas.doris@nxp.com
160
150
140
ISSCC 1997-2007
VLSI 1997-2008
ISSCC 2008-2017
130 VLSI 2008-2014
ISSCC 2017
28nm and below
28nm and below VLSI
120
1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12
Fsampling
Figure 1.2 shows the receiver part of a communication system represented with a
direct conversion architecture without loss of generality.
Physical signals serving as carrier waves propagate information across the
propagation medium (the channel) prior to being received at the antenna interface.
These information carrier signals are analog and are optimally matched to the
physics of the propagation medium for the purpose of carrying information. They
are minimally restricted in the amplitude and time domains, which means they have
limited redundancy and as a consequence they are sensitive to noise, interference,
distortions, etc. Abstract bits on the right side of the figure are maximally restricted
in amplitude and time domains. This restriction introduces redundancy which trans-
lates to robustness against CMOS technology. In this way they become optimally
matched to CMOS technology for the purpose of computation of information.
This leaves us with two signal types and media optimally matched to each other
for purposes of propagation and computation of information and one analog mixed-
signal domain in the middle, where signals are not optimally matched to its CMOS
implementation carrier.
1 Hybrid Data Converters 5
The conversion of information from one medium to the other is done at the
analog to digital interface spanning all the way from the antenna to the bit slicer,
the exact point at which conversion from analog to digital fully takes place. We are
neglecting for simplicity for now the conversion from the electromagnetic domain
to the electrical at the antenna without loss of generality. Please note the distinction
between the overall conversion from analog to digital and the data converter block,
which is a boundary designers put in practice.
The primary role of this interface is to translate information from one matched
domain to the other with minimum loss, from analog (no restrictions in time and
amplitude) to digital (amplitude restricted to 0 and 1, sequences replacing time). As
the analog signal reaches the boundary of the antenna interface it encounters a large
discontinuity. Inversely, as signals go out of the bit slicer, they reach the level of
best matching to CMOS technology. By restricting the signals in a stepwise manner
across signal domains (continuous/discrete amplitude and time), we introduce
redundancy in the signal and enable abstraction from hardware between device,
circuit, architecture, algorithm and signal layers.
Abstraction and redundancy in the (mixed) signal allow a gradual functional
decoupling across the signal chain. This is a typical property of digital systems
but not of analog, which opens up the door for a plethora of transformations
(modulation, scheduling, hardware redundancy, averaging, etc.) The hybrid com-
bination of these techniques across the hardware abstraction hierarchy is what
provides the optimal match of the conversion function (not just the converter block)
to the evolving CMOS technology. This hybrid process goes stepwise along the
signal path. It starts at the front-end circuits (e.g. LNA and PA) dealing with
the largest discontinuity where especially at higher frequencies (e.g. mmwave)
matching to technology means physical design approaches (e.g. resonating all
nodes to overcome signal losses) and reaches full physical abstraction at the digital
calibrations of the ADC.
As it can be seen, the conversion from analog to digital is not restricted to the
data converter block and it is hybrid by definition, as already indicated by its dual
name.
6 K. Doris
DAC
out Parallelism/
in
S/H + ADC SAR , Slope, etc. Modulation Redundancy
Sequencing
Averaging
-
Hybrid across same concept
Conv. Incremental
Sigma Delta Pipeline/cyclic
radix <2 Multiple
Algor. Asynchr. SAR flash/SAR conversions
in
1 2 N Multi-rate
Archit: Sigma Delta Unary coding. flash/SAR SAR
out
Logic
Chopping extra units muxed gmC
Circuit: Range scale (DEM) comparators integrator
ADC 1
1 out Back-gate dummy parallel R C (noise)
MUX
ADC M
M hybrid across same layer
Fig. 1.3 Abstraction and concepts in analog/digital conversion (non exhaustive list of examples)
1 Hybrid Data Converters 7
Bit pipelining in SAR [5, 27] or pipelined ADCs [28]. Not only this offers the
benefit of parallelism for speed but it also enables range scaling transformation
in the amplitude domain with interstage amplification, which reduces the
impact of noise in LSB conversions as well.
Sample to sample parallelism (interleaving).
Hierarchical approaches in sampling, interleaving, demultiplexing and
pipelining [18, 23, 2935].
Bit multiplexing with separate comparators [36, 37] or interleaved compara-
tors [32].
DACs with current steering flash and delta-sigma for MSB and LSBs [38],
hybrid encoding with binary thermometer segmentation and combinations of
flash and time interleave [39] and even functional co-integration across the
receive/transmit path, e.g. mixing DACs [40].
Hybrid circuit block implementations
DACs based on capacitors [24, 41, 42], resistors[5, 8, 20] or current [23, 43],
current steering combined with resistors (e.g. R2R) [44].
Class AB amplifiers reconfigured in and out of positive feedback [45] during
operation, cascaded class A [11, 23] or class AB dynamic amplifications [27,
28, 46], charge re-use [25, 46], etc.
Charge sharing combined with charge redistribution DACs [4].
LDOs based on series-parallel concepts [47].
chip detection (e.g. on-chip monitors) and covers user, application, ambience and
technology domains. It is exploited in digital or analog form, continuous time or
discrete. Converters today use a multitude of information types and apply hybrid
calibration techniques to deal with errors such as time interleave artefacts, transfer
function errors, programming the chip for given temperature profiles, etc., for
example, time interleave error detection in digital domain and correction in the
analog domain.
Such hybrid converters need to be made adaptive (reconfigurable architecture,
adaptable parameters, test signals, etc.) to exploit the combination of information
types available. They also need hardware redundancy. For example in a DAC,
redundancy can be put in the switched hardware units [51, 52], while reconfiguration
is done at the decoder. The dual ADC reported in [53] analyses the input data
and chooses during operation amongst two ADCs on sample per sample basis
reconfiguring the gain applied to the signal to reduce the impact of clipping in multi-
carrier signals.
So far we have been talking mainly about hybrid data converters looking in the
converter block. Another hybrid dimensions occurs along the signal path in a
receiver and transmitter. Most notable examples here include the functional co-
integration between the upconverter mixer and the DAC [52] for wireless systems,
the power amplifier and DAC in wireline systems driving directly 50 cable load,
the combination of equalization and conversion in wireline systems, etc.
10 K. Doris
In the same line of thinking, references [18, 23, 43] integrate the buffer driving
the large load of the ADC into the SAR loop eliminating its nonlinear behaviour.
When combined with demultiplexing, this further enables to reduce the impact of
wire interconnect load of the interleave array [23].
Finally, the combination of different technologies can lead also to hybrid data
converters, e.g. optical and electronic ADCs.
The type of hybrid techniques associates strongly with the problem that needs to be
addressed at the corresponding resolution speed regime.
At very high sampling rates, combinations of scheduling operations with time
interleave error correction techniques dominate hybrid ADCs. Achieving tens of
GS/s sampling rate and 1020 GHz input bandwidth with low clock jitter noise
overshadows the low-resolution noise requirements of the unit converters. The main
issue is how to deal with the large interconnect capacitance stemming from the
large converter interleaved array that is required to get to the required sampling
rate (mainly clock and signal interconnect, not sampling capacitors). A large array
enables high sampling rate, but limits input bandwidth and clocking performance
and translates to high-power dissipation. Bandwidth also brings in additional
constrains in the packaging due to input signal losses.
Hybrid concepts help to reduce the size of the interleave array, to divide and
reduce the interconnect capacitance that is present at the input or at clock nodes.
We observe in both signal and clocking paths hierarchical interleaving forms,
multiplexing, resampling, power splitting, etc. [18, 23, 3033]. Typically flash/SAR
ADCs are preferred for either their speed or small area that both help to scale
down the array size and complexity. This is achieved thanks to hybrid calibration
techniques correcting time and amplitude dimension interleave errors with analog
and digital correction techniques. The units need to be as fast as possible, and
multiplexing is applied even at the comparator level, e.g. separate comparators [36]
or interleaving [32]. Frequency domain multiplexing was also reported [54].
High-resolution levels, e.g. 1016 b with sampling rates between 1 and 10 GS/s
rates, require dealing with thermal noise, matching, large sampling and interconnect
capacitance. Here there is a lot of emphasis for hybrid concepts that apply at
the converter unit and associated use of on-chip information to calibrate it. Time
interleaving is present but reduces drastically as the resolution increases: noise
imposes restrictions due to the large sampling capacitors and clock buffers loading
the input and clock nodes.
We observe hierarchical combinations of multi-bit MDAC sampling front ends
with SAR sub-ADCs [55] but also pipelined multi-bit SARs [37] exploiting dynamic
amplifiers and SARs with separate comparators per bit to become the alternatives
1 Hybrid Data Converters 11
of conventional pipelined ADCs using flash sub-ADCs [56]. Digital and analog
corrections are implemented for nearly everything that can be calibrated in the
converter for both pipeline and interleave artefacts such as comparator offsets, gains,
signal transfer functions, bandwidth mismatches, interleave errors, track hold errors,
clock injections, using DEM, LMS algorithms, dithering, etc. [57].
At speeds below 1 GHz [4, 28] with high-resolution levels, we see combinations
of various concepts diverging from the conventional pipelined ADC. The main idea
is partitioning the conversion cycle in MSB and LSB parts. A suitable architecture
can be chosen for the LSB part that determines noise performance and power
dissipation, and a faster architecture can be used to remove speed limitations in
the MSB part. Interleaving is a degree of freedom coming on top. The work
of [4] introduces an incremental delta-sigma modulator with reconfigurability,
whereas [28] uses pipeline of SARs with interstage amplification based on dynamic
amplifiers. In [10] an asynchronous digital slope converter is combined with a
SAR exploiting also continuous time comparator techniques. The 14b ADC in [58]
utilized multi-bit SARs with time interleaving of the single comparator being shared
between all DACs. Analog to Digital converters with even lower speed but high
resolutions follow and extend the same trend. A recent example can be seen in the
audio domain sub-ranging delta-sigma ADC presented in [59].
These examples illustrated how multiple principles and techniques can be used to
exploit most optimally CMOS technology to deal with fundamental limitations such
as noise. However, the solution space is not restricted to these techniques only. As
we move to even higher levels of abstraction, more degrees of freedom will become
available. At the transmitter side, one can use spatial signal processing techniques
at analog and digital domains. At the circuit layer, for example, power combining
allows generating higher signal power compared to what a single amplifier unit
can deliver. At the architectural level, spatial processing can be implemented with
analog and digital beam forming either with phase rotation or time delays in analog
and digital domains, respectively. The trend will continue at application level, for
example in the future autonomous driving vehicles multiple sensors with different
principles will be combined into one sensor architecture (ultra sound, radar, lidar,
camera, etc.) to enable the car process a broad spectrum of signals from low
frequencies up to optical wavelengths with high accuracy and reliability.
1.7 Conclusions
To appreciate properly the nature of hybrid converters, one needs to see them from
multiple angles and abstraction layers and appreciate the whole conversion from
antenna to bits.
In the past, hybrid converters were mostly all about the circuit layer of abstrac-
tion. Nowadays, hybrid data converters expand across all abstraction layers, espe-
cially with regard to scheduling, algorithmic conversions, handling of information
for corrections and the transceiver signal path.
12 K. Doris
The hybrid analog to digital or digital to analog converter has a dual name
to begin, analog and digital. It operates in multiple signal amplitude and time
domains and utilizes combinations of conversion techniques and concepts such as
redundancy, modulation, parallelism, sequencing and sub-ranging across hardware
abstraction hierarchy. Through reconfigurable and adaptable hardware, it exploits
multiple forms of information to deal with nm CMOS imperfections. The ideal
hybrid converter breaks completely the boundaries of conventional converter blocks
from antenna to bits and becomes a converter truly matched to the application and
the properties of silicon technology and propagation channel.
References
1. Doris, K.: Time interleaved analog-to-digital converters: an algorithmic melting pot. In: 2009
IEEE International Solid-State Circuits Conference (ISSCC), Jan 2009
2. Murmann, B.: ADC performance survey. http://www.stanford.edu/~murmann/adcsurvey.html
3. van Roermund, A.: Shifting the frontiers of analog and mixed-signal electronics. Advances in
Electronics, vol. 2014 (2014)
4. Venca, A., et al.: A 0.076 mm2 12 b 26.5 mW 600 MS/s 4-way interleaved sub-ranging SAR-
ADC with on-chip buffer in 28 nm CMOS. IEEE J. Solid State Circuits 51(12), 29512962
(2016)
5. Louwsma, S.M., et al.: A 1.35 GS/s, 10 b, 175 mW time-interleaved AD converter in 0.13 m
CMOS. IEEE J. Solid State Circuits 43(4), 778786 (2008)
6. Verbruggen, B., et al.: A 2.6 mW 6 bit 2.2 GS/s fully dynamic pipeline ADC in 40 nm digital
CMOS. IEEE J. Solid State Circuits 45(45), 20802090 (2010)
7. Cao, Z., et al.: A 32 mW 1.25 GS/s 6b 2b/Step SAR ADC in 0.13 m CMOS. In: IEEE
International Solid-State Circuits Conference, Digest of Technical Papers, pp. 542543 (2008)
8. Wei, H., et al.: A 0.024 mm2 8b 400MS/s SAR ADC with 2b/Cycle and resistive DAC in 65 nm
CMOS. In: IEEE International Solid-State Circuits Conference, Digest of Technical Papers,
pp. 188190 (2011)
9. Ding, M., et al.: A 5bit 1GS/s 2.7 mW 0.05 mm2 asynchronous digital slope ADC in 90 nm
CMOS for IR UWB radio. In: 2012 IEEE Radio Frequency Integrated Circuits Symposium,
pp. 487490, June 2012
10. Liu, C.C., et al.: A 12 bit 100 MS/s SAR-assisted digital-slope ADC. IEEE J. Solid State
Circuits 51(12), 29412950 (2016)
11. Harsener, M., et al.: A 14b 40 MS/s redundant SAR ADC with 480 MHz clock in 0.13 m
CMOS. In: IEEE International Solid-State Circuits Conference, Digest of Technical Papers,
pp. 248249 (2007)
12. Hurrel, C., et al.: An 18b 12.5 MHz ADC with 93 dB SNR. In: IEEE International Solid-State
Circuits Conference, Digest of Technical Papers, pp. 378379 (2010)
13. Fredenburg, J., Flynn, M.: A 90MS/s 11MHz bandwidth 62dB SNDR noise-shaping SAR
ADC. In: IEEE International Solid-State Circuits Conference, Digest of Technical Papers,
pp. 468471 (2012)
14. Liu, C.-C., et al.: A 0.46 mW 5MHz-BW 79.7 dB SNDR noise-shaping SAR ADC with
dynamic-amplifier-based FIR-IIR Filter. In: 2017 IEEE International Solid-State Circuits
Conference (ISSCC), pp. 466467, Jan 2017
15. Shu, Y.S., et al.: An oversampling SAR ADC with DAC mismatch error shaping achieving
105 dB SFDR and 101 dB SNDR over 1 kHz BW in 55 nm CMOS. IEEE J. Solid State Circuits
51(12), 29282940 (2016)
1 Hybrid Data Converters 13
16. Xu, H., et al.: A 78.5dB-SNDR radiation- and metastability-tolerant two-step split SAR ADC
operating up to 75MS/s with 24.9 mW power consumption in 65 nm CMOS. In: 2017 IEEE
International Solid-State Circuits Conference (ISSCC), pp. 477477, Jan 2017
17. Ginsburg, B.P., Chandrakasan, A.P.: Highly interleaved 5-bit, 250-MSample/s, 1.2-mW ADC
with redundant channels in 65-nm CMOS. IEEE J. Solid State Circuits 43(12), 26412650
(2008)
18. Janssen, E., et al.: An 11b 3.6GS/s time-interleaved SAR ADC in 65 nm CMOS. In: 2013
IEEE International Solid-State Circuits Conference Digest of Technical Papers, pp. 464465,
Feb 2013
19. Kuttner, F.: A 1.2V 10b 20 MSample/s non-binary successive approximation ADC in 0.13 m
CMOS. In: IEEE International Solid-State Circuits Conference, Digest of Technical Papers,
pp. 176177 (2002)
20. Boyacigiller, Z., et al.: An error-correcting 14b/20ps CMOS A/D converter. In: IEEE Interna-
tional Solid-State Circuits Conference, Digest of Technical Papers, pp. 6263 (1981)
21. Draxelmayr, D.: A self calibration technique for redundant A/D converters providing 16b
accuracy. In: IEEE International Solid-State Circuits Conference, Digest of Technical Papers,
pp. 204205 (1988)
22. Liu, W., et al.: A 12b 22.5/45MS/s 3.0 mW 0.059 mm2 CMOS SAR ADC achieving over 90 dB
SFDR. In: IEEE International Solid-State Circuits Conference, Digest of Technical Papers,
pp. 380381 (2010)
23. Doris, K., et al.: A 480 mW 2.6GS/s 10b 65 nm time-interleaved ADC with 48.5 dB SNDR up
to Nyquist. In: IEEE International Solid-State Circuits Conference, Digest of Technical Papers,
pp. 180182 (2011)
24. Chen, S.-W.M., Brodersen, R.W.: A 6-bit 600-MS/s 5.3-mW asynchronous ADC in 0.13-m
CMOS. IEEE J. Solid State Circuits 41(12), 731740 (2006)
25. Craninckx, J., Van der Plas, G.: A 65J/conversion-step 0-to-50MS/s 0-to-0.7 mW 9b charge-
sharing SAR ADC in 90 nm digital CMOS. In: IEEE International Solid-State Circuits
Conference, Digest of Technical Papers, pp. 246247 (2007)
26. Harpe, P., et al.: A 30fJ/Conversion-Step 8b 0-to-10MS/s asynchronous SAR ADC in 90 nm
CMOS. In: IEEE International Solid-State Circuits Conference, Digest of Technical Papers,
pp. 387389 (2010)
27. Verbruggen, B., Iriguchi, M., Craninckx, J.: A 1.7 mW 11b 250MS/s 2x interleaved fully
dynamic pipelined SAR ADC in 40 nm Digital CMOS. In: IEEE International Solid-State
Circuits Conference, Digest of Technical Papers, pp. 466469 (2012)
28. van der Goes, F., et al.: A 1.5 mW 68 dB SNDR 80 Ms/s 2x interleaved pipelined SAR ADC
in 28 nm CMOS. IEEE J. Solid State Circuits 49(12), 28352845 (2014)
29. Gupta, S., et al.: A 1GS/s 11b time-interleaved ADC with 55-dB SNDR, 250 mW power
realized by a high bandwidth scalable time-interleaved architecture. IEEE J. Solid State
Circuits 41, 26502657 (2006)
30. Greshishchev, Y., et al.: A 40GS/s 6b ADC in 65 nm CMOS. In: IEEE International Solid-State
Circuits Conference, Digest of Technical Papers, pp. 390391 (2010)
31. Doris, K., et al.: Interleaving of sar adcs in deep submicron CMOS technology. Advances
in Analog and RF IC Design for Wireless Communication Systems, 1st edn. Elsevier.
ISBN:9780123983268
32. Kull, L., et al.: 22.1 A 90GS/s 8b 667 mW 64x; interleaved SAR ADC in 32 nm digital
SOI CMOS. In: 2014 IEEE International Solid-State Circuits Conference Digest of Technical
Papers (ISSCC), pp. 378379, Feb 2014
33. Duan, Y., Alon, E.: A 12.8 GS/s time-interleaved ADC with 25 GHz effective resolution
bandwidth and 4.6 ENOB. IEEE J. Solid State Circuits 49(8), 17251738 (2014)
34. Wu, J., et al.: A 4GS/s 13b pipelined ADC with capacitor and amplifier sharing in 16nm
CMOS. In: 2016 IEEE International Solid-State Circuits Conference (ISSCC), pp. 466467,
Jan 2016
35. Wu, J., et al.: A 5.4GS/s 12b 500 mW pipeline ADC in 28 nm CMOS. In: 2013 Symposium on
VLSI Circuits, pp. C92C93, June 2013
14 K. Doris
36. Kull, L., et al.: A 10b 1.5GS/s pipelined-SAR ADC with background second-stage common-
mode regulation and offset calibration in 14 nm CMOS FinFet. In: 2017 IEEE International
Solid-State Circuits Conference (ISSCC), pp. 474476, Jan 2017
37. Vaz, B., et al.: A 13b 4GS/s digital assisted dynamic 3-stage asynchronous pipelined-SAR
ADC. In: 2017 IEEE International Solid-State Circuits Conference (ISSCC), pp. 476477, Jan
2017
38. Su, S., Chen, M.S.W.: A 12b 2GS/s dual-rate hybrid DAC with pulsed timing-error pre-
distortion and in-band noise cancellation achieving >74dBc SFDR up to 1 GHz in 65 nm
CMOS. In: 2016 IEEE International Solid-State Circuits Conference (ISSCC), pp. 456457,
Jan 2016
39. Olieman, E., et al.: A 110 mW, 0.04 mm2 , 11gs/s 9-bit interleaved DAC in 28 nm fdsoi with
50 dB SFDR across nyquist. In: 2014 Symposium on VLSI Circuits Digest of Technical Papers,
June 2014, pp. 12
40. Bechthum, E., et al.: A wideband RF mixing-DAC achieving IMD lt; -82 DBC up to 1.9 GHz.
IEEE J. Solid State Circuits 51(6), 13741384 (2016)
41. McCreary, J., Gray, P.: All-MOS charge redistribution analog-to-digital conversion techniques.
IEEE J. Solid State Circuits 10(6), 371379 (1975)
42. Alpman, E., et al.: A 1.1V 50 mW 2.5GS/s 7b time-interleaved C-2C SAR ADC in 45 nm
LP digital CMOS. In: IEEE International Solid-State Circuits Conference, Digest of Technical
Papers, pp. 7677, 77a (2009)
43. Kramer, M.J., et al.: A 14 b 35 MS/s SAR ADC achieving 75 dB SNDR and 99 dB SFDR with
loop-embedded input buffer in 40 nm CMOS. IEEE J. Solid State Circuits 50(12), 28912900
(2015)
44. Poulton, K., et al.: A 7.2-GSa/s, 14-bit or 12-GSa/s, 12-bit DAC in a 165-GHz fT BiCMOS
process. In: 2011 Symposium on VLSI Circuits Digest of Technical Papers, pp. 6263, June
2011
45. Hershberg, B., et al.: Ring amplifiers for switched capacitor circuits. IEEE J. Solid-State
Circuits 47(12), 29282942 (2012)
46. van Elzakker, M., et al.: A 10-bit charge-redistribution ADC consuming 1.9W at 1MS/s.
IEEE Journal of Solid-State Circuits 45(5), 10071015 (2010)
47. Breems, L., et al.: A 2.2 GHz continuous-time ADC with 102 dBc THD and 25 MHz
bandwidth. IEEE J. Solid-State Circuits 51(12), 29062916 (2016)
48. Harpe, P., et al.: A 7-to-10b 0-to-4MS/s flexible SAR ADC with 6.5-to-16fJ/conversion-
step. In: IEEE International Solid-State Circuits Conference, Digest of Technical Papers,
pp. 472475 (2012)
49. Kramer, M.J., et al.: A 14-bit 30-MS/s 38-mW SAR ADC using noise filter gear shifting. IEEE
Trans. Circuits Syst. Express Briefs 64(2), 116120 (2017)
50. Yip, M., Chandrakasan, A.: A resolution-reconfigurable 5-to-10b 0.4-to-1V power scalable
SAR ADC. In: IEEE International Solid-State Circuits Conference, Digest of Technical Papers,
pp. 190191 (2011)
51. Tang, Y., et al.: A 14 bit 200 MS/s DAC with SFDR > 78 dBc, im3 < 83 dBc and NSD
< 163 dbm/Hz across the whole Nyquist band enabled by dynamic-mismatch mapping. IEEE
J. Solid State Circuits 46(6), 13711381 (2011)
52. de Vel, H.V., et al.: 11.7 a 240 mW 16b 3.2gs/s DAC in 65 nm CMOS with 80 dBc im3 up
to 600 MHz. In: 2014 IEEE International Solid-State Circuits Conference Digest of Technical
Papers (ISSCC), pp. 206207, Feb 2014
53. Lin, Y.: et al.: An 11b 1GS/s ADC with parallel sampling architecture to enhance SNDR for
multi-carrier signals. In: 2013 Proceedings of the ESSCIRC (ESSCIRC), pp. 121124, Sept
2013
54. Greshishchev, Y.: CMOS ADCs for optical communications. In: Proceedings of the 20th
Workshop on Advances in Analog Circuit Design (AACD), Apr 2012
55. Brandolini, M., et al.: 26.6 A 5GS/S 150 mW 10b SHA-less pipelined/SAR hybrid ADC in
28 nm CMOS. In: 2015 IEEE International Solid-State Circuits Conference (ISSCC) Digest
of Technical Papers, pp. 13, Feb 2015
1 Hybrid Data Converters 15
56. Devarajan, S., et al.: A 12b 10GS/s interleaved pipeline ADC in 28 nm CMOS technology.
In: 2017 IEEE International Solid-State Circuits Conference (ISSCC), pp. 288289, Jan 2017
57. Ali, A.M.A., et al.: A 14-bit 2.5GS/s and 5GS/s RF sampling ADC with background calibration
and dither. In: 2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits), pp. 12, June 2016
58. Kapusta, R., et al.: A 14b 80 Ms/s SAR ADC with 73.6 db SNDR in 65 nm CMOS. IEEE J.
Solid State Circuits 48(12), 30593066 (2013)
59. Gonen, B., et al.: 15.7 A 1.65 mW 0.16 mm2 dynamic zoom-ADC with 107.5 dB DR in 20 kHz
BW. In: 2016 IEEE International Solid-State Circuits Conference (ISSCC), pp. 282283, Jan
2016
Chapter 2
Hybrid and Segmented ADC Techniques
to Optimize Power Efficiency and Area: The
Case of a 0.076 mm2 600 MS/s 12b SAR-
ADC
2.1 Introduction
signal amplitude range where they perform at its maximum power efficiency. This
approach has become so widespread that the resulting ADC architectures have been
denoted as hybrid ADCs.
At the same time, emerging wireline standards like the G.hn Gen 2 [3] that
employ MIMO strategies often require the integration of a large number of analog
front ends (AFE) in a single SoC. Such AFEs should therefore require little or no
external components as this may have a significant impact on the complete system
bill of materials (BoM). In order to tackle these application demands, the design
of wide-bandwidth high-resolution ADCs with extremely low power and area is
clearly of key importance.
Also in the field of ADC area, the adoption of the SAR architecture and the
increasing usage of digital calibrations allowed data converters to reap the benefits
of technology scaling. As an example, the introduction of digital DAC linearity
calibrations [12] resulted in a significant reduction of matching requirements,
allowing scaling down the DAC capacitance to the kT/C limit.
In noise-limited SAR ADCs, for every extra bit of resolution, the sampling
and DAC capacitance grows four times, while, at the same time, the ripple on
the converter reference voltage has to be reduced two times to preserve linearity.
Such a trend sets very tough requirements for the reference generator of high-
resolution charge-redistribution DAC (CR-DAC) SAR ADCs, especially when no
external components can be used, hence requiring the integration of very large on-
chip capacitors [9]. Another approach to address this issue in literature is using
DAC switching schemes that optimize the current drawn from the reference [8, 9]
or with DAC topologies more immune to reference ripple, like current-steering [10]
or charge-sharing DACs (CS-DAC) [11].
In this paper, a 12b four-way interleaved 600 MS/s ADC with on-chip input sig-
nal and reference buffer is presented. The energy efficiency challenge is addressed
with a hybrid ADC architecture that employs a SAR as a coarse ADC and an incre-
mental Delta-Sigma as a fine ADC. A significant area reduction is achieved with
a segmented charge-sharing charge-redistribution DAC architecture that relaxes
significantly the accuracy requirements on the reference generator and can be scaled
down to kT/C limit as a conventional CR-DAC.
The 28 nm CMOS ADC prototype delivers 58 dB SNDR at Nyquist for 26.5 mW
of power with a total area of only 0.076 mm2 and does not require any external
component.
Section 2.2 discusses the thermal noise performance of incremental ADCs
and compares it with the SAR architecture. The subrange SAR- ADC architec-
ture is then introduced in Sect. 2.3. Sections 2.4 and 2.5 review the requirements
for the reference generator of high-resolution charge-redistribution DACs and the
scalability limits of the charge-sharing DAC architecture. The segmented charge-
sharing charge-redistribution DAC architecture is then presented in Sect. 2.6. The
overall ADC architecture and calibrations are presented in Sect. 2.7, while in
Sect. 2.8 the circuit level implementation is presented. Section 2.9 presents the
measurement results, and finally conclusions are drawn in Sect. 2.10.
2 Hybrid and Segmented ADC Techniques to Optimize Power Efficiency. . . 19
where HFIR (f ) is the FIR frequency response and Bwn is the equivalent analog
noise bandwidth of the FIR filter. The parameter depends on the actual FIR filter
coefficients, and it can be proven that it is maximized ( D 2) when all the FIR
coefficients are equal (dumped integrator). If we implement the loop filter using a
simple open-loop integrator (first-order modulator), this result is similar to the
one in [8]; however in this scheme, the integrator, being inside a loop, does not
need gain calibration nor stringent linearity performance.
We can compare the incremental and SAR thermal noise performance by
considering the integrator and the SAR dynamic comparator as the only noise
(and power) limiting blocks (see Fig. 2.2). The latter is usually implemented as a
dynamic integrator followed by a CMOS latch (see Fig. 2.2a), and its input-referred
thermal noise is dominated by the contribution of the input differential pair that can
be written [16] as
8kT 1
Pn SAR D SVn Bwn (2.2)
gm cmp 2Tint
where Tint is the duration of the dynamic integrator integration phase. If a simple
open-loop integrator is used in an incremental ADC (assuming no linearity
constraints, see Fig. 2.2b), the resulting noise power using Eq. (2.1) is
8kT 1
Pn D SVn Bwn (2.3)
gm int 'T
Equations (2.2) and (2.3) have very similar structure and show that for both
architectures the thermal noise is proportional to the inverse of the product gm T.
If we assume the same input pair overdrive (Vov ), the product gm T D 2Id /Vov T is
proportional to the total charge drawn by the integrators.
While in an incremental ADC this charge (gm int T ) is drawn only once
during the complete conversion time (T see Fig. 2.3b), in a SAR approx-
imately the same charge (gm cmp Tint ) must be drawn every time the comparator
is activated (see Fig. 2.3a). That means that in an N-bit SAR converter, the total
charge drawn by the SAR comparator is N times the one used by an incremental
integrator for the same level of thermal noise. In redundant SAR [17], the
comparator power and noise can be optimized during conversion, but also in this
case, more than one conversion cycle with the final comparator noise level is
required when the quantization noise approaches the comparator thermal noise.
2 Hybrid and Segmented ADC Techniques to Optimize Power Efficiency. . . 21
Fig. 2.3 Timing diagram with grey highlights when gm is used to suppress thermal noise (a) for
SAR ADC and (b) for a first-order ADC
Fig. 2.4 SAR- subrange lane ADC conceptual block diagram and timing
In the presented 12b design, the coarse non-binary SAR ADC provides 9b of
equivalent resolution, and the fine ADC (-subADC) implemented as 1-bit first-
order CT-incremental ADC resolves the remaining 3b to get to the final 12b
resolution. 1b overrange is added between the coarse SAR ADC and the -
subADC in order to correct for the coarse SAR ADC error induced by the SAR
loop thermal noise leading to a final -subADC resolution of 4b after filtering
and decimation.
The operation phases are shown in Fig. 2.4. After the sampling and the coarse
SAR conversion phase (ten clock cycles), the SAR residual error is converted by
the fine -subADC in eight clock cycles. The eight -subADC comparator
decisions are filtered using an 8-taps low-pass FIR filter and decimated by eight
before being recombined with the SAR output to get a 12b final code.
Once the -subADC and coarse SAR output are recombined, the only residual
noise is the sampling kT/C noise and the -subADC noise, i.e., quantization
noise, latch thermal noise, and loop filter thermal noise. All these components are
low-pass filtered by the -subADC FIR filter with the equivalent bandwidth of the
FIR (approx 1/T where T is the -subADC conversion time) as shown in
Sect. 2.2. Since both quantization and latch thermal noise are first-order shaped, they
are strongly suppressed by the FIR leaving as dominant terms the sampling (kT/C)
noise and the input-referred noise of the integrator (with integration time T ).
Artifacts due to first-order shaping are dithered by the thermal noise added by the
latch.
The -subADC is implemented with minimum hardware overhead by merging
the -DAC with the SAR-DAC and reusing the comparator latch. Reconfiguring
the SAR comparator static preamp stage (Gmint ) into the integrator only requires
stopping the reset of its load capacitor (Cint ) after the last SAR cycle (Fig. 2.5).
A consequence of this choice is that the transconductance of the resetting
integrator in the SAR phase and of the loop integrator in the phase are the
same. The resetting integrator is designed to have a 3 thermal noise to be less than
the coarse LSB (LSBSAR ) to guarantee the SAR noise (quantization C thermal) to
be within the input range of the fine ADC. The integrator thermal noise power
at the output can then be derived from (Eq. 2.2) and (Eq. 2.3) as
2Tint
Pn D Pn SAR (2.4)
'T
where the suppression of the integrator thermal noise can be easily appreciated since
T D NTCLK where N is the number of clock cycles, whereas Tint is a fraction
of TCLK .
Since a first-order suppresses the quantization noise with the inverse of the
oversampling ratio (OSR T /TCLK ) at the power of 3, it can be calculated that
a first-order modulator is enough to suppress the quantization noise in N D 8
cycles to roughly the same level as the thermal noise.
The FIR filter frequency response has been designed in order to maximize the
final ADC ENOB performance in the presence of all noise sources (including latch
thermal noise) as well as quantization noise. The resulting filter shows a factor
of approximately 1.8. Given the limited number of -subADC conversion cycles,
the digital FIR filter can be implemented in a simple direct form as weighted sum
of the eight -subADC comparator decisions.
In the presented implementation, all timing signals required for the conversion
are generated internally using a self-timed loop approach [18]. Jitter on the internal
clock during the phase is minimized by allocating a fixed time to comparator
decisions.
In the previous section, we have seen that it is possible to improve ADC power
efficiency by applying a combination of conversion algorithms each of them used in
a resolution range where it performs at its peak energy efficiency. A similar approach
can be also applied to minimize the converter area, that is, by implementing blocks
with a combination of different architectures, it is possible to achieve significant
area saving and/or remove the need for external components. In the specific case
of SAR- ADC, we are focusing on the DAC and its reference generator blocks
which are often the largest contributor to total converter area.
The D/A converter of SAR ADCs is commonly a capacitive-based topology due
to the inherent high matching of metal capacitors in deep-submicron technology.
Among them, the charge-redistribution DAC (CR-DAC) (Fig. 2.6) [19] is exten-
sively used mainly for its small C-array area. As the DAC switches are on the bottom
plate (the side opposite to the comparator), the CR-DAC is inherently insensitive to
switch parasitics (C0 n, C0 n1, : : : , C0 1 ), since these undesired capacitances are always
connected to the reference generator. This property allows scaling down the DAC
capacitance eventually to the kT/C noise limit, and it represents the main reason for
the small C-array area achieved by the CR-DAC topology.
A key limitation of CR-DAC is that it requires a very accurate reference voltage
not to impact the linearity performance. In a CR-DAC, in fact, the reference voltages
are connected through the DAC switches and capacitors to the comparator inputs
during the SAR phase of the ADC. The transfer function from the reference voltage
node to the input of the comparator is not linearly code dependent, ranging from
a unity transfer function when the DAC has all the capacitors connected between
24 A. Venca et al.
the comparator input node and the reference voltage to virtually infinite attenuation
when the DAC has all the capacitors connected between the comparator input node
and the ground voltage. Therefore, ripples on the reference voltage greater than
1LSB during the SAR phase can cause errors in the comparator decisions that if not
corrected may produce distortion in the converted signal. Moreover the net charge
absorbed by the reference generator during one conversion, due to the switching of
the DAC capacitors, is not linearly code dependent as well, as described in [19].
The net switching charge per conversion is absorbed by the finite impedance of the
reference generator and produces a low-frequency code-dependent voltage ripple
with a major spectral component at the ADC input signal second harmonic (in
differential structures). This ripple, when mixed with the input signal fundamental,
produces a third harmonic at the ADC output.
The two nonlinear mechanisms discussed above translate into very tough require-
ments for the reference generator of high-resolution charge-redistribution SAR
ADCs. To mitigate the effects of the nonlinear reference-to-comparator transfer
function, a large decoupling capacitance is usually required to reduce the high-
frequency ripple, while low output peak impedance of the reference buffer is needed
to attenuate the effects of the low-frequency nonlinear current drawn from the
references by the switching DAC activity.
In a SAR ADC using binary-weighted CR-DAC, the reference voltage is required
to settle within 1LSB accuracy at every conversion cycle. Non-binary-weighted CR-
DAC implementations tolerate larger reference errors in the redundant part of the
DAC, but still they require 1LSB settling accuracy during the last few conversion
cycles where errors cannot be recovered by redundancy. The requirements on the
reference voltage accuracy in a generic non-binary CR-DAC can be evaluated from
Fig. 2.7a where the reference voltage ripple is sketched in steady state (ripple
recovered within TS ). In this qualitative example, the reference generator has a
bandwidth much smaller than the ADC conversion frequency fS , and the decoupling
capacitance is equal to CREF . In this condition, by approximating the ripple peak
amplitude generated by the first few MSB transitions as "MAX D CDAC /CREF and
assuming a linear reference voltage recovery, the required CREF can be estimated as
2 Hybrid and Segmented ADC Techniques to Optimize Power Efficiency. . . 25
Fig. 2.7 Qualitative reference voltage ripple (a) in a non-binary CR-DAC and (b) in a non-binary
SAR- architecture using a CR-DAC
Tbin N1
CREF 2 CDAC (2.5)
TS
where CDAC is the total DAC capacitance and the factor takes into account for the
switching scheme adopted in the DAC. In [20], for example, a switching scheme
with equal to 0.25 has been presented. This factor can be reduced further but often
results in a code-dependent DAC output common mode [21].
In the case of a noise-limited charge-redistribution SAR ADC, it is interesting to
analyze the scaling of the size of the CREF as the required resolution increases, by
using (Eq. 2.5). In Fig. 2.8 an example for 1 bit of additional resolution requirement
is shown. Here the CDAC has been multiplied by a factor 4 to meet the kT/C noise
requirement (Fig. 2.8b). To recover the original ripple, the CREF must be increased
by a factor 4 as well (Fig. 2.8c). However since the new LSB is half the original
one (N is increased by 1), to preserve the linearity specification, the ripple must be
reduced by another factor 2, hence requiring CREF to be multiplied by a factor 8
(Fig. 2.8d). As a consequence of that in a noise-limited charge-redistribution SAR
26 A. Venca et al.
Fig. 2.8 Reference ripple and requirement on CREF for one additional bit of resolution (T/H phase
omitted to simplify the visualization). (a) Original ripple (black), (b) ripple (blue) with CDAC
increased four times to meet kT/C requirements, (c) ripple (blue) with CREF increased four times
to obtain the original ripple, and (d) ripple (red) with CREF increased eight times to meet linearity
requirements
ADC as the resolution increases, the size of the reference capacitance grows with
double rate with respect to its core DAC capacitance eventually becoming the main
integrated area contributor for resolutions beyond 9/10bit ENOB.
For the specific case of the 12b SAR- architecture using a CR-DAC, the
reference ripple can be tolerated to a certain extent in the SAR phase, thanks to
the DAC redundancy and to the overrange, but during the fine phase, the
reference error must be kept within 1LSB as in a non-binary-weighted SAR ADC
(Fig. 2.7b). The required CREF can be then calculated using (Eq. 2.5) as
T
CREF 2N1 CDAC 50pF (2.6)
TLANE
where N D 12, CDAC D 300fF, D 0.25, T D 2.22 ns, and TLANE D 6.66 ns.
Integrating such a large capacitance may result in exceeding the area budget
as it will most likely dominate the converter area [9]. This example clearly shows
that while the CR-DAC is a very area-efficient topology to implement DACs with
very small LSB units and/or total DAC capacitance, it becomes less area effective
to implement large DACs where total capacitance is dominated by kT/C noise
requirements.
2 Hybrid and Segmented ADC Techniques to Optimize Power Efficiency. . . 27
An effective way to reduce the area of the reference generator is the charge-sharing
DAC (CS-DAC) topology [11]. A CS-DAC is depicted in Fig. 2.9 together with the
T/H capacitance on which the input signal is sampled at the end of the T/H phase.
Simultaneously the reference voltage VREF is sampled on CDAC . After sampling, the
VREF is disconnected from CDAC for the entire SAR conversion phase. In this way the
comparator decisions are virtually insensitive to the noise/ripple on the references.
Moreover the CDAC is reset to zero at the end of the conversion phase by the SAR
algorithm itself making the net charge absorbed by the reference generator during
one conversion and the associated reference voltage ripple code independent. The
combination of these two properties relaxes the noise/ripple requirements on the
CS-DAC reference generator allowing a drastic reduction of the CREF and easing the
specifications on the reference buffer output peak impedance even for kT/C noise-
dominated DACs.
The CS-DAC however is not suitable to implement DACs with very small LSB
unit capacitance. Each CS-DAC element requires, in fact, a top plate switch (that is
a switch on the comparator side) in order to share the charge between CDAC and CTH
in additive/subtractive way during the conversion phase. The parasitic capacitances
associated with the top plate switches (C0 x, C00 x) affect the DAC element effective
the CR-DAC (VREFCR ) and for the CS-DAC (VREFCS ) segments, the former reference
capacitance can be sized according to
T CCR
CREFCR 2N1 CCR 1:6pF with D (2.7)
TLANE CCS
The overall 600MS/s ADC architecture and the calibration scheme are shown in
Fig. 2.11. The ADC uses four interleaved lanes driven with a 25% duty-cycle clock
generated from a 600 MHz PLL input clock. The VREFCR and VREFCS buffers are
shared among the four time-interleaved lanes for a total ADC reference capacitance
of 20 pF. Since they have relaxed specifications on the peak output impedance, each
of them consumes only 100 A from a 1.5 V power supply. The ADC uses both
foreground and background calibrations. At start-up the offset and gain mismatches
among lanes are calibrated as well as SAR-DAC mismatches. The latch offset
calibration is performed both in foreground and in background. This is required
to prevent -subADC saturation.
The static integrator (Fig. 2.12) driving the dynamic latch uses a complementary
input transconductor and a folding stage to increase the output impedance.
30 A. Venca et al.
Fig. 2.11 Four-way interleaved SAR- ADC block diagram and calibrations
The input unity gain signal buffer receives a 1.5 Vppd differential signal and drives
the four interleaved ADC lanes T/H stages. The 25% duty-cycle T/H sampling
clock has been chosen such that always one lane is connected to the input signal
buffer at a given time for best power efficiency. The T/H stage (see Fig. 2.10)
consists of a couple of sampling capacitors (CTH D 300 fF) in bottom plate sampling
configuration. Due to charge-sharing operation, the CTH are reset to a common mode
voltage at the end of the conversion phase before being switched to the buffer output
for tracking. The buffer consists of an OTA in differential inverting configuration
2 Hybrid and Segmented ADC Techniques to Optimize Power Efficiency. . . 31
with a resistive (R D 700 ) feedback network. The OTA (Fig. 2.13) is a two-stage
single-slope topology with 2 GHz unity gain bandwidth and a class AB cascode-
compensated output stage.
2.9 Measurements
jitter is measured between 300 kHz and 20 MHz frequency offset using an auxiliary
clock output and a phase noise meter. Below 300 kHz, phase noise is considered as
part of the input signal power, while above 20 MHz, the measurement is dominated
by white noise of the auxiliary path buffer chain. This jitter figure is enough for
the target application as the ADC input signal has a peak-to-average power ratio
(PAPR) which is much larger compared to a sine wave. A spectrum with a 265 MHz
input signal and PLL jitter included is shown in Fig. 2.17. The effect of the PLL
jitter is visible in the skirts around the input tone. The SNR performance as a
function of the signal amplitude for a 2 MHz and a 265 MHz input frequency
(see Fig. 2.18) shows the typical jitter-induced roll-off of the SNR when the input
signal approaches full scale at high frequency. We can further verify this jitter
measurement result by evaluating the ADC performance as a function of the input
signal frequency (Fig. 2.19). The SNDR roll-off with the input signal frequency
matches very well with a 3 ps-rms sampling time jitter-induced SNR depicted in
Fig. 2.19 as red curve. Given the good agreement between the jitter measurement
techniques, that is, the phase noise measurement and the SNDR roll-off with input
signal amplitude and frequency, we have extrapolated the ADC SNR performance
from a small signal SNR measurement. The resulting SNDR performance has then
been calculated by adding to the extrapolated SNR all the tones in the original
spectrum. The results of this operation are depicted in Fig. 2.19 as dashed lines.
After de-embedding the PLL jitter, the final SNDR performance at high frequency
is 58 dB. The extrapolated SNDR degradation with the input signal frequency can
be mainly ascribed to sampling time interleaving errors which are not calibrated in
this design and to an increase in THD.
2 Hybrid and Segmented ADC Techniques to Optimize Power Efficiency. . . 33
Fig. 2.15 ADC output spectrum at fin D 2 MHz. (a) Before calibration with off and (b) after
calibration with on
Fig. 2.16 Noise contributors of the SAR- ADC including input buffer at low frequency.
SNR D 61.13 dB
34 A. Venca et al.
Fig. 2.17 ADC output spectrum at fin D 265 MHz (PLL jitter dominated)
Fig. 2.18 ADC performance at 600MS/s vs. input amplitude at fin D 265 MHz
Figure 2.20 shows the Schreier FoM plot as a function of the Nyquist frequency
for all converters published between 1997 and 2015 with Nyquist signal frequency
and sampling speed larger than 10 MHz and SNDR at Nyquist higher than 57 dB
[25]. The presented work delivers 58 dB of SNDR with 26.5 mW of total power
including input signal buffer, reference generator, and biasing without requiring any
additional BoM. The resulting HF Schreier FoM is 158.5 dB. Table in Fig. 2.21
shows a comparison between this design and other state-of-the-art ADCs with on-
chip input buffers in the same speed and SNDR range. This work gives significant
improvement in both FoMs and a drastic reduction in area.
2 Hybrid and Segmented ADC Techniques to Optimize Power Efficiency. . . 35
Fig. 2.19 600MS/s ADC performance vs. input frequency with and without PLL jitter
Fig. 2.20 Nyquist Schreier FoM vs. Nyquist sampling frequency for ADCs published at ISSCC
and VLSI between 1997 and 2015 with SNDR at Nyquist larger than 57 dB [25]
2.10 Conclusions
In this paper we have shown how hybrid ADC techniques and segmented DAC
architectures can be used to improve ADC power efficiency and area. The presented
SAR- hybrid ADC architecture achieves better power efficiency compared to
a conventional SAR, while the usage of the segmented charge-sharing charge-
redistribution DAC reduces drastically the total area compared to a conventional
charge-redistribution DAC. The 12b 600MS/s CMOS prototype ADC implementing
36 A. Venca et al.
Fig. 2.21 Performance summary and comparison with state-of-the-art ADCs (SNDR HF > 57 dB,
Fs > 300MS/s) including on-chip buffering
References
1. Janssen, E., et al.: A direct sampling multi-channel receiver for DOCSIS 3.0 in 65nm CMOS.
In: VLSI Circuits (VLSIC), 2011 Symposium on, Honolulu, pp. 292293 (2011)
2. Wu, J., et al.: 27.6 A 4GS/s 13b pipelined ADC with capacitor and amplifier sharing in 16nm
CMOS. In: 2016 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco,
pp. 466467 (2016)
3. Oksman, V., Galli, S.: G.hn: the new ITU-T home networking standard. IEEE Commun. Mag.
47(10), 138145 (2009)
4. Morie, T., et al.: A 71dB-SNDR 50MS/s 4.2mW CMOS SAR ADC by SNR enhancement tech-
niques utilizing noise. Solid-State Circuits Conference Digest of Technical Papers (ISSCC),
2013 IEEE International, San Francisco, pp. 272273 (2013)
5. Fredenburg, J., Flynn, M.: A 90MS/s 11MHz Bandwidth 62dB SNDR Noise-Shaping SAR
ADC. ISSCC Dig. Tech. Papers, pp. 468469, Feb. (2012)
6. Liu, C.C.: 27.4 A 0.35mW 12b 100MS/s SAR-assisted digital slope ADC in 28nm CMOS.
2016 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, pp. 462
463 (2016)
2 Hybrid and Segmented ADC Techniques to Optimize Power Efficiency. . . 37
7. Shu, Y.S., et al.: 27.2 an oversampling SAR ADC with DAC mismatch error shaping achieving
105dB SFDR and 101dB SNDR over 1kHz BW in 55nm CMOS. 2016 IEEE International
Solid-State Circuits Conference (ISSCC), San Francisco, pp. 458459 (2016)
8. van der Goes, F. et al.: A 1.5 mW 68 dB SNDR 80 Ms/s 2 Interleaved Pipelined SAR ADC
in 28 nm CMOS. IEEE J. Solid-State Circuits 49(12), 28352845, Dec. (2014)
9. Verbruggen, B., et al.: A 70 dB SNDR 200 MS/s 2.3 mW dynamic pipelined SAR ADC in 28
nm digital CMOS. VLSI circuits digest of technical papers, 2014 Symposium on, Honolulu,
pp. 12 (2014)
10. Doris, K., et al.: A 480mW 2.6GS/s 10b 65nm CMOS time-interleaved ADC with 48.5dB
SNDR up to Nyquist. Solid-state circuits conference digest of technical papers (ISSCC), 2011
IEEE International, San Francisco, pp. 180182 (2011)
11. Craninckx, J., van der Plas, G.: A 65fJ/Conversion-Step 0-to-50MS/s 0-to-0.7mW 9b Charge-
Sharing SAR ADC in 90nm Digital CMOS. Solid-state circuits conference, 2007. ISSCC 2007.
Digest of technical papers. IEEE International, San Francisco, pp. 246600 (2007)
12. Liu, W., et al.: A 600MS/s 30mW 0.13m CMOS ADC Array Achieving over 60dB SFDR
with Adaptive Digital Equalization. IEEE International Solid-State Circuits Conference, Digest
of Technical Papers, pp. 8283 Feb. (2009)
13. Quiquempoix, V., et al.: A low-power 22-bit incremental ADC. IEEE J. Solid State Circuits.
41(7), 15621571 (July 2006)
14. Markus, J., et al.: Theory and applications of incremental converters. In: IEEE Transac-
tions on Circuits and Systems I: Regular Papers, vol. 51, no. 4, pp. 678690, April 2004.
15. Robert, J., et al.: A 16-bit low-voltage CMOS A/D converter. IEEE J. Solid State Circuits.
22(2), 157163 (Apr 1987)
16. van Elzakker, M., et al.: A 10-bit charge-redistribution ADC consuming 1.9 W at 1 MS/s.
IEEE J. Solid State Circuits. 45(5), 10071015 (May 2010)
17. Harpe, P., et al.: A 7-to-10b 0-to-4MS/s flexible SAR ADC with 6.5-to-16fJ/conversion-step.
2012 IEEE international solid-state circuits conference, San Francisco, pp. 472474 (2012)
18. Chen, S.W.M., et al.: A 6-bit 600-MS/s 5.3-mW asynchronous ADC in 0.13-m CMOS. IEEE
J. Solid State Circuits. 41(12), 26692680 (Dec. 2006)
19. Ginsburg, B.P., Chandrakasan, A.P.: An energy-efficient charge recycling approach for a SAR
converter with capacitive DAC. IEEE International Symposium on, Circuits and Systems,
2005. ISCAS 2005, vol. 1, pp. 184187 (2005)
20. Hariprasath, V., et al.: Merged capacitor switching based SAR ADC with highest switching
energy-efficiency. Electron. Lett. 46(9), 620621, April 29 (2010)
21. Liu, C.C., et al.: A 10-bit 50-MS/s SAR ADC with a monotonic capacitor switching procedure.
IEEE J. Solid State Circuits. 45(4), 731740 (April 2010)
22. Mulder, J., et al.: An 800MS/S 10b/13b receiver for 10GBASE-T Ethernet in 28nm CMOS.
ISSCC Digest of Technical Papers, pp. 462463, Feb. 2015
23. Mulder, J., et al.: An 800MS/s dual-residue pipeline ADC in 40nm CMOS. ISSCC Digest of
Technical Papers, pp. 184185, Feb. 2011
24. El-Chammas, M., et al.: A 90dB-SFDR 14b 500MS/S BiCMOS switched- current pipelined
ADC. ISSCC Digest of Technical Papers, pp. 286287, Feb. 2015
25. Murmann, B.: ADC performance survey 19972015. [Online]. Available: http://
web.stanford.edu/~murmann/adcsurvey.html
Chapter 3
Interleaved Pipelined SAR ADCs: Combined
Power for Efficient Accurate High-Speed
Conversion
Ewout Martens
3.1 Introduction
Power
FoM Walden D :
2ENOBC1 Bandwidth
Flash converters offer the highest conversion speeds [4], but they require a lot
of overhead in terms of both area and power and suffer from high input loads. The
number of comparators increases exponentially with the required number of bits,
and techniques like interpolating [5], folding [6], and asynchronous binary search
algorithms [7] are used to reduce the required number of comparisons. However,
E. Martens ()
imec, Kapeldreef 75, Leuven, Belgium
e-mail: Ewout.Martens@imec.be
ENOB
[bit]
14 Flash
SAR
12
Pipeline
Pipelined SAR
10
TI Flash
8 TI SAR
TI Pipeline
6
TI Pipelined SAR
10p SAR
Pipeline
1p Pipelined SAR
TI Flash
100f TI SAR
TI Pipeline
10f TI Pipelined SAR
1f
1M 10M 100M 1G 10G
Bandwidth [Hz]
Fig. 3.1 Flash, SAR and pipelined ADCs, and combinations of them published at ISSCC and
VLSI between 1997 and 2016 [2]
for a high number of bits, less noise can be tolerated in the comparators resulting in
a rapid increase of the energy per comparator and limiting the practical number of
bits.
Analog-to-digital architectures based on successive approximation (SAR) [8]
are the most energy-efficient choice, since they require only a small number
of operations, and all of them can be made fully dynamic avoiding any static
current. For N-bit accuracy, only N comparisons are required, and N-1 residues
need to be generated, for example, by switching a capacitive digital-to-analog
3 Interleaved Pipelined SAR ADCs: Combined Power for Efficient Accurate. . . 41
3.2 Architecture
Figure 3.2 shows a generic diagram of an interleaved pipelined SAR ADC. The
architectural choices include the number of interleaved channels, the number of
pipelined stages per channel, and the number of bits resolved by the SAR converters
in each stage.
Each stage of the pipeline should have enough time to receive the signal from the
previous stage, do N comparisons (Tcmp ), generate N residues (Tres ), and amplify
42 E. Martens
ADC input
the last residue to the next stage (Tamp ). These requirements result in the following
timing constraints for the first, intermediate, and last stages:
PN1
Ttrack C iD1 Tcmp;1;i C Tres;1;i C Tamp C Treset < CTsample ;
PNk
iD1 Tcmp;k;i C Tres;k;i C 2T amp C Treset < CTsample ;
PNS 1
iD1 Tcmp;S;i C Tres;S;i C Tcmp;S;NS C Tamp C Treset < CTsample :
with the track time Ttrack usually between Tsample /2 and Tsample . Based on these
constraints, it becomes clear that a pipeline of only a small number of stages
is usually preferred. Indeed, the available time in the intermediate stages for
comparisons and residue generation is rather limited since twice the amplification
time needs to be allocated in these stages. Reducing the amplification time Tamp or
increasing the interleaving factor C leaves more time for the SAR conversions. That
gives also extra room to do extra conversions in the first and last stages. Hence,
for moderate quantization levels, two stages are usually sufficient. On the other
hand, multiple stages cannot be avoided to achieve a high resolution. Also, when
a slightly higher channel speed is desired, reducing one or two conversions from the
first and last stages and making an extra intermediate stage become more efficient
than increasing the number of channels.
An example of an architecture and its timing diagram is shown in Fig. 3.3. With
only two stages, a quantization level of 14 bits is achieved for channels which are
designed for 200 MHz. To increase the robustness of the ADC, redundancy between
the different stages is used resulting in 6 bits to be resolved in the first stage and 10
bits in the second one. Further, the last two comparisons are low-noise comparisons
with a 1-bit redundancy to correct for settling and comparator errors.
3 Interleaved Pipelined SAR ADCs: Combined Power for Efficient Accurate. . . 43
Stage 2
Stage 1 Phase 2 + 3
ADC input input residue A input 8
digital output Finedigital
SARoutput 2
6
Channel 1
Channel 1
Tchannel = C Tsample
TRACK
C1 R1
C2 R2
C3 R3 STAGE 1
C4 R4
C5 R5
C6 R6
AMPLIFY RESET
comparison
residue generation
AMPLIFY
C1 R1
C2 R2
C3 R3
C4 R4
STAGE 2 C5 R5
C6 R6
C7 R7
C8 R8
C9 R9 C10 RESET
Tchannel = C Tsample
Fig. 3.3 Architecture and timing diagram of interleaved pipelined SAR with two stages and 14-bit
quantization level targeting 200 MHz per channel
Doubling the number of interleaved channels also doubles the available time per
stage, so a higher number of lanes seem to be attractive. However, mismatches
between the channels degrade the signal-to-noise ratio [1517]:
Channel offset mismatches result in spurs at multiples of fs /C. For C > 2, it ends
up within the Nyquist band.
Channel gain mismatches act as amplitude modulation and cause two images of
the signal to appear around multiples of fs /C.
Clock skew results in sample-time errors and hence phase modulation. It causes
similar images as gain mismatches, but their magnitude becomes frequency
dependent.
Bandwidth mismatches between the channels also cause phase modulation and
hence images around multiples of fs /C.
Offset errors are readily corrected in digital domain [20]. The pipelined SAR
ADC allows to easily tune the gain of a channel via tunable capacitors to ground
in the first stage resulting in a fine analog compensation of the gain mismatch
errors [19]. Skew mismatches can be reduced by using a full-rate front-end sampler
which is easily combined with the top-plate sampling of the first stage [18]. Finally,
bandwidth mismatches can be tuned via tunable boost voltages in the sampling
switch [21].
44 E. Martens
The total SNDR of an interleaved pipelined SAR ADC is degraded by several error
sources listed in Table 3.1. Gain and offset errors are easily calibrated out using
either foreground and/or background calibration techniques [22, 23]. Mismatch in
the DAC is caused by capacitor mismatch and can also be calibrated for using a
one-time effort offline calibration. Other noise sources end up in the overall noise
budget.
A typical example of a noise budget is shown in Fig. 3.4 which corresponds to
a 12-bit ENOB pipelined SAR ADC with 200 MHz channels as shown in Fig. 3.3.
There are four major noise sources which are given an equal weight: the sampling
noise, the amplifier noise, the comparator noise, and the rest including quantization
noise and jitter. Consequently, the power budget usually follows a similar pattern.
Next section discusses the contributions of the different blocks into more detail.
This section focuses on the four main building blocks of the interleaved pipelined
SAR ADC: the sampling stage, the DAC, the comparator, and the residue amplifier.
Although it also has a major impact on the performance of high-accuracy high-speed
designs, low-jitter clock buffering and distribution are not discussed here.
3 Interleaved Pipelined SAR ADCs: Combined Power for Efficient Accurate. . . 45
Other, 15%
kT/C,22%
Quantization,7%
Comp 2.1-2.8 , 9%
Amplifier, 21%
Sampling jitter,
11%
Low-noise
Comp, 15%
Fig. 3.4 Example of noise budget split-up over different noise sources in interleaved pipelined
SAR ADC of Fig. 3.3
High-speed, high-accuracy sampling requires low values for the resistance in the
sampling network. This poses strict requirements on the output impedance of the
input buffer and the switch resistance. When the tracking phase starts, the voltage
on the first-stage sampling capacitance Cdac should settle to the value of the input
signal. Assuming a maximum error of LSB/ is tolerable, the required settling time
is then
1
Tsettling D > RCdac .N log 2 C log / ;
f sample
with R the total resistance of the sampling network and N the number of bits, and
1/ indicates the tracking time fraction of the sampling period. This means that the
bandwidth of the sampling network scales as follows:
BW
> .N log 2 C log / :
fsample 2
Further, assuming the kT/C noise should be smaller than LSB/ , the maximum
resistance can be computed as follows:
2
Vptp
R< ;
f sample kT22N 2 .N log 2 C log /
46 E. Martens
bootstrap
clkFR Vboost
CDAC,P
bootstrap
Vin,p
CDAC,M
bootstrap
clk1 Vboost
clk2 Vboost
bootstrap
CDAC,P
Vin,m
bootstrap
Fig. 3.5 Example of sampling network for a two-time interleaved pipelined SAR
where the input range Vptp is proportional to the supply voltage. Hence, with smaller
technology nodes and lower supply voltages, the requirement on the resistance
continuously becomes tougher.
High accuracy also means low distortion. A smaller switch resistance reduces
the signal dependency of the resistance. Since most pipelined SAR ADCs use a
capacitive DAC and switch the bottom plates of the first-stage DAC, they cannot
easily use bottom-plate sampling which makes them more vulnerable to non-
idealities such as charge injection and clock feedthrough.
To meet these requirements, careful design of a sampling network with boot-
strapped switches is needed [24]. Figure 3.5 shows an example of the sampling
network of a high-speed high-accuracy two-time interleaved pipelined SAR ADC
[21]. To minimize the skewing errors between the channels, a front-end sampler
operating at the full clock rate clkFR is added. To increase the linearity, the bootstrap
circuits use the original input rather than the signal at the source of the sampling
switch inside the channels. The bootstrap voltage used inside the channels can be
tuned to calibrate mismatches between the bandwidths of the sampling networks of
the channels.
Since a small switch resistance is required, the sampling transistors become quite
large, and they consequently have large parasitic capacitances. To avoid coupling
3 Interleaved Pipelined SAR ADCs: Combined Power for Efficient Accurate. . . 47
Pipelined SAR N
input DT system Gm ADC channel
1100...
FIR taps
clock /N
Fig. 3.6 Interleaved pipelined SAR ADC embedded in a discrete-time system with integrated
filtering function
over the switches when the SAR algorithm is executed, dummy switches are added
to cancel the coupling in the first order and avoid cross talk from one channel to
another.
When the ADC is part of a discrete-time system, it can easily be driven by an
amplifier which acts as a Gm stage that puts some charge on the DAC of the first
stage, in the same way as the residue amplifier between the stages operates. This
configuration offers an easy way to integrate a filtering function with the interleaved
pipelined SAR ADC [25]. As shown in Fig. 3.6, a FIR function is realized when a
high-rate clock adds the charge of some samples onto the DAC capacitor after which
the ADC starts the conversion.
3.3.2 DAC
Most pipelined SAR ADCs use capacitive DACs to generate the residue. The
total capacitance is dictated by the noise and mismatch requirements. In Fig. 3.7,
a comparison is made for a 6-bit DAC in 28 nm technology between the kT/C
noise and the accuracy due to mismatch (based on Monte Carlo simulations). A
differential input range of 2 VDD is assumed. Mismatch dominates the SNDR
degradation, and for more than 10-bit accuracy, one or more MSB units of the DAC
can be calibrated in an offline calibration step. The improvement is also shown in
Fig. 3.7 for a different number of MSB units that are calibrated with a resolution of
200aF. A common-centroid layout is adopted to improve the matching of the DAC.
For high-accuracy designs, the switching energy of the DAC significantly
contributes to the total conversion energy (corresponding to the portion of the kT/C
noise in Fig. 3.4). With a different switching scheme, some energy can be saved.
Several switching schemes have been developed, and four of them are compared in
Fig. 3.8. These schemes dont need extra voltages and have a simple logic. The total
DAC size is always the same, and the DACs are reset after the conversion.
48 E. Martens
75
70
65
0 2 4 6 8 10
DAC capacitance [pF]
1.4
0.6
1.2
1 0.4
Monotonic
0.8 Switchback
Monotonic
0.2 Alternate
Switchback
Symmetrical
0.6 Alternate
Symmetrical
0.4 0
0 10 20 30 40 50 60 0 1 2 3 4 5 6
DAC code SAR step
Fig. 3.8 Switching energy and common-mode level of 6-bit DAC used in the first stage of a
pipelined SAR ADC with different switching schemes
4 4
2 2
0 0
1 2 3 4 5 1 2 3 4 5
# Time constants of settling # Time constants of settling
Fig. 3.9 Input-referred noise of ADC due to finite DAC settling in interleaved pipelined SAR
ADC of Fig. 3.3
The bottom-plate drivers of the DAC can be simple inverters. Each time the
DAC switches, it draws some signal-dependent charge from the reference, which
can be just the supply voltage in a pipelined SAR ADC. This ripple on the reference
voltage introduces nonlinearities. To compensate for this effect at high accuracy and
high speed, a large decoupling capacitance can be added [21]. A more area-efficient
approach is to use a reference buffer, which however needs a large bandwidth and
hence becomes quite power hungry [29].
The finite settling time of the DAC results in errors in the top-plate voltage which
results in wrong decisions by the comparators. In the pipelined SAR architecture,
the redundancy between the stages makes these errors less critical. Although the
effect is deterministic, the subsequent DAC feedbacks and the redundancy in the
ADC make the finite settling time errors act more as random noise rather than as
distortion, somewhat similar to the conversion of quantization errors to quantization
noise. Figure 3.9 shows the input-referred noise created by finite DAC settling for
the first and second stage of a two-stage pipelined SAR ADC. The settling time
before the final residue generation in front of the amplifier is most critical. Hence,
extra settling time is usually foreseen as also shown in the timing diagram of Fig.
3.3 (R6 in stage 1 and R9 in stage 2).
Another error source introduced by the DAC is noise from the drivers. This noise
gets filtered on the top plate of the DAC:
2
X 1
2
Vn;top D 4kT P ;
1 C .Ri C j!Ci / ki 1= .Rk C j!Ck / i
i
with Ci the units of the DAC, Ri the output resistance of the driver, and i its
equivalent noise resistance. When the stage is followed by a residue amplifier like
the one of Fig. 3.12, the noise is further filtered by a sinc characteristic because of
the integrating functionality of this dynamic amplifier.
50 E. Martens
3.3.3 Comparators
In the comparators, a trade-off between noise, comparison time, and power con-
sumption should be made. To avoid static power consumption, a dynamic compara-
tor [30] is often used. When the noise of the input pair is dominant, the input-referred
noise is in first order given by
s
kT
Vn;in ;
CD IgDSm Vtl
with CD the capacitance on the drain terminals and Vtl the threshold of the latch
[31]. Reducing the noise by half then consumes four times more comparison energy.
Further, the decision time consists of a part needed to build up Vtl on CD ( CD Vtl /IDS )
and a part due to the regeneration time of the latch:
Vin
tcomp D t1mV log ;
1 mV
with Vin the input signal and t1mV the delay with a 1 mV input signal.
The pipelined SAR architecture helps in relaxing the specifications for noise and
time so that energy can be saved. First, there is redundancy between the different
stages so that errors made in the front section of the pipeline become less important.
Second, the gain of the residue amplifiers relaxes the noise requirements of the
comparators near the end. Also, the larger the input signals of the comparator, the
lower the decision time.
For example, Fig. 3.10 shows the input-referred noise of the different com-
parators in the two-stage pipelined SAR ADC of Fig. 3.10. For each comparator,
a Gaussian noise model is assumed that allows to compute the variation of the
output value of the ADC which can be translated to the input to find the input-
referred noise. As expected, the comparators of the first stage (left) can be fast
and noisy without too much overall SNDR degradation. The input-referred noise
of the comparators of the second stage is suppressed by the gain of the amplifier for
both the high-noise (middle) and low-noise comparators (right). These low-noise
comparators only resolve the last bits of the 14-bit quantization level so that the
maximum error settles to about 1 LSB.
With asynchronous operation of the SAR algorithm [32], each comparator gets
as much time as it needs, depending on the input signal it sees. However, with an
asynchronous operation, there is always a certain probability that there is not enough
time available for the complete conversion. Based on the delay model described
above, this probability has been computed in Fig. 3.11 for various amplifier gains
and comparator delays (with a 1 mV input signal). The gain of the amplifier relaxes
the requirement on the delay of the comparator enabling higher-speed operation.
Each channel of the interleaved pipelined SAR can be driven completely
asynchronously: the comparators and the amplifier give a ready signal which starts
the next part of the conversion. However, due to finite probability on time-outs, a
3 Interleaved Pipelined SAR ADCs: Combined Power for Efficient Accurate. . . 51
[LSB] Comparators Stage 1 [LSB] High-Noise Comparators Stage 2 [LSB] Low-Noise Comparators Stage 2
0.5 7 1.2
Gain = 2 Gain = 2
Gain = 4 6 Gain = 4
0.4 1
Gain = 6 Gain = 6
Gain = 8 5 Gain = 8
0.8
0.3 4
0.6
3
0.2
0.4
2 Gain = 2
0.1 Gain = 4
1 0.2 Gain = 6
Gain = 8
0 0 0
0 10 20 30 40 50 0 5 10 15 20 0 5 10 15 20
Comparator noise [w.r.t. LSB] Comparator noise [w.r.t. LSB] Comparator noise [w.r.t. LSB]
Fig. 3.10 Input-referred noise of comparators in different stages corresponding to the timing
diagram of Fig. 3.3
10-20
10-40
Gain = 2
Gain = 4
Gain = 6
Gain = 8
10-60
0.02 0.04 0.06 0.08 0.1
Delay comparator 3 [w.r.t Tchannel]
Fig. 3.11 Example of computation of probability on time-out in low-noise comparators of Fig. 3.3
for different nominal comparator delays and gain factors
stage S in the pipeline cannot be sure that stage S C 1 has finished and that the
amplification between the two stages can start. To resolve this issue, stage S needs
to set an interrupt signal to stage S C 1.
The offset of a comparator depends on its common-mode input signal which
increases the DNL of the ADC when the common-mode voltage changes during
the SAR algorithm when one of the nonsymmetrical switching schemes of Fig. 3.8
is used. Therefore, a different comparator for each decision can be used, and the
offset is calibrated out at the correct common-mode level. This scheme leads to a
comparator-driven controller [7]: the valid signal of the comparator can directly be
used as clock for the next one. Furthermore, with one of the switching schemes
of Fig. 3.8, the output of the comparators can directly drive the bottom plate of
the corresponding unit in the DAC. Also, all comparators can be reset in parallel
ensuring complete reset for all of them independent of the time needed for the DAC
to settle.
52 E. Martens
gm Tamp
AD ;
Cdac
where Tamp is the amplification time. This value can be determined intrinsically by
observing the common-mode level of the outputs. More flexibility is offered with a
tunable delay which allows to set the gain of the amplifier.
The input-referred noise of the amplifier is directly translated to the input of the
ADC. In first order, it is given by
s
4kT
Vn;in :
gm Tamp
For a certain gm /Id biasing, the required energy per amplification then becomes
showing the trade-off between power consumption, amplification time, gain, and
noise.
3 Interleaved Pipelined SAR ADCs: Combined Power for Efficient Accurate. . . 53
VDD
amp
VSS amp
amp
The complementary dynamic amplifier shown in Fig. 3.12 improves the effi-
ciency via current and charge reuse. After finishing the SAR algorithm and resetting
the DAC of the next stage, the previous values are back at the top plates. The first
switches of the amplifier then use charge sharing to restore the common-mode level
before the next amplification starts. A local common-mode feedback loop provides
some common-mode rejection. When the amplifier is not active, it loads the DACs
of the previous and next stages. To avoid large nonlinear capacitance, the terminals
of the input transistors are reset, and the gates of the transistors for the CM loop are
connected to an internal node rather than directly to the outputs.
The SAR algorithm reduces the input of the amplifier which improves its
linearity. Hence, the linearity of the amplifier is relaxed. This is shown in Fig. 3.13,
where the THD of the ADC is plotted as a function of the HD2 and HD3 of the
amplifier (measured with a 0.75 LSB1 input for the amplifier).
Since the amplifier is not used during most of the time, it can be shared among
different channels to save some area [40]. However, this requires some multiplexing
between the channels (e.g., at the output via all switches driven by amp or its
inverse) which also increases the cross talk between the channels.
54 E. Martens
Fig. 3.13 Impact of amplifier [dB] THD of ADC due to amplifier nonlinearity
nonlinearity on THD of ADC -40
of Fig. 3.3 HD2
-50 HD3
-60
-70
-80
-90
-100
-110
-100 -80 -60 -40 -20 0
HD of amplifier [dB]
Table 3.2 lists the specifications and architectural details of the three interleaved
pipelined SAR ADCs shown in Fig. 3.14. They all have been designed in a 1P9M
28 nm CMOS process. An architecture with two interleaved channels and two stages
per channel has been chosen for all three designs. They differ in the design point:
The first example [19] has the highest sampling speed of 410 MS/s with an 11-
bit resolution. Digital calibration is put on chip to calibrate comparator offsets,
amplifier gain, and MSB mismatch errors. This calibration engine takes up an
area of more than the actual ADC as shown in the left part of Fig. 3.14.
ADC 2 [25] consumes only 1 mW at 250 MS/s which is the lowest power
consumption of the three designs. It is part of a larger discrete-time system with
an integrated filtering function like the system of Fig. 3.6. Therefore, its input
range is much smaller than for the other two ADCs.
The third ADC [21] demonstrates the feasibility of 14-bit resolution with
an interleaved pipelined SAR architecture at a fair high speed of 280 MS/s.
These high-speed, high-resolution specifications result in a large decoupling
capacitance which is also denoted in the right part of Fig. 3.14.
ADC 2 resolves only 3 bits in the first stage, whereas the others use a 6-bit SAR
conversion. Since its input range is also small, the residue to be amplified is of the
same order, and an open-loop dynamic amplifier can be used without increasing
the nonlinearity too much. Nevertheless, the harmonic distortion of the amplifier is
about 8 dB higher, and spurs are clearly observed in the output spectrum shown in
Fig. 3.15.
3 Interleaved Pipelined SAR ADCs: Combined Power for Efficient Accurate. . . 55
Table 3.2 Overview of properties of the three interleaved pipelined SAR ADCs
ADC 1 [19] ADC 2 [25] ADC 3 [21]
Technology CMOS 28 nm CMOS 28 nm CMOS 28 nm
Channels 2 2 2
Sampling speed 410 MS/s 250 MS/s 280 MS/s
Resolution 11 bit 10 bit 14 bit
ENOBlf 10 bit 7.4 bit 11.8 bit
ENOBhf 9.5 bit N/A 9.8 bit
SFDR 70 dB 57 dB 70 dB
Power 1.9 mW 1.0 mW 3.2 mW
Input range 1.2 Vptp 200 mVptp 1.6 Vptp
Channel time Tchannel 4.9 ns 8 ns 7.1 ns
Stage 1 Bits 6 3 6
DAC size 860 fF 1.6 pF 3.2 pF
DAC type Monotonic Symmetrical Switchback
Comp. 1 noise 1.3 LSB 2.0 LSB 6.5 LSB
Comp. 1 delay 0.02 Tchannel 0.02 Tchannel 0.02 Tchannel
Amplifier Type Two-stage Complementary Complementary
Gain 4 6 4
Tamp 0.11 Tchannel 0.1 Tchannel 0.09 Tchannel
Noise 0.2 LSB 0.2 LSB 0.4 LSB
THD 55 dB 47 dB 56 dB
Energy 0.38 Econv 0.25 Econv 0.27 Econv
Stage 2 Bits 5C2 7C2 8C2
DAC size 500 fF 830 fF 3.1 pF
DAC type Monotonic Symmetrical Monotonic
Comp. 2 noise 0.8 LSB 2.0 LSB 3.4 LSB
Comp. 2 delay 0.03 Tchannel 0.02 Tchannel 0.02 Tchannel
Comp. 3 noise 0.6 LSB 1.3 LSB 1.5 LSB
Comp. 3 delay 0.04 Tchannel 0.02 Tchannel 0.08 Tchannel
Fig. 3.14 Chip photographs of the three interleaved pipelined SAR ADCs
56 E. Martens
The three ADCs have similar timing diagrams with 2550% tracking time, about
10% amplification time, and the rest for SAR conversion and reset. To realize the
14-bit quantization with only two stages, the high-noise comparators of the second
stage in ADC 3 are made a bit faster, while the low-noise comparators are made
much slower, and a longer channel time is needed to fit ten comparisons in a stage
resulting in a slower ADC. Its timing diagram is shown in Fig. 3.3.
Different switching schemes are used in the three designs. ADC 2 uses the sym-
metrical scheme with a constant common mode (see Fig. 3.8). This configuration
allows to use only one comparator in the first stage and only two in the second stage,
whereas the other ADCs use the comparator-based controller architecture with a
different comparator per step. Since also the DAC is quite small and the resolution
only 10 bit, ADC 2 has the most compact area. However, more control is needed,
which takes up a large part of the total energy conversion.
The high resolution of ADC 3 requires a large DAC in the first stage to limit kT/C
noise. The linearity is enhanced by calibration of the first three MSBs. The switching
energy of this first-stage DAC takes up a larger part of the total conversion energy
compared to the other two ADCs.
Fig. 3.15 shows measurement results of ADC 2 at 250 MS/s for a 10 MHz input.
The spectrum shows an SNDR of almost 47 dB which is mainly limited by the SNR.
The right part shows the FIR filtering profile. The FIR can create notches at specific
frequencies which can be used in a receiver to filter out unwanted interferers. More
notches can be added by increasing the number of filter taps.
For ADCs 1 and 3, the trade-off between bandwidth, accuracy, and power
consumption is depicted in Fig. 3.16. ADC 1 achieves 9.64-bit ENOB at 410 MS/s
with a low-frequency input which drops to 8.85 bit with a Nyquist input. The peak
efficiency is 5.5 fJ/c.s. which remains less than 12 fJ/c.s. at 410 MS/s with a Nyquist
input signal, including the power for the (slow) background calibration. For ADC 3,
the peak ENOB with low-frequency input is 11.8 bit at 280 MS/s. It remains 9.8 bit
with Nyquist input in part degraded by jitter from the measurement setup. With a
high-frequency Walden FoM of 12.8 fJ/c.s., it realizes a higher accuracy at the cost
of a slower speed but with comparable power consumption. All examples compare
favorably to the state of the art shown in Fig. 3.1.
3 Interleaved Pipelined SAR ADCs: Combined Power for Efficient Accurate. . . 57
[bit] [bit]
ENOB for ADC 1 ENOB for ADC 3
10.5 12
10 11.5
9.5 11
9 10.5
8.5 10
ENOBlflf @10MHz ENOBlflf @10MHz
hf @Nyquist
ENOBhf ENOBhf
hf @Nyquist
8 9.5
0 100 200 300 400 500 0 50 100 150 200 250 300
Sampling frequency [MS/s] Sampling frequency [MS/s]
[fJ/c.s.] [fJ/c.s.]
Walden FoM for ADC 1 Walden FoM for ADC 3
20 14
FoMlflf @10MHz
@10MHz @10MHz
FoMlflf @10MHz
FoMhf
hf @Nyquist 12 hf @Nyquist
FoMhf
15 10
10 6
5 2
0 100 200 300 400 500 0 50 100 150 200 250 300
Sampling frequency [MS/s] Sampling frequency [MS/s]
3.5 Conclusions
Acknowledgment The author would like to thank Bob Verbruggen, Badr Malki, Masao Iriguchi,
Kazuaki Deguchi, and Jan Craninckx for their contributions to this work.
58 E. Martens
References
20. Le Dortz, N. et al.: A 1.62GS/s time-interleaved SAR ADC with digital background mismatch
calibration achieving interleaving spurs below 70dBFS. 2014 IEEE International Solid-State
Circuits Conference Digest of Technical Papers (ISSCC), San Francisco, pp. 386388 (2014)
21. Verbruggen, B., Deguchi, K., Malki, B., Craninckx, J.: A 70 dB SNDR 200 MS/s 2.3 mW
dynamic pipelined SAR ADC in 28 nm digital CMOS. 2014 Symposium on VLSI Circuits
Digest of Technical Papers, Honolulu, pp. 12 (2014)
22. Murmann, B.: Digitally assisted data converter design. 2013 Proceedings of the ESSCIRC
(ESSCIRC), Bucharest, pp. 2431 (2013)
23. Verbruggen, B.: Digitally assisted analog to digital converters. 2014 Workshop on Advances in
Analog Circuit Design, Lisbon.
24. Abo, A.M., Gray, P.R.: A 1.5-V, 10-bit, 14.3-MS/s CMOS pipeline analog-to-digital converter.
IEEE J. Solid State Circuits. 34(5), 599606 (1999)
25. Malki, B., Verbruggen, B., Martens, E., Wambacq, P., Craninckx, J.: A 150 kHz80 MHz BW
discrete-time analog baseband for software-defined-radio receivers using a 5th-order IIR LPF,
active FIR and a 10 bit 300 MS/s ADC in 28 nm CMOS. IEEE J. Solid State Circuits. 51(7),
15931606 (2016)
26. Liu, C.C., Chang, S.J., Huang, G.Y., Lin, Y.Z.: A 10-bit 50-MS/s SAR ADC with a monotonic
capacitor switching procedure. IEEE J. Solid State Circuits. 45(4), 731740 (2010)
27. Huang, G.Y., Chang, S.J., Liu, C.C., Lin, Y.Z.: 10-bit 30-MS/s SAR ADC using a switchback
switching method. IEEE Trans. Very Large Scale Integr. VLSI Syst. 21(3), 584588 (2013)
28. Lin, K.T., Cheng, Y.W., Tang, K.T.: A 0.5 V 1.28-MS/s 4.68-fJ/conversion-step SAR ADC
with energy-efficient DAC and trilevel switching scheme. IEEE Trans. Very Large Scale Integr.
VLSI Syst. 24(4), 14411449 (2016)
29. Kuppambatti, J., Kinget, P.R.: Current reference pre-charging techniques for low-power zero-
crossing pipeline-SAR ADCs. IEEE J. Solid State Circuits. 49(3), 683694 (2014)
30. Miyahara, M., Asada, Y., Paik, D., Matsuzawa, A.: A low-noise self-calibrating dynamic
comparator for high-speed ADCs. 2008 IEEE Asian Solid-State Circuits Conference, Fukuoka,
pp. 269272 (2008)
31. Nuzzo, P., De Bernardinis, F., Terreni, P., Van der Plas, G.: Noise analysis of regenerative
comparators for reconfigurable ADC architectures. IEEE Trans. Circuits Syst. I, Reg. Papers.
55(6), 14411454 (2008)
32. Chen, S.-W.M., Brodersen, R.W.: A 6b 600MS/s 5.3mW Asynchronous ADC in 0.13/spl mu/m
CMOS. 2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers,
San Francisco, pp. 23502359 (2006)
33. Murmann, B., Boser, B.E.: A 12 b 75 MS/s pipelined ADC using open-loop residue ampli-
fication. 2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical
Papers. ISSCC., San Francisco, vol. 1, pp. 328497 (2003)
34. Anthony, M., Kohler, E., Kurtze, J., Kushner, L., Sollner, G.: A process-scalable low-power
charge-domain 13-bit pipeline ADC. 2008 IEEE Symposium on VLSI Circuits, Honolulu,
pp. 222223 (2008)
35. Hershberg, B., Weaver, S., Sobue, K., Takeuchi, S., Hamashita, K., Moon, U.K.: Ring ampli-
fiers for switched-capacitor circuits. 2012 IEEE International Solid-State Circuits Conference,
San Francisco, pp. 460462 (2012)
36. Chiang, S.H.W., Sun, H., Razavi, B.: A 10-Bit 800-MHz 19-mW CMOS ADC. 2013
Symposium on VLSI Circuits, Kyoto, pp. C100C101 (2013)
37. Brooks, L., Lee, H.S.: A zero-crossing-based 8-bit 200 MS/s pipelined ADC. IEEE J. Solid
State Circuits. 42(12), 26772687 (2007)
38. Lim, Y., Flynn, M.P.: 26.1 A 1mW 71.5dB SNDR 50MS/S 13b fully differential ring-amplifier-
based SAR-assisted pipeline ADC. 2015 IEEE International Solid-State Circuits Conference
(ISSCC) Digest of Technical Papers, San Francisco, pp. 13 (2015)
60 E. Martens
39. Malki, B., Verbruggen, B., Wambacq, P., Deguchi, K., Iriguchi, M., Craninckx, J.: A com-
plementary dynamic residue amplifier for a 67 dB SNDR 1.36 mW 170 MS/s pipelined SAR
ADC. ESSCIRC 2014 40th European Solid State Circuits Conference (ESSCIRC), Venice
Lido, pp. 215218 (2014)
40. Huang, Y.C., Lee, T.C.: A 10b 100MS/s 4.5mW pipelined ADC with a time sharing
technique. 2010 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco,
pp. 300301 (2010)
Chapter 4
Hybrid VCO Based 0-1 MASH and Hybrid
SAR
4.1 Introduction
A. Sanyal
State University of New York at Buffalo, Buffalo, NY, USA
e-mail: arindams@buffalo.edu
W. Guo
Intel Corporation, Austin, Austin, TX, USA
e-mail: wenjuan.guo@intel.com
N. Sun ()
The University of Texas at Austin, Austin, TX, USA
e-mail: nansun@mail.utexas.edu
VCO ADC can also provide intrinsic first-order quantization noise shaping, and
thus a ring VCO can be used as a first-order TD ADC. However, frequency
tuning gain of a ring VCO is highly nonlinear and is sensitive to variations in
process, voltage, and temperature (PVT), which seriously undermines accuracy and
robustness of VCO-based ADCs. In addition, existing VCO-based ADCs usually
have energy efficiency around 100 fJ/conversion-step [511]. Recent research has
tried to address the PVT sensitivity of VCO-based ADCs by using a pipelined,
hybrid VCO ADC with digital background nonlinearity calibration [12, 13] and
embedding a first-order VCO inside a loop [1416].
This work presents a scaling friendly and energy-efficient hybrid 0-1 MASH
ADC [17]. A coarse 8-bit first-stage SAR ADC is combined with a fine VCO
which acts as the second stage. The VCO is effective at quantizing small voltages
in time domain. Since the VCO sees only a small SAR residue, VCO nonlinearity
is greatly suppressed, and no nonlinearity correction is needed. The VCO cancels
out SAR quantization noise as well as the comparator thermal noise. Thus, the
quantization noise at the ADC output comes only from the VCO and is first-order
shaped. The PVT variation of VCO tuning gain can still cause SAR quantization
noise leakage into the output and degrade SNDR. To address this issue, a simple
digital background calibration technique is developed. The proposed background
calibration technique enables precise tracking of the VCO gain and results in 12-bit
ADC linearity.
While the proposed SAR C VCO architecture can be used as a high-resolution
ADC, there have been efforts to develop highly digital ADCs which do
not require calibration. The first noise-shaping (NS) SAR ADC is published in [18],
but it still needs an OTA to realize a first-order noise transfer function (NTF) zero
at z D 0.64. It also requires a finite impulse response (FIR) DAC that introduces
extra noise and increases chip area. Later, a fully passive first-order NS SAR
ADC is published in [19]. It obviates the need for any OTA, but its noise-shaping
performance is very limited, as its NTF zero is located at 0.5 rather than 1. Moreover,
its input signal is attenuated by two times during normal conversion, leading to 6-
dB penalty in SNR or quadrupled analog power for the same SNR. In addition,
it requires double capacitance, increasing chip area. We propose a novel NS SAR
architecture [20] that is simple, robust, and low power. It gets rid of OTA-based
active integrators and realizes a NTF zero at 0.75 with a passive integrator. The
passive integrator only requires one switch and two capacitors. The zero location is
fully determined by the capacitor ratio, which is insensitive to PVT variations and
ensures the robustness of the architecture. Compared to [18] (z D 0:64) and [19]
(z D 0:5), the proposed NS SAR ADC achieves the best noise-shaping performance
with a zero closest to 1. Figure 4.1 compares the NTF magnitude with zeros from
[18, 19] and this work. As can be seen, this work achieves around 3 dB more in-band
attenuation than [18] and 6 dB more in-band attenuation than [19]. Furthermore,
the proposed architecture does not cause any signal attenuation and requires less
capacitance than [19]. With minimum modification to the original SAR ADC
architecture, the proposed NS SAR ADC altogether shapes the quantization noise,
comparator noise, and DAC noise with a NTF of .1 0:75z1 /. It allows the use of
a low-resolution DAC and relaxes the requirement on comparator noise, making it
possible to reach high resolution and high power efficiency simultaneously.
4 Hybrid VCO Based 0-1 MASH and Hybrid SAR 63
Fig. 4.1 NTF magnitude comparisons with zeros from [18, 19] and this work
The rest of the paper is organized as follows. Section 4.2 discusses the proposed
SAR C VCO hybrid ADC in details along with chip measurement results. The
noise-shaping hybrid SAR ADC is presented in Sect. 4.3 along with chip
measurement results. The conclusion is brought up in Sect. 4.4.
The proposed SAR C VCO ADC is shown in Fig. 4.2 along with the timing diagram.
An 8-bit SAR forms the first stage of the ADC. During the phase
1 , the input is
sampled differentially on the bottom plates of the capacitive DAC array. The bottom-
plate sampling switches are closed during
1e and are opened slightly before the
input sampling switches are opened. The sampled input is quantized by the SAR
during the phase
2 . After quantization, the first-stage residue is fed to a pseudo-
differential dual VCO during
3 phase. The dual VCOs perform phase-domain
integration of the residue from the SAR stage. The VCO differential output is
differentiated digitally before being combined with SAR output to generate the final
ADC output.
To reduce switching power in the capacitive DAC, the bidirectional single-sided
switching scheme of [2] has been adopted. The switching scheme is illustrated for a
3-bit capacitive DAC in Fig. 4.3. The SAR DAC has no redundancy as the second-
stage VCO can absorb decision errors in the SAR stage as long as the error is not
so large as to cause phase overflow in the VCO stage. The VCO stage relaxes the
64 A. Sanyal et al.
Yes
No
Yes
No
Yes
No
precision requirement of the SAR stage and thus a low power comparator can be
used in the SAR stage. The SAR in turn reduces the VCO swing and obviates the
need for any VCO nonlinearity calibration.
4 Hybrid VCO Based 0-1 MASH and Hybrid SAR 65
The circuit diagram of the second stage is shown in Fig. 4.4. Each VCO consists
of a seven-stage pseudo-differential ring inverter chain. The VCO performs phase-
domain integration during the clock phase
3 , but the VCO is not stopped during
the clock phases
1 and
2 . This is to prevent charge leakage which can corrupt the
phase information held by the VCOs. Instead, the VCOs are biased with Icm and run
at a fixed frequency during
3 . Icm is kept low to save power and reduce phase noise
during the VCO idle phase (
3 ). The digital logic for the second-stage runs at the
ADC sampling frequency. There is no phase-overflow counter running at high VCO
frequency, which significantly lowers the power consumption of the second stage
compared to [11].
The ADC model is shown in Fig. 4.5. G represents the SAR residue voltage
attenuation due to parasitic capacitors, Kvco is the VCO tuning gain, and Gd is the
digital gain used to scale the VCO output d2 before combining with SAR output d1 .
Rn is the pseudo-random sequence used to dither the VCO stage. The ADC output
can be written as
GKvco q2 .1 z1 /
dout D Vin C .q1 Rn / 1 C (4.1)
Gd Gd
66 A. Sanyal et al.
If the digital gain Gd is set equal to the analog interstage gain GA GKvco , both Rn
and SAR conversion error q1 can be canceled at the output. The SAR comparator
thermal noise is also removed at the output. Thus, the second stage allows the use
of a low power comparator in the first stage. SAR quantization and comparator
thermal noise as well as Rn can only be canceled at the output if the VCO stage
has sufficient linearity. The final quantization noise at the ADC output comes solely
from the VCO, q2 , and is first-order shaped. Thermal noise at the ADC output comes
from VCO phase noise and kT=C noise which is much smaller compared to VCO
phase noise. Thus, as long as the VCO is linear, the ADC resolution depends only
on the VCO stage and is independent of the SAR resolution. An 8-bit SAR is chosen
to ensure that the VCO is sufficiently linear.
Any mismatch between GA and GD will result in SAR quantization noise,
comparator thermal noise, and Rn leaking to the output which will raise the in-band
floor and increase distortion. To ensure GA D GD , we digitally adjust GD such that
where
The speed of convergence of the calibration algorithm depends on the IIR filter
bandwidth . It can be seen from (4.2) and (4.3) that the dominant noise source
4 Hybrid VCO Based 0-1 MASH and Hybrid SAR 67
at the input to the IIR filter is the SAR quantization noise q1 . Rn is a deterministic
signal and does not affect the IIR filter bandwidth. If the SAR resolution is increased
by two times, its quantization noise power reduces by 4, and the IIR filter bandwidth
can be increased by four times resulting in four times faster calibration convergence.
This can also be seen from Fig. 4.7. As the SAR resolution is increased from 6
to 7 bits, the IIR filter bandwidth increases by 4, and this reduces the calibration
convergence time by a factor of 4.
The SAR C VCO ADC is implemented in 40 nm CMOS. The chip photo is shown
in Fig. 4.8. The core circuit occupies an area of 0:03 mm2 . The prototype consumes
350 W from 1.1 V supply while operating at a frequency of 36 MHz. Out of the
350 W total power, 100 W is consumed by the VCO, SAR comparator, and
capacitive DAC switching, while the remaining 250 W power is consumed by the
digital circuitry including the pseudo-random number generator, SAR logic, clock
generator, and VCO digital logic.
The measured spectrum with a 2.2 V differential input at 500 kHz frequency is
shown in Fig. 4.9. The ADC quantization noise comes from the VCO stage and is
first-order shaped. At an OSR of 9, the ADC SNDR without calibration is 64.5 dB,
while calibration improves the SNDR to 74.3 dB. Background calibration also
68 A. Sanyal et al.
0 w/o calibration
w/ calibration
-20
BW = 2MHz SNDR = 64.5 dB (w/o cal)
Amplitude(dBFS)
-80
-100
20dB/dec
-120
10-2 10-1
Frequency(f/fs)
improves the SFDR from 68 to 81 dB. Even order distortion in the ADC spectrum
after background calibration comes from mismatch in the capacitive DAC in the
first-stage SAR.
Figure 4.10 shows the SNDR and SNR of the ADC versus input amplitude. The
ADC has a dynamic range of 75.7 dB. The measured histogram of d2 is shown in
Fig. 4.11. The shift in d2 distribution for Rn D 1 to Rn D 0 can be clearly seen from
Fig. 4.11. The difference between d2 .Rn D 1/ and d2 .Rn D 0/ gives the interstage
gain GKvco as 1.3.
Figure 4.12 shows the calibration convergence speed of the proposed
SAR C VCO ADC. The proposed background calibration has a very fast
convergence and requires only 103 samples (or 25 s) to converge. This is because
the SAR quantization noise, which is a primary source of perturbation in the
background calibration loop, is substantially attenuated by the first-stage 8-bit
SAR [21].
4 Hybrid VCO Based 0-1 MASH and Hybrid SAR 69
85
80
75
70
SNDR (dB)
65
60
55
50
45
500 1000 1500 2000 2500 3000 3500
No. of samples
Figure 4.13 shows the architecture of the proposed first-order NS SAR ADC.
Compared to conventional SAR operations, two more clock cycles,
ns0 and
ns1
are added. Before
ns0 cycle, the SAR ADC does the normal conversion. Different
from [19], there is no capacitor connected to Vres node during normal conversion,
and, thus, the signal attenuation problem is avoided. To realize first-order noise
shaping, the key is to integrate the residual voltage Vres and feed it back to the
comparator input. During
ns0 cycle, a small capacitor, C2 D C=3 is merged with
the DAC capacitor, C1 D C, to get the residue voltage, Vres . At the end of
ns0
cycle, C2 will carry 0.75 Vres . In the following
ns1 cycle, C2 dumps its charge onto
another capacitor, C3 D C, effectively realizing a passive integration. The voltage
integrated on C3 is labeled as Vint , which is fed back to the comparator input. Now
4 Hybrid VCO Based 0-1 MASH and Hybrid SAR 71
the comparator has two path inputs, one of which is connected to Vres , while the
other is connected to Vint . However, there is a limitation with passive integration that
only a fraction of Vres is integrated, which degrades the noise-shaping performance.
It seems that OTAs are still required to provide a gain to compensate the attenuation
of Vres . Fortunately, as the comparators result is a 1-bit sign, what is required here
is only a relative gain between Vint and Vres , which can be realized by simply sizing
the comparator input transistors correspondingly. As shown in Fig. 4.13, to provide
a gain of 4 on Vint path for a proper NTF, we size its corresponding input transistors
4 times larger than the Vres path. After
ns1 cycle, the charge on C2 is cleared in next
s cycle to be ready for getting the new residual voltage. In real implementation,
a mode signal is used to pull down Vint to ground so that the SAR ADC can be
easily reconfigured to the conventional mode in case of Nyquist-rate applications.
Additionally, foreground calibration on DAC mismatch can also be conducted in the
Nyquist mode.
4.3.2 Analysis
Fig. 4.14 General signal flow diagram of the proposed NS SAR ADC assuming C1 D C3 D C,
C2 D a=.1 a/C, and the integration path gain of g
Fig. 4.15 Nonideal effects in the proposed NS SAR ADC with the integration path gain g D 1=a
and thus, is insensitive to PVT variations. To ensure stability, the pole needs to be
within the unit circle. The stability condition is shown in Fig. 4.14. Given that the
current stability condition is 4=3 < g < 28=3, g D 4 determined by the comparator
input transistor ratio is very far from the unstable boundary. Therefore, the proposed
NS SAR architecture is highly robust.
With g D 1=a, Fig. 4.15 further investigates the nonideal effects including
thermal noises and DAC mismatch errors in the flow. n1 is the kT=C sampling noise
which directly adds to the input signal. n2 is the noise voltage on C2 at the end of
Fig. 4.17 Measured output spectrum with a 95.37 KHz, 2 dBFS sinusoidal input
Fig. 4.19a, with OSR doubled, SNDR increases by 6 dB which matches the NTF
of .1 0:75z1 /. Therefore, according to FoMS D SNDR C 10log10 .BW=Power/,
the FoMS increases by 3 dB with OSR doubled. As shown in Fig. 4.19b, when OSR
is 8, the chip achieves a FoMS of 167 dB.
4 Hybrid VCO Based 0-1 MASH and Hybrid SAR 75
(a) (b)
Fig. 4.19 With different OSRs: (a) measured SNDR and (b) Schreier FoM
4.4 Conclusions
Two highly digital, hybrid ADCs are presented in this work. The SAR C VCO
0-1 MASH architecture performs time-domain quantization of the analog input
signal and uses a simple, digital background calibration technique to address PVT
sensitivity of VCOs tuning gain. The SAR C VCO achieves an energy efficiency of
18.5 fJ/conversion-step (Schreier FoM of 171 dB) which is the best among VCO-
based ADCs. The hybrid, NS SAR ADC, achieves first-order noise shaping by
using passive integrators. The NTF zero location is determined by the ratio of
capacitors, thus, making the proposed architecture immune to PVT variations and
highly robust. Compared to prior NS SAR ADC works, it gives the best noise-
shaping performance with a zero closest to 1 and achieves a Schreier FoM of 167 dB.
The SAR C VCO and NS SAR architectures achieve high resolution at low power
by either canceling or first-order shaping the quantization and thermal noise of the
SAR stage. The SAR C VCO architecture cancels noise from the SAR stage by
using a fine-resolution VCO ADC but requires background calibration for accurate
estimation of interstage gain. The NS ADC does not use a second quantizer but uses
a passive integrator to first-order shape noise from the SAR stage. Thus, the NS
SAR ADC has the advantage of much better PVT insensitivity over SAR C VCO
architecture and does not require any background calibration. Compared to the
SAR C VCO architecture, the NS SAR ADC introduces additional kT=C noise due
to the capacitors C2 and C3 in the passive integrator. The order of noise shaping in
the NS SAR ADC can be easily increased by increasing the order of the passive
integrator and adding more paths to the comparator at the cost of increased kT=C
noise and comparator thermal noise.
References
1. Sanyal, A., Sun, N.: An energy-efficient low frequency-dependence switching technique for
SAR ADCs. IEEE Trans. Circuits Syst. Express Briefs 61(5), 294298 (2014)
2. Chen, L., Sanyal, A., Ma, J., Sun, N.: A 24-W 11-bit 1-MS/s SAR ADC with a bidirectional
single-side switching technique. In: IEEE European Solid-State Circuits Conference, Venice
Lido, pp. 219222 (2014)
3. Chen, L., Tang, X., Sanyal, A., Yoon, Y., Cong, J., Sun, N.: A 10.5-b ENOB 645 nW 100kS/s
SAR ADC with statistical estimation based noise reduction. In: IEEE Custom Integrated
Circuits Conference, San Jose, pp. 14 (2015)
4. Chen, L., Tang, X., Sanyal, A., Yoon, Y., Cong, J., Sun, N.: A 0.7V 0.6W 100kS/s low-power
SAR ADC with statistical estimation based noise reduction. IEEE J. Solid State Circuits 52(5),
13881398 (2017)
5. Park, M., Perrott, M.: A 78 dB SNDR 87 mW 20 MHz bandwidth continuous-time ADC
with VCO-based integrator and quantizer implemented in 0.13 m CMOS. IEEE J. Solid State
Circuits 44(12), 33443358 (2009)
6. Taylor, G., Galton, I.: A mostly-digital variable-rate continuous-time delta-sigma modulator
ADC. IEEE J. Solid State Circuits 45(12), 26342646 (2010)
4 Hybrid VCO Based 0-1 MASH and Hybrid SAR 77
7. Straayer, M.Z., Perrott, M.H.: A 12-bit, 10-MHz bandwidth, continuous-time ADC with a
5-bit, 950-MS/s VCO-based quantizer. IEEE J Solid State Circuits 43(4), 805814 (2008)
8. Reddy, K., Rao, S., Inti, R., Young, B., Elshazly, A., Talegaonkar, M., Hanumolu, P.K.:
A 16-mW 78-dB SNDR 10-MHz BW CT ADC using residue-cancelling VCO-based quantizer.
IEEE J. Solid State Circuits 47(12), 29162927 (2012)
9. Reddy, K., Dey, S., Rao, S., Young, B., Prabha, P., Hanumolu, P.K.: A 54mW 1.2 GS/s 71.5 dB
SNDR 50MHz BW VCO-based CT ADC using dual phase/frequency feedback in 65 nm
CMOS. In: IEEE Symposium on VLSI Circuits, Kyoto, pp. C256C257 (2015)
10. Rao, S., Young, B., Elshazly, A., Yin, W., Sasidhar, N., Hanumolu, P.K.: A 71dB SFDR
open loop VCO-based ADC using 2-level PWM modulation. In: IEEE Symposium on VLSI
Circuits, Kyoto (2011)
11. Sanyal, A., Ragab, K., Chen, L., Viswanathan, T., Yan, S., Sun, N.: A hybrid SAR-VCO
ADC with first-order noise shaping. In: IEEE Custom Integrated Circuits Conference, San Jose,
pp. 14 (2014)
12. Ragab, K., Sun, N.: A 12b ENOB, 2.5 MHz-BW, 4.8 mW VCO-based 0-1 MASH ADC
with direct digital background nonlinearity calibration. In: IEEE Custom Integrated Circuits
Conference, San Jose, pp. 14 (2015)
13. Ragab, K., Sun, N.: A 12b ENOB, 2.5 MHz-BW, 4.8 mW VCO-based 0-1 MASH ADC with
direct digital background nonlinearity calibration. IEEE J. Solid State Circuits 52(2), 433447
(2017)
14. Lee, K., Yoon, Y., Sun, N.: A scaling-friendly low-power small-area ADC with VCO-
based integrator and intrinsic mismatch shaping capability. IEEE J. Emerg. Sel. Top. Circuits
Syst. 5(4), 561573 (2015)
15. Yoon, Y., Lee, K., Hong, S., Tang, X., Chen, L., Sun, N.: A 0.04-mm 2 0.9-mW 71-dB SNDR
distributed modula ADC with VCO-based integrator and digital DAC calibration. In: IEEE
Custom Integrated Circuits Conference, San Jose, pp. 14 (2015)
16. Li, S., Sun, N.: A 174.3 dB FoM VCO-based CT modulator with a fully digital phase
extended quantizer and tri-level resistor DAC in 130nm CMOS. In: IEEE European Solid-State
Circuits Conference, Lausanne, pp. 241244 (2016)
17. Sanyal, A., Sun, N.: A 18.5-fJ/step VCO-based 0-1 MASH ADC with digital background
calibration. In: IEEE Symposium on VLSI Circuits, Honolulu, pp. 12 (2016)
18. Fredenburg, J., Flynn, M.: A 90-MS/s 11-MHz-bandwidth 62-dB SNDR noise-shaping SAR
ADC. In: IEEE ISSCC Digest of Technical Papers, San Francisco, pp. 468470, Feb 2012
19. Chen, Z., Miyahara, M., Matsuzawa, A.: A 9.35-ENOB, 14.8 fJ/conv.-step fully-passive noise-
shaping SAR ADC. In: IEEE Symposium on VLSI Circuits Digest, Kyoto, pp. C64C65, June
2015
20. Guo, W., Sun, N.: A 12b-ENOB 61W noise-shaping SAR ADC with a passive integrator.
In: IEEE European Solid-State Circuits Conference, Lausanne, pp. 405408, 2016
21. Sun, N., Lee, H.-S., Ham, D.: A 2.9-mW 11-b 20-MS/s pipelined ADC with dual-mode-based
digital background calibration. In: IEEE European Solid State Circuits Conference, Bordeaux,
pp. 269272, 2012
22. Kauffman, J., Witte, P., Lehmann, M., Becker, J., Manoli, Y., Ortmanns, M.: A 72 dB DR, CT
modulator using digitally estimated, auxiliary DAC linearization achieving 88 fJ/conv-step
in a 25 MHz BW. IEEE J. Solid State Circuits 49(2), 392404 (2014)
23. Lee, H., Hodges, D., Gray, P.: A self-calibrating 15 bit CMOS A/D converter. IEEE J. Solid
State Circuits 19(6), 813819 (1984)
Chapter 5
A Hybrid Architecture for a Reconfigurable
SAR ADC
5.1 Introduction
quantization. With the interstage gain, GA , the quantization step in the fine ADC
can be enlarged to relax the comparator design. Therefore, only an accurate DAC
(DC D VC ) and an accurate gain stage (VR D DR ) are required to keep DO D VI .
This implies that the design requirement of the fine ADC is further shifted to the
gain stage, where the analog gain, GA , needs to match well with the ideal digital
gain, GD .
Once the circuit noise and device matching requirements are both shifted to
the DAC arrays and gain stages, the ADC performance is now determined by the
mismatch of the passive devices and the performance of the operational amplifiers
in these blocks. If there is no strict power and speed constraint on the operational
amplifiers, device matching eventually limits the ADC resolution to 1012 bits, as
exemplified by the regime of pipelined and SAR ADCs shown in Fig. 5.2.
A modulator with 1-bit quantizer is the only architecture which can achieve
high resolution regardless of device mismatch. It is based on the concept of a
negative feedback loop, where the feedback signal is equivalent to the input signal
as long as the loop gain is high. If an ADC is put inside the loop, also shown in
Fig. 5.3, its quantization and comparator errors will be suppressed by the high loop
gain and can be ignored if the gain is high enough. This feature allows the use of a
low-resolution ADC to digitize the signal, and therefore, either an inherently linear
1-bit DAC or a low-resolution DAC with dynamic element matching (DEM) can be
adopted in the feedback path to realize an accurate DAC such that the digital output
is equivalent to the feedback signal (DO D VO ) and also the input signal, VI . The
high loop gain in modulators is usually realized by loop filters with poles at the
band of interest. Since it is challenging to maintain a high loop gain over a wide
bandwidth, the effective conversion rate of modulators is usually limited by the
sampling rate achievable by the operational amplifiers.
Through this review of the fundamental ADC architectures, it can be observed
that the requirements of device mismatch (SFDR) and circuit noise (SNR) on
comparators are gradually shifted to DAC arrays and gain stages. Therefore, in
modern ADC designs, comparator noise (in subranging ADCs), DAC linearity (in
subranging and pipelined ADCs), and operational amplifier power consumption (in
pipelined and ADCs) are the most critical challenges to be overcome by design
innovations.
technology and design techniques, such as asynchronous timing control [5], SAR
ADCs have been able to resolve more bits than flash ADCs with much lower power
and area cost at an acceptable transport delay.
Figure 5.4 lists several examples to demonstrate the advantages of SAR-assisted
hybrid architectures. For the subranging ADCs, resolving more bits in the coarse
ADCs reduces the residue swing and enables more options to realize the fine
ADCs. The zoom ADC [6] uses a 1-bit modulator as the fine ADC with
relaxed requirement on operational amplifier bandwidth to achieve an ultralow
power consumption. The SAR-VCO ADC [7] takes advantage of the small
quantization steps of a VCO ADC without suffering from its limited linear input
range. The SAR-assisted digital slope ADC [8] is able to achieve 100 MS/s since
the time period needed to quantize the small residue in the linear slope stage is
considerably reduced. In these subranging-based hybrid architectures, the demands
for low-noise comparators and high-performance operational amplifiers are avoided.
For the pipelined architecture, additional bits in the coarse SAR ADC reduce
the input swing of the gain stage. In the SAR-assisted pipelined ADC [10], the
operational amplifier bandwidth and linearity requirements in the gain stage are
greatly relaxed, and SAR ADCs can also be used as the fine ADC without strict
comparator noise constraint. As to the architecture, a multi-bit SAR quantizer
[11] can be used to reduce the quantization error and further lower the required loop
gain and sampling rate, leading to relaxed operational amplifier design constraints.
Figure 5.4 also includes two examples of hybrid ADCs [9, 12], which are not SAR-
assisted, to demonstrate the idea of choosing different types of sub-ADCs depending
on the particular design target.
5 A Hybrid Architecture for a Reconfigurable SAR ADC 85
A hybrid ADC can also be a SAR-based design. For example, the most significant
bits (MSBs) of a SAR ADC can be resolved by a flash ADC [13] to enhance
the conversion rate. In a noise-shaping SAR ADC [14], the residue after the SAR
operation is sampled and used to shape both the comparator and quantization noise
out of the band of interest, like modulators. This architecture relaxes the
requirements on the comparator noise and reduces the number of required SAR
cycles. It is especially interesting because the ADC can be switched between SAR
and modes by disabling and enabling the noise-shaping function.
Although hybrid architectures can effectively overcome the challenges of com-
parator noise and operational amplifier design, the issue of DAC nonlinearity
remains. Therefore, for high-resolution applications, -based designs are still
required to allow the use of low-resolution DACs. For example, the modulator
with SAR quantizer [11] uses a digital modulator to reduce the number of bits for
the feedback DAC. Also, although a noise-shaping SAR ADC can behave like a
modulator, its achievable resolution is still limited by the DAC linearity.
cost with the number of bits, DEM is seldom used for DACs with >6-bit resolution.
To extend the application of DEM, the segmented noise-shaped scrambling scheme
[16], as shown in Fig. 5.5, is the only solution reported so far capable of linearizing
high-resolution DACs without calibration. It uses digital modulation to separate
the (M C N)-bit digital input, DI , into the M-bit MSB of DI C (1 z1 )Q and the
(N C 1)-bit least significant bit (LSB) of (1 z1 )Q, where Q is the quantization
error of the digital modulator. After conversion by two thermometer-coded DAC
arrays with DEM, the first-order shaped quantization error is cancelled in the analog
domain. By doing so, the mismatch between the two DAC arrays only results in a
small residue of (1 z1 )Q within the band of interest.
The segmented noise-shaped scrambling scheme is very effective for high-
resolution DACs. However, the digital modulator prohibits its application in
SAR ADCs, since the modulator needs to receive all the digital bits before DAC
switching, which is inconsistent with SAR operation. The transport delay of the
digital modulator is also not suitable for the feedback loop in modulators.
Therefore, this technique is seldom used in ADC designs, and how to linearize
high-resolution DACs in ADCs (besides calibration) is still one of the fundamental
challenges.
In this section, a new technique, DAC mismatch error shaping (MES) [17], is
introduced in detail to directly address the challenge of DAC linearity in SAR ADCs
and to enable an over 100-dB SFDR without calibration. The idea of MES originates
from the definition of DAC mismatch error. DAC mismatch error generally refers to
the mismatch between the DAC cells, but more exactly, it is the mismatch between
the DAC cells and their corresponding digital weights. Taking the subranging
5 A Hybrid Architecture for a Reconfigurable SAR ADC 87
Fig. 5.6 Definition of DAC mismatch errors with MSB cell as reference
architecture in Fig. 5.3 as an example, DAC mismatch errors occur if the analog
value, VC , subtracted from the analog input, does not agree with the digital weights,
DC , added in the digital domain.
To precisely define DAC mismatch error, the relationship between the digital
weights and the analog values should first be set by a reference. For example, in
digital calibration, the digital weights of the MSB cells are measured and redefined
by the following backend ADC [18]. That is, the LSB DAC is considered to have an
ideal digital weight and is taken as a reference to define the digital weights of the
MSBs. In DEM, the digital weight of a DAC cell corresponds to the average value
of all the DAC cells, so that the summation of all the DAC mismatch errors is zero,
leading to a notch at DC in the output spectrum.
In the MES technique, the MSB cell is taken as the reference to define the
mismatch error, as illustrated in Fig. 5.6, where the 10-bit SAR ADC consists of
a binary-weighted DAC array, 29 C20 C, with corresponding digital weights, 29 20 .
In practice, each capacitor deviates from its ideal value due to process variation, as
noted by the error terms of e9 e0 . To further define mismatch error, the MSB cell is
used as a reference to set C0 D (512C C e9 )/512. Once C0 is applied to all the DAC
cells, the new error terms, e8 0 e0 0 , are the mismatches between the MSB cells and
the LSB cells. That is, the MSB cell is considered perfectly matched with its digital
weight, i.e., C0 (analog value) D 512 (digital weight), and only the LSB cells contain
mismatch errors relative to their digital weights.
88 Y.-S. Shu et al.
Fig. 5.7 Behavioral model of SAR ADC operation with DAC mismatch error shaping in gray area
Based on this definition, the SAR ADC operation can be modeled mathematically
as depicted in Fig. 5.7, excluding the areas in gray for the moment. Ideally, the
operations in the analog and digital domains should be perfectly matched such that
the digital output equals the analog input exactly except for a small quantization
error. In Fig. 5.7, the value of the MSB cell switching, DACMSB (n), is first subtracted
from the analog input. Then, the sum of the LSB cell switching, DACLSBs (n),
along with the LSB mismatch error, E(n), is subtracted from the signal, where
E(n) represents a combination of e8 0 e0 0 depending on the LSB codes. After the
corresponding digital weights of the LSBs, DLSBs (n) (DDACLSBs (n)), and the digital
weight of the MSB, DMSB (n) (DDACMSB (n)), are added in the digital domain to
reconstruct the signal, the LSB code-dependent error sequence, E(n), appears at
the digital output. This signal-dependent error results in harmonic distortions in the
output spectrum.
To reduce the harmonic distortions, the principle of the MES technique is to
duplicate the error incurred in the previous cycle, E(n 1), and use it to cancel
the error in the present cycle, E(n). The cancellation is not exact but does generate
a (1 z1 ) high-pass filtering effect on the error sequence. In other words, if the
error sequence, E(n), is subtracted from itself with a one cycle delay, E(n 1), the
low-frequency components remain the same and are removed after subtraction. The
gray area of Fig. 5.7 shows that an intuitive way to duplicate the previous error with
opposite polarity is to reverse the operation of LSB in the previous cycle. By adding
back the previous LSB value to the analog input and then removing it at the digital
output, the error in the previous cycle, E(n 1), is generated at the digital output to
cancel E(n).
The LSB reversion in the analog domain can be easily realized by a modified
DAC switching scheme. Conventionally, all the DAC cells in a SAR ADC are
reset in the sampling phase and switched to the references sequentially in the
5 A Hybrid Architecture for a Reconfigurable SAR ADC 89
Fig. 5.8 SAR ADC operation with DAC mismatch error shaping
conversion phase. During this DAC switching process, the LSB mismatch error
is injected. Figure 5.8 illustrates the operation of the modified switching scheme.
In the sampling phase, the LSB cells are held on the references, which are set by
the previous LSB code, instead of being reset to 0. After the sampling switch is
disconnected, the LSB cells are switched from the references back to 0 to inject the
previous LSB value, DACLSBs (n 1), along with the LSB mismatch error, E(n 1),
into the sampled input. Once all the DAC cells are reset to 0, the normal conversion
phase continues. The only difference in the timing control is an additional LSB
reset phase. The time slot for this phase can be short since the incomplete settling
of resetting the LSBs can be corrected for by redundancy in the LSBs [19].
Figure 5.9 shows the simulated output spectrums of a 12-bit SAR ADC before
and after MES is applied to the 11-bit LSBs. It can be seen that low-frequency har-
monics are greatly reduced and translated into high-frequency noise when MES is
enabled. Once compared to an ideal first-order high-pass transfer function depicted
in dashed line, it is interesting to observe that the harmonic suppression is actually
better than the first-order shaping effect, as shown by the arrows. This phenomenon
comes from the implicit dithering effect due to the LSB reversion because the LSB
90 Y.-S. Shu et al.
values added to the analog input effectively represent the quantization noise after
the MSB quantization. This dither-like signal is able to break large harmonics into
different frequencies before shaping. Similar to most other dithering techniques,
the injected dither occupies part of the signal range and sacrifices system dynamic
range. In the example shown in Fig. 5.9, the signal must be reduced by 6 dB in order
to allow the previous 11-bit LSBs to be injected at the ADC input.
An efficient way to mitigate the loss of dynamic range is to reduce the number
of bits assigned to the LSB section. For example, if a 3-bit MSB is linearized using
conventional data-weighted averaging (DWA) and MES is applied only to the LSB
section to shape the MSB-to-LSB mismatch error, then the LSB section is reduced
to one-eighth of full scale, resulting in less dynamic range loss due to MES. In
this manner, the average value of the MSB cells becomes the reference to define the
MSB-to-LSB mismatch error. This hybrid DAC linearization scheme comes with an
additional benefit of mutual randomization effect because the MSB DWA sequence
is randomized by the LSB dithering, and the MSB mismatch error shaped by DWA
also randomizes the LSB codes. This effect results in less spurious tones before the
first-order shaping takes effect.
A behavioral simulation with estimated circuit kT/C and comparator noise is
applied to examine the effectiveness of the 3-bit DWA with MES and to compare
it with the conventional DWA technique. Figure 5.10a is a histogram of SFDR
from a 100-point Monte Carlo simulation with 2-dBFS input and random DAC
mismatch at 64x oversampling ratio (OSR). It shows that DWA improves linearity
as more MSB bits are included, as expected. Once MES is applied, SFDR with
only 3-bit DWA becomes greater than 105 dB. The highest SFDR is limited by
the noise floor due to the finite number of samples for FFT analysis. Figure 5.10b
is another comparison at different input amplitudes. It shows that 8-bit DWA can
achieve a high SFDR with a near full-scale input, but the linearity decreases and
results in SNDR degradation around the 8-bit boundary because the MSB-to-LSB
DAC mismatch error is not random enough in those conditions. On the other hand,
5 A Hybrid Architecture for a Reconfigurable SAR ADC 91
the simulated results with the MES technique exhibit a constant SFDR/SNDR
performance with typical circuit noise. The drop around 0 dB reflects the reduced
dynamic range because of the LSB reversion.
The MES technique can be further extended to different types of error shaping other
than first-order high-pass effect of (1 z1 ). For example, if the previous LSB codes
are inverted during the sampling phase, it results in a (1 C z1 ) filtering effect with
a notch at FS /2, where FS is the sampling frequency. Similarly, if the inverted LSB
codes from two cycles ago are applied, it generates a (1 C z2 ) transfer function
with a pair of notches at FS /4 [17]. This feature is especially useful in band-pass
ADCs, such as band-pass modulators.
92 Y.-S. Shu et al.
The hybrid DAC linearization scheme with DEM and MES can also be applied to
typical high-resolution DACs, as shown in Fig. 5.11. Compared to the conventional
segmented noise-shaped scrambling scheme in Fig. 5.5, this technique avoids the
use of the digital modulator and the thermometer-coded LSB DAC. Similar to the
DAC in the SAR ADC prototype, the (M C N)-bit digital input is separated into
an M-bit thermometer-coded MSB DAC with DEM and an N-bit binary-weighted
LSB DAC with MES. The LSB DAC needs to convert the present LSB codes and
also performs the reverse operation of the previous LSB codes in the same cycle.
This can be realized in switched-capacitor DACs, as in the SAR ADC example. For
non-return-to-zero (NRZ) current DACs in continuous-time modulators, where
each cell can only process one code in one cycle, two LSB DACs can be used to
perform 1 and z1 in alternate cycles to generate the (1 z1 ) effect on the same
DAC error. To compensate for the additional z1 LSBs in the analog domain, the
z1 LSBs are added to the digital input to keep the DAC output level, VO , equivalent
to the digital input value, DI .
The simulation results shown in Fig. 5.10 have demonstrated that a SAR ADC can
achieve high linearity without calibration; together with noise shaping [14], both
high SFDR and high SNR can be realized with a SAR ADC simultaneously.
Figure 5.12 shows the architecture of a 12-bit SAR ADC silicon prototype
with MES and the first-order noise-shaping function. It is based on a coarse-fine
architecture [20] with the concepts of hybrid ADC and hybrid DAC linearization.
In this prototype, the three MSBs are resolved by a coarse flash ADC, whose
thermometer-coded output passes through DWA logic and switches all the trilevel
DAC cells simultaneously to avoid redundant charge transfers during the binary
search process. The 11-bit LSB section, including 2-bit redundancy, is a nonbinary-
weighted DAC array resolved by SAR operation with MES logic. The noise-shaping
filter samples the residue after the SAR operation. The filtered residue is then used
to shift the threshold voltage of the comparator to shape the comparator noise along
with quantization noise.
5 A Hybrid Architecture for a Reconfigurable SAR ADC 93
Fig. 5.12 Architecture of 12-bit SAR prototype with hybrid ADC and DAC techniques
Figure 5.12 also indicates that this prototype includes the principles of three
fundamental ADC architectures (flash, SAR, and modulation) as well as three
DAC linearization techniques (DWA, MES, and dithering). Since the MES and
noise-shaping functions can be enabled/disabled without interfering with the ADCs
normal operation, this prototype can be switched from the conventional SAR mode
to the oversampling modes by enabling the DWA C MES function and the
noise-shaping filter. It is noted that this architecture can also be used as a simple
flash ADC by disabling the DAC and the fine ADC functions.
The DAC resolution of the prototype is chosen to be 12 bits in order to minimize
the residue at the noise-shaping filter input and to further save power. Since
the comparator noise is around 1011-bit level, the residue is mainly dominated
by circuit noise instead of quantization error or large signal. Consequently, the
switched-capacitor noise-shaping filter only needs to process small circuit noise
with greatly relaxed gain, bandwidth, and linearity requirements on the operational
amplifiers, resulting in an extremely power-efficient design.
The 12-bit SAR prototype is fabricated in a 55-nm CMOS process. The oversam-
pling mode operates at 1 MS/s and consumes 15.7 W from a 1.2-V supply.
When configured in the conventional SAR mode, the ADC consumes 22 W at
94 Y.-S. Shu et al.
5 MS/s. A chip photo is shown in Fig. 5.13. The ADC occupies an active area
of 0.072 mm2 including decoupling capacitors. The noise-shaping filter is placed
between the MSB and LSB DACs in order to exacerbate the MSB-to-LSB mismatch
so as to verify the MES technique.
Figure 5.14 compares the measured output spectrums in the conventional SAR
mode and the oversampling mode with a 60-Hz input. The spectrum of the con-
ventional SAR mode shows a high noise floor with a clear trend of flicker noise and
large harmonic distortions. The SNDR within the 1-kHz bandwidth is 57.3 dB lim-
ited by the harmonic distortions. Once the ADC is configured into the oversampling
mode, all the in-band noise and the harmonics are significantly reduced with
increased out-of-band noise, and the SNDR within 1-kHz bandwidth is improved
to 101 dB. The 105-dB SFDR is limited by the third-order harmonic, which comes
from the nonlinear parasitic capacitor on the top plates of the sampling DAC and
is also observed from circuit simulation without DAC mismatch. The small bump
around 2030 kHz in the out-of-band noise is the remaining signal-dependent
pattern of the DAC mismatch error, which exhibits much less tonal behavior due
to the mutual randomization effects between DWA for MSBs and MES for LSBs.
Table 5.1 summarizes and compares the performance with state-of-the-art over-
sampling ADCs, including an oversampling SAR ADC [15], a continuous-time
Table 5.1 Summary of measured performance and comparison with state-of-the-art oversampling ADCs
This ADC [17] Harpe, P., ISSCC 2014 [15] Sukumaran, A., JSSC 2014 [21] Perez, A.P., ISSCC 2011 [22]
Architecture Oversampling SAR Oversampling SAR CT - CT -
Nyquist mode 9.5 bits up to 5 MS/s 11.3 bits up to 32 kS/s
Process 55 nm 65 nm 0 18 m 0.18 m
Supply 1.2 V 0.8 V 1.8 V 1.5 V
Active area 0.072 mm2 0.18 mm2 0.24 mm2 0.492 mm2
Sampling rate 1 MHz 128 kHz 6.144 MHz 3.2 MHz
Order, bit First-order, 12 bits 14 bits Third-order, 1 bit Third-order, 5 bits
Power 15.7 W 1.37 W 280 W 140 W
5 A Hybrid Architecture for a Reconfigurable SAR ADC
(CT) modulator [21], and a discrete-time (DT) modulator [22]. The ADC
prototype achieves a 9.5-bit effective number of bits (ENOB) up to 5MS/s in the
Nyquist mode. In the oversampling mode, the MES technique allows the 12-bit
DAC to achieve 105-dB SFDR, which is comparable to the SFDR of the inherently
linear 1-bit FIR DAC in the CT modulator [21]. This low distortion leads to over
100-dB SNDR at a 1.2-V supply, while higher voltages are used in the designs with
high SNDR [21, 22]. The signal bandwidth of the prototype is the lowest because of
the target applications. This ADC has the smallest area and is potentially more area
efficient than CT modulators since large RC time constants and large devices for
flicker noise are usually required in low-frequency designs. This architecture is also
potentially more power efficient than DT modulators because of the relaxed
operational amplifier design with small internal swings. Based on the measured
SNDR over 4-kHz bandwidth, the prototype achieves the highest Schreier FoM
of 180 dB compared with the state-of-the-art designs in [2] with >20-Hz signal
bandwidth.
5.9 Conclusions
In this paper, the three fundamental challenges in modern ADC designs are revisited.
While the emerging hybrid architectures can effectively relax the comparator noise
and operational amplifier bandwidth requirements, a DAC mismatch error shaping
(MES) technique is introduced to overcome the remaining challenge of DAC
nonlinearity. A hybrid DAC linearization technique consisting of DEM, MES, and
dithering is demonstrated in a SAR ADC prototype and achieves an over 100-dB
SFDR without calibration. Once combined with noise shaping, the ADC can be
configured between SAR and modes while maintaining the advantages of high
power efficiency and flexible sampling rate in SAR ADCs. This hybrid architecture
allows an ADC to be configured between flash, SAR, and -type performance
and further blurs the boundaries between different ADC architectures.
References
6. Chae, Y., Souri, K., Makinwa, K.A.A.: A 6.3 W 20 bit incremental zoom-ADC with 6 ppm
INL and 1 V offset. IEEE JSSC. 48(12), 30193027 (2013)
7. Sanyal, A. et al.: A hybrid SAR-VCO ADC with first-order noise shaping. Proceedings of
the custom integrated circuits conference, IEEE, pp. 14 (2014)
8. Liu, C.C., Huang, M.C., Tu, Y.H.: A 12 bit 100 MS/s SAR-assisted digital-slope ADC. IEEE
JSSC. 51(12), 29412950 (2016)
9. Dong, Y., et al.: A continuous-time 03 MASH ADC achieving 88 dB DR with 53 MHz BW
in 28 nm CMOS. IEEE JSSC. 49(12), 28682877 (2014)
10. Lee, C.C., Flynn, M.P.: A SAR-assisted two-stage pipeline ADC. IEEE JSSC. 46(4), 859869
(2011)
11. Tsai, H.-C., et al.: A 64-fJ/conv.-step continuous-time modulator in 40-nm CMOS using
asynchronous SAR quantizer and digital truncator. IEEE JSSC. 48(11), 26372648 (2013)
12. Straayer, M.Z., Perrott, M.H.: A 12-bit, 10-MHz bandwidth, continuous-time ADC with a
5-bit, 950-MS/s VCO-based quantizer. IEEE JSSC. 43(4), 805814 (2008)
13. Lin, Y.Z. et al.: A 9-bit 150-MS/s 1.53-mW subranged SAR ADC in 90-nm CMOS. IEEE
Symp. VLSI Circuits Digest of Technical Papers, pp. 243244 (2010)
14. Fredenburg, J.A., Flynn, M.P.: A 90-MS/s 11-MHz-bandwidth 62-dB SNDR noise-shaping
SAR ADC. IEEE JSSC. 47(12), 28982904 (2012)
15. Harpe, P., Cantatore, E., van Roermund, A.: An oversampled 12/14b SAR ADC with noise
reduction and linearity enhancements achieving up to 79.1 dB SNDR. IEEE ISSCC Digest of
Technical Papers, pp. 194195 (2014)
16. Adams, R., Nguyen, K.Q.: A 113-dB SNR oversampling DAC with segmented noise-shaped
scrambling. IEEE JSSC. 33(12), 18711878 (1998)
17. Shu, Y.-S., Kuo, L.-T., Lo, T.-Y.: An oversampling SAR ADC with DAC mismatch error
shaping achieving 105 dB SFDR and 101 dB SNDR over 1 kHz BW in 55 nm CMOS. IEEE
JSSC. 51(12), 29282940 (2016)
18. Karanicolas, A.N., Lee, H.-S., Barcrania, K.L.: A 15-b 1-Msample/s digitally self-calibrated
pipeline ADC. IEEE JSSC. 28(12), 12071215 (1993)
19. Kuttner, F.: A 1.2V 10b 20MSample/s non-binary successive approximation ADC in 0.13m
CMOS. In: IEEE ISSCC digest of technical papers, pp. 176177 (2002)
20. Tai, H.-Y. et al.: A 0.85fJ/conversion-step 10b 200kS/s subranging SAR ADC in 40nm CMOS.
in IEEE ISSCC Digest of Technical Papers, pp. 196197 (2014)
21. Sukumaran, A., Pavan, S.: Low power design techniques for single-bit audio continuous-time
delta sigma ADCs using FIR feedback. IEEE JSSC. 49(11), 25152525 (2014)
22. Perez, A.P., Bonizzoni, E., Maloberti, F.: A 84dB SNDR 100kHz bandwidth low-power single
op-amp third-order modulator consuming 140W. IEEE ISSCC Digest of Technical
Papers, pp. 478479 (2011)
Chapter 6
A Hybrid ADC for High Resolution:
The Zoom ADC
6.1 Introduction
In this paper, we describe a dynamic zoom ADC [10, 11], i.e., a hybrid ADC
that consists of a compact and efficient coarse SAR ADC and an accurate and high-
resolution fine discrete-time SDM (DT-SDM). The hybrid ADC achieves 109-dB
dynamic range (DR), 106-dB signal-to-noise ratio (SNR), and 103-dB signal-
to-noise-and-distortion ratio (SNDR) in a 20-kHz bandwidth, while dissipating
1.12 mW and occupying only 0.16 mm2 in a 0.16-m CMOS process.
The paper is organized as follows: first, the energy and area efficiency of high-
resolution high-linearity ADCs is discussed (Sects. 6.2 and 6.3). This is followed by
an overview of hybrid ADC architectures (Sect. 6.4). The zoom ADC and its system-
level design are then introduced (Sect. 6.5), followed by its circuit design (Sect. 6.6).
Finally, experimental results are presented (Sect. 6.7), followed by conclusions.
The energy efficiency of an ADC is measured in terms of its energy per conversion
Econv , i.e., the energy spent by the ADC to produce an output sample. Figure 6.1
shows Econv for ADCs published in recent years [14]. For low-resolution ADCs
with N output bits, the energy per conversion is often limited by the energy required
to compute the N output bits, thus Econv scales with the number of conversion steps,
i.e., Econv / 2N . For high-resolution ADCs, i.e., ADCs with >75-dB DR, the energy
per conversion is limited by the need to achieve sufficiently low thermal noise,
which requires a quadratic increase of energy for each additional quantization step
(Econv / 22N ) [13]. Consequently, ADC power consumption will scale with DR. This
consideration leads to the definition of the Schreier figure of merit (FoMS ) [12]:
1.E+07
1.E+06
1.E+05
1.E+04
Econv [pJ]
1.E+03
1.E+02
1.E+01
Zoom ADCs
Hybrids
1.E+00 FOMS=175-dB
FOMS=198-dB
1.E-01
20 30 40 50 60 70 80 90 100 110 120
SNDR [dB]
185
Zoom ADCs
180 Hybrids
175
170
165
FOMS [dB]
160
155
150
145
140
135
130
1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11
2BW [Hz]
fbw
FoMs D DR C 10 log10 dB (6.1)
P
where DR is the dynamic range in dB, fbw is the ADC bandwidth, and P is the
ADC power consumption. Sometimes SNDRmax is used instead of DR, e.g., as in
[14], because the former is usually worse than the latter. Energy efficiency is also
difficult to combine with high speed, as shown in Fig. 6.2, which reports FoMS
vs input bandwidth for ADCs published in recent years [14]. FoMS is higher for
low and moderate bandwidths (010 MHz). It can be shown that a 198-dB FoMS is
theoretically achievable [15], which means that there is still an approximately 20-dB
gap between this limit and the current state of the art (Fig. 6.1).
The reason for this gap lies in the implicit assumption behind the definition of
FoMS , i.e., that most of an ADCs power consumption is used in its input stage to
reduce thermal noise. In practice, however, this is not the case. First, other ADC
subblocks consume a non-negligible amount of power. In a DT-SDM, for example,
such subblocks will include the integrators that follow the input stage, the quantizer,
the biasing circuits, and the digital back end. Second, even the power consumption
of the input stage is often not limited by thermal noise but by other requirements
such as linearity, slew rate, and settling time.
Bearing this in mind, it is then clear that to maximize FOMS , a number of
different design strategies can be adopted. First, the power consumption of all ADC
subblocks, especially that of the critical input stage, should be reduced. Several
recent works have targeted improvements in the efficiency of the amplifiers used in
the SDM loop filters. Various inverter-based amplifiers have been proposed [1, 10,
1622] which double efficiency by summing the transconductances of NMOS and
PMOS transistors biased by the same current. A further improvement is achieved in
102 B. Gnen et al.
A similar approach can be used to analyze ADCs from an area efficiency perspec-
tive. Most of their silicon area should then be used to ensure low enough thermal
noise. However, since the matching of integrated components scales with the square
root of their area, i.e., two times more accuracy requires four times larger area,
the accuracy requirements on active and passive components also impose a lower
limit on the silicon area [20]. This implies that in a DT-SDM the total area should
ideally be dominated by thermal-noise-critical and matching-critical components,
such as sampling capacitors, the first integrator, and the DAC.1 It should also be
noted that over-sampling effectively reduces the in-band thermal noise in an ADC,
hence relaxing the area requirement of the noise-critical capacitors for the same DR.
Although good for area efficiency, over-sampling comes at the expense of increased
power consumption in the quantizer and in the digital back end.
However, components not limiting noise or accuracy will also occupy a non-
negligible chip area. For example, SDMs with multi-bit quantizers usually suffer an
area penalty due to the quantizers exponentially increasing area [20]. Furthermore,
not all passives are sized for thermal noise or accuracy requirements. For example,
the size of the integration capacitors in the switched-capacitor (SC) integrators of
a DT-SDM is determined by the choice of loop-filter coefficients and the desired
integrator output swing.
Figure 6.3 shows an area vs DR comparison of state-of-the-art audio ADCs
[11]. It shows that higher DR indeed corresponds to higher chip area. It should
be noted that both device matching and capacitor density (capacitance per unit area,
in F/m2 ) and, hence, the resulting chip area are strongly technology dependent.
Technology scaling also helps to reduce the power consumption of the digital logic
and the quantizer, thus facilitating, for example, the use of multi-bit SDMs. System-
level design should then include a careful choice of the technology in order to exploit
the possible presence of high-density passives.
1
While it is trivial that lower thermal noise requires larger capacitors in DT circuits, this is also
true for continuous-time circuits: lower thermal noise implies lower resistances and, consequently,
larger capacitors for the same total bandwidth.
6 A Hybrid ADC for High Resolution: The Zoom ADC 103
To improve the area efficiency of DT-SDMs for a given technology, the following
design flow should be followed. First, the over-sampling ratio (OSR) should be
increased until the power consumed by the digital back end and the quantizer
becomes significant. Second, the architecture should be chosen to reduce the area
of system blocks that do not directly determine resolution and accuracy, such as the
integration capacitors not in the first integrator, the quantizer, and the digital logic.
As mentioned before, most ADC architectures are not energy efficient when
high resolution and high linearity are both required. Even conventional SAR
ADCs, known for their excellent energy efficiency, suffer under high-resolution
requirements due to the increased power consumption of the comparator. Further-
more, some ADC architectures, such as VCO-based converters, exhibit excessive
nonlinearity for large input signals, thus limiting their DR [8]. Thus, it is often
beneficial to divide the input dynamic range into manageable subranges, i.e.,
coarse and fine ranges, to decouple the problems associated with large signals, low
noise, and high accuracy levels. In this way, the challenges of each design space,
i.e., subrange, can be addressed with an appropriately tailored ADC architecture.
ADCs based on this approach are called hybrid ADCs. It is beneficial to have a
close look into a subset of the hybrid ADCs, subranging ADCs, to understand their
architectural motivation. Subranging ADCs were originally used to improve the
efficiency of flash ADCs for high resolution by dividing the input range into multiple
104 B. Gnen et al.
Vin +
Vx
H bs
subranges, most commonly into two coarse and fine ranges. However, they suffered
from interstage matching, i.e., the coarse converter and the fine converter should
have a perfectly matched range. This results in tough requirements on the thermal
noise and accuracy of the coarse converter, which, in turn, lead to degraded energy
efficiency. For this reason, back-end correction techniques such as redundancy
(over-ranging) and/or calibration are often employed to relax the coarse converters
accuracy requirements [15], leading to very efficient designs.
By dividing the full input range into two or more subranges, hybrid two-step
architectures in the form of SAR C SAR pipeline [9], SAR C single slope [7],
SAR C SDM [14, 8], and flash C SDM [5, 6] achieve state-of-the-art energy
efficiency, as shown in Figs. 6.1 and 6.2. In addition, their linearity is often improved
thanks to the reduction of the signal swing at the input of the linearity-critical fine
converter. It is observed that the architectures of the coarse and fine converters are
tailored to the desired performance. For low-to-moderate input bandwidths, i.e., not
close to the speed limits of the technology used, the coarse converter is often a SAR
ADC [13, 79] due to their compactness and superior energy efficiency. When the
speed of the coarse conversion is important, a flash ADC is preferred [5, 6]. In very
efficient high-resolution (DR > 75 dB) hybrids, the fine converter is either a SDM
[1, 2, 5], an over-sampling SAR [4], or a single-slope ADC [7], to achieve high
resolution with maximum efficiency.
The zoom ADC architecture has been proposed for high-resolution and high-
linearity applications, in which it simultaneously achieves excellent energy effi-
ciency and small die area [1]. The system block diagram of a zoom ADC is shown
in Fig. 6.4. It consists of a coarse ADC and a fine SDM. The coarse ADCs output
(k) corresponds to an analog range kVLSB,C < Vin < (k C 1)VLSB,C where VLSB,C is
its quantization step or least significant bit (LSB). The digital value k is then used to
adjust, i.e., zoom in, the references of the SDMs DAC such that VREF D kVLSB,C
and VREFC D (k C 1)VLSB,C . These reference voltages straddle the input signal Vin ,
thus ensuring that it lies in the input range of the fine M. In contrast to other
Nyquist-rate ADC C SDM hybrids, there is no computation of an analog residue
signal resulting from the coarse conversion. Instead, only the digital result of the
coarse conversion is used to zoom in on the signal level. By using a wider fine
input range (i.e., over-ranging), the coarse converters linearity and accuracy can
be considerably relaxed. The overall linearity is determined by the fine SDM, in
particular by its DAC, whose linearity is then improved by using dynamic element
matching techniques.
Like a multi-bit SDM, zooming reduces the signal swing at the input of the SDM,
thus relaxing the slewing requirements of the first SDM stage. Its performance will
6 A Hybrid ADC for High Resolution: The Zoom ADC 105
also be similar to that of a multi-bit SDM with the same OSR, despite the fact that
there its multi-bit quantizer is outside the SDM loop. Consequently, higher quantizer
delays can be tolerated, which means that the coarse converter can be implemented
as a compact and efficient SAR ADC.
The first zoom ADCs were implemented as incremental converters, in which the
coarse and fine conversions were performed sequentially [1, 2, 23]. The time-
domain operation of an incremental zoom ADC is shown in Fig. 6.5. The conversion
starts with a SAR phase to quickly determine the correct zoom range, followed
by a fine 1-bit SDM phase that uses the nearest two reference levels to accurately
determine the final digital value. This approach works well for quasi-static signals,
such as those encountered in sensor readout [2, 23], or instrumentation applications
[1], but it does not work for dynamic signals.
The time-domain operation of an incremental zoom ADC with a dynamic (time-
varying) signal is shown in Fig. 6.6. After the SAR period, the ADC will assume
that the chosen reference values are valid throughout the whole fine conversion.
However, this is not true for dynamic signals, leading to modulator overload. Thus,
the maximum input signal frequency will be limited to when assuming a 1-LSBC
(coarse LSB) of fine input range, i.e., no over-ranging:
fs
fin;max < (6.2)
2 OSR 2N
Fig. 6.5 Time-domain operation of an incremental zoom ADC with a static input. Showing the
SAR ADCs comparator output during the coarse period, and the SDMs bitstream during the fine
period
106 B. Gnen et al.
Fig. 6.6 Time-domain operation of an incremental zoom ADC with a dynamic input. Showing the
SAR ADCs comparator output during the coarse period, and the SDMs bitstream during the fine
period
where fS is the sampling frequency, OSR is the over-sampling ratio of the zoom
ADC, and N is the coarse ADCs number of bits. Thus, this architecture is only well
suited to the conversion of quasi-static signals.
fcoarse
fin;max < (6.3)
2 2N
where fcoarse is the coarse ADC sampling frequency, which is an integer fraction
of fS , i.e., fS /N for an N-bit SAR ADC or fS for a flash ADC. Compared to its
6 A Hybrid ADC for High Resolution: The Zoom ADC 107
incremental counterpart, the maximum input frequency of the dynamic zoom ADC
is not a function of the SDMs OSR, thus allowing the use of a large OSR with the
associated benefits in terms of resolution and area occupation.
A prototype dynamic zoom ADC for digital audio has been designed as a proof of
concept. The targeted specifications are 106-dB SNR and SNDR higher than 100 dB
in the 20-kHz audio bandwidth with 1.25 Vrms input range. The chosen process
technology is 0.16-m CMOS. The system-level design starts with architectural
choices. For a dynamic zoom ADC, these include the choice of the following
parameters: Fs (OSR), coarse ADC resolution, coarse ADC redundancy (over-
ranging), SDM loop-filter structure and order, and SDM quantizer resolution.
The efficiency of a SDM is, to first order, independent of its OSR. Increasing OSR
is desirable to reduce the signal-to-quantization-noise ratio (SQNR) and the chip
area and increase fin,max in (6.3). However, the power consumption of the digital
sections (DEM, SAR controller, clocking) increases proportionally with the SDM
sampling frequency fs and, consequently, proportionally with OSR. For the chosen
0.16-m CMOS technology, system simulations revealed that 11.2896 MHz (audio
standard), corresponding to OSR D 282, is a good compromise between chip area,
fin,max , and digital power consumption.
For high energy efficiency, a thermal-noise limited SNR is desired, i.e., the
quantization noise should be much less than the thermal noise. To achieve the
targeted thermal-noise limited 110-dB SNR, SQNR D 130 dB is chosen. The zoom
ADCs total SQNR is determined by the coarse resolution and the SDMs SQNR.
The last depends on its loop-filter order, the quantizer resolution, and the OSR.
Zooming relaxes the SQNR requirement of the SDM by reducing its input range.
Thus, more than 1-bit quantization in the loop is not necessary.
To determine the loop-filter order of the SDM, Fig. 6.8 shows the SQNRmax (for
an ideal loop filter) as a function of the OSR for a zoom ADC with a 1-bit SDM
quantizer and a coarse SAR ADC with 35 bits, for different loop-filter orders [12].
It is observed that for each increased bit in the coarse ADC, SQNRmax increases by
6.02 dB similar to multi-bit SDMs. For the chosen OSR D 282, a second-order loop
filter would be sufficient. However, for a robust design, a third-order SDM is chosen.
Thanks to the noise scaling of the third stage, the power consumption of the third
stage is expected to account for only 15% of the whole loop filter (simulated). For
the implementation, a switched-capacitor (SC) loop filter is chosen for its robustness
to clock jitter. The SC loop filter is chosen as a cascade of integrators with feed-
forward (CIFF) for its superior linearity.
108 B. Gnen et al.
As mentioned before, the fine DAC in Fig. 6.4 uses the digital result (k) of the SAR
ADC to dynamically adjust its references. If according to the coarse converter the
input signal satisfies kVLSB,C < Vin < (k C 1)VLSB,C where VLSB,C is the coarse
converter quantization step or least significant bit (LSB), the references of the fine
DAC are set to VREF D kVLSB,C and VREFC D (k C 1)VLSB,C . However, using
the zoomed-in references has several problems. Foremost, SDMs are not stable
over the full range of their DACs. So, if Vin is close VREFC or VREF , the SDM
could be overloaded. Furthermore, any error in the coarse ADC due to mismatch
or the coarse converters thermal noise can lead to an error in k causing Vin to
fall outside the SDMs input range. In that case, the SDM overloads and fine
conversion becomes totally invalid, similar to the interstage mismatch problem of
subranging converters [15]. To address this issue, the input range of the SDM can
be widened by using over-ranging, so that the SDM DAC references are chosen
as VREFC D (k C 1 C M/2)LSBC and VREF D (k-M/2)LSBC where M is the
over-ranging factor. Since the SDM DAC range is widened both at the low and
at the high side, the DAC references symmetrically straddle the signal, i.e., Vin is
approximately in the center of the SDM input range. Thus, even in the presence
of a coarse conversion error smaller than M/2, the SDM can operate without
overloading. This allows for larger errors in the coarse ADC converter.
Figure 6.9 shows simulated maximum acceptable SAR ADC INL vs M for zoom
ADCs with 46-bit SAR ADCs and third-order SDM with 1-bit quantization. For
each data point, a 100-point Monte Carlo simulation has been run, and the maximum
INL which causes less than 10-dB SQNR deviation is reported. The offset of
the SAR ADC is not included for the sake of simplicity. The maximum tolerable
INL (LSBC ) is found independent of the coarse resolution; however, the relative
matching of the unit elements increases quadratically for each coarse bit, i.e., from
6 A Hybrid ADC for High Resolution: The Zoom ADC 109
Fig. 6.9 Maximum simulated tolerable SAR ADC INL (LSBC ) vs M for zoom ADCs with a third-
order SDM
5 bits to 6 bits, due to the smaller size of LSBC . As it is seen from Fig. 6.9, the
maximum acceptable INL error increases proportionally with M, thus dramatically
relaxing the SAR ADCs accuracy requirements. Even missing codes are tolerated
for M 3.
Over-ranging comes at the cost of a lower SQNR, since doubling M results in a
1-bit less coarse resolution, i.e., 6-dB less SQNR, but it greatly simplifies the design
of the SAR ADC and, consequently, its power consumption and area occupation.
Thus, it makes an energy-efficient two-step conversion possible while keeping the
SDM input range small (D [M C 1]LSBc ), avoiding strict matching requirements
between two converters, and overloading in the SDM. Over-ranging also helps in
increasing the maximum input signal bandwidth by modifying (6.3) into
.M C 1/ f coarse
fin;max < (6.4)
2 2N
Figure 6.10 shows M vs fin,max for SAR ADCs with 46 bits and with
fs D 11.29 MHz. To allow for the 20-kHz signal bandwidth, viable options are
both a 4-bit SAR ADC with M D 2 and a 5-bit SAR ADC with M D 4. However,
the latter provides a better coarse resolution, i.e., a more precise reference range
estimation, with negligible additional power and area. So, in this work a 5-bit SAR
ADC with 4-LSBc over-ranging is used.
80
70
N=4
fin,max (kHz) 60
50
40
30 N=5
20
N=6
10
0
1 2 3 4
M
Fig. 6.10 fin,max vs M for 46-bit SAR ADCs clocked at 11.29 MHz fS
ADC total full-scale input. A larger a1 can be exploited to save a considerable silicon
area, as explained in the following. Coefficient a1 in the proposed implementation
can be expressed as:
CS
a1 D (6.5)
Cint 1
where Cint1 is the first integration capacitor and CS is the sampling capacitor. The
integrator is assumed to be non-inverting for the sake of simplicity. CS is determined
by the kT/C noise requirement, and it is fixed for a given OSR, thus resulting in
Cint1 being proportional to a1 . Since the area of a DT-SDM is dominated by the
first-stage capacitors, a1 > 1 allows for a large area saving. As a drawback, this
increases the output swing of the first integrator, which however is quite small in a
zoom ADC, and increasing it does not constitute an issue. Hence, a1 D 1.5 is chosen
(Fig. 6.11) corresponding to a first-integrator output swing of 27% of the full-scale
output range. Such output swing is within the linear range of the inverter-based
class-AB OTA used to implement the first integrator (see Sect. 6.5.5).
6 A Hybrid ADC for High Resolution: The Zoom ADC 111
Fig. 6.12 A simplified circuit schematic of the proposed dynamic zoom ADC
112 B. Gnen et al.
VOP VDD
- +
VDD
M2
Caz M11
Vbias
1 S Vb,p1
b1 M
4
M8 M10
CS M5 M6
Vin 2 Sb2 Vin,p
1 Vo,diff Vin,n
Vb,n2 Vb,p2
Sb4 2 Cint
2 1 VO
VCM Cpar Vb,n1 M7 M9
1 Sb3 M3
VON
- + M1 Vo,CM CM FB M12
Caz
a) b)
Fig. 6.13 (a) Proposed dynamic-biased inverter-based OTA. (b) Current-reuse OTA
SAR
VCM C C 2C 4C 8C 16C logic result (k)
SAR
SAR
SAR
SAR
SAR
CMP
2 VCM
2
Sp4
Sn4
Sp2
Sn2
Sp3
Sn3
Sp0
Sn0
Sp1
Sn1
2
VREF,FS CMP
VREF,0 SAR
Vin t1 t2 t3 t4 t5 t1
The prototype dynamic zoom ADC has been fabricated in a 0.16-m CMOS
technology [11]. It occupies an area of 0.16 mm2 as shown in the chip micrograph
(Fig. 6.15). Its total power consumption is 1.12 mW with the digital circuitry
consuming 29% of the power (including DWA, SAR logic, and the nonoverlapping
clock generator and excluding the digital decimator). The analog power consump-
tion is dominated by the first integrator (56%, simulated). In contrast, the SAR
ADCs analog section draws only 7 W (measured).
The digital outputs of the ADC were the SAR ADCs comparator output, the
SDM bit stream, and a clock synchronized to the data. Since the outputs were single-
ended and full-CMOS level (0 V1.8 V), their interference with the external voltage
reference on the test PCB limited the measured SNDR to 98.3 dB in 20-kHz BW in
the first experimental characterization [10]. After lowering the supply of the digital
output drivers from 1.8 V to 0.9 V, the interference is reduced (Figs. 6.16 and 6.17)
so that the maximum measured SNDR is 103 dB.
The ADCs peak SNR and DR were 106 dB and 109 dB, respectively, with DWA
active (Fig. 6.17). Peak SNDR is limited to 72 dB with DWA off due to the fine DAC
mismatch. Thanks to the input common-mode cancellation scheme, the CMRR is
greater than 62 dB from DC up to 1 MHz for full-scale common-mode inputs. The
ADCs 1/f corner measured to be below 20 Hz, proving the effectiveness of the
auto-zeroing employed in the first OTA.
114 B. Gnen et al.
Fig. 6.16 Measured output spectra for DWA off, DWA on, and no input. Inputs are connected to
VCM for no input case, with DWA on
110
SNR
100
SNDR
90
80
SNR & SNDR [dB]
70
60
50
40
30 @ 1 kHz
SNRmax = 106dB
20
SNDRmax = 103dB
10 DR = 109dB
Dynamic Range
0
-110 -100 -90 -80 -70 -60 -50 -40 -30 -20 -10 0
Input Amplitude [dBFS]
In order to test the overloading of the SDM with full-scale out-of-band signals,
a full-scale sine wave is applied to the ADCs input, and its frequency is swept
from 10 Hz to 100 kHz. In-band noise is measured for each point to predict the
achievable DR as shown in Fig. 6.18. The degradation of the DR is observed with
full-scale signals above 27 kHz, as predicted in system-level simulations. A first-
order RC low-pass filter (LPF) with 30-kHz corner frequency is inserted before the
6 A Hybrid ADC for High Resolution: The Zoom ADC 115
110
DR in 20kHz BW [dB]
109
108
107
106
Fig. 6.18 DR in 20-kHz BW in the presence of in- and out-of-band full-scale inputs with and
without an LPF at the input with 30-kHz corner frequency (DWA on)
Table 6.1 Performance summary and comparison with state-of-the-art audio ADCs
Unit This work [24] [25] [26] [27] [18]
Year 2016 2016 2016 2016 2011 2016
Loop-filter type DT CT CT CT CTCDT DT
Technology nm 160 160 130 65 40 130
Die area mm2 0.16 0.21 1.33 0.256 0.05 0.31
Power consumption mW 1.12 0.39 0.28 0.8 0.5 0.3
Sampling frequency MHz 11.29 3 6.144 6.4 6.5 6.1
Signal bandwidth kHz 20 20 24 25 24 20
Peak SNR dB 106 93.4 99.3 100.1 93.6
Peak SNDR dB 103 91.3 98.5 95.2 90 97.7
DR dB 109 103.1 103.6 103 102 100.5
FOMsa dB 181.5 180.2 182.9 177.9 179 178.7
a
FOMs D DRC10 log(signal bandwidth/Power)
ADC input, which ensures that the DR is constant up to at least 100 kHz (the max.
measurement frequency is limited by the low-noise audio signal generator).
A performance comparison with the ADCs with similar resolution (>100-
dB DR) and bandwidth is presented in Table 6.1. Although the proposed zoom
ADC is a discrete-time design, it shows state-of-the-art 181.5-dB FoMS . It is
also considerably more area efficient than the previous designs implemented in
similar technology nodes. As discussed before, the ADCs area is dominated by
the capacitors defined by the kT/C noise required to obtain the 109-dB DR, so the
area is used efficiently.
116 B. Gnen et al.
6.8 Conclusions
The dynamic zoom ADC is presented as a hybrid ADC suitable for high-resolution
and high-linearity digital audio applications. The proposed zoom ADC employs a
5-bit SAR ADC working in parallel to assist a third-order SDM. This improved
the overall energy efficiency by reducing the signal swing of the SDM and relaxed
its nonthermal-noise-related power consumption. A 0.16-mm2 prototype chip is
implemented in 0.16-m CMOS technology, achieving 109-dB DR, 106-dB peak
SNR, and 103-dB peak SNDR while having an excellent FoMS of 181.5 dB.
References
1. Chae, Y., Souri, K., Makinwa, K.A.A.: A 6.3 W 20 bit incremental zoom ADC with 6 ppm
INL and 1 V offset. IEEE J. Solid State Circuits. 48(12), 30193027 (2013)
2. Sechang, Oh., Jung, W., Yang, K., Blaauw, D., Sylvester, D.: 15.4b incremental sigma-delta
capacitance-to-digital converter with zoom-in 9b asynchronous SAR. 2014 Symposium on
VLSI Circuits Digest of Technical Papers, Honolulu, pp. 12 (2014)
3. Venca, A., Ghittori, N., Bosi, A., Nani, C.: A 0.076 mm2 12 b 26.5 mW 600 MS/s 4-way
interleaved subranging SAR- ADC with on-chip buffer in 28 nm CMOS. IEEE J. Solid
State Circuits. 51(12), 29512962 (2016)
4. Shu, Y.S., Kuo, L.T., Lo, T.Y.: An oversampling SAR ADC with DAC mismatch error shaping
achieving 105-dB SFDR and 101-dB SNDR over 1 kHz BW in 55 nm CMOS. IEEE J. Solid
State Circuits. 51(12), 29282940 (2016)
5. Dong, Y., Yang, W., Schreier, R., Sheikholeslami, A., Korrapati, S.: A continuous-time 03
MASH ADC achieving 88-dB DR with 53 MHz BW in 28 nm CMOS. IEEE J. Solid State
Circuits. 49(12), 28682877 (2014)
6. Gharbiya, A., Johns, D.A.: A 12-bit 3.125 MHz bandwidth 03 MASH Delta-sigma modulator.
IEEE J. Solid State Circuits. 44(7), 20102018 (2009)
7. Liu, C.C., Huang, M.C., Tu, Y.H.: A 12 bit 100 MS/s SAR-assisted digital-slope ADC. IEEE
J. Solid State Circuits. 51(12), 29412950 (2016)
8. Sanyal A., Sun N.: A 18.5-fJ/step VCO-based 01 MASH ADC with digital background
calibration. 2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits), Honolulu, pp. 12
(2016)
9. van der Goes, F., Ward, C., Astgimath, S., Yan, H., Riley, J., Mulder, J., Wang, S., Bult, K.:
11.4 A 1.5-mW 68-dB SNDR 80-MS/s 2x interleaved SAR assisted pipelined ADC in 28nm
CMOS. Proc. IEEE International Solid-State Circuits Conference, San Francisco, pp. 200201
(2014)
10. Gnen, B., Sebastiano, F., van Veldhoven, R., Makinwa, K.A.A.: A 1.65mW 0.16mm2 dynamic
zoom ADC with 107.5dB DR in 20kHz BW. 2016 IEEE International Solid-State Circuits
Conference (ISSCC), San Francisco, pp. 282283 (2016)
11. Gnen, B., Sebastiano, F., Quan, R., van Veldhoven, R., Makinwa, K.A.A.: A dynamic zoom
ADC with 109-dB DR for audio applications. IEEE J. Solid-State Circuits, (accepted for
publication)
12. Schreier, R., Temes, G.C.: Understanding Delta-Sigma Data Converters. John Wiley and Sons,
Hoboken (2005)
13. Murmann, B.: A/D converter trends: power dissipation, scaling and digitally assisted architec-
tures. 2008 IEEE Custom Integrated Circuits Conference, San Jose, pp. 105112 (2008)
6 A Hybrid ADC for High Resolution: The Zoom ADC 117
14. Murmann, B.: ADC Performance survey 19972016. [Online]. Available: http://
web.stanford.edu/~murmann/adcsurvey.html
15. Pelgrom, M.J.M.: Analog-to-Digital Conversion. Springer, Cham (2017)
16. Chae, Y., Han, G.: Low voltage, low power, inverter-based switched-capacitor delta-sigma
modulator. IEEE J. Solid State Circuits. 44(2), 458472 (2009)
17. Christen, T.: A 15-bit 140-W scalable-bandwidth inverter-based modulator for a MEMS
microphone with digital output. IEEE J. of Solid-State Circuits. 48(7), 16051614 (2013)
18. Lee, S., Jo, W., Song, S., Chae, Y.: A 300-W audio modulator with 100.5-dB DR using
dynamic bias inverter. IEEE Trans. Circuits Syst. I, Reg Papers. 63(11), 18661875 (Nov.
2016)
19. Steiner, M., Greer, N.: 15.8 A 22.3b 1kHz 12.7mW switched-capacitor modulator with
stacked split-steering amplifiers. 2016 IEEE International Solid-State Circuits Conference
(ISSCC), San Francisco, pp. 284286 (2016)
20. van Veldhoven, R.H.M., van Roermund, A.H.M.: Robust Sigma Delta Converters. Springer,
Dordrecht (2017)
21. van Veldhoven, R.H.M., Rutten, R., Breems, L.J.: An inverter-based hybrid modulator.
2008 IEEE International Solid-State Circuits Conference Digest of Technical Papers, San
Francisco, pp. 492630 (2008)
22. van Veldhoven, R.H.M, Nizza, N., Breems, L.J.: Technology portable, 0.04mm2 , Ghz-rate
modulators in 65nm and 45nm CMOS. 2009 Symposium on VLSI Circuits, Kyoto, pp. 7273
(2009)
23. Souri, K., Chae, Y., Makinwa, K.A.A.: A CMOS temperature sensor with a voltage-calibrated
inaccuracy of 0.15 C (3) from 55 C to 125 C. IEEE J. Solid State Circuits. 48(1), 292
301 (2013)
24. De Berti, C., Malcovati, P., Crespi, L., Baschirotto, A.: A 106-dB A-weighted DR low-power
continuous-time modulator for MEMS microphones. IEEE J. Solid State Circuits. 51(7),
16071618 (2016)
25. Billa, S., Sukumaran, A., Pavan, S.: A280-W 24-kHz-BW 98.5dB-SNDR chopped single-
bit CT M achieving <10-Hz 1/f noise corner without chopping artifacts. 2016 IEEE
International Solid-State Circuits Conference (ISSCC), San Francisco, pp. 276277 (2016)
26. Leow, Y.H., Tang, H., Sun, Z.C., Siek, L.: A 1 V 103-dB 3rd-order audio continuous-time
ADC with enhanced noise shaping in 65 nm CMOS. IEEE J. Solid State Circuits. 51(11),
26252638 (2016)
27. Lo, T.Y.: A 102dB dynamic range audio sigma-delta modulator in 40nm CMOS. 2011 IEEE
Asian Solid State Circuits Conference (A-SSCC), Jeju, pp. 257260 (2011)
Part II
Smart Sensors for the IoT
The second part of this book is dedicated to recent advances in the field of sensors,
interfaces, and references intended for use in wearable and IoT applications. Since
such applications are usually battery powered, their key requirement is for high
energy efficiency, which can then be combined with aggressive duty cycling to
achieve extremely low (nW) levels of average power.
The first chapter, by Nick van Helleputte et al., discusses advancements in the
design of analog circuits intended for use in wearable healthcare applications. A
number of general trends, e.g., toward multimodal sensing, are discussed. Circuit
topologies for the most relevant sensing modalities, e.g., ExG, bio-impedance, and
photoplethysmogram (PPG), are presented, as well as some recent state-of-the-art
implementations.
The second chapter, by Rajesh Pamula, Chris van Hoof, and Marian Verhelst,
presents an ultra-low power PPG readout circuit that exploits various mixed-
signal processing techniques. In particular, the use of compressive sampling (CS)
allows the power consumption of its LED driver to be reduced by 30. Heart rate
information is then extracted in the compressed domain, thus avoiding the use of
complex signal reconstruction techniques.
The third chapter, by David Ruffieux et al., describes an ultra-low power (240 nA)
real-time clock module that achieves a typical accuracy of 1 ppm at 1 Hz over
the industrial temperature range (4085 C). It combines a miniature 32 kHz
quartz crystal and an ASIC in a miniature 8-pin ceramic package. An all-digital
interpolation scheme allows its 1 Hz output to be trimmed with a resolution of
0.1 ppm, resulting in significant savings in both circuit area and power consumption.
The fourth chapter, by Sining Pan and Kofi Makinwa, presents two resistor-based
CMOS temperature sensors. One is based on a Wien bridge RC filter, which has a
temperature-dependent phase shift; while the other is based on a Wheatstone bridge,
which outputs a temperature-dependent current. In both cases, the bridge outputs are
digitized by continuous-time delta-sigma modulators. This results in sensors with
state-of-the-art energy efficiency as well as low power dissipation (<200 W).
In the fifth chapter, by Javier Perez Sanjurjo et al., an integrating dual-slope
(DS) capacitance-to-digital converter (CDC) is presented. This is used to digitize
120 II Smart Sensors for the IoT
Nick Van Helleputte, Jiawei Xu, Hyunsoo Ha, Roland Van Wegberg,
Shuang Song, Stefano Stanzione, Samira Zaliasl, Richard van den Hoven,
Wenting Qiu, Haoming Xin, Chris Van Hoof, and Mario Konijnenburg
7.1 Introduction
Global demographic trends, like increased access to healthcare and aging popu-
lation, impose tremendous pressure on the traditional healthcare system. There is
a clear ongoing paradigm shift toward preventive healthcare, which aims to avoid
people getting sick in the first place or detect the onset of illness as soon as possible.
The motivation is obvious, as such a preventive system will increase the quality
of life, while reducing the cost of medical care. A critical requirement for such
a preventive healthcare system is the ability to monitor the health status in an
unobtrusive and permanent manner. Most medically relevant diagnostic equipment
today still is expensive and bulky and requires trained professionals to operate
them and interpret the data. Recent years have seen a significant rise in research
toward bringing automated, reliable, high-quality diagnostic health assessment to a
wearable platform.
Figure 7.1 shows a general block diagram for a wearable healthcare device. There
is an analog front-end (AFE) which records (multiple) physiological signals and
converts them to digital. There are numerous parameters that can be monitored
that have clinical relevance for preventive healthcare systems. These include the
ExG (ECG, EMG, EEG) signals are measured on the surface of the skin by
electrodes connecting to a high input impedance instrumentation amplifier (IA).
This IA usually dominates the overall performance, such as noise, input impedance,
CMRR, and power.
Sub-W ExG readout ICs with low supply voltages not only increase the continuous
operation time of biomedical sensors but also reduce the size of the battery.
This makes the ULP ExG readouts very popular for wireless biosensor nodes.
However, minimizing power dissipation by scaling down the supply voltage is not
7 Advances in Biomedical Sensor Systems for Wearable Health 123
Fig. 7.2 Multi-voltage IA using local 0.2 V supply in its first stage [2]
To improve the input impedance, [6, 7] proposed auxiliary buffers to drive the input
coupling capacitance before the main input chopper is switching (see Fig. 7.3).
Thus, the current to charge the input capacitance (Cin ) is provided by the buffers
instead of the input signal source. This technique ensures 300 M input impedance
over the ExG bandwidth [6] compared to the IAs utilizing a positive feedback
124 N. Van Helleputte et al.
Fig. 7.3 Input impedance boosting by using two auxiliary buffers [6]
loop for impedance boosting (bootstrapping). The auxiliary buffers provide better
stability and less dependence on IAs bandwidth. On the other hand, the analog
buffers are power hungry, and their 1/f noise can degrade the IAs low-frequency
noise performance, even if the buffers are duty cycled to be active only for a short
period.
Apart from using the capacitively coupled IA, a low-supply voltage IA can be
implemented in a time domain. A time-domain ECG amplifier (Fig. 7.4) is proposed
in [8], in which the input biopotential signal is first converted into a 2b digital output
through time-domain ADC and then decimated by a CIC digital filter. The large
dynamic range requirement is moved to time domain and thus decoupled from VDD.
In addition, the electrode offset is also compensated similar as in [9] via a digitally
assisted servo loop. Nevertheless, time-domain IAs suffer from quantization noise
of the comparator. Although the noise in ExG bandwidth can be reduced by using a
higher-frequency clock, this also subjects to higher power consumption.
Table 7.1 shows a performance overview of some recently published instrumen-
tation amplifiers. As is clear, the field spans various design nodes and architectures
that focus on very high performance or ultralow power.
7 Advances in Biomedical Sensor Systems for Wearable Health 125
7.3.1 Introduction
It is clear that the measured signal VIN also contains the electrode impedance
(ZELEC ) which is usually much higher than the signal of interest (ZBODY ), and this
limits its application to monitoring of only relative variation of BIOZ. Furthermore,
126
Current
Human
Generator
Body
Readout
ZELEC Front-end
+ +
ZBODY VIN + IA ADC ICG
-
ZELEC -
LPF
(a)
Current
Human ZELEC Generator
Body
Readout
ZELEC Front-end
+ +
ZBODY VIN + IA ADC ICG
-
ZELEC -
LPF
ZELEC
(b)
Fig. 7.5 Two (a) and four (b) electrode measurements of BIOZ
noise from the current generator in;CG 2 becomes significant due to large ZELEC
(few k). For better accuracy, BIOZ can be measured with a tetrapolar method
(Fig. 7.5b), where two electrodes are injecting current, while the remaining two are
sensing the voltage. In this case, the equations become
The input signal now only contains ZBODY between the two sensing electrodes
because theres no current flowing through the electrodes for voltage sensing. Of
course this assumes an infinite input impedance, but in reality the input impedance
tends to be high enough to make this approximation. However, it is important to
understand how finite input impedance might result in a partial appearance of ZELEC
into the formula also. The effect of the current noise in this case is negligible, thanks
to small ZBODY (<200 ) (Eq. 7.4). Therefore, four electrode measurements are
preferred to achieve high accuracy as well as to obtain absolute value of ZBODY
[1216].
The bio-impedance has a dependency on frequency because it has capacitive
as well as resistive characteristic. Depending on the application (i.e., vital signs
recording, respiration measurement, fluid retention analysis, etc.), one might be
interested in either the resistive or capacitive parts, and one might be interested to
characterize these at different frequencies. For continuous monitoring of respiration
and heartbeat, BIOZ is often measured at a fixed single frequency (SF) of less than
100 kHz [12, 13, 17]. On the other hand, multifrequency (MF) measurement is pre-
ferred for in-depth analysis of body fluid such as electrical impedance tomography
(EIT), lung fluid accumulation, and de-/overhydration detection, requiring a wide
range of frequencies from 1 k to 1 MHz [14, 15]. In the following section, the state-
of-the-art BIOZ readout circuits for SF and MF measurement will be shown.
Vbn
N1 N1 N4 N4 N4 N4
For BIOZ MF measurement, the readout front-end typically must handle larger
BWs. The traditional BIOZ readout, where the modulated input signal is amplified
first and demodulated back to baseband, requires power-hungry wide-BW IAs
(BW > 1 MHz). A digital-intensive broadband approach using maximum length
sequence (MLS) which has an equally distributed power spectrum was introduced,
and it achieved a fast measurement (100 ms) with fine resolution (100 m) (Fig. 7.9)
[14]. However, this work also suffers from wide-BW requirements resulting in a
limited frequency range up to 125 kHz.
A pre-modulation to intermediate frequency was introduced for low power con-
sumption [15, 16] (Fig. 7.10). The high-frequency input signal is first demodulated
to baseband and then modulated to an intermediate frequency (kHz) which is low
enough to be handled by low-power IAs (BW 30 kHz [15]) and also high enough
to mitigate 1/f noise. This method also enables quadrature-phase measurement by
using 90 shifted clock for demodulation. This work achieved wide frequency range
up to 1.24 MHz while consuming only 52 W.
To conclude our BIOZ recording overview chapter, Table 7.2 shows a state-of-
the-art comparison.
7 Advances in Biomedical Sensor Systems for Wearable Health 131
7.4.1 Introduction
A PPG signal is recorded by illuminating the skin and measuring the transmitted or
reflected light that is modulated by the blood flow (heartbeat) [18]. While clinical
PPG is a well-established field, reliable ambulatory PPG recording remains chal-
lenging because of motion artifacts, ambient light interference, and physiological
differences among people. This necessitates the use of high dynamic range readouts
with ambient light cancellation techniques. Besides, the power consumption of a
PPG system (including the LED drivers and readout channels) is usually much
higher than, for example, ECG, primarily because of the power required for the
LED drivers. Low-power circuit solutions are thus paramount for wearable devices
to reduce the battery size and improve user comfort.
A typical PPG recording circuit is shown in Fig. 7.11. It consists of an LED driver
and LED and a photodetector together with appropriate readout channel. In the rest
of this chapter, three design examples are discussed. The first example [12] supports
various LEDs with different wavelengths together with multiple photo sensors for
dynamically finding an optimized placement of LED/PD that functions well for
all people. It also provides an ambient light cancellation technique which helps to
partially remove the motion artifact and increase the dynamic range of the readout
channel. The second example [20] provides one of the highest reported dynamic
ranges, which improve the robustness of operation during ambulatory recording.
The third example [19] focuses on the power minimization, in particular the power
of the LED driver.
7 Advances in Biomedical Sensor Systems for Wearable Health 133
Figure 7.12 shows the low-power PPG readout channel with integrated ambient
interferer removal from [12]. It consists of a TIA and an integrator, which amplifies
the signal and removes the ambient component. Its operation is synchronized to the
LEDON (timing diagram). There are two phases of one integration cycle (PD&INT
134 N. Van Helleputte et al.
enabled period in timing diagram): in the first phase, the ambient light signal
is integrated on CINT ; in the second phase, the CINT is swapped, and then the
ambient light together with the LED modulated PPG signal is integrated on the
same capacitor. The pulse repetition frequency (PRF) is 4 kHz, and the pulse width
is typically 1020 s. It is worth noting that the LED pulse width is narrow (10 s);
the ambient light can be regarded as constant during the whole integration period;
thus, the signal from the ambient light is effectively cancelled. This can also be
done by double sampling at the TIA output without the integrator and subtraction
in digital domain [21]. Alternatively, course subtraction of ambient/DC current
can be achieved via IDAC at the input of the TIA, to reduce the input DR range
requirements of the TIA.
The LED drivers for pulse generation (Fig. 7.13) are organized in an 88
matrix with eight current drivers and eight driving voltage selectors, to control
up to 64 LEDs in an orthogonal fashion. The LED drivers are controlled by a
LED sequence table implemented in the digital controller, and the readout channel
is kept synchronous. This feature provides the possibility for various settings, in
terms of color, biasing current and voltage of the LEDs, to cope with physiological
differences among people.
Figure 7.14 shows a high-DR PPG readout channel where there are two amplifica-
tion stages (TIA C PGA) [20]. An ambient light measurement (obtained by reading
7 Advances in Biomedical Sensor Systems for Wearable Health 135
out the PD without pulsing the LED) is inserted between LED pulse phases to enable
system-level correlated double sampling. A fully differential topology is used for the
TIA/PGA and the ADC, improving the DR over a single-ended solution by 6 dB.
Moreover, an offset DAC is used to remove the DC component in the signal to allow
further amplification. In a PPG signal, the AC component is usually less than 1% of
the DC component; therefore, effective DC cancellation helps to increase dynamic
range significantly. The obtained DR is 97 dB, which is among the highest reported.
Table 7.3 shows an overview of PPG systems. The input-referred current noise is of
significant importance, since it decides the sensitivity of the readout channel. A rms
noise of <1 nA is typically achieved, while the noise can be reduced by increasing
the gain of the channel at the cost of dynamic range. Providing a low-power readout
channel with sub-100 pArms noise and high dynamic range at the same time is still
challenging.
Biomedical signals are characterized by medium to high dynamic ranges but fairly
low bandwidths. SAR ADCs (8b12b) have been very popular in this field because
of their minimal use of high-accuracy analog circuits [10, 24, 25]. However, their
resolution is limited to about 12 bits due to matching requirements for the capacitors
in the DAC. To accommodate higher dynamic ranges, ADCs are an attractive
alternative making it possible to shift most of the processing to the digital domain
and relax the requirements on the analog front-end. This also fits nicely with the
trend toward more digital signal processing being integrated.
Konijnenburg et al. [12] propose a ADC for biomedical applications. To
minimize power consumption, the ADC operates on a 32 kHz clock, avoiding
power-hungry generation of high-speed accurate low jitter sampling clocks. The
ADC is a single-loop second-order feed-forward SC ADC with 5-bit
successive approximation (SA) in-loop quantization for the required ENOB of 15
bits. Figure 7.15 shows the measured output spectrum achieving a SNDR of 85.4 dB
and a SNR of 87 dB with 3 A current consumption.
It is worth noting that with a lot of ADCs, the modulator is not the power-
limiting block, but the driver and reference generation are. In [12] the driver,
which doubles as programmable gain amplifier (PGA) and anti-alias filter, consumes
10 A. The input stage of the PGA consists of a differential difference amplifier
(DDA) with two fully differential inputs [26, 27] and is implemented as a miller-
compensated two-stage amplifier. The amplifier is chopper compensated to reduce
flicker noise [4]. The first-stage amplifier is a symmetric amplifier with degenerated
input differential pairs for handling the high signal swing in the input. The class
A/AB output stage [28] helps to drive the ADC by enhancing the slew rate
(SR).
While ADCs can achieve the required resolution, they dont specifically
address multimodality, where multiple signals must be converted simultaneously.
Indeed, [12] uses dedicated ADCs for different channels resulting in large silicon
area. A ADC cannot easily be multiplexed due to the memory effect of the loop
Table 7.3 Benchmarking of PPG systems
[12] [19] [20] [21] [22] [23]
Technology 0.18 m 0.18 m 0.13 m 0.18 m N. A. N. A.
Supply voltage 1.2 V 1.2 V 1.5 V 1.8 V 2.0 V 1.8 V
Input noise 15.4 pVrms 486 pArms 4 pArms (0.120 Hz) 600 pArms 36 pArms 15 nArms (N. A.)
(164 Hz) (164 Hz) (0.110 Hz) (0.120 Hz)
DC cancellation 10 A 10 A 12 A and VDAC 100 A 14 A and VDAC No.
Dynamic range 87 dB N. A. 97 dB 91 dB 99 dB 72.3 dB
LED current 5160 mA 5160 mA 12 mA 0.125.6 mA 1100 mA 8250 mA
Channel power 130 W 172 W 69 W 216 W 400 W 200 W
7 Advances in Biomedical Sensor Systems for Wearable Health
-20
-40
-60
-80
(dB)
-100
-120
-140
-160
-180
-200
10-1 100 101 102
Frequency (Hz)
Bootstrap Switch
Vin Non-Binary-Weighted
Thermometer-Coded 3b MSB DAC (9+2)b LSB DAC LSB DAC
+
z-1+ 13 z-2 +
2-
1z1
Tri-Level Monotonic
Noise Shaping
Switching Switching
Data Weighted
Averaging Mismatch Error
(DWA) Shaping (MES)
12b Dout
fs/M
NER
ER ADC
scaling
Y2
Vin z-1 di Y1 Yout
DF
1 z-1 Residue
error
Reset Reset
DAC
fs
NISDM
Fig. 7.17 Concept for an incremental SDM with an ER ADC [35, 36]
Fig. 7.18 Concept for an incremental SDM with a zoom ADC [37]
7.6 Conclusions
Wearable healthcare devices have entered our daily lives already in a very tangible
way. They have the potential for even more disruptive fundamental societal impact
by enabling true preventive healthcare. While research in these domains is advanc-
ing and great breakthrough results are being presented at a regular basis, power
consumption remains prohibitively high for true, long-term monitoring. Hence, a
continued quest for ever-lower power consumption remains a research challenge.
An important general trend is multimodality readouts [12]. On the one hand, more
signal modalities of course simply means that more relevant health parameters can
be observed. But on the other hand, it also has the potential to increase the robustness
of the recording or improve the power efficiency. For example, biopotential (ECG),
bio-impedance, and optical (PPG) can all be used to measure heart rate. While all
of these will suffer from motion artifacts, the underlying mechanisms are different.
Hence, through sensor fusion algorithms, several recordings, each potentially of
low quality and disrupted by heavy artifacts, can be combined to improve the
overall quality. This paper focused on analog circuits for a number of common
signal modalities for wearable healthcare and discussed a number of state-of-the-
art implementations.
References
1. Song, S., et al.: A low-voltage chopper-stabilized amplifier for fetal ECG monitoring with a
1.41 power efficiency factor. IEEE Tran. Biomed. Circuits
p Syst. 9(2), 237247 (2015)
2. Yaul, F.M., Chandrakasan, A.P.: A sub-W 36nV/ Hz chopper amplifier for sensors using a
noise-efficient inverter-based 0.2V-supply input stage. IEEE ISSCC. 9495 (2016)
3. Harpe, P., Gao, H., van Dommele, A.R., Cantatore, E., van Roermund, A.: 0.20 mm2 3 nW
signal acquisition IC for miniature sensor nodes in 65 nm CMOS. IEEE J. Solid State Circuits.
51(1), 240248 (2016)
4. Enz, C.C., Temes, G.C.: Circuit techniques for reducing the effects of op-amp imperfections:
autozeroing, correlated double sampling, and chopper stabilization. Proc. IEEE. 84(11), 1584
1614 (1996)
5. Harrison, R.R., Charles, C.: A low-power low-noise CMOS amplifier for neural recording
applications. IEEE J. Solid State Circuits. 38(6), 958965 (2003)
6. Chandrakumar, H., Markovic, D.: 5.5 a 2 W 40 mVpp linear-input-range chopper- stabilized
bio-signal amplifier with boosted input impedance of 300 M and electrode-offset filtering.
IEEE ISSCC. 59, 9697 (2016) p
7. Birk, C., Mora-Puchalt, G.: A 60V capacitive gain 27nV/ Hz 137dB CMRR PGA with 10V
inputs. IEEE ISSCC. 376377 (2012)
8. Mohan, R., Zaliasl, S., Gielen, G., Van Hoof, C., Van Helleputte, N., Yazicioglu, R.F.: A 0.6V
0.015mm2 time-based biomedical readout for ambulatory applications in 40nm CMOS. IEEE
ISSCC. 482483 (2016)
9. Muller, R., Gambini, S., Rabaey, J.M.: A 0.013mm2 5uW, DC-coupled neural signal acquisi-
tion IC with 0.5 V supply. IEEE J. Solid State Circuits. 47(1), 232243 (2012)
10. Van Helleputte, N., et al.: A 345 W multi-sensor biomedical SoC with bio-impedance, 3-
channel ECG, motion artifact reduction, and integrated DSP. IEEE J. Solid State Circuits.
50(1), 230244 (2015)
142 N. Van Helleputte et al.
11. Bin Altaf, M.A., Zhang, C., Yoo, J.: A 16-channel patient-specific seizure onset and termination
detection SoC with impedance-adaptive transcranial electrical stimulator. IEEE J. Solid State
Circuits. 50(11), 27282740 (2015)
12. Konijnenburg, M., et al.: A multi(bio)sensor acquisition system with integrated processor,
power management, 88 LED drivers, and simultaneously synchronized ECG, BIO-Z, GSR,
and two PPG readouts. IEEE J. Solid-State Circuits. 51(11), 25842595 (2016)
13. Yan, L., et al.: A 13A analog signal processing IC for accurate recognition of multiple intra-
cardiac signals. IEEE Trans. Biomed Circuits Syst. 7(6), 785795 (2013)
14. Xu, J., et al.: A low power configurable Bio-Impedance Spectroscopy (BIS) ASIC with
simultaneous ECG and respiration recording functionality. Proc. IEEE ESSCIRC, pp. 396399
(2015)
15. Ko, H., et al.: Ultra low power Bioimpedance IC with intermediate frequency shifting chopper.
IEEE Trans. Circuits Syst. II. 63(3), 259263 (2016)
16. Kassanos, P., et al.: An integrated analog readout for multi-frequency Bioimpedance measure-
ments. IEEE Sensors J. 14(8), 27922800 (2014)
17. TI AFE4300 Datasheet.
18. Reddy, K.A., George, B., Mohan, N.M., Kumar, V.J.: A novel calibration-free method of
measurement of oxygen saturation in arterial blood. IEEE Trans. Instrum. Meas. 58(5), 1699
1705 (2009)
19. Rajesh, P.V., et al.: 22.4 A 172uW compressive sampling photoplethysmographic readout
with embedded direct heart-rate and variability extraction from compressively sampled data.
2016 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, pp. 386387
(2016)
20. Sharma, A. et al.: Multi-modal smart bio-sensing SoC platform with >80dB SNR 35A
PPG RX chain. 2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits), Honolulu, pp. 12
(2016)
21. Winokur, E.S., ODwyer, T., Sodini, C.G.: A low-power, dual-wavelength Photoplethysmo-
gram (PPG) SoC with static and time-varying interferer removal. IEEE Trans. Biomed. Circuits
Syst. 9(4), 581589 (2015)
22. TI AFE4404 datasheet
23. ADI ADPD103 datasheet
24. Yazicioglu, R.F., et al.: 200 W eight-channel EEG acquisition ASIC for ambulatory EEG
systems. IEEE ISSCC Digest of Technical Papers, pp. 164165 (2008)
25. Xu, J., Yazicioglu, R.F., Harpe, P., Makinwa, K.A.A., Van Hoof, C.: A 160 W 8-channel
active electrode system for EEG monitoring. IEEE ISSCC Digest of Technical Papers, pp.
300301 (2011)
26. Sackinger, D., Guggenbuhl, W.: A versatile building block: the CMOS differential difference
amplifier. IEEE J. Solid State Circuits. 22(2), 287294 (1987)
27. Alzaher, H., Ismail, M.: A CMOS fully balanced differential difference amplifier and its
applications. IEEE Trans. Circuits Syst. II, Analog Digi. Signal Process. 48(6), 614619
(2001)
28. Rabii, S., Wooley, B.A.: A 1.8-V digital-audio sigma-delta modulator in 0.8-m CMOS. IEEE
J. Solid State Circuits. 32(6), 783796 (1997)
29. Sebastiano, F., van Veldhoven, R.H.M.: A 0.1-mm2 3-channel area-optimized ADC in 0.16-um
CMOS with 20-kHz BW and 86-dB DR. 2013 Proceedings of the ESSCIRC (ESSCIRC)
30. Harpe, P., Cantatore, E., van Roermund, A.: An oversampled 12/14b SAR ADC with noise
reduction and linearity enhancements achieving up to 79.1dB SNDR. ISSCC Dig. Tech. Papers.
194195 (2014)
31. Shu, Y., Kuo, L., Lo, T.: 27.2 an oversampling SAR ADC with DAC mismatch error shaping
achieving 105dB SFDR and 101dB SNDR over 1kHz BW in 55nm CMOS. pp. 458459
32. Markus, J., Silva, J., Temes, G.C.: Theory and applications of incremental delta sigma
converters. IEEE Trans. Circuits Syst. I. 51(4), 678690 (2004)
33. Quiquempoix, V., Deval, P., Barreto, A., Bellini, G., Markus, J., Silva, J., Temes, G.C.: A low-
power 22-bit incremental ADC. IEEE J. Solid State Circuits. 41(7), 15621571 (2006)
7 Advances in Biomedical Sensor Systems for Wearable Health 143
34. Agnes, A., Bonizzoni, E., Maloberti, F.: High-resolution multi-bit second-order incremental
converter with 1.5- V residual offset and 94-dB SFDR. Analog Integr. Circ. Sig. Process. 72(3),
531539 (2011)
35. Rombouts, P., De Wilde, W., Weyten, L.: A 13.5-b 1.2-V micropower extended counting A/D
converter. IEEE J. Solid State Circuits. 36(2), 176183 (2001)
36. Agah, A., Vleugels, K., Griffin, P.B., Ronaghi, M., Plummer, J.D., Wooley, B.A.: A high-
resolution low-power incremental ADC with extended range for biosensor arrays. IEEE J. Solid
State Circuits. 45(6), 10991110 (2010)
37. Chae, Y., Souri, K., Makinwa, K.: A 6.3 W 20b incremental zoom-ADC with 6ppm INL and
1 V offset. IEEE ISSCC Dig. Tech. Papers, pp. 276277 (2013)
38. Kim, H., et al.: A configurable and low-power mixed signal SoC for portable ECG monitoring
applications. IEEE Trans. Biomed. Circuits Syst. 8(2), 257267 (2014)
Chapter 8
An Ultra-low Power, Robust
Photoplethysmographic Readout Exploiting
Compressive Sampling, Artifact Reduction,
and Sensor Fusion
8.1 Introduction
based methods. The power consumption of a typical PPG acquisition system ranges
from few mWs to tens of mWs, dominated by the power consumption of the
LED driver. Moreover, PPG acquisition is highly susceptible to motion artifacts,
degrading its robustness and reliability.
In this chapter, a compressive sampling (CS)-based PPG readout is presented,
which enables reduction of relative LED driver power consumption by up to a factor
of 30x. The ASIC also integrates a digital back-end, which performs direct feature
extraction from the CS signal to estimate average HR, without requiring complex
reconstruction techniques. The possibility of artifact reduction, leveraging on sensor
fusion and a spectral subtraction technique, is also presented.
Fig. 8.2 Conventional PPG acquisition system employing uniform LED stimulation and sampling
Y D X (8.1)
N
CR D (8.2)
M
The equivalent (partial) measurement matrix for random subsampling is shown in
Fig. 8.3, which is M N reduced order identity matrix, formed by choosing M rows
from the N N identity matrix at random. The M rows chosen at random correspond
to the M sampling instants in time domain (with the row index corresponding to the
sample index). In practice, pseudorandom subsampling schemes are used, showing
8 An Ultra-low Power, Robust Photoplethysmographic Readout Exploiting. . . 149
0
1
Fig. 8.3 Sampling sequence for CS PPG acquisition system and its equivalent measurement
matrix structure
on par performance with fully random samples. The same pseudorandom sequence
can be reused for every discrete window of length Tacq s (Fig. 8.3). Compared
to the conventional PPG acquisition, based on uniform sampling, CS-based PPG
acquisition acquires signal at an average sampling rate of fs;CS given by
fs;N
fs;CS D : (8.3)
CR
where fs;N is the uniform sampling rate.
Therefore, CS-based PPG acquisition systems have a LED driver duty cycle
of TON fs;CS compared to TON fs;N , as is the case for conventional uniform
sampling PPG acquisition systems which hence enables reduction of LED driver
power consumption by a factor of CR.
Fig. 8.4 Various possible CS-based acquisition systems in context of BAN. (a) Signal acquisition
is performed at the sensor node, while reconstruction and feature extraction are performed at the
base station. (b) Both CS encoding and decoding are performed on the signal node followed by
feature extraction. (c) Feature extraction is performed on the sensor node directly from the CS data
compressed data rather than performing the reconstruction on the sensor node and
then transmitting the data is efficient in terms of power consumption. This approach,
therefore, shifts the problem of signal analysis and extracting the parameters of
clinical interest to the base station. However, the need for analyzing the signal and
extracting the relevant features locally on the sensor node is becoming important,
particularly for privacy-and latency-sensitive biomedical applications [13]. Yet,
locally reconstructing the signal on the sensor node (Fig. 8.4b) would consume
power in the range of mWs [9], rendering feature extraction from the reconstructed
signal on the sensor node infeasible for low-power sensing applications.1
Alternatively, the requirement of reconstructing the signal can be circumvented
if the features of interest can be extracted directly from the CS data (Fig. 8.4c).
This approach enables rapid signal analysis on energy-scarce BAN/WSN platforms
directly from the CS data, without requiring complex reconstruction process. In
this work, the use of least-squares spectral fitting techniques is explored for power
1
The benefits of CS encoding and decoding, followed by feature extraction, all on the sensor node,
over the conventional approach of performing feature extraction on the Nyquist rate sampled signal
might not be obvious. CS-based approach can be useful in cases where high-power stimulation is
involved, as in the case with PPG acquisition as well as in the cases where the maximum achievable
sampling frequency of the ADC is limited [15].
8 An Ultra-low Power, Robust Photoplethysmographic Readout Exploiting. . . 151
Fig. 8.5 Estimation of average HR from the frequency spectrum of PPG signal
where HRavg is the average HR in beats per minute (bpm) (Fig. 8.5). Once the
average HR is estimated, HRV can be readily inferred from the variation of average
HR across successive time intervals over which the spectrum is estimated.
The implementation details of a single-channel PPG readout ASIC [16], leverag-
ing on the concepts of CS and feature extraction from CS domain discussed in this
section, are presented in Sect. 8.4.
Fig. 8.6 The architecture of a single-channel CS PPG acquisition ASIC which embeds a DBE for
feature extraction
on-chip bias and reference signals. The DBE comprises of a control unit (CU)
that generates the necessary control signals required for the LED driver, AFE, and
the ADC and also the required internal timing and synchronizing signals. Direct
memory access (DMA) is integrated into the DBE which transfers the incoming data
from the ADC into one of the data memory (DMEM) banks. The feature extraction
unit (FEU), also part of the DBE, accelerates the process of LSP to enable extraction
of HR directly from the CS PPG signal. The DBE is clocked through an external
clock at 32 kHz. The ASIC also provides wide-scale programmability both for the
gain and bandwidth settings of the AFE and CR, thereby extending its utility across
a wide range of photocurrent amplitudes.
The first stage of the readout channel is a TIA that is interfaced to an off-chip
photodiode (PD). The TIA converts the PPG signal that is acquired as a current
signal at the output of the PD into a voltage signal, which is further processed by
the signal processing chain in voltage domain. The TIA is realized by employing
resistive feedback (Rf ) around a two-stage Miller-compensated OTA. The large
reverse bias junction capacitance of the PD, which manifests itself as a parasitic
capacitance (Cp ) at the inverting node, poses issues to the stability of the TIA.
Hence, a compensation capacitor (Cf ) is added in parallel to Rf to improve the
stability margin of the TIA. As mentioned in Sect. 8.1, the relative large DC
component of photocurrent necessitates the need for large DR for the readout.
The channel DR requirements can however be relaxed if the DC component of
the current is rejected early in the signal processing chain. This is achieved by
interfacing a 5-bit current DAC (IDAC), capable of sourcing up to 10 A of current
at the input of the TIA.
8 An Ultra-low Power, Robust Photoplethysmographic Readout Exploiting. . . 153
The output of the TIA is fed into a switched integrator (SI), which is realized by
incorporating a switched-capacitor (SC) in feedback around the OTA. The output
of the TIA is converted into a current signal through Rint , which is then integrated
onto Cint for a duration of Tint , thereby providing additional voltage amplification.
The SI stage, apart from providing additional gain, also acts as a noise-limiting
filter [17]. This is particularly important in pulsed PPG acquisition systems, where
the thermal noise originating from the OTA of the TIA exhibits noise peaking at
high frequencies. A mixed-signal feedback loop, comprised of a SC low-pass filter
(SC LPF), comparators, and an up-down counter, tracks the output DC level of the
SI. A 5-bit control code, to control the LED drive/IDAC current, is generated by the
feedback loop such that the DC output of the SI stays within the threshold values
(Vrefmin and Vrefmax ), to ensure the proper utilization of the available channel DR.
The output of the SI is then digitized using a 12-bit SAR ADC, which comprises
of a split capacitor DAC to reduce the area requirements, with a unit capacitance
(Cu ) of 800 fF. The pseudorandom subsampling instants of the ADC are controlled
by the CU that forms part of the DBE. The digitized data, at the output of the ADC,
is fed into the DBE for further processing to extract the HR. Interested readers
are referred to [18], where the detailed description of the DBE architecture and
implementation is presented.
Fig. 8.7 (Left) Signal acquisition with CRs 8x and 30x when LED is stimulated with a sinusoidal
current at 1.2 Hz. (Right) In vivo acquired PPG signal through the ASIC with a CR of 10x
154 V.R. Pamula et al.
Fig. 8.8 Measured frequency corresponding to the peak in the PSD (fpk ) from the ASIC with LED
modulated with a sinusoidal current whose frequency is swept from 0.5 to 3.4 Hz
2
Standard database [19] PPG signals lack annotations and hence sinusoidal modulation is chosen.
3
The LED driver power consumption is measured while acquiring the PPG signal of a healthy
individual. At the reported power levels, the resulting photocurrent is measured to have an AC
component of 45 nApp , while the DC component is measured to be 1.6 A.
8 An Ultra-low Power, Robust Photoplethysmographic Readout Exploiting. . . 155
Fig. 8.9 The ASIC chip micrograph and measured power consumption breakdown of the ASIC
and the off-chip LED driver for different CRs
Fig. 8.10 In vivo acquired PPG signals under different SNR conditions. The corresponding values
of acquired photocurrent and LED driver current are indicated in Table 8.1
314 mA. The HR, estimated from the uniformly sampled PPG signal using FFT,
serves as the reference. The PPG signal is then compressively acquired at a CR of
10x, and the average HR estimated by the ASIC is compared against the reference.
As can be seen in Table 8.1, the error in the average HR is estimated at 10x CR
within 2 bpm under varying SNR conditions. The LED driver power consumption,
on the other hand, scales proportional to the CR, from 6.1 mW to 615 W for an
acquired AC component of photocurrent of 12 nApp .
Finally, the performance of the CS PPG ASIC is summarized and compared
against the state-of-the-art PPG acquisition systems in Table 8.2. Compared to the
state-of-the-art, CS-based PPG, acquisition enables up to 30x reduction in the power
consumption of the LED driver, thanks to the DBE, which accelerates LSP to enable
feature extraction directly from CS data to accurately estimate HR with minimum
power penalty.
Off-chip LED driver. LED power consumption is subject to the SNR, skin tone of the subject and the efficiency of the LED used in the setup
157
158 V.R. Pamula et al.
To demonstrate the efficacy of the proposed technique, PPG signals (Fig. 8.12a)
are acquired using an internal PPG acquisition platform built from commercial
off-the-shelf (COTS) components, from a subject under normal office working
conditions. Simultaneously, accelerometer signals (Fig. 8.12c) are acquired using
the same platform. The PPG and accelerometer signals are then randomly sub-
sampled by a CR of 10x, and LSP is performed on both subsampled signals in
MATLABo . Spectral subtraction is finally performed on the normalized LSP of PPG
and accelerometer signal and is rescaled. The rescaling process uses a scale factor
that renormalizes the PSD of the spectral subtracted PPG signal, thereby restoring
the amplitude of the peak in the PSD of the PPG signal. Figure 8.12e shows the
spectral subtracted PSD of the PPG signal, and as can be seen, the spurious peak
in the frequency range [2.73.2 Hz] that is correlated to the motion is significantly
suppressed by spectral subtraction.
It must however be noted that while simulation results on a limited data set show
promising results, extensive characterization of the technique is required under a
variety of use case scenarios to arrive at a concrete conclusion regarding its efficacy
under different motion artifact scenarios. Moreover, this technique is only a post-
processing step and does not mitigate the requirement of a high channel DR. While
an adaptive filter-based approach, presented in [25] for ECG, is promising to relax
the DR requirements, it is challenging to design adaptive filters that work with
randomly subsampled data.
Cuffless blood pressure (BP) monitoring using combination of ECG and PPG has
been demonstrated in [23, 26]. The determination of BP is based on the relative
timing between peaks in the ECG and PPG signals. Figure 8.13 shows the relevant
timing information required for the BP estimation. Of interest is the pulse arrival
time (PAT), which is the temporal difference between the peak in the ECG and
8 An Ultra-low Power, Robust Photoplethysmographic Readout Exploiting. . . 159
Fig. 8.12 (a) PPG signal acquired from a subject under normal office working conditions. (b) PSD
of the PPG signal estimated using LSP after 10x random subsampling. (c) Accelerometer signal
acquired simultaneous to PPG acquisition. (d) PSD of the accelerometer signal estimated using
LSP after 10x random subsampling. (e) PSD of the PPG signal post spectral subtraction
the subsequent peak in the PPG signal. Once PAT is determined, BP is estimated
using (8.5).
SBP D a1 PAT C b1 HR C c1
(8.5)
DBP D a2 PAT C b2 HR C c2
where SBP and DBP are the systolic and diastolic blood pressure, respectively, while
ai , bi and ci , for i D 1; 2 are the calibration coefficients obtained through linear
regression.
While the implementations in [23, 26] report achieving sufficient accuracy in
determining BP for wearable applications, their power consumption is dominated by
the PPG system, owing to the uniform stimulation and sampling. [12] demonstrated
160 V.R. Pamula et al.
the use of CS-based PPG for cuffless BP estimation. However, [12] employs a full
signal reconstruction process to perform BP determination from the reconstructed
PPG signal, with the assumption of the availability of a powerful base station. As
discussed in Sect. 8.3.2, the overhead in the reconstruction process can potentially
cancel all power savings obtained from CS acquisition of PPG.
Alternatively, an event-driven approach that relies on the assistance from ECG
to acquire PPG can be explored. Realizing that the peak in PPG signal is the
aftereffect of the pumping action of blood through vessels by the heart, one can
utilize the occurrence of the QRS complex to trigger the capture of the PPG signal.
The acquisition can be stopped, when sufficient number of samples are acquired
around the peak of the PPG signal. This approach is shown in Fig. 8.14. The
presence of QRS complexes in the ECG can easily be detected using the activity
detection process outlined in [27]. While a wide range of stopping criteria can be
used for the PPG sampling, ranging from simple thresholding to more complex
approaches based on learning, in this work a sum of slopes followed by thresholding
is employed. Figure 8.15 shows a 10 s simultaneous ECG and PPG recording
obtained through the COTS platform. The PPG signal is then adaptively resampled
in ECG-assisted sampling mode, with the ECG signal acting as the trigger for PPG
acquisition. For the recordings shown in Fig. 8.15, only 446 samples of PPG signal
are acquired in the ECG-assisted acquisition mode as against 1280 in the uniform
sampling mode, leading to an average stimulation and sampling frequency reduction
8 An Ultra-low Power, Robust Photoplethysmographic Readout Exploiting. . . 161
Fig. 8.15 A 10 s simultaneous ECG and PPG recording obtained through the COTS platform
(average values are equalized for better representation). Both signals are sampled at 128 Hz
8.6 Conclusions
of power from a 1.2 V supply, with the digital back-end (DBE) consuming only
7.2 W, thus avoiding the energy penalties of wireless/wire line transmission
and/or embedded signal reconstruction. In addition, a digital domain motion artifact
suppression technique leveraging on multisensor fusion is presented. The proposed
technique leverages on the spectral subtraction of the power spectral density (PSD)
estimated for PPG signals and accelerometer signals. The efficacy of the technique
is demonstrated through simulations. Finally, an electrocardiogram (ECG)-assisted
PPG acquisition system is described for cuffless blood pressure (BP) monitoring.
The proposed approach retains the relevant relative timing information between
the ECG and the PPG signals, yet facilitating accurate BP estimation at a reduced
average stimulation and sampling rate by a factor of 1.8 across ten records obtained
using a COTS platform.
References
1. Wijsman, J., Grundlehner, B., Liu, H., Hermens, H., Penders, J.: Towards mental stress
detection using wearable physiological sensors. In: 2011 Annual International Conference of
the IEEE Engineering in Medicine and Biology Society, pp. 17981801, Aug 2011
2. Allen, J.: Photoplethysmography and its application in clinical physiological measurement.
Physiol. Meas. 28(3), R1 (2007)
3. Webster, J.G.: Design of Pulse Oximeters. Taylor & Francis Group, New York (1997)
4. Rhee, S., Yang, B.-H., Asada, H.: Artifact-resistant power-efficient design of finger-ring
plethysmographic sensors. IEEE Trans Biomed. Eng. 48(7), 795805 (2001)
5. Alhawari, M., Albelooshi, N., Perrott, M.H.: A 0.5 V < 4 W CMOS photoplethysmographic
heart-rate sensor IC based on a non-uniform quantizer. In: 2013 IEEE International Solid-State
Circuits Conference Digest of Technical Papers, pp. 384385 (2013)
6. Cands, E.J., Wakin, M.B.: An introduction to compressive sampling. IEEE Signal Process.
Mag. 25(2), 2130 (2008)
7. Pamula, V.R., Verhelst, M., Van Hoof, C., Yazicioglu, R.F.: Computationally-efficient com-
pressive sampling for low-power pulse oximeter system. In: 2014 IEEE Biomedical Circuits
and Systems Conference (BioCAS) Proceedings, pp. 6972 (2014)
8. Dixon, A.M., Allstot, E.G., Gangopadhyay, D., Allstot, D.J.: Compressed sensing system
considerations for ECG and EMG wireless biosensors. IEEE Trans. Biomed. Circuits Syst.
6(2), 156166 (2012)
9. Ren, F., Markovic, D.: A configurable 12237 kS/s 12.8 mW sparse-approximation engine for
mobile data aggregation of compressively sampled physiological signals. IEEE J. Solid-State
Circuits 51(1) 6878 (2016)
10. Maechler, P., Studer, C., Bellasi, D.E., Maleki, A., Burg, A., Felber, N., Kaeslin, H., Baraniuk,
R.G.: VLSI design of approximate message passing for signal restoration and compressive
sensing. IEEE J. Emerging Sel. Top. Circuits Syst. 2(3), 579590 (2012)
11. Maechler, P., Greisen, P., Sporrer, B., Steiner, S., Felber, N., Burg, A.: Implementation of
greedy algorithms for LTE sparse channel estimation. In: 2010 Conference Record of the Forty
Fourth Asilomar Conference on Signals, Systems and Computers, Nov 2010
12. Baheti, P.K., Garudadri, H.: An ultra low power pulse oximeter sensor based on compressed
sensing. In: 2009 Sixth International Workshop on Wearable and Implantable Body Sensor
Networks, Jun 2009
13. Csavoy, A., Molnar, G., Denison, T.: Creating support circuits for the nervous system:
Considerations for brain-machine interfacing. In 2009 Symposium on VLSI Circuits, Jun 2009
8 An Ultra-low Power, Robust Photoplethysmographic Readout Exploiting. . . 163
14. Pamula, V.R., Verhelst, M., Van Hoof, C., Yazicioglu, R.F.: A novel feature extraction
algorithm for on the sensor node processing of compressive sampled photoplethysmography
signals. In: 2015 IEEE SENSORS, pp. 14. IEEE (2015)
15. Yoo, J., Turnes, C., Nakamura, E.B., Le, C.K., Becker, S., Sovero, E.A., Wakin, M.B., Grant,
M.C., Romberg, J., Emami-Neyestanak, A., Candes, E.: A compressed sensing parameter
extraction platform for radar pulse signal acquisition. IEEE IEEE J. Emerging Sel. Top.
Circuits Syst. 2(3), 626638 (2012)
16. Rajesh, P.V., Valero-Sarmiento, J.M., Yan, L., Bozkurt, A., Van Hoof, C., Van Helleputte,
N., Yazicioglu, R.F., Verhelst, M.: A 172 W compressive sampling photoplethysmographic
readout with embedded direct heart-rate and variability extraction from compressively sampled
data. In: 2016 IEEE International Solid-State Circuits Conference (ISSCC), pp. 386387.
IEEE, Piscataway (2016)
17. Glaros, K.N., Drakakis, E.M.: A sub-mW fully-integrated pulse oximeter front-end. IEEE
Trans. Biomed. Circuits Syst. 7(3), 363375 (2013)
18. Pamula, V.R., Valero-Sarmiento, J.M., Yan, L., Bozkurt, A., Van Hoof, C., Van Helleputte,
N., Yazicioglu, R.F., Verhelst, M.: A 172_W compressively sampled photoplethysmographic
(PPG) readout ASIC with heart rate estimation directly from compressively sampled data.
IEEE Trans. Biomed. Circuits Syst. 11(3), 487496 (2017). Available online at IEEE Xplore
19. Goldberger, A.L., Amaral, L.A., Glass, L., Hausdorff, J.M., Ivanov, P.C., Mark, R.G., Mietus,
J.E., Moody, G.B., Peng, C.-K., Stanley, H.E.: Physiobank, physiotoolkit, and physionet
components of a new research resource for complex physiologic signals. Circulation 101(23),
215220 (2000)
20. ANSI/AAMI-EC13: American National Standards for cardiac monitors, hearth rate meters and
alarms (2002)
21. Tavakoli, M., Turicchia, L., Sarpeshkar, R.: An ultra-low-power pulse oximeter implemented
with an energy-efficient transimpedance amplifier. IEEE Trans. Biomed. Circuits Syst. 4(1),
2738 (2010)
22. Wong, A.K., Pun, K.-P., Zhang, Y.-T., Leung, K.N.: A low-power CMOS front-end for
photoplethysmographic signal acquisition with robust DC photocurrent rejection. IEEE Trans.
Biomed. Circuits Syst. 2(4), 280288 (2008)
23. Winokur, E.S., ODwyer, T., Sodini, C.G.: A low-power, dual-wavelength photoplethysmo-
gram (PPG) SoC with static and time-varying interferer removal. IEEE Trans. Biomed. Circuits
Syst. 9(4), 581589 (2015)
24. Boll, S.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans.
Acoust. Speech Signal Process. 27(2), 113120 (1979)
25. Helleputte, N.V., Kim, S., Kim, H., Kim, J.P., Hoof, C.V., Yazicioglu, R.F.: A 160 A
biopotential acquisition IC with fully integrated IA and motion artifact suppression. IEEE
Trans. Biomed. Circuits Syst. 6(6), 552561 (2012)
26. Poon, C., Zhang, Y.: Cuff-less and noninvasive measurements of arterial blood pressure
by pulse transit time. In: 2005 IEEE Engineering in Medicine and Biology 27th Annual
Conference (2005)
27. Pamula, V.R., Verhelst, M., Hoof, C.V., Yazicioglu, R.F.: A 17 nA, 47.2 dB dynamic range,
adaptive sampling controller for online data rate reduction in low power ECG systems. In:
2016 IEEE Biomedical Circuits and Systems Conference (BioCAS), pp. 272275, Oct 2016
Chapter 9
A 32 kHz DTCXO RTC Module with an Overall
Accuracy of 1 ppm and an All-Digital 0.1 ppm
Compensation-Resolution Scheme
9.1 Introduction
Timekeeping based on 32 kHz XTAL still remains the most popular, cost-effective,
low-power, accurate solution for low-power portable applications. Simplest solu-
tions with overall accuracies of a few 100 ppm are based on the combination
of a through-hole or SMD XTAL together with an oscillator implemented as
part of the application SoC (microcontroller, cell phone). As a single ppm error
represents a deviation of 30s/year (nearly an hour/year at 100 ppm!), temperature-
compensated XTAL or MEMS oscillators (TCXO, TCMO) used for timekeeping
applications have received significant research attention over the last decade driven
by further miniaturization, tighter accuracy and lower power consumption needs
[14]. Combining both resonator and oscillator intimately or even better, in a single
package, leads to superior stability, improved robustness and lower consumption
by minimizing environmental effects (moisture, temperature gradients) and stray
capacitance. Real-time clock (RTC) modules integrating further time, timer, cal-
endar, timestamping and alarm functions become a key power management block
capable of scheduling precise wake-up at user- or predefined intervals so that
a more complex, energy-constrained application can be heavily duty cycled and
left mostly hibernating (e.g. wireless sensor node). They are found in a variety
of consumer, metering, medical, wearable, automotive, communication, outdoor,
safety and automation applications and are a key component of the upcoming IoT
revolution.
T, CL
9 A 32 kHz DTCXO RTC Module with an Overall Accuracy of 1 ppm. . . 167
f0/N
T
168 D. Ruffieux et al.
PPS=1.0000000s
DIVIDER
@16kHz
TOO SLOW 214th
late
XTAL
TOO FAST SLOW DOWN
@32kHz dNINT>0
DIVIDER
TOO SLOW
BY 2 CYCLES
CY
FRAC
ERROR
<1 CYCLE
Fig. 9.3 Time domain illustration of the digital temperature compensation scheme known as
inhibition
whose coefficients have been previously calibrated. The floor value of the integer
part determines according to its sign how much clock periods should be swallowed
(>0) or injected (<0). The fractional part, dNFRAC , should be accumulated as in a
first-order modulator. Upon overflow, an additional cycle is inhibited so as to
guarantee that the maximum error is bounded to a single 32 kHz clock period. The
coarse adjustment yields a resolution limited to 1/32 kHz or 31 s that corresponds
to 31 ppm over a 1 s averaging time. In order to reach a resolution measurement
accuracy of 0.1 ppm, the associated integration time is 5 min making it almost
impossible for such XO manufacturer or their customer to verify the individual
module timing accuracy.
PPS=1.0000000s
PPS
FRAC
XTAL ERROR
@32kHz <1 CYCLE
TEMPERATURE SENSOR
RCO RESIDUAL
@10MHz PPS ERROR
<2/fRC
Fig. 9.4 Time domain illustration of the digital temperature compensation scheme using fractional
interpolation
(7 MHz RCO) over a gate time defined by N periods of the signal to be measured
(XO). The frequency ratio (fRCO /fXO D M/N) needed for interpolation is obtained
by a mere right shift if N is taken as a power of 2. N D 32 is chosen (1 ms
conversion time) so as to duty cycle the RCO at 1 Hz with a 0.1% ratio. In addition
to the conventional inhibition mechanism presented earlier, the interpolating edge
is delayed by a number of RCO pulses corresponding to the accumulated fraction
multiplied by the M/N ratio. The resulting error is now bounded to two clock
periods of the RCO as the phases of the two clocks are uncorrelated. The concept
is illustrated in Fig. 9.4 with N D 4. Calculating the compensation requires an
extra four XTAL cycles, while the RC is maintained running. The power overhead
is hence of 15% compared to that used for the temperature measurement. An
additional benefit of this direct temperature-to-frequency conversion scheme is the
elimination of an ADC resulting in low hardware complexity and area.
Figure 9.5 shows an illustration of how the DTCXO should be calibrated. Here,
a three-temperature trim is illustrated at which both the XO and RCO frequencies
should be measured. The temperatures at which the calibration is performed should
not be precisely known but should span the compensation range. Any thermal
gradient between the XO and RCO should be avoided for accurate compensation.
Knowing fXO (T) and fRCO (T) and substituting for T yield fXO (fRCO ). With a three-
point trim and provided the temperatures are spaced evenly, a second-order fit will
lead to an exact compensation at the three calibration points and to the elimination
170 D. Ruffieux et al.
32kHz
fXO,L fXO,0 fXO,H
g=2bXTALbRC/aRC
1PPS
TL T0 TH
T
Fig. 9.5 Illustration of the calibration procedure and resulting compensated curve after a three-
temperature trim
The system architecture of the DTCXO RTC circuit that is depicted in Fig. 9.6
incorporates in addition to the two XO and RCO blocks mentioned in the previous
section a fully integrated LDO so as to increase the power supply rejection and
permit the circuit to operate over a wide 1.25.5 V range. A built-in POR and
brownout detection (BOD) circuit together with an on-chip NVM used to store the
calibration parameters complete the analog functions of the ASIC. In addition to
the temperature compensation and PPS generation finite state machines, the circuit
features all conventional RTC functions such as clock, calendar, timer and alarms
as well as a timestamping unit triggered by an external event. All functions may
generate interrupts that can be handled through an I2 C interface. Many different
clocks are furthermore available on an output pad that can be gated externally so as
9 A 32 kHz DTCXO RTC Module with an Overall Accuracy of 1 ppm. . . 171
OE CK_OUT
The XO, whose simplified block diagram is shown in Fig. 9.8, is based on a
standard Pierce topology (M1 , C1 , C2 ) but features a non-linear amplitude limitation
mechanism (M2 M4 , R1 ) so that it can operate reliably over a very large range of
XTAL ESR (50 k < Rm < 1 M) both in vacuum and at 1 bar during the frequency
trimming step that is performed before the module is sealed under vacuum. This is
illustrated in the top right part of Fig. 9.8 that shows the XO VOSC (IMEAN ) behaviour
in both vacuum and at 1 bar as well as the AGC regulator characteristic. At power-
up (VOSC D 0), a large current determined by R1 and M1 M4 mirror ratios ensures
a rapid and robust start-up whatever Rm. Steady state is reached once the AGC and
XO curves intersect at IMEAN D 2ICRIT minimizing the power consumption while
guarantying a sufficient amplitude to drive the comparator generating a rail-to-rail
clock signal. The critical current, ICRIT , leading to oscillation build-up depends non-
linearly on the loading capacitance, CL , as shown in the equation on Fig. 9.8 valid
in subthreshold regime. The capacitance should however be made big enough to
ensure a sufficient negative resistance can be generated at start-up to overcome the
XTAL losses. This is ruled by the circuit impedance, ZC , defined as the impedance
seen from the quartz terminals while excluding the motional elements (Rm, Lm,
Cm). As a function of the transconductance, gm, it describes a circle whose radius
is dependent on CL (given by C1 //C2 and any foot stray capacitance) and CO , the
XTAL dielectric capacitance (including any parasitic capacitance in parallel to the
XTAL) that should, respectively, be maximized and minimized to obtain the biggest
circle radius. This is illustrated at the bottom right of Fig. 9.8. With a too small
CL and too large CO , the circuit would ultimately fail starting in air (or even in
vacuum) whatever the current flowing through the gain transistor. For a small CL ,
the frequency pulling (proportional to the distance from the ZC , Rm intersect point
to the x-axis) becomes Rm dependent leading to a degraded frequency stability.
The choice of CL hence dictates a very careful optimization and modelling of the
package parasitic. It should also be noted that the sensitivity of the frequency to a
variation of the loading capacitance as determined by the equation shown in Fig. 9.1
is increased for small such values.
The differential comparator generating the rail-to-rail signal is connected
between the gate of the main gain transistor and its DC-filtered version used to
set the AGC time constant (R2 , C3 ). Avoiding the combined use of the differential
drain signal which sounds attractive at a first glance ensures superior PSRR, close to
50% clock duty cycle and more freedom to bias smartly the gain transistor using a
low-loss non-linear MOS conductance instead of an area-consuming resistor. Most
32 kHz XOs necessitate large DC-blocking series capacitors to prevent the leakage
current of ESD clamps or any moisture-induced board conductance from altering
the DC operating point of the XO eventually stopping it. The most sensitive pin
is the one connected to the gate of the gain transistor which can only be fed with
some current through a high-impedance path (RB ) for proper biasing. In this design,
such series capacitors could be avoided owing to a proper assisting circuit that turns
on a lower-impedance path across the XTAL should the gate voltage of the gain
transistor drop below its set point. The XO features additional non-linear current
comparators that assist its initial biasing and detect when sufficient amplitude has
been reached to turn on the comparator that could prevent proper XO start-up due
to its Miller capacitance.
174 D. Ruffieux et al.
f=1/(RTHC)
I=V/R I=fCV
RTH Cf
The RCO topology, pictured in Fig. 9.9, is such that both similar currents and
voltages are biasing a ring oscillator of total gate capacitance C and a thermistor
RTH . With I D fCV and V D RTH I, one gets f D 1/RTH C. An HR poly resistor
is used as the thermistor yielding a RCO with a large, mostly linear, positive
TCF. A 1 MHz seven-stage ring oscillator formed with large area inverters is
used, and the signals from the different stages are combined to generate a 7
frequency reducing the quantization noise that scales inversely proportionally with
the integration time, , by a similar factor. The temperature sensor resolution is
equal to 1/RC , the inverse of the RCO first-order TCF, multiplied by the RCO Allan
deviation,
Y ( ), which is a statistical measurement of f /f over a time . The latter
is linked to the accumulated jitter
ACC ( ) /
Y ( ) , which is itself related to the
phase noise of the RCO. Both low thermal noise and 1/f noise corner frequency
are desired so that the accumulated jitter remains white noise dominated for longer
time intervals, improving the resolution of the sensor as 0.5 . It should however
be mentioned that while the proposed circuit offers a good PSRR, the thermal
noise contribution of M5 M6 largely exceeds that of the thermistor, especially in
low voltage designs where limited overdrive voltage can be applied to the gates of
the current source transistors. This degrades the sensitivity significantly. 1/f noise
minimization was achieved by increasing the transistor area and optimizing their
biasing and geometry using standard simulation tools.
The importance of the temperature sensor linearity was evidenced in a previous
section as it introduces higher-order polynomial compensation terms. This is a
particularly sensitive design point as models of the different components, such as
for the resistors, might lack sufficient accuracy to obtain predictable performances
(higher-order TCR terms and their spreads). One should mention that the core
ring oscillator contribution to the overall temperature sensor behaviour is far from
negligible.
9 A 32 kHz DTCXO RTC Module with an Overall Accuracy of 1 ppm. . . 175
Fig. 9.10 Circuit micrograph with detailed floor plan and picture of the DTCXO module after
assembly in the ceramic package
The circuit die photo highlighting the floor plan is shown in Fig. 9.10 together
with a package assembly view. It is integrated in a 0.35 m CMOS process and
measures 1.8 0.72mm2 . It is flip chip bonded using Au studs on the bottom layer
of a miniature 3.2 1.5mm2 ceramic package containing eight IOs in addition
to the two internal XTAL leads. The latter is assembled overhanging attached on
its side on the second level of the package. Eventually, after laser trimming the
XTAL, a metal lid is reflow soldered atop the third level to seal the package at low
pressure (<0.1 mbar) to maximize the resonator Q factor. Parts are then individually
calibrated and verified over the industrial temperature range in large batches before
the compensation parameters are stored in the on-chip NVM.
Figure 9.11 shows measurement data of the LDO output voltage and PTAT current
over the complete supply range from 1.5 to 5.5 V obtained over 50 samples. A
DC PSSR of 38 dB can be extracted. The PTAT current varies by 7.5% over the
176 D. Ruffieux et al.
12.0
11.5
PTAT current [nA]
11.0
10.5 1.5V
3V
10.0
5.5V
9.5
9.0
1.25 1.30 1.35
Regulator voltage [V]
Fig. 9.11 PTAT current and LDO voltage spread at several supply voltages
x 10-4
1
Number of DUT
200
df/f [ppm]
0
100
-1
-2 0
-50 0 50 100 -20 0 20 40
Temperature [C] Rel dev from 32,768Hz [ppm]
400 400
Number of DUT
Number of DUT
200 200
0 0
0 10 20 30 -40 -35 -30 -25
Turnover temperature [C] 2nd order TCF [ppb/C2]
10-4
XO
10-5
PPS inhibition
10-6
Allan deviation
300
10-7
10-8
30 PPS interpolated
10-9
32kHz
10-10
100 101 102 103
Time [s]
Fig. 9.13 Allan deviation of different clock sources derived from the XO
previously. The ToT (or similarly the first-order TCF) of the XO is reduced to a mean
value of 18 Ccompared to 23 C for the XTALmost likely due to the effect of
the temperature dependency of the XO circuit capacitance. The ToT variance is 2 C
mostly affected by variations in the tines etching. The parabolic coefficient reaches
37 ppb/ C2 , with a standard deviation over mean value of 3.8%.
Figure 9.13 shows the Allan deviation measurement for the 32 kHz XO, the
generated 1PPS signal with usual integer and newly proposed fractional inhibition
using higher-frequency interpolation. The measurement is performed at ambient
temperature with the compensation disabled. The intrinsic 1 s short-term stability
of the XO reaching 1 ppb is degraded by a factor 30 at the PPS due to interpolation
quantization at fRCO . Regarding the newly proposed scheme, a 300 improvement
can be seen compared to that using classical inhibition corresponding to the
frequency ratio. The Allan variance of the latter improves as 1 over time, as
expected.
Figure 9.14 shows RCO calibration data obtained over 1000 samples during
the calibration procedure. Temperature gradients among parts in the oven are
not compensated affecting the reported figures pessimistically. The mean RCO
frequency is 7 MHz with a relative deviation of 3.6%. The first TCF is positive
and rather large at 0.46%/ C with a relative variation of 2.1%. The temperature
error without trim would span 25 C at 3
at ambient, but after a single trim,
it reduces to 5 C. The second-order TCF is a factor 1000 below the first one
but has a larger relative deviation of some 30%. A three-point trim improves the
temperature sensor inaccuracy down to a few degrees, but again, the fact that
measurements are obtained over production data in a large oven should be taken
into account.
178 D. Ruffieux et al.
500
df/f [/oo]
0
-500
-50 0 50 100
Temp error [C] 1-pt trim, mean , from batch
10
-10
-50 0 50 100
3-pts trim
Temp error [C]
-5
-50 0 50 100
Temperature [C]
S(t)=1/RCALL(t)
10-2
RCO
10-3 200
Allan deviation
Sensitivity [mK]
10-4 20
10-5 2
Flickernoiselimit
10-6 -6
10 10-4 10-2 100 102
time [s]
Fig. 9.15 Allan deviation and sensitivity of the RCO temperature sensor
Figure 9.15 shows the Allan deviation of the RCO at ambient temperature after
having taken care of eliminating any airflow by placing the sample in a closed box.
After division by the RCO first TCF, the temperature sensor resolution is obtained.
It is shown on the right axis and reaches a floor of 1mK limited by 1/f noise. The
thermal noise limit improving as expected as 0.5 over time reaches the level of
the 1/f noise for an integration time of 10 ms. Quantization noise decaying as 1 is
superimposed on the same plot. It also reaches the noise floor limit at 10 ms. Oper-
ating the temperature sensor at a duty cycle of 1% would however represent as much
9 A 32 kHz DTCXO RTC Module with an Overall Accuracy of 1 ppm. . . 179
-1
-2
-3
-50 0 50 100
Temperature [C]
60
40
20
0
0 1 2 3 4 5 6
Overall accuracy over -40 to 85 C [ppm]
power as the overall RTC including the LDO. At the cost of a lower resolution, the
sensor is operated for only 1 ms reaching a 20mK quantization noise limited res-
olution that should be compared to 5mK when considering the thermal noise limit.
Figure 9.16 shows the temperature stability of 1000 DTCXOs over 40 to
85 C. Most parts reach 1 ppm stability over the complete industrial temperature
range as evidenced by the histogram shown in Fig. 9.17 plotting the maximum error.
Owing to the LDO and avoidance of PMOS ESD clamps, there is no noticeable
influence of the supply voltage on the stability across the full 1.255.5 V range.
Figure 9.18 shows the RTC module current consumption breakdown. The XO
consumes 16 and 30 nA for the core and buffer, respectively, (note that the buffer
has been oversized for production testing where a 300 kHz clock is fed from one of
180 D. Ruffieux et al.
the XTAL pin). The consumption of the RCO that is duty cycled with a 1/700 ratio
reaches 7 and 28 nA for the core and decimation part, respectively. The digital part
consumes about 32% of the overall budget at 75 nA, while the LDO including the
current and voltage references a similar fraction.
The presented solution offers the widest voltage supply range, the best temper-
ature stability and at 240 nA, a 4 power reduction compared to any other COTS
TCXO/RTC state-of-the-art products as evidenced in Fig. 9.19.
9 A 32 kHz DTCXO RTC Module with an Overall Accuracy of 1 ppm. . . 181
9.6 Conclusions
This paper has presented a novel high-performance DTCXO module that achieves at
240 nA, a 4 power dissipation reduction compared to other COTS while being the
most accurate on the market with a typical inaccuracy of 1 ppm (3 ppm max). It
introduces a novel fractional inhibition all-digital temperature compensation scheme
achieving a resolution of 0.1 ppm at 1 s. It reuses for that purpose, to perform fine-
grained timing interpolation, edges from a heavily duty-cycled 7 MHz RCO that
operates mainly as a 20 mK resolution temperature sensor. Consequently, a very
compact circuit eliminating the need for an ADC is obtained. The DTCXO RTC
module is available commercially in large volumes.
References
1. Ruffieux, D., et al.: Silicon resonator based 3.2 W real time clock with 10 ppm frequency
accuracy. J. Solid-State Circuits. 45(1), 224234 (2010)
2. Asl, S.Z., et al.: A 3 ppm 1.5 0.8 mm2 1.0 A 32.768 kHz MEMS-based oscillator. J. Solid-
State Circuits. 50(1), 112 (2015)
3. Ruffieux, D., et al.: A versatile timing microsystem based on wafer-level packaged XTAL/BAW
resonators with sub-W RTC mode and programmable HF clocks. J. Solid-State Circuits. 49(1),
212222 (Jan. 2014)
4. Park, P., et al.: A thermistor-based temperature sensor for a real-time clock with 2 ppm
frequency stability. J. Solid-State Circuits. 50(7), 15711580 (2015)
Chapter 10
Energy-Efficient High-Resolution
Resistor-Based Temperature Sensors
10.1 Introduction
Integrated temperature sensors are often used for the temperature compensation
of frequency references [13]. This is a demanding application, as it requires
sensors that can achieve both high resolution and high energy efficiency. High
resolution is essential for minimizing jitter in the compensated output frequency,
while high energy efficiency is necessary to minimize the sensors contribution
to the references total energy budget. Furthermore, the sensor should be CMOS
compatible, i.e., based on the generic devices or features of baseline CMOS
technologies, so that it can be co-integrated in a low-cost manner with the rest of the
electronics of the frequency reference.
The temperature dependencies of various CMOS-compatible devices, such as
BJTs [57], MOSFETs [89], thermistors (temperature-dependent resistors) [13,
1317], and electrothermal filters [1011], have been used to realize temperature
sensors. MEMS resonators have also been used to realize temperature sensors
with excellent resolution and energy efficiency [12]. However, the resonators are
fabricated in a dedicated process on separate die and so are not CMOS compatible.
A survey of smart temperature sensors [4] shows that resistor-based temperature
sensors are the most energy-efficient class of CMOS temperature sensors. Their
resolution figure of merit (FoM), defined as energy/conversion resolution2 [4], is
about an order of magnitude less than that of conventional BJT-based sensors, and
they can also achieve much higher (sub-mK) resolution [1].
Several high-resolution resistor-based temperature sensors have been reported
[13, 15]. In [2, 3], temperature sensors based on Wien bridge RC filters are
Compared to BJTs, on-chip resistors can operate over a wider temperature and
voltage range. However, compared to the base-emitter voltages of BJTs, whose
spread is typically in the order of a few millivolts, and so much less than 1%
[6], on-chip resistors exhibit much larger spread and nonlinearity [23, 15]. These
two disadvantages, however, can be compensated by trimming and systematic
nonlinearity removal.
In CMOS processes, many resistors are available: metal resistors, diffusion
resistors, poly-silicon (poly) resistors, N-well resistors, and silicided resistors. Their
characteristics are summarized in Table 10.1.
10 Energy-Efficient High-Resolution Resistor-Based Temperature Sensors 185
The structure of a Wien bridge (WB) sensor [2] is shown in Fig. 10.1a. It is a second-
order band-pass RC filter, and its frequency domain voltage amplitude and phase
transfer functions can be written as
186 S. Pan and K.A.A. Makinwa
Fig. 10.1 (a) Wien bridge sensor, voltage readout scheme (b) Bode plots (c) Wien bridge sensor,
current readout scheme
RCj!
H .j!/ D (10.1)
1 R2 C2 ! 2
C 3RCj!
2 2 2
1 R C ! 1
WB .!/ D tan (10.2)
3RC!
Its Bode plot is shown in Fig. 10.1b, where the frequency is normalized to
f0 D 1/(2RC).
Given a fixed driving frequency fdrive , the phase output of the WB is determined
by its resistors and capacitors. If capacitors with a negligibly low TC are used,
such as metal-insulator-metal (MIM) capacitors, the bridges phase shift is mainly
a function of the resistors temperature dependency. This phase shift can be
determined either by measuring the voltage across the output resistor 2R(T) or the
current flowing through it, as shown in Fig. 10.1c. In this paper, the former method
is referred to as the voltage readout scheme and the latter as the current readout
scheme.
An ideal phase detection model based on synchronous demodulation, shown in
Fig. 10.2, can be used to estimate the achievable rms temperature resolution and the
resolution FoM of the WB sensor. To simplify the analysis, both the driving signal
and the demodulating signal are modeled as sine waves with the same frequency,
but with a phase difference: Vin D Asin(2f0 t) and Vdemod D sin(2f0 t C ' demod ),
where f0 D 1/(2RC), ' demod D 90 and A is the amplitude of the driving signal.
The signal of interest is the DC output after the low-pass filter.
10 Energy-Efficient High-Resolution Resistor-Based Temperature Sensors 187
Assuming that the resistors in the WB are the only noise contributors, the rms
temperature resolution of a differential WB sensor for a voltage readout scheme can
be expressed as
s
3 3kTR
Tv;WB D ; (10.3)
A 2 tconv
where is the temperature coefficient of the resistors in the WB and tconv is the
conversion time.
In the current readout scheme, however, the noise contribution of the resistor
2R(T) is less attenuated than that in the voltage readout scheme. This results in a
lower temperature-sensing resolution, which can be expressed as
s
3 3kTR
Ti;WB D : (10.4)
A 2tconv
As stated in Sect. 10.2, the silicided poly resistor is the chosen resistor type. In
the actual design, the WB sensor is implemented with R D 32 k, C D 10 pF,
fdrive D 500 kHz, and A D 0.9 V. A current readout scheme is chosen for its
low swing, thus relaxed readout linearity [2]. With tconv D 5 ms, this results in
a temperature-sensing resolution of 230 K (rms) when driven by a sine wave.
Within the target industrial temperature range of 4085 C, the output phase of the
WB will then vary from about 7 to 10 . For comparison purposes, a WB sensor
based on a non-silicided n-poly resistor (largest negative TC) has also been realized.
With the same component values, this results in a resolution of 380 K (rms). The
corresponding FoMi,WB values are 2.3 fJK2 and 8.1 fJK2 for the silicided p-poly
sensor and the non-silicided n-poly sensor, respectively.
To achieve high energy efficiency, the phase output of the WB sensors is digitized
by a 1-bit second-order phase-domain CTDSM (PD-CTDSM). A simplified block
diagram of the modulator is shown in Fig. 10.3. The input phase (at f0 D 500 kHz)
is firstly down-converted to DC by multiplying it by a phase reference at the
same frequency (fdemod D f0 ). Depending on the chosen phase references ' 1 or ' 2 ,
188 S. Pan and K.A.A. Makinwa
the multipliers DC output will either be positive or negative [11]. The result is
integrated by the loop filter and then quantized. Due to a negative feedback, the
quantizer will toggle the reference phases in such a way that the loop filters average
DC input will be zero. The output bitstream will then be a digital representation of
the input phase. To leave enough margin for the spread of the resistors and capacitors
of the WB sensor, the phase difference between the two references was designed
to be 45 (C22.5 and 22.5 ). These references are generated digitally from an
8 MHz master clock, which is 16 higher than the WBs driving frequency.
The circuit diagram of the phase-domain CTDSM is shown in Fig. 10.4. In the
current domain, the analog input of the PD-CTDSM is the current output signal
in Fig. 10.1c. The phase DAC is realized by controlling the direction of the current
flow, i.e., by chopping the current input with a reference signal with different phases
' 1 and ' 2 . The feedforward coefficient c1 (Fig. 10.3) is achieved by Rff at the output
of the second stage.
To suppress the 1/f noise from the first integrating stage, the opamp is chopped.
By choosing the chopping frequency fchop the same as fdemod , the input chopper of
the opamp can be merged with the input demodulator and becomes a single chopper
in front of the integration capacitors, as shown in Fig. 10.5.
10 Energy-Efficient High-Resolution Resistor-Based Temperature Sensors 189
Fig. 10.5 Circuit diagram of the phase-domain CTDSM, with merged chopper
The gm -C second stage uses an efficient telescopic OTA with source degenerated
NMOS input pairs to improve its linearity, and it draws 4 A, which is less than
10% of that of the first stage.
The WB sensor chip [19] is fabricated in a 0.18 m technology, and the chip
micrograph is shown in Fig. 10.8. There are two sensors fabricated side by side:
a silicided p-poly (s-p-poly) sensor and a non-silicided n-poly sensor. They share
the same constant-gm biasing and phase generation circuits. Each sensor occupies
about 0.72 mm2 , about 40% of which is occupied by the first integrators capacitors
(2 180 pF). Including the readout circuits, each WB sensor draws 87 A from a
1.8 V power supply.
To determine the resolution of the proposed WB sensor design, the sensors are
driven by a low-jitter (1 ps rms) frequency reference, which contributes less than
0.5% to the readout circuits total noise power. The sensors are mounted in good
thermal contact with a large aluminum block to minimize ambient temperature
drift. Spectra of the bitstream outputs of both sensors are shown in Fig. 10.9a.
The sensors noise floor is dominated by the bridges thermal noise. Figure 10.9b
shows the temperature resolution vs. conversion time plot of both sensors obtained
after a sinc2 decimation of their bitstream outputs. In a 5 ms conversion time (2500
samples), the n-poly resistor achieves 880 K rms resolution, while the s-p-poly
achieves 410 K rms resolution, due to its higher TC. The n-poly resistor exhibits
a 1/f corner of about 10 Hz, while that of the p-poly sensor is below 1 Hz. These
results demonstrate the effectiveness of the 1/f noise cancelation techniques of the
readout circuit. The remaining 1/f noise, however, comes from the sensing resistors.
10 Energy-Efficient High-Resolution Resistor-Based Temperature Sensors 191
Fig. 10.9 Bitstream FFT (a) and temperature resolution versus conversion time (b) of both Wien
bridge sensors
Twenty samples from one wafer were characterized in ceramic DIL packages.
At room temperature, supply sensitivities of 0.17 C/V (s-p-poly bridge) and
0.34 C/V (n-poly bridge) were observed for supply voltages ranging from 1.6
to 2 V. To improve the inherent nonlinearity of the phase domain CTDSM, a
cosine nonlinearity compensation [10] is applied to the decimated bitstream before
trimming, resulting in the phase output vs. temperature plots shown in Fig. 10.10.
After a first-order polynomial fitting followed by a fixed correction of the
systematic nonlinearity on the phase shift vs. temperature plot, the silicided p-poly
sensor achieves a 3 inaccuracy of 0.07 C over the industrial temperature range
of 4585 C. The non-silicided n-poly sensor is less accurate. Its inaccuracy is
0.25 C over the same range, as shown in Fig. 10.11.
192 S. Pan and K.A.A. Makinwa
Fig. 10.10 Phase output vs. temperature plots for (a) silicided p-poly sensor (b) non-silicided
n-poly sensor
Fig. 10.11 Temperature error after first-order fit and systematic error removal for (a) silicided
p-poly sensor (b) non-silicided n-poly sensor
s
1 2kTR
TWhB D ; (10.5)
A .P N / tconv
where P and N are the TCs of RP and RN , respectively, and A D Vdd/2 is the
voltage across one single resistor. Note that the resolution is independent of the
nature of its output, i.e., whether it is a voltage or a current. For D P N , the FoM
of a WhB temperature sensor can be expressed as FoMWhB D 8kT/ 2 , excluding the
readout electronics power consumption and noise.
To maximize the resolution and the energy efficiency of WhB sensors, the TCs of
RP and RN should have opposite polarities. In the target process, however, only the
non-silicided poly resistors have negative TCs. To compare their performance, two
bridges, based on non-silicided n- and p-poly resistors as the negative TC resistors
(RN ), respectively, were realized. Both bridges used silicided p-poly resistors as the
positive TC resistors (RP ). The two Wheatstone bridges, i.e., s-p/n-poly and s-p/p-
poly bridge, achieve FoM values of 1.7 fJK2 and 3.4 fJK2 , respectively (excluding
the readout circuits).
The s-p/n-poly WhB sensor was designed to be slightly unbalanced at room
temperature to leave a similar margin at the two extremes of the targeted temperature
range (40 to 85 C). With RP D 105 k, RN D 95 k, and a supply voltage
Vdd D 1.8 V, the sensors resolution is 74 K within 10 ms conversion time. The
current output range of the WhB sensor is 2.03.0 A. To achieve a similar current
output range over the targeted temperature range, the resistors in the s-p/p-poly
sensor were chosen to be RP D 67.5 k and RN D 64 k. These values result
in a slightly worse resolution of 85 K.
As for the WB sensor, a second-order 1-bit CTDSM is employed to read out the
WhB sensor. Since the output of the WhB sensor is a DC current signal, a current-
domain CTDSM is used as depicted in Fig. 10.13. The simplified circuit diagram of
the designed CTDSM is shown in Fig. 10.14. The DC performance of the CTDSM
is critical, so chopping is applied on the first stage opamp to suppress its offset
194 S. Pan and K.A.A. Makinwa
Fig. 10.14 Circuit diagram of the readout circuit of the Wheatstone bridge sensor
and 1/f noise. The feedforward coefficient is accomplished by the two feedforward
resistors (Rff ) at the output of the gm stage. Because of the symmetry of the WhB,
the common-mode input voltage of the readout circuit is Vdd/2 (0.9 V). A resistive
DAC is chosen to implement the current feedback. To avoid introducing additional
spread, the four DAC resistors (RDAC) are made of the same material as that of RN ,
which has the smaller TC among the two types of sensing resistors. In this design,
RDAC D 140k, for both the s-p/n-poly sensor and the s-p/p-poly sensor.
To avoid aliasing high frequency quantization noise at the chopping transitions
[20], the chopping frequency is set to be equal to the sampling frequency (500 kHz).
Since no phase generation circuits are required, the readout electronics of the WhB
sensor employs a lower reference clock frequency than that of the WB sensor, i.e.,
4 MHz instead of 8 MHz, which lowers the power consumption of its digital circuits.
10 Energy-Efficient High-Resolution Resistor-Based Temperature Sensors 195
Fig. 10.16 Bitstream FFT (a) and temperature resolution over conversion time (b) of both
Wheatstone bridge sensors
The WhB sensor chip is fabricated in the same 0.18 m CMOS technology, and
the chip micrograph is shown in Fig. 10.15. There are two sensors fabricated side
by side: a silicided p-poly/non-silicided n-poly (s-p/n-poly) sensor and a silicided
p-poly/non-silicided p-poly (s-p/p-poly) sensor. They share the same constant-gm
biasing and clock generation circuits. Under a 1.8 V power supply, the current
consumption is 101 A for each sensor, including the readout circuits. The chip
area is the same as the WB sensor.
With similar measurement setup as that for the WB sensor (but without the
low-noise frequency reference), the bitstream spectra and the conversion time vs.
resolution plots of the two WhB bridge sensors are shown in Fig. 10.16. The 1/f
corner of both sensors is around 10 Hz. After a sinc2 decimation of their bitstream
196 S. Pan and K.A.A. Makinwa
Fig. 10.17 Decimated bitstream vs. temperature plots for (a) s-p/n-poly sensor (b) s-p/p-poly
sensor
Fig. 10.18 Temperature error after first-order fit and systematic error removal for (a) s-p/n-poly
sensor (b) s-p/p-poly sensor
The two sensors represent different ways of realizing the reference impedance
required to digitize a temperature-sensing resistance. The WB sensors reference is
a capacitive reactance, i.e., the impedance of a capacitor driven by a fixed frequency.
In contrast, the reference of the WhB sensor is simply another resistance.
The realization of a reference resistance is more straightforward, as it does not
require a stable frequency reference (although this is not a problem in a frequency
reference). Also, the associated readout electronics can be more efficient, because
its input signals are at DC and do not need to be down-modulated. Moreover, the
use of two different types of resistors in the WhB sensor results in a higher effective
TC than in the WB sensor. As a result, the FoM of the WhB sensor is 2.7 better
than the WB sensor.
However, apart from FoM, other important performance criteria for temperature
sensors are their accuracy and stability. The temperature dependency of a WhB
sensor is determined by two different types of resistors with different (non-linear)
TCs and spread, while a WB employs only a single type. As a result, a WhB sensor
can be expected to be somewhat less accurate than a WB sensor, especially over
corners. Moreover, the observed 1/f noise of silicided poly resistors is much better
than that of normal poly resistors, which also makes WB sensors better in this
respect.
10.6 Conclusion
Table 10.2 Performance summary of the Wien bridge (WB) sensor and the Wheatstone bridge
(WhB) sensor compared to previous high-resolution energy-efficient temperature sensors
WB sensor WhB sensor JSSC13 [1] JSSC15 [15] JSSC17 [12] TIE17 [5]
Sensor type Resistor Resistor Resistor Resistor MEMS BJT
Tech (m) 0.18 0.18 0.18 0.18 0.18 0.7
Area (mm2 ) 0.72 0.72 0.18 0.43 0.54 1.5
Power (mW) 0.16 0.18 13 0.065 19 0.16
Temp. range ( C) 4085 4085 4085 40125 40105 -40130
Resolution (mK) 0.41 0.16 0.1 10 0.02 3
Tconv (ms) 5 10 100 0.1 5 1.8
Trim point 2a 2a 6 2b 1
Inaccuracy (3) 70 mK 100 mK 15 mKc 400 mKc 300 mK
Res. FoM (pJK2 ) 0.13 0.049 13 0.65 0.04 3.2
a
First-order fit
b
One-point trim with first-order curve fitting
c
Min or max
Fig. 10.19 Energy efficiency of the two measured sensors compared to other CMOS temperature
sensors [4]
low 1/f noise, is employed in both bridges. Energy-efficient CTDSMs are adopted
for the readout circuits for both sensors, and chopping is applied to suppress the
offset and 1/f noise of the readout electronics. The Wien bridge sensor achieves a
410 K resolution in a 5 ms measurement time and a resolution FoM of 0.13 pJK2 .
The Wheatstone bridge sensor, however, achieves a 160 K resolution in a 10 ms
10 Energy-Efficient High-Resolution Resistor-Based Temperature Sensors 199
measurement time and a resolution FoM of 0.049 pJK2 . These results clearly show
that resistor-based temperature sensors can be used to realize the high-resolution and
energy-efficient sensors for the temperature compensation of frequency references.
Acknowledgments The authors would like to thank Yanquan Luo and Saleh Heidary Shalmany
for their contributions to the Wien bridge sensor design and Hui Jiang for his work on the
Wheatstone bridge sensor. The authors would also like to thank Burak Gnen, Vincent van Hoek,
and Said Hussaini for proofreading and suggestions.
References
14. Wu, C.K., Chan, W.S., Lin, T.H.: A 80kS/s 36W resistor-based temperature sensor using
BGR-free SAR ADC with a unevenly-weighted resistor string in 0.18 m CMOS. In: 2011
Symposium on VLSI Circuits Digest of Technical Papers, Honolulu, HI, pp. 222223 (2011)
15. Weng, C.H., Wu, C.K., Lin, T.H.: A CMOS thermistor-embedded continuous-Time Delta-
sigma temperature sensor with a resolution FoM of 0.65 pJ C2 . IEEE J. Solid-State Circuit.
50(11), 24912500 (2015)
16. Sankaragomathi, K.A., Koo, J., Ruby, R., Otis, B.P.: A 3ppm 1.1mW FBAR frequency
reference with 750MHz output and 750mV supply. In: 2015 IEEE International Solid-State
Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, pp. 13 (2015)
17. Tang, X., Pun, K.P., Ng, W.T.: A 0.9V 5kS/s resistor-based time-domain temperature sensor in
90 nm CMOS with calibrated inaccuracy of 0.6 C/0.8 C from 40 C to 125 C. In: 2013
IEEE Asian Solid-State Circuits Conference (A-SSCC), Singapore, pp. 169172 (2013)
18. De Graaff, H.C., Huybers, M.T.M.: 1/f noise in polycrystalline silicon resistors. J. Appl. Phys.
54(5), 25042507 (1983)
19. Pan, S., Luo, Y., Shalmany, S.H., Makinwa, K.A.A.: 9.1 A resistor-based temperature sensor
with a 0.13pJK2 resolution FOM. In: 2017 IEEE International Solid-State Circuits Conference
(ISSCC), San Francisco, CA, pp. 158159 (2017)
20. Billa, S., Sukumaran, A., Pavan, S.: A 280W 24kHz-BW 98.5 dB-SNDR chopped single-
bit CTM achieving < 10Hz 1/f noise corner without chopping artifacts. In: 2016 IEEE
International Solid-State Circuits Conference (ISSCC), San Francisco, CA, pp. 276277 (2016)
Chapter 11
A High-Resolution Self-Oscillating Integrating
Dual-Slope CDC for MEMS Sensors
11.1 Introduction
Fig. 11.1 Block diagram of the proposed architecture. Sensors are connected in a Wheatstone
bridge to a preamplifier, whose output is digitized by an ADC
have been proposed to reduce area and power consumption [5, 6]. However, it has
proven quite challenging to match the high resolution of multi-bit SC ADCs.
In order to save power and area without reducing the performance of the CDC,
this paper presents a topology that is able to detect small capacitance changes while
reducing ADC complexity, which is a key determinant of CDC area and power. In
order to be area and energy efficient, this work is based on an integrating dual-slope
topology [79]. The proposed converter maps amplitude information into the time
domain. In addition, like traditional converters [10], it uses quantization-noise
shaping to reduce measurement time. However, unlike these, it does not require flash
quantizers or n-bit DACs to generate a multi-bit digital output. Instead, it achieves
high resolution by using single-bit circuitry and operating in the time domain.
The main strengths of the proposed CDC are (1) intrinsically small sensitivity to
temperature and process variations, (2) ease of trimming offset and gain to correct
the sensor parameter spread, and (3) area and energy-efficient implementation while
maintaining the performance of traditional approaches. A block diagram of the
proposed solution is shown in Fig. 11.1. Among the main blocks are an excitation
circuit that drives a MEMS bridge and which also provides a reference signal for
synchronous demodulation, an analog front-end and an integrating DS ADC.
MEMS are microscale systems made of both mechanical moving parts and electron-
ics. Such structures can sense the variation of mechanical quantities or actuate. Most
MEMS devices consist of a mass which is free to move in one or more direction in
3D space with respect to the substrate, to which it is anchored by springs. Different
methods are commonly used to sense the displacement of the moving mass:
capacitive, piezoresistive, optical, and resonant sensing. Capacitive readout MEMS
are based on the measure of a capacitance variation due to the displacement of a
suspended microscopic structure in presence of an external applied force. Moving
electrodes (also called rotors borrowing mechanical terminology) are mechanically
11 A High-Resolution Self-Oscillating Integrating Dual-Slope CDC for. . . 203
Fig. 11.2 (a) Sketch of a typical differential capacitive sensing cell for a MEMS structure. Stators
A and B are anchored to chip substrate and they form a differential capacitor with rotor. (b)
Diagram with forcing acting on a suspended mass
anchored to the moving structure and fixed electrodes (as a consequence called
stators) are anchored to the substrate. Figure 11.2a shows a differential capacitive
sensing cell with a moving electrode anchored to a suspended shuttle (on the right)
and forming a couple of capacitors with stators A and B.
A microelectromechanical system can be modeled as a lumped parameter spring-
mass-damper system [11], as shown in Fig. 11.2b; a mass is connected via a spring
to a fixed support, being pulled by an external force Fext . A dashpot is used to
represent a mechanical damping element. All these three elements share the same
displacement x with respect to a rest position. For the sake of simplicity, only a one-
axis model is considered now, neglecting secondary vibrating modes; this analysis
can be easily extended to a 3-DOF system in an inertial frame of reference. Applying
Newtons second law of motion, stating that the net force on a body is equal to the
product of acceleration and mass of the body, F D m*a, the classical equation of
motion describing the dynamics of a suspended micromachined structure can be
derived:
m xR C b xP C k x D Fext (11.1)
being the elastic force proportional to the displacement x, the viscous force to the
velocity xP .
By applying Laplace transform to Eq. (11.1), the frequency behavior of MEMS
can be studied with respect to frequency, and two main parameters can be
highlighted to describe the behavior of the mechanical element.
204 J.P. Sanjurjo et al.
A heavier moving mass resonates at lower frequency; the stiffer is the spring, the
higher is the resonance frequency.
The quality factor Q is a dimensionless parameter useful to characterize how
over- or under-damped a MEMS resonator is. Equivalently, for large values, Q also
characterizes resonator bandwidth f relative to its center frequency fr , and it is
related to MEMS parameters according to
p
fr !r m k m
QD D D (11.3)
f b b
where T being the absolute temperature, kB the Boltzmann constant, and b damping
coefficient previously introduced. It should not be surprising that the resulting
expression for mechanical noise is very similar to Johnson noise in resistors,
SVn D 4kB TR, as they both have the same physical origin and dissipation.
In gas damped systems, like MEMS working either at ambient pressure or in a
package at a lower pressure, mechanical noise is mainly due to random paths of
molecules which hit the suspended structure. The result of this statistic process is an
unwanted random displacement of the moving mass which is nevertheless detected
by position sense interface.
preAmplifier
Int. Dual-Slope ADC
MEMS CGain
Coffset INTI
Vref Cin
1 fclk
+ + fs
- Vref
-
Csen PreAmp Rin OA
VEX
-
+ -
SUM
+
Dem. VDS
Cref
Coffset Vin -IDAC
CGain 2 Vcomp
4bits
VEX Vref
DOUT
in Csen . Cref is a reference capacitor, whose capacitance does not depend on pressure
and has the same temperature variation as Csen . The bridge is differentially (VEX C
and VEX ) modulated by a smoothed (in order not to excite mechanical higher order
resonances) square wave, which results in a signal at the output of the bridge with
its amplitude proportional to bridge capacitance unbalancing.
leaving the demodulated signal without any offset and low frequency noise. This
effect is illustrated in Fig. 11.4. This procedure is referred to as chopping [13].
Note, that the PreAmp is not chopped separately, but rather intrinsically chopped
by the demodulation stage in the signal chain. In this way, the DS ADC only
digitizes the voltage difference between Csen and Cref , as output by the PreAmp.
The ideal relation between the bridge capacitors and the DS converter input voltage
Vin (see Fig.11.3) is shown in (11.5).
Csen Cref
Vin D VEX (11.5)
CGain
T1 T2
Phase I Phase II Phase I
Output (V)
Integrator
TCLK
K1 K2
Reset Reset
obtained in Phase I with a fixed slope until a zero crossing in the comparator. Then
the conversion is over, and the integrator moves to the reset position to eliminate
the charge in the integrating capacitor. In classical DS converters, T1 is always
fixed, i.e., in Fig. 11.5 this time is 4 times the clock period (N D 4). The time
needed to discharge the voltage accumulated at the end of Phase I (T2 in Fig.
11.5) then depends on the charge accumulated in the integrating capacitor during
Phase I (in the example of Fig. 11.5, this time is equal to six clock cycles, M D 6).
The resolution of this method is then proportional to the number of clock cycles
needed to discharge the voltage integrated when input voltage is the full-scale value
(MMAX ). Equations (11.6) and (11.7) show the equivalent number of bits:
VFS
LSB D (11.6)
MMAX
T1 T2
a)
Output (V) Phase I Phase II Phase I
Integrator
TCLK
K2 Quantization
K1 Error
T1 T2
b)
Phase I Phase II Phase I
Output (V)
Integrator
TCLK
K2 Quantization
K1 Error
Fig. 11.6 (a) Standard integrating dual-slope and (b) proposed self-oscillation approach time
diagrams
integrating capacitor (instead of resetting its voltage), at the end of Phase II, as it can
be seen in Fig. 11.6a. If the integrating DS converter is compared with standard
modulators, it does not require multi-bit circuits (i.e., flash quantizers or n-bit DACs)
to keep the same resolution and performance. In [7, 9], it was demonstrated that the
maximum signal-to-noise ratio (SNR) that can be achieved using this architecture is
similar to the one of a first-order multi-bit modulator:
where Nbits is equal to Eq. (11.7) and OSR represents the oversampling ratio between
the sampling frequency of the converter (fs) and the signal bandwidth (fBW ) divided
by 2: OSR D fs/(2fBW ). According to Eq. (11.8) the resolution of the integrating DS
converter can be modified by these two factors, similar to a modulator. However,
to do so, it is needed to have a fixed sampling time (Ts D T1 C T2 ). For this, a slight
modification of the DS ADC equations is needed. In this case, the variables of the
converter must be selected in a way that the system is able to discharge completely
the integrating capacitor Cin (see Fig. 11.6) in Phase II when the input signal is at its
maximum value. The control of the discharge in Phase II depends on the following
two equations:
Ifeedback
K2 D (11.10)
Cint
where two main clocks must be defined: fclk D 1/Tclk and fs D 1/Ts. According to
the rate R D fclk /fs, and assuming N D M, then R D 2Nbits-1 . In this way, Eq. (11.8)
can be rewritten as follow:
Log.R/ fs
SNR.dB/ D 6:02 C 1 C 30 log 5:17 (11.11)
Log.2/ 2 f BW
Using Eq. (11.11) it is easy to scale the integrating DS converter in the same way
a multi-bit modulator is done. However, still some circuit design-related issues
must be solved.
First, the integrating capacitor must keep its charge constant after the zero
crossing until the next sampling period. But parasitics of the circuit play an
important role in this case and take a significant unwanted amount of charge from
the capacitor (leakage). This leads to a value of the quantization error that, at the end
of the sampling period, is different from the estimated one after the zero crossing.
Second, the transfer function is not linear. The possible output digital values are in
the range from -Dn to Dn. However, there are two different 0, 0C and 0.
To compensate this, an extra digital logic is needed (two different approaches has
been already shown in [7, 9]), increasing the area and power consumption of the
converter.
In this work, the integrating DS of [7, 9] has been modified in order to solve some
of the issues mentioned before. The main difference is shown in Fig. 11.6b. In this
case, the output of the DS integrator is kept toggling after the zero crossing in the
discharging phase of the integrator. By adopting this modification, the quantization
error is still kept for the next sampling period as it is shown in [8]. Since now,
the zero crossing is resolved intrinsically by the operation of the DS converter, and
the digital control complexity and comparator precision can be extremely relaxed,
saving power and area. Also, this modification introduces a zero self-compensation,
removing also the digital logic necessary to avoid the nonlinear effect of the transfer
function. In addition, the proposed solution relaxes significantly the leakage issue
mentioned before. The integrating capacitor does not need now to freeze the
quantization error information during a long period of time, reducing the associated
error.
This section describes the solution adopted in this work to connect the proposed
self-oscillating integrating DS converter with the MEMS and CV stage. The DS
converter needs to be connected with the voltage amplifier and the demodulation
block in a way that the information is not disturbed. As it is mentioned before,
210 J.P. Sanjurjo et al.
VEX(v) T(VEX)=T(VREF)=2TS
1.5
0
t
Vin(vdiff)
0.15 1mV
TI=NTclk TS=(N+M)Tclk t
1(V)
1.5 Tclk
Phase I Phase I Phase I
0
t
TII=MTclk
2(V)
1.5
Phase II Phase II
0
t
VDS(vdiff)
0.3
0
t
Vcomp(V) Dout=+4 Dout=+4
1.5
+1 +1 +1 +1 -1 +1 +1 +1 +1 +1 -1 +1 +1 +1
0
t
the readout circuit is modulating the input signal to high frequencies to remove
offset and flicker noise in first part of the chain. Figure 11.7 shows a timing diagram
with the different signals along the full readout chain. VEX represents the excitation
signal of the bridge. It is shaped in a pseudo-trapezoidal wave in order to reduce
stimulation of high frequency of MEMS sensor and therefore reduce ringing in
the output voltage of the bridge. The excitation of the MEMS together with the
demodulation block makes the chopping of the first part of the CDC. The signals
that drive these blocks (VEX and Vref ) are synchronized and have the same frequency
(fchopping D fs /2). The chopping frequency has to be carefully chosen: it has to be high
enough to modulate flicker above bandwidth of interest, but low enough, to allow
the signal stabilize after the ringing. The output of the demodulator is the input for
the integrating DS ADC (Vin in Figs. 11.1 and 11.3). As it was explained, the signal
11 A High-Resolution Self-Oscillating Integrating Dual-Slope CDC for. . . 211
is affected by the ringing of the MEMS but with a stable period of time (due the
frequency of chopping and the pseudo-trapezoidal waveform). This period of time
will be assigned for the Phase I of the DS ADC (as a track of a track and hold).
The signals 1 and 2 represent the two phases of the DS ADC sampling period. In
order to achieve high resolution with the lowest clock frequencies (to reduce power
consumption), N and M are selected unequal. N is equal to two clock periods, which
is equal to the stable time of Vin . M is equal to six.
Now VDS (Fig. 11.3) is the output of INTI (Fig. 11.1). During Phase I, VDS is
charged proportional to (Vin TI )/(RIN CIN ). Then, at the beginning of Phase II (signal
2 high), the input of the DS converter is connected to the feedback DAC (IDAC in
Fig. 11.2), and VDS is discharged with a constant slope proportional to IDACTII /CIN .
Due to the self-oscillation behavior, during Phase II, when the loop sees a change
of polarity in the output of the comparator, the system starts to toggle until the end
of the phase, keeping the quantization error for the next sampling period (1 high).
Signal Vcomp represents the output of the clocked comparator. This signal will be
used to drive IDAC and to generate the multi-bit digital output of the CDC (DOUT in
Fig. 11.3). A simple counter (SUM in Fig. 11.3) can be used to generate this signal.
This block makes the logic addition of the output digital data only during Phase
II (high level adds a 1 and low level subtracts a 1) every clock period Ts. DOUT
is then proportional to the input amplitude of the DS converter (Vin ) and therefore
to the input pressure of the CDC. In this system, the sampling period is defined as
Ts D TI C TII , where TI D NTclk, TII D MTclk, and Tclk is the clock period of
the comparator.
The analog fronted is based on a closed loop capacitive voltage amplifier (preAmp
in Fig. 11.3) built with a telescopic gain-boosted OTA in order to be power efficient.
This amplifier does not need any specific technique to reduce low frequency noise
as it is already chopped by the modulated/demodulated scheme proposed in this
architecture.
The power consumption of the DS ADC is given by the RC integrator (INTI
in Fig. 11.3). The OTA used in the integrator is a two-stage class A/AB pull-up-
down topology, and it is Miller-compensated. A simplified schematic is shown in
Fig. 11.8. This architecture allows a better power trade-off with respect to folded
cascode and two-stage class A topologies. However, in the class A/AB topology,
each gain stage requires a separate common mode feedback (CMFB) circuit. This
is the consequence of using current mirrors in the second stage: the common mode
output voltage of the first stage affects the bias condition of the second stage but
does not affect the second stage output voltage. Moreover, to cope with the 1/f noise
introduced by the OTA input differential pair, a chopping modulation technique has
been adopted. The signal demodulation is applied on low impedance nodes (cascode
nodes; see Fig. 11.8). This chopping configuration does not suffer from the opamp
212 J.P. Sanjurjo et al.
vinp vinn
Fchop
VDDA
bias M17
M1 M2
M13 M11 M12 M14
Fchop
CM CM
casp
outn M3 M4 outp
RM RM
M15 M16
cmn cmp
M9 casn M10
M5 M6
Fchop
CMFB2
M7 M8
Fig. 11.8 Simplified schematic of OTA used in the dual-slope integrator INTI
limited bandwidth and from the distortion introduced by the chopping switches. This
is because the demodulation of the chopping stage is performed before the dominant
pole and not at the output of the OTA. Signal Vref in Fig. 11.3 is also used to make
the chopping of this OTA, this way aligning it with the timing of the system. The
gain bandwidth product of this OTA has been set to four times the clock frequency
G BW D 4f clk (f clk D 1/Tclk) in order to deal with the DAC pulses.
The switches that control the different phases in the DS ADC do not need to
be done by special circuit because their distortion is shaped by the noise shaping
behavior of the converter. The current DAC uses non-return-to-zero topology. In
order to deal with the 1/f noise at the output of the IDAC, the current mirrors that
drive the current cells are designed with large size PMOS (W D 9 m/L D 94 m)
and NMOS (W D 9 m/L D 150 m) transistors. A two-stage regenerative low-
power clocked comparator is used for the single-bit conversion. Its output is a PWM
waveform that can be directly connected to the IDAC and to the counter SUM to
generate DOUT .
In order to compute the final CDC output measurement, the generated multi-
bit output DOUT must be interpreted appropriately. Since the signal information
is contained in the DC component of the DOUT data stream, an efficient digital
11 A High-Resolution Self-Oscillating Integrating Dual-Slope CDC for. . . 213
11.5 Measurements
The designed CDC was fabricated in a standard digital 0.13 m CMOS technology.
It was bonded together with a pressure sensor MEMS in the same carrier to
minimize interconnection or parasitic capacitances between them. Figure 11.9
shows a die photograph of both the MEMS sensor and the CDC. The CDC core, with
an area of 0.317 mm2 and clocked at f clk D 1.28 MHz (equivalent to a sampling
frequency of fs D 160 kHz), consumes 146 A from a single 1.5 V power supply.
This current includes the analog, digital, and excitation signal generator blocks. The
bridge of the capacitive MEMS is excited with a signal of 80 kHz, which is also
used to demodulate the output voltage of the preAmp and to chop the integrator
INTI . The spectrum of the digital output (DOUT ) of the CDC is shown in Fig. 11.10.
It represents an input pressure of 1050 mbar, equivalent to a voltage of 16dBFS at
the input of the DS converter, where the full scale is VFS D 1 V.
The measured equivalent integrated noise over a bandwidth of 10 Hz is
2.9 Vrms, leading to an ENOB of 17bits. Also, in Fig.11.10, the first-order
noise shaping from the integrating DS converter as well as the DC tone representing
the input pressure can be observed. In addition, the modulation between the DC
signal and the clock is present. This behavior is well known in first-order noise
shaping ADCs, but these tones at high frequencies do not corrupt the in-band noise
floor. In order to measure the pressure resolution of the CDC in a more precise
way, another experiment has been set. A special high precision pressure generator
was used in the lab. The prototype (MEMS C CDC) was inserted in this special
equipment where the air pressure can be controlled with a precision of 0.1 Pa.
Fig. 11.9 Die photo of the MEMS sensor and the CDC in the same package
214 J.P. Sanjurjo et al.
Fig. 11.10 FFT of the CDC DOUT for an input pressure of 1050 mbar
The output of the CDC DOUT integrated and averaged over 20 ms is plotted
versus the input pressure in Fig. 11.11. Three different values of pressure in the
range of specifications were used to calculate the resolution of the CDC. For
each of these three points, 100 different measurements were done. The standard
deviation is calculated for every group of data related to a certain input pressure. The
equivalent resolution of each measurement is equal to a digital standard deviation of
D D 0.003digital-code or equivalent voltage standard deviation of
V D 3.12 Vrms
(assuming a full scale at the input of the DS converter of VFS D 1 V). In order to
calculate the CDC resolution using the standard deviation of each measurement, the
following definition can be used:
11 A High-Resolution Self-Oscillating Integrating Dual-Slope CDC for. . . 215
2 VFS
ENOB D log2 p (11.12)
2
v
Using Eq. (11.12) the CDC achieves an ENOB of 17.5 bits. This result is in
agreement with the spectral density shown in the FFT of Fig. 11.10. Using these
results, the CDC is able to resolve difference in capacitance of Csen D 5.4aF (for
a maximum capacitive range in the sensor of Csen D 1 pF), which means a pressure
resolution equal to 0.8 Pa.
The demand for low power and reduced area CDCs is significantly increasing
in the last years due to the fast development of technology demanded in the
Internet of Things (IoT). To optimize the trade-off between performance and power
consumption, different approaches are being used: modulation, SAR, and
incremental and DS converters are the main topologies. ADCs can achieve
higher resolution than other topologies, but this is often at the expense of area and
power. On the other hand, SAR converters are not able to achieve as much resolution
as ADCs, but in terms of power and area, these converters are more efficient. On
the other hand, Incremental converters are becoming popular for these applications.
They are able to achieve moderate to high resolutions with some advantages: a
simpler decimation filter, ease of multiplexing, low latency, and the absence of idle
tones. But the circuit complexity of these converters is still high. A new family
of converters (integrating DS), presented in this work, is able to achieve the same
performance of an incremental converter (in the same range of power consumption)
with a simpler configuration. As it was said before, this configuration includes an
analog front end and uses a simple decimation filter. Also (due to the digital control)
it is able to multiplex different inputs and change the resolution of the converter in
an efficient way. In Fig. 11.12, a plot with the state-of-the-art converters and this
work is presented using the data from Table 11.1.
p !
In:Range=2 2 Power Meas:Time
SNRcap.dB/ D 20 log FoM D
Re solution 2.SNRcap1:76/=6:02
As shown in Table 11.1, the topology presented in this work has similar SNR and
FoM as the state-of-the art CDCs for the same capacitance resolution. As mentioned
before, and shown in Table 11.1, modulators are able to process the sensor
information with higher resolution at the expense of higher power consumption.
On the other hand, incremental and SAR converters are power efficient, but their
resolution is moderated. Finally, classical DS topologies are not efficient enough for
these applications. As a summary, it can be seen in Fig. 11.12 that there is a clear
trade-off between measurement time, power, and resolution in all the converters.
216 J.P. Sanjurjo et al.
In this scenario, the proposed work offers a power-efficient CDC using a very
simple and robust implementation. In addition, as it interchanges amplitude by time
resolution, the proposed converter can have a more efficient trade-off in terms of
circuit implementation for low-voltage technologies between measurement time,
power, and resolution compared with other CDCs.
11 A High-Resolution Self-Oscillating Integrating Dual-Slope CDC for. . . 217
11.7 Conclusions
Acknowledgments This work has been funded by Marie Curie project SIMIC, Grant Agreement
No. 610484, and funded by grants from the European Union (Research Executive Agency) and
TEC2014-56879-R of CICYT, Spain.
References
1. Langfelder, G., Buffa, C., Frangi, A., Tocchio, A., Lasalandra, E., Longoni, A.: Z-Axis
magnetometers for MEMS inertial measurement units using an industrial process. IEEE Trans.
Ind. Electron. 60(9), 39833990 (2013)
2. Xia, S., Makinwa, K., Nihtianov, S.: A Capacitance-to-Digital Converter for Displacement
Sensing with 17b Resolution and 20s Conversion Time. In: International Solid-State Circuits
Conference Digest of Technical Papers, San Francisco, CA, pp. 198200 (2012)
3. Oh, S., Jung, W., Yang, K., Blaauw, D., Sylvester, D.: 15.4b incremental sigma-delta
capacitance-to-digital converter with zoom-in 9b asynchronous SAR. In: Symposium on VLSI
Circuits Digest of Technical Papers, Honolulu, HI, pp. 12 (2014)
4. Tan, Z., Chae, Y., Daamen, R., Humbert, A., Ponomarev, Y.V., Pertijs, M.A.P.: A 1.2V 8.3nJ
energy-efficient CMOS humidity sensor for RFID applications. In: Symposium on VLSI
Circuits Digest of Technical Papers, Honolulu, HI, pp. 2425 (2012)
5. Jung, W., Jeong, S., Oh, S., Sylvester, D., Blaauw, D.: 27.6 A 0.7pF-to-10nF fully digital
capacitance-to-digital converter using iterative delay-chain discharge. In: International Solid-
State Circuits Conference Digest of Technical Papers, San Francisco, CA, pp. 13 (2015)
6. Tan, Z., Shalmany, S.H., Meijer, G.C.M., Pertijs, M.A.P.: An energy-efficient 15-bit capacitive-
sensor Interface based on period modulation. J. Solid-State Circuits. 47(7), 17031711 (2012)
7. Maghari, N., Temes, G.C., Moon, U.: Noise-shaped integrating quantisers in modulators.
Electron. Lett. 45(12), 612613 (2009)
8. Hernndez, L., Prefasi, E.: Multistage ADC based on integrating quantiser and gated ring
oscillator. Electron. Lett. 49(8), 526527 (2013)
9. Cannillo, F., Prefasi, E., Hernndez, L., Pun, E., Yazicioglu, F., Van Hoof, C.: 1.4V 13W
83dB DR CT- modulator with Dual-Slope quantizer and PWM DAC for biopotential signal
acquisition. In: Proceedings of ESSCIRC, Helsinki, pp. 267270 (2011)
10. Oh, S., et al.: A Dual-slope capacitance-to-digital converter integrated in an implantable
pressure-sensing system. J. Solid-State Circuits. 50(7), 15811591 (2015)
218 J.P. Sanjurjo et al.
12.1 Introduction
consumption in the nano-watt regime. The area of reference voltage also needs to
be very small to keep the size of the IoT device small. Further, the reference voltage
circuit should become operational at lower voltage. This is very critical for ULP IoT
devices because of the following reasons. The voltage at which reference voltage
becomes operational is typically the system start-up voltage. The start-up voltage
determines the voltage at which all the power supplies are active. The value of start-
up voltage determines the lifetime of an IoT device. A lower start-up voltage means
a higher lifetime. In this paper, we present a charge-pump-based voltage reference
circuit which has a start-up voltage below 400 mV and a power consumption below
20 nW in simulation. We measure the operation of the bandgap reference from 500
mV supply at a power consumption of 32 nW.
Temperature compensated zener diodes were used for generating reference
voltage in the very beginning where the breakdown of the zener would determine
the reference level. However, zeners did not meet many requirements of a typical
reference as they were higher power and noisy apart from being a discrete compo-
nent. The bandgap voltage reference circuit meets all of the above requirements and
is primarily used as reference voltage in most of the ICs that are sold today. The
bandgap reference circuit produces a temperature-independent reference voltage
which is equal to the silicon bandgap. The temperature independence is achieved
by adding a voltage that is proportional to absolute temperature (PTAT) with a
voltage that is complimentary to absolute temperature (CTAT). This concept was
first proposed by Widlar [2] providing a reference at 5 V. Other bandgap reference
circuits utilized the similar concept to provide voltage reference ICs producing 10
V [3] and 2.5 V [4].
While the architecture of the bandgap reference has undergone changes, the
concept however has remained the same. Various voltage reference circuits have
been proposed as an alternative to bandgap reference. Some of the recent publica-
tions in this area are listed in bibliography [57]. These circuits are MOS transistor
based and achieve good temperature stability. However, bandgap reference provides
better performance in one or more critical parameters. The alternative circuits have
found limited use in some application space. Further, the alternative circuits can
provide lower-voltage operation at ULP levels. There is a need to achieve ultra-
low power consumption with the bandgap reference. In this paper, we present a
bandgap voltage reference circuit with a new architecture suitable for IoT devices.
The proposed circuit is designed to meet the power, area, and voltage requirements
of a ULP system.
Figure 12.1 shows the concept of the operation of the bandgap reference circuit. The
voltage VBE which is obtained from a BJT in diode configuration is complementary
to absolute temperature (CTAT) with a slope of 2.2 mV/ C. The thermal voltage,
Vt , on the other hand is proportional to absolute temperature (PTAT) with a slope of
12 Ultra-low Power Charge-Pump-Based Bandgap References 221
VBE -2.2mV/oC
Temp.
10ppm/oC
VBE
VREF
VREF=VBE+KVt
KVt
Temp.
0.085mV/oC
Vt Vt
Vt=kT/q K
Temp.
CTAT PTAT
0.085 mV/ C. The PTAT voltage is scaled with a constant K and added to the CTAT
voltage. The addition cancels the first-order temperature variation and achieves a
constant voltage which is close to the silicon bandgap. The value of K is carefully
chosen to cancel the temperature dependence of CTAT and PTAT, and the reference
voltage VREF becomes temperature-independent voltage reference. This circuit can
achieve temperature stability in the order of 10 ppm/ C.
Figure 12.2 shows the CTAT and PTAT voltage generation circuit used for bandgap.
The bipolar junction transistors (BJTs) are used for the generation of PTAT and
CTAT voltages. The CTAT voltage is simply generated by connecting the BJT
transistor Q into a diode configuration. The CTAT voltage is given by the base-
emitter voltage, VBE of the transistor. As temperature increases, the voltage VBE
decreases because of the increase in the number of carriers. Since the number of
carrier increases, the conductivity of the transistor increases which decreases the
VBE voltage. The expression of VBE with temperature is given by [8, pp. 155],
222 S. Tewari and A. Shrivastava
T T kT T kT JC
VBE D VG0 1 C VBE0 C ln C ln (12.1)
T0 T0 q T0 q JC0
where VG0 is the bandgap voltage of silicon, T is the temperature, T0 is the room
temperature, VBE0 is the VBE voltage at room temperature, JC and JCO are the current
densities, and , k, q, and are various physical constants. The equation shows
that the value of VBE will decrease with temperature. The difference in the voltage
of two BJT transistors with different sizes but biased with the same current will
provide the PTAT voltage as shown in Fig. 12.2. The transistor Q2 is bigger than Q1
by multiplicity factor M. The difference between VBE1 and VBE2 is called VBE
which is given by using Eq. (12.1),
kT JC1 JC1
VBE D VBE1 VBE2 D ln D Vt ln (12.2)
q JC2 JC2
The VBE voltage increases linearly with the temperature making it a PTAT
voltage. Thus, Eqs. (12.1) and (12.2) provide the expressions of CTAT and PTAT
that are generated from the circuit shown in Fig. 12.2. The VREF voltage which will
be independent of temperature can be obtained by adding CTAT and PTAT voltages,
The value of K is chosen in such a way that it cancels the temperature variation.
The optimal value of K is given by
VG0 VBE0 . / Vt
KD (12.4)
Vt0
The reference voltage VREF comes out very close to the silicon bandgap and
hence is known as bandgap reference. The circuit implementation of bandgap
reference is conventionally realized using BJTs and resistors in a feedback network
using opamp [4]. The ratio, K, is typically obtained using resistors, although
capacitors are also used to realize the ratio [9].
The bandgap reference circuit in [4] utilizing resistors and opamp and providing
voltage given by Eq. (12.5) provides an almost ideal on-chip voltage reference
circuit which performs very well with the variation of voltage, temperature, and
process. However, the circuit has couple of limitations when used in ULP space. The
output voltage of VREF comes out to be close to 1.2 V being close to silicon bandgap,
which requires the power supply to be higher than 1.2 V. The circuit cannot work
12 Ultra-low Power Charge-Pump-Based Bandgap References 223
I
I I
+
_
a b
+
R2
VR
_
VBE1 VBE2 Q2 +
Q1
R1
1 M R1 R3 VREF
_
below a VDD of less than 1.2 V. Further, the circuit provides a reference voltage
of 1.2 V, which is higher than the power supply level for modern VDD . Some of the
recent architectures which can overcome these limitations were presented in [911].
In the next section, we will present circuit architectures which are used to provide
reference voltage at lower VDD .
A bandgap reference voltage architecture that can operate from lower power supply
and provide lower reference level was proposed in [10] overcoming the limitation
of a conventional bandgap. Figure 12.3 shows the architecture of the low-voltage
bandgap reference. It works in the following manner. Nets a and b are set at the
same voltage by the amplifier:
Va D Vb D VBE (12.6)
Vb Vb VBE2
ID C (12.7)
R1 R2
While the minimum operating voltage using this circuit can be brought down to
750 mV, even lower VMIN is needed for ultra-low power application. Further, a lower
area solution is desired which is not possible with this architecture while achieving
lower power consumption. In the next section, we present a bandgap voltage refer-
ence circuit that starts operating at 400 mV. Further, the power consumption of the
proposed circuit is 20 nW in simulation. The proposed circuit is also smaller in area.
One of the key limitations of bandgap reference circuit reported in [10] and other
bandgap reference circuits is that they cannot operate below 750 mV because
the BJT diode needs to be biased with a current source. A charge-pump-based
bandgap reference [12] overcomes this limitation. The biasing of a BJT diode
in charge-pump-based bandgap reference is achieved through the charge-pump
circuit. The use of charge pump provides following advantages to the bandgap
reference. It enables operation from lower input voltage which enables ultra-low
power consumption. Further, the use of switched capacitor network also enables a
lower power solution at a lower area cost.
Figure 12.4 shows the use of voltage doubler charge-pump circuit for our
bandgap circuit. The output of the charge pump is connected to the transistor Q1.
Cf Q1
CL
VDD j2 1
+ j1
_
12 Ultra-low Power Charge-Pump-Based Bandgap References 225
In the absence of Q1, the output will go to 2VDD . However, connecting Q1 at the
output will restrict the voltage to VBE . The transistor Q1 is connected in a voltage
clamp configuration. It sinks the additional current from the voltage doubler circuit
and restricts the output at VBE . A key advantage of this circuit is that the voltage
VDD needed for generating VBE is smaller than VBE . The minimum voltage for the
bandgap to be operational is given by
VBE
VMIN > (12.11)
2
It is easy to see that if voltage trippler or higher-order charge pumps are used,
then even lower VMIN can be achieved using this technique. Figure 12.4 shows
the circuit used to generate VBE1 . For generating VBE2 , same circuit is used with
transistor Q2 which is M times bigger than Q1. Figure 12.5 shows the simulation
result of VBE and VBE generated from the proposed circuit. Figure 12.5 shows
the CTAT behavior of VBE1 and VBE2 and the PTAT behavior of VBE . These
voltages are generated using VDD of 0.4 V. The weights of voltages of VBE1 and
VBE are added to generate the bandgap voltage. We propose the generation of
bandgap voltage in the form of
where constants a and b are needed to generate the weights for VBE and VBE
to generate VREF . These constants are obtained by employing switched capacitor
circuit techniques as opposed to the use of resistor in [10]. The use of resistor
increases the area of the circuit for low power systems. The power consumption
of the bandgap depends on the value of the resistors used in the design. The power
consumption can be brought down by using large resistors in this design. For a
200 nW bandgap reference circuit, 14 M resistors are needed which can be very
large in area. The use of switched cap circuit on the other hand can generate these
constants with lower area. One disadvantage of the switched capacitor scheme is
that it increases the noise on the reference voltage which can be reduced by using
bigger capacitors.
While the simulation results in Fig. 12.5 show PTAT and CTAT characteristics for
VBE and VBE , we also study the underlying theory of the proposed circuit. Figure
12.4 shows the charge-pump biasing circuit. The transistor Q1 maintains output
voltage VBE with a bias current which can be obtained through the charge transfer
happening during switching cycle. The charge across the plates of capacitor Cf 2x-
charge-pump circuit in phase
1 is given by
226 S. Tewari and A. Shrivastava
700 140
VBE1 VBE
650 135
VBE2
600 130
125
550
Voltage (mV)
Voltage (mV)
120
500
115
450
110
400
105
350 100
300 95
250 90
-20 0 20 40 60 80 100 -20 0 20 40 60 80 100
Temperature (C) Temperature (C)
Fig. 12.5 Simulation result of VBE and VBE generated from the proposed circuit
Q
1 D Cf .2VDD VDD / (12.13)
However, the voltage gets clamped to VEB by the diode. The charge that remains
across the flying capacitor Cf is
Q
2 D Cf .VEB VDD / (12.14)
Q Q
1 Q
2
D D> IC D Cf f .2VDD VEB / (12.15)
T T
1
where f ( T ) is the clock frequency.
Further, the current flowing through the BJT transistor is defined as,
VEB
I D IS exp Vt (12.16)
where C is a constant, n is the temperature dependency order, and VG0 is the silicon
bandgap voltage. Comparing currents in Eqs. (12.17) and (12.16), the VEB can be
written as
12 Ultra-low Power Charge-Pump-Based Bandgap References 227
T T T 2VDD VEB
VEB D VEB0 C VG0 1 nVt ln C Vt ln
T0 T0 T0 2VDD VEB0
(12.18)
Figure 12.7 shows the circuit for generating constants for VBE . Figure 12.7(i)
shows the circuit that generates VBE . The capacitor Cb is connected between VBE1
and VBE2 generated from the charge-pump circuit shown in Fig. 12.7. Therefore, the
voltage across Cb is VBE . For generating different bandgap reference voltages,
VBE needs to be multiplied by different constants. In this circuit, we present ways
to generate three constants for VBE , namely, one, two, and three. Figure 12.7
shows the circuits that can generate 2 VBE and 3 VBE . Figure 12.7(ii) shows the
circuit to generate 2 VBE . It uses the two nonoverlapping phases of clock
1 and
2 . In phase
2 , the voltages VBE1 and VBE2 are connected across the capacitors Cb1
and Cb2 . In phase
1 , the connections for the capacitors are rearranged, and the top
plate of Cb1 is connected to the bottom plate of Cb2 as shown in Fig. 12.7(ii). So, the
voltage appearing at the top plate of Cb2 is 2 VBE . This essentially is the voltage
doubler scheme. Similarly, a voltage trippler scheme is presented in Fig. 12.7(iii)
to generate 3 VBE . In design we can choose from VBE , 2 VBE , and 3 VBE ,
which can be used to generate different reference voltages.
228 S. Tewari and A. Shrivastava
Charge Pump
j1 j2
VBE1
Cf Q1 +
CL
Vin j2 1
+ j1
_
VBE Summing Ckt
a(VBE1+ b*VBE)
VREF
Charge Pump
j1 j2 _
VBE2
Cf Q2
CL
Vin j2 M
+ j1
_
j2 j2 j2 j2 j2
VBE1 Vx VBE1 Vx Vx
Cb2 Cb2 Cb3
Cb1 j1 Cb1 j1 j1
VBE2 j VBE2 j
2 j2 2 j2 j2
VBE1
VBE1 VBE1
Cb VBE Cb2 Cb2 Cb3
Cb1 VBE Cb1 VBE
VBE VBE VBE
VBE2
VBE2 VBE2
i) VBE Circuit Phase j 2 Phase j 2
Vx=3*VBE
Vx=2*VBE
Cb2 Cb3
Cb2 Cb1
Cb1 VBE VBE VBE VBE
VBE
Phase j 1
Phase j 1
ii) 2*VBE Circuit iii) 3*VBE Circuit
VBE
C2 C1
j2 j1
VBE j1
Phase j 2
C2 C1 j2 CL
Vx
C2 C1 CL
Circuit for generating VBE constant
Phase j 1
Figure 12.8 shows the circuit that is used to generate the rational number constant
for VBE . It also uses switched capacitor circuit with nonoverlapping clock phases
1 and
2 . In phase
1 , capacitor C2 is connected to VBE , while C1 is connected
to ground. In phase
1 , the capacitors C1 and C2 are connected together as shown
in Fig. 12.8. The total charge on the capacitors remains the same. Therefore, VX is
given by
C2
VX D VBE (12.19)
C1 C C2
Figure 12.9 shows the summing circuit of the bandgap. It consists of circuits used
for generating constants for VBE and VBE and uses switched capacitor scheme
to generate the sum. It also uses the nonoverlapping phases of the clock
1 and
2 . Figure 12.9a shows the summing circuit with all the signals. In phase
2 , the
switches connected with signal
2 are closed, and the circuit is configured as shown
in Fig. 12.9b. The capacitor Ca1 is discharged to ground, while the top plate of Ca2 ,
Cb1 , Cb2 , and Cb2 is connected to VBE1 . The bottom plate of Ca2 is connected to
ground, while the bottom plate of Cb1 , Cb2 , and Cb3 is connected to VBE2 . So the
230 S. Tewari and A. Shrivastava
1 VBE1 2
Ca1 Cb1 Cb2 Cb3
Ca2
VBE2
b) Phase j 2
VX=VBE1Ca1/(Ca1+Ca2) VREF=VBE1Ca1/(Ca1+Ca2)+3*VBE
1 2
Ca1 VBE Cb2 Cb3
Ca2 CREF
Cb1 VBE VBE
c) Phase j 1
voltage across Ca2 is VBE1 , while the voltage across Cb1 , Cb2 , and Cb3 is VBE .
In phase
1 , the switches are reconfigured and the circuit is arranged as shown
in Fig. 12.9c. First, the capacitors Ca1 and Ca2 are connected and charge share
to generate VBE component of the bandgap. The reference voltage generated is
given by
Ca2
VREF D VBE1 C 3VBE (12.20)
Ca1 C Ca2
The bandgap reference uses switched capacitor circuit which uses two nonover-
lapping phases of a clock. Therefore, a clock source is needed for the working of
this circuit. Therefore, the power consumption of the clock source needed for the
bandgap can be minimized by operating it at a very low frequency. The frequency
of the clock source should be enough to maintain the bias voltage of transistors
12 Ultra-low Power Charge-Pump-Based Bandgap References 231
j1
PTAT Current Clock Bandgap
Oscillator
Source Doubler Reference
j2
1mm
220m
2x Charge 60m
pumps
80m
Area Breakdown
BJT Block Area(m2)
60m Q1 & Q2
120m
PTAT Current Source 80x60
95m
SCN 60x60
Switched 60m
Cap netw. Clock Osc.
doubler
Q1 and Q2. A low-frequency, low-power clock source is needed for the bandgap
circuit. Further, the switches used for bandgap need to pass VBE , which is a voltage
higher than VDD . Therefore, the clock phases
1 and
2 need to swing from 0 to
2VDD . Figure 12.10 shows the block diagram for the clock generation scheme of the
bandgap circuit. A PTAT current source is used for current controlled ring oscillator.
The ring oscillator produces a clock of 30 kHz at 0.4 V VDD and consumes 2 nW of
power. Further, a clock doubler circuit is used to double the swing of output clock
to enable the switches which can pass VBE voltage level. The details of the circuit
are covered in [12].
The proposed bandgap circuit was implemented in 130 nm CMOS process. Figure
12.11 shows the annotated chip. It has an area of 0.0264 mm2 . The capacitors are
implemented using nMOS capacitors and MIM capacitors. The load capacitors for
the VBE generation circuit and the fraction generation switched capacitor circuit
232 S. Tewari and A. Shrivastava
were implemented using nMOS capacitors, whereas the load capacitors for the
bandgap output and the VBE trippler circuit were implemented using MIM
capacitors to avoid bottom plate parasitic. The total area of the proposed circuit is
much smaller than conventional low power bandgap circuit because it does not use
big resistors. It consumes 19.2 nW power at 0.4 V VDD in simulation. We were able
to measure the performance of the circuit only after 0.5 V VDD as the measurement
below 0.5V was not achieved due to limitation of the measurement circuit.
The circuit was verified for temperature range of 080 C. While this range is
quite large for the intended ULP applications, the performance of the circuit in this
range is crucial for it to compare with the state-of-the-art circuits. Figure 12.12
shows the variation of the bandgap output for a temperature change of 080 C.
The proposed bandgap circuit is designed to provide an output voltage of 500 mV
and the output voltage achieving a performance of 75 ppm/ C. The performance
of the bandgap circuit with temperature is in line with the reported work. A better
performance can be achieved at higher output power.
Figure 12.13 shows the output of bandgap circuit with respect to process and
mismatch variation and with input voltage variation. The circuit achieves a 3
variation 2%. The variation of the output voltage can be reduced by trimming the
505
504
VREF voltage (mV)
502
495
500
490
498
496
0 20 40 60 80 0 20 40 60 80
Temperature (C) Temperature (C)
Measured (untrimmed) BGR output voltage across 6 chips Measured (trimmed) BGR output voltage across 6 chips
from 0 to 80C at 0.5V Vin varies from 492mV to 504mV from 0 to 80C at 0.5V Vin with best stability of 75ppm/C
Fig. 12.12 Measurement of the bandgap reference with temperature variation [12]
515
Histogram of VREF
80
=498mV
VREF voltage (mV)
60
=3.3mV
505
Frequency
40
495
20
0.6 0.8 1.0 1.2 1.4 485 490 495 500 505 510 515
Vin(V) VREF (mV)
Measured (untrimmed) BGR output voltage across 6 chips Simulation result of 500 point Monte-Carlo at 0.5V VDD and
from 0.5 to 1.5V VDD at 20oC shows variation of 2%/V 20oC shows a 3 process variation of 2% (untrimmed)
Fig. 12.13 Power supply and process variation of VREF voltage [12]
12 Ultra-low Power Charge-Pump-Based Bandgap References 233
bandgap output using the capacitors used in switched capacitor circuits to generate
the constants for bandgap. Figure 12.13 also shows the variation of bandgap with
VDD . The output varies by 1% for 1 V variation of VDD . The power supply variation
of the bandgap reference is quite high which we address in the next version of the
design.
NREF VDDV
V
MND
NREF Clock Bandgap
Oscillator
Doubler Reference
Current Source
0.5045
0.504
5035 1.09mV
0.503
VREF
0.5025
0.502 1.8mV
5015
0.501
0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5
VDD
Fig. 12.15 Simulation result of the reference voltage with the variation of VDD
0.506
Vdd=0.5
0.505 Vdd=0.55
Vdd=0.75
0.504 Vdd=1
Vdd=1.25
0.503 Vdd=1.5
VREF
0.502
0.501
0.05
0.499
0.498
-20 0 20 40 60 80 100
Fig. 12.16 Overall variation of the bandgap reference circuit temperature and power supply
Table 12.1 Comparison of the bandgap circuit with the state of the art
[14] [15] [10] [11] [9] [12] This work
Power 300 nW 52.5 nW 1.85 20 W 170 nW 32 nW 48 nW
consumption W
Area (mm2 ) 0.055 0.0246 0.1 0.4 0.07 0.0264 NA
3
variation 21 4.8 5.8 1.5 3 2 2
(%)
Temp 7 114 119 11 40 75 45
variation
(ppm/ C)
PSRR(dB) 45 @ 56 57 86@DC 93 @DC 42@DC 60@DC
100Hz @100Hz @DC
Min voltage 1.4 0.7 0.84 1.1 0.75 0.5 0.5
(V)
Type CS CS CS CS CS CP CP
Process 350 nm 180 nm 400 nm 500 nm 130 nm 130 nm 130 nm
voltage achieves lower power consumption, lower area, and lower voltage operation
compared to the state-of-the-art bandgap references that are current source (CS)
based.
12.5 Conclusions
Table 12.1 compares this work with previous reported state-of-the art low-power
bandgap circuits. Our work reports operation from minimum input voltage of 0.5
V improving over 1.5 from previous reported least operating voltage circuit in
[9]. Our power consumption is 32 and 48 nW which is one of the lowest power
consumptions for bandgap reference without duty cycling. The work in [9] achieves
a power of 170 nW by sampling the reference voltage on a capacitor by periodically
turning it on and off. Even lower power can be achieved in our work if duty cycling
is employed. The power supply variation is higher in the charge-pump circuit of
[12] which is improved using a new PSRR improvement circuit presented in this
paper. The lower power consumption, lower voltage operation, and lower area of
the charge-pump-based bandgap reference makes it ideally suited for IoT devices.
References
1. Roy, A., Klinefelter, A., Yahya, F.B., Chen, X., Gonzalez-Guerrero, L.P., Lukas, C.J.,
Kamakshi, D.A., Boley, J., Craig, K., Faisal, M., Oh, S., Roberts, N.E., Shakhsheer, Y.,
Shrivastava, A., Vasudevan, D.P., Wentzloff, D.D., Calhoun, B.H.: A 6.45 w self-powered soc
with integrated energy-harvesting power management and ulp asymmetric radios for portable
biomedical systems. IEEE Trans. Biomed. Circuits Syst. 9(6), 862874 (2015)
236 S. Tewari and A. Shrivastava
2. Widlar, R.: New developments in ic voltage regulators. In: 1970 IEEE International Solid-State
Circuits Conference. Digest of Technical Papers, vol. XIII, pp. 158159, Feb 1970
3. Kuijk, K.E.: A precision reference voltage source. IEEE J. Solid-State Circuits 8(3), 222226
(1973)
4. Brokaw, A.: A simple three-terminal ic bandgap reference. In: 1974 IEEE International Solid-
State Circuits Conference. Digest of Technical Papers, vol. XVII, pp. 188189, Feb 1974
5. Kinget, P., Vezyrtzis, C., Chiang, E., Hung, B., Li, T.L.: Voltage references for ultra-low supply
voltages. In: 2008 IEEE Custom Integrated Circuits Conference, pp. 715720, Sept 2008
7. Leung, K.N., Mok, P.K.T.: A CMOS voltage reference based on weighted delta;VGS for
CMOS low-dropout linear regulators. IEEE J. Solid-State Circuits 38(1), 146150 (2003)
8. Allen, P.E., Holberg, D.R.: CMOS Analog Circuit Design. Oxford University Press, New York
(2002)
9. Ivanov, V., Brederlow, R., Gerber, J.: An ultra low power bandgap operational at supply from
0.75 v. IEEE J. Solid-State Circuits 47(7), 15151523 (2012)
10. Banba, H., Shiga, H., Umezawa, A., Miyaba, T., Tanzawa, T., Atsumi, S., Sakui, K.: A CMOS
bandgap reference circuit with sub-1-v operation. IEEE J. Solid-State Circuits 34(5), 670674
(1999)
11. Sanborn, K., Ma, D., Ivanov, V.: A sub-1-v low-noise bandgap voltage reference. IEEE J.
Solid-State Circuits 42(11), 24662481 (2007)
12. Shrivastava, A., Craig, K., Roberts, N.E., Wentzloff, D.D., Calhoun, B.H.: 5.4 a 32nw bandgap
reference voltage operational from 0.5v supply for ultra-low power systems. In: 2015 IEEE
International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, pp. 13,
Feb 2015
13. Roberts, N.E., Craig, K., Shrivastava, A., Wooters, S.N., Shakhsheer, Y., Calhoun, B.H.,
Wentzloff, D.D.: 26.8 a 236 nw56.5 dbm-sensitivity bluetooth low-energy wakeup receiver
with energy harvesting in 65 nm CMOS. In: 2016 IEEE International Solid-State Circuits
Conference (ISSCC), pp. 450451, Jan 2016
14. Ueno, K., Hirose, T., Asai, T., Amemiya, Y.: A 300 nw, 15 ppm/ c, 20 ppm/v CMOS
voltage reference circuit consisting of subthreshold mosfets. IEEE J. Solid-State Circuits 44(7),
20472054 (2009)
15. Osaki, Y., Hirose, T., Kuroki, N., Numa, M.: 1.2-v supply, 100-nw, 1.09-v bandgap and 0.7-v
supply, 52.5-nw, 0.55-v subbandgap reference circuits for nanowatt CMOS lsis. IEEE J. Solid-
State Circuits 48(6), 15301538 (2013)
Part III
Sub-1V and Advanced-Node Analog
Circuit Design
The third part of this book is dedicated to recent developments in the field of
ultra-low-voltage (1 V) design and/or in advanced nodes. The first three papers
deal with technology innovations, the fourth one about future A/D converter
developments, while the fifth and the sixth ones report two low-voltage design cases.
The first paper from Andreia Cathelin (STM) presents the planar 28 nm FD-SOI
Technology, which appears advantageous for analog/RF/millimeter-wave and high-
speed mixed-signal circuits, by taking full advantage of ultra-wide voltage range
body biasing tuning. Concrete design examples are given in order to highlight the
main FD-SOI design features.
As comparison, the second paper from Alvin L.S. Loke (Qualcomm) deals with
analog/mixed-signal design in 16/14 nm FinFET technologies. The compact 3-D
FinFET structure offers superior short-channel control that achieves digital power
reduction and adequate device performance, while analog/mixed-signal designers
must adapt to new design constraints. The challenges and considerations faced when
porting analog/mixed-signal designs to FinFET are given.
The third paper from Lukas Drrer (Intel) presents Intel Tri-Gate transistors
(FinFET) that further shrink MOSFET technologies and have been a disruptive
semiconductor innovation offering lower area, lower supply voltage, and lower
power consumption. This paper presents and compares measurements and designs
implemented in the 14 nm FinFET and in a planar 28 nm technology.
The fourth paper from Michael Flynn (University of Michigan) discusses about
future developments of SAR ADC which received strong impulse in recent years.
A new charge injection-based SAR architecture shrinks the die area and achieves
outstanding energy efficiency. The SAR-assisted pipeline structure enables record
efficiency for >12bit pipeline ADCs. The ring amplifier replaces the OTA in a
pipeline so that pipeline ADCs can achieve outstanding efficiency in advanced
nodes.
238 III Sub-1V and Advanced-Node Analog Circuit Design
In the fifth paper, Rachit Mohan (imec) presents a low-voltage design case for
biomedical applications, where low-power and low-cost sensor readout devices are
demanded. The proposed solution uses time-domain operation to overcome the
trade-off between cost, power consumption, and accuracy or the dynamic range
capability.
In the sixth paper, Masoud Babaie (Delft University) proposes as second design
case a 4.4 mW-TX, 3.6 mW-RX fully integrated Bluetooth Low-Energy Transceiver
for IoT Applications optimized for 28 nm CMOS.
Chapter 13
FDSOI Technology, Advantages for Analog/RF
and Mixed-Signal Designs
Andreia Cathelin
The race on the More Moore integration scale has brought several major limitations
for efficient process integration starting from the 40 nm technology node, in CMOS
planar solutions. It had appeared that the transistor channel was more and more
difficult to control in terms of electrostatics, and a lot of process engineering
methods (e.g., silicon strain) have been needed in order to provide transistors with
good carrier speed and decent electrical characteristics. Starting from the 28 nm
node, the obvious solution for transistors with increased electrical performances was
the use of fully depleted channel devices. Two integration paths have been chosen
in the semiconductor industry for these fully depleted devices: fully depleted silicon
on insulator CMOS (FDSOI) and FinFET CMOS devices. The fundamental carrier
semiconductor equations are similar; nevertheless, the process integration is very
different. This paper will focus on planar FDSOI CMOS technology features as
integrated by STMicroelectronics [1, 2] and its specific features for analog, RF, and
mmW integrations.
Figure 13.1 gives a generic cross section of a FDSOI CMOS device. We call
this technology Ultrathin Body and BOX (UTBB) FDSOI CMOS, as the active
device is integrated atop an ultrathin layer of buried oxide (BOX). The transistor
active conduction area (also called silicon film) is very thin as well. In the 28 nm
UTBB FDSOI technology from ST, the BOX thickness is 25 nm and the film layer
is 7 nm. This planar topologys direct implications are the following: thanks to the
A. Cathelin ()
STMicroelectronics, Crolles, France
e-mail: andreia.cathelin@st.com
SOI BOX layer, the transistor gets total dielectric isolation. No channel doping is
needed as, thanks to the thin silicon film, the channel is fully depleted. Also enabled
by this topology, no pocket implants are needed for the source and drain, which
enhances naturally the analog/RF transistors behavior. Another implication of the
thin layer consisting the BOX is the fact that the front side transistors electrostatics
can be controlled through the area underneath the BOX, also called transistor body.
By applying a voltage on the transistor body (hence underneath the BOX), one can
change or modulate the threshold voltage of the main (front side) transistor. We can
see this device as well as a planar dual-gate device: the front gate is the regular one
(like in bulk technology), and the second one comes from the body tie, with the
BOX as the backside gate oxide. As the thickness report of the front and back gate
oxides is about 10, we can state the front side transistors transconductance Gm is
ten times bigger than the one of the backside gate.
In terms of manufacturing, the 28 nm FDSOI CMOS process from STMicroelec-
tronics is sharing most of process steps with the equivalent 28 nm bulk technology
from ST. It is a modified bulk 28 nm high-K metal gate LP process using the
same back end of the line and same gate module. Several process steps, specifically
channel implants, halo implants, and masking levels are removed compared to the
traditional 28 nm bulk technology because of the undoped FDSOI channel. There
is less than 20% change with respect to a classical CMOS bulk flow; the two extra
specific steps are related to the hybridation and raised source/drain epitaxy. At this
node, more than 10% of the process steps and seven masks are saved, resulting in
an overall manufacturing process cost saving of 10% [3].
13 FDSOI Technology, Advantages for Analog/RF and Mixed-Signal Designs 241
Fig. 13.3 Cross section of 28 nm UTBB FDSOI CMOS transistors: Top Low VT (LVT)
transistors; Bottom Regular VT (RVT) transistors
Lets now focus on the low VT (LVT) devices on the upper part of Fig. 13.3.
These devices are also called flipped-well devices, as in order to obtain the lower
VT characteristics, the process engineering has proposed a solution when the NMOS
devices lay on an NWell body, and respectively the PMOS on a PWell body. In an
equivalent way, we can apply forward body biasing (FBB) on these devices, with
also an equivalent body voltage variation from roughly 0 to 3 V (modulus).
The body factor for both families of devices (RVT and LVT) is much larger
in FDSOI than in an equivalent body node, here in 28 nm being 85 mV/V. This
argument together with the very wide body biasing range result in an unprecedented
wide variation of the threshold voltage (VT) of around 250 mV, as depicted in
Fig. 13.4.
When we were following the historical nm downscaling of the CMOS bulk road
map, the analog designers had to get used with the fact that the analog behavior
of the respective transistors was getting worse and worse with the downscaled
technology node. This was inherent from the planar CMOS bulk transistor topology,
in the race for faster and faster digital behavior and/or low power. Some foundries,
13 FDSOI Technology, Advantages for Analog/RF and Mixed-Signal Designs 243
like ST, had solved that in the 65nm node by introducing a specific analog transistor
called HPA (high-performance analog) which was overcoming this problem by
eliminating the transistor pockets. In the 28 FDSOI planar technology, thanks to
the thin film structure, we do not need transistor pockets, and hence we can recover
nice native analog behavior.
FDSOI hence brings several advantages to analog designers in terms of efficient
short channel devices, improved analog performances, and lower noise variability.
For comparison, in the several following figures, we show comparison of the
28 nm FDSOI technology with its equivalent 28 nm LP bulk technology, both from
STMicroelectronics.
Figure 13.5 shows major improvements of FDSOI vs. bulk solution regarding
analog gain and VT matching parameter. For example, in 28 nm FDSOI, an LVT
NMOS device of size 1 m/100 nm can show an excellent analog performance of
DC gain of 80 and a sigma (VT) of only 6 mV.
Figure 13.6 shows that FDSOI provides higher Gm for a given current, with
respect to the equivalent 28 nm LP bulk node. This, combined with the lower
parasitic capacitances coming inherently from the SOI insulation, permits to achieve
higher operation bandwidths for a given current consumption or if working at
constant bandwidth lower power consumption.
The variability in planar FDSOI technologies is improved with respect to an
equivalent LP bulk node, thanks to the simpler manufacturing process steps. This
helps a lot as well for the noise behavior, as it can be seen in Fig. 13.7. As an
example, for an LVT NMOS of size 1 m/120 nm biased at a 1 A drain current,
we get 1.5 dB lower 1/f noise in FDSOI than in bulk. This is a typical value of the
main branch current for a LNA in low GHz frequencies.
244 A. Cathelin
1E+3
DC gain-lin (Gm/Gds)
28FDSOI
1E+2
28LP bulk
1E+1
5.0
Avt (mV.m)
4.5 28LP bulk
Curves for W=1m
4.0
3.5
3.0 28FDSOI
2.5
2.0
Gate length (m)
1.5
1E-8 1E-7 1E-6 1E-5
Fig. 13.5 Analog gain (Gm/Gds) and matching (AVT) for NMOS LVT devices in 28 nm FDSOI
technology (red) and comparison with 28 nm LP bulk (blue)
Thanks to the deep submicron lithography, this technology node provides very fast
transistors. The intrinsic devices (FEOL plus Metal1 contact) in the LVT flavor, for
example, here NMOS, show fT and fmax superior to 300 GHz (Fig. 13.8).
We can hence distinguish two types of dimensioning and biasing strategies,
depending on the operation frequency.
If we talk about RF operation frequency below 10 GHz, we can then work with
a transistor length of 100 nm. Performances such as a maximum available gain
MAG D 12 dB and NFmin 0.5 dB can be obtained for a current density: 125 A/m
(here W D 1 m).
13 FDSOI Technology, Advantages for Analog/RF and Mixed-Signal Designs 245
21
Gm/ld (1/V) 28FDSOI
20
19
18 28LP bulk
17
16
15
14
13
12 Gate length (m)
11
1E-8 1E-7 1E-6 1E-5
140
28LP bulk
120 Cgg (fF/m)
100
28FDSOI
80
60
40
20
Gate length (m)
0
1E-8 1E-7 1E-6 1E-5
Fig. 13.6 Improved analog performance (Gm/Id and total gate capacitance Cgg) for NMOS LVT
devices in 28 nm FDSOI technology (red) and comparison with 28 nm LP bulk (blue)
Going higher in frequency will then request working with the minimum tran-
sistor length of 30 nm. Here, for example, for a 60 GHz operation frequency, a
MAG D 12 dB and NFmin 1.3 dB can be obtained, when working at a current
density of 200 A/m. This value is 33% lower than in 28LP bulk.
Deep submicron CMOS has the counterpart of very low and dense back end
of line, with a large number of metal layers. This can be seen as a limiting point
for mmW design; nevertheless, the eight metal layers of the discussed technology
permit to obtain very decent values for the integrated passive devices. This is
enabled by the operation in a low parasitics environment coming with the SOI
technologies. Several examples can be cited here: an inductor of L D 0.5 nH with a
Q factor of 18 at 10 GHz, a varactor of C D 50fF with a Q factor of 20 at 20 GHz
and a 50 Ohm transmission line of 08 dB/mm losses at 60 GHz.
246 A. Cathelin
3.0E-5
2.0E-5
1.0E-5
28FDSOI
5.0E-6
1E-6 1E-5 1E-4 1E-3 1E-2 1E-1 1E+0 1E+1 1E+2
noise_id/w [uA/um] Idrain/W (mA/mm)
1E-4
7E-5
6E-5
5E-5
Fig. 13.7 Noise behavior for NMOS LVT devices in 28 nm FDSOI technology (red) and
comparison with 28 nm LP bulk (blue)
Fig. 13.8 High-frequency behavior (fT and fmax ) of LVT NMOS 0.5 M/30 nm in 28 nm FDSOI
CMOS
13 FDSOI Technology, Advantages for Analog/RF and Mixed-Signal Designs 247
Vth (mV)
550
500
28FDSOI
450
Slow
400 Typ
350 Fast
300
250 Slow
28lp bulk Typ
200
Fast
150
1E-8 1E-7 1E-6 1E-5
Gate length (m)
Fig. 13.9 VT process corners for LVT NMOS devices, comparison between 28 nm FDSOI CMOS
and 28 nm LP bulk
1.5
1.0
[E-3]
gm meas
0.5
0.0
1E-7 1E-6 1E-5 1E-4 1E-3
id [LOG]
gm vs. id (W=1e-06 L=3e-08 MODEL=Ivtnfet)
Fig. 13.11 28 nm FDSOI LVT CMOS 1 m/30 nm transistor measured Gm for different drain
currents, Vbody varies from 0 to 2 V, Vds D 1.1 V
The semiconductor physics in FDSOI predict that the main design transistors
parameters (such as Gm or fT ) do not depend on the body biasing variation, for
an operation at constant current.
The following two figures illustrate this aspect by providing measurement curves
for different devices (Figs. 13.11 and 13.12).
13 FDSOI Technology, Advantages for Analog/RF and Mixed-Signal Designs 249
[Hz]
30 400
[S]
300
20
[E-3]
[E+9]
200
gm meas
Ft meas
10
100
0 0
0 5 10 15 20 25 0 5 10 15 20 25
id [E-3] [A] id [E-3] [A]
gm vs. id ( vds=1 WFING=1e-06 L=3e-08 MODEL=lvtnfet_rf) Ft vs. id ( vds=1 WFING=1e-06 L=3e-08 MODEL=lvtnfet_rf)
This section presents a typical bloc used in most of wireless communication ICs,
and it highlights the main benefits that such a design can take from a 28 nm FDSOI
integration. We hence discuss in this section the integration of analog filters with
several 100s MHz bandwidth. In the real life of a SoC, they suffer from process,
voltage, temperature (PVT), and aging variations, which affect system operation
and directly influence overall applicative behavior. All such blocs need to be tuned
or trimmed inside the SoC, in order to strictly control independently several of its
parameters: cutoff frequency, linearity, and noise, all these for an optimal power
consumption.
Since more than 10 years now, in deep submicron CMOS processes, it is very
practical and straightforward to implement inverter-based analog functions: They
yield to simple and compact solutions, which nicely scale with technology nodes.
The example here discusses about an analog low-pass Gm-C filter with cutoff
frequencies in the range of hundreds of MHz. The typical implementation of such
topology is realized with fixed capacitors, then the filter parameters are varied
by tuning the different filter Gms; see Fig. 13.13. Each filter transconductor Gm
is composed of several invertors operated in the gain region (analog operation
with biasing point on the middle of the out/in transfer function), as, for example,
presented in [5].
In traditional bulk CMOS technologies, the only tuning way of such inverter-
based topologies is to define as unique tuning knob the local Vdd of the analog
operated inverters. A dedicated LDO with controlled output voltage is, for example,
implemented between the global system Vdd and the bloc local Vdd, generating this
tuning value. This is generating extra power consumption and reduced voltage head-
room (hence linearity) for the bloc to be controlled. In general, with only one tuning
knob for such blocs and several parameters to tune (cutoff frequency, linearity, etc.),
250 A. Cathelin
Fig. 13.13 Typical tuning methods in bulk and FDSOI CMOS integration for analog Gm-C filters
the designers have to take a very large power consumption margin in order to be
able to satisfy the system level specifications in all operation conditions.
In FDSOI CMOS technologies, as presented in the previous section, there are two
available and independent new tuning knobs in the system: the local bodies of the
NMOS and respective PMOS transistors. Thanks to the very wide tuning voltage
range of these individual body biases knobs, there is an independent variation of
the respective transistors threshold voltages, hence generating at system level a
variation of the system level parameters, and this over a very wide range. Moreover,
this permits also an independent tuning of the different system level parameters
(e.g., cutoff frequency and linearity).
Another consideration can be made here regarding the parasitic influence of a
tuning control loop on the main signal path operation of a system. In classical bulk
CMOS implementation, the loop control signal is somewhere either on the signal
path or with direct parasitic influence on the signal path. In these FDSOI tuning
systems, the tuning control signal are on the body ties, which are isolated by a
buried oxide layer with respect to the signal path main operation, hence all parasitic
signals on these signals have much less influence on the main signal path operation.
All these will be illustrated in a simple intuitive way in the following paragraphs.
In a bulk CMOS implementation, the inverter transconductance variation (hence
the global filter transconductor variation) is obtained by the local Vdd variation, as
depicted in Fig. 13.14. In this example, the capacitor values are fixed, meaning that
the global transconductor variation is directly proportional to the one of the filter
cutoff frequency. Unfortunately, this transconductance variation will also imply a
variation of the linearity; hence, a larger design margin will have to be taken in
order to cover these two variations.
13 FDSOI Technology, Advantages for Analog/RF and Mixed-Signal Designs 251
Fig. 13.14 Bulk CMOS operation: inverter-based design tuning through local Vdd variation
Fig. 13.15 FDSOI CMOS operation: inverter-based design tuning through independent Vbody_N
and respective Vbody_P variation
In Fig. 13.15, we show an FDSOI CMOS typical implementation for this filter
tuning strategy. The inverters constituting the Gm-C filter have all a global Vdd,
which is the global (system) Vdd, and in any ways it is not used for tuning means.
Two independent tuning knobs are materialized here by the body ties of the NMOS,
respective PMOS transistors. Without body biasing, any variation on the global Vdd
will induce a variation in the transconductance and the linearity of the inverters,
hence the filter. Thanks to these new independent tuning knobs, a combination of
the two tuning voltages can always be found in order to get the desired values for
both transconductance and linearity. The very wide tuning voltage range available
on these body ties (see Sect. 13.2.1) permits to lower the design margins that are
needed on such design, hence the typical power consumption.
This FDSOI tuning concept has been implemented in a third-order Gm-C low-
pass filter, as presented in [6]. The measured performances as well as the comparison
with the state of the art are given in Fig. 13.16.
This RF low-pass Gm-C filter using CMOS inverters has been successfully tuned
by back gate instead of supply, moreover with no signal path interference. This
supply regulator-free operation is energy efficient and also able of low voltage
operation (down to VDD D 0.7 V).
There are two main categories of analog filters, each with advantages and
drawbacks, these considered for an equivalent input referred noise level: the active-
RC filters show generally excellent linearity, with the counterpart of large power
252 A. Cathelin
Fig. 13.16 28 nm FDSOI Gm-C low-pass filter, chip photomicrograph, measured performances,
and comparison with the state of the art
consumption and limited operation in high frequency; the Gm-C filters show
excellent frequency behavior and straightforward implementation from passive LC
prototypes but have traditionally limited linearity performance. This FDSOI Gm-C
implementation shows nevertheless competitive linearity. When compared to similar
circuit in 65 nm bulk [7], at the same noise level, we get twice the linearity for a
power level divided by four. When compared to best-in-class filters (active-RC) [8],
at same noise level and cutoff frequency, we get competitive linearity for a power
level divided by 14.
This 28 nm FDSOI CMOS solution implemented in STs technology exhibits
best in class compromise noise linearity power, thanks to the excellent analog/RF
process intrinsic features.
Fig. 13.17 28 nm Classical and FDSOI revisited Doherty power amplification topology
implies also passive power adaptation structures at the input and output, which
inherently bring passive losses, hence natural lower efficiency. Such structures are
typically implemented in classical CMOS integration.
In a specific FDSOI CMOS integration, we have decided to revisit this classical
Doherty PA architecture, as depicted in Fig. 13.17 [9]. Two different class power
amplifiers are as well connected in parallel, but here they are materialized by two
differential pairs. The parallelization of such structures is straightforward, with no
lossy passive elements; hence, the starting point for energy efficiency is already
improved. Each PA cell (i.e., differential pair) is biased in a different class (here
class AB and C), and each individual biasing point is changed by gradually varying
the body voltage of each differential pair. We hence get the ability of gradually
changing the overall class of the PA (mix of class AB and class C), thanks to the
wide range of the forward body biasing voltage. Moreover, we get a new equivalent
class of PA, which at any instant is composed of x% class AB operation and y%
class C. At the two-class operation extremities, we get either Class-C at zero body
bias or Class-A in maximum forward bias.
The compression of the Class-AB transistors is compensated by the gain
expansion of the Class-C transistors, increasing the PA compression point without
DC current penalty. The Class-AB devices are sized to carry the RMS power
of the modulated signal and determine the average power consumption, while
Class-C devices pass the peaks. Static body bias and dynamic modulated signals
induce drain-gate nonlinear capacitance variations, tracked by MOS neutralization
254 A. Cathelin
G RFout G
NRPC NRPC
NRPC NRPC
TRF3 TRF3
NRPC NRPC
TRF4
CL CL
Fig. 13.18 28 nm FDSOI 10ML CMOS implementation of the WiGig 60 GHz Power Amplifier
capacitors across process, temperature, bias, and signal conditions. This robustness
allows to safely use the full neutralized power cell flexibility to trade power gain and
consumption for linearity, between high gain mode (all forward bias, Class-A) and
high linearity mode (dual body bias, optimized Class-AB Class-C combination).
This new design topology, uniquely enabled by FDSOI wide range body biasing
capabilities, permits to optimize in the same time power efficiency and linearity in
such power amplification cells.
The total PA consists of three-stage transformer-based power amplification cells,
each active cell as depicted before. This PA has been implemented in a 10 ML
process option of the STs 28 nm FDSOI CMOS technology; see Fig. 13.18. The
VLSI-like dense and low BEOL still permits to obtain very competitive passives
design, thanks to the SOI substrate feature.
At the output, a parallel-series eight-way power combiner TRF1 sums four
differential reconfigurable power cells and performs differential to single-ended
conversion. While the distributed active transformer (DAT) transforms 50 output
to four 7 input ports with a coupling factor close to 0.87, the access lines are
used to create eight ports presenting the optimal large signal load impedance for
the power devices. This compact topology achieves an insertion loss of only 1 dB
at 60 GHz. The 1:2 differential TRF2 and 1:4 TRF4 power splitters are based on
a DAT structure and implement wideband matching networks. The two secondary
windings are orthogonally placed to reduce parasitic magnetic coupling. The 1:2
transformer TRF3 performs impedance matching between the two driver stages.
13 FDSOI Technology, Advantages for Analog/RF and Mixed-Signal Designs 255
Fig. 13.19 28 nm FDSOI 10 ML CMOS WiGig PA, chip photomicrograph, measured perfor-
mance, and comparison with the state of the art
Figure 13.19 gives the measured performance of this mmW PA, for three extreme
operation cases, knowing that an infinite number of operation conditions can be met
when the two different body bias voltages are varied from 0 to 2 V (FBB conditions).
This power amplifier is fully WiGiG compliant, when taking into consideration
linearity and frequency operation range (all four bands of the standard). It has
introduced a new PA architecture which permits to continuously reconfigure power
cells. This continuous operation class tuning is enabled by the very wide body bias
voltage range.
Moreover, in the high gain mode, it exhibits the highest ITRS FOM, improving
by an impressive factor of 10 the previous state of the art. In the high linearity
mode, it breaks the linearity/consumption trade-off. It permits also a low-voltage
high-efficiency operation (Vdd_min D 0.8 V).
Before concluding, this section gives an overview of the potential body biasing
enabled tuning and trimming methods, identified as of today.
By taking advantage of the unique very wideband body biasing (BB) voltage
range available in FDSOI technologies, the state of the art proposes several
unique techniques bringing uncontested chip energy saving and revisiting system
performances.
The first method consists in generating and making available on chip a body bias
voltage variable over time and process, voltage, and temperature (PVT) variations.
This permits to:
256 A. Cathelin
This paper has presented a short overview of planar UTBB FDSOI technologies and
their application for analog, RF, mmW, and mixed-signal designs.
As a summary, here are the major arguments of such technologies to focus on for
analog/RF designs:
Make massive usage of the body biasing techniques which enable transistors VT
as tuning knob and this over an unprecedented very wide tuning range
Take profit of the very good analog performances. They permit designs with
lower power consumption and which can safely operate at L > Lmin for design
margin.
For RF to mmW design, atop the previously mentioned aspects, we should take
into consideration the deep submicron technology features for the active devices
(excellent fT , fmax ). The back end of line in an FDSOI environment permits to obtain
performant passive devices, despite the very dense VLSI constraints.
For mixed-signal and high-speed designs, the major key parameters are the
improved variability, the remarkable CMOS switches performance, and the reduced
parasitic capacitances.
And finally, the UTBB FDSOI technologies open a new era in terms of innovative
energy-efficient circuits and systems. They find excellent application for IoT
implementations. Ultralow Power SoCs take benefit of efficient ultralow-voltage
digital performances [16]; the full mixed-signal integration is then enabled by
13 FDSOI Technology, Advantages for Analog/RF and Mixed-Signal Designs 257
efficient analog, RF [11], and mmW operation. Finally, the FDSOI implementation
exhibit power and performance flexibility, thanks to the new tuning knobs brought
in by very wide voltage range body biasing.
References
1. Planes, N., Weber, O., Barral, V., Haendler, S., Noblet, D., Croain, D., Bocat, M., Sassoulas, P.,
Federspiel, X., Cros, A., Bajolet, A., Richard, E., Dumont, B., Perreau, P., Petit, D., Golanski,
D., Fenouillet-Beranger, C., Guillot, N., Rafik, M., Huard, V., Puget, S., Montagner, X., Jaud,
M.-A., Rozeau, O., Saxod, O., Wacquant, F., Monsieur, F., Barge, D., Pinzelli, L., Mellier,
M., Boeuf, F., Arnaud, F., Haond, M.: 28 nm FD-SOI technology platform for high-speed
low-voltage digital applications. In: Proceedings of Symposium VLSI Technology (VLSIT),
pp. 133134 (2012)
2. Arnaud, F., Planes, N., Weber, O., Barral, V., Haendler, S., Flatresse, P., Nyer, F.: Switching
energy efficiency optimization for advanced CPU thanks to UTBB technology. In: IEEE
International Electron Devices Meeting (IEDM) Dig., pp. 3.2.13.2.4 (2012)
3. Jacquet, D., Hasbani, F., Flatresse, P., Wilson, R., Arnaud, F., Cesana, G., Di Gilio, T., Lecocq,
C., Roy, T., Chhabra, A., Grover, C., Minez, O., Uginet, J., Durieu, G., Adobati, C., Casalotto,
D., Nyer, F., Menut, P., Cathelin, A., Vongsavady, I., Magarshack, P.: A 3 GHz Dual Core
Processor ARM Cortex TM -A9 in 28 nm UTBB FD-SOI CMOS With Ultra-Wide Voltage
Range and Energy Efficiency Optimization. IEEE J. Solid-State Circuits. 49(4), (2014)
4. Kumar, A., Debnath, C., Narayan Singh, P., Bhatia, V., Chaudhary, S., Jain, V., Le Tual, S.,
Malik, R.: A 0.065mm2 19.8mW single channel calibration-free 12b 600MS/s ADC in 28nm
UTBB FDSOI using FBB. In: ESSCIRC Conference 2016: 42nd European Solid-State Circuits
Conference, pp. 165168 (2016)
5. Nauta, B.: A CMOS transconductance-C filter technique for very high frequencies. IEEE J.
Solid-State Circuits. 27(2), 142153 (1992)
6. Lechevallier, J., Struiksma, R., Sherry, H., Cathelin, A., Klumperink, E., Nauta, B.: A forward-
body-bias tuned 450 MHz Gm-C 3rd-order low-pass filter in 28nm UTBB FD-SOI with
>1dBVp IIP3 over a 0.7-to-1V supply. In: 2015 IEEE International Solid-State Circuits
Conference (ISSCC) Digest of Technical Papers, pp. 13 (2015)
7. Houfaf, F. et al.: A 65nm CMOS 1-to-10GHz Tunable Continuous-Time Lowpass Filter for
High-Data-Rate Communications. In: IEEE ISSCC Digest of Techical Papers, pp. 362364
(2012)
8. Kwon, K., et al.: A 50300-MHz highly linear and low-noise CMOS gm-C filter adopting
multiple gated transistors for digital TV tuner ICs. IEEE Trans. Microwave Theory Techn.
57(2), 306313 (2009)
9. Larie, A., Kerherv, E., Martineau, B., Vogt, L., Belot, L.: A 60GHz 28nm UTBB FD-SOI
CMOS reconfigurable power amplifier with 21% PAE, 18.2dBm P1dB and 74mW PDC.
In: 2015 IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical
Papers, pp. 13 (2015)
10. Danilovic, D., Milovanovic, V., Cathelin, A., Vladimirescu, A., Nikolic, B.: Low-power
inductorless RF receiver front-end with IIP2 calibration through body bias control in 28
nm UTBB FDSOI. In: 2016 IEEE Radio Frequency Integrated Circuits Symposium (RFIC),
pp. 8790 (2016)
11. de Streel, G., Stas, F., Gurn, T., Durant, F., Frenkel, C., Cathelin, A., Bol, D.: Sleep Talker:
A ULV 802.15.4a IR-UWB Transmitter SoC in 28-nm FDSOI Achieving 14 pJ/b at 27 Mb/s
With Channel Selection Based on Adaptive FBB and Digitally Programmable Pulse Shaping.
IEEE J. Solid-State Circuits. PP(99), 115 (2017)
258 A. Cathelin
12. Sourikopoulos, I., Frapp, A., Cathelin, A., Clavier, L., Kaiser, A.: A digital delay line with
coarse/fine tuning through gate/body biasing in 28nm FDSOI. In: ESSCIRC Conference 2016:
42nd European Solid-State Circuits Conference, pp. 145148 (2016)
13. Fanori, L., Mahmoud, A., Mattsson, T., Caputa, P., Rm, S., Andreani, P.: A 2.8-to-5.8 GHz
harmonic VCO in a 28 nm UTBB FD-SOI CMOS process. In: 2015 IEEE Radio Frequency
Integrated Circuits Symposium (RFIC), pp. 195198 (2015)
14. Lahiri, A., Gupta, N.: A 0.0175 mm2 600 W 32kHz input 307 MHz output PLL with 190
psrms jitter in 28 nm FD-SOI. In: ESSCIRC Conference 2016: 42nd European Solid-State
Circuits Conference, pp. 339342 (2016)
15. Le Tual, S., Narayan Singh, P., Curis, C., Dautriche, P.: A 20 GHz-BW 6b 10GS/s 32
mW time-interleaved SAR ADC with Master T&H in 28 nm UTBB FDSOI technology. In:
2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC),
pp. 382383 (2014)
16. Zimmer, B., Lee, Y., Puggelli, A., Kwak, J., Jevtic, R., Keller, B., Bailey, S., Blagojevic,
M., Chiu, P.-F., Le, H.-P., Chen, P.-H., Sutardja, N., Avizienis, R., Waterman, A., Richards,
B., Flatresse, P., Alon, E., Asanovic, K., Nikolic, B.: A RISC-V vector processor with
simultaneous-switching switched-capacitor DCDC converters in 28 nm FDSOI. IEEE J.
Solid-State Circuits. 51(4), 930942 (2016)
Chapter 14
Analog/Mixed-Signal Design in FinFET
Technologies
Alvin L. S. Loke, Esin Terzioglu, Albert A. Kumar, Tin Tin Wee, Kern Rim,
Da Yang, Bo Yu, Lixin Ge, Li Sun, Jonathan L. Holland, Chulkyu Lee,
Deqiang Song, Sam Yang, John Zhu, Jihong Choi, Hasnain Lakdawala,
Zhiqin Chen, Wilson J. Chen, Sreeker Dundigal, Stephen R. Knol,
Chiew-Guan Tan, Stanley S. C. Song, Hai Dang, Patrick G. Drennan,
Jun Yuan, P. R. Chidambaram, Reza Jalilizeinali, Steven J. Dillen,
Xiaohua Kong, and Burton M. Leary
14.1 Introduction
The mobile system-on-chip (SoC) has emerged as the principal economic driver
to extend CMOS scaling. Since the finFET debuted in manufacturing at the 22-
nm node [1], most advanced fabless designs have bypassed planar 20 nm in favor
of foundry finFET offerings introduced at 16 and 14 nm [2]. With its superior
short-channel control, the fully depleted tri-gate finFET enables a smaller device
footprint with dynamic and leakage power savings and competitive performance.
Qualcomms first finFET product (Snapdragon 820) was built in 14-nm CMOS
and has been in high-volume production since 2015. Its next-generation mobile
processor (Snapdragon 835) recently became the worlds first product built in
10-nm CMOS [3, 4]. Continuing at this aggressive pace, 7-nm finFET products are
expected to be commercialized by late 2018.
Analog/mixed-signal (AMS) subsystems are essential in SoCs. Functions
such as clocking, I/O connectivity, and core voltage/frequency scaling require a
smorgasbord of AMS components including phase-locked loops (PLLs), wireline
transceivers, data converters, regulators, thermal sensors, and bandgap references.
We address the general considerations faced as we port AMS designs to a finFET
node. We also summarize the key scaling innovations preceding the finFET to
demonstrate how 16-/14-nm design is complicated by far more than just the new
device structure. These are process-induced mechanical strain, high-permittivity
A.L.S. Loke () E. Terzioglu A.A. Kumar T.T. Wee K. Rim D. Yang B. Yu L. Ge
L. Sun J.L. Holland C. Lee D. Song S. Yang J. Zhu J. Choi H. Lakdawala Z. Chen
W.J. Chen S. Dundigal S.R. Knol C.-G. Tan S.S.C. Song H. Dang P.G. Drennan J. Yuan
P.R. Chidambaram R. Jalilizeinali S.J. Dillen X. Kong B.M. Leary
Qualcomm Technologies, Inc., 5775 Morehouse Drive, San Diego, CA 92121-1714, USA
e-mail: alvin.loke@ieee.org
gate dielectric and metal gate (HKMG), multiple and spacer-based patterning, and
a substantially more complex middle-end-of-line (MEOL). As with previous SoC
nodes, AMS designers continue to adapt to process technology prioritized to the
scaling needs of logic and SRAM which dominate the die area and thus dictate the
wafer cost.
With channel lengths approaching 20 nm, short-channel effects in the planar bulk
silicon MOSFET have become ever so severe. At a given supply voltage (VDD ),
building a smaller device with threshold voltage (VT ) low enough for good on-
state drive current while preserving a low off-state leakage remains the perennial
challenge, constrained by subthreshold operation. Here, the MOSFET behaves as
a BJT whose base voltage, analogously the source-side surface potential (
s ), is
induced from coupling to the gate (Cox ), body (CB ), and drain (CD ) as depicted in
Fig. 14.1 [5]. Leakage is governed by the subthreshold swing SS (VGS required for
a decade change in subthreshold current) and degraded by DIBL (VT reduction per
VDS of 1 V). Good gate control of the silicon surface below a channel length of
1 m has largely been maintained through gate dielectric thinning for higher Cox but
also through channel engineering. Here, higher well doping levels and retrograded
well doping profiles, localized halo implants under the gate edge, and shallow
source/drain extensions helped to preserve gate control of the surface depletion by
suppressing the impact of the source/drain junction depletion regions.
s can then
closely track the gate voltage to modulate the source-to-channel potential barrier
and device subthreshold diffusion current. Unfortunately,
s coupling to the body
and drain has been increasing with channel optimization. The resulting tug-of-war
gate control
gate (what we want)
source
VGS VGS
Cox drain
source f s drain
VDS DIBL
CB CD
VBS
body body
depletion effect
region edge VDS
-VBS
source drain
A fully depleted (FD) MOSFET enables VDD and VT reduction for uncompromised
lower-power operation. Conceptually, when the well doping profile of a bulk planar
MOSFET is extremely retrograded [7], the undoped body surface becomes fully
depleted as it is devoid of ionized dopant fixed charge. This makes the device
resemble a parallel plate capacitor with source and drain attached to the plate
dielectric as shown in Fig. 14.2. In subthreshold, gate charge is balanced by opposite
polarity charge in the heavily doped well region to establish a vertical electric field
through the undoped body. This field subsequently induces energy band bending
beneath the gate dielectric. At higher VGS , more band bending will eventually
create a conductive surface channel to support the on-state drift current. In fact,
the channel would be formed in a similar fashion should the body be uniformly
doped, a simplifying condition typically assumed in introductory device textbooks
[8]. So, fundamentally, a distribution of body dopants at or near the silicon surface
is not essential for field-effect action in a MOS system.
Because electric fields from the gate and drain cannot terminate in the charge-
free fully depleted region but must instead terminate in the heavily doped region,
s
is weakly coupled to the body and drain, i.e., CB and CD are substantially smaller
than Cox . The resulting stronger gate control reduces both SS and DIBL [9, 10]. VT
and VDD can therefore be lowered for a given performance (IDsat or Ieff [11]) and
leakage (Ioff ) target to reduce the VDD 2 -dependent dynamic switching power. See
Fig. 14.3. As an example, the 22-nm tri-gate finFET in [1] demonstrates a SS of
70 mV/decade and DIBL of 50 mV/V.
The fully depleted structure of Fig. 14.2 can also be realized in a silicon-on-
insulator (SOI) substrate [12, 13]. Here, the undoped body is simply replaced by
262 A.L.S. Loke et al.
DIBL DIBL
VGS VGS
VTsat VTlin VDD VTsat VTlin VDD
gate gate
++++++++++++ ++++++++++++
source
drain source drain
partially-depleted fully-depleted
(a) (b)
Fig. 14.4 Electric fields in (a) partially depleted and (b) fully depleted MOSFET
thin undoped SOI and buried oxide layers. Gate charge is balanced by mirror charge
in the substrate beneath the buried oxide. Unlike partially-depleted SOI, FD-SOI is
not prone to VT hysteresis because there is no quasi-neutral floating body in the SOI
layer. At the 22-nm node, FD-SOI can achieve SS of 8085 mV/decade and DIBL
of 8590 mV/V as demonstrated in [13]. FD-SOI does require a more expensive
wafer substrate but circumvents the bulk substrate challenge of growing an undoped
surface epitaxially at high temperature while suppressing dopant outdiffusion from
the heavily doped well region [14].
With the surface ideally undoped, the fully depleted structure enjoys better
channel mobility given no ionized impurity scattering. In addition, VT variation
due to random dopant fluctuation (RDF) [15] is eliminated [16]. In a conventional
partially depleted MOSFET (Fig. 14.4a), dopants in the body vary not only
in quantity but also in location due to the stochastic nature of ion implantation.
Consequently, there will be variation in the lengths of the electric field lines from the
gate to the ionized dopants which integrates to variation in surface band bending and
14 Analog/Mixed-Signal Design in FinFET Technologies 263
fully-depleted
channel body n+ drain p-well tie
gate n+ drain p-well tie
NMOS
n+ source
n+ source
STI STI
p-well p-well
p-substrate p-substrate
PMOS
p+ source
p+ source
n-well n-well
p-substrate p-substrate
VT . However, in a fully depleted MOSFET (Fig. 14.4b), the lengths of the electric
field lines exhibit less variation to realize less VT variation.
The sensitivity to electric field profile does, however, make fully depleted
structures especially sensitive to variations in geometry. For instance, VT depends
on variation in the undoped body thickness in FD bulk, the SOI and buried oxide
thicknesses in FD-SOI, and the width, height, and shape of a bulk finFET.
A fully depleted device can be realized as a two- or three-dimensional structure
built in bulk or SOI [1, 13, 14, 17]. The bulk tri-gate finFET [1] of Fig. 14.5 has by
far become the dominant flavor in high-volume production.
NMOS P
PMOS
Fig. 14.6 Layout-dependent effects due to stress of surrounding isolation and devices
0% +10%
-2% +8%
-4% +6%
-6% +4%
-8% +2%
-10% 0%
(a) (b)
Fig. 14.7 Mobility variation in (a) NMOS and (b) PMOS resulting from die attachment to
package [26]
HK HK
HK-first (bottom only) HK-last (bottom+sides)
(a) (b)
Fig. 14.8 HKMG integration variants: (a) HK-first and (b) HK-last
HK gate dielectric
M metals may experience some VT shift. MBE can be mitigated by increasing the
device separation and even eliminated by unsharing the gate, both at the expense of
layout bloat.
sacrificial
spacer
mandrel
Mask A Mask B
Fig. 14.10 Lithography innovations: (a) cut mask [20], (b) pitch splitting [34], and (c) spacer-
based patterning [1]
Tighter CPP has resulted in a complex and far costlier MEOL to contact the devices
in 16/14 nm [26, 36]. See Fig. 14.11a. This contrasts with the 28-nm MEOL which
consists of only a single mask module to form both diffusion and gate contacts. The
MEOL is typically the most problematic module to yield. As such, despite a higher
RG penalty especially as the gate traverses over the fins (Fig. 14.11b) [37], starting
at the 22 nm node, self-aligned source/drain contacts (SACs) with full dielectric
encapsulation of the gate are built to eliminate overlay-related contact-to-gate shorts
[1]. See Fig. 14.12. Furthermore, diffusion contacts and gate contacts are formed
independently to overcome process difficulties associated with a single module for
both sets of contacts. Finally, an additional via level is inserted to bridge these
contacts to Metal-1. The combination of finer geometries and additional contact
interfaces (whose quality always dominates the overall contact resistance) have
added substantially more resistance in the MEOL [26].
14 Analog/Mixed-Signal Design in FinFET Technologies 269
Metal-1
via MG very
fins in resistive
gate trench over fin
gate
diffusion
gate contact
contact
gate spacer
fins
well spacer
(a) (b)
Fig. 14.11 (a) Complex MEOL to contact finFET [26], (b) corresponding high RG
nitride
misalignment cap 100%
80%
Dies Passing
contact SAC
60% Non-SAC
(a) (b)
Fig. 14.12 (a) Self-aligned contact and (b) its resilience to misalignment [1]
Porting a design to finFET requires some design re-optimization to address the new
device structure and aforementioned technology constraints. To little surprise, these
process complexities have spawned many more restrictive layout design rules which
have significantly increased the design closure effort.
270 A.L.S. Loke et al.
14.3.1 General
14.3.1.1 Channel Width Quantization
Fin dimensions (height Hfin , width Wfin , and pitch) are generally uniform throughout
the die to reduce lithography and etch load effects, i.e., for tight process control,
forcing the tri-gate channel width Weff to be multiples of 2Hfin C Wfin . This quantity
is an estimate for design convenience because in reality, current density actually
varies along the electrical width of the device, peaking at the top of the fin where the
onset of inversion occurs and decreasing along the sides of the fin. This phenomenon
is expected given that the top of the fin is most weakly coupled to the device body
node (undepleted portion of the fin not wrapped by the gate). It is important to
realize that Hfin is not the physical height of the entire fin but only the amount of fin
protrusion above the STI oxide.
Weff quantization is a challenge for logic and SRAM design and has, for
instance, led to the growing use of SRAM assist techniques [38] where variable
gate and bitline voltages are used to modulate the strength of bit cell devices
with finer resolution. However, its impact on analog design is minimal as the
gate transconductance, gm , under typical biasing conditions is granular enough
(10100 A/V per fin) for sufficient design flexibility.
Compared to planar CMOS, device density is the clear winner in a finFET node with
Weff being 1.53 times the width projected onto the wafer plane. Given the superior
gate control in a fully depleted structure, the body effect in a finFET is practically
nonexistent (VT < 10 mV for jVBS j D VDD ) [39]. Hence, there is no stack penalty
such as in the NMOS pulldown network of a NAND gate. Furthermore, reduced
DIBL means more effective drive current (Ieff ) for better digital performance [9] and
more ideal analog behavior with 3 better intrinsic gain [40]. Defined as the average
of drain current (1) at VGS D VDD and VDS D VDD /2 and (2) at VGS D VDD /2 and
VDS D VDD , Ieff is an increasingly preferred metric for logic performance because it
is a more accurate average than IDsat of the switching current in an inverter. Without
RDF, VT variation is also reduced by about 30% [40], much to the benefit of the
SRAM minimum supply voltage Vmin . These benefits may reopen opportunities for
precision analog approaches that became unfeasible in recent planar CMOS nodes.
The compact finFET structure comes at the cost of more parasitics. Source/drain
resistance is dramatically higher as currents must funnel from wide trench contacts
into the narrow fins. The extrinsic source/drain resistance is particularly high given
14 Analog/Mixed-Signal Design in FinFET Technologies 271
Fig. 14.13 (a) PNP BJT and (b) STI ESD diode in finFET technology
limited silicidation. CGS and CGD are also higher due to gate coupling to the trench
diffusion contacts and to the epitaxial source/drain fill between fins. As Wfin is
a small fraction of the fin pitch, junction capacitance to the wells is reduced,
but the vertical well resistance is much higher. This enforces stricter well-tie and
latch-up layout rules and increases series resistance in analog and ESD diodes
(see Fig. 14.13). Higher RG especially in short-channel gates also exacerbates non-
quasistatic behavior, making the design of circuits such as RF low-noise amplifiers
especially challenging. Finally, self-heating is worse given the higher area density
of device currents, elevating both device and metallization reliability concerns.
14.3.2 Analog/Mixed-Signal
14.3.2.1 Transistor Parasitics
AMS designs, for the most part, can be ported to 16/14 nm with expected node-to-
node adjustments. However, some designs are increasingly difficult due to growing
parasitics and accumulating layout constraints from continued scaling.
Techniques to reduce resistance are becoming vitally important even at the
expense of increased capacitance. For example, the double-source layout of Fig.
14.14 is now a common technique to combat high contact resistance and resulting
supply droop especially in circuits such as I/O transmitters and clock drivers that
need to drive large loads.
Worse parasitic capacitance is also troublesome. For example, in Fig. 14.15a,
higher CGD (Miller) coupling in a low-dropout (LDO) regulator with a PMOS
pass element translates to worse high-frequency supply-noise rejection. In another
example (Fig. 14.15b), higher CGS can inject substantial Vref kickback noise in
say an LPDDR receiver commonly implemented as a PMOS differential amplifier
with one input tied to the Vref threshold. Such non-idealities can be mitigated by
incorporating more capacitive clamping which obviously costs more area as well
as longer circuit start-up time. Slower circuit wake-up is increasingly undesirable,
272 A.L.S. Loke et al.
Vref Vbias
OTA CGS
CGD
Vout Vin Vref
Vout
(a) (b)
Fig. 14.15 (a) LDO regulator with PMOS pass element and (b) LPDDR receiver
especially in mobile ICs, as many AMS subsystems support on demand burst modes
for power saving. As designers become more familiar with finFETs though, clever
solutions such as anti-kickback circuits [41] are emerging.
The stacked FET of Fig. 14.16 has become ubiquitous for building current sources
as Lmax continues to shrink. The desired higher rout is realized through resistive
source degeneration of the top device in the stack operating in saturation. Area bloat
is incurred by the intermediate diffusions, but rout into the gigahertz range is also
worse as these diffusions electrically short to ground, degrading analog metrics such
as intrinsic gain and common-mode noise rejection.
Ideality
Io Io PIo Factor, n higher
OTA series
RD
Vout
VD usable
Io/N & Io
R1 R1 R2 R3 range
N 1
log(ID)
Io
ADC
RD RD
nkT
V BE;N D ln N C .N 1/ I0 RD (14.1)
q
nkT
V BE;M D ln M C .M 1/ I0 RD (14.2)
q
14.3.2.4 Varactor
n-well
inversion
p-substrate
VG
(a) (b)
(a) (b)
Fig. 14.20 Resistor options: (a) MEOL resistor and (b) gate resistor
narrower voltage window of useful tuning. In fact, for this very reason, quarter gap
as opposed to band-edge gate work function materials are required in fully depleted
devices to properly target VT from being too low [45]. Lastly, as CGS and CGD are
higher in a finFET-based varactor, tuning range will inevitably be compromised.
In a finFET node, the precision resistor of choice for AMS applications is the thin-
film MEOL resistor shown in Fig. 14.20a. It replaced the polysilicon resistor in
the transition to HK-last integration. The MEOL resistor is composed of a thin
refractory metal compound that is deposited and subtractively etched prior to the
metallization module. It is specifically built for analog/mixed-signal usage, so its
integration is decoupled from the finFET. The MEOL resistor is necessarily thin
to obtain a small material grain structure that is electrically dominated by surface
and grain boundary (as opposed to bulk) scattering in order to realize a lower
temperature coefficient of resistance. Unlike the polysilicon resistor which has well-
defined, low-resistance silicided resistor ends, the MEOL resistor is more prone
to current spreading near its contacts. The finFET gate of Fig. 14.20b can also
be leveraged as a resistor. Unfortunately, it is not well controlled and is plagued
276 A.L.S. Loke et al.
A B A A B A A B A
by many sources of variation including RMG CMP and the SAC gate recess etch.
Moreover, its width is limited by the transistor Lmax .
Linear capacitors are still implemented using metal-oxide-metal (MOM) fingers
in the interconnect stack. The area density of MOM capacitors is one of only a
handful of scaling consequences that benefits analog design. Designers have to be
cautious though of technology-imposed modeling subtleties. For example, corner
models of the lowest metal layers may account for double-patterning misalignment
in a nonphysical way shown in Fig. 14.21 where Mask B has been misaligned to the
left relative to Mask A.
Planar metal-insulator-metal (MIM) capacitors built with a high-permittivity
dielectric like hafnium oxide are sometimes available. As they require additional
processing, MIM capacitors are usually only justified in more expensive ICs such
as high-performance servers [1] that can command a premium profit margin.
The migration to finFET impacts inductor design minimally as the highest thick
layers of interconnect metal used for forming planar spirals generally do not scale so
as to preserve low enough resistance to mitigate voltage droop in supply distribution.
Subtle degradation of inductor Q will occur as interconnect scaling at lower metal
layers requires more dummy fill surrounding the inductor for tighter CMP pattern
density control to minimize accumulation of topography.
Although VDD scaling has helped to reduce core power tremendously in finFET
nodes, the supply voltage for sub-gigahertz general-purpose I/Os (GPIOs) remains
mostly at 1.8 V which imposes several technology and design challenges. FinFETs
with thicker gate dielectrics are increasingly difficult to build because a tighter fin
pitch requires more aggressive ALD MG fill capability for the thicker oxide [46].
Also, the larger separation between core and I/O voltages leads to more complex
voltage level shifter designs. Historically, the GPIO voltage has scaled from 5.0 V
to 3.3 V and eventually 1.8 V. System ecosystem consensus is needed to lower
the GPIO supply voltage yet again, but little cost motivation exists for designs
staying in cheaper legacy nodes. However, high-performance memory interfaces
like LPDDR4X, which require a higher DRAM die supply, are migrating to lower
and more SoC-friendly signaling voltages to enable the SoC I/O supply to scale [41].
14 Analog/Mixed-Signal Design in FinFET Technologies 277
Supply voltages have scaled to the point where traditional VT -based analog design
principles can at best be loosely applied, finFET nodes being no exception. Given
limited voltage headroom, transistors operating in saturation have been biased with
gate overdrive (VGS VT ) and saturation margin (VDS VDSAT ) as low as 50 mV
going as far back as the 45-nm node. These voltage levels are basically drowned
in the VGS required to transition from subthreshold to weak inversion, making
impossible a clear demarcation between off and on regions of operation. This
ambiguity has spawned a host of practical though less convenient design metrics
such as current efficiency (gm /ID ) [47], inversion coefficient [48], and rout -based
saturation margin [49] for optimum analog biasing.
VT is a cumbersome quantity to define. Based on the inversion condition
s D
b where
b is the bulk potential, the traditional VT definition is not only
electrically immeasurable but more fundamentally inapplicable to a fully depleted
device where
b vanishes if the body is undoped. The BSIM-CMG
s -based model
defines VT as VGS at which the superthreshold drift current matches the subthreshold
diffusion current as the channel forms [50]. Although this definition is theoretically
unique, this condition also cannot be measured. As a result, foundries typically
measure and report the constant-current VT [51], defined as VGS corresponding to
the threshold current.
Weff
IT D I0 (14.4)
Leff
14.4 Conclusion
With SoC finFET technologies already in production for several years, AMS designs
have clearly migrated to finFET nodes without showstoppers. AMS designers
are pressed to understand process technology even more than ever in order to
anticipate its impact on design. HKMG, MEOL, and finFET parasitics as well
as their layout-related effects and constraints have already increased AMS design
effort significantly. As we march toward the remaining few CMOS nodes, this
design landscape will stay on course as logic and SRAM needs continue to dictate
technology priorities.
278 A.L.S. Loke et al.
Acknowledgments We wish to extend sincere thanks to Yanxiang Liu, Michael Brunolli, Ray
Stephany, Andy Wei, Bich-Yen Nguyen, Jung-Suk Goo, Shawn Searles, Dennis Fischette, Larry
Bair, Jia Feng, Simon Wong, and Marcel Pelgrom for valuable discussions. The first author is
especially indebted to Tin Tin, Theo, and Josephene for their love, support, and encouragement.
References
1. Auth, C., et al.: A 22nm high performance and low-power CMOS technology featuring fully-
depleted tri-gate transistors, self-aligned contacts and high density MIM capacitors. In: IEEE
Symposium on VLSI Technology Technical Digest, Honolulu, HI, June 2012, pp. 131132
2. Wu, S.-Y., et al.: A 16nm finFET CMOS technology for mobile SoC and computing
applications. In: IEEE International Electron Devices Meeting Technical Digest, Washington,
DC, Dec. 2013, pp. 224227
3. Cho, H.-J., et al.: Si finFET based 10nm technology with multi Vt gate stack for low power and
high performance applications. In: IEEE Symposium on VLSI Technology Technical Digest,
Honolulu, HI, June 2016, pp. 1213
4. Yang, S., et al.: 10 nm high performance mobile SoC design and technology co-developed for
performance, power, and area scaling. In: IEEE Symposium on VLSI Technology Technical
Digest, Kyoto, Japan, June 2017, pp. 7071
5. King Liu, T.-J.: Bulk CMOS scaling to the end of the roadmap. In: IEEE Symposium on VLSI
Circuits, Short Course, Honolulu, HI, June 2012
6. Packan, P., et al.: High performance 32nm logic technology featuring 2nd generation high-k
C metal gate transistors, In: IEEE International Electron Devices Meeting Technical Digest,
Baltimore, MD, Dec. 2009, pp. 659662
7. Yan, R.-H., Ourmazd, A., Lee, K.F.: Scaling the Si MOSFET: From bulk to SOI to bulk. IEEE
Transactions on Electron Devices. 39(7), 17041710 (July 1992)
8. Muller, R.S., Kamins, T.I.: Device electronics for integrated circuits, 2nd edn. Wiley, New York
(1996)
9. Wei, L., Boeuf, F., Antoniadis, D., Skotnicki, T., Wong, H.-S.P.: Exploration of device design
space to meet circuit speed targeting 22nm and beyond. In: Proceedings of the International
Conference on Solid State Devices and Materials, Miyagi, Japan, Sep. 2009, pp. 808809
10. Skotnicki, T.: CMOS technologies trends, scaling and issues. In: IEEE International Electron
Devices Meeting, Short Course, San Francisco, CA, Dec. 2010
11. Na, M.H., Nowak, E. J., Haensch, W., Cai, J.: The effective drive current in CMOS inverters.
In: IEEE International Electron Devices Meeting Technical Digest, San Francisco, CA, Dec.
2002, pp. 121124
12. Colinge, J.-P.: Silicon-On-Insulator Technology: Materials to VLSI. Kluwer Academic Pub-
lishers, Norwell (1991)
13. Cheng, K., et al.: Fully depleted extremely thin SOI technology fabricate by a novel integration
scheme featuring implant-free, zero-silicon-loss, and faceted raised source/drain. In: IEEE
Symposium on VLSI Technology Technical Digest, Kyoto, Japan, June 2009, pp. 212213
14. Fujita, K., et al.: Advanced channel engineering achieving aggressive reduction of VT variation
for ultra-low power applications. In: IEEE International Electron Devices Meeting Technical
Digest, Dec. 2011, pp. 749752
15. Stolk, P.A., Widdershoven, F.P., Klaassen, D.B.M.: Modeling statistical dopant fluctuations in
MOS transistors. IEEE Transactions on Electron Devices. 45(9), 19601971 (Sep. 1998)
16. Asenov, A.: Suppression of random dopant-induced threshold voltage fluctuations in sub-0.1-
m MOSFETs with epitaxial and doped channels. IEEE Transactions on Electron Devices.
46(8), 17181724 (Aug. 1999)
14 Analog/Mixed-Signal Design in FinFET Technologies 279
17. Seo, K.-I., et al.: A 10nm platform technology for low power and high performance application
featuring FINFET devices with multi work function gate stack on bulk and SOI. In: IEEE
Symposium on VLSI Technology Technical Digest, Honolulu, HI, June 2014
18. Chan, V., Rim, K., Ieong, M., Yang, S., Malik, R., Teh, Y.W., Yang, M., Ouyang, Q.: Strain
for CMOS performance improvement. In: Proceedings of the IEEE Custom Integrated Circuits
Conference, San Jose, CA, Sep. 2005, pp. 667674
19. Liu, Y., et al.: NFET effective work function improvement via stress memorization technique
in replacement metal gate technology. In: IEEE Symposium on VLSI Technology Technical
Digest, Kyoto, Japan, June 2013, pp. 198199
20. Auth, C., et al.: 45nm high-k C metal-gate strain-enhanced transistors. In: IEEE Symposium
on VLSI Technology Technical Digest, Honolulu, HI, June 2008, pp. 128129
21. Faricelli, J.: Layout-dependent proximity effects in deep nanoscale CMOS. In: Proceedings of
the IEEE Custom Integrated Circuits Conference, San Jose, CA, Sep. 2010
22. Garcia Bardon, M. et al.: Layout-induced stress effects in 14nm & 10nm finFETs and their
impact on performance. In: IEEE Symposium on VLSI Technology Technical Digest, Kyoto,
Japan, June 2013, pp. 114115
23. Lee, C., et al.: Layout-induced stress effects on the performance and variation of finFETs.
In: IEEE International Conference on Simulation of Semiconductor Processes and Devices,
Washington, DC, Sep. 2015, pp. 369372
24. Sato, F., et al.: Process and local layout effect interaction on a high performance planar 20nm
CMOS. In: IEEE Symposium on VLSI Technology Technical Digest, Kyoto, Japan, June 2013,
pp. 116117
25. Xi, X., et al.: BSIM4.3.0 MOSFET Model Users Manual. Regents University California,
Berkeley (2003)
26. Terzioglu, E.: Design and technology co-optimization for mobile SoCs. In: International
Conferance on IC Design & Technology, Leuven, Belgium, June 2015
27. McPherson, J.: Reliability trends with advanced CMOS scaling and the implications on design.
In: Proceedings of the IEEE Custom Integrated Circuits Conference, Sep. 2007, pp. 405412
28. Wong, P.: Beyond the conventional transistor. IBM Journal of Research and Development.
23(46), 133168 (Mar. 2002)
29. Kang, C.Y., et al.: The impact of La-doping on the reliability of low Vth high-k/metal gate
nMOSFETs under various gate stress conditions. In: IEEE International Electron Devices
Meeting Technical Digest, San Francisco, CA, Dec. 2008
30. Hou, C.: A smart design paradigm for smart chips. In: IEEE International Solid-State Circuits
Conference Technical Digest, San Francisco, CA, Feb. 2017, pp. 813
31. Yang, S., et al.: High-performance mobile SoC design and technology co-optimization to
mitigate high-K metal gate process variations. In: IEEE Symposium on VLSI Technology
Technical Digest, Honolulu, HI, June 2017
32. Dadgour, H.F, Endo, K., De, V.K., Banerjee, K.: Modeling and analysis of grain-orientation
effects in emerging metal-gate devices and implications for SRAM reliability. In: IEEE
International Electron Devices Meeting Technical Digest, San Francisco, CA, Dec. 2008
33. Hamaguchi, M., et al.: New layout dependency in high-K/metal gate MOSFETs. In: IEEE
International Electron Devices Meeting Technical Digest, Washington, DC, Dec. 2011, pp.
579582
34. Dorsch, J.: Changes and challenges abound in multi-patterning lithography. Semiconductor
Manufacturing & Design Community, www.semi.org/en/node/54491, Feb. 2015
35. Woo, Y., Ichihashi, M., Parihar, S., Yuan, L., Banna, S, Kye, J.: Design and process technology
co-optimization with SADP BEOL in sub-10nm SRAM bitcell. In: IEEE International Electron
Devices Meeting Technical Digest, Washington, DC, Dec. 2015, pp. 273276
36. Rashed, M., et al.: Innovations in special constructs for standard cell libraries in sub 28nm
technologies. In: IEEE International Electron Devices Meeting Technical Digest, Washington,
DC, Dec. 2013, pp. 248251
37. Wu, W., Chan, M.: Gate resistance modeling of multifin MOS devices. IEEE Electron Device
Letters. 27(1), 6870 (Jan. 2006)
280 A.L.S. Loke et al.
38. Wang, Y.: Embedded memory design in CMOS finFET technology. In: IEEE Symposium on
VLSI Circuits, Short Course, Honolulu, HI, June 2016
39. Sheu, B.: Circuit design using finFETs. In: IEEE International Solid-State Circuits Conference,
Tutorial T4, San Francisco, CA, Feb. 2013
40. Hsueh, F.-L., et al.: Analog/RF wonderland: circuit and technology co-optimization in
advanced finFET technology. In: IEEE Symposium on VLSI Technology Technical Digest,
Honolulu, HI, June 2016, pp. 114115
41. Lee, C.-K., et al.: A 5Gb/s/pin 8Gb LPDDR4X SDRAM with power-isolated LVSTL and split-
die architecture with 2-die ZQ calibration scheme. In: IEEE International Solid-State Circuits
Conference Technical Digest, San Francisco, CA, Feb. 2017, pp. 390391
42. Banba, H., Shiga, H., Miyaba, T., Tanzawa, T., Atsumi, S., Sakui, K.: A CMOS bandgap
reference circuit with sub-1-V operation. IEEE Journal of Solid-State Circuits. 34(5), 670673
(May 1999)
43. Pertijs, M.A.P., Huilsing, J.H.: Precision temperature sensors in CMOS technology. Springer,
Dordrecht (2006)
44. ADT7461 1 C temperature monitor with series resistance cancellation. ON Semiconductor
Publication No. ADT7461/D, Mar. 2014
45. Chang, L., Tang, S., King, T.-J., Bokor, J., Hu, C.: Gate length scaling and threshold voltage
control of double-gate MOSFETs. In: IEEE International Electron Devices Meeting Technical
Digest, San Francisco, CA, Dec. 2000, pp. 719722
46. Wei, A., et al.: Challenges of analog and I/O scaling in 10nm SoC technology and beyond.
In: IEEE International Electron Devices Meeting Technical Digest, San Francisco, CA, Dec.
2014, pp. 462465
47. Silveira, F., Flandre, D., Jespers, P.G.A.: A gm /ID based methodology for the design of CMOS
analog circuits and its application to the synthesis of a silicon-on-insulator OTA. IEEE Journal
of Solid State Circuits. 31(9), 13141319 (Sep. 1996)
48. Sansen, W.: Analog CMOS from 5 micrometer to 5 nanometer. In: IEEE International Solid-
State Circuits Conference Technical Digest, San Francisco, CA, Feb. 2015, pp. 2227
49. Feng, J., et al.: Bridging design and manufacture of analog/mixed-signal circuits in advanced
CMOS. In: IEEE Symposium on VLSI Technology Technical Digest, Kyoto, Japan, June 2011,
pp. 226227
50. Khandelwal, S., et al.: BSIM-CMG 110.0.0 Multi-Gate MOSFET Compact Model Technical
Manual. Regents University California, Berkeley (2015)
51. Loke, A.L.S., Wu, Z.-Y., Moallemi, R., Cabler, C.D., Lackey, C.O., Wee, T.T., Doyle, B.A:
Constant-current threshold voltage extraction in HSPICE for nanoscale CMOS analog design.
In: Synopsys Users Group (SNUG) Conference, San Jose, CA, Mar. 2010
Chapter 15
Analog Circuits in 28 nm and 14 nm FinFET
15.1 Introduction
The FinFETs are processed with a gate which surrounds the channel at three sides
as shown in Fig. 15.1. The improved lateral electrostatic control of the Tri-Gate
structure leads to the outstanding performance of FinFETs in terms of saturation
Fig. 15.1 A planar transistor (left) and a FinFET transistor (right) [1]
current, leakage current, and other analog parameters, as well as improved short
channel effects [1] (Fig. 15.2).
Designing circuits for a FinFET technology is relatively similar to the circuit
design for a planar technology. The difference is the parallel connection of fins to
get a wanted transistor width and the stacking of devices to increase the effective
length. A microscope picture of some transistors implemented in the planar and in
the FinFET technologies is shown in Fig. 15.3. Although up to a hundred transistors
have to be stacked for a current source, the parasitic capacitance is small enough to
not degrade the performance in comparison to planar CMOS (Fig. 15.4).
The main issue for these arrays of fins is the layout generation, because all the
fins must to be connected.
15 Analog Circuits in 28 nm and 14 nm FinFET 283
Fig. 15.3 Microscope picture of some transistors in planar (left) and in FinFET (right)
technology [1]
Special care has to be taken at high current or for high frequency circuits. FinFET
self-heating (FISH) can be a serious issue because the thermal resistance from the
channel of the FinFET to the substrate is much higher than in planar CMOS resulting
in increased maximal temperature within the fins [2]. Moreover, the resistances as
well as the parasitic capacitances of the lower metal layers are increased because of
the small metal pitch.
284 L. Drrer et al.
Fig. 15.5 Double patterning requires two different masks for the same layer
15 Analog Circuits in 28 nm and 14 nm FinFET 285
In this section we compare the planar and FinFET designs of a CDAC, a two-step
flash ADC and a Sigma Delta ADC. We analyze designs targeting similar constraints
and describe how the implemented circuit solutions can be enhanced or limited by
the technology.
A total sine wave output power of 14.2 dBm and 13.5 dBm were measured at
1850 MHz and 850 MHz, respectively. The measurement results in 14 nm and 28 nm
are similar. In 14 nm 25% power reduction was possible, which is caused by the
digital architecture of the DAC. Although the power delivered to the output has to
be the same, losses of switched digital signals are drastically reduced. The area for
the DAC is reduced by more than 50%, but the overall area of the transmitter is
determined by the matching network at the output. The transformer in the matching
network is the main area contributor, and its size is given by the amount of turns for
a chosen frequency. Therefore, the total area for the whole transmitter shrinks by
approximately 25% (Fig. 15.8).
In IQ mode, an excellent noise floor of -153dBc/Hz has been measured in LTE5
mode (Fig. 15.9). Low error vector magnitudes (EVM) of 3% and 1.5% have been
further measured in LTE5 mode at 1850 MHz and 850 MHz, respectively. With
digital predistortion, the EVM can be reduced to 0.8% in LTE20 mode. The periodic
spurs in Fig. 15.9 are an artifact caused by the testchip RAM in the digital path and
thus can be easily avoided. It is expected that the noise floor can be improved by
better layout, because power connections were not fully optimized.
For 5G application fast AD converters with bandwidths in the GHz range are
necessary. The basic architecture is described in the literature [4, 5]. The resolution
of the AD converter is mainly limited by the power dissipation of the system. For
15 Analog Circuits in 28 nm and 14 nm FinFET 287
Fig. 15.8 The transformer depends on the matching network and in 28 nm and 14 nm does not
shrink substantially
Fig. 15.10 Concept of the subranging converter, coarse and fine stage
The comparators are very simple differential stages followed by a latch which is
tuned for low kickback. Figure 15.11 shows the comparator of the coarse stage.
The numbers express the number of used fins or capacitors. For the fine stage,
a slice-based design to simplify the layout was used. Instead of sizing every
comparator itself, a common size was used and switched in parallel to generate
the capacitance and current necessary by noise requirements (Fig. 15.12).
In the preamplifier of 14 nm, it is possible to stack more devices due to better ratio
of VDD to threshold voltages. Exploiting the small technologies made it possible to
digitally calibrate the offset of each comparator (Figs. 15.13 and 15.14).
The comparators of the coarse stage are first offset calibrated [4]. The remaining
errors are corrected by monitoring the difference of the coarse and the fine result.
It is obvious that an error occurred whenever an underrange or overrange signal is
generated. The implementation in 14 nm is more compact and faster. Simulations
showed that for the same power dissipation, the sampling rate can be doubled in
14 nm. So for 15 mW, the ADC can be operated up to 2Gs/s in 28 nm, whereas
4Gs/s is possible in 14 nm. The area shrinks by 50% in 14 nm. Measurements are
available in 28 nm, showing 7.2 ENOBs up to Nyquist frequency.
15 Analog Circuits in 28 nm and 14 nm FinFET 289
As a last comparison example, a continuous time Sigma Delta ADC has also been
developed and measured in the two technologies. This ADC is a good test vehicle to
scout and compare the improved performance or the limitations of a technology
because its building blocks relate to pure analog, mixed mode, and pure digital
circuits. Some literature about similar ADC design can be found in [6, 7].
In this work we compare two similar ADC designs in feedback configuration.
The ADC architecture is shown in Fig. 15.15. The ADC can be reconfigured
by changing the filter coefficients to cover two bandwidth modes 9 MHz and
47 MHz, respectively. A low-resolution 5-bit flash quantizer and three current
steering feedback DACs are used to close the loop and digitize the analog input
290 L. Drrer et al.
Fig. 15.15 The third-order Sigma Delta ADC architecture used for the technology comparison
Fig. 15.16 The tuned resistor implementation. The parallel design (left) and the more compact
R2R implementation (right). The latter reduces the area and the parasitic capacitance to ground
Fig. 15.18 Metal stack section comparison between two Intel FinFET technologies 22 nm and
14 nm [8]
The digitized output of the quantizer is collected into a memory in the digital
VLSI domain. As expected the digital VLSI block shown in Fig. 15.15 in FinFET
14 nm greatly outperforms in all aspects the planar design.
A comparison of the measured performance is shown in Figs. 15.21 and 15.22
and in Table 15.1.
Although the two ADCs are not identical, they show similar performance and can
be thus compared. An even more important fact is that in both designs the simulation
results fit well with the measurements. The accurate simulation of the extracted top
level view in 14 nm is much more time consuming due to the larger amount of netlist
elements that cannot be neglected or simplified/reduced.
294 L. Drrer et al.
15.4 Conclusions
The critical high performance A/D and D/A building blocks for an RF transceiver
have been designed in Intel 14 nm Tri-Gate FinFET and 28 nm planar technologies.
The competitive advantage in performance has been determined, and hurdles for a
commercialization with a competitive execution schedule have been identified.
References
1. Bohr, M., Mistry, K.: Intels revolutionary 22nm transistor technology, Intel
Corporation.www.intel.com/content/dam/www/public/us/en/documents/presentation/
revolutionary-22nm-transistor-technology-presentation.pdf, pp. 49, May 2011
2. Prasad, C., et al.: Transistor reliability characterization and comparisons for a 14 nm tri-
gate technology optimized for system-on-chip and foundry platforms. 2016 IEEE International
Reliability Physics Symposium (IRPS), Pasadena, CA, 2016, pp. 4B-5-1-4B-5-8
3. Fulde, M., Kuttner, F., et al.: Digital multimode polar transmitter supporting 40MHz LTE carrier
aggregation in 28nm CMOS. ISSCC2017 13.2 D
4. Sandner, C., Clara, M., Santner, A., Hartig, T., Kuttner, F.: A 6-bit 1.2-GS/s low-power flash-
ADC in 0.13-m digital CMOS. IEEE J. Solid-State Circuits. 40(7), 14991505 (2005)
5. Clara, M., Wiesbauer, A., Kuttner, F.: A 1.8 V fully embedded 10 b 160 MS/s two-step ADC in
0.18 m CMOS. Proceedings of the IEEE 2002 Custom Integrated Circuits Conference (Cat.
No.02CH37285), 2002, pp. 437440. D
6. Dorrer, L., Kuttner, F., Greco, P., Torta, P., Hartig, T.: A 3-mW 74-dB SNR 2-MHz continuous-
time delta-sigma ADC with a tracking ADC quantizer in 0.13-m CMOS. IEEE J. Solid State
Circuits. 40(12), 24162427 (2005)
7. Dorrer, L., et al.: A 2.2mW, Continuous-Time Sigma-Delta ADC for Voice Coding with 95dB
Dynamic Range in a 65nm CMOS Process. 2006 Proceedings of the 32nd European Solid-State
Circuits Conference, Montreux, 2006, pp. 195198
Additional Reference
8. Borkar, R., Bohr, M., Jourdan, S.: Advancing Moores Law The Road to 14 nm. Intel
Corporation,www.intel.com/content/www/us/en/silicon-innovations/advancing-moores-law-in-
2014-presentation.html, pp. 1439, August 2014
Chapter 16
Pipeline and SAR ADCs for Advanced Nodes
16.1 Introduction
The energy efficiency of ADCs has improved by orders of magnitude over the past
two decades. Even though process scaling degrades the analog characteristics of
transistors, by exploiting, scaling the energy efficiency of recently reported ADCs
is approaching fundamental limits [1]. These improvements have been achieved
through innovative circuit ideas and through the evolution of ADC architectures.
In particular, the SAR ADC architecture has benefitted tremendously from scaling.
SAR ADCs are among the most efficient stand-alone converters and form excellent
building blocks for more complex architectures including pipeline and highly
interleaved ADCs. This chapter builds on material that first appears in [24].
Section 2 presents a stand-alone SAR ADC that achieves outstanding efficiency
and also shrinks the area needed for a SAR ADC. This architecture uses a charge-
injection cell-based DAC to avoid problems associated with residual settling in
conventional SAR ADCs. Further, reuse of the charge-injection cells allows a very
small die area. The small size and efficiency make the charge-injection cell SAR
ADC an ideal building block for hybrid and interleaved ADCs.
We concentrate on pipeline ADCs for the remainder of the chapter. These
combine sub-ADCs of moderate resolution with high-performance amplifiers to
construct a high-resolution pipeline. In Sect. 3, we present a simple argument that
in a two-stage pipeline the first stage should have higher resolution. However, this
high-resolution first stage is difficult to achieve with flash-based sub-ADCs. The
SAR-assisted pipeline ADC allows a high-resolution first stage that enables a very
efficient two-stage pipeline.
We focus on the amplifier part of the pipeline for the remainder of the chapter.
In Sect. 4, we argue that a ring amplifier can supersede the workhorse-cascoded
telescopic OTA in the switched-capacitor residue amplifier of a SAR-assisted
pipeline ADC. In Sect. 5, we present a SAR-assisted pipeline ADC that uses a ring
amplifier to achieve outstanding energy efficiency.
SAR ADCs are not only highly effective by themselves but also form critical
building blocks of pipeline ADCs and interleaved SAR ADC arrays. Interleaving
of SAR ADCs delivers very high sampling speeds and good energy efficiency.
However, interleaving of multiple SAR ADCs poses significant challenges due
to the large area needed. A particular problem is that interleaving artifacts are
exacerbated by die size [5]. Compact and efficient SAR ADCs facilitate highly
interleaved ADCs and also serve as efficient building blocks for pipeline and
ADCs. One approach to improving SAR ADC performance is the multiple-bit-per-
cycle SAR ADC, but this has the disadvantage of significant extra complexity since
extra quantizers and capacitor DACs are needed [6, 7]. Furthermore, multiple-bit-
per-cycle SAR ADCs need increased die area. The charge-injection cell-based DAC
SAR ADC (ciSAR ADC) [3] is a very compact SAR ADC architecture and achieves
excellent energy efficiency.
Interrupted settling makes the ciSAR ADC faster, simpler, and more linear
for high-speed operation. This is because the ciSAR architecture avoids the
distortion suffered by conventional fast SAR ADCs due to insufficient DAC
settling. Figure 16.1 illustrates how residual settling compromises the linearity of
a conventional SAR ADC. Settling from a previous step continues into the present
SAR step, leading to distortion in the conversion. As shown in the example, the
residual background settling skews trip points downward. Redundancy in the SAR
algorithm can alleviate this problem, but redundancy requires extra SAR steps and
more complicated SAR logic. On the other hand, in ciSAR, thanks to interrupted
settling, settling for any given SAR step stops completely at the end of that step.
Returning to the example in Fig. 16.1, we see that with interrupted settling, there is
no longer distortion of the trip points.
Modular charge-injection cells (CICs), as shown in Fig. 16.2, are the key to
interrupted settling. Initially, the input signal is sampled onto two differential
integration caps, CintC and Cint-. During the SAR operation, the DACC and DAC-
nodes of these capacitors are connected to the CIC cells and to a comparator. During
the binary search, the CICs subtract fixed quanta of charge from the CintC and Cint-
capacitors. The binary search of the SAR ADC is based on the set-and-down method
[8], since the CIC cells can only subtract charge. A unique advantage is that CIC cell
can be reused for different SAR steps. Thanks to this reuse, the number of CIC cells
can be far fewer than the number of levels in the SAR ADC. For example, in [3]
only eight identical CICs are needed for a prototype 6-bit ADC.
16 Pipeline and SAR ADCs for Advanced Nodes 299
Background /
Settling
Fig. 16.1 Interrupted settling (right) avoids distortion due to limited bandwidth [3]
VIN+ DAC+
+
DAC-
-
VIN-
CIC
CIC
CIC
Cint+ Cint-
en en en
DOUT
CLK
Timing and Control Logic
Fig. 16.2 ciSAR architecture with eight CIC cells for a 6-bit ADC [3]
Fig. 16.3 Charge transfer cell, timing diagram, and transfer current profile [3]
The profile of the charge transfer (Fig. 16.3) also facilitates interrupted settling.
At the beginning of the charge transfer cycle, one of the charge transfer switches
(i.e., either M1 or M2) is strongly on. However, during the transfer, the voltage on
the reservoir node rises reducing the gate-source voltage of the conducting NMOS
switch (M1 or M2) and causing the current to fall. The current continues to fall until
it drops to the level of the small bias current supplied by M3. This falling current
profile greatly reduces the sensitivity to jitter in the timing control signal, since the
current flow is always small when charge transfer is halted. As CIC cells are only
active for a short duration, we use the remaining time to prepare for the next charge
transfer.
A prototype ciSAR needs only eight CICs [3]. CIC cells in a ciSAR ADC are
reused multiple times in a SAR conversion both to save area and improve linearity.
During the MSB cycle, these eight CICs are used twice in two successive phases
to deliver 16 units of charge. CIC cell reuse not only halves the DAC area but also
halves the driver power for the control signals. CIC cell reuse also improves ADC
linearity because the same cells are reused. CIC cell reuse in the prototype [3] only
slows down the overall ADC sampling rate by only 15%.
16 Pipeline and SAR ADCs for Advanced Nodes 301
transconductance Gm , and CL,tot is the total output load of the opamp. If we assume
a first-order step response, then the output of the first-stage MDAC at the end of
hold phase is
TGm
C
Vres D Videal C Verr and Verr D .Videal Vinitial / e L;tot (1)
where T is the available time for settling and is the feedback factor.
A simple argument shows the power advantages of an increased first-stage gain.
The feedback factor (21-M ) is approximately halved with every 1-bit increase in
resolution, M, of the first-stage MDAC. A 1-bit increase in the resolution of the first
stage also indicates a 1-bit decrease in the required resolution of the subsequent
stages. Therefore, the worsened feedback factor, , is approximately offset by
the increased tolerance for settling error, Verr . On the other hand, a 1-bit decrease
in the required resolution of the subsequent stages also approximately halves the
output load capacitance, CL,tot . This reduction in CL,tot decreases the required opamp
transconductance, Gm , which in turn directly translates into a reduction in the opamp
power consumption. However, this power improvement with increasing first-stage
MDAC resolution ceases when output self-parasitics of the opamp dominate CL,tot .
The linearity of a pipeline ADC improves as the first-stage resolution increases
[9]. This is because an increased first-stage resolution lowers nonlinearity due to
capacitor mismatch. Furthermore, the large gain of a high-resolution stage decreases
the nonlinearity and noise contributions of the subsequent stages.
The cascoded telescopic OTA-based SC residue amplifier has been the workhorse
of conventional pipeline and SAR-assisted pipeline ADCs [2, 11]. However, the
conventional OTA structure consumes a lot of power and suffers from a limited
output swing. This restricted output swing forces a stage gain smaller than suggested
by the first-stage resolution and the redundancy of the pipeline. Because of this,
SAR-assisted pipelines often need to reduce the reference voltages to the second
stage [2], but this consumes extra power. Another alternative is to use an R-2R
DAC in the second-stage SAR ADC [11]. Dynamic amplifiers are a lower-power
alternative to OTAs in a pipeline ADC [12, 13, 15, 16]. Through time-domain
integration, a dynamic amplifier offers low power amplification of the residue. An
advantage is that this integration filters noise [14]; however, the inaccurate open-
loop gain with dynamic amplification requires gain calibration in the pipeline.
Not only does calibration increase both the complexity of design and test cost,
it also reduces robustness to changes in process, supply voltage, and temperature
(PVT) [13].
The ring amplifier [17, 18] is an energy-efficient alternative to an OTA that
intrinsically has a high output swing. The high gain of the ring amplifier allows
closed-loop operation without the need for calibration of gain. [4] introduced a fully
differential ring amplifier enabling a fully differential switched-capacitor stage.
Slew-based charging makes ring amplifiers energy efficient. Recent ring amplifiers
are robust to PVT variation [4, 18] because they do not need external biasing.
The original ring amplifier [17], Fig. 16.6a, is a three-stage inverter-based
amplifier with an offset-canceled first stage. The ring amplifier is stabilized by
last stage moving to the subthreshold region as the ring amplifier virtual ground
(i.e., VIN in Fig. 16.6) approaches the desired common-mode voltage. This is done
with the help of split second-stage inverter amplifiers with separate floating input
offset voltages. During auto-zero, the floating input offsets of the second-stage
inverters are applied to capacitors C2 and C3 via an external bias voltage, VOS .
304 M.P. Flynn et al.
Fig. 16.6 (a) Original ring amplifier [17] and (b) the self-biased ring amplifier [18]
Operating the third stage in subthreshold results in a high output resistance, thereby
forming a dominant pole at the output and stabilizing the amplifier. Ring amplifiers
[4, 17, 18] have several intrinsic advantages compared to OTAs. First, even with
a low supply voltage, a ring amplifier easily produces high gain from its three
cascaded gain stages. Second, as mentioned earlier, slew-based charging is very
energy efficient. Third, because the last stage is a simple inverter, ultimately
operating in subthreshold, ring amplifiers can handle a near rail-to-rail output signal
swing.
The self-biased single-ended ring amplifier introduced in [18] and shown in Fig.
16.6b is more robust to PVT changes and uses less power than the original ring
amplifier circuit. The improved robustness and the removal of external biases make
this ring amplifier more practical. One of the innovations in [18] is the use of high
threshold voltage NMOS and PMOS transistors in the last stage, which extends
the stable range since high threshold voltage FETs have an order-of-magnitude
higher output resistance for a given gate-source voltage. Another technique that
helps stabilize the design is the addition of resistor, RB , between the gates of third-
stage NMOS and PMOS transistors, as shown in Fig. 16.6b. The voltage drop
caused by the second-stage inverter current flowing through RB dynamically applies
different voltages to the gates of the last inverter stage, as VIN approaches the desired
common-mode voltage. On the other hand, the gates of the NMOS and PMOS
transistors of the last stage are still driven rail to rail when VIN is away from the
common-mode voltage, ensuring a high slew rate. An advantage compared to the
ring amplifier in [17] is that the combined three stages are auto-zeroed for improved
PVT tolerance.
The ring amplifiers in [17, 18] are single-ended circuits and therefore inherit
the drawbacks of single-ended structures. The well-known disadvantages of single-
ended circuits include limited common-mode and supply rejection. Furthermore,
single-ended circuits do not reject even order harmonics as differential circuits
do. As shown in Fig. 16.7, a pseudo-differential structure along with a common-
mode feedback (CMFB) circuit [17, 18] to some extent alleviates these problems.
The switched-capacitor CMFB in Fig. 16.7 consists of the common-mode sensing
16 Pipeline and SAR ADCs for Advanced Nodes 305
Fig. 16.7 Pseudo-differential MDAC gain stage with two ring amplifiers in [18]
capacitors CSC and CS- and feedback capacitors, CF . VCM is the common-mode
voltage reference. A limitation is that this pseudo-differential CMFB reduces the
effective gain of the ring amplifier because CF forms a capacitive divider at the input
of the ring amplifier. The effective gain is reduced from the nominal ring amplifier
gain, AV to AV CC /(CC C CF C CIN ), where CC is the auto-zero offset storage
capacitor and CIN is the input parasitic capacitance of the ring amplifier.
As shown in Fig. 16.8, [4] introduces a fully differential ring amplifier that avoids
the problems of the single-ended ring amplifier structure. In the fully differential
ring amplifier, a single differential pair replaces the first stages of a pair of single-
ended ring amplifiers [18]. Reuse of current by the NMOS and PMOS differential
pairs increases the transconductance, thereby reducing the dominant thermal noise
of the ring amplifier. To further save power, when not needed, the first stage is
powered down via an enable signal EN .
Effective biasing and CMFB are important for reliable operation of the
ring amplifier. Biasing and CMFB are shown in Fig. 16.8. The auto-zero forces
the ring amplifier input and output voltages to be close to values that lead to
the highest amplifier gain. There are separate CMFB loops to set the common mode
of the first stage and the common mode of the overall ring amplifier. During the
auto-zero phase, a CMFB loop, consisting of PMOS devices M4, M5, and M6
306 M.P. Flynn et al.
Fig. 16.8 Fully differential ring amplifier, along with bias and CMFB [4]
operating in triode [19], coarsely regulates the output common mode of the first
stage. A separate switched-capacitor CMFB circuit forces the output common mode
of the entire ring amplifier to VCM during the amplification phase.
The second and third stages of the ring amplifier are based on inverters. Similar
to the single-ended self-biased ring amplifier [18], resistors, RB , apply (Fig. 16.8)
dynamically offset voltages to the PMOS and NMOS gates of the third stage.
Furthermore, high-threshold voltage devices in the second stage increase gain. This
is needed because dynamic biasing can cause the second-stage transistors to operate
in triode region. Triode operation greatly reduces both the second-stage gain and
also the gain of the entire ring amplifier. As with the third stage, the use of high
threshold voltage transistors extends the output voltage range for which the second-
stage transistors are operating in saturation. The simulated small-signal gain for a
65 nm CMOS prototype ring amplifier is greater than 80 dB for an output swing
range from 0.1 V to 1.1 V and a 1.2 V supply.
Fig. 16.9 50MS/s 13-bit SAR-assisted pipeline ADC with ring amplifier [4]
pipeline to tolerate first-stage sub-ADC errors. The output range of the residue is
0.30.9 V for ideal first-stage CDAC and comparator. The additional output range
of the amplifier facilitates the 1-bit redundancy.
To reduce the switching energy of the first stage, SAR CDAC is split into two
separate capacitor DAC arrays, Big DAC and Small DAC, as shown in Fig. 16.9.
Splitting the CDAC into two separate capacitor arrays also reduces the INL and
DNL errors due to the CDAC capacitor mismatch. The total differential sampling
capacitance of the first-stage CDAC is 4 pF to satisfy the 13-bit kT/C noise
requirement. Taking advantage of the fact that the 6-bit first-stage SAR sub-ADC
needs only to meet 6-bit kT/C noise performance, Small DAC, which is part of the
first-stage SAR ADC, uses only a quarter of the sampling capacitance, to reduce
the SAR DAC power consumption. Merged capacitor switching (MCS) [20] further
reduces the energy consumption of the SAR DAC. Asynchronous SAR operation
[21] eliminates the need for a high-frequency ADC clock and reduces errors due to
comparator metastability.
Both Big DAC and Small DAC sample the same input signal. Big DAC contains
the remaining three quarters of the sampling capacitance and is only needed during
residue generation. Based on the decision of the SAR, energy-efficient switching
of Big DAC is achieved with the floated detect-and-skip (FDAS) CDAC switching
technique, derived from [22]. Once the first-stage SAR conversion is complete, the
residues of the Big and Small DACs are merged together and passed to the 32
residue amplifier.
Figure 16.10 shows a simplified single-ended depiction of the residue gain struc-
ture the actual implementation is fully differential. A controls the amplification
phase, and S and S are sampling/auto-zero phase control signals. Auto-zeroing
ensures that the output swing of the ring amplifier is fully utilized. A relatively large
(4 pF) offset storage capacitor, CAZ , minimizes folding of the auto-zero noise [23].
However, the fact that the sampled voltage on CAZ stays constant means that the
large CAZ capacitance does not have a detrimental effect on power consumption.
Furthermore, this large CAZ capacitance has the advantage of stabilizing the ring
amplifier during the auto-zero. This is because CAZ presents a large load to the ring
amplifier during auto-zero, thereby reducing both the dominant pole frequency and
the slew rate.
308 M.P. Flynn et al.
Fig. 16.10 Simplified single-ended depiction of residue gain stage structure [4]
Residue
Off Residue amplification
Amplifier Auto-zero
Fig. 16.11 Simplified timing for SAR-assisted pipeline with ring amplifier [4]
Figure 16.11 shows a simplified timing diagram for the entire SAR-assisted
pipeline ADC. To save power, the ring amplifier is powered down during the
operation of the first-stage SAR ADC. Amplification begins after the completion
of the first-stage SAR ADC conversion this maximizes the time for residue
amplification. In the prototype, an 8-bit second-stage SAR sub-ADC digitizes
the amplified residue. The second stage, like the first-stage sub-ADC, uses MCS,
bottom-plate input sampling, and asynchronous SAR logic. The second-stage 8-bit
CDAC is reset to VCM after the sub-ADC is finished so that residue amplification
always starts from VCM . This reset improves efficiency by halving the maximum
slew rate required from the ring amplifier [18].
16.6 Conclusions
The last decade has seen a near three order-of-magnitude improvement in the energy
efficiency of ADCs. Much of this can be attributed to the scaling-friendly nature of
the SAR architecture. Furthermore, the SAR-assisted pipeline architecture enables
SAR ADCs to dramatically improve the energy efficiency of moderately high-
resolution pipeline ADCs. At the same time, the ring amplifier avoids the problems
associated with OTAs in advanced CMOS nodes.
16 Pipeline and SAR ADCs for Advanced Nodes 309
References
1. Murmann, B.: Energy limits in current A/D converter architectures, ISSCC Short Course, Feb.
2012
2. Lee, C.C., Flynn, M.P.: A SAR-assisted two-stage pipeline ADC. IEEE JSSC. 46(4), 859869
(2011)
3. Choo, K.D., Bell, J., Flynn, M.P.: Area-efficient 1GS/s 6b SAR ADC with charge-injection-
cell-based DAC. ISSCC Digest Technical Papers. 460461 (Feb. 2016)
4. Lim, Y., Flynn, M.P.: A 1 mW 71.5 dB SNDR 50 MS/s 13 bit fully differential ring amplifier
based SAR-assisted pipeline ADC. IEEE JSSC. 50(12), 29012911 (2015)
5. Le Dortz, N., et al.: A 1.62GS/s time-interleaved SAR ADC with digital background mismatch
calibration achieving interleaving spurs below 70dBFS. ISSCC Digest Technical Papers.
386388 (Feb. 2014)
6. Chan, C.-H., et al.: A 5.5mW 6b 5GS/S 4interleaved 3b/cycle SAR ADC in 65nm CMOS.
In ISSCC Digest Technical Papers, pp. 1, 3, Feb. 2015
7. Hong, H-K., et al.: An 8.6 ENOB 900MS/s time-interleaved 2b/cycle SAR ADC with a
1b/cycle reconfiguration for resolution enhancement. In ISSCC Digest Technical Papers,
pp. 470, 471, Feb. 2013
8. Liu, C.-C., et al.: A 0.92mW 10-bit 50-MS/s SAR ADC in 0.13m CMOS process. IEEE
Symposium VLSI Circuits, Digest Technical Papers, pp. 236237, June 2009
9. Yang, W.H., Kelly, D., Mehr, I., Sayuk, M.T., Singer, L.: A 3-V 340-mW 14-b 75-Msample/s
CMOS ADC with 85-dB SFDR at Nyquist input. IEEE JSSC. 36, 19311936 (2001)
10. Devarajan, S., Singer, L., Kelly, D., Decker, S., Kamath, A., Wilkins, P.: A 16b 125MS/s
385mW 78.7dB SNR CMOS pipeline ADC. IEEE ISSCC Digest Technical Papers. 8687
(Feb. 2009)
11. Lee, H.-Y., Lee, B., Moon, U.-K.: A 31.3fJ/conversion-step 70.4dB SNDR 30MS/s 1.2V two-
step pipelined ADC in 0.13m CMOS. In IEEE ISSCC Digest Technical Papers, pp. 474475,
Feb. 2012
12. Verbruggen, B., Iriguchi, M., Craninckx, J.: A 1.7 mW 11b 250 MS/s 2-times interleaved fully
dynamic pipelined SAR ADC in 40 nm digital CMOS. IEEE JSSC. 47(12), 28802887 (2012)
13. Verbruggen, B. et al.: A 2.1 mW 11b 410 MS/s dynamic pipelined SAR ADC with background
calibration in 28nm digital CMOS. IEEE Symposium VLSI Circuits, Digest Technical Papers,
pp. 268269, June 2013
14. van der Goes, F., et al.: A 1.5 mW 68 dB SNDR 80 Ms/s 2 interleaved pipelined SAR ADC
in 28 nm CMOS. IEEE JSSC. 49(12), 28352845 (2014)
15. Verbruggen, B., Deguchi, K., Malki, B., Craninckx, J.: A 70 dB SNDR 200 MS/s 2.3 mW
dynamic pipelined SAR ADC in 28nm digital CMOS. In: IEEE Symp. VLSI Circuits, Digest
Technical Papers, pp. 268269, June 2014
16. Malki, B., Verbruggen, B., Wambacq, P., Deguchi, K., Iriguchi, M., Craninckx, J.: A
complementary dynamic residue amplifier for a 67 dB SNDR 1.36 mW 170 MS/s pipelined
SAR ADC. In: ESSCIRC Digest Technical Papers, pp. 215218, Sep. 2014
17. Hershberg, B., Weaver, S., Sobue, K., Takeuchi, S., Hamashita, K., Moon, U.-K.: Ring
amplifiers for switched capacitor circuits. IEEE JSSC. 47(12), 29282942 (2012)
18. Lim, Y., Flynn, M.P.: A 100 MS/s, 10.5 bit, 2.46 mW comparator-less pipeline ADC using
self-biased ring amplifiers. IEEE JSSC. 50(10), 23312341 (2015)
19. Razavi, B.: Operational amplifiers. In: Design of Analog CMOS Integrated Circuits,
pp. 319324. McGraw-Hill, Boston (2001)
20. Hariprasath, V., Guerber, J., Lee, S.-H., Moon, U.-K.: Merged capacitor switching based SAR
ADC with highest switching energy-efficiency. Electron. Lett. 46(9), 620621 (2010)
21. Chen, S.-W., Brodersen, R.W.: A 6b 600MS/s 5.3mW Asynchronous ADC in 0.13m CMOS.
IEEE ISSCC Digest Technical Papers. 23502351 (Feb. 2006)
310 M.P. Flynn et al.
22. Tai, H.-Y., Hu, Y.-S., Chen, H.-W., Chen, H.-S.: A 0.85fJ/conversion-step 10b 200kS/s
subranging SAR ADC in 40nm CMOS. IEEE ISSCC Digest Technical Papers. 196197 (Feb.
2014)
23. Enz, C.C., Temes, G.C.: Circuit techniques for reducing the effects of op-amp imperfections:
autozeroing, correlated double sampling, and chopper stabilization. Proc. IEEE. 84(11),
15841614 (1996)
Chapter 17
Time-Based Biomedical Readout
in Ultra-Low-Voltage, Small-Scale
CMOS Technology
Rachit Mohan, Samira Zaliasl, Chris Van Hoof, and Nick Van Helleputte
17.1 Introduction
Fig. 17.2 Power and area breakdown of a sensor SoC in 180 nm CMOS technology for wearable
biomedical readout application [8]
To realize power- and area-efficient SoCs in this application, both analog and
digital design challenges need to be considered. To understand the challenge of
obtaining a low-power, low-area sensor SoC without compromising on accuracy
and large-signal handling capabilities, consider Fig. 17.2. It shows the power and
area break-up of a sensor SoC [8] for a wearable ECG readout, with on the node
digital signal processing including motion artefact reduction and beat detection. As
can be seen in the figure, for a design in 180 nm, even fairly modest digital signal
processing tasks result in a sizeable power and area consumption comparable to
the AFE. The figure also plots the area of the AFE and the digital in the 180 nm
technology and also an extrapolated version at 40 nm technology, just to show the
magnitude of the problem. As can be seen, the digital area is hardly visible in an
advanced node such as 40 nm. However, the area of the AFE does not scale so
easily with technology and thereby lead to an increase in costs.
17 Time-Based Biomedical Readout in Ultra-Low-Voltage, Small-Scale. . . 313
The reasons why the AFE does not scale well with technology and voltage supply
are well known and have been discussed extensively in literature [914]. In the next
section, we will briefly discuss the reasons that are relevant to biomedical sensor
readout.
One of the dominant reasons that the AFE does not scale well is the reduction
in available voltage headroom for signal swing due to reduction in voltage supply.
To overcome this challenge, as in other design communities such as the ADCs, we
look towards time-based operation. We do so because it offers the advantage, among
others, of an available dynamic range that is decoupled to the VDD. This potentially
allows lower voltage operation and hence a lower power consumption. Moreover,
this available dynamic range even increases as we scale down in technology as
opposed a voltage-based design.
However, the existing time-based implementations are not quite suitable for
low-power, low-cost biomedical sensor readout. They too face challenges and
often the same ones to meet the low-power and low-area constraints as a voltage-
based implementation. The reasons for this are also discussed later in this paper.
We propose a time-based circuit to overcome the existing challenges to meet the
requirements of a sensor SoC.
This paper is structured in the following manner. In Sect. 17.2, we discuss the
challenges of technology and voltage scaling of traditional AFEs. In Sect. 17.3,
we will present a historical overview of existing time-based approaches. In Sect.
17.4, we will discuss the specific implementation of the proposed time-based circuit.
Finally, Sect. 17.5 will list the conclusion.
In this section, we will discuss the challenges of AFE design from a biomedical
perspective while also discussing some existing solutions in literature.
A summary of the main challenges is listed below:
Voltage scaling: A lower VDD will lead to a lower available dynamic range.
Intrinsic gain: As the channel length reduces, the intrinsic gain of the transistor
reduces. This is especially problematic for design of instrumentation amplifiers
(IA) which require high DC gains for power- and noise-efficient readouts.
Flicker noise: Since biomedical signals are at low frequencies (<1 KHz), impact
of flicker noise is large, and it needs to be reduced for low-noise applications.
Since, flicker noise is proportional to area of the transistor, scaling down the
transistor size becomes challenging.
Gate-leakage current: As technologies scale, the gate oxide thickness reduces,
leading to an increase in gate-leakage current. This is problematic for IAs which
need to have extremely high input impedances (>10 M).
On-chip passives: On-chip passives are required for implementing time constants
for the required signal conditioning and AC coupling. However, their area does
not scale down with technology. Since biomedical signals are at low frequencies,
314 R. Mohan et al.
they require large on-chip time constants, and hence, biomedical designs are
typically dominated by on-chip passive area. Therefore, this is one of the main
reasons why biomedical readouts do not scale down in area with the technology.
Table 17.1 lists the rough specifications that we would like to target for an AFE
design to meet the application requirements.
To understand the effect of VDD scaling on a circuits power and area consumption,
consider Fig. 17.3 [15]. It shows a gm -C circuit with an input signal with peak-
to-peak voltage swing (Vpp ) Vsig and an output voltage of Vpp of VDD. The
gm -C circuit is a representative conceptual model for most analog blocks.1 The
capacitance C can be any capacitance MIM cap, moscap or parasitic cap. It
basically will represent the area of the circuit. For the purposes of this analysis,
we assume that the gm is ideal. The only nonideality that we assume present
is the kT/C thermal noise of the capacitor. This is because thermal noise is
important for biomedical designs and often forms the fundamental limit to the circuit
performance. The signal-to-noise (SNR), bandwidth (BW) and power consumption
(P), in this case, are given by
VDD2
8 SNR
SNR D kT !C/
C
VDD2
gm IDD
BW D /
2 pi C C
P D VDD IDD
1
To be exact, the model and the following analysis hold true for sampled systems such as sample-
and-hold circuits, ADCs. It does not hold true for some blocks such as instrumentation amplifiers
(IA) and low-noise amplifiers (LNA). However, since almost any analog signal chain will contain
ADCs, the overall conclusion of the analysis will still be applicable and relevant to our discussion
in this paper.
17 Time-Based Biomedical Readout in Ultra-Low-Voltage, Small-Scale. . . 315
Fig. 17.3 Conceptual model to understand the effect of decrease in VDD [15]
So now, if the VDD scales down with a factor of s, then the C (and
consequently the area) will increase by s2 , and the power consumption of the
circuit will increase by s to keep the SNR and the BW constant.
A similar analysis can be performed for flicker noise and mismatch with the
same result. Basically, this result means that if VDD scales down, the noise will
also have to scale down to obtain the same dynamic range (or accuracy), which
requires increase in power and area consumption.
One way to combat this is by representing the signal in another domain
independent of VDD, that is, time. Such solutions are discussed in the next chapter.
It is important to note that in practice, the results are not so dramatic [16]. This
is because the parameter that is important is not VDD but (VDD-Vth ), where Vth is
the threshold voltage of the transistor. A low Vth transistor can be used at critical
places in the circuit to offset the impact of a low VDD. The increase of kT/C noise
due to smaller C can be overcome by oversampling techniques to some extent.
Nevertheless, the general conclusion still holds true.
To understand the impact of the intrinsic gain of the transistor due to shorter channel
length, consider a representative differential transistor circuit shown in Fig. 17.4
with the given design parameters. The input transistors are biased in weak-inversion
region as all low VDD designs and biomedical circuits, in general, operate in.
The intrinsic gain of the input transistors is given by
gm dVth
Intrinsic gain / L exp
gds n Ut
316 R. Mohan et al.
Fig. 17.4 Intrinsic gain of a representative differential opamp with different input transistor
lengths
L is the length of the transistor, Vth is the threshold voltage, n is the subthreshold
constant and Ut is the thermal voltage. Figure 17.4 also plots the intrinsic gain
vs L, assuming all the parameters remain the same. As can be seen, the intrinsic
gain reduces considerably by changing the L D 1 m (which is fairly standard for
biomedical designs) to L D 40 nm.
In practice, the exact results will vary due to secondary effects of change in Vth
and n due to change in the drain voltage and VDD. For a more comprehensive
analysis, the readers are referred to [10].
To overcome this problem, cascoding and gain-boosting techniques are typically
applied to increase the intrinsic gain [17, 18]. However, for VDD < 1 V, these
techniques are not so useful. Hence, existing solutions in literature apply one or
multiple of the following techniques to increase gain:
Cascading of gain stages.
Increasing the transconductance, gm , of the transistor by biasing the body at a
voltage other than ground or source or even using the body also as input [1921].
The possibility of latch-up in such low VDD designs is very low as VDD Vth .
Using weak-positive feedback to enhance the gain [19, 22].
Figure 17.5 shows an example of a 0.5 V opamp which uses all the three
techniques listed above [19]. Alternatively, it is also possible to use the phase
domain [23, 24] or the time domain [25] to obtain a large gain. This will be discussed
in the next chapter.
17 Time-Based Biomedical Readout in Ultra-Low-Voltage, Small-Scale. . . 317
Fig. 17.5 Opamp circuit that uses cascading, body biasing and weak-positive feedback to increase
the gain. It operates at 0.5 V VDD
Fig. 17.6 Normalized flicker noise vs transistor channel length (a), normalized Kf vs technology
where Kf is the flicker noise coefficient, Id is the drain current of the input-stage
transistor, Cox is the unit oxide capacitance of a MOSFET, f is the frequency, and W
and L are the width and length of the input-stage transistors, respectively.
Figure 17.6a plots the flicker noise vs L to show the impact of decrease
in transistor length. The values have been normalized to the value at 180 nm
technology node. As can be seen, the flicker noise increases will increase >10
by reducing the transistor length down to 40 nm.
318 R. Mohan et al.
It is also of interest to know the effect of choice of technology on the flicker noise
of the transistor for a given L. Figure 17.6b plots the normalized K 2 f vs technology.
The technologies chosen are sample TSMC technologies from 180 nm down to
28 nm. Interestingly, there seems to be no pattern or correlation that we can discern,
making it impossible for us to predict the impact of technology on flicker noise.
Typical solutions in literature, apart from increasing the size of the transistor, to
overcome the problem with flicker noise are [26]:
Autozeroing technique
Chopping technique
The underlying strategy for both these techniques is to have separate paths for
the signal and the flicker noise and thereby be able to filter out the flicker noise after
signal readout. The advantage of the chopping technique over the autozeroing is that
there is no significant increase in the baseband thermal noise floor. However, the
main drawback is that there can be a large output ripple in the chopping technique,
which can limit its efficacy and can also be problematic for low VDD designs. This
ripple voltage can be reduced by the use of a ripple feedback loop [27]. This loop
can be a mixed-signal feedback loop as well [28]. Alternatively, references [29, 30]
show a readout architecture that uses both autozeroing and chopping to retain the
advantages of both the techniques while getting rid of flicker noise.
It is to be noted that both these techniques require the circuit to operate at a
higher bandwidth than the baseband. Although at small-scale technologies this is not
necessarily problematic, in older technologies with large transistor designs, this can
lead to a higher power consumption. At low VDD, we also need to take care about
the impedance of the switch transistors which can impact the maximum frequency of
operation. However, with low Vth transistors and operation at frequencies <100 kHz,
this is not a major problem yet [31]. However, in case it is, techniques such as clock
boosting can be used to improve the speed performance [32, 33].
2
Kf has been extracted from the BSIM4 spice model parameters.
17 Time-Based Biomedical Readout in Ultra-Low-Voltage, Small-Scale. . . 319
Figure 17.7 plots the normalized (to value at 180 nm technology node) gate-leakage
current of the input stage for the representative differential amplifier in Fig. 17.4, to
get an idea of the increase in gate-leakage current with technology scaling. We use
the following BSIM4 model to generate the plot [34]:
1
Ig / WL 3
tox etox
where Ig is the gate current, tox is the thickness of the oxide, and W and L are the
width and length of the input-stage transistors, respectively. As can be seen, as we
scale down to 40 nm and 28 nm technology, the gate-leakage current increases by
10 times as compared to the 180 nm technology, which is still a favourite node for
biomedical designs.
The presence of gate-leakage current is problematic not just because of wasteful
power consumption, but it can reduce the input impedance of the readout. It can
also lead to an increase in noise, especially for sensor readouts with high source
impedance.
A simple solution to overcome the challenge of gate-leakage current is to either
use thick-oxide transistors or small area transistors. The drawback of using thick-
oxide transistors is that they have a higher Vth than the typical oxide thick ones,
which can be a problem at low VDD designs. On the other hand, small area
transistors will be limited by their flicker noise performance. In case none of
these solutions are possible, we will need to look at our bipolar counterpart for
compensation circuits for gate current. Additionally, it is also possible to use overall
positive DC feedback loops to compensate for it. Such feedback loops already exist
in literature to compensate for input currents arising due to other effects such as
chopping. Figure 17.8 shows one such architecture that incorporates such a feedback
mechanism [35].
320 R. Mohan et al.
Fig. 17.8 IA architecture with positive feedback loop (bootstrap loop) to compensate for input-
stage current [35]. Such feedback loops can also be utilized to compensate for gate-leakage current
due to technology scaling
Although the size of the transistor scales down with technology, the size and density
of on-chip capacitances and resistances do not. For example, the MIM (metal-
insulator-metal) capacitance remains a few fF/m [2, 12], and the sheet resistance
of a hi-poly layer is around the order of a few k/sq. This means that the area that is
comprised of passives will remain the same as we scale down, even if we decrease
the size of the transistors. Given that biomedical readouts are generally dominated
by area of passives [36], we cannot scale down the area with technology.
To overcome this challenge, it is possible to implement the passives with
active transistors [37]. However, such designs often face a power and dynamic
range trade-off [38]. For example, reference [39] implements a large time constant
by way of a switched-capacitor technique, at the expense of increase in noise.
Alternatively, in certain applications, instead of an active implementation, passive
elements in an unconventional configuration can be used. Reference [40] uses
capacitors in a T-network configuration in the feedback to reduce the effective
feedback capacitance of an IA. This allows them to reduce the input capacitance
while keeping the overall gain the same. This architecture is shown in Fig. 17.9a.
It is also possible to implement the required on-chip time constant in the digital
domain [36, 41]. The required time constants are implemented in the digital domain,
and the information is fed back to the analog domain through DACs. Figure
17.9b shows one such architecture [36]. It implements a DC-coupled mixed-signal
17 Time-Based Biomedical Readout in Ultra-Low-Voltage, Small-Scale. . . 321
feedback loop to get rid of all the capacitors required for implementing a time
constant and for AC coupling. For large array readouts, it is also possible to share
a common/reference terminal or even a complete block such as ADC to save area
[42, 43].
322 R. Mohan et al.
In the previous section, we saw that one of the existing critical challenges is the
reduction in the available signal swing due to the reducing VDD. To overcome this
challenge, we need to represent the signal in a domain other than the voltage domain.
We specifically look towards the time domain for the following reasons:
A large dynamic range available: As the size of the transistor scales down, a
higher speed or bandwidth is available to represent the signal as opposed to the
voltage domain, wherein the dynamic range reduces.
This dynamic range is decoupled to the voltage supply (at least to a first order).
The signal is represented in a digital-like representation. Given the reality that
the digital circuits drive the semiconductor technology, by moving to a time-
based operation, we basically ensure that any changes in technology that will be
beneficial to the digital circuits will be beneficial to time-based circuits as well.
An important corollary of this point is that it is easy to convert a time-domain
signal to a digital representation. Simple digital counters are usually sufficient as
opposed to complicated voltage-based ADCs. Not only that, it will potentially
allow using digital calibration and compensation techniques to correct for the
errors in the analog domain.
The idea of time-domain-based operation is not new. It is explored in literature
in different design communities such as ADCs [4447], filters [23, 48], DC-DC
converters [49], pixel readout [50] and resistive/capacitive sensor readouts [51, 52].
In fact, time-based operation has been existing in literature for almost as long as
the semiconductor industry itself. One of the first time-interval-based ADC patents
was published in the 1940s [44, 53]. By the 1960s1980s, with the advent of digital
circuits, it was clear that the time-based circuits will be more advantageous with
the improvement in technology [54]. Most of the time-based circuits that will be
discussed later in this section can find their origins in studies done during this period.
Given this history, however, until very recently, there has almost no study done yet
on time-based biomedical readout.
The reasons for this can be explained by looking at the challenges of a time-
domain-based approach:
Real-world biomedical signals are, often, voltage signals and not time signals.
Thus, to implement the time-based operation, we will require a block in the signal
chain to convert the information from voltage to time. This block will face the
same challenges of a voltage-based design in an ultra-low-voltage, small-scale
technology environment.
In addition to the previous point, although the time-domain signals are digital-
like or pseudo-digital, they are, in essence, still analog signals. This means that
they are plagued by the same concerns of mismatch, kT/C noise, flicker noise and
technology limitations such as effect of gate-leakage currents and short-channel
17 Time-Based Biomedical Readout in Ultra-Low-Voltage, Small-Scale. . . 323
effects such as low intrinsic gain of the transistor. Hence, although it is possible
to operate at low VDD, scaling down in area with technology will be challenging.
Since it is challenging to reduce the area or the capacitance, even though a larger
dynamic range is available in scaled technologies, the power reduction due to
low VDD operation might be minimal or worse significantly high.
For the same reasons, although converting to the digital domain is indeed simpler,
it can also be power consuming as the digital counters may require a high
frequency (>100 MHzs) for sufficient resolution. This, however, may not be
a problem in advanced technology nodes and low VDD.
With these pros and cons in mind, we will analyse some of the existing ideas in
time-based implementations.
Although biomedical signals are in voltage domain, many physical sensors
output the information in the frequency domain, or the information can be easily
converted to frequency domain by integrating them with voltage-controlled oscil-
lators (VCOs) [51, 55, 56]. Time-domain-based operation can enable large DR
readout [51, 55] or alternatively ultra-low-voltage readout [56] in such applications.
Figure 17.10 shows the architecture of two such capacitive readouts. The underlying
principle for all the three is that the change in the capacitor value will change the
frequency of the oscillator. This change is detected by a mixer and a low-pass filter.
Finally, the filtered value is converted into digital via a counter.
Time-domain readout can also bring advantages of large dynamic range to sensor
readout for applications which are not area sensitive [57]. Figure 17.11 shows the
architecture of a resistive readout via a relaxation oscillator-based loop. The working
principle is that the resistor value is sensed by the current through the resistor, which
is mirrored and converted into time edges and then to a digital value via ramp and a
threshold comparator.
Figure 17.12 shows a resistive/capacitive sensor readout design [58]. This circuit
too, like the previous one, uses a threshold comparator to detect the rate of discharge
of the RC circuit and thereby the resistance value for a known capacitor value. This
design was proposed to be used in conjunction with an off-chip microcontroller.
Hence, the frequency of the digital counter (200 MHz in this case) was not an issue.
By implementing a time-based readout, the design could achieve a large dynamic
range readout at only 1 V supply.
Figures 17.13 a and b show two VCO-based architectures in sigma-delta loop
for two different applications high PSRR-resistive readout [59] and an 800 MHz
ADC [60]. However, the main idea of both the designs is the same both use the
high voltage phase gain of the VCOs to implement the integrator in the sigma-delta
loop instead of the analog integrator. In both cases this leads to a low VDD operation
and lower power operation.
Figure 17.14a and b show examples wherein a time-based operation is used for
ease in conversion to digital [44, 61]. The digital domain is then used to calibrating
the errors (nonlinearity errors in both the cases) of the analog domain.
324 R. Mohan et al.
a
detecting
crystal
detecting
oscillator
lowpass
mixer counter
filter
reference
oscillator
reference
crystal
b Precharge VHIGH
CT
CSENSE DOUT
counter
Reference
Delay
C1:C2 = 1:3
VHIGH
DOUT / V CSENSE
V
VCT
DOUT_C1 = 10 DOUT_C2 = 30
Fig. 17.11 Resistive sensor readout via a relaxation oscillator-based mechanism [57]
Inverter
Resistive fclk
Input Sensor Output
Buffer Comparator buffer
V1 01 XOR Decimator
EN
fcrys RX
V2 02
Crystal
Oscillator C
Cpar
R
Capacitive
Reset
CX
Sensor
Transistor
Fig. 17.13 Example architectures of VCO being used as high-gain blocks in a sigma-delta loop
for (a) high PSRR resistive readout [52] (b) 800 MHz ADC [60]
Figure 17.15 shows the overview of the proposed time-based AFE architecture.
The differential input signal (ECG signal in this case) is converted into 2-bit digital
signal by a time-based ADC (T-ADC). This 2-bit signal is filtered off-chip via a CIC
filter. This 2-bit signal is also used to filter out the DC information, which is then
fed back to the input via a DAC.
Figure 17.16 shows the schematic of the T-ADC. It comprises of pseudo-
differential stages, each comprising of a comparator and a charge-pump integrator
in feedback. These stages are chopped to get rid of the flicker noise. The reasons for
the 1M resistance will be discussed later. The DC filter in the DC reduction loop
is implemented using a reset counter, whose output is down sampled, quantized to
7 bit and fed back to the feedback node using a current DAC.
The comparator and the charge-pump integrator in feedback form a time-based
loop. It is essentially an asynchronous delta modulator [62]. Due to the negative
feedback, the input of the comparator is essentially a virtual ground, whereas the
comparator output is in time domain. Hence, we eliminate any large voltage swings,
allowing us to operate at low VDD and thereby consume low power. The virtual
ground also allows us to implement a power- and area-efficient dynamic comparator.
This is because one of the drawbacks of a dynamic comparator is noise fold-over.
As will be seen later, we can suppress this by implementing a nominal gain stage
17 Time-Based Biomedical Readout in Ultra-Low-Voltage, Small-Scale. . . 327
Fig. 17.14 Sample time-based architectures that also implement digital nonlinearity correction for
(a) ADC to be used in a sensor node [44]. (b) Neural AFE [61]
and a simple anti-alias filter. This would not have been possible had the input signal
swing been large. Finally, since the integrator is in feedback, and not in feedforward
as seen in many time-based implementations in previous chapter, its gain need not
be particularly high.
The 1M resistance is implemented to reduce the noise due to the feedback
integrator by shunting its current noise.
One of the drawbacks of using a delta modulator time-based loop for biomedical
readout is that since it is a nonlinear loop, its analysis is very challenging. This
is especially problematic if we need to estimate the amount of thermal noise and
the quantization noise floor, which is important for biomedical readouts, and to
optimize the loop parameters to gain the maximum power and area efficiency.
Although, historically, the analysis is done by the describing function method (DF)
328 R. Mohan et al.
[62, 63]. However, this is quite intensive, yields very little design intuition and is
quite challenging to perform a noise analysis. We, instead, propose using a pseudo-
continuous analysis [64] for this time-based loop. For more details regarding this
analysis and design optimization, we refer the readers to paper [31].
Figure 17.17 shows the transistor-level schematic of the implemented dynamic
comparator. It is a standard architecture that comprises a preamplifier followed
by a latch stage. The input-stage transistors are implemented using thick-oxide
transistors to eliminate gate-leakage current. The anti-alias filter is inherently
implemented by using the parasitic capacitances (Cpar ). The clock frequency of the
dynamic comparator is 25 MHz. This frequency is chosen to keep the quantization
noise lower than the thermal noise. The dynamic power consumption will be small
due to low VDD and small parasitic capacitances.
Figures 17.18 and 17.19 show a few measurement results of the proposed T-
AFE. Figure 17.18a plots the input-referred noise floor of the T-AFE. It achieves a
0.6 Vrms thermal noise floor or a 7.2 Vrms total noise in a bandwidth of 150 Hz.
As expected, there is no flicker noise. Figure 17.18b plots the FFT of a 5 mVpp
input signal of 11 Hz frequency. We choose this frequency because most of the
ECG signal energy is concentrated in this range. T-AFE achieves an SFDR of 56 dB.
Figure 17.19a plots a sample ECG signal measured by the T-AFE. To validate the
large-signal capabilities, Fig. 17.19b shows the readout signal in presence of motion
as high as 40 mVpp . As can be seen, the readout does not saturate and even maintains
decent beat detection.
17 Time-Based Biomedical Readout in Ultra-Low-Voltage, Small-Scale. . . 329
Fig. 17.18 Input-referred noise of the T-AFE (a) and FFT of the PWM comparator output (b)
Fig. 17.19 ECG with the subject at rest (a), with motion noise added [65] (b)
330 R. Mohan et al.
17.5 Conclusions
Personalized healthcare applications are pushing the boundaries for small, low-
cost, low-power and accurate devices. Existing sensor SoC designs require further
improvements in terms of power consumption, accuracy and cost. One of the main
reasons that this improvement is challenging is that the traditional voltage-mode
AFE design techniques are not well suited for low-voltage supplies and small-scale
technologies.
In this paper, we presented the challenges and existing solutions in literature
for design of a low-power, low-area biomedical AFE readout. In particular, we
focused on time-based operation to overcome the challenge of design with a low-
voltage supply. We further discussed the reasons why the existing time-based
implementations are not suitable to meet our target specifications.
To overcome these challenges, we proposed a time-based readout architecture
that eliminates voltage swing from right at the electrodes itself through the use of
negative feedback and focuses on scalable design techniques. By doing so, we gain
significant benefits in terms of power and area consumption.
Thus, we not only come a step closer to realizing a low-power, low-cost and
accurate sensor SoC, but we open further avenues for analog design in ultra-low-
voltage and small-scale technology.
References
1. Patel, S., Hyung, P., Bonato, P., Chan, L., Rodgers, M.: A review of wearable sensors and
systems with application in rehabilitation. J. NeuroEng. Rehabil. 9(1), 1 (2012)
2. Blaauw, D. et al.: Iot Design Space Challenges: Circuits and Systems. VLSI-Technology,
Honolulu (2014)
3. Penders, J., van Hoof, C., Gyselinckx, B.: Bio-Medical Application of WBAN: Trends and
Examples. pp. 279302. Bio-Medical CMOS ICs, Boston (2011)
4. Islam, A.B., Islam, S.K., Rahman, T.: A vertically aligned carbon nanofiber (VACNF) based
amperometric glucose sensor. 2009 International Semiconductor Device Research Symposium,
College Park (2009)
5. Islam, M.Z., Haider, M.R., Huque, M.A., Adeeb, M.A., Rahman, S., Islam, S.K.: A low power
sensor signal processing circuit for implantable biosensor applications. Smart Mater. Struct.
16(2), 525 (2007)
6. Baker, D.A., Gough, D.A.: A continuous, implantable lactate sensor. Anal. Chem. 67(9),
15361540 (1995)
7. Jeevarajan, A.S., Vani, S., Taylor, T.D., Anderson, M.M.: Continuous pH monitoring in a
perfused bioreactor system using an optical pH sensor. Biotechnol. Bioeng. 78(4), 467472
(2002)
8. Helleputte, N.V., Konijnenburg, M., Hyejung, K., Pettine, J., Jee, D.-W., Breeschoten, A.,
Morgado, A., Torfs, T., Groot, H.D., Hoof, C.V., Yazicioglu, R.F.: A multi-parameter signal-
acquisition SoC for. 2014 IEEE International Solid-State Circuits Conference Digest of
Technical Papers (ISSCC), San Fransisco (2014)
17 Time-Based Biomedical Readout in Ultra-Low-Voltage, Small-Scale. . . 331
9. Bult, K.: The effect of technology scaling on power dissipation in analog circuits. Analog
Circuit Design: RF Circuits: Wide band, Front-Ends, DACs, Design Methodology and
Verification for RF and Mixed-Signal Systems, Low Power and Low Voltage, pp. 51294.
Springer, Dordrecht (2006)
10. Annema, A.J., Nauta, B., Langevelde, R.V., Tuinhout, H.: Analog circuits in ultra-deep-
submicron CMOS. IEEE J. Solid State Circuits. 40(1), 132143 (2005)
11. Annema, A.J.: Analog circuit performance and process scaling. IEEE Trans. Circuits Syst. II,
Analog. Digit. Signal Process. 46(6), 711725 (1999)
12. Lee, K., et al.: The impact of semiconductor technology scaling on CMOS RF and digital
circuits for wireless application. IEEE Trans. Electron Devices. 52(7), 14151422 (2005)
13. Baschirotto, A., Chironi, V., Cocciolo, G., DAmico, S., De Matteis, M., Delizia, P.: Low
power analog design in scaled technologies. Proc. Topical Workshop Electron. Particle
Phys.,pp. 103110, (2009)
14. Murmann, B.: A/D converter trends: power dissipation, scaling and digitally assisted architec-
tures. 2008 IEEE Custom Integrated Circuits Conference, San Jose (2008)
15. Enz, C.C., Vittoz, E.A.: Chapter 1.2 Cmos Low-Power Analog Circuit Design. (2001)
16. Murmann, B., Nikaeen, P., Connelly, D., Dutton, R.: Impact of scaling on analog performance
and associated modeling needs. IEEE Trans. Electron Devices. 53(9), 1602167 (2006)
17. Razavi, B.: Design of Analog CMOS Integrated Circuits. McGraw-Hill, New York (2001)
18. Bult, K., Geelen, G.J.: The CMOS gain-boosting technique. Analog Integr. Circ. Sig. Process.
1(2), 118135 (1991)
19. Chatterjee, S., Tsividis, Y., Kinget, P.: 0.5-V analog circuit techniques and their application in
OTA and filter design. IEEE J. Solid State Circuits. 40(12), 23732387 (2005)
20. Lehmann, T., Cassia, M.: 1-V power supply CMOS cascode amplifier. IEEE J. Solid State
Circuits. 36(7), 10821086 (2001)
21. Stockstad, T., Yoshizawa, H.: A 0.9 V, 0.51 A rail-to-rail CMOS operational amplifier. IEEE
J. Solid State Circuits. 37(3), 286292 (2002)
22. Wang, R., Harjani, R.: Partial positive feedback for gain enhancement of low-power CMOS
OTAs. Low-Voltage Low-Power Analog Integrated Circuits: A Special Issue of Analog
Integrated Circuits and Signal Processing An International Journal 8(1) (1995), Boston,
Springer US, pp. 2135 (1995)
23. Drost, B., Talegaonkar, M., Hanumolu, P.K.: Analog filter design using ring oscillator
integrators. IEEE J. Solid State Circuits. 47(12), 31203129 (2012)
24. Park, M., Perott, M.H.: IEEE J. Solid State Circuits. 44(12), 33443358 (2009)
25. Mohan, R., Yan, L.L., Gielen, G., Hoof, C., Yazicioglu, R.: 0.35 V time-domain-based
instrumentation amplifier. Electron. Lett. 50(21), 15131514 (2014)
26. Enz, C., Temes, G.: Circuit techniques for reducing the effects of op-amp imperfections:
autozeroing, correlated double sampling, and chopper stabilization. Proc. IEEE. 84(11),
15841614 (1996)
27. Wu, R., Makinwa, K.K.A.A., Huijsing, J.: A chopper current-feedback instrumentation
amplifier with a 1 mHz 1/f noise corner and an AC-coupled ripple reduction loop. IEEE J.
Solid State Circuits. 44(12), 32323243 (2009)
28. Akita, I., Ishida, M.: A chopper-stabilized instrumentation amplifier using area-efficient self-
trimming technique. Analog Integr. Circ. Sig. Process. 81(3), 571582 (2014)
29. Tang, A.: A 3 /spl mu/V-offset operational amplifier with 20 nV//spl radic/Hz input noise PSD
at DC employing both chopping and autozeroing. 2002 IEEE International Solid-State Circuits
Conference, San Fransisco (2002)
30. Pertijs, M., Kindt, W.: A 140 dB-CMRR current-feedback instrumentation amplifier employing
ping-pong auto-zeroing and chopping. IEEE J. Solid State Circuits. 45(10), 20442056 (2010)
31. Mohan, R., Zaliasl, S.G.G.G., Hoof, C., Yazicioglu, R., Helleputte, N.: A 0.6-V, 0.015-mm2,
time-based ECG readout for ambulatory applications in 40-nm CMOS. IEEE J. Solid State
Circuits. 52(1), 298308 (2017)
332 R. Mohan et al.
32. Mesgarani, A., Alam, M., Nelson, F., Ay, S.: Supply boosting technique for designing very
low-voltage mixed-signal circuits in standard CMOS. 2010 53rd IEEE International Midwest
Symposium on Circuits and Systems, Seattle (2010)
33. Abo, A., Gray, P.P.R.: A 1.5-V, 10-bit, 14.3-MS/s CMOS pipeline analog-to-digital converter.
IEEE J. Solid State Circuits. 34(5), 599606 (1999)
34. Ranuarez, J., Deen, M., Chen, C.-H.: A review of gate tunneling current in MOS devices.
Microelectron. Reliab. 46(12), 19391956 (2006)
35. Helleputte, N.V., Kim, S., Kim, H., Kim, J.P., Hoof, C.V., Yazicioglu, R.F.: A 160uA
biopotential acquisition IC with fully Integrated IA and motion artifact suppression. IEEE
Trans. Biomed. Circuits Syst. 6(6), 552561 (2012)
36. Muller, R., Gambini, S.S., Rabaey, J.: A 0.013 mm2, 5uW , DC-coupled neural signal
acquisition IC With 0.5 V supply. IEEE J. Solid State Circuits. 47(1), 232243 (2012)
37. Tsividis, Y., Banu, M., Khoury, J.: Continuous-time MOSFET-C filters in VLSI. IEEE J. Solid
State Circuits. SC-1(1), 1530 (1986)
38. Helleputte, N.V., et al.: A 345 W multi-sensor biomedical SoC with bio-impedance,
3-channel ECG, motion artifact reduction, and Integrated DSP. IEEE J. Solid State Circuits.
50(1), 230244 (2015)
39. Fan, Q., Sebastiano, F., Huijsing, J.H., Makinwa, K.A.: A 1.8 W 60 nV Hz capacitively-coupled
chopper instrumentation amplifier in 65nm CMOS for wireless sensor nodes. IEEE J. Solid
State Circuits. 46(7), 15341543 (2011)
40. Ng, K., Xu, Y.: A compact, low input capacitance neural recording amplifier with Cin/Gain
of 20fF.V/V. 2012 IEEE Biomedical Circuits and Systems Conference (BioCAS), Hsinchu
(2012)
41. Bohorquez, J.L., Yip, M., Chandrakasan, A.P., Dawson, J.L.: A biomedical sensor interface
with a sinc filter and interference cancellation. IEEE J. Solid State Circuits. 46(4), 746756
(2011)
42. Zou, X., Liew, S., Yao, L., Lian, Y.: A 1V 22W 32-channel implantable EEG recording IC.
2010 IEEE International Solid-State Circuits Conference, San Fransisco (2010)
43. Majidzadeh, V., Schmid, A., Leblebici, Y.: Energy efficient low-noise neural recording
amplifier with enhanced noise efficiency factor. IEEE Trans. Biomed. Circuits Syst. 5(3),
262271 (2011)
44. Naraghi, S., Courcy, M., Flynn, M.P.: A 9-bit, 14 W and 0.06 mm pulse position modulation
ADC in 90nm digital CMOS. IEEE J. Solid-State Circuits. 45(9), 1870 (2010)
45. Hernandez, L., Prefasi, E.: Analog-to-digital conversion using noise shaping and time encod-
ing. IEEE Trans. Circuits Syst. I, Reg. Papers. 55(7), 20262037 (2008)
46. Dhanasekaran, V., et al.: A 20MHz BW 68dB DR CT ADC based on a multi-bit
time-domain quantizer and feedback element. 2009 IEEE International Solid-State Circuits
Conference, San Fransisco (2009)
47. Iwata, A., Sakimura, N., Nagata, M., Morie, T.: The architecture of delta sigma analog-to-
digital converters using a voltage-controlled oscillator as a multibit quantizer. IEEE Trans.
Circuits Syst. II, Analog Digit Signal Process. 46(7), 941945 (1999)
48. Kuppambatti, J., Vigraham, B., Kinget, P.: 17.9 A 0.6V 70MHz 4th-order continuous-time
butterworth filter with 55.8dB SNR, 60dB THD at C2.8dBm output signal power. 2014
IEEE International Solid-State Circuits Conference Digest of Technical Papers, San Fransisco
(2014)
49. Cuk, S.M.: Modeling, Analysis and Design. Thesis California Institute of Technology,
Pasadena (1977)
50. Hanson, S., Foo, Z., Blaauw, D., Sylvester, D.: 0.5 V sub-microwatt CMOS image sensor with
pulse-width modulation read-out. IEEE J. Solid State Circuits. 45(4), 759767 (2010)
51. Jung, W., Jeong, S., Oh, S., Sylvester, D., Blaauw, D.: 27.6 A 0.7pF-to-10nF fully digital
capacitance-to-digital converter using iterative delay-chain discharge. 2015 IEEE International
Solid-State Circuits Conference, San Fransisco (2015)
17 Time-Based Biomedical Readout in Ultra-Low-Voltage, Small-Scale. . . 333
52. Rethy, J., Smedt, V., Dehaene, W., Gielen, G.: Supply-noise-resilient design of a BBPLL-based
force-balanced Wheatstone bridge Interface in 130-nm CMOS. IEEE J. Solid State Circuits.
48(11), 618627 (2013)
53. Yurish, S.: Sensors and transducers: frequency output vs voltage output. Sens. Transducers
Mag. 49(11), 302305 (2004)
54. Middlehoek, S., French, P., Huijsing, J., Lian, W.: Sensors with digital or frequency output.
Sensors Actuators. 15(2), 119133 (1988)
55. Kindlund, A., Sundgren, H., Lundstrom, I.: Quartz crystal gas monitor with a gas concentrating
stage. Sensors Actuators. 6(1), 117 (1984)
56. Danneels, H., Coddens, K., Gielen, G.: A fully-digital, 0.3V, 270 nW capacitive sensor
interface without external references. 2011 Proceedings of the ESSCIRC (ESSCIRC), Helsinki
(2011)
57. Grassi, M., Malcovati, P., Baschirotto, A.: A 141-dB dynamic range CMOS gas-sensor
interface circuit without calibration with 16-bit digital output word. IEEE J. Solid State
Circuits. 42(7), 15431554 (2007)
58. Lu, J.H.-L., Inerowicz, M., Joo, S., Kwon, J.-K., Jung, B.: A low-power, wide-dynamic-range
semi-digital universal sensor readout circuit using pulsewidth modulation. IEEE Sensors J.
11(5), 11341144 (2011)
59. Rethy, J.V., Danneels, H., Smedt, V.D., Dehaene, W., Gielen, G.E.: Supply-noise-resilient
design of a BBPLL-based force-balanced wheatstone bridge interface in 130nm CMOS. IEEE
J. Solid Stage Circuits. 48(11), 26182627 (2013)
60. Park, M., Perrott, M.: A VCO-based analog-to-digital converter with second-order Sigma-
Delta noise shaping. 2009 IEEE International Symposium on Circuits and Systems, Taipei
(2009)
61. Jiang, W., Hokhikyan, V., Chandrakumar, H., Karkare, V., Markovic, D.: 28.6 A C50mV
linear input-range VCO-based neural-recording front-end with digital non-linearity correction.
2016 IEEE International Solid-State Circuits Conference, San Fransisco (2016)
62. Booton, R.: Nonlinear control systems with random inputs. IRE Trans. Circ. Theory. 1(1), 918
(1954)
63. Ouzounov, S., Hegt, H., Roermund, A.V.: Sigmadelta modulators operating at a limit cycle.
IEEE Trans. Circuits Syst. II. 53(5), 399403 (2006)
64. Perrott, M., Trott, M., Sodini, C.: A modeling approach for - fractional-N frequency
synthesizers allowing straightforward noise analysis. IEEE J. Solid State Circuits. 37(8),
10281038 (2002)
65. Goldberger, A., et al.: PhysioBank, PhysioToolkit, and PhysioNet: components of a new
research resource for complex physiologic signals. Circulation. 101, e215e220 (2000)
Chapter 18
A 4.4 mW-TX, 3.6 mW-RX Fully Integrated
Bluetooth Low Energy Transceiver for IoT
Applications
18.1 Introduction
M. Babaie ()
Delft University of Technology, Delft, The Netherlands
e-mail: m.masoud.babaie@ieee.org
S.B. Ferreira
Federal University of Rio Grande Do Sul (UFRGS), Porto Alegre, Brazil
F.-W. Kuo
Taiwan Semiconductor Manufacturing Company (TSMC), Hsinchu, Taiwan
R.B. Staszewski
Delft University of Technology, Delft, The Netherlands
University College Dublin (UCD), Dublin, Ireland
SPI TX Modulation
Data FCW Amplitude Control ADPLL+TX
data (SRAM)
TX Mod Word (ACW)
Ctrl KDCO 8 64
4.1-5.1GHz
ModM
ModL
Smooth shifting
BW KDCO 3 PVT DCO
FCW RR[k] E[k] M
buffer
LF DCO 7 PVTL
+ Nom. 64 TBM
2
- 8 TBL
Switching 2
current source
RV[k]+[k]
PHE Divider
RV[k] RV[i] DCO buffer
Fref 1
Controller
KTDC CKV0 4bits
4 -5...+3 dBm
Ctrl [k] 1/4 CKV 4 CKV0-3 2 Class-E/F2
0...3
FREF TDC (2.052.55 GHz) CKV0,2 DPA
Prog.
Divider FREF_Div
frac(RR[k]) 4
Matching
network
16
8 16 2 &
4
RXI 8 T/R switch
gm gm
fS = fLO /4 fS = 4x fLO RX
Fig. 18.1 Block diagram of the proposed Bluetooth Low Energy transceiver
system and circuit techniques are exploited here to enhance the power and area
efficiency of an ULP transceiver: First, the most energy-hungry circuitry, such as
a digitally controlled oscillator (DCO) and an output stage of a power amplifier
(PA), can operate directly at the low voltage of harvesters [12, 13]. Second, a
new switching current source oscillator reduces power and supply voltage without
compromising the robustness of its start-up [14]. Third, thanks to the low wander
of the DCO, digital power consumption of the rest of all-digital PLL (ADPLL)
is saved by scaling the rate of a sampling clock to the point of its complete
stillness [12, 13]. Fourth, a fully integrated differential class-E/F2 switching PA
is utilized to optimize high power-added efficiency (PAE) at a low output power
of 03 dBm [13, 15]. Fifth, the RX architecture was derived from realizing that
the best devices and basic building blocks in low-voltage deep-nanoscale CMOS
are logic gates, transistor switches, inverter-like gm transconductors, and metal-
oxide-metal (MOM) capacitors. Hence, the most logical topology would be a
charge-domain switched-capacitor network operating in discrete time. However, to
maximally reduce power consumption, MOS devices would need to be remarkably
small, which would invariably increase their flicker noise corner. To mitigate that,
we propose to increase the RX intermediate frequency (IF) to just beyond the flicker
corner frequency [9, 16]. Sixth, new multistage multi-rate charge-sharing band-pass
18 A 4.4 mW-TX, 3.6 mW-RX Fully Integrated Bluetooth Low Energy. . . 337
filters are adapted to achieve high out-of-band linearity, low noise, and low power
consumption [9, 17]. Last, an integrated on-chip matching network serves both PA
and low-noise transconductance amplifier (LNTA), thus allowing a one-pin direct
antenna connection with no external band-selection filters [9].
The paper is organized as follows: Sect. 18.2 introduces a new RF oscillator
topology that is suitable for ultra-low-voltage/power applications. The ADPLL-
based TX architecture is discussed in Sect. 18.3. The trade-offs between the output
power, matching network insertion loss, drain, and power-added efficiency (PAE)
of the class-E/F2 PA are investigated in Sect. 18.4. Section 18.5 reveals the RF
input/output matching and switching. Section 18.6 details the discrete-time receiver
implementation. In Sect. 18.7, the experimental results are discussed.
The phase noise (PN) of the traditional oscillator (i.e., class B) with an ideal
current source at an offset frequency ! from its resonating frequency !0 can be
expressed as
! 2
KT 0
L.!/ D 10 log10 .1 C / (18.1)
2 Q2t I V PDC !
0
0 Itank
VDA Itank a I =2/p -IDC
Itank
a I =4/p
VDA
VDA
VOsc=Vt Vt
+ VDA
VOD 0.5Vt
- a V = Vt 0.66
+
Vt + VOD VOD a V = Vt 0.4
- 2Vt + VOD
(a) (b)
Fig. 18.2 VDDmin , I , and V parameters for (a) cross-coupled NMOS and (b) complementary
push-pull oscillators
A0 A0 A0 A0
DA DB
Gnu
M3 M4 Gnd
A0 A0
VB M1
M1 M2 M1 M2
M1 M2 GA GB
Figure 18.3 shows an evolution toward the switching current source oscillator. The
OscN topology is chosen as a starting point due to its low VDD capability. To
reduce PDC further, it is desired to switch the direction of the LC tank current
in each half period, which will double I . Consequently, we propose to split the
fixed current source M1 in Fig. 18.3a into two switchable current sources, M1
and M2 , as suggested in Fig. 18.3b. This allows for the tank to be disconnected
from the VDD feed and be moved in between the upper and lower NMOS transistor
pairs to give rise to an H-bridge configuration. In the next step, the passive voltage
gain blocks, A0 , are added to the NMOS gates, as shown in Fig. 18.3c. Both upper
and lower NMOS pairs should each individually demonstrate synchronized positive
feedback to realize the switching of the tank current direction. The master" positive
feedback enforces the differential-mode operation and is realized by the lower-pair
transistors configured in a conventional cross-coupled manner. Since the lower pair
is voltage biased, its negative conductance seen by the tank may be estimated as
Gnd D 0:25 A0 gm1 .
/ C gm2 .
/.
On the upper side, the differential-mode oscillation of the tank is reinforced
by the M3;4 devices which realize the second positive feedback.1 The negative
conductance seen by the tank into the upper pair can be calculated as Gnu D
0:25 .A0 1/gm3 .
/ C gm4 .
/, which clearly indicates that the voltage gain
block is necessary and A0 must be safely larger than 1 to be able to present a
negative conductance to the tank, thus enabling the H-bridge switching. By merging
the redundant voltage gain blocks, the proposed switching current source oscillator
is arrived at in Fig. 18.3d.
Figure 18.4 illustrates the proposed oscillator schematic and simulated wave-
forms indicating various operational regions of M14 transistors. The two-port
1
It should be noted that the master/slave view is mainly valid from a small-signal standpoint.
Both are equally important when considering the large-signal switching operation.
340 M. Babaie et al.
VDD=0.5V
128(1.5m/30nm) 0.8
Voltage waveforms (V) Current waveforms (mA)
PVT(MSB) banks
C2 M4 RBias (off-chip) IM3 IM4
M3 GB GA
0.5A0VOsc
B7-B9 1.2
GB GA IM2
IM3 0.5L2
VB
0.5L2
IM4 0.6 IM1
VB VDD
2 0.8
1 km 0.4 VB
1K
DA DB
20pF
L1 C1 DA
Cpar Cpar DB 0.4
B0-B6 0.2
0.5 VOsc
PVT(LSB) banks C3fF
0 0
50 100 150 200 50 100 150 200
Time (psec) Time (psec)
128 unit weighted
Tank Current (mA)
C52aF M1 in triode 1.5
M2 in triode M1 & M2
Itank MB2 in Sat.
1
16 unit weighted M1 & M4 are ON 0.5
C6aF M4 in weak inversion
16(1.5m/30nm)
0
M1 Rin M2 M2 & M3 are ON
-0.5
M3 in weak inversion
GA GB -1
IM1 IM2 MB1 bias circuit -1.5
50 100 150 200
To the oscillator buffer Time (psec)
Figure 18.1 shows a block diagram of the proposed ultra-low-power (ULP) all-
digital PLL (ADPLL), whose architecture is adapted from a high-performance
cellular 4G ADPLL disclosed in [19]. Due to the relaxed PN requirements of BLE,
the DCO dithering [20] was removed thanks to the fine switchable capacitance
of the tracking bank varactors producing a fine step size of 4 kHz. The DCO features
two separate tracking banks (TB): (1) phase-error correction and (2) direct FM
modulation. Each bank is segmented with LSB (i.e., 1 4 kHz) and MSB (i.e.,
8) unit weights. Each TB range is 4 kHz .8 C 8 64/ D 2:08 MHz.
The DCO clock is divided by two to generate four phases of a variable carrier
clock, CKV03 , in the Bluetooth frequency range of fV = 24022478 MHz. Two of
its phases, CKV0;2 , are fed as differential clock signals to the digital PA (DPA). The
four CKV03 phases are routed to the phase detection circuitry, which selects the
phase whose rising clock edge is expected to be the closest to the rising clock edge
of a frequency reference (FREF) clock. This prediction is based on two MSB bits
of a fractional part of reference phase, RR k, which is an accumulated frequency
command word (FCW). By means of this prediction, the selected TDC input clock
CKV spans a quarter of the original required TDC range, i.e., TV =4, where TV is
the CKV clock period. This way, the long string of 417 ps/12 ps >35 TDC inverters
is shortened by 4, improving INL linearity and power consumption by the same
amount.
The TDC output, after decoding, is normalized to TV by the TDC =TV multiplier,
and the quadrant estimation, normalized to TV =4, is added to produce the phase
error
E . The DCO tuning word is updated based on
E . The
E k is fed to the type
II loop filter (LF) with fourth-order IIR. The LF is dynamically switched during
frequency acquisition to minimize the settling time while keeping phase noise (PN)
at optimum. The built-in DCO gain, KDCO , and TDC gain, KTDC , calibrations are
autonomously performed to ensure the wideband FM response.
The following architectural innovations allow the ADPLL to support ULP
operation (highlighted in blue): The effective sampling rate of the phase detector
and its related DCO update is dynamically controlled by scaling down the FREF
clock and simultaneously adjusting the LF coefficients in order to keep the same
bandwidth and LF transfer function characteristics. During the ADPLL settling,
the full FREF rate is used, but afterward, its rate could get substantially reduced
(e.g., 8), or completely shut down, thus saving power consumption of the digital
circuitry. The resulting in-band PN degradation is tolerable due to low PN of the
DCO. In fact, freezing FREF would incur sufficiently low-frequency drift during
the BLE 376 s packets while keeping in operation the bare minimum of circuitry
highlighted in red.
342 M. Babaie et al.
Designing a fully integrated PA optimized for low output power (Pout < 3 dBm)
with PAE > 40% is very challenging, especially when differential structure is
needed to satisfy the stringent second harmonic emissions. In this work, a fully
integrated differential class-E/F2 PA is exploited to address the aforementioned
challenges.
Figure 18.5a illustrates a general schematic of a transformer-based matching
network of a switched mode PA, which performs simultaneously m-series (i.e.,
voltage) and p-parallel (i.e., current) combining [21, 22]. As proven in [13], the
equivalent resistance rL seen by PA switching transistors may be estimated by
2
p km Qs C QL =
rL RL (18.4)
m n 2QL C Qs C Q2L Qs . 1/2
TRX11
Vin 1:n mVs m rs Vout
rp
Iin Lp Ls Is RL IL
Cs CL
km
mth Voltage
ZL = rL+jXL TRXm1 summation
+
rp
Iin Lp Ls Vs
Cs -
pth Current
Zin = rin+jXin rp TRX1p summation
Iin m rs
Lp Ls
Cs
rp TRXmp mth Voltage summation
Iin Ip Lp Ls
Cs
km (a)
P1 Matching Network
(W/L)
8(W/L)
2 1 P5 Lbond
Cs
VDDH
VDDH
VDD Lp
b0 IN+ b3
LCM CL
IN+ P3
2Ls
P4 RL
b0 IN- b3 IN- 100pF
Lp
VDDH
VDDH
P6
km
(W/L)
8(W/L)
Cs
P2 VDD =0.5V, VDDH =1V
(b)
Lser
P1 P1
jXL
2LCM 2LCM
Cs rL Cs RCM
(c) (d)
Cs jX rL Cs RCM
L
P2 P2
Lser
Fig. 18.5 (a) Transformer-based matching network with m-way voltage and p-way current
summation; (b) schematic of the proposed class-E/F2 PA. Equivalent circuit PAs matching network
for (c) differential and (d) common-mode excitations
344 M. Babaie et al.
No induced
P6 P5 Magnetic field is P6 P5 current
negligible (Lp-CM0)
(km-CM 0)
Large km-DM
P3 P3
P1 P2 P1 P2
LCM
(a) P4 P4 (b)
Fig. 18.6 Behavior of a 2:1 step-down transformer in (a) differential-mode and (b) common-mode
excitations
As illustrated in Fig. 18.6, the step-down 2:1 transformer acts differently to the
common-mode (CM) and differential-mode (DM) input signals. When the trans-
formers primary is excited by a CM signal (Fig. 18.6b), the magnetic flux within
the primarys two turns cancels itself out [24]. Consequently, the transformers Lp is
negligible, and no current is induced at the transformers secondary (kmCM 0).
Hence, RL , Ls , and CL cannot be seen by even harmonics of drain current.
Furthermore, the CM inductance, 2LCM , seen by the switching transistors, is
mainly determined by the dimension of the trace between the transformers center-
tap and decoupling capacitors at the VDD node. Together with Cs , 2LCM realizes a
CM resonance, !CM . Note that Pout of the class-E PA can be reduced by 2 dB at
the same rL and VDD by means of an additional open circuit acting as the switches
effective load at 2!0 (i.e., class-E/F2 operation [25]), as supported in the power
factor, Kp , column in Table 18.2. Consequently, this PA needs smaller impedance
transformation ratio for Pout <3 dBm, which results in a lower insertion loss for its
matching network and thus higher system efficiency. However, in practice, limited
value of an equivalent parallel resistance of the CM resonance, RCM , leads to a power
loss at the second harmonic and thus a penalty on the PAs efficiency if !CM is set
at precisely 2!0 . Consequently, in this design, we adjust the CM resonance slightly
lower (i.e., at 1.8!0 ) to benefit from the lower Kp of semi-class-E/F2 operation
while avoiding the additional power loss at even harmonics.
18 A 4.4 mW-TX, 3.6 mW-RX Fully Integrated Bluetooth Low Energy. . . 345
0.5rp RL = RS = 50
n
LG rloss
rs
(W/L)
0.5C1 1
0.5Lp
LCM Cbig
VDD Ls Cgs
C2
(a) 0.5Lp
(W/L)
L1
0.5C1 0.5rp km
RS
ZPA,RX ZRX
2 2
Vn,rp = 4KTrp LG
1 n Vn,rs = 4KTrs rloss
rp rs
(b) C1 Lp Ls C2 2
Vn,TX
2
Vn,RX
Cgs
L1
km
0.5C1
C2
(c)
(W/L)
0.5C1 RRX
0.5rp 0.5Lp(1-km2) CRX
Fig. 18.7 (a) RF input/output block (RFIO) including the first stage of LNTA and the last stage of
class-E/F2 PA; (b) RFIO in the RX mode; (c) RFIO in the TX mode
346 M. Babaie et al.
2 2
RS Vn;RX RS Vn;TX
F D1C C (18.5)
4KT jZRX j2 4KT jZPA;RX j2
2
where Vn;RX and ZRX are, respectively, the equivalent input noise and input
2
impedance of LNTA at the operating frequency !0 . Furthermore, ZPA;RX and Vn;TX
are, respectively, the output impedance and equivalent output noise of the PAs
matching network. As shown in [9], the contribution of TXMN to the system noise
figure may be estimated by
Qs 2 2 2 4
2
2
Vn;TX 4KT k L C !
Qp m p 1
C 1 Lp C1 ! 2
D 2 2
jZPA;RX j2 Ls !Qs 1
1 Lp C1 ! 2 1 C QQps C 1 Lp C1 ! 2 1 km2
Q2s
(18.6)
It can be shown that (18.6) reaches its minimum at
1 Qp
Lp C1
2 Q CQ
(18.7)
!0 p s
To achieve the minimum noise figure penalty, one should tune C1 switchable
2
capacitor to roughly satisfy (18.7). The optimum Vn;TX = jZPA;RX j2 is then obtained
by inserting (18.7) into (18.6)
0 1 0 1
2 2
Vn;TX 4KT k2 Qp Qs C Q2s Qp DQs Vn;TX 4KT
@ A D m ! @ A D
ZPA;RX 2 Ls !Qs Qs C k2 Qp 2 ZPA;RX 2 2
Ls !Qs 1 C km
min m min
(18.8)
As a result, the noise factor penalty reduces with increasing Qs and km , which
fortunately coincides with efforts to optimize the efficiency of the PAs matching
network [13]. However, a step-down transformer must be employed for the PAs
matching network to scale up the load resistance seen by PAs transistor in order to
achieve the highest possible efficiency at relatively low output power of 3 dBm.
It is against the noise factor optimization, as evident from (18.8), and clearly
demonstrates a trade-off between TX efficiency and RX noise factor. The total noise
factor may be estimated by
rloss L1 !0 2 Rs
F D1C C gm Rs C (18.9)
Rs Rs Ls !Qs 1 C km2
By considering Ls D 880 pH, Qs D 11, and km D 0:75, the noise factor penalty in
(18.9) can be as low as 0.22.
Now, moving attention to the TX mode, the LNTAs transistor is off, and
consequently, the RX path can be simplified to a series RLC network (RXMN)
as shown in Fig. 18.7d. In this mode, the ultimate goal is to alleviate the side
effects of RXMN on the efficiency of the PA. To analyze this efficiency drop, it
is more convenient to replace the RLC series network with its equivalent parallel
capacitance (CRX ) and resistance (RRX ), as illustrated in Fig. 18.7d. It can be
18 A 4.4 mW-TX, 3.6 mW-RX Fully Integrated Bluetooth Low Energy. . . 347
shown that
2 !
!RX !0
RRX D rloss 1 C Q2RX (18.10)
!0 !RX
and
2
!RX
Q2RX !0
1
CRX D Cgs 2 (18.11)
!RX !0
1 C Q2RX !0
!RX
p !RX are, respectively, the RXMNs quality factor and its resonant
where QRX and
frequency 1= .L1 C LG /Cgs . Due to RRX power dissipation, the PAs efficiency
scales down with
RRX
RX D : (18.12)
RL C RRX
Recent ULP receivers for BLE achieve significant power reduction [2, 26] and
higher level of integration [4, 7, 10] primarily using sliding intermediate frequency
(IF) and low-IF continuous-time (CT) architectures. To reduce the RX power
consumption beyond state of the art, we propose a discrete-time (DT) high-IF or
superheterodyne RX architecture with complex-signaling band-pass filters (BPF)
and a progressively reduced sampling rate.
The front-end section of DT-RX is presented in Fig. 18.8a. It consists of the
narrow-band LNTA, a single-ended-to-differential quadrature sampling mixer, and
a DT 4/4 CS-BPF [16]. The LNTA is composed of two stages (see Fig. 18.8c):
a single-input/single-output common-source cascode LNA and a common-source
transconductance (gm ) amplifier. Both stages operate in moderate inversion as
opposed to a strong inversion operation in prior reports, in order to reduce power
consumption with Id D 400 and 100 A, respectively. Capacitors Cg and Cd are
348 M. Babaie et al.
1
4/4 CS
Vin 4 BPF 2
LNTA 25% LO
VO1,ip
1 qi,p CH VO1,in
VO1,ip
+
VO,i
VO1,in -
3 qi,n CH 1 3 2 4 3 1 4 2
Vin Iin
gm
4 qq,n CH CR CR CR CR
VO1,qn -
VO,q
+
VO1,qp 2 4 3 1 4 2 3 1
2 qq,p CH VO2,qn
(b) VO2,qp
LNTA VDD
0
Cd R3
C2 Iin
Gain (dB)
-50
C1 -3fLO 3fLO
96m/100nm
LG
6m/100nm
Vin -100
-5fLO
Mixing effect
5fLO
WIS
Cg 4/4 BPF
WIS + 4/4 BPF
-150
-20 -10 0 10 20
L1 Frequency (GHz)
(c) (d)
Fig. 18.8 (a, b) Full-rate receiver strip; (c) LNTA schematic; (d) transfer function of 4=4 CS-BPF
also showing windowed integration sampling (WIS)
4-bit programmable to tune the LNTA input matching network and its tank load
over process, voltage, and temperature (PVT), as well as over package parasitics.
The LNTA is connected to a 25% quadrature passive mixer, which implicitly
acts as a balun, converting its single-ended input to differential output signals (see
Fig. 18.8b). The passive mixer works in the current mode, which results in a low
quadrature imbalance, low noise, and high linearity [27]. The sampling mixer is
then cascaded with the first complex BPF as shown in Fig. 18.8b. Transfer function
(TF) of the 4/4 CS-BPF in z-domain, from the charge input Qin .z/ to the voltage
output Vout .z/ is given by
18 A 4.4 mW-TX, 3.6 mW-RX Fully Integrated Bluetooth Low Energy. . . 349
In this design, the center frequency of the BPFs is adjusted just beyond the
flicker noise corner of active devices (i.e., IF = 5 MHz). Equation (18.14) clearly
demonstrates a trade-off between capacitor size (area), sampling frequency (power),
and linearity. The gain of the first stage is given simply by a product of effective
transconductance of LNTA and Req . In this low-power application, the strategy is
to reduce CR as much as possible in order to increase Req and, consequently, the
gain of the first stage. This strategy enables high gain with lower GMs, hence with
a lower current at the transconductors. The increase of the input impedances also
allows for the use of smaller switches (with higher resistances) both in the mixer
and in the filters, with a consequent reduction in the power consumption of the
clock generation.
Figure 18.8d plots a transfer function (TF) of the infinite impulse response (IIR)
filter of 4/4 CS-BPF (18.13). As expected of any DT filter, TF reveals repetition
peaks (replicas) at multiples of fs ( 9.8 GHz). Repetition peaks are folded to DC,
but not before being attenuated by a windowed integration sampling (WIS) effect of
the current-mode sampling, which creates an inbuilt sinc filter response, also shown
on the plot. Combination of these two effects determines the filtering shape of the
4/4 CS-BPF [28, 29].
The next stage of the receiver is a 4=8 CS-BPF. Its schematic is shown in
Fig. 18.9. It is based on eight rotating capacitors sampled at eight phases with D D
12:5%, which results in a quadrature filter with higher quality factor (Q D 1:14)
[17]. Its z-domain TF of charge input to voltage output is
1-8 1 2
8 2 3
VO2,i VO2,ip
VO1,i gm 4/8 CS 16
BPF VO1,ip CH CH
8 1
gm
VO2,q VO1,in
3 4
VO1,q gm -fs/2 fIF fs/2
4 5
VO2,in
32TLO
CH 2 3 CH
1 5 6
4TLO
2 6 7 VO2,qp
3 VO1,qp
CH 4 5 CH
gm
4 VO1,qn 7 8
8 1
5 VO2,qn
6 CH 6 7 CH
7
CR CR CR
8
Fig. 18.9 4=8 CS-BPF filter schematics and its clock waveforms
The third stage of the receiver is also a 4=8 CS-BPF. However, there is no
further decimation between the second and third filter stages. It is done to avoid
any additional clock generation circuitry since the power consumption of these
blocks is already very low (around 160 A in simulations for both 4/8 CS-BPF
clock generation, including buffers). The sufficient front-end filtering provided by
the three-stage CS-BPF allows to directly digitize the IF signal using a low-power
ADC and move the second mixer and baseband filtering into the digital domain
[17]. Based on BLE requirements, two 9-bit 20 MS/s ADCs would be sufficient to
digitalize the IF output signals of the proposed receiver [31].
Figure 18.10 shows the die photo of the proposed transceiver implemented in TSMC
1P9M 28 nm digital CMOS. The total core area, including empty space between the
subblocks, is merely 1.9 mm2 .
Figure 18.11 displays the phase noise of the proposed oscillator at the lowest and
highest tuning frequencies for VDD = 0.5 and 0.8 V. The measured PN is -111 dBc/Hz
at 1 MHz offset from 5.1 GHz carrier while consuming 0.35 mW at 0.5 V. As
justified in [13, 32, 33], the 1/f3 PN corner of the oscillator is extremely low (i.e.,
100 kHz) across the tuning range (TR) of 22% (i.e., from 4.1 to 5.1 GHz). Its
average FoM is 189 dBc and varies 1 dB across the TR.
Figure 18.12 plots the measured phase noise at different configurations for
both integer-N and fractional-N BLE channels. When used as an LO at undi-
18 A 4.4 mW-TX, 3.6 mW-RX Fully Integrated Bluetooth Low Energy. . . 351
Fig. 18.10 (a) Die micrograph of the proposed BLE transceiver; (b) its layout with breakdown of
subblock areas
Fig. 18.11 Measured phase noise of the proposed oscillator at (a) the lowest and (b) the highest
frequency
Figure 18.15c shows the BLE receiver packet error rate (PER) versus the input
signal power. The sensitivity is 95 dBm at 30.8% PER. For the OOB blocking
measurement shown in Fig. 18.15d, the desired BLE signal is fixed at channel 12
with an input power of 67 dBm. Both the desired signal and out-of-band CW
blocker are injected into the receiver. The OOB blocker power is recorded when the
PER reaches 30.8%. Results corroborate with the proposed full-rate DT-RX strategy
and show that the receiver is able to tolerate the OOB BLE blocker mask.
18 A 4.4 mW-TX, 3.6 mW-RX Fully Integrated Bluetooth Low Energy. . . 353
Fig. 18.12 Measured transmitter PN in open-loop and different close-loop configurations for
(a) integer-N and (b) fractional-N channels
Fig. 18.13 Bluetooth GFSK modulation spectrum for modulation index of (a) m D 0:25,
(b) m D 0:5, and (c) burst-mode modulation accuracy
Table 18.3 summarizes the proposed transceiver and compares it with recent
state-of-the-art BLE designs. It is the first implemented in the 28-nm CMOS node.
It reaches similar RX performance (NF, linearity, and sensitivity) and better TX
354 M. Babaie et al.
Fig. 18.14 (a) ADPLL settling, (b) oscillator frequency drift, and (c) demodulated TX frequency
for 425 s BLE packet in the open-loop operation
40 Max Gain
25 Measured Gain IF Output (dB) 46dB
Noise Figure (dB)
20 Simulated 20 Measured
Simulated
15 0
10 -20 Desired
Channel
5 Image
-40
0
-28 -24 -20 -16 -12 -8 -4 0 -40 -20 0 20 40
(a) IF Frequency (MHz) (b) Frequency (MHz)
50 20
OOB CW Blocker Level (dBm)
RX Packet Error Rate (%)
40
BLE spec. : 30.8% 0
30
20 -20
BLE spec.
10
-40
0
-100 -80 -60 -40 -20 0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
(c) Input power (dBm) (d) Blocker Frequency (GHz)
Fig. 18.15 Summary of the RX measurements: (a) Noise figure of the RX at various IF
frequencies; (b) RX filtering characteristics; (c) RX packet error rate across input power (d) OOB
blocking performance
performance (max Pout , PLL PN) but at a much lower power consumption, even
better than [5], which uses off-chip matching network and T/R switch. When
compared with the other two designs with fully integrated on-chip T/R switch [4, 7],
the power efficiency is over 2 better for both the TX and RX.
Table 18.3 Performance summary and comparison with state-of-the-art transceivers
This work ISSCC15 IMEC [5] JSSC15 Dialog [7] ISSCC15 Renesas [4] TMTT2013 [10] CC2640 TI
CMOS technology 28 nm 40 nm 55 nm 40 nm 130 nm N/A
Osc PN @1MHz (dBc/Hz) 116 to 117 110 111.5 N/A 110 109
Osc FoM (dB) 188189 183 179 N/A 185 N/A
Osc tuning range 2.052.55 GHz (22%) 25% 20% N/A N/A N/A
PLL in-band PN (dBc/Hz) 92 @ FREF = 5 MHz 90 N/A N/A 100 N/A
101(a) FREF = 40 MHz
Integrated PN (degree) 1.08 @ FREF = 5 MHz 1.5 N/A N/A N/A N/A
0.87 @ FREF = 40 MHz
PLL FoMa (dB) 238.65 236 N/A N/A N/A N/A
PLL settling time (s) 15 15 15 N/A N/A N/A
Reference/Fractional spurs 80/60 70/38 N/A N/A 75/37 N/A
(dBc)
TX modulation error 2.70% 5% N/A N/A N/A N/A
Output power (dBm) 5 to +3 2/1 20 to 0 0 1.6 21 to +5
Total PA efficiency 41% 25% 30% <30% 26.80% N/A
Strongest harmonic emission HD3/47 dBc HD2/49 dBc HD2/50 dBc HD3/48 dBc HD3/34 dBc 46 dBc
18 A 4.4 mW-TX, 3.6 mW-RX Fully Integrated Bluetooth Low Energy. . .
18.8 Conclusions
References
9. Kuo, F.-W., et al.: A Bluetooth Low-Energy (BLE) transceiver with 3.7mW all-digital
transmitter, 2.75mW high-IF discrete-time receiver, and TX/RX switchable on-chip matching
network. IEEE J. Solid-State Circuits 52(4), 11441162 (2017)
10. Masuch, J., et al.: A 1.1-mW-RX 81.4 dBm sensitivity CMOS transceiver for Bluetooth Low
Energy. IEEE Trans. Microw. Theory Tech. 61(4), 16601673 (2013)
11. Bluetooth specification version 4.2, in Available: http://www.bluetooth.com (2014)
12. Kuo, F.-W., et al.: A fully integrated 28 nm Bluetooth Low-Energy transmitter with 36% system
efficiency at 3 dBm. In: IEEE European Solid State Circuits Conference, pp. 356359, Sept
2015
13. Babaie, M., et al.: A fully integrated bluetooth low-energy transmitter in 28-nm CMOS with
36% system efficiency at 3 dBm. IEEE J. Solid-State Circuits 51(7), 15471565 (2016)
14. Babaie, M., Shahmohammadi, M., Staszewski, R.B.: A 0.5 V 0.5 mW switching current source
oscillator. In: IEEE RFIC Symposium, pp. 183186, May 2015
15. Babaie, M., Staszewski, R.B., Galatro, L., Spirito, M.: A wideband 60 GHz class-E/F2 power
amplifier in 40 nm CMOS. In: IEEE RFIC Symposium, pp. 215218, May 2015
16. Tohidian, M., Madadi, I., Staszewski, R.B.: A fully integrated discrete-time superheterodyne
receiver. IEEE Trans Very Large Scale Integr. (VLSI) Syst. 25(2), 635647 (2017)
17. Madadi, I., Tohidian, M., Staszewski, R.B.: A high IIP2 saw-less superheterodyne receiver
with multistage harmonic rejection. IEEE J. Solid-State Circuits 51(2), 332347 (2016)
18. Babaie, M., Staszewski, R.B.: An ultra-low phase noise class-F2 CMOS oscillator with
191 dBc/Hz FoM and long-term reliability. IEEE J. Solid-State Circuits 50(3), 679692 (2015)
19. F.-W. Kuo, et al.: A 12mW all-digital PLL based on class-F DCO for 4G phones in 28 nm
CMOS. In: Proceedings of IEEE Symposium on VLSI Circuits, pp. 12, June 2014
20. Staszewski, R.B., et al.: All-digital PLL and transmitter for mobile phones. IEEE J. Solid-State
Circuits 40(12), 24692482 (2005)
21. Aoki, I., Kee, S.D., Rutledge, D.B., Hajimiri, A.: Distributed active transformera new power-
combining and impedance-transformation technique. IEEE Trans. Microw. Theory Tech. 50(1),
316331 (2002)
22. Kim, J., et al.: A fully-integrated high-power linear CMOS power amplifier with a parallel-
series combining transformer. IEEE J. Solid-State Circuits 47(3), 599614 (2012)
23. Babaie, M., Staszewski, R.B.: A study of RF oscillator reliability in nanoscale CMOS. In:
Proceedings of IEEE 21st European Conference on Circuit Theory and Design, pp. 243246,
Sept 2013
24. Chen, J., et al.: A digitally modulated mm-Wave Cartesian beamforming transmitter with
quadrature spatial combining. In: IEEE International Solid-State Circuits Conference Digest
of Technical Papers, pp. 232233, Feb 2013
25. Kee, S., Aoki, I., Hajimiri, A., Rutledge, D.: The class-E/F family of switching amplifiers.
IEEE Trans. Microw. Theory Tech. 51(6), 16771690 (2003)
26. Selvakumar, A., Zargham, M., Liscidini, A.: Sub-mW current re-use receiver front-end for
wireless sensor network applications. IEEE J. Solid-State Circuits 50(12), 29652974 (2015)
27. Mirzaei, A., Darabi, H., Leete, J.C., Chang, Y.: Analysis and optimization of direct-conversion
receivers with 25% duty-cycle current-driven passive mixers. IEEE Trans. Circuits Syst. I:
Regul 57(9), 23532366 (2010)
28. Staszewski, R.B., et al.: All-digital TX frequency synthesizer and discrete-time receiver for
Bluetooth radio in 130-nm CMOS. IEEE J. Solid-State Circuits 39(12), 22782291 (2004)
29. Karvonen, S., Riley, T.A.D., Kurtti, S., Kostamovaara, J.: A quadrature charge-domain sampler
with embedded FIR and IIR filtering functions. IEEE J. Solid-State Circuits 41(2), 507515
(2006)
30. Bagheri, R., et al.: An 800 MHz to 6 GHz software-defined wireless receiver in 90 nm CMOS.
IEEE J. Solid-State Circuits 41(12), 28602876 (2006)
18 A 4.4 mW-TX, 3.6 mW-RX Fully Integrated Bluetooth Low Energy. . . 359
31. Ferreira, S.B., Kuo, F.-W., Babaie, M., Bampi, S., Staszewski, R.B.: System design of a
2.75 mW discrete-time superheterodyne receiver for Bluetooth Low Energy. Submitted for
review to a special issue in IEEE Trans. Microw. Theory Tech. 65(5), 19041913 (2017)
32. Shahmohammadi, M., Babaie, M., Staszewski, R.B.: A 1/f noise upconversion reduction
technique applied to class-D and class-F oscillators. In: IEEE International Solid-State Circuits
Conference Digest of Technical Papers, pp. 444445, Feb 2015
33. Shahmohammadi, M., Babaie, M., Staszewski, R.B.: A 1/f noise upconversion reduction
technique for voltage-biased RF CMOS oscillators. IEEE J. Solid-State Circuits 51(11), 2610
2624 (2016)