Anda di halaman 1dari 5

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/261306459

A CMOS imager for embedded systems with integrated, real-time, motion-detection


capabilities and digital output

Conference Paper · October 2008


DOI: 10.1109/ICSENS.2008.4716434

CITATIONS READS

3 23

2 authors:

Gian nicola Angotzi M. Barbaro


Istituto Italiano di Tecnologia Università degli studi di Cagliari
28 PUBLICATIONS   185 CITATIONS    67 PUBLICATIONS   688 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

use of conductive polymer composites as coatings for improved performance of implantable devices View project

Mixed-signal front-end for acquisition, elaboration and wireless transmisssion of nerural signals in BMI applications View project

All content following this page was uploaded by M. Barbaro on 01 September 2014.

The user has requested enhancement of the downloaded file.


A CMOS imager for embedded systems with
integrated, real-time, motion-detection capabilities
and digital output
Gian Nicola Angotzi and Massimo Barbaro
Dept. of Electrical and Electronic Engineering
University of Cagliari
Cagliari, Italy)
Email: g.angotzi,barbaro@diee.unica.it

Abstract— A CMOS, smart, low-power imager with pre- The chip architecture is flexible enough to allow computation
processing capabilities suitable for embedded systems, was re- of other algorithms provided they can be described in terms
alized and successfully tested. The chip hosts an array of of weighted sums. In all these algorithms, the key point is
64x64 active pixels for image acquisition and frame storage,
a programmable and reconfigurable analog row-processor for the possibility of interactively changing the parameters of the
parallel spatial and temporal filtering of an image row at kernel (frequency, envelope, gain, phase). Very fast output
the time and a fully digital communication block for chip rate is required to be able to perform multi-scale and multi-
configuration and frame grabbing. The row processor is capable frequency filtering of the same image.
of implementing programmable and tunable 2D spatial IIR filters
and programmable temporal FIR filters with up to 8 taps. The
two computations may be cascaded on-chip in order to extract II. C HIP A RCHITECTURE
motion information in real-time. The same row-processor can
be reconfigured into a parallel set of 64 8-bit A/D converters The chip architecture shown in Figure 1 reflects the choice
to achieve fully digital read-out. The chip was fabricated in a of a semi-parallel processing approach, where a full portion
0.35um process by AMI Semiconductors, has a size of 6mm2 , of the image (an entire row) is processed simultaneously,
hosts 120,000 transistors with a static power consumption of
4.7mW and is capable of a frame rate of 50frames/sec.
allowing to meet both real time and low area constraints.
Acquisition of the image is implemented in the Array, made-
I. I NTRODUCTION up of 64x64 active pixel sensors (Figure 1a) incorporating
A growing interest for embedded systems integrating image a photodiode, 2 long term memories (LTM) needed to store
processing capabilities (i.e. face/gesture recognition, motion 2 complete frames or 8 reduced frames, 2 output buffers
detection, target tracking, etc.) is being shown in different (LineCol and LineRow) and 1 input line (LineIn). Frame
application fields such as surveillance, automotive, biometrics storage is needed to compute temporal correlation. All the
and robotics. In such systems, the traditional approach based processing is done in the analog row processor (AP), a
on acquisition with CCD cameras and hardware/software vectorial structure made-up of 64 identical elements (APU,
processing on digital platforms may be ineffective in achieving Analog Processing Units, Figure 1b). A row of the new image
constraints of low-power, real-time and integration. A number or of any previously stored frame can be written in a battery
of different CMOS imagers have thus been proposed, in the of 64 S&H (short-term memories, STM) and fed to a set of
last decade, since CMOS technology allows to incorporate 64 programmable, switched capacitors, weighted adders which
image acquisition capabilities as well as low-level processing implement a feed-forward FIR filter. The weighted adders can
circuits, both digital or analog, on the same device [1]-[3]. also be interconnected one with the other to implement a
In this work we present a reconfigurable CMOS imager recursive spatial 1D IIR filter. The results of any computation
for real-time image processing which is intended to be a can be directly A/D converted (by reconfiguring the adders into
part of a portable system, capable of performing different a successive approximation A/D converter) or stored back in
kinds of spatio-temporal computations on the acquired im- the S&H or even in the frame memories.
age. Spatial processing is based on the convolution of the Notice that all the pixels of the same column as well as
image with a programmable Gabor-like function (a decaying all those of a same row share the same output channel so
exponential modulated by a cosine) useful for low-level tasks that the Array is addressable by rows and by columns. This
needed for stereo depth estimation [4], texture analysis [5], is needed when 2D spatial filters are required: such filters are
segmentation [6]. The temporal filter, which is a programmable implemented cascading a 1D horizontal filter and a 1D vertical
FIR filter with up to 8 taps (high-pass, band-pass, low-pass), filter in such a way: a) a row image is read through LineCol
coupled to the Gabor filter is suited for implementation of bus, b) the row is stored in the STMs, c) a 1D Gabor-like
motion detection [7] and estimation of motion-in-depth [8]. filter is applied, d) the result is stored back in the LTMs of

1-4244-2581-5/08/$20.00 ©2008 IEEE 274 IEEE SENSORS 2008 Conference


TABLE I
I NTEGRATION TIME FOR DIFFERENT CONDITIONS OF LIGHTING

Lighting condition Lux Photocurrent Integration time


strong sunlight 100000 200 [pA] 300 [µs]
full daylight 10000 20 [pA] 150 [µs]
Normal office light 500 1 [pA] 3 [ms]

LTMm
fbCol fbRow ReadRowM writeM

rstRead
reset

SelMem

ReadColM
shutt
fbRow
ReadRowP writeP

selCol selRow

ReadColP
Acquisition outCol outRow
LTMp
Column Buffer Row Buffer

Fig. 2. Schematic of the Pixel

that affects the images during read-out is easily removed by


fully differential processing units while short-channel effects
are reduced using two feedback signals (f bRow and f bCol).
The f bRow channel also provides an input path for signals that
Fig. 1. Building blocks of the reconfigurable imager
must be stored in the Long Term Memories. In order to use the
same column buffer for the read-out by rows and by columns
(see section III-B), signals selRow and selCol as well as
the Array, e) the Array is accessed by columns (LineRow bus) signals outRow and outCol are shorted on the diagonal of the
and finally (f) the columns are filtered again. Array. For the same reason the signals f bRow and f bCol are
The Digital Control Unit (DCU) is used to produce the shorted too. The Long Term Memories can be accessed in read
correct sequence of operations, to program the shapes of the mode by rows (signals readRowM and readRowP ) and by
spatial or temporal filters, to control the A/D conversion, to columns (signals readColM and readColP ) while in write
rearrange the connection in the Switch Matrix to properly mode they can be accessed only by rows (signals writeM
reconfigure the Analog Processor and to transmit the digitally and writeP ). Due to the non-idealities (charge redistribution
converted output off-chip. and clock feedthrough) each time a LTM is accessed in read
mode its data is affected by an error in inverse proportion to
III. C IRCUITS the size of the memory capacitor. To save the area the LTM
A. Pixel are implemented by gate capacitors (with a saving of around
The design of the pixel (whose schematic is depicted in 25% respect to a poly-poly2
  implementation). The area of the
Figure 2) is critical since a number of constraints must be Pixel is 24.7 × 24.7 um2 with a fill factor of about 30%.
fulfilled. In particular it is necessary to find a compromise
B. Colum buffer
between the area (the Array grows with N 2 ) and the sensi-
tivity of the photodiode (which increases with its area). The In Figure 3 is depicted the schematic of the Column Buffer.
image is acquired integrating the photocurrent produced by a It is made-up of a level shifter insensitive to the body effect.
photodiode (nplus on well) which is cascoded to reduce the To reduce the area only the portion inside the dashed square
parasitic capacitance of the integration node. The cascode tran- is embedded inside the pixel, while the rest is shared by the
sistor acts as an electronic shutter, by modulation of its gate pixels of the same column (see section III-A). To improve
(signal shutt). Since the integration time is programmable the the precision the Column Buffer is locally biased (transistors
device can be used in different condition of lighting without M 0 ÷ M 4) and a unitary gain buffer is used to match the
compromising the real-time of the elaborations (as reported in drain-source voltages of transistors M 7 and M 8. The unitary
Table I). The need to minimize the area of the pixel imposes gain buffer is a Miller OpAmp
  with a DC gain of 72dB,
V
the use of minimum-size transistors for the read-out circuitry GBW=10MHz and SR=1 µs . The transistors M 9 ÷ M 11
causing a reduction of the precision. The Fixed Pattern Noise are used to reset the high parasitic capacitances at the nodes
275
selFB selSUM selADC
VradcP
Vout
Y
VDD
preset M11 bf<6:0> b<6:0> sample
selADC In0P
M3 M2=M3=M4 topBC
GND
F20 C0 bottom hold
M4 In0P Vref
topSUM
F10
selSUM F1
A=1 fbPCA
Vref
Vin M7
F2 Vref

bottom side
M8=M7 F1a

F24 C4 Control
In4P
sel M5
F14
+ - outM

Feedback PCA
M6=M5 selHLS outP1
Vref

Program
P1

npreset M1 bf<6:0>
outM1
- +
VDD
X F20 C0
M10 In0M outP

bottom side
F10
INbf<6:0>
M1=M0
Vbias Vref
F1a
M9 Vref
GND F2
npreset
F24 C4 fbPCA
In4M selSUM F1
top
Fig. 3. Column buffer and feedback signal F14 Vref
bottom hold
Vref
topBC selADC In0M
to_STM to_FB from_STM bf<6:0> b<6:0> sample

Vin[0]
Column[i]

VradcM
STMtoLine0
Line0toSTM

Line1toFB
Line0toFB

selSUM selADC

Fig. 5. Integrated programmable adder and ADC

Topology Topology Topology Topology Topology Topology


Selection Selection Selection Selection Selection Selection
AddertoLine0

Sign
Selection
Sign
Selection
Sign
Selection
Sign
Selection
Sign
Selection
Sign
Selection product of the OpAmp is GBW = 10 [M Hz] while the
DC open loop gain is A0 = 70 [dB]. The operating mode
Vin[0] Vin[1] Vin[2] Vin[3] Vin[4] of this module is set by signals selSum (adder mode) and
fromAdder
selADC (converter mode). When the circuit works in adder
Fig. 4. Switch Matrix mode, two phases are needed to compute simultaneously the
weighted sum of five inputs. During the reset phase (F1 and
F1a high and F2 low) the OpAmp is connected in a unit-
gain feedback and the capacitors are put in parallel. During
X and Y in order to reduce the time needed for each reading.
the amplification phase (F1 and F1a low and F2 high) it is
C. Switch Matrix computed the weighted sum of the inputs, being the weights
Ci
The reconfigurability of the device, i.e. the capability of set by the ratios C f
(i=0. . . 4). Since each capacitor is a 7 bit
performing different algorithms, is demanded to the Switch Programmable Capacitor Array (PCA) the weighted sum is
Matrix whose schematic is depicted in Figure4. A number programmable too and the user can digitally set the shape of
of lines of signal distribution allows the interaction among the kernel. When the circuit works in converter mode (signal
neighbouring columns, while a network of switches (Topology selADC) two phases are needed to produce the whole 8 bit
selection) provides the right connection scheme depending on digital word. During the first phase the OpAmp is connected
the desired algorithm. Thanks to the fully differential archi- in a unitary gain feedback (signal F1a high) and the total
tecture, the sign of each input can be digitally programmed feedback capacitor (bits bf [6 : 0] all equal to one) is used
just swapping the inputs via a multiplexer (Sign selection). to sample (signal sample) and hold (signal hold) the input
Moreover, the Switch Matrix must correctly route the data signal that comes from the STM. During the second phase it
from/to the Array to/from the Analog Processor. is implemented the well known bit cycling algorithm. In this
phase the OpAmp is in the open look configuration, working as
D. Programmable Adder a comparator; the Miller capacitor is disabled and only the first
In order to save area and power consumption the pro- stage is used to speed-up settling of output. The bits produced
grammable adder and the ADC are integrated into the same at successive approximations are stored in a bank of dynamic
switched capacitors circuit whose schematic is depicted in flip-flops and are used to the digitally program the feedback
Figure 5. The adoption of a fully differential architecture is PCA. Once that the whole digital word has been produced it
motivated by power-supply noise rejection. The operational remains stored in the bank of registers and can be read-out
amplifier is a two-stage Miller OpAmp. The gain-bandwidth while the circuit processes a new row/column.
276
IV. E XPERIMENTAL RESULTS still under test since a number of parameters can be optimized
The chip was designed and realized in a 0.35µm process (and digitally set) to reduce the effect of such imperfections.
from AMI Semiconductors and its layout is shown in Figure 6
(due to metal fill, the microphotograph do not show many
details). The pitch of the columns is given by the size of the
pixel which is 24.7µm and determines the size of each unit
(APU) in the Analog Processor. Figure 7 shows the outputs
(a) Test image (b) Readout and (c) Spatial filter- (d) Spatial filter-
A/D conversion ing ing

Fig. 8. Images acquired and processed by the chip

The spatial Gabor filter is computed in less than 3µsec


Sensor array and Row
scanning circuitry Processor while the temporal filter requires more time since the long-
term memories in the Array have to be accessed several times
to get the previous samples. Static power consumption is
6mW @3.3V for the 64 × 64 version.
V. C ONCLUSION
Digital In conclusion, we conceived, designed and successfully
Control realized a reconfigurable CMOS imager for spatio-temporal
image processing with fully digital output. The versatility and
programmability of the Analog Processor allows to implement,
Fig. 6. Layout of the chip on the same chip, even different vision algorithms, provided
they can be accomplished by means of weighted sums of
of the Analog Processor: 7(a)) original image injected for test different pixels (convolutions). The device is currently under
purposes in the STMs, 7(b)) read-out of the image through further tests to increase its sensitivity by proper choice of the
the embedded A/D converters, 7(c)) test image filtered with programmable parameters.
the spatial Gabor-like filter. It should be noted that there is
ACKNOWLEDGMENT
The authors would like to thank Giovanni Busonera, Daniela
Loi, Alessandra Caboni, Paolo Meloni and Giamarco Angius
for helping with the test setup and Xilinx for hardware
donation.
R EFERENCES
[1] K. Boahen and A. Andreou, “A 590000 transistor, 48000 pixel contrast
(a) Test image (b) Readout and A/D (c) Spatial filtering sensitive, edge-enhancing cmos imager silicon retina,” Proceedings of the
conversion 16th Conference on Advanced Research in VLSI, pp. 225–240, 1995.
[2] M. Barbaro, P.-Y. Burgi, A. Mortara, P. Nussbaum, and F. Heitger, “A
Fig. 7. Test of the Analog Processor with an injected image 100100 pixel silicon retina for gradient extraction with steering filter
capabilities and temporal output coding,” IEEE J. Solid-State Circuits,
vol. 37, no. 2, pp. 160–172, 2002.
a background noise (the vertical stripes) introduced probably [3] L. Massari, M. Gottardi, L. Gonzo, D. Stoppa, and A. Simoni, “A CMOS
by the mismatch of the two capacitors of the column S&H image sensor with programmable pixel-level analog processing,” IEEE
(the signal is differential so 2 capacitors are needed). This is Trans. Neural Networks, vol. 16, no. 6, pp. 1673–1684, 2005.
[4] T. Sanger, “Stereo disparity computation using Gabor filters,” Biol.
due to the fact that the size of such capacitors is probably Cybern., vol. 59, pp. 405–418, 1988.
too small. On the contrary, the convolution do not add further [5] D. Dunn and W. Higgins, “Optimal gabor filters for texture segmentation,”
background noise. IEEE Transactions on Image Processing, vol. 4, pp. 947–964, July 1995.
[6] J. Van Demeeter and J. Du Buf, “Simultaneous detection of lines and
Figure 8 shows the output of the chip with internal acquis- edges using compound gabor filters,” International Journal of Pattern
tion. In this case, all the blocks of the devices were stimulating, recognition and analysis, vol. 14, p. 757, 2000.
from the Array, to the Analog Processor. Figures 8(a) and [7] D. Clausi and M. Jernigan, “A general model for visual motion detection,”
Proceedings of the 9th International Conference on Neural Information
8(b) show the acquisition and convolution of a simple image Processing (ICONIP-’02), vol. 1, 2002.
with 2 vertical lines on a white background while Figures 8(c) [8] S. Sabatini, F. Solari, P. Cavalleri, and G. Bisio, “Phase-based binocular
and 8(d) show a human face. In this case, the noise is higher perception of motion-in-depth: cortical-like operators and analog vlsi
architectures,” EURASIP Journal of Applied Signal Processing, vol. 7,
and includes the protion of noise introduced by the pixel. It pp. 690–702, 2003.
should be noted, again, that this are the preliminary results
obtained from a small number of prototypes and the device is

277

View publication stats

Anda mungkin juga menyukai