Anda di halaman 1dari 7

International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]

Human Detector and Counter Using Raspberry


Pi Microcontroller
Shubham Mathur1, Balaji Subramanian2, Sanyam Jain3, Kajal Choudhary4
EEE1, 2, 3-SELECT, CSE-SCOPE4
VIT University, Vellore, Tamil Nadu.
Email: Shubham.mathur15@gmail.com1, balajis95@gmail.com2, jainsanyam2@gmail.com3, choudhary.kajal19@gmail.com4

Dr. Rama Prabha D


Associate Professor-SELECT,
VIT University, Vellore, Tamil Nadu
Email: dramaprabha@vit.ac.in

The HOG feature descriptor in contrast to other descriptors,


Abstract— A novel initiative towards the digital image processing involves a concept of comparison of local intensity gradients or
technique by the application of histogram of oriented gradients edge directions [3-5]. This reduces the complexity of pointing
(HOG) feature descriptor using the OpenCV library coded with the out the positions of the object on the grid as it only compares the
High-level programming language Python, booted with the help of normalized forms of the gradients of the image with the
Raspberry Pi microcontroller fitted with a RaspiCam to capture normalized forms of gradients of the samples uploaded in the
moving images of objects passing under it has been presented in this system. The descriptor is used in OpenCV which guides the
paper. The project utilizes image samples in top-view which are process of image comparison [6-7]. Image-Processing is done in
used to set predefined models containing a great number of motion
top-view as the wide-variety of human gesture patterns that can
variations for identification of humans entering a room through a
door or gate. Pair of Passive Infra-Red (PIR) sensors has been used be present compared to frontal-view is greatly reduced [8].
to instruct the system to capture images of incoming or outgoing Some of the most popular sensors used in detection applications
objects that cross it. This method of image detection combined with such as the thermal sensors etc. suffer a penalty while
a sensor feedback has been used along with an ability to send data differentiating between humans and objects that have
via bluetooth to local servers for security or record purposes. temperatures close or similar to that of a human body. Latest 3-D
depth technology sensors such as the Kinect Sensor[9] , though
Keywords — HOG, Human Detection, Image Processing, PIR, foolproof, are very expensive being in the range of MRP Rs.
Raspberry Pi (RPi), Support Vector Machine (SVM), Single Board 9490 - Rs.11,490 ( 141.59$- 171.43$).
Computer (SBC) To counter these disadvantages, P.I.R (Passive Infra-Red)
Sensors have been used in the detection of humans connected as
I. INTRODUCTION an input to an RPi-2 microcontroller unit which processes the
images of the object entering the room, captured by a Raspi-
T he detection of humans by automated systems using image
CAM connected to the RPi-2.
A Pair of PIR sensors is used to send instructions to the
processing has presented numerous promising avenues in the Raspberry Pi to start taking pictures of the object that has just
field of technology [1-2]. Counting of humans using Image tripped it. A PIR comes under the category of pyroelectric
Processing has found applications in security, crowd-monitoring devices [10]. This type of sensor detects infrared radiation. It
and automated attendance systems in multinational firms etc. senses motion that occurs in zones lateral to the sensor. PIR
image processing has proven advantageous over many other sensor can detect precisely up to 40 feet which is sufficient for
methods such as multi-sensing, thermal sensing, heart-beat detecting motion through a door. As its name suggests the PIR
sensing etc. This is evident owing to its relative advantages over sensor detects infrared radiation which has a wavelength in range
detection of objects and humans using simple sensors due to their 700 nm to 300 ȝm. It generates a temporary potential when there
inability to differentiate between humans and other objects is an increase or decrease in infrared radiation. The major
without an accurate and dedicated data acquisition system. advantage of the PIR sensor is that it is shielded from false
triggers caused by sudden changes in air speed which gives it an

978-1-5090-5682-8 /17/$31.00 ©2017 IEEE

1
International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]
obvious edge over ultrasonic and other sensors [11]. For this Numerous feature descriptors have been created for image
purpose a Fresnel lens is placed over the sensor element which processing applications. Detection of human faces using two
additionally provides uniform sensitivity and extends its field of variants of Discriminative Local Binary Pattern (LBP) descriptor
view. PIR sensors are suitable for standalone systems because of has been described by Mu, Yadong et al. in [16]. In yet another
their less power consumption [12]. Human detection and work on Trainable System for object detection, Papageorgiou et
counting is presented as the primary objective in this research al. [17] use intensity differences between adjacent cells in an
paper. image representation efficiently computable as a Haar Wavelet
transform. Use of Haar Classifiers for object detection can be
found in many works such as Reinius, Staffan [18] in which its
II. PREVIOUS WORK AND COMPARISON application has been extended to iOS applications. Major
disadvantage of Haar Wavelet transform is that it is not a

S tudies and work conducted on human detection techniques


continuous waveform and also not differentiable and hence
cannot be technically used for optimal object detection
techniques. The LBP descriptor takes into consideration the
using electronic systems have been proven useful on many
occasions whenever the segregation of human beings from other pixels in 8 directions surrounding a single cell and calculates
objects became essential, not only for security purposes and their gradients in each of these respective directions, assigning
entry authorizations into a meeting hall, workplace or building each gradient value as positive or negative i.e. 1 and 0. This
but also for reasons like a research conducted to determine the method of converting gradient values into 1s and 0s suffers a
density of humans in a particular area or region. This section has major disadvantage as it does not consider the increase or
been particularly utilized to mention a few of these papers decrease in gradient values surrounding a particular cell leading
relevant to the topic at hand [13 - 22]. Viola et al. in [13] to arbitrary conclusions on the nature of the image. The HOG
presented an optimal and fast method of face detection using a classifier used in this project considers the magnitudes of the
classifier built with AdaBoost learning algorithm which involves gradients instead of using binary digits gives a clear edge and
concentrating on a small number of critical visual features of a definition to the image captured and hence overcoming the
human face for detection. In [14] Jesorsky et al. implemented an disadvantage of the LBP classifier. The Haar Cascade on the
edge detection technique on gray scale images of human faces other hand is primarily a texture based classifier and hence not
involving face localization technique which is adaptive in all useful for the detection of orientations of the gradients. A
textures and backgrounds, thereby ensuring an improved comparison between the results rendered by the three classifiers
accuracy in detecting human faces. Another method of human on highways has been studied in Cruz et al. in [19]. An efficient
detection by using the spatial layout of a body part’s appearance method of human detection in image and video processing using
and training these images using AdaBoost is presented in HOG with SVM is presented in Pang, Yanwei et al. [20].
Mikolajczyk et al. [15]. Detection of a human body part by Works related to human detection and counting have been
means of training a thousand images to locate specific features published using various systems suiting the need and
present in an image for localization and identification might not environment. A synonymous work relating to the counting of
necessarily provide the target accuracy due to varying shapes and crowded moving objects using a parallelized version of KLT
sizes of human body parts making it unsuitable for such an tracker has been published by Rabaud et al. [21]. The cited work
endeavor. Face detection in particular has its own disadvantages also presents the relative problems involved in detection and
pertaining to varying face shapes and different feature tracking of moving objects. These disadvantages are tackled with
orientations and image occlusion. A more viable option is to use the use of top-view image capturing and subsequent processing
still images in top-view and perform image localization using by RPi. A similar approach using RPi for image processing
feature descriptors such as HOG to serve the purpose. Object application can be found in Fernandes et al. [22] which also
details in top-view are very specific referring to the head and throws light on the cost efficiency and low power consumption
shoulder alignment of individuals which present a more suitable of the device.
and proportionate method for detecting human beings. Moreover
with a large reduction in the number of features exhibited by the
human body in this orientation makes it a better solution for the III. IMAGE PROCESSING USING OPENCV IN PYTHON
problems observed in [13- 15]. This project has been conducted
in top-view using a RaspiCam to take pictures of objects that
pass underneath and simultaneously process the images captured O penCV (Open Source Computer Vision) is a library that
by relaying them to the Raspberry Pi microprocessor. can be imported in almost all computer languages like python, C,
Java etc. It contains optimized image processing tools. One of

2
International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]
the object in the test image. The SVM feeds on sample images
classified as positive samples and negative

(A) (B)

Fig. 2. Images taken by RaspiCam and detection of Humans by HOG algorithm


run by a python script in Raspberry Pi

samples. Positive samples Fig. 1 (C, D) are those that have the
(C) (D)
desired object in them whereas negative samples Fig. 1 (A, B) do
Fig. 1. Samples for HOG feature descriptor (A, B) Negative; (C, D) Positive not contain the object. SVM works on the comparison of two
types of samples. Collected samples should be diverse in
complexity and number of objects present.
the most supreme aspects of OpenCV is that it is platform
independent which enables its usage on a large variety of
hardware like RPi. Using OpenCV in python boosts its abilities
by incorporating numpy (Numerical Python). In image
processing, images are dealt as large 3D arrays and numpy
serves as a robust tool for numerical array computations [23-24].
For human detection, we used HOG algorithm. HOG is used for
feature description in computer vision and image processing for
the purpose of object detection. The principle of the algorithm is
that, it counts presences of gradient orientation in localized zones
of an image. It is computed on an image assumed as a dense grid
and divided into uniformly spaced cells and uses overlapping
local contrast normalization for improved accuracy. In HOG a
shape and its appearance in an image is defined by the
distribution of intensity gradients or edge directions. The image
is divided into small connected regions or zones called cells and
a histogram of intensity gradients is computed for each of the
cells and stored in the form of a matrix. The descriptor is then
defined by summing of these values of the matrix. For better
accuracy, the local histograms can be contrast-normalized by Fig. 3. An Illustration of the Human Counter detecting a Person passing under it.
calculating a measure of the intensity across a connected cluster
Other algorithms also exist for object detection like Haar
of cells, called a block, and then using this value to normalize all
Cascades, background subtraction etc. Haar Cascades is well
cells within the block. Normalization of the cells removes noise
suited to cases where features are well defined and specific like
from the images to some extent. In some cases it may decrease face detection etc. In real time situations, features of a human
the sharpness of the images. It usually results in better invariance figure are not well specified, they tend to vary a lot according to
to changes in illumination and shadowing. After the calculation the pose or exposure of human body to the camera. In such cases
of Histograms of the divided cells, a SVM is trained by the HOG helps best. HOG was applied on top-view (alignment of
histograms prepared. Task of SVM is to test various images for head and shoulders) on human body as shown in Fig. 2.
the presence of the desired object and declare the occurrence of

3
International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]

Fig. 4. Block Diagram Describing the Process Flow of the project.

IV. REAL TIME IMPLEMENTATION

A s per the process flow, when a person or thing enters the schematic, other pins can be programmed for controlling more
number of relay switches.
field of sensors as in Fig. 3, it triggers the DIP unit programmed
in RPi to take images from a camera mounted on the top of the
door and process them for detecting human presence [25-27]. If
human is detected by the DIP unit, it checks the status of the
other sensor. Triggering of other sensor signifies the
entrance/exit of a person and updates the counter accordingly. If
human is not detected by DSP algorithm or other sensor is not
triggered, then process is directed to the beginning as given in
Fig. 4. All the data of image processing that is ongoing in the
RPi is retrieved on another system using serial communication
and a bluetooth module (HC-05) interfaced with RPi serial port.
This information was very useful for performance assessment of
the python script and DIP algorithm.
Schematic diagram of hardware connections is shown in Fig. 5.
USART pins are connected to the bluetooth module [28-29].
universal synchronous and asynchronous receiver and transmitter
(USART) is used to communicate between devices over serial
communication protocol. Relays can be used in the circuit for
controlling the power supply of the room as per the counter [30].
Status LED denotes the state of the DIP algorithm. If the LED is
glowing, it signifies that DIP algorithm is currently non- Fig. 5. Schematic Diagram of the Pin Connections of RPi
operational. Apart from the pins currently used in the circuit

4
International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]

Fig. 6. Graph of an analysis performed on the no. of people detected while entering the Room.

For expanding the number of relays, some other boards like Face detection imparts a delay and uncertainty to the system as
Arduino Mega 2560 etc. can be used with RPi by I2C person entering the room has to face the camera in-line for
communication protocol in a master–slave association. Mega proper detection. Face detection has a limitation that gate should
2560 has an edge over other available microcontroller boards as be transparent for faces to be visible. Top-view eliminates all
it consists of 52 digital pins that can be directly interfaced with such limitations with ease.
magnetic relays. It also has 3 serial communication ports that can Third major advantage of this project is that this is a cost
be used to further extend the length of master-slave chain. efficient project, as it costs only 6500/- INR along with an NRE
A small setup with a box and two sensors are placed on a cost of 1200/-. Low cost design enables this project to be
gateway. The box contains RPi with RPi camera, bluetooth implemented at a larger scale.
module (HC-05) and a power bank (to power RPi). This box is
mounted approximately at a height of 8-10 feet. Sensors placed
on either sides of the gate must have an optimum distance of 1- VI. EXPERIMENTAL RESULTS
1.5 meter between each other.

V. INNOVATION INVOLVED T he project was implemented on a lab gate and 100 people
were passed through it. Total number of people detected were
The major innovation involved is incorporation of image recorded and plotted against total number of people passed. It
processing in human detection algorithm which is already resulted in a curve slightly lagging than straight line. Straight
discussed in introduction. Second major innovation is use of top- line signifies an ideal system that is fully efficient. This recorded
view in HOG. data was analyzed by linear regression using least square method
Top view gives more accuracy in human detection because to determine the overall efficiency of the system that came to be
possible alignment combinations of head with shoulders are 83% being just a scaled down model. In least square analysis,
lesser when compared to alignment combinations of full body number of people passed (X) and corresponding values of
seen from front or back. However, face detection is far better number of people detected (Y) are tabulated in Fig. 6. An
than human body detection in terms of accuracy, but it poses a approximate linear line is derived from (1) and (3) by the data as:
situational drawback of placement a camera. If face detection is
used, camera module has to be placed in front of the door but ܻ෠ ൌ ܾ଴ ൅  ܾଵ ܺ (1)
places for mounting camera are not always available in front of σೣσ೤
door. For mounting camera in this case, we require a separate σ ௫௬ି
ܾଵ ൌ  ೙
ሺσ ೣሻమ
(2)
stand/hanger from roof placed in front side of door but it will σ ௫మି

disrupt the gateway. If camera is placed away from the gate, then
a high power scope has to be used that will incur further cost.

5
International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]

Fig. 7. The Efficiency Curve Plotted by Real-Time Experimentation

enters a facility in restricted time period etc. can also be


σ ௬ିሺ௕భ σ ௫ሻ incorporated with the supply automation.
ܾ଴ ൌ  (3)

VIII. REFERENCES
Slope (ܾଵ ) as in (2) of the approximated line signifies the overall
efficiency of the system. Fig. 7 shows the efficiency curve which
was plotted from the data obtained experimentally. [1] Hou, Ya-Li, and Grantham KH Pang. "People counting and human detection
in a challenging situation." Systems, Man and Cybernetics, Part A: Systems and
Humans, IEEE Transactions, vol. 41.1, 2011, pp. 24-33.

VII. CONCLUSION AND FUTURE ASPECTS [2] Zhou, Jianpeng, and Jack Hoang. "Real time robust human detection and
tracking system." Computer Vision and Pattern Recognition-Workshops, 2005.
CVPR Workshops. IEEE Computer Society Conference on. IEEE, 2005.
This project is a new approach for human detection and
counting. Involvement of image processing with conventional [2] Dalal, Navneet, Bill Triggs, and Cordelia Schmid. "Human detection using
oriented histograms of flow and appearance." Computer Vision–ECCV 2006.
methods helped in avoiding false detections (non-human Springer Berlin Heidelberg, 2006, pp. 428-441.
detection) and led to an efficiency of 83% as this was just a
scaled down model. This efficiency can be further increased to a [4] Zhu, Qiang, et al. "Fast human detection using a cascade of histo
level of nearly 96%. Using RPi for soft computation averted the grams of oriented gradients." Computer Vision and Pattern Recognition, 2006
IEEE Computer Society Conference, IEEE vol. 2, 2006.
use of desktop CPUs and reduced the power consumption of the
system and cost. A very simple design results in easy installation. [5] Zhang, Junge, et al. "Boosted local structured hog-lbp for object localization."
Internal Processing of the machine is observable through serial Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on.
communication by bluetooth module. Future research directions IEEE, 2011.
involve replacement of PIR sensors by high range LOS (Line of [6] http://docs.opencv.org/3.0.0/d6/d00/tutorial_py_root.html
Sight) sensors, such as IR sensors with lasers, or Proximity
sensors to eliminate solid angle triggering and decreasing [7] Bradski, Gary, and Adrian Kaehler. Learning OpenCV: Computer vision with
triggering area. For boosting up the performance and speed of the OpenCV library. "O'Reilly Media, Inc.", 2008.
the system, more powerful hardware can be used like RPi-3 [8] Rauter, Michael. "Reliable human detection and tracking in top-view depth
(released in February 2016) that has 4×ARM Cortex-A53, images." Proceedings of the IEEE Conference on Computer Vision and Pattern
1.2GHz processor. Further improvement can be done by using a Recognition Workshops. 2013.
higher resolution camera to increase the calibre, lucidness and
[9] Xia, Lu, Chia-Chih Chen, and Jake K. Aggarwal. "Human detection using
quality of the images captured. To increase the efficiency of depth information by kinect." Computer Vision and Pattern Recognition
HOG algorithm, more number of samples can be used to train Workshops (CVPRW), 2011 IEEE Computer Society Conference on. IEEE,
SVM. Alerting systems like message alerts or alarms if someone 2011.

6
International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]
[10] Leslie Hodges, “Ultrasonic and Passive Infrared Sensor integration for dual [21] Rabaud, Vincent, and Serge Belongie. "Counting crowded moving objects."
technology user detection sensors”, unpublished 2006 IEEE Computer Society Conference on Computer Vision and Pattern
Recognition (CVPR'06), IEEE, vol. 1, 2006.
[11] Gearhart, Chris, et al. "Use of ultrasonic sensors in the development of an
Electronic Travel Aid." Sensors Applications Symposium, 2009. SAS 2009. IEEE. [22] Fernandes, Steven Lawrence, and Josemin G. Bala. "Low Power Affordable
IEEE, 2009. and Efficient Face Detection in the Presence of Various Noises and Blurring
Effects on a Single-Board Computer." Emerging ICT for Bridging the Future-
[12] Tsai, Cheng-Hung, et al. "PIR-sensor-based lighting device with ultra-low Proceedings of the 49th Annual Convention of the Computer Society of India
standby power consumption." Consumer Electronics, IEEE Transactions, vol. (CSI), Springer International Publishing vol. 1, 2015.
57.3, 2011, pp. 1157-1164.
[23] Van Der Walt, Stefan, S. Chris Colbert, and Gael Varoquaux. "The NumPy
[13] Viola, Paul, and Michael J. Jones. "Robust real-time face detection." array: a structure for efficient numerical computation." Computing in Science &
International journal of computer vision, vol. 57.2, 2004, pp. 137-154. Engineering, vol. 13.2, 2011, pp. 22-30.

[14] Jesorsky, Oliver, Klaus J. Kirchberg, and Robert W. Frischholz. "Robust [24] Oliphant, Travis E. A guide to NumPy. vol. 1, USA, Trelgol Publishing,
face detection using the hausdorff distance." International Conference on Audio- 2006.
and Video-Based Biometric Person Authentication. Springer Berlin Heidelberg,
2001. [25] Moghavvemi, M., and Lu Chin Seng. "Pyroelectric infrared sensor for
intruder detection." TENCON 2004. 2004 IEEE Region 10 Conference, IEEE,
[15] Mikolajczyk, Krystian, Cordelia Schmid, and Andrew Zisserman. "Human vol. 500, 2004.
detection based on a probabilistic assembly of robust part detectors." European
Conference on Computer Vision. Springer Berlin Heidelberg, 2004. [26] Song, Byunghun, Haksoo Choi, and Hyung Su Lee. "Surveillance tracking
system using passive infrared motion sensors in wireless sensor network."
[16] Mu, Yadong, et al. "Discriminative local binary patterns for human detection Information Networking, 2008. ICOIN 2008. International Conference on. IEEE,
in personal album." Computer Vision and Pattern Recognition, 2008. CVPR 2008.
2008. IEEE Conference on. IEEE, 2008.
[27] Benezeth, Yannick, et al. "Towards a sensor for detecting human presence
[17] Papageorgiou, Constantine, and Tomaso Poggio. "A trainable system for and characterizing activity." Energy and Buildings, vol. 43.2, 2011, pp. 305-314.
object detection." International Journal of Computer Vision, vol. 38.1, 2000, pp.
15-33. [28] Agarwal, Nidhi, and S. R. N. Reddy. "Design & development of daughter
board for Raspberry Pi to support Bluetooth communication using UART."
[18] Reinius, Staffan. "Object recognition using the OpenCV Haar cascade- Computing, Communication & Automation (ICCCA), 2015 International
classifier on the iOS platform", 2013. Conference on. IEEE, 2015.

[19] Cruz, Juliano EC, Elcio H. Shiguemori, and Lamartine NF Guimarães. [29] Dnyanoba, Birajdar Amar, B. Nagajayanthi, and Prakash Ramachandran.
"Concrete and Asphalt Runway Detection in High Resolution Images Using LBP "Development of an Embedded System to Track the Movement of Bluetooth
Cascade Classifier." 2013 BRICS Congress on Computational Intelligence and Devices based on RSSI." Indian Journal of Science and Technology, vol. 8.19,
11th Brazilian Congress on Computational Intelligence. IEEE, 2013. 2015.

[20] Pang, Yanwei, et al. "Efficient HOG human detection." Signal Processing [30] Pott, Vincent, et al. "Mechanical computing redux: relays for integrated
vol. 91.4, 2011, pp. 773-781. circuit applications." Proceedings of the IEEE, vol. 98.12, 2010, pp. 2076-2094

Anda mungkin juga menyukai