by
Felipe Echeverri Guevara
Andres
2011
Committee:
Acknowledgments
I would like to thank all the people who helped me in the development
of this thesis, and specially Dr. Bishop who gave me the opportunity to work
on this amazing project and learn from him, and I would also like to thank
Marquette University, who made me feel welcome.
Publication No.
Felipe Echeverri Guevara
Andres
Escuela de Ingeniera de Antioquia, 2011
vi
Table of Contents
Acknowledgments
Abstract
vi
List of Tables
ix
List of Figures
Chapter 1.
Introduction
Chapter 2. Preliminaries
2.1 Problem Context and Characterization
2.2 Problem Definition . . . . . . . . . . .
2.3 Projects objectives . . . . . . . . . . .
2.3.1 General objective: . . . . . . . .
2.3.2 Specific Objectives: . . . . . . .
.
.
.
.
.
5
5
6
7
7
7
8
8
8
9
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
11
12
19
26
27
29
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
30
31
32
33
35
37
43
Appendices
45
46
Appendix B.
Connectors Diagram
47
Appendix C.
Mechanical Diagram
50
Bibliography
51
Index
54
viii
List of Tables
10
12
33
38
49
ix
List of Figures
. . . . . . . . . . . . . . . . . . . . . .
13
14
14
15
16
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
it is used
. . . . . .
. .
. .
. .
. .
. .
to
. .
17
18
19
25
26
28
29
29
30
31
32
33
37
39
39
40
40
41
42
43
44
46
47
48
50
xi
Chapter 1
Introduction
Robots allow communication between the computers and the physical world, in order to fulfill certain tasks or works. These tasks depend on
the purpose for which they were created; for example, surveillance, medical
purposes, or exploration. In order to develop these tasks, robots need to
be gifted with accurate actuators and sensors. In most cases, it is essential
to build a map using the sensors. In addition, the robot needs to be able
to make an estimation of its location. Finally, it should be able to reason
about the task that it follow and the perception of the environment, in order
to reach the work in the most efficient and safe way.
Mapping, localization and path planning have become actively discussed in the robotics field. In the year 1983, in the CMU Mobile Laboratory
the first approach was formulated about the map building theory, the map
was built with several ultrasound sensors, which is an inaccurate sensor that
has a narrow field of view. However it was the most used sensor at the time
and worked surprisingly well in the first implementation of sonar navigation
in crowded rooms. Currently, sensors such as LIDARS and Stereoscopic
cams are quite accurate, with a wide field of view, capable of measuring
long distances. However, the noise is still present in the sensors and they
are not affordable for the public in general, which is why this research is being conducted to build a map with affordable sensors. However, is still quite
difficult to represent a map in 3D due to the computation expenses, and it
would not be useful for navigation purposes. Nonetheless, the development
of a 3D map is on the way.
Let us define and take a careful look to the mapping process, localization and path planing processes.
Mapping: Mapping is the task of collecting data and measurements
from one sensor or several sensors, for the map representation. There
are many ways to show a map. Two of the mains types are topological maps, which are based on features of the environment, and
metric maps, which geometrically represents the environment and its
surroundings. The Mapping technique could be applied in 3D or 2D,
which is used more often. However, all of these representations deal
with odometry noise and the uncertainty from the sensors. As the
robot moves on the map, it should estimate its pose and correct the
error.
Localization: Addresses the use of sensors to estimate the robots
position in the environment. As with the mapping process, localization
suffers the same problem, it deals with odometric uncertainty. Sometimes, the robot has trouble with the localization because it is hard
2
Chapter 2
Preliminaries
2.1
camera uses a port IEEE 1394. Another company, VIDERE Design has
been developing a system which integrates monocular and binocular lenses
for use in industrial systems and robotics.
Microsoft developed the Kinect sensor with RARE Company, who
provides the software and PRIME SENSE, an Israeli company that is in
charge of the hardware.
The Kinect sensor features include an RGB camera and a depth sensor. The depth sensor consists of an infrared laser projector combined with
a monochrome CMOS sensor, which captures video data in 3D without the
presence of light.
It took ten years to develop the Kinects technology. The initial use
was for video game consoles, but over the past year, there have been developments in the field of robotics, and it has integrated with other sensors
to obtain data and further development of control.
2.2
Problem Definition
Is it possible to use the inexpensive Kinect sensor in applications like
2.3
2.3.1
Projects objectives
General objective:
2.3.2
Specific Objectives:
To chose the correct drivers for the Kinect which make it possible to
hook it up to a computer.
To convert the raw data into understandable data for the processor
and the user.
To measure if the behavior of the Kinect is reliable.
Chapter 3
Methodological Design
3.1
Type of investigation
3.2
Research Method
3.3
Recollection, systematization and study of information from the Internet, as well as databases, journals and books.
Implementation of the USB communication using the drivers.
Convert the pixels of the image into distances for the navigation.
Compare the distance obtained with the sensor and the real distance.
Development of the Occupancy Grid Map algorithm.
Development of bibliographic work: Write the process which was followed and the results from the research.
Activities
Look for information and related
works.
Look for examples of use and
different libraries which integrate
the sensor.
Look for the specifications of the
sensor
Test and run some examples
with different programming consoles and languages
Select one of the proven programming languages
Transform data obtained from
the sensor into understandable
information (Distance)
Distance testing
Convert the distance into (x, y)
coordinates
Implement the Occupancy Grid
Map algorithm
Try the algorithm with several
environments
Make Adjustments to the algorithm
Research and select equipment
which supports the chosen language and the Kinect Sensor
Attach and try the Kinect with
the embedded system selected
Build several map and test it if
the maps looks reliable
10
Methodology
Reliability experiments
Developing software of data acquisition
Recompilation and analysis of
information
Developing software of data acquisition
Reliability experiments
Chapter 4
Projects Development
4.1
0.6 meters and the maximum range is between 4 and 5 meters. It depends
only on how much error the application can handle. The Hokuyo URG04LX-UG01 works between 0.06 and 4 meters with a 1% chance of error,
and the UTM-30LX works between 0.1 to 50 meters, and the XV-11 LIDAR
works between 0.2 meters to 6 meters. Thus, the Kinect will have the same
problems that the more expensive LIDARS in the market have in terms of
being capable of seeing the end of a long hallway. But the biggest problem
to consider is that the Kinect has a very close blind spot. However, robots
do not have only one sensor. This problem is easy to solve with different
kinds of sensors capable of measuring the range like ultrasound or infrared
sensors.
The Table 4.1 shows a benchmark for different sensors in the market.
Since one of the subjects that is considered in this thesis is using an
inexpensive sensor, The Kinect was chosen to be the main sensor of the
robot, however, the algorithm that is considered in this thesis can be used
11
Kinect
URG-04LX
Power Source
12VDC
5VDC
0.5m to 6m
0.02m to 4m
Scan Angle
Interface
57
USB (2.0)
240
USB(2.0), RS232
Frequency
Cost
30Hz
$150
10Hz
$2375
Detection
Range
UTM30LX
5VDC
0.1m
to
30m, Max.
60m
360
USB(2.0)
40 Hz
$5590
LMS100
SICK
10.8VDC30VDC
20m
270
Ethernet RS232
50hz-20hz
$5600
with any sensor in the market that considers the depth information.
4.1.1
Stereoscopic System
The stereo vision technique is mainly inspired by the human visual
12
z=
bf
d
(4.1)
14
The Kinect returns a raw disparity that is not normalized in this way,
which is to say, a zero Kinect disparity does not correspond to infinite distances. The Kinect disparity is related to a normalized disparity by the relation:
d=
1
(dof f rdis)
8
(4.2)
z=
1
8
1
8
1
8
pixel units.
bf
(dof f rdis)
(4.3)
Typical values for these terms are b = 580, f = 0.075, dof f = 1090,
rdis = is the raw disparity from the Kinect.
Figure 4.4: Plotting some data from the Kinect, with equation 4.3
15
(4.4)
With the fitting tool it was able to find the sum of squares due to error
value (SEE) and the R-square value (R2 ). SEE = 0.05007 which is very
16
Figure 4.6: The black curve is the original and the red curve is the approximation, it almost overlaps the entire curve
close to 0, and R2 = 0.9995 which is close to 1, meaning that the curve
almost fits in the data.
To find the angle of each pixel it is important to know the field of
view, and how many pixels are in the perceptual range. The Kinect has
a horizontal field of view of 57 and a vertical field of view of 43 . The
depth sensor works in the range of 0.5m until around 6m. The image size is
rames
640 480 at 30 fsecond
.
It is good to keep in mind that the origin is in the left corner, since
we just want to work with a slide of the image. The row number 240 was
taken. After this, we are working with just one row, each position in the
row is representing an angle in the image, and each position carries on the
depth information. First of all we need to convert the raw data into distance
17
= pixel(i)
f ov
nop
(4.5)
x = z cos()
18
(4.6)
(4.7)
Where z is the distance obtained from transforming the raw data, and
is the angle in which each pixel is positioned.
4.1.2
19
For all of these cell grids mx,y , the problem has become a binary
problem with a static state. Binary issues are addressed using the odds
ratio. The odd ratio of a state is defined as the ratio of the probability of an
event divided by the probability of its negate.
Let mx,y be each cell in the map, with m as the map, and < x, y >
the position of the cell in the map. Let z1 . . . zT be the distance or range of
each cell from time 1 until time T, along with the position(it is assumed to be
known). The main objective of the occupancy grid map is determining the
probability of each cell m given the measurement z.
p(mx,y |z1 , . . . , zt ) =
(4.8)
p(mx,y |z1 , . . . , zt ) =
(4.9)
This equation gives us the probability that the cell mx,y is occupied
given a distance z.
This result leads us in a similar way to obtain its negate, or when a
cell is free:
21
p(mx,y |z1 , . . . , zt ) =
(4.10)
This equation give us the probability that the cell mx,y is free given a
distance z.
In order to avoid some difficult terms to compute, it is better if the
equation (4.9) is divided by equation (4.10), and in that way it is easier to
get the odd ratios:
(4.11)
log
p(mx,y |z1 , . . . , zt )
p(mx,y |zt ) 1 p(mx,y ) p(mx,y )p(z1 , . . . , zt1 )
= log(
)
1 p(mx,y |z1 , . . . , zt )
1 p(mx,y |zt ) p(mx,y ) 1 p(mx,y )p(z1 , . . . , zt1 )
(4.13)
22
p(mx,y |z1 , . . . , zt )
p(mx,y |zt )
1 p(mx,y )
log
= log
+ log
+
1 p(mx,y |z1 , . . . , zt )
1 p(mx,y |zt )
p(mx,y )
p(mx,y )p(z1 , . . . , zt1 )
log
1 p(mx,y )p(z1 , . . . , zt1 )
(4.14)
1 p(mx,y ) t1
p(mx,y |zt )
+ log
`tx,y = log
+`x,y
1 p(mx,y |zt )
p(mx,y )
|
{z
}
(4.15)
SensorModel
`0x,y = log
p(mx,y )
1 p(mx,y )
(4.16)
`tx,y = log p(mx,y |zt ) log[1 p(mx,y |zt )] + log[1 p(mx,y )] log p(mx,y ) +`t1
|
{z
} x,y
SensorModel
(4.17)
Now we have the value `tx,y for each cell in the field of view of the
sensor, but we need to express the cell in a binary value.
Since we know:
`tx,y = log
p(mx,y |z1 , . . . , zt )
1 p(mx,y |z1 , . . . , zt )
23
(4.18)
e`x,y =
p(mx,y |z1 , . . . , zt )
1 p(mx,y |z1 , . . . , zt )
(4.19)
(4.20)
(4.21)
(4.22)
(4.23)
e`x,y
p(mx,y |z1 , . . . , zt ) =
t
(1 + e`x,y )
p(mx,y |z1 , . . . , zt ) = 1
1
t
(1 + e`x,y )
(4.24)
(4.25)
The log odds representation is brilliant, `tx,y can assume values between and and avoids truncation with values close to 0 and 1. This
means that it does not matter how high or how low the value of `tx,y is, in
most cases it will be 0 or 1, or at least a value in between them, with the
recovery of the probability, giving us a binary state.
Figure 4.9: Recovering the probability from the log odds ratio
If we carefully look at Figure 4.9, the recovering of the log odds ratio
are asymptotic in 0 and 1. To avoid computational expenses, it is better if a
threshold is established around 7 and -7, since every time the sensor measurement hits the obstacle, it will increase or decrease the value of `tx,y ,
making it larger or smaller every time. The occupancy grid map algorithm is
used more in static environments, but if we set this constraint, it can be used
in dynamic environments. It will be take less time for p(mx,y |zt ) to converge
to a value of 0 or 1 if an obstacle is removed or introduced in the map.
25
4.2
Sensor Model
Equations 4.17 is one of the most important equations in the entire
algorithm. This equation suggests that only two probabilities are needed to
implement occupancy grid maps. First, is the probability p(mx,y ), which is
the prior value for the occupancy. It is usually set to a value between 0.2
and 0.5, depending on how crowded the environment is. It will help us to
converge easily to 0 or 1, but it is most likely set to a value of 0.5, in this way
x,y )
the term log 1p(m
will be equal to zero.
p(mx,y )
The value which needs special attention is the term p(mx,y |z). This
probability is conditioned and depends on the range of the sensor, for this
term it relies on the Sensor Model. Studying the term carefully it is easy
to infer that the occupancy grid map does not address the neighboring cell
problem. This leads us to an overlapping cells problem, which will be discussed in chapter 4.4.
The Sensor Model is usually independent of the current map, and
26
can be stored in tables, but there are several Model sensor that can be
used for different purposes.
In the literature, there are numerous versions of this probability, for
sensors such as LIDARS, ultrasound sensors, cameras and infrared sensors. Usually, the sensor model for a range sensor is attributed to a high
probability for a grid cell that hits an obstacle, and low probability for a grid
cell in its absence. The sensor model of this function can be built by hand,
or learned from sensor data.
To insert the value in the grid cell by a particular value of z, each grid
cell that falls within the field of view of the sensor needs to be updated with
the value of the sensor model in the position. To update each grid cell, the
common principle applied is the ray tracing or ray casting method.
4.2.1
Ray Casting
The ray casting method was considered at the beginning for image
rendering. The method consists of casting a ray or a line, the line is traced
knowing the initial point and the end point. The end point will be the obstacle
with a distance z, and knowing these two points it is possible to make a
straight line.
The equation of a line for ray tracing is shown below:
y(z1 , . . . , zt ) = mz + b
27
(4.26)
m=
y2 y1
x 2 x1
(4.27)
b = y(z1 , . . . , zt ) mz
(4.28)
28
from the sensor, and it only considers the values of 0 (for free space) and 1
(for occupied space), this Sensor Model is represented in Figure2.2.
29
4.2.3
account the uncertainty of the senor due to the distance, however, this sensor model could make objects appear thicker.
30
(a) Curve of the New Real Sensor Model (b) Ray Casting of the New Real Sensor
Model
4.3
Why Probabilities
All the algorithms used in the literature have a common feature: Al-
most all of them are probabilistic. All of them employ probabilistic methods
to model the robot and the surroundings, and almost all of them turn sensor
measurements into maps. Even some techniques which are implemented
in the literature do not appear probabilistic at first sight, but they can be
interpreted with probabilistic methods under the correct assumption.
But, what makes probability such an important tool in robotics?. The
key to this answer is the sensor noise. The noise from the sensors is complex and is not easy to accommodate. Probability addresses models from
the different sources of noise and the effect on the measurement.
31
4.4
Occupancy grid map algorithm does not take into account the neighbors
cells. The problem presents itself in most of the corners of an environment.
This method was tested with the Ideal Sensor Model. Since this
model only takes the values of free or occupied, it is easier to compare
the prior value of the cell. One solution to this problem is shown in this
thesis.
(a) Good
(b) Bad
The main goal of the code is to avoid overlapping problems with grid
cells already occupied, trying not to overlap them with a white one. But if
the cell is not occupied and it is in the field of view, it is because the cell is
between the obstacle and the sensor, so it takes a low probability or a free
value, and if the cell is at the end of the line, it takes the occupied value.
32
4.5
puter needs to be chosen to attach it to the robot. The computer which was
chosen is the IT XSP line family which best suits the requirements.
Table 4.2: Different IT XSP with features
Variant
Plus
Basic
Standard
Plus
Basic
Standard
Clock
1.1GHz
1.1GHz
1.1GHz
1.6GHz
1.6GHz
1.6GHz
P-ATA
X
X
S-ATA
X
X
X
X
X
USB
X
x1.
X
X
x1.
X
SDIO
X
x2.
X
x2.
Note:
1. Only two USB ports available
2. Only one microSD socket available
4.5.1
35
36
Chapter 5
Conclusions and Results
With the data acquisition of the Kinects depth, it was proven that the
noise in the sensor does not follow a Gaussian curve using the Lilliefors
test on Matlab. The null hypothesis that the samples come from a normal
distribution was rejected with a randomly selected distance. However, it is
assumed that the noise is Gaussian for simplistic purposes of representation. In addition, since then, the algorithm needs to be fused with the
Extended Kalman Filter, which works with the Gaussian noise that comes
from the sensors.
It is clear in Figure 5.1 that the distance data does not have a normal
distribution shape.
37
Even though the noise is present, the measurements are quite consistent. A data acquisition was recorded for every 0.1m, from 0.5m until
5.5m; 500 samples were taken at each distance. This interval was considered at the beginning to model the distance with the raw data. The data
analyses were taken every 1m. The and the , and some other statistical
data were found and are shown in Table 5.1
Table 5.1: Statistical Data
Distance
1m
2m
3m
4m
5m
0.9976
1.9975
3.0090
3.9770
5.0015
0.0022
0.0077
0.0154
0.0229
0.0355
Var
4.7562 106
5.988 105
2.3826 104
5.2501 104
0.0013
Error(Distance )
0.0024
0.0025
-0.009
0.023
-0.0015
Figure 5.2: Depth Image, the Null Band is on the right side
the accuracy were taken into account. The Ideal Model was selected since
it works more efficiently when it is used with the embedded computer, and
generates better results. Additionally, there is no significant difference once
the map is built in comparison to the Real Sensor Model.
39
Figure 5.6: For Loop positioned outside of the main Loop, it is used to find
the angles
The performance and memory tools allow one to know where the
bottle neck is in the VI, checking which of the Sub-VIs take a larger amount
of time to run. Improving programming in these Sub-VIs it will speed up the
program.
Loading viewers in the Front Panel will increase the memory use. Do
not load unnecessary viewers, the dataflow among the Front Panel and the
Block Diagram will increase significantly, consuming memory resources.
Using the Element In Place Structure will increase the VI and memory efficiency. The Element In Place structure operates on data elements
41
Figure 5.7: Using the Element In Place Structure to rotate the Cartesian
Coordinates
in the same memory location and returns those elements to the same location in the array, cluster, variant, or waveform.
Use Real Time Loops if possible, the Real time will increase the efficiency and the time in your program. The dataflow programming does not
work like sequential lines of text, the flow of data is determined between
nodes, making some operations parallel. Additionally, LabVIEW makes it
easy to assign thread priorities with the Timed Loop structure.
Currently, the map is built and stored into a 100 by 100 Matrix, in this
space it can build a map of an area of 100m2 . There is a way of building
bigger maps, which is not considered in this thesis.
The code was tested in the IT XSP and it took 500 milliseconds
to process the entire field of view, reducing the size of the matrix by 50 cells.
It reached a time of 200ms, the computer is working with Windows XP and
it has LabVIEW 2010 installed. One advantage of having Windows on this
42
(c) Map
5.1
Future Work
The development of a 3D map is underway; however, it will consume
a lot of computation. For a laptop this is not a problem, but, the robot will
need a good computer on board to perform all the algorithms. The map can
be used for localization and path planning purposes and it can be used with
any kind of mobile robot. The area restriction was considered not to be a
problem, since it could be considered part of a submap in future works.
43
44
Appendices
45
Appendix A
IT XSP Block Diagram
46
Appendix B
Connectors Diagram
47
48
Part
49
Appendix C
Mechanical Diagram
50
Bibliography
Intech,
51
vehicles.
[15] Wolfram Burgard Sebastian Thrun and Dieter Fox. Probabilistic Robotics
(Intelligent Robotics and Autonomous Agents). The MIT Press, 2005.
[16] Sebastian Thrun. Exploration and model building in mobile robot domains. Neural Networks, IEEE International Conference, 1993.
[17] Sebastian Thrun.
Exploring Artificial
53
Index
Abstract, vi
Acknowledgments, v
Appendices, 45
Bibliography, 53
Choosing an embedded computer,
33
commands
environments
table, 9, 11, 12, 33, 38, 47
Conclusions And Results, 37
Connectors Diagram, 47
Ray Casting, 27
Real Sensor Model, 30
Research Method, 8
Sensor Model, 26
Specific Objectives:, 7
Stereoscopic System, 12
Why Probabilities, 31
Future Work, 43
General objective, 7
Getting started with the itx SP ,
35
Ideal Sensor Model, 29
Information recollection techniques
and instruments, 9
Introduction, 1
Mechanical Diagram, 50
Methodological Design, 8
Occupancy Grid Map, 19
Preliminaries, 5
54