Anda di halaman 1dari 65

Copyright

by
Felipe Echeverri Guevara
Andres
2011

USING THE SENSOR KINECT FOR LANDMARK


RECOGNITION AT ESCUELA DE INGENIERA DE
ANTIOQUIA

Committee:

Robert H. Bishop, Supervisor

USING THE SENSOR KINECT FOR LANDMARK


RECOGNITION AT ESCUELA DE INGENIERA DE
ANTIOQUIA

FELIPE ECHEVERRI GUEVARA


ANDRES

Degrees Work In Partial Fulfillment Of


The Requirements For The Degree Of
MECHATRONIC ENGINEER

ESCUELA DE INGENIERA DE ANTIOQUIA


MECHATRONIC ENGINEERING
ENVIGADO
2011

Dedicated to my parents, family, and friends, and those who always


supported me.

Acknowledgments

I would like to thank all the people who helped me in the development
of this thesis, and specially Dr. Bishop who gave me the opportunity to work
on this amazing project and learn from him, and I would also like to thank
Marquette University, who made me feel welcome.

USING THE SENSOR KINECT FOR LANDMARK


RECOGNITION AT ESCUELA DE INGENIERA DE
ANTIOQUIA

Publication No.
Felipe Echeverri Guevara
Andres
Escuela de Ingeniera de Antioquia, 2011

Supervisor: Robert H. Bishop

This thesis presents an approach to one of the key processes that


can be applied to autonomous systems for obstacle avoidance and path
planning. The three processes of robotics will be discussed at the beginning
of this thesis: Mapping, Localization and Path Planning. Afterwards these
are discussed. The solution to the mapping problem is addressed with the
Occupancy Grid Map algorithm. The algorithm is shown and presented with
the neighboring cell improvement and the two main sensor models that can
be used for it.

vi

Table of Contents

Acknowledgments

Abstract

vi

List of Tables

ix

List of Figures

Chapter 1.

Introduction

Chapter 2. Preliminaries
2.1 Problem Context and Characterization
2.2 Problem Definition . . . . . . . . . . .
2.3 Projects objectives . . . . . . . . . . .
2.3.1 General objective: . . . . . . . .
2.3.2 Specific Objectives: . . . . . . .

.
.
.
.
.

5
5
6
7
7
7

Chapter 3. Methodological Design


3.1 Type of investigation . . . . . . . . . . . . . . . . . . . . . . .
3.2 Research Method . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Information recollection techniques and instruments . . . . . .

8
8
8
9

Chapter 4. Projects Development


4.1 Selecting the sensor . . . . .
4.1.1 Stereoscopic System .
4.1.2 Occupancy Grid Map .
4.2 Sensor Model . . . . . . . . .
4.2.1 Ray Casting . . . . . .
4.2.2 Ideal Sensor Model . .
vii

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

11
11
12
19
26
27
29

4.2.3 Real Sensor Model . . . . . . . .


4.3 Why Probabilities . . . . . . . . . . . . .
4.4 Dealing with Overlapping problems . . .
4.5 Choosing an embedded computer . . . .
4.5.1 Getting started with the itx SP

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

30
31
32
33
35

Chapter 5. Conclusions and Results


5.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37
43

Appendices

45

Appendix A. IT XSP Block Diagram

46

Appendix B.

Connectors Diagram

47

Appendix C.

Mechanical Diagram

50

Bibliography

51

Index

54

viii

List of Tables

3.1 Steps and Methodology . . . . . . . . . . . . . . . . . . . . .

10

4.1 Sensors Benchmarking . . . . . . . . . . . . . . . . . . . . .


4.2 Different IT XSP with features . . . . . . . . . . . . . . . .

12
33

5.1 Statistical Data . . . . . . . . . . . . . . . . . . . . . . . . . .

38

B.1 Different IT XSP Connectors . . . . . . . . . . . . . . . .

49

ix

List of Figures

1.1 The Field of Robotics


4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9
4.10
4.11
4.12
4.13
4.14
4.15
4.16
4.17
5.1
5.2
5.3
5.4
5.5
5.6

. . . . . . . . . . . . . . . . . . . . . .

Knowing the Kinect . . . . . . . . . . . . . . . . . . . . . . .


Stereoscopic representation . . . . . . . . . . . . . . . . . . .
Stereo Image, from the Kinect Sensor . . . . . . . . . . . . .
Plotting some data from the Kinect, with equation 4.3 . . . . .
Raw Data vs. Real Distance . . . . . . . . . . . . . . . . . . .
The black curve is the original and the red curve is the approximation, it almost overlaps the entire curve . . . . . . . .
Converting the Position into an Angle . . . . . . . . . . . . .
Learning some features of the Kinect . . . . . . . . . . . . .
Recovering the probability from the log odds ratio . . . . . .
Occupancy Grid Map code in LabVIEW . . . . . . . . . . . .
Representation of a line casted in the grid . . . . . . . . . . .
Code for Ray Casting in LabVIEW . . . . . . . . . . . . . . .
Ideal Sensor Model . . . . . . . . . . . . . . . . . . . . . . . .
Real Sensor Model . . . . . . . . . . . . . . . . . . . . . . . .
New Sensor Model . . . . . . . . . . . . . . . . . . . . . . . .
The red circle shows the conflict cell . . . . . . . . . . . . . .
The Overlaps for loop, implemented in LabVIEW . . . . . . .

13
14
14
15
16

Fitting the data in a Normal Distribution . . . .


Depth Image, the Null Band is on the right side
Ideal Sensor Model Plotted In Matlab . . . . . .
Real Sensor Model Plotted In Matlab . . . . . .
Unchecking Allow debugging . . . . . . . . . .
For Loop positioned outside of the main Loop,
find the angles . . . . . . . . . . . . . . . . . .

. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
it is used
. . . . . .

. .
. .
. .
. .
. .
to
. .

17
18
19
25
26
28
29
29
30
31
32
33
37
39
39
40
40
41

5.7 Using the Element In Place Structure to rotate the Cartesian


Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.8 Occupancy Grid Map of a Hallway . . . . . . . . . . . . . . .
5.9 The Mobile Robot and the SRIO . . . . . . . . . . . . . . . .

42
43
44

A.1 IT X-SP Block Diagram . . . . . . . . . . . . . . . . . . . .

46

B.1 IT X-SP Top Side . . . . . . . . . . . . . . . . . . . . . . . .


B.2 IT X-SP Bottom Side . . . . . . . . . . . . . . . . . . . . . .

47
48

C.1 IT X-SP Mechanical Diagram, all dimensions are in mm . .

50

xi

Chapter 1
Introduction

Robots allow communication between the computers and the physical world, in order to fulfill certain tasks or works. These tasks depend on
the purpose for which they were created; for example, surveillance, medical
purposes, or exploration. In order to develop these tasks, robots need to
be gifted with accurate actuators and sensors. In most cases, it is essential
to build a map using the sensors. In addition, the robot needs to be able
to make an estimation of its location. Finally, it should be able to reason
about the task that it follow and the perception of the environment, in order
to reach the work in the most efficient and safe way.
Mapping, localization and path planning have become actively discussed in the robotics field. In the year 1983, in the CMU Mobile Laboratory
the first approach was formulated about the map building theory, the map
was built with several ultrasound sensors, which is an inaccurate sensor that
has a narrow field of view. However it was the most used sensor at the time
and worked surprisingly well in the first implementation of sonar navigation
in crowded rooms. Currently, sensors such as LIDARS and Stereoscopic
cams are quite accurate, with a wide field of view, capable of measuring

long distances. However, the noise is still present in the sensors and they
are not affordable for the public in general, which is why this research is being conducted to build a map with affordable sensors. However, is still quite
difficult to represent a map in 3D due to the computation expenses, and it
would not be useful for navigation purposes. Nonetheless, the development
of a 3D map is on the way.
Let us define and take a careful look to the mapping process, localization and path planing processes.
Mapping: Mapping is the task of collecting data and measurements
from one sensor or several sensors, for the map representation. There
are many ways to show a map. Two of the mains types are topological maps, which are based on features of the environment, and
metric maps, which geometrically represents the environment and its
surroundings. The Mapping technique could be applied in 3D or 2D,
which is used more often. However, all of these representations deal
with odometry noise and the uncertainty from the sensors. As the
robot moves on the map, it should estimate its pose and correct the
error.
Localization: Addresses the use of sensors to estimate the robots
position in the environment. As with the mapping process, localization
suffers the same problem, it deals with odometric uncertainty. Sometimes, the robot has trouble with the localization because it is hard
2

to differentiate between two or more plausible positions on the map,


especially if it is initialized without an initial estimation or without a location point.
Path Planing: Addresses the issue of reaching waypoints and goals
in an efficient, safe, and accurate way.

Figure 1.1: The Field of Robotics


These problems are closely related and are combined differently under different names.
The relationship between mapping and localization is clear, and its
commonly called Simultaneous Localization and Mapping or SLAM (1.), in
contrast, the relationship between planning and mapping is called exploration (2.) but this relationship does not address the localization problem.
Similar to the relationship between localization and planning, which is called
Active Localization (3.),a map built a priori is used off-line to address the
3

mapping problem. The relationship between the three processes is known


as Active SLAM, which consists of the process of selecting actions that
reduce the uncertainty in the mapping and localization. Active SLAM addresses the exploration part, when the robot builds the map, and the operational part, when the robot uses the map to achieve its task.
The mapping process is considered one of the most important anchors in the robotics field. Robots use the maps to localize themselves and
at the same time to choose the best path. The main goal of this work relays
in the map construction and map building techniques, with an uncommon
sensor, The Kinect sensor using the LabVIEW platform.

Chapter 2
Preliminaries

2.1

Problem Context and Characterization


Artificial vision systems have become the most widely used sensors

in robotic applications, especially in mobile wheeled robotic applications,


due to the amount of information that they can offer about the environment.
Autonomous robots make use of stereo vision to detect and avoid obstacles, measure the range and calculate the path that they need to follow.
This method is easy to apply and inexpensive. While laser scanners can
cost tens of thousands of dollars, stereo vision requires only two cameras
which need to be aligned.
It is quite intuitive that visual information in 3D view possesses more
information about the object in the scene than an image in 2D view. During
the image formation process of the camera, explicit depth information about
the scene is lost.
In the market, there are many companies who are researching and
developing in the field of stereoscopic vision such as Point Gray and its
stereo camera BumbleBee2 . The company provides software for tracking
people and the possibility to integrate the camera with other cameras. The

camera uses a port IEEE 1394. Another company, VIDERE Design has
been developing a system which integrates monocular and binocular lenses
for use in industrial systems and robotics.
Microsoft developed the Kinect sensor with RARE Company, who
provides the software and PRIME SENSE, an Israeli company that is in
charge of the hardware.
The Kinect sensor features include an RGB camera and a depth sensor. The depth sensor consists of an infrared laser projector combined with
a monochrome CMOS sensor, which captures video data in 3D without the
presence of light.
It took ten years to develop the Kinects technology. The initial use
was for video game consoles, but over the past year, there have been developments in the field of robotics, and it has integrated with other sensors
to obtain data and further development of control.

2.2

Problem Definition
Is it possible to use the inexpensive Kinect sensor in applications like

robotics and autonomous navigation in replacement of the LIDAR sensor in


order to obtain data from it, use it, and process the data for autonomous
navigation?

2.3
2.3.1

Projects objectives
General objective:

To obtain data from the Kinect sensor for implementation in autonomous


navigation on a wheeled mobile robot.

2.3.2

Specific Objectives:

To chose the correct drivers for the Kinect which make it possible to
hook it up to a computer.
To convert the raw data into understandable data for the processor
and the user.
To measure if the behavior of the Kinect is reliable.

Chapter 3
Methodological Design

3.1

Type of investigation

The investigation type is investigative and experimental development.

3.2

Research Method

Compilation and analysis of the information: Bibliographic compilation


about different and related works, and about the mathematical tools
and different models that were used before.
Establish communication: The Kinect sensor works with USB protocol.
It is necessary to choose the correct drivers for the Kinect to attach it
to the computer.
Developing software of data acquisition: Extract the data from the
Kinect.
Working with the Data: Convert the Kinects data to understandable
data to implement the Occupancy Grid Map algorithm.
Reliability experiments: Since the technique of building a map cannot

be simulated, it is necessary to prove that the sensors work. The


reliability of the system needs to be proved.

3.3

Information recollection techniques and instruments

Recollection, systematization and study of information from the Internet, as well as databases, journals and books.
Implementation of the USB communication using the drivers.
Convert the pixels of the image into distances for the navigation.
Compare the distance obtained with the sensor and the real distance.
Development of the Occupancy Grid Map algorithm.
Development of bibliographic work: Write the process which was followed and the results from the research.

Table 3.1: Steps and Methodology


Phase
To identify requirements of
the Kinect sensor

To obtain data from the


Kinect sensor

To Research and implement


the algorithm

Attach the Kinect to a


embedded computer

To observe and measure if


the system is reliable

Activities
Look for information and related
works.
Look for examples of use and
different libraries which integrate
the sensor.
Look for the specifications of the
sensor
Test and run some examples
with different programming consoles and languages
Select one of the proven programming languages
Transform data obtained from
the sensor into understandable
information (Distance)
Distance testing
Convert the distance into (x, y)
coordinates
Implement the Occupancy Grid
Map algorithm
Try the algorithm with several
environments
Make Adjustments to the algorithm
Research and select equipment
which supports the chosen language and the Kinect Sensor
Attach and try the Kinect with
the embedded system selected
Build several map and test it if
the maps looks reliable

10

Methodology

Recompilation and analysis of


information

Developing software of data acquisition


Reliability experiments
Developing software of data acquisition

Reliability experiments
Developing software of data acquisition
Recompilation and analysis of
information
Developing software of data acquisition
Reliability experiments

Chapter 4
Projects Development

4.1

Selecting the sensor


The minimum range of the Kinect is approximately between 0.5 and

0.6 meters and the maximum range is between 4 and 5 meters. It depends
only on how much error the application can handle. The Hokuyo URG04LX-UG01 works between 0.06 and 4 meters with a 1% chance of error,
and the UTM-30LX works between 0.1 to 50 meters, and the XV-11 LIDAR
works between 0.2 meters to 6 meters. Thus, the Kinect will have the same
problems that the more expensive LIDARS in the market have in terms of
being capable of seeing the end of a long hallway. But the biggest problem
to consider is that the Kinect has a very close blind spot. However, robots
do not have only one sensor. This problem is easy to solve with different
kinds of sensors capable of measuring the range like ultrasound or infrared
sensors.
The Table 4.1 shows a benchmark for different sensors in the market.
Since one of the subjects that is considered in this thesis is using an
inexpensive sensor, The Kinect was chosen to be the main sensor of the
robot, however, the algorithm that is considered in this thesis can be used

11

Table 4.1: Sensors Benchmarking


Device

Kinect

URG-04LX

Power Source

12VDC

5VDC

0.5m to 6m

0.02m to 4m

Scan Angle
Interface

57
USB (2.0)

240
USB(2.0), RS232

Frequency
Cost

30Hz
$150

10Hz
$2375

Detection
Range

UTM30LX
5VDC
0.1m
to
30m, Max.
60m
360
USB(2.0)
40 Hz
$5590

LMS100
SICK
10.8VDC30VDC
20m

270
Ethernet RS232
50hz-20hz
$5600

Note: The currency used for the cost it is US dollar.

with any sensor in the market that considers the depth information.

4.1.1

Stereoscopic System
The stereo vision technique is mainly inspired by the human visual

system, leading to the phenomenon of depth perception. The process of


achieving the depth consists of the use of two cameras aligned with a separation between them. Each camera captures an image and these images
are analyzed to estimate the pixel correspondences of similar features in
the stereo perspective views originated from the same 3D scene. However, finding correct corresponding points continues to be a problem due
to a several factors such as obstruction, photometric, radial and geometric
distortion.

12

Figure 4.1: Knowing the Kinect


For a normal stereo system, the cameras are calibrated, the rectified
images are parallel and have corresponding horizontal lines.
Triangulation requires knowing the next terms:
f = Focal Length of the camera.
b =Is the distance of the baseline between the 2 cameras.
d =Is the difference between the lateral distance and the pixel (V2 andV1 )
on the image from their respective centers.
IP1 andIP2 = The image plane.
Using the geometric concept of similar triangles in Figure 4.2, the
distance z is calculated as follows:

z=

bf
d

(4.1)

Where z is the depth (in meters), b is the horizontal baseline between


the cameras (in meters), f is the focal length of the cameras (in pixels), and
13

Figure 4.2: Stereoscopic representation


d is the disparity (in pixels). At zero disparity, the rays from each camera
are parallel, and the depth is infinite. Larger values for the disparity mean
shorter distances.
The result of processing this information is an intensity image. Each
pixel in the image has an intensity which represents a distance from the
camera. A darker pixel means that it is closer to the camera, and a lighter
pixel means that it is farther from the camera.

Figure 4.3: Stereo Image, from the Kinect Sensor

14

The Kinect returns a raw disparity that is not normalized in this way,
which is to say, a zero Kinect disparity does not correspond to infinite distances. The Kinect disparity is related to a normalized disparity by the relation:

d=

1
(dof f rdis)
8

(4.2)

Where d is a normalized disparity, rdis is the Kinect disparity, and


dof f is an offset value particular to a given Kinect device. The factor
appears because the values of rdist are in

z=

1
8

1
8

1
8

pixel units.

bf
(dof f rdis)

(4.3)

Typical values for these terms are b = 580, f = 0.075, dof f = 1090,
rdis = is the raw disparity from the Kinect.

Figure 4.4: Plotting some data from the Kinect, with equation 4.3

15

Another way to find a good approximate measure of the distance, is


making a data acquisition with the Kinect, comparing the raw data with the
real distance. The distance was measured with a measuring tape, and the
results are displayed in the following figure.

Figure 4.5: Raw Data vs. Real Distance


Using the Curve fitting tool from MatLab, it was able to find several
curves which fit in the data, the best of which was the exponential approximation.
The equation for the approximation is shown below:

z = 0.1397 e0.002708rawdata + 1.146 109 e0.02118rawdata

(4.4)

With the fitting tool it was able to find the sum of squares due to error
value (SEE) and the R-square value (R2 ). SEE = 0.05007 which is very

16

Figure 4.6: The black curve is the original and the red curve is the approximation, it almost overlaps the entire curve
close to 0, and R2 = 0.9995 which is close to 1, meaning that the curve
almost fits in the data.
To find the angle of each pixel it is important to know the field of
view, and how many pixels are in the perceptual range. The Kinect has
a horizontal field of view of 57 and a vertical field of view of 43 . The
depth sensor works in the range of 0.5m until around 6m. The image size is
rames
640 480 at 30 fsecond
.

It is good to keep in mind that the origin is in the left corner, since
we just want to work with a slide of the image. The row number 240 was
taken. After this, we are working with just one row, each position in the
row is representing an angle in the image, and each position carries on the
depth information. First of all we need to convert the raw data into distance
17

Figure 4.7: Converting the Position into an Angle


with the equation 4.3.
Each pixel in the image represents an angle, so by knowing the field
of view of the Kinect, and the position of each pixel, we can represent each
position with an angle, with the following equation.

= pixel(i)

f ov
nop

(4.5)

Where is the angle of the pixel, i represents the position of the


pixel in the row ( in this case it takes values between 0 and 640). f ov is the
field of view, since we are working with the horizontal slide, f ov = 57 and
nop is the quantity of pixels in the row, which is equal to 640.
Since we are looking for an x and y position, and we have the distances and the angles of each pixel, it is necessary to change the polar
coordinates for cartesian coordinates.

x = z cos()

18

(4.6)

Figure 4.8: Learning some features of the Kinect


y = z sin()

(4.7)

Where z is the distance obtained from transforming the raw data, and
is the angle in which each pixel is positioned.

4.1.2

Occupancy Grid Map


Where am I? Where have I been? Like a human, the robots ask

themselves these questions frequently. Knowing the current location, and


being able to go to other locations, are important tasks in autonomous navigation. Then, how can a robot localize itself? The smartest way is to use
a map, combined with data from the sensors and the robots movements.
Which brings us to the next question of which map should it use? We could
build a map made by hand, but it would be hard and error-prone because it

19

is preferred that the robot build its own map.


But how can a robot build a map? The most common way in robotics
is the occupancy grid map technique. It is necessary to divide the map into
cells; each cell representing a portion of the floor space, and is marked as
either empty or full. The sensor readings can easily be used to build a grid
map.
The occupancy grid map addresses the problem of estimating the
map by calculating the probability of each cell independently, sometimes
assuming overlapping problems with the neighboring cells.
To build the map, it might be helpful to keep in mind the next few
factors:
Size: If the field of view of the robot was wide, it would be harder to
build a map.
Noise in the sensors and actuators: If the sensors and actuators
were noise free, the task of building a map would be simple, it would
not be necessary to use filters.
Perceptual ambiguity: Some places looks almost the same to the
robot; it would be harder to establish correspondences in different locations with time.
Cycles: If the map was just a hallway, it would be easy to correct the
odometry incrementally by just going up and coming back through the
20

hallway. However, a cycle is referred to when the robot returns to its


position by a different path, which can leave a lot of room for error.

For all of these cell grids mx,y , the problem has become a binary
problem with a static state. Binary issues are addressed using the odds
ratio. The odd ratio of a state is defined as the ratio of the probability of an
event divided by the probability of its negate.
Let mx,y be each cell in the map, with m as the map, and < x, y >
the position of the cell in the map. Let z1 . . . zT be the distance or range of
each cell from time 1 until time T, along with the position(it is assumed to be
known). The main objective of the occupancy grid map is determining the
probability of each cell m given the measurement z.

p(mx,y |z1 , . . . , zt ) =

p(zt |mx,y )p(mx,y |z1 , . . . , zt1 )


p(z1 , . . . , zt1 )

(4.8)

Now applying the Bayes Law to the term p(mx,y |zt ).

p(mx,y |z1 , . . . , zt ) =

p(mx,y |zt )p(zt )p(mx,y |z1 , . . . , zt1 )


p(mx,y )p(z1 , . . . , zt1 )

(4.9)

This equation gives us the probability that the cell mx,y is occupied
given a distance z.
This result leads us in a similar way to obtain its negate, or when a
cell is free:

21

p(mx,y |z1 , . . . , zt ) =

p(mx,y |zt )p(zt )p(mx,y |z1 , . . . , zt1 )


p(mx,y )p(z1 , . . . , zt1 )

(4.10)

This equation give us the probability that the cell mx,y is free given a
distance z.
In order to avoid some difficult terms to compute, it is better if the
equation (4.9) is divided by equation (4.10), and in that way it is easier to
get the odd ratios:

p(mx,y |zt ) p(mx,y ) p(mx,y )p(z1 , . . . , zt1 )


p(mx,y |z1 , . . . , zt )
=
p(mx,y |z1 , . . . , zt )
p(mx,y |zt ) p(mx,y ) p(mx,y )p(z1 , . . . , zt1 )

(4.11)

Since p(m) = 1 p(m), replaces all the negate terms:

p(mx,y |zt ) 1 p(mx,y ) p(mx,y )p(z1 , . . . , zt1 )


p(mx,y |z1 , . . . , zt )
=
1 p(mx,y |z1 , . . . , zt )
1 p(mx,y |zt ) p(mx,y ) 1 p(mx,y )p(z1 , . . . , zt1 )
(4.12)
The logarithmic function has enormous advantages. It allows one to
convert multiplication into addition and division into substraction, which are
computationally faster, through the application of the log function.

log

p(mx,y |z1 , . . . , zt )
p(mx,y |zt ) 1 p(mx,y ) p(mx,y )p(z1 , . . . , zt1 )
= log(
)
1 p(mx,y |z1 , . . . , zt )
1 p(mx,y |zt ) p(mx,y ) 1 p(mx,y )p(z1 , . . . , zt1 )
(4.13)
22


p(mx,y |z1 , . . . , zt )
p(mx,y |zt )
1 p(mx,y )
log
= log
+ log
+
1 p(mx,y |z1 , . . . , zt )
1 p(mx,y |zt )
p(mx,y )

p(mx,y )p(z1 , . . . , zt1 )
log
1 p(mx,y )p(z1 , . . . , zt1 )

(4.14)

p(mx,y |z1 ,...,zt )


Lets replace the term log 1p(m
for `tx,y :
x,y |z1 ,...,zt )

1 p(mx,y ) t1
p(mx,y |zt )
+ log
`tx,y = log
+`x,y
1 p(mx,y |zt )
p(mx,y )
|
{z
}

(4.15)

SensorModel

And the initial value in t = 0 it will be:

`0x,y = log

p(mx,y )
1 p(mx,y )

(4.16)

The Sensor Model it will be explained in chapter 4.2.


Separating terms we will get:

`tx,y = log p(mx,y |zt ) log[1 p(mx,y |zt )] + log[1 p(mx,y )] log p(mx,y ) +`t1
|
{z
} x,y
SensorModel

(4.17)
Now we have the value `tx,y for each cell in the field of view of the
sensor, but we need to express the cell in a binary value.
Since we know:

`tx,y = log

p(mx,y |z1 , . . . , zt )
1 p(mx,y |z1 , . . . , zt )
23

(4.18)

Applying the next logarithm property logb (x) x = bn to equation


(4.18) and factoring:

e`x,y =

p(mx,y |z1 , . . . , zt )
1 p(mx,y |z1 , . . . , zt )

e`x,y [1 p(mx,y |z1 , . . . , zt )] = p(mx,y |z1 , . . . , zt )

e`x,y e`x,y p(mx,y |z1 , . . . , zt ) = p(mx,y |z1 , . . . , zt )

e`x,y = p(mx,y |z1 , . . . , zt ) + e`x,y p(mx,y |z1 , . . . , zt )

e`x,y = p(mx,y |z1 , . . . , zt )(1 + e`x,y )

(4.19)

(4.20)

(4.21)

(4.22)

(4.23)

e`x,y
p(mx,y |z1 , . . . , zt ) =
t
(1 + e`x,y )

p(mx,y |z1 , . . . , zt ) = 1

1
t
(1 + e`x,y )

(4.24)

(4.25)

This algorithm is additive, since it increases or decreases the value


of `tx,y every time that it hits an obstacle in its presence or without it. Any
algorithm that increases or decreases a variable in response to measurements can be interpreted as a Bayes filter in log odds form.
24

The log odds representation is brilliant, `tx,y can assume values between and and avoids truncation with values close to 0 and 1. This
means that it does not matter how high or how low the value of `tx,y is, in
most cases it will be 0 or 1, or at least a value in between them, with the
recovery of the probability, giving us a binary state.

Figure 4.9: Recovering the probability from the log odds ratio
If we carefully look at Figure 4.9, the recovering of the log odds ratio
are asymptotic in 0 and 1. To avoid computational expenses, it is better if a
threshold is established around 7 and -7, since every time the sensor measurement hits the obstacle, it will increase or decrease the value of `tx,y ,
making it larger or smaller every time. The occupancy grid map algorithm is
used more in static environments, but if we set this constraint, it can be used
in dynamic environments. It will be take less time for p(mx,y |zt ) to converge
to a value of 0 or 1 if an obstacle is removed or introduced in the map.

25

Figure 4.10: Occupancy Grid Map code in LabVIEW

4.2

Sensor Model
Equations 4.17 is one of the most important equations in the entire

algorithm. This equation suggests that only two probabilities are needed to
implement occupancy grid maps. First, is the probability p(mx,y ), which is
the prior value for the occupancy. It is usually set to a value between 0.2
and 0.5, depending on how crowded the environment is. It will help us to
converge easily to 0 or 1, but it is most likely set to a value of 0.5, in this way
x,y )
the term log 1p(m
will be equal to zero.
p(mx,y )

The value which needs special attention is the term p(mx,y |z). This
probability is conditioned and depends on the range of the sensor, for this
term it relies on the Sensor Model. Studying the term carefully it is easy
to infer that the occupancy grid map does not address the neighboring cell
problem. This leads us to an overlapping cells problem, which will be discussed in chapter 4.4.
The Sensor Model is usually independent of the current map, and
26

can be stored in tables, but there are several Model sensor that can be
used for different purposes.
In the literature, there are numerous versions of this probability, for
sensors such as LIDARS, ultrasound sensors, cameras and infrared sensors. Usually, the sensor model for a range sensor is attributed to a high
probability for a grid cell that hits an obstacle, and low probability for a grid
cell in its absence. The sensor model of this function can be built by hand,
or learned from sensor data.
To insert the value in the grid cell by a particular value of z, each grid
cell that falls within the field of view of the sensor needs to be updated with
the value of the sensor model in the position. To update each grid cell, the
common principle applied is the ray tracing or ray casting method.

4.2.1

Ray Casting
The ray casting method was considered at the beginning for image

rendering. The method consists of casting a ray or a line, the line is traced
knowing the initial point and the end point. The end point will be the obstacle
with a distance z, and knowing these two points it is possible to make a
straight line.
The equation of a line for ray tracing is shown below:

y(z1 , . . . , zt ) = mz + b

27

(4.26)

Figure 4.11: Representation of a line casted in the grid


The equation to find the slope of the line is showed below:

m=

y2 y1
x 2 x1

(4.27)

And factoring from equation (4.26), the value of b is shown below.

b = y(z1 , . . . , zt ) mz

(4.28)

Where z1 , . . . , zt means the measurement from the sensor to the obstacle.


The code was implemented in LabVIEW and it is shown in Figure
4.12.
Now, it is necessary to update each cell along the beam. In order to
update each cell it is necessary to use a sensor model, two of which will be
discussed primarily; the Ideal Sensor Model and the Real Sensor Model.

28

Figure 4.12: Code for Ray Casting in LabVIEW


4.2.2

Ideal Sensor Model


This type of sensor model does not consider the distance uncertainty

from the sensor, and it only considers the values of 0 (for free space) and 1
(for occupied space), this Sensor Model is represented in Figure2.2.

(a) Curve of the Ideal Sensor Model

(b) Ray Casting of the Ideal Sensor Model

Figure 4.13: Ideal Sensor Model


This Sensor Model is very easy to compute, and it does not make
objects ticker. It does not consider uncertainty distance, but, it is easy to
build by hand.

29

4.2.3

Real Sensor Model


The Real Sensor Model is based on a Gaussian curve. It takes into

account the uncertainty of the senor due to the distance, however, this sensor model could make objects appear thicker.

(a) Curve of the Real Sensor Model

(b) Ray Casting of the Real Sensor Model

Figure 4.14: Real Sensor Model


It was decided to remove part of the curve, and make a combination
between the ideal sensor model and the real sensor model, since we just
want to take into account the uncertainty before and not after the obstacle.
The new Sensor Model is represented in Figure2.4. The darker cell
is the most likely considered to be occupied for the obstacle.
The thickness of an obstacle cannot be derived from a single view,
so it depends on the depth uncertainty. Information about the true size and
the area behind obstacles is included when the vehicle moves there.

30

(a) Curve of the New Real Sensor Model (b) Ray Casting of the New Real Sensor
Model

Figure 4.15: New Sensor Model

4.3

Why Probabilities
All the algorithms used in the literature have a common feature: Al-

most all of them are probabilistic. All of them employ probabilistic methods
to model the robot and the surroundings, and almost all of them turn sensor
measurements into maps. Even some techniques which are implemented
in the literature do not appear probabilistic at first sight, but they can be
interpreted with probabilistic methods under the correct assumption.
But, what makes probability such an important tool in robotics?. The
key to this answer is the sensor noise. The noise from the sensors is complex and is not easy to accommodate. Probability addresses models from
the different sources of noise and the effect on the measurement.

31

4.4

Dealing with Overlapping problems


Sometimes the map is not accurate or it is not reliable, since the

Occupancy grid map algorithm does not take into account the neighbors
cells. The problem presents itself in most of the corners of an environment.
This method was tested with the Ideal Sensor Model. Since this
model only takes the values of free or occupied, it is easier to compare
the prior value of the cell. One solution to this problem is shown in this
thesis.

(a) Good

(b) Bad

Figure 4.16: The red circle shows the conflict cell

The main goal of the code is to avoid overlapping problems with grid
cells already occupied, trying not to overlap them with a white one. But if
the cell is not occupied and it is in the field of view, it is because the cell is
between the obstacle and the sensor, so it takes a low probability or a free
value, and if the cell is at the end of the line, it takes the occupied value.

32

Figure 4.17: The Overlaps for loop, implemented in LabVIEW

4.5

Choosing an embedded computer


Since the algorithm is planned to be used in a mobile robot, a com-

puter needs to be chosen to attach it to the robot. The computer which was
chosen is the IT XSP line family which best suits the requirements.
Table 4.2: Different IT XSP with features

Variant
Plus
Basic
Standard
Plus
Basic
Standard

Clock
1.1GHz
1.1GHz
1.1GHz
1.6GHz
1.6GHz
1.6GHz

P-ATA
X
X

S-ATA
X
X
X

X
X

USB
X
x1.
X
X
x1.
X

SDIO
X
x2.
X
x2.

Note:
1. Only two USB ports available
2. Only one microSD socket available

For further information, in Appendix A is the Block Diagram, in Ap33

pendix B is the Connectors Diagram, and in Appendix C is the Mechanical


Diagram.
The computer chosen was the itx SP Plus 1.6GHz with the following features:
CPU: Intel Atom Z510-1.6 GHz
Chipset: Intel System Controller Hub US15W
Ethernet: Intel 82574L Gigabit Ethernet
Graphics: DirectX 9.0e, OpenGL 2.0, Shader based 2D and 3D dual
independent graphics
Hard Disk: Single or Dual SATA II (chipset option) 1 x PATA 44 Master
/ Slave option
Main Memory: 1 x DDR2 SO-DIMM, 2GB
Power Consumption: 7 -9 W typical
Power Supply: 5VDC
Resolution: DVI up to 1920 x 1080 @ 60Hz. LVDS up to 1280 x 1024
@ 85Hz
Special Features: 1 x microSD socket
Temperature: Operating 0 C - 60 C (32 F 140 F)
USB Connections: 6 x USB 2.0 (2 x at front panel, 4 x on board)
34

4.5.1

Getting started with the itx SP


The following steps needs to be followed in order to run the computer:

1. Plug a suitable DDR2-SDRAM memory module into the RAM socket.


2. Connect a DVI monitor to the DVI connector.
3. Plug a keyboard and/or mouse into the USB connector(s).
4. Plug a data cable into the hard disk interface. Attach the hard disk to
the connector at the opposite end of the cable. If necessary connect
the power supply to the hard disks power connector.
5. Make sure all the connections have been made properly. Connect the
power supply to the itx SP power supply connector.
6. Turn on the board by shortening the power button pins on the power
front panel header or use the autostart jumper.
7. Enter the BIOS by pressing the Delete key during boot-up. Make all
changes in the BIOS Setup.

35

Algorithm 1 Occupancy Grid Cell


Initialization
for all grid cells < x, y > do
`x,y = log p(mx,y ) log[1 p(mx,y )]
end for
Grid Calculation
for all time steps t from 1 to T do
for all grid cells in the field of view of the sensor do
`x,y = `x,y + log[p(mx,y |zt ) log[1 p(mx,y |zt )] log[p(mx,y ) + log[1
p(mx,y )]
if `x,y 7 then
`x,y = 7
if `x,y 7 then
`x,y = 7
end if
end if
end for
end for
for all grid cells which are in the field of view do
p(mx,y |zt ) = 1 1+e1`x,y
end for

Algorithm 2 Overlapping Cell


for all grid cells m(x,y) from t=1 to t=T do
if p(mx,y |zt1 ) = p(mx,y |zt ) then
m(xocc,yocc) = Occupied
else
m(xocc,yocc) = Occupied
m(xf ree,yf ree) = F ree
end if
end for

36

Chapter 5
Conclusions and Results

With the data acquisition of the Kinects depth, it was proven that the
noise in the sensor does not follow a Gaussian curve using the Lilliefors
test on Matlab. The null hypothesis that the samples come from a normal
distribution was rejected with a randomly selected distance. However, it is
assumed that the noise is Gaussian for simplistic purposes of representation. In addition, since then, the algorithm needs to be fused with the
Extended Kalman Filter, which works with the Gaussian noise that comes
from the sensors.

Figure 5.1: Fitting the data in a Normal Distribution

It is clear in Figure 5.1 that the distance data does not have a normal
distribution shape.

37

Even though the noise is present, the measurements are quite consistent. A data acquisition was recorded for every 0.1m, from 0.5m until
5.5m; 500 samples were taken at each distance. This interval was considered at the beginning to model the distance with the raw data. The data
analyses were taken every 1m. The and the , and some other statistical
data were found and are shown in Table 5.1
Table 5.1: Statistical Data
Distance
1m
2m
3m
4m
5m

0.9976
1.9975
3.0090
3.9770
5.0015

0.0022
0.0077
0.0154
0.0229
0.0355

Var
4.7562 106
5.988 105
2.3826 104
5.2501 104
0.0013

Error(Distance )
0.0024
0.0025
-0.009
0.023
-0.0015

In each of the data acquisitions, it was observed that the distance


jumped around 3 different values, one of them being the real distance. Proof
of this is that the is really close to the real distance, and is really close
to 0, showing that the values are not too far from the .
The Kinect was not designed for robotic purposes, nevertheless, it
works really well for educational purposes. But it has a problem: Since the
cameras have different resolutions, the Kinect has a null band, which has a
size of 8 pixels. Reducing the field of view just a little bit, the null band is
shown in Figure 5.2.
To choose the best model to be used, the computational cost and
38

Figure 5.2: Depth Image, the Null Band is on the right side
the accuracy were taken into account. The Ideal Model was selected since
it works more efficiently when it is used with the embedded computer, and
generates better results. Additionally, there is no significant difference once
the map is built in comparison to the Real Sensor Model.

(a) Lateral View of the Sensor Model

(b) 3D View of the Sensor Model

Figure 5.3: Ideal Sensor Model Plotted In Matlab


The main difference between both sensor models is the smooth curve
which is present in the Real Sensor Model before the obstacle.
Since the program was made on LabVIEW, there were a few strate-

39

(a) Lateral View of the Sensor Model

(b) 3D View of the Sensor Model

Figure 5.4: Real Sensor Model Plotted In Matlab


gies taken into account to improve the performance of the VI:
By Unchecking Allow debugging in the VI properties will slightly increase the VI performance, by reducing memory requirements. This property causes the VI to be recompiled.

Figure 5.5: Unchecking Allow debugging


Another way to decrease memory leaking is to close every reference
that is created in the VI. Every image and every port has a reference that
needs to be closed at the end, and if the reference is not closed, it will take
40

up more memory resources every time that the VI runs.


To speed up the program, it is necessary to have knowledge about
programming efficiently. It is good to know that For Loops and While loops
can make a difference every time that they run. Sometimes, a For loop just
needs to be run one time, not every time. Positioning the For loops outside
of the main program can significantly increase the performance of the code.

Figure 5.6: For Loop positioned outside of the main Loop, it is used to find
the angles
The performance and memory tools allow one to know where the
bottle neck is in the VI, checking which of the Sub-VIs take a larger amount
of time to run. Improving programming in these Sub-VIs it will speed up the
program.
Loading viewers in the Front Panel will increase the memory use. Do
not load unnecessary viewers, the dataflow among the Front Panel and the
Block Diagram will increase significantly, consuming memory resources.
Using the Element In Place Structure will increase the VI and memory efficiency. The Element In Place structure operates on data elements

41

Figure 5.7: Using the Element In Place Structure to rotate the Cartesian
Coordinates
in the same memory location and returns those elements to the same location in the array, cluster, variant, or waveform.
Use Real Time Loops if possible, the Real time will increase the efficiency and the time in your program. The dataflow programming does not
work like sequential lines of text, the flow of data is determined between
nodes, making some operations parallel. Additionally, LabVIEW makes it
easy to assign thread priorities with the Timed Loop structure.
Currently, the map is built and stored into a 100 by 100 Matrix, in this
space it can build a map of an area of 100m2 . There is a way of building
bigger maps, which is not considered in this thesis.
The code was tested in the IT XSP and it took 500 milliseconds
to process the entire field of view, reducing the size of the matrix by 50 cells.
It reached a time of 200ms, the computer is working with Windows XP and
it has LabVIEW 2010 installed. One advantage of having Windows on this
42

(a) Depth Image

(b) RGB Image

(c) Map

Figure 5.8: Occupancy Grid Map of a Hallway


computer is that the data transmission among the SRIO and the computer
does not need to be programmed.

5.1

Future Work
The development of a 3D map is underway; however, it will consume

a lot of computation. For a laptop this is not a problem, but, the robot will
need a good computer on board to perform all the algorithms. The map can
be used for localization and path planning purposes and it can be used with
any kind of mobile robot. The area restriction was considered not to be a
problem, since it could be considered part of a submap in future works.

43

The map will be attached to a mobile robot that is under construction


by a Marquette sudent, this robot has a SRIO (Single-Board RIO), with other
sensors, such as ultrasounds, IR-Sensors and IMU (Inertial Measurement
Unit). The Kinect could not be connected to the SRIO in part because the
SRIO does not support USB connections.

Figure 5.9: The Mobile Robot and the SRIO


The Ryan Gordons drivers for the Kinect were used, and recently
Microsoft has released the official drivers. To connect the Kinect and work
in the LabVIEW environment it is necessary to use a wrapper. The wrapper
calls the dlls from the C environment. The LabVIEW drivers began to be
built, having good results connecting the Kinect. The laptop recognizes the
drivers as LabVIEWs drivers. However, interpreting the protocol is not a
trivial task and it will take time to make a good program.

44

Appendices

45

Appendix A
IT XSP Block Diagram

Figure A.1: IT X-SP Block Diagram

46

Appendix B
Connectors Diagram

Figure B.1: IT X-SP Top Side

47

Figure B.2: IT X-SP Bottom Side

48

Table B.1: Different IT XSP Connectors


Number
J1
J2
J710
J1100
J1200
J1800
J1801
J1802
J1803
J1804
J2000
J2300
J2900
J2901
J3100
J3200
J3201
J3202
J3203
J3300
J3301
J3302
J3303
J3304
J3400
J3401
J3402

Part

Reset BIOS jumper


DDR2-SDRAM socket
Autostart jumper
Battery connector
BackLight connector
Flat panel connector
Checking jumper for panel power
Power header
LAN connector
Power Supply connector
Analog audio interface connector
Digital audio
DVI-D connector
MicroSD card socket
Pin strip
Digital I/O interface
Fan connector
USB ports
USB connector
USB connector
USB connector
USB connector
P-ATA connector
S-ATA connector
S-ATA connector

49

Appendix C
Mechanical Diagram

Figure C.1: IT X-SP Mechanical Diagram, all dimensions are in mm

50

Bibliography

[1] Frederic Bourgault Alexei A. Makarenko, Stefan B. Williams and Hugh F.


Durrant-Whyte. An experiment in integrated exploration. IEEE-RSJ International Conference on Intelligent Robots and Systems, 1:534539,
December 2002.
[2] Franz Andert and Lukas Goormann. Combining occupancy grids with
a polygonal obstacle world model for autonomous flights.

Intech,

Edited by: Thanh Mung Lam, Januray 2009.


[3] Alberto Elfes. Using occupancy grids for mobile robot perception and
navigation. Computer, 22:4657, June 1989.
[4] Nathaniel Fairfield. Localization, Mapping, and Planning in 3D Environments. PhD thesis, Robotics Institute, Carnegie Mellon University,
Pittsburgh, PA, January 2009.
[5] National Instruments. NI Developer Zone-Robotics Fundamental Series: Stereo Vision.

Retrieved July 2011 at http://zone.ni.com/

devzone/cda/tut/p/id/8176, December 2008.


[6] William Morris Ivan Dryanovski and Jizhong Xiao. Multi volume occupancy grids: an efficient probabilistic 3d mapping model for micro aerial

51

vehicles.

IEEE-RSJ International Conference on Intelligent Robots

and Systems, October 2010.


[7] Kontron. IT X-SP Manuals. Retrieved July 2011 at http://emea.
kontron.com/products/boards+and+mezzanines/embedded+sbc/pitx+
25+sbc/pitxsp.html, December 2010.
[8] Martin C. Martin and Hans P. Moravec. Robot evidence grids. CMURobotics Institute, March 1996.
[9] Adam Milstein. Occupancy grid maps for localization and mapping.
Intech, Edited by: Xing-Jian Jing, June 2008.
[10] Hans P. Moravec and Alberto Elfes. High resolution maps from wide
angle sonar. Robotics and Automation. Proceedings. IEEE International Conference, March 1985.
[11] Don Murray and James J. Little. Using real-time stereo vision for mobile robot navigation. Auton. Robots, 8:161171, April 2000.
[12] OpenKinect. OpenKinect Main Page. Retrived January 2011 at http:
//openkinect.org/wiki/Main_Page, 2011.
[13] Philipp Robbel, David Demirdjian, and Cynthia Breazeal. Simultaneous localization and mapping with people. 2011.
[14] ROS.ORG. Kinect Calibration. Retrived January 2011 at http://www.
ros.org/wiki/kinect_calibration/technical, 2010.
52

[15] Wolfram Burgard Sebastian Thrun and Dieter Fox. Probabilistic Robotics
(Intelligent Robotics and Autonomous Agents). The MIT Press, 2005.
[16] Sebastian Thrun. Exploration and model building in mobile robot domains. Neural Networks, IEEE International Conference, 1993.
[17] Sebastian Thrun.

Robotic mapping: A survey.

Exploring Artificial

Intelligence in the New Millenium, 2002.


[18] Sebastian Thrun. Learning occupancy grid maps with forward sensor
models. Auton. Robots, 15:111127, September 2003.
[19] Sebastian Thrun, M. Beetz, Maren Bennewitz, Wolfram Burgard, A.B.
Creemers, Frank Dellaert, Dieter Fox, Dirk Hahnel, Chuck Rosenberg,
Nicholas Roy, Jamieson Schulte, and Dirk Schulz. Probabilistic algorithms and the interactive museum tour-guide robot minerva. International Journal of Robotics Research, 19(1):972999, November 2000.
[20] Sebastian Thrun and A. Buecken. Learning maps for indoor mobile
robot navigation. CMU, Computer Science Department, 1996.
[21] Hehua Ju Yan Ma and Pingyuan Cui. Research on localization and
mapping for lunar rover based on rbpf-slam.

IEEE Computer Soci-

ety, Proceedings of the 2009 International Conference on Intelligent


Human-Machine Systems and Cybernetics, 2, 2009.

53

Index
Abstract, vi
Acknowledgments, v
Appendices, 45

Problem Context and Characterization, 5


Problem Definition, 6
Projects Development, 11
Projects objectives, 7

Bibliography, 53
Choosing an embedded computer,
33
commands
environments
table, 9, 11, 12, 33, 38, 47
Conclusions And Results, 37
Connectors Diagram, 47

Ray Casting, 27
Real Sensor Model, 30
Research Method, 8

Dealing with Overlapping problems,


32
Dedication, iv

The Kinect Sensor, 11


Type of investigation, 8

Sensor Model, 26
Specific Objectives:, 7
Stereoscopic System, 12

Why Probabilities, 31

Future Work, 43
General objective, 7
Getting started with the itx SP ,
35
Ideal Sensor Model, 29
Information recollection techniques
and instruments, 9
Introduction, 1
Mechanical Diagram, 50
Methodological Design, 8
Occupancy Grid Map, 19
Preliminaries, 5

54

Anda mungkin juga menyukai