Anda di halaman 1dari 4

ICIIBMS 2017, Track1: Signal Processing, Computer Networks and Telecommunications, Okinawa, Japan

Autonomous Flight Drone for Infrastructure

(Transmission line) Inspection (3)
Michinari Morita㸨, Hironobu Kinjo, Shido Sato
Tansuriyavong Sulyyon Takashi Anezaki

We are developing a system for “automatic inspection of infrastructures using drones” especially aimed at transmission power lines.
Basically, drones fly using GPS information. GPS errors, however, occur near high voltage lines. In non-GPS zones where drones fly
without using GPS, they fly using v-SLAM. Determining the absolute position is indispensable for transmission line inspection. v-SLAM,
however, only provides relative localization. Therefore, to obtain the drone’s absolute position, we used feature points on the image as

(Keywords: inspection of transmission power lines, drone, position estimation, SFM, SLAM, 5-point algorithm,
Perspective-n-Point, bundle adjustment)


Drones are currently being used for a wide range of

applications, such as for the i-Construction initiative and
public surveys [1] conducted by the Ministry of Land,
Infrastructure, Transport and Tourism, and for surveillance
and for crop-dusting operations [2]. For these purposes,
drones are controlled either by a pilot through an FPV, or
drones fly autonomously by estimating self-position using
GPS. GPS data, however, can become faulty under bridges,
inside tunnels, or near high-voltage power lines [3], which Figure 1. Number of power line engineers
could lead to drone flight errors. To address this issue, we
are developing a system for conducting infrastructure
inspections using drones that basically use GPS for When there is a clear target point, such as a power
autonomous flight control, but can also estimate transmission line, the drone position needs to be in the
self-position through image processing when GPS cannot same coordinates as the transmission line, i.e. its position
be used under situations such as those mentioned above. must be absolute. However, v-SLAM (the major algorithm
In particular, we are developing a drone system aimed at used for real-time self-positioning based on image
performing automatic inspections of transmission lines. processing) estimates self-position by tracking feature
According to the National Tax Agency of Japan, points on moving images [6], so that only relative position
transmission lines have a lifespan of 40 years [4], and the and attitude can be inferred if no criteria are defined. Thus,
transmission lines built during the postwar rapid economic finding the absolute position and attitude requires the use
growth period require regular inspections. However, in the of landmarks (reference points).
midst of the increasing demand for overhead power line
engineers to perform maintenance and inspection of Landmarks that serve as reference points can be defined
transmission lines, there is a decreasing number of young either by installing artificial landmarks near the target point
power line engineers as a whole, as shown in Figure 1.1 that are detected in the camera images and serve as basis
[5]. These issues highlight the importance of using drones, for navigation, or by detecting feature points on the image
which do not rely on power line engineers, for and using them as landmarks [7]. Installing artificial
transmission line inspection. landmarks is time consuming because they must be
installed manually. On the other hand, it is possible to use
978-1-5090-6664-3/17/$31.00 ©2017 IEEE
feature points as landmarks simply by taking multiple
images before performing the inspection. = [ | ] -------------------------------------- (1)

where =[ 1] represents the 3D coordinates

Therefore in this study, we estimated absolute position
of the feature point; =[ 1] represents the
and attitude through a method that used feature points as
coordinates on the matching images; is the inner
landmarks. In this method, we used a technique called
parameter of the camera that includes the focal distance in
Structure from Motion (SfM), which performs 3D
pixels, , and the principal point, ; [ | ] is the
reconstruction from multi-perspective images, to determine
essential matrix representing both the camera rotation
3D coordinates of feature points from multiple, previously
matrix and motion vector; and is the scale factor of the
prepared images. By using the feature points as landmarks,
coordinate on the image. From these components, (1) can
we were able to estimate absolute position and attitude.
be expressed in the following formula (2):

s = 0 -----------
A previous study reported on a 3D reconstruction 1 0 0 1 1
method called Parallel Tracking and Mapping (PTAM), (2)
where tracking of feature points is carried out in parallel
If the 3D coordinates of multiple feature points are
with the construction of the environment map on the 3D
known, the camera’s position and attitude can be estimated
space. This enables an AR system for inferring absolute
by resolving the PnP problem [11]. The PnP problem can
position without using markers [9]. In this study, unlike
be resolved through the linear solution method and the
PTAM, 3D mapping is not done in parallel with feature
nonlinear solution method. The linear solution method
tracking, wherein the camera’s absolute position and
requires at least six pairs of matching feature points [12].
attitude are estimated from the matching between the
previously detected feature points in 3D space and the
If 3D coordinates of feature points have been obtained
images inputted frame by frame.
beforehand in the calibration process mentioned above, the
position and attitude of a flying drone can be estimated by
4. Detection of 3D position of feature points and resolving the PnP problem. Since the matching points that
estimation of camera position and attitude serve as reference are the previously obtained feature
points (landmarks), for all images taken anytime during the
ࠑ4㺃1ࠒ Calibration between two images
flight, it is possible to obtain the absolute position and
attitude. Figure 4.1 shows the illustration of how the
Calculating the 3D coordinates of a feature point first
camera position and attitude are estimated in this study. In
requires calculating the essential matrix between images.
Figure 4.1., Image 1 and Image 2 are used as images for
The essential matrix can be calculated using the five-point
obtaining, in advance, the 3D coordinates of feature points
algorithm if there are matching feature points between two
that serve as landmarks, while Image N is the image
images and the inner camera parameters, such as focal
obtained by the flying drone’s camera. The PnP problem is
distance, are known [10]. This essential matrix is a 3×4
used to estimate the camera position and attitude in Image
matrix combining the camera rotation matrix and the
motion vector. 3D coordinates of the feature point can be
obtained using the triangulation principle based on the
essential matrix and a matching pair of feature points.
To perform this for an “N” number of images, feature
point matches between every two images with overlapping
visual fields are sought and are all associated with each

ࠑ4㺃2ࠒ Estimating the camera’s self-position and

The 3D position and the coordinates of the feature point
on the image obtained from the drone can be expressed in
Figure 2. Estimation of camera position and attitude
the following formula (1),

IV. BUNDLE ADJUSTMENT In this study, we used C++ programming language and a
PC with Intel Corei7-6500U CPU, memory of 8GB, and
After obtaining the feature point 3D coordinates, inner
Windows10 Home OS.
camera parameters, and the essential matrix, the 2D
Figure 6-1 shows the number of matching feature points
coordinates on the image can be reconstructed using
initially detected, and the number of feature points for
formula (2) above. Slight errors, however, may arise
which 3D coordinates were actually calculated and used as
between the reprojected coordinates and the actual
landmarks after removing false matches using RANSAC .
coordinates. This error is called the “reprojection error,”
and is defined in the cost function below (3) using the sum
TABLE1 Compared image pairs and number of landmarks
of squares. In the following formula, , … , represent
Compared Matching feature Landmarks used
the unknown parameters combining the inner camera
image pairs points
paramaters and the essential matrix; , … , represent
the 3D coordinates of the feature point; while ( , ) and A: Images 1, 2 631 21
B: Images 1, 3 631 29
( , ̅ ) represent the true coordinates and the coordinates
projected on the image based on the parameters,
Next, the rotation matrix and the motion vector for
pairs A and B were expressed using the formula below, (4)
( ,…, , ,…, ,) = ∑ ∑ {( − and (5), respectively. and represent rotation
matrix and motion vector, respectively, between Images 1
( , )) +             ( − and 2, while and represent those between Images
̅( , )) }------------- (3) 1 and 3, with values rounded off to 3 decimal places.
In particular, when tracking is done continuously, errors 0.986 −0.165 −0.021
accumulate, leading to aberrations in camera = 0.165 0.986 0.017
self-positioning. Bundle adjustment minimizes the 0.018 −0.02 0.999
reprojection error in order to obtain an accurate estimate of −0.098
position and attitude [13]. = −0.079 ----------(̒)
Minimizing reprojection error is done by partial 0.992
differentiation of the rotation and translation
parameters ( , ̅ ). Then, from an appropriately defined 0.991 0.134 0.019
= −0.134 0.990 −0.018
increment (Δx), update is carried out (x → x + Δx) using
−0.021 0.016 0.999
the Gauss-Newton algorithm, iterating until convergence is
reached [14]. −0.086
= −0.091 ---------(̓)
Figure 6-2 shows the 3D position of the camera rendered
In this study, we calculated the essential matrix and using OpenGL (right image) based on the essential matrix,
detected the 3D coordinates of feature points based on and the camera position estimated using the 3D
three images. For the image and camera inner parameters, reconstruction software Visual SFM [16] (left image). The
we used the online dataset published by the University of white, blue, and green objects in the right image, and the
Illinois at Urbana–Champaign, UIUC [15]. The images we blue, purple, and red objects in the left image represent the
used are shown below (from left, Image 1, Image 2, Image positions of cameras 1, 2, and 3, respectively. Comparing
and 3). these two images, we can see that there is a large error in
the camera positions obtained.

Figure 3. Experiment image

Figure 3. Result of Estimate of Camera position

7. SUMMARY (11) Richard Szeliski: Computer Vision: Algorithms and
Applications㸪Springer (2011)
 This paper describes a method for estimating a drone’s (12) R. Hartley, A. Zisserman: Multiple View Geometry in
Computer Vision (Second Edition)㸪Cambridge University
absolute position and attitude based on feature points as Press, (2003)
reference, with the aim of developing a drone system for (13) Takayuki Okatani: “Bundle Adjustment," IPSJ Technical
Report 2009-CVIM-167-37㸪(2009) [in Japanese]
the inspection of infrastructures (transmission lines). We (14) The University of Illinois at Urbana–Champaign, UIUC:
calculated the essential matrix and the 3D coordinates of 3D Photography Dataset
feature points based on three images. Although we were (
able to calculate the essential matrix and obtain the feature (15) Changchang Wu: VisualSFM: A Visual Structure from
point 3D coordinates, there were large errors in the camera Motion System㸪 ( )
positions rendered using OpenGL. Going forward, first, we
need to determine the cause of these errors. Next, we will
calculate 3D coordinates of feature points using a larger
number of images, and aim to estimate camera position
and attitude from moving images.

8. Acknowledgement

Part of this research was funded through the Ministry of

Internal Affairs and Communication SCOPE budget

(1) Geospatial Information Authority of Japan (GSI):
“Manual for public surveys using UAV (Proposal)”
( 2017) [in
(2) Nihon Keizai Shimbun: “Use of drones for crop-dusting”
(July 25, 2017) [in Japanese]
(3) Atsushi Tsuchiya, Hiromichi Tsuji: “New and Easy GPS
Surveying” Japan Association of Surveyors (2017) [in
(4) National Tax Agency: “Appendix on handling
notifications regarding the applications of service life”
obetsu/sonota/700525/fuhyou/04.htm) [in Japanese]
(5) Statistics Bureau: “Population and average age of
employed persons 15 years old and over by employment
(sub-classification), age (five-year age group) and sex
(total and employee) (specially for one-person households
and independent spouses of residents of Self-Defense
Forces barracks) – Nationwide”
2013) [in Japanese]
(6) C.Forster, M.Pizzoli, D.Scaramuzza: Fast Semi-Direct
Monocular Visual Odometry㸪ICRA (2014)
(7) Wataru Toishita: “Performance Improvement of Camera
Tracking Method Using the Feature Landmark Database,”
IEICE Technical Report. Technical Group on Media
Experience and Virtual Environment. pp. 255-260
(2009) [in Japanese]
(8) Changchang Wu: Towards Linear-time Incremental
Structure from Motion㸪IEEE Conf. on 3D vision, (2013)
(9) Georg Klein: Parallel Tracking and Mapping for Small
AR Workspaces㸪ISMAR, (2007)
(10) David Nist´er: An Efficient Solution to the Five-Point
Relative Pose Problem㸪IEEE Trans. PAMI㸪26㸪6, p.
756-777 (2004)