You are on page 1of 4

2015 4th International Conference on Computer Science and Network Technology (ICCSNT 2015)

A RGBD SLAM Algorithm Combining ORB with


PROSAC for Indoor Mobile Robot

Guan-xi XIN, Xu-tang ZHANG, Xi Wang and Jin Song


School of Mechatronics Engineering
Harbin Institute of Technology, Harbin 150001, China
E-mail: xinguanxi1992@hotmail.com

Abstract—In order to enhance the instantaneity of SLAM for development in robot, and a typical example is Kinect which
indoor mobile robot, a RGBD SLAM method based on Kinect has been categorised under RGBD cameras.[4] Kinect has a
was proposed. In the method, oriented FAST and rotated BRIE- prominent advantage that it can obtain depth image and color
F(ORB) algorithm was combined with progressive sample image simultaneously which can lower data association and
consensus(PROSAC) algorithm to execute feature extracting and computation complexity. In addition, it has an acceptable price.
matching. More specifically, ORB algorithm which has better Based on the Kinect sensor, RGBD SLAM was coined by P.
property than many other feature descriptors was used for Henry in 2010.[5]
extracting feature. At the same time, ICP algorithm was adopted
for coarse registration of the point clouds, and PROSAC As for vision based SLAM, in order to build a 3D map of
algorithm which is superior than RANSAC in outlier removal the surrounding environment, the information of environment
was employed to eliminate incorrect matching. To make the must be extracted from images. Therefore, it’s very important
result more accurate, pose-graph optimization was achieved to select suitable feature descriptors. Scale Invariant Feature
based on g2o framework. In the end, a 3D volumetric map which Transform (SIFT) or Speeded Up Robust Features (SURF)
can be directly used to the navigation of robots was created. which is a variant of SIFT is commonly regarded as feature
descriptors in visual SLAM[6][7]. However, there is no doubt
Keywords—SLAM; KINECT; RGB-D SLAM; ORB; PROSAC that whatever SIFT or SURF. They can’t meet the needs of the
instantaneity well which is an important parameter for SLAM.
I. INTRODUCTION Aiming at the instantaneity problem of SLAM, this paper
Mobile robot is a complexly integrated system including proposed a RGBD SLAM algorithm combining ORB[8] with
computer technology, sensor technology, information PROSAC algorithm [9] based on Kinect. ORB algorithm was
processing, electronic engineering, automation, control used to extract feature, and PROSAC algorithm which is
engineering, artificial intelligence and so on. At the same time, superior than RANSAC in outlier removal was employed to
it is one of the most active and promising fields at present. eliminate incorrect matching. To make the result more accurate,
There are many research directions in the field of mobile robot pose-graph optimization was achieved based on g2o
in which autonomous navigation is one of the most challenging framework[10]. In the end, a 3D volumetric map which can be
abilities. Autonomous navigation problem of mobile robot can directly used to the navigation of robots was created .
be divided into three parts: localization, mapping and path
planning[1]. After years of research, scholars have realized II. THE DESCRIPTION OF PROPOSED METHOD
Simultaneous Localization and Mapping(SLAM) is the crux to
realize autonomous navigation of mobile robot in the true A. System Framework
sense[2].
In this section, the system framework of proposed method
With the development of SLAM algorithm, there are is shown in Fig.1. Like most methods of RGB-D SLAM, the
mainly three kinds of sensors to be used in SLAM domain. proposed method is a graph-based SLAM. There are five
They are range lasers, binocular and monocular. Many scholars primary modules in the method, including feature detection,
have done a lot of research about SLAM based on the three transformation estimation, ICP algorithm, graph optimization
sensors. However, they all have their own shortcomings. Laser and map representation. The main process is as follows: Firstly,
sensors can obtain detailed information of the surrounding depth images and RGB images were obtained by Kinect sensor.
environment, but it has the following problems: not useful in Then, feature detection was executed using ORB algorithm, at
highly chaotic environments or for object identification; costly, the same time, The PROSAC algorithm is utilized to get more
heavy, making their use difficult for aerial robotics or accurate inlier matching points in motion transformation
humanoid robots[3]. With regard to monocular or binocular, estimation. Dense point clouds were registered with the
they both obtain depth information at a very heavy cost, by addition of the depth information. To obtain globally optimal
means of structure from motion or stereo matching, consuming poses of the robot and the map of the environment, a graph-
a lot of time. In recent years, one of the most recent based optimization method by g2o frame was applied to the
developments in sensing technology has led to important system.Thus, the trajectory of the robot can be obtained. In the

978-1-4673-8173-4/15/$31.00 ©2015 IEEE 71 Harbin, China


meantime, a colored 3D voxel map of the environment can be Then, BRIEF operator was carried out in the section of
generated. feature description. BRIEF descriptor has the advantages of
simple description, small storage space and fast speed.
kinect However, BRIEF has defect that it does not have rotation
invariance, which rBRIEF operator in ORB descriptor can
Depth RGB solve. To give orientation information to BRIEF, point sets
images images need to be modified into a matrix like the equation (4)

Transformation
Feature matching(ORB)  x , , xn 
estimation(PROSAC)
S = 1 (4)

 y1 , , yn 
Graph Robot Point
optimization(g2o) trajectory clouds Corresponding rotation matrix R𝜭𝜭 can be calculated by the
direction angle 𝜭𝜭 of FAST. Thus the test point sets including
orientation property can be calculated S𝜭𝜭= R𝜭𝜭S. The new
3D volumetric descriptor can be represented as follows:
Pose graph
map

Fig. 1. System framework g n ( p, q ) f n ( P) ( xi , yi ) ∈ Sθ


= (5)

B. ORB Algorithm for Feature Detection


The result of ORB matching can be shown in Fig.2.
It’s well known that ORB which is a combination of FAST
detector [11] and BRIEF descriptor[12] is much faster than
SIFT and SURF. ORB is a kind of locally invariant feature
descriptor which possesses invariance of translation, rotation
and scaling for photos and images. Fast detector does not
produce multi-scale features. However, the ORB applies a
scale pyramid of the image, and generates FAST features at
every level of the pyramid. FAST operator was applied in the
section of feature point detection, and ORB has implemented
some improvement aiming at the problem that FAST operator
does not have the property of orientation. On account of the
problem that FAST detector doesn’t provide the method of
describing the nature of the angular point and has a non- Fig. 2. The ORB features of the two images
ignorable response of edge, oFAST detector employed Harris
corner detector in the process of sorting of FAST detector. In
comparison with FAST detector, the major improvement of C. PROSAC Algorithm for Transformation Estimation
oFAST is the addition of orientation property which makes Chances are that wrong matchings which are also termed
feature description operator possess rotation invariance. outliers may appear when it comes to feature matching.
Although these outliers are not worth mentioning compared to
The orientation is calculated based on the intensity centroid. the whole feature points, they will give rise to a lot of mistakes
Firstly, a moment of image region was defined as follows: in the global motion estimation. So some measures which can
distinguish inliers from outliers and eliminate outliers must be
taken to choose the correct points in parameter estimation.
m pq = ∑ x p yq I ( x, y ) (1)
x, y As a general rule, there are a majority of valid data and a
little invalid data in the dataset. Least square method or similar
The center of mass of the image can be found by the methods can be applied to determine the model parameter and
moment error. However, least square method can’t be used when there
are a great number of invalid data. So PROSAC algorithm
which is a variant of RANSAC is adopted to eliminate the error
m10 m01 matching. The PROSAC algorithm eliminates outliers by
C =( , ) repeated iterative operation and employs least square method to
m00 m00 (2) estimate parameter matrix .
Thus, a vector can be obtained, and the direction of the PROSAC algorithm is superior to RANSAC algorithm in
vector can represent the main orientation of FAST the matter of robustness and computational efficiency.
RANSAC algorithm randomly draws samples without
considering the differences between samples. In contrast to
θ = a tan 2( m01 , m10 ) (3) RANSAC, PROSAC algorithm sorts the correlation between

72
sampling points and the model in the first place, then iterates 1 N 2

from the correlation of sampling points, finally the optimal E ( R, t )


=
N
∑ RPr + t − Pl (6)
solution can be obtained by continuous hypothesis and i =1

validation. The flow chart of PROSAC algorithm is shown in


Fig.3 , and the result of outlier removal is shown in Fig.4. To calculate rotation and translation matrix R and t which
make the equation minimal, quaternion algorithm is applied
instead of SVD method which may generate singular matrix.
Start
E. Graph Optimization
Iterative setting In general , a graph is made up of nodes and edges. A node
of related item is on behalf of the pose of robot , and a edge represents the
pose transformation of two adjacent moment. Once a graph is
built completely, the pose of robot should be adjusted to meet
Y the boundary constraints. A graph-based SLAM system can be
Iterations> Maximum iterations divided into two modules: front-end and back-end. In the front-
N end module, geometric constraints need to be extracted from
Sorting and the sensor data, and the problem of data association needs to be
no solution choose high- Iterations+1 solved. Finally , nodes and edges of the graph structure should
quality data be constructed. In the back-end module, the assignment is to
look for the state vector which can make likelihood function of
geometric constraints maximum. A classical graph-based
model parameters SLAM is shown as Fig.5
and inliers of the
calculations Optimized poses
Sensor
data
The number of N
inliers>pre-set threshold
Pose and
Y Front-end constraints Back-end
Return parameters (graph construction) (optimization)
Of inliers and
model
Fig. 5. A graph-based SLAM system

To solve the problem of pose graph optimization, we make


start
use of g2o framework like Fig.6. The solution of SLAM can be
regarded as the minimum value of the function as follows:
Fig. 3. PROSAC algorithm


T
=F ( x) e( xi , x j , zij ) Ωij e( xi , x j , zij ) (7)
<i , j >∈C

x* = arg min x F ( x) (8)

x* is the result of final optimization, x=(x1T, x2T, …, xnT) is


a vector of camera poses. Zij and Ωij indicate the mean and the
information matrix of a constraint with respect to the camera
poses xi and xj. e(xi,xj,zij) is a vector error function which
Fig. 4. Matching results with PROSAC
measures how well the camera poses xi and xj satisfies the
constraint zij. If the poses xi and xj perfectly satisfies the
D. ICP Algorithm constraint zij ,the error is zero[13].
The Iterative Closest Point (lCP) algorithm is often used to
align two groups of point clouds. Assuming that there are two
groups of point clouds named Pr and Pl, then we can obtain the
rigid transformation of the two groups of point clouds by
calculating the minimum of the equation as follows:

73
[2] H. Durrant-Whyte and T. Bailey, "Simultaneous localization and
e45 x5 mapping: Part I," IEEE Robotics and Automation Magazine, vol. 13, pp.
99-108, 2006.
x4 [3] J. Fuentes-Pacheco, J. Ruiz-Ascencio, and J. M. Rendon-Mancha,
"Visual simultaneous localization and mapping: a survey," Artificial
Intelligence Review, vol. 43, pp. 55-81, 2012.
x1 e42 e34
[4] G. Hu, S. Huang, L. Zhao, A. Alempijevic, and G. Dissanayake, "A
e12 robust RGB-D SLAM algorithm," in 25th IEEE/RSJ International
Conference on Robotics and Intelligent Systems, IROS 2012, October 7,
e23 x3 2012 - October 12, 2012, Vilamoura, Algarve, Portugal, 2012, pp. 1714-
x2 1719.
[5] P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, "RGB-D mapping:
Using depth cameras for dense 3D modeling of indoor environments," in
Fig. 6. Graph structure of g2o 12th International Symposium on Experimental Robotics, ISER 2010,
December 18, 2010 - December 21, 2010, New Delhi, Agra, India, 2014,
pp. 477-491.
F. Map Representation [6] P.-C. Su, J. Shen, and S.-C. S. Cheung, "A robust RGB-D SLAM system
So far, a globally consistent trajectory of robot has been for 3D environment with planar surfaces," in 2013 20th IEEE
obtained. Then we can project all point measurements in a International Conference on Image Processing, ICIP 2013, September 15,
2013 - September 18, 2013, Melbourne, VIC, Australia, 2013, pp. 275-
global 3D coordinate system by means of the trajetory, thus a 279.
point cloud map of the environment can be generated. However,
[7] F. Endres, J. Hess, J. Sturm, D. Cremers, and W. Burgard, "3-D
the map is redundant and demands large calculational and Mapping with an RGB-D camera," IEEE Transactions on Robotics, vol.
memory sources. 30, pp. 177-187, 2014.
[8] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, "ORB: An efficient
To solve the problem, the OctoMap[14] is introduced to the alternative to SIFT or SURF," in 2011 IEEE International Conference on
system to indicate the 3D colored map of the environment. 3D Computer Vision, ICCV 2011, November 6, 2011 - November 13, 2011,
map of lab is shown as Fig. 7 Barcelona, Spain, 2011, pp. 2564-2571.
[9] O. Chum and J. Matas, "Matching with PROSAC - Progressive sample
consensus," in 2005 IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, CVPR 2005, June 20, 2005 - June 25,
2005, San Diego, CA, United states, 2005, pp. 220-226.
[10] R. Kummerle, G. Grisetti, H. Strasdat, K. Konolige, and W. Burgard,
"G2o: A general framework for graph optimization," in 2011 IEEE
International Conference on Robotics and Automation, ICRA 2011, May
9, 2011 - May 13, 2011, Shanghai, China, 2011, pp. 3607-3613.
[11] E. Rosten and T. Drummond, "Machine learning for high-speed corner
detection," in 9th European Conference on Computer Vision, ECCV
2006, May 7, 2006 - May 13, 2006, Graz, Austria, 2006, pp. 430-443.
[12] M. Calonder, V. Lepetit, C. Strecha, and P. Fua, "BRIEF: Binary robust
independent elementary features," in 11th European Conference on
Computer Vision, ECCV 2010, September 5, 2010 - September 11, 2010,
Heraklion, Crete, Greece, 2010, pp. 778-792.
[13] L. Meng, C. W. De Silva, and J. Zhang, "3D visual SLAM for an
assistive robot in indoor environments using RGB-D cameras," in 9th
Fig. 7. 3D colored map of the lab. International Conference on Computer Science and Education, ICCCSE
2014, August 22, 2014 - August 24, 2014, Vancouver, BC, Canada,
2014, pp. 32-37.
III. CONCLUSIONS [14] A. Hornung, K. M. Wurm, M. Bennewitz, C. Stachniss, and W. Burgard,
This paper proposed a RGB-D SLAM algorithm combining "OctoMap: An efficient probabilistic 3D mapping framework based on
ORB with PROSAC. It was applied to an indoor environment octrees," Autonomous Robots, vol. 34, pp. 189-206, 2013.
for mobile robot. ORB algorithm which has better property
than many other feature descriptors was used for extracting
feature. At the same time, ICP algorithm was adopted for
coarse registration of the point clouds, and PROSAC algorithm
which is superior than RANSAC in outlier removal was
employed to eliminate incorrect matching. To make the result
more accurate, Pose-graph optimization was achieved based on
g2o framework. Finally, a 3D volumetric map was created.

REFERENCES
[1] C. Stachniss, "Robotic Mapping and Exploration," in Robotic Mapping
and Exploration, Tiergartenstrasse 17, Heidelberg, D-69121, Germany,
2009, pp. 1-217.

74

You might also like