Anda di halaman 1dari 8

2008 IEEE International Conference on Signal Image Technology and Internet Based Systems

3D Panoramic Reconstruction with an Uncalibrated System of Stereovision using Evolutionary Algorithms


Alain Koch, Albert Dipanda, Claire Bourgeois-Rpublique Universit de Bourgogne, LE2I 9, Avenue Alain Savary BP 47870 21000 Dijon, France alain.koch@u-bourgogne.fr Abstract
In this paper, a new method to reconstruct a 3D panoramic shape is introduced. It is based on an uncalibrated stereovision system composed of five cameras circularly located around the object to be analysed. The method is based on the detection of interesting points and their matching from one image to another. An evolutionary algorithm provides two basic elements: on the one hand, the interesting point depth values and on the other hand, the transformation matrix between the two images. Experimental results validate the proposed method. Index Term3D Panoramic Reconstruction, Evolutionary Algorithm, Uncalibrated System executed, usually two for each axis (x, y and z), in order to calculate the transformation between the real and observed movements. Three transformations are calculated, one for each axis, and allow to obtain the global transformation between the real world system and the camera system. Other uncalibrated systems use interesting points, obtained on different images in order to calculate the transformation between the different cameras. These points can be determined either with an object whose geometry is known, for example a grid [7], or by placing some markers on the studied object. In this study, we use an uncalibrated system constituted of five cameras which are located on an arc of circle around the studied object. Some markers are placed on the object. We simultaneously take five images of the object with the different cameras. Interesting points are extracted from the markers in all the images. The interesting points detected on images acquired by successive cameras are matched [14]. The 3D coordinates of those points are then calculated. Since the system is uncalibrated, we dont have some or all the parameters that allow to calculate the 3D transformation between a couple of images. Thus a classic linear method cannot calculate the transformation. We propose to solve this problem by using an evolutionary algorithm. This paper is organized as follows: In section 2, we briefly describe the classical calibration method based on the epipolar geometry. In section 3, we detail the proposed 3D reconstruction method. First, interesting points are detected and matched, and secondly, the 3D reconstruction is performed using an evolutionary algorithm. Section 4 describes experimental results. Finally, in section 5, after concluding on our work, we give some of the further works that we must proceed with.

1. Introduction
3D Panoramic reconstruction [13] is a recent area of research which started more than 10 years ago. It is used in various domains, namely medicine [1], security [2], virtual reality or robotics [3]. Two broad classes of systems can be defined to have a 3D panoramic reconstruction. In the first class, only one camera is used, while there are many cameras in the second class. The double lobed mirror [4] system belongs to the first class of methods. It is composed of a double hemisphere mirror with different diameters and the camera is located above the double mirror. The 3D panoramic reconstruction is made with the images obtained from the two mirrors which provide two different points of view of the scene. In the same class, we can include the cylindrical camera [5]. It takes a 2D panoramic image of the scene, and the 3D panoramic reconstruction requires to take at least five images from different locations in the scene. In the second class, the stereovision systems are constituted of two or more cameras that can be calibrated or not. Some uncalibrated systems utilize either object or camera movements [6][12] to obtain the 3D panoramic reconstruction. Six predefined movements are

2. Background on the epipolar geometry

978-0-7695-3493-0/08 $25.00 2008 IEEE DOI 10.1109/SITIS.2008.63

458

Usually the 3D reconstruction process requires a stereovision system composed of two or more cameras (see Figure 1). The 3D coordinates are calculated by triangulation. The triangulation needs to be calculated to know where the cameras are in the scene and the coordinates of points in each image. Usually, the pinhole camera model is used to calibrate cameras [15]. In this purpose, the intrinsic and the extrinsic parameters must be calculated [8][9]. The extrinsic parameters allow to obtain the transformation between a real point in the world system and its corresponding point in the camera system, while the transformation between the image plane and the camera system is provided by the intrinsic parameters. The intrinsic and extrinsic parameters must be calculated for each camera. For a given camera the intrinsic parameter matrix Ic is obtained as follows:

Camera System #1 x i1 Image System #1 x i1 y i1 Pi1 y i1 z i1 Pw X Pi2 y i2 Camera System #2 x i2 Image System #2 Z Y World System

z i2

x i2 y i2

Figure 1. Theoretical stereovision acquisition system The matrix A includes a rotation R and a translation T between the world and the camera systems. These elements are known. From the matrices Ic and A we obtain the matrix M which allows to calculate the 2D coordinates of a world point in the image. The matrix M is given by:
m11 m12 m13 m14 [M ] = [Ic][A] = m21 m 22 m 23 m24 m31 m32 m33 m34

[ p] = [Ic][P]
x su u 0 u 0 0 sv = 0 v v 0 0 y z s 0 0 1 0 s
(1)

(5) (6)

Where (u, v) are the coordinates of the 2D point p in the image plane, (x, y, z) are the coordinates of the 3D point P on the object in the camera system, s is a scale factor (generally, equal to one) and (u0, v0) are the coordinates of the optical centre in the image plane The values of the parameters u and v are calculated by: (2) u=kuf

p = [M ]P

v=kvf

(3)

In order to calculate the respective left and right camera matrixes M and M', the same test pattern is utilized. For each camera, at least six points must be detected in the image to solve the previous equation system. Then with A and A' extrinsic matrix, As is calculated:
r11 r21 1 [As]=[A'][A] = r31 0 r12 r13 bx r1 r22 r23 by r2 = r32 r33 bz r3 0 0 1 0 bx by bz 1

Where ku (resp. kv) is a vertical (resp. horizontal) scale factor of the camera and f is the distance named the focal length. u, v, u0 and v0 are the four intrinsic parameters in the matrix Ic. The extrinsic parameters are calculated by the following formula:

(7)

[A ]=

r 11 R T = r 21 0 1 r 31 0

r r r

r 13 r 23 r 33 0 0
12 22 32

tx ty tz 1

(4)

Let us consider a given world point viewed respectively as a point P(X, Y, Z) in the left camera system and P(X', Y', Z') in the right one, the next equation can be written:

P ' = [ As ] P

(8)

459

To match a point p(x, y) in the left image with the p'(x', y') in the right image, the following equation system is used to find the line where p' is located:

(bzr2pbyr3p)x'+(bxr3p+bzr1p)y'=bxr2p (9)
Then the exact point on the line can be found by using a block matching method or any other method.

3. The proposed 3D reconstruction method


In this section, we describe the proposed 3D reconstruction method. First, we present the image acquisition system that we have designed and next, we develop the different steps of the proposed 3D reconstruction method, which is based on the calculation of the transformed matrix between two different images of the analysed scene by using evolutionary algorithms.

Figure 2. A view of our acquisition system. The


analysed object in this image is a yellow car.

3.2. Depth calculation using Evolutionary Algorithm


Our goal is to achieve a partial object 3D reconstruction based on two images. For this purpose, we propose the following two steps: first, we detect some couples of interesting points which are viewed on both images. And secondly, we determine the respective depth values of these points by using an evolutionary algorithm (EA). The main idea that we are developing is the following: as illustrated on Figure 1; two corresponding interesting points which are respectively detected in both images represent the same 3D point. Thus, in order to transform an interesting point in one image into its corresponding point in the second image, we have to determine the depth of their same 3D projection point in the world system. 3.2.1. Interesting point detection and matching. The principal step on which the proposed method relied is the determination of some particular points in one image, called interesting points, and their corresponding points in the other image. As mentioned previously, some markers are placed on the object to be analyzed in order to facilitate the determination of these interesting points. First, the markers are detected, in both images, by using the color threshold method introduced by Fan Yang et al in [2]. Secondly, we determine the highest and lowest points on each marker. We have chosen to utilize these two points for a given marker since they are less dependent on the image point of view. And finally, these points are matched in the two images by executing the following steps: 1. All the possible matches between the same colour markers are created.

3.1. Description of our acquisition system


Our system is composed of five web cameras which are positioned circularly around the object to be analysed. Two successive cameras are separated by an angle of 30. The analysed object is located on a lazy Susan (see Figure 2) which can be rotated by an angle of 11.5 in one direction and inversely. At each position of the lazy Susan, five images are acquired, which allow to obtain more than a half size of the complete view of the object. The partial 3D reconstruction, i.e. the 3D reconstruction of the part of the object that is viewed by each single camera, is performed based on two images acquired by the camera at two successive positions of the lazy Susan. This 3D reconstruction process requires matching the two images then acquired. Since we consider that the objects to be studied may have no texture and no particular painting, the standard methods usually utilized to detect interesting points are not very efficient in these cases and can lead to wrong results. Thus, some markers are placed on the object, which allow to obtain more accurately interesting points.

460

2. 3.

Elimination of the impossible case: permutation of two or more markers between images. Only one possibility left: the correct matching.

3.2.2. Evolutionary Algorithm. Evolutionary Algorithms (EA) [10] are adaptive procedures that find solutions to problems by using an evolutionary process based on natural selection. A EA uses a fixed- or a variable-size, finite population of potential solutions to a problem. Each individual solution is encoded as a chromosome made up of a string of genes which take values in either a binary or a non-binary alphabet. A EA comprises three main stages: evaluation, selection and mating. They are applied cyclically and iteratively until saturation or other boundary condition is satisfied. At the evaluation stage, each chromosome is assigned a fitness value which represents its ability to solve the problem. The fitness function directly relates to the objective function to be minimized or maximized. At the selection stage, chromosomes are chosen, based on their fitness in such a way that the better chromosome is more likely to be selected. Fitness selection models the reproductive success of adapted organisms in their environment. At the mating stage, crossover and mutation operations are performed. The crossover operation recombines pairs of selected chromosomes, also called parent chromosomes, to form two new offspring. Crossover models the exchange of genetic information of creatures during the process of sexual reproduction. The mutation operation creates new offspring by modifying one or more genes of chromosomes chosen randomly from the mating pool. Mutation models occasional alteration of genetic information of a creature. From generation to generation, this process leads to increasingly better chromosomes and to near-optimal solutions. 3.2.3. Chromosome encoding. The input of our reconstruction problem is made up of a set of interesting points which are the highest and lowest points of the detected markers. However, we state that for a given marker, these two points have the same depth value. So in the following, we consider the marker depth equals to the depth of one of its two corresponding interesting points. A chromosome of our EA represents the depth values of the markers. Thus, the chromosomes are composed of N genes, where N is the marker number. And, a gene is a real number which is comprised between 0 and the maximum length of the object. 3.2.4. Fitness function. Our goal is to find the depth values of the different markers. It can be recalled that

at this stage, the known parameters are on the one hand, the rotation angle between the cameras and on the other hand, the focal length value of each camera. In order to evaluate the correctness of the depth values we use a transformation operation which projects the interesting points from the left image into their corresponding matched points on the right image. This transformation requires four steps for each interesting point. Lets consider the interesting point Pli(Xli, Yli) in the left image and Pri(Xri, Yri) its matched point in the right image. The point Pli(Xli, Yli) will be transformed as follows: 1. A perspective projection T1 is achieved on Pli, which gives the 3D point Plc(Xlc, Ylc, Zlc) in the left camera coordinate system. Note that the coordinate Zlc is the ith gene value in the chromosome. 2. A 3D rotation R2 is executed on the point Plci(Xlci, Ylci, Zlci) to obtain the point Prci(Xrci, Yrci, Zrci) in the right camera coordinate system. 3. The point Prci is transformed into the 2D point Pri (Xri, Yri) in the right image by a perspective projection T3. 4. A last transformation T4 which takes into account the translation, the residual rotation and the distortion between the two cameras is applied. The T4 matrix is the inverse of the matrix P which is constituted of all the points Pri obtained from all the interesting points on the left image. Since the matrix P is not a square matrix, we use the pseudo inverse method to calculate its inverse. For a given non-square matrix M, we have the following equation:
Id = ( M T M ) M T M
1

(10)

Where Id is the identity matrix. Thus, from the equation (10), we can deduce that T4 is given by:
T4 =(PT P) PT
1

(11)

The final transformed point in the right image obtained from Pri by achieving the transform T4 then obtained is called Prfi (Xrfi, Yrfi). If the depth values in the ith chromosome are correct then the two points Prfi and Pri are equal for all the couples of matched interesting points in the left and right images. The fitness fi of the ith chromosome is the sum, on the one hand, of all the error distances di between the corresponding points Pri and Prf, and on the other hand, of the maximum error distance calculated in the chromosome. This maximal error value is utilized in

461

order to penalize some chromosomes which can present a low mean value while one of the gene values remains very large. Finally, the fitness value fi is obtained by:
fi = (di ) * max(di )
0 2N

Mutation: The mutation operation randomly selects one of the genes and replaces its value by a random value chosen in the interval defined for the depth.
Parent chromosomes

(12)

V1 V2 V3 V4 .. Vp Vp+1 .. Vn-1 Vn W1 W2 W3 W4 .. Wp WP+1 .. Wn-1 Wn


Child chromosomes

It can be noticed that the optimisation problem at hand is a minimization problem to be solved. Thus, the smaller the fitness value, the better the chromosome. 3.2.5. Evolutionary operators Selection: To select the chromosome for the evolutionary operators, we make a selection by rank [11] for the entire population. First, the chromosomes are sorted from the best to the worst, based on their fitness. Next, each chromosome has a probability to be selected in function of its rank, calculated as follows:

V1 V2 V3 V4 .. Vp WP+1 .. Wn-1 Wn W1 W2 W3 W4 .. Wp Vp+1 .................. Vn-1 Vn Selected gene

Figure 3. Single point crossover operator

4. Experimentation results
In this section we describe the experiments which were conducted to assess the feasibility and the validity of our proposed approach. It must be noticed that experiments were applied on different objects however, due to the space limitation, in the following, we only illustrate results obtained on one object. It is a yellow car on which ten markers of different colours were placed as follows: two markers on the front and the back, and three markers on each side (see Figure 4). As mentioned previously, the system is composed of five web-cameras which are separated by an angle of 30 and the focal length of all the cameras is 2.18mm. The first experiments concern the interesting point detection and matching processes. Figure 4a and Figure 4b illustrate respectively an example of the interesting points detection results on two images acquired by two successive cameras separated by an angle of 30. It can be observed that, the highest and the lowest points are well detected on most of the markers however, it can be noted small location errors for the interesting points on the last marker on the right in Figure 4b. Actually, this marker is not well segmented due to the quality of the images. Figure 4c shows the result of the interesting point matching process between the two successive images in the previous figures. It can be observed that all the interesting points in one image are well matched with their respective corresponding points in another image.

Pi =

2 * ( N + 1 i) N * ( N + 1)

(13)

Where Pi is the probability to select the chromosome with the ith rank and N is the total of the population. Crossover: Two crossover operations are achieved upon the selection in order to diversify the new offspring: a single point crossover [16] and an algebraic crossover. The single point crossover is a well known crossover operation in EA: a gene is selected randomly in the chromosome. Then, the first child is obtained by taking the beginning part of the first parent, i.e. from its first gene until the selected gene and, the end part of the second parent. And the second child is the same thing but it takes the end of the first parent and the beginning of the other, as we can see in Figure 3. For the algebraic crossover, a linear combination is applied for each gene of two chromosomes C1 (resp. C2) with the rank found in the evaluation step r1 (resp. r2). The coefficients are calculated with this formula for their children:
1 = max(r 1, r 2) / 100 1 = 1 1 2 = 1 2 = 1

(14)

The single point crossover allows to produce two offspring by combining the genes of two parents in order to improve the future generation while the algebraic crossover permits to achieve a search between the two hyperplanes defined by the two parents.

462

of 10 seconds. Figure 6 illustrates an example of the fitness curves obtained from the different runs. This curve is representative of most of the curves provided by the different runs. It can be observed that the process quickly converges to the global minimum.
1
(a) (b)

fitness
0 0

(c)

50

100

Figure. 4. a) Initial image with the matching. b) Final


image

Number of Generation

Figure. 6. Curve example of the fitness evolution of


the best chromosomes vs generation numbers.

Figure 5d shows a 2D panoramic reconstruction obtained from images acquired by three successive cameras. The images are connected based on the interesting points respectively detected on the different images. Two successive images are then linked up by superimposing two corresponding interesting points respectively detected in the two images. It can be noticed that the different parts of images that constitute the final panoramic are not transformed.

Table 1 and Figure 7 show the estimations of depth values of the different markers. Note that, since EAs are a stochastic method, the results are not exactly the same from different runs. The results shown above give a mean view of about 50 runs of experiments. Note that the origin of the world system is located on the fourth marker, which gives the value zero for this marker. It can be observed that the proposed method yields very good results regardless of the interesting points location errors as previously mentioned on Figure 4b.
Table 1. Comparison between the real and the

(a)

(b)

(c)

estimated depth values


Depth marker 1 100 99 5 Depth marker 2 70 72 6 Depth marker 3 30 32 5 Depth marker 4 0 0 0 Depth marker 5 20 18 4

(d)

Figure. 5. (a, b and c) Three different views of the front of the car obtained by three successive cameras. (d) The 2D panoramic reconstruction from the three images.

Exact values Average estimated values mean deviation


120

com parison of depth values

The second experiments were conducted in order to determine the EAs parameters. More than 100 trials were executed for each parameter and for each run, the optimization process was commenced with a randomly generated initial population. Since the EAs are stochastic, the process was different from one run to another. All the runs were carried out until convergence. After these trials, the parameter values were set as follows: population fixed-size=100 individuals, maximum number of generations=100, one point crossover rate=0.5, algebraic crossover probability=0.3, mutation rate=0.05. For this set of parameters, the convergence occurs in an average time

100

80

de pt h
60

Real values Est imat ed values

40

20

0 1 2

m arkers

Figure 7. Curves of real and estimated marker depth


values

463

Figure 8 illustrates a 3D representation of the whole results obtained from all the runs. As seen, the different results are very similar since they are concentrated on the same values, which confirms that the process well converges towards the global minimum.
Z

calculated with the pseudo inverse matrix method.


180

160

140

120

100

80

60

40

20

0 40

60

80

100

120

140

160

180

X
(a)

200

150

100

Z
50 0 40 60 80 100 120

Figure 8. 3D representation of the whole results

Figure 9 shows two examples of the estimated transform results. For each example, the third image is obtained by superimposing the second image and the transform result of the first image. As seen in both cases, the superimposition results are quite precise on the parts of the car where the markers are located.

140 160 92 94 96

180 76

78

80

82

84

86

88

90

(b)

Figure 10. Two views of the 3D reconstruction of the


car. a) view from above the car. b) view from the right side of the car.

5. Conclusion
(a) (b) (c)

(d)

(e)

(f)

Figure 9. Two examples of the transform results. a)


and b) (resp. d) and e)): two successive images. c) (resp. f)): superimposition of the image in b) (resp. e)) and the transformed image in a) (resp. in d)).

Figure 10 shows two views of a 3D reconstruction of the object. This result is obtained by a combination of the depths given by the EA and the 2D panoramic reconstruction. 1. The EA gives the depth of all the marker seen by the cameras for the left and the right sides of the object. 2. The panoramic reconstruction gives the corresponding point from the left to the right images. 3. The transformation, composed of a rotation and a translation, between left and right points, matched in the previous step, is

In this paper we have presented a global image analysis system for reconstructing 2D and 3D panoramic images. The system is composed of five uncalibrated cameras, which are circularly positioned around the object to be analyzed. The proposed method consists of two basic phases: first, the detection and the matching of some interesting points on two successive images and next, the determination of the interesting point depth values and the transform matrix between the two images. We have defined evolutionary operators and an original fitness function well appropriated for calculating the elements introduced in this second phase. Experimental results validate the effectiveness and the correctness of the proposed method. Although the study presented in this paper provides very promising result, it remains the first step of a global work with applications in static and dynamic domains. Our ongoing work is aimed first, at improving the interesting point detection method, by performing a colour calibration at the beginning of the process, second, at increasing the number of interesting points in order to have a complete 3D object reconstruction, for instance by adding a structured light

464

projector into the system and finally, at improving the 2D panoramic final image by exploiting the estimated transform matrix between two images. REFERENCES
[1] J. Zhang, Y. Ge, S.H. Ong, C.K. Chui, S.H. Teoh, and C.H. Yan, Rapid surface registration of 3D volumes using a neural network approach, Image and Vision Computing, Vol. 23 (2008) pp. 201-210 [2] F. Yang, M. Paindavoine, H. Abdi, and D. Arnoult, Fast Image Mosaicing for Panoramic Face Recognition, Journal of multimedia, Vol. 1(2) (2006) pp. 14-20 [3] N. Ayache, Vision stroscopique et perception multisensorielle applications la robotique mobile, paris. (1989) [4] M. Fiala, and A. Basu, Panoramic stereo reconstruction using non-SVP optics, Computer Vision and Image Understanding, Vol.98 (3), pp. 363-397 [5] R. Bunschoten, and B. Krse,3D scene reconstruction from cylindrical panoramic images, Robotics and Autonomous Systems, Vol. 41 (2-3), pp. 111-118 [6] H.C. Longuet-Higgins, A computer algorithm for reconstructing a scene from two projections, Nature, Vol. 293 (1981), pp. 133-135. [7] P.R.S. Mendonca, and R. Copolla, A simple technique for self-calibration, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1 (1999) [8] X. Armangue, and J. Salvi, Overall view regarding fundamental matrix estimation, Image Vision Computer, Vol. 21 (2003), pp. 205-220. [9] O. Faugeras, and G. Toscani, Camera Calibration for 3D Computer Vision, Proc. International Workshop on Machine Vision and Machine Intelligence, Tokyo, Japan, Feb. 1987. [10] P.J. Fleming, and P.C. Purshouse, Evolutionary algorithms in control systems engineering: a survey, Control Engineering Practice, Vol. 10 (2002), pp. 1223-1241. [11] D.E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Kluwer Academic Publishers, Boston, MA., 1989 [12] Y. Meng, and H. Zhuang, Autonomous robot calibration using vision technology, Robotics and Computer-Integrated Manufacturing, Vol. 23 (2007), pp. 436-446. [13] D. Gledhill, G.Y. Tian, D. Taylor, and D. Clarke, Panoramic imaginga review, Computers and graphics, Vol. 27 (2003), pp. 435-445 [14] J. Park, and S. Inoue, Hierarchical depth mapping from multiple cameras, Proc. Of the ICIAP 97, Vol. 1, Florence, Italy, September 1997, pp. 685-692 [15] R.D. Morris, V.N Smelyanskiy, and P.C. Cheeseman, Matchig Images to Models Camera Calibration for 3-D Surface Reconstruction, Proc. Of Energy Minimization Methods in Computer Vision and Pattern Recognition, Vol. 1, Sophia Antipolis, France, Septembre 2001, pp. 105-117 [16] H. Schwefel, Advances in Computational Intelligence: Theory and Practice, Springer Verlag, New York, 2002.

465

Anda mungkin juga menyukai