Anda di halaman 1dari 10

Machine Vision and Applications (2010) 21:403412 DOI 10.

1007/s00138-008-0171-x

ORIGINAL ARTICLE

Interactive color image segmentation with linear programming


Hongdong Li Chunhua Shen

Received: 18 January 2008 / Accepted: 15 September 2008 / Published online: 24 October 2008 Springer-Verlag 2008

Abstract Image segmentation is an important and fundamental task for image and vision understanding. This paper describes a linear programming (LP) approach for segmenting a color image into multiple regions. Compared with the recently proposed semi-denite programming (SDP)-based approach, our approach has a simpler mathematical formulation, and a far lower computational complexity. In particular, to segment an image of M N pixels into k classes, our method requires only O (( M N k )m ) complexitya sharp contrast to the complexity of O (( M N k )2n ) if the SDP method is adopted, where m and n are the polynomial complexity of the corresponding LP solver and SDP solver, respectively (in general we have m n ). Such a signicant reduction in computation readily enables our algorithm to process color images of reasonable sizes. For example, while the existing SDP relaxation algorithm is only able to segment a toy-size image of, e.g., 10 10 to 30 30 pixels in hours time, our algorithm can process larger color image of, say, 100 100 to 500 500 image in much shorter time. Keywords Interactive image segmentation Linear programming Object cutout
H. Li (B) C. Shen Australian National University, Canberra, ACT 0200, Australia e-mail: hongdong.li@anu.edu.au C. Shen e-mail: cs@rsise.anu.edu.au H. Li C. Shen NICTA (National ICT Australia), Canberra Research Lab, Canberra, ACT 2601, Australia

1 Introduction The task of segmenting a color image into multiple meaningful regions is of central importance to image processing and computer vision. It has many practical applications in realworld problems. For example, in visual tracking or video surveillance the input images are generally required to be segmented into two regions: foreground object and background. This is basically a binary segmentation problem. Many highlevel image understanding tasks (e.g., object recognition) often rely on the partitioning of image into meaningful and color-homogeneous regions. There has been a large body of work published on the subject of image segmentation. Various algorithms have been proposed, for example, the normalized cut [2], mean shift [1], graph cut [4,7], belief propagation [8], and convex optimization [911] being some of the most popular methods of choice. Different algorithms have different motivations and mathematical origins. Our work to be presented in this paper belongs to the convex optimization family. In particular, the algorithm we are going to describe is largely inspired by the recently proposed semi-denite programming (SDP) algorithm, appeared in a series of papers such as [10,11]. Our principle motivation is to improve signicantly the computational efciency of the SDP algorithm. To this end, we propose a much simpler convex formulation, which is in fact a linear programming (LP) formulation, to the color image segmentation problem. Besides works faster, such an LP approach has also the benet of higher exibility in that prior knowledge (e.g., user-interactions), if it is linearly representable, can be easily incorporated into the computation. In this paper we will demonstrate this by experiments on interactive image segmentation.

123

404

H. Li, C. Shen

1.1 Previous work on SDP-based image segmentation Modern convex programming (e.g., SDP or LP) is a powerful tool for solving various optimization problems in sciences and engineering. It has also attracted many attention of the computer vision community. Paper [11] shows that the problem of multi-class image labeling (i.e., segmentation or partition), after certain approximation/relaxation, can be formulated as a convex SDP problem. By solving the resulting SDP, one can easily obtain an approximate globally optimal segmentation. Experiments on both 1D signals and 2D images have received convincing success. However, the SDP-based approach suffers from a serious practical issue. That is, compared with modern LP solvers, general off-the-shelf SDP solvers usually have a much higher computational complexity. Such a computational complexity issue manifests itself in two equally important aspects. The rst is the polynomial degree of the complexity, and the second is the largest problem size of being numerically solvable. Firstly, while in theory both LP and SDP have polynomial complexities, to solve two problems with roughly the same size, an LP solver is typically much faster than an SDP solver. Formally, let us assume that N is the problem size (e.g., the number of unknown variables) of both the problems. Suppose that the polynomial degree of the complexities for an LP solver and an SDP solver are m and n respectively, then the LPs complexity and the SDPs complexity are O ( N m ) and N n. O ( N n ). In general, we have N m Secondly, LP solvers are much more mature than SDP solvers. Even a state-of-the-art SDP solvers (e.g., SeDuMi, CSDP) are only able to solve a relatively small problem (with a few thousands variables and constraints). This has largely hampered the wider application of the SDP algorithms. In contrast, modern LP solvers (e.g., CPLEX, MOSEK) are able to solve problems with millions of variables. In the context of image segmentation, there is a third issue that makes the high complexity more problematic. For segmenting an image of M N pixels into k classes, the SDP formulation in [11] results in an SDP problem of size ( M N k )2 , i.e., having quadratic complexity. For instance, if one wants to segment a very small image of 10 10 pixels into k = 3 classes, the SDP formulation in [11] generates 90,000 variables. To solve such an SDP problem has already been beyond the capability of current best general SDP solvers like CSDP or Sedumi. Some remedies have been proposed to salvage the SDPbased approach. One popular solution is to simply sub-sample the image, i.e., only process 1% of all pixels, see [10]. Another solution is to pre- and over-segment the image using some other algorithms to get a much-reduced number of socalled super-pixels, and then apply the SDP to the set of super-pixels. Both approaches result in a much smaller SDP

problem. However, neither of them has truly overcome the computational complexity problemwhat they do is simply reducing the problem size. By doing so the effective image resolution is also lost. Even with the above remedies, the reported computational result is far from satisfactory. By using a special SDP solver (e.g., PENNON SDP) with the sub-sampling pre-processing technique, [11] reported that it has managed to segment a 32 32 image into three classes in about 4 h Clearly, processing such a toy-size problem in such a long time is not practical for most real applications. 1.2 Other related work on color image segmentation Colors play a signicant role in human visual perception. This paper uses color information exclusively as the cue for image segmentation. Extend it to other features (e.g., texture) should be easy. There are several classical algorithms that are popularly adopted for color image segmentation. One simple idea is to employ a simple k -means clustering on all the pixel values in a proper color space (e.g., RGB space or CIELab space, etc.). An improved version of the clustering idea is through Gaussian Mixture Model estimation. The EM algorithm is commonly used for nding the unknown parameters of a GMM. The Normalized-Cut (N-cut) method is an important image segmentation algorithm [2]. It performs well, has sound theoretic foundations and a simple implementation, hence has received much attention. Yet another commonly adopted algorithm is the MeanShift method [1]. Mean-shift proves to be very efcient in detecting multiple modes existing in a color feature space, each mode corresponding to a cluster of color pixels. Both the N-Cut and Mean-Shift algorithms have common drawbacks. They all ignore the local coherency among neighboring pixels, which is believed to be crucial in image analysis. These algorithms belong to the so-called global approach, which means that they operate directly on a bag of orderless color feature vectors. None of them takes into account the local consistency issue, hence they often yield erroneous (e.g., over-segmentation) results. To exploit such local coherence information, sophisticated algorithms using delicate graph structures have been used for image segmentation. For example, Graphcut(G-Cut) and BeliefPropagation (BP) based on Markov Random Field (MRF) image model have been applied to the problem of image segmentation. Very successful results have been obtained [4,7,8]. These two algorithms (Graph-cut and Belief-Propagation) represent the state of the art methods for image segmentation. However, both algorithms involve complicated optimization procedures, which are often non-convex. For example, in Graph-cut, ad hoc local swap operations are used; while in

123

Interactive color image segmentation with linear programming

405

the belief propagation algorithm, iterative local message passing is necessary. The Graph-cut algorithm is able to converge to the true optimum when the energy function is sub-modular [4]. In addition, graph-cut works remarkably fast; to process an 512 512 it only needs about one second on a moderate PC. Compared with the graph cut, our LP algorithm is slower. However, we gain the advantage of having more exibility. While offering an (approximated) global optimality we do not require the objective function to be sub-modular. Moreover, additional linear constraints may be easily incorporated. Our method requires the user to pre specify some scribbles (or seeds points) as prior knowledge. In this sense, our method is an instance of semi-supervised learning or transductive learning technique in machine learning eld. In fact, this learning technique has also been applied to the interactive image segmentation problem [15,16]. However, the computational devices used by the transduction techniques are different form our LP. Image segmentation is also closely related to natural image matting [13]. Compared with segmentation, the matting is a more general image labeling task where the pixel labels to be determined can take any real values between 0 and 1 (in stead of some discrete class labels). Again, to computational device used by matting, e.g., the spectral (eigen-) technique in [13] is different from ours. 1.3 Our contribution In view of the aforementioned aspects that hampered the practical application of the SDP algorithm, in this paper we want to develop a globally optimal full-resolution all-pixelwise color image segmentation algorithm. We propose a much simpler LP-based algorithm where the image segmentation task is naturally formulated as linear optimization under linear constraints. The solution can be found easily by an off-the-shelf LP solver. Linear programming is a mature mathematical technique (more mature than SDP) and widely adopted by researchers from both academia and industry. The formulation of our LP algorithm is much simpler than the SDP counterpart [9,11]. Solving an LP problem is usually much faster than solving an SDP problem of the same size. With our formulation, the resulting problem size is only linear in the image size. That is, to segment an image of size M N into k classes we only need to solve an LP problem of size ( M N k )this forms a sharp contrast to the quadratic complexity of the SDP formulation. Below we show a numerical example, in order to provide an impression of the resultant linear complexity. Using our algorithm, to segment a 200 150 size image into three classes only requires solving an LP problem of 90,000 variables, which is a trivial task for any modern LP solver. Currently available industrystrength LP solvers such as CPLEX is able to solve a linear

system with a million variables and constraints in realistic time. In addition to all these above benets, we also provide an LP algorithm specically for solving the 2-class segmentation problem (i.e., k = 2, foreground/background separation or object cutout). By substituting the sum-to-one condition in advance, we further reduce the problem size to M N , instead of 2 M N . Detail is given below. This has even signicant practical indicationtwo-class segmentation is a common task in computer vision. In order to segment a 512 512 size color image, we only need to solve an LP of 260,000 variables. As a matter of example, in this paper we only illustrate experimental results for the two-class segmentation situation. Our LP formulation is natural and simple. It allows the user to easily incorporate different prior knowledge about the image (and the segmentation) into the optimization procedure. For example, both the local pair-wise pixel Markov Random elds (MRF) relationship and global histogram constraint can be easily embedded.

2 Mathematical formulation: k-class segmentation In this section, we derive our LP formulation for k -class image segmentation problem. The goal is to segment an input RGB color image (denoted by I) with M N pixels into k classes. Color image segmentation is usually based on certain optimization criterion, according to the color measurement, and the spatial continuity of neighboring pixels labels. In the work, these two properties are captured by a two-term MRF energy which serves as the objective-function to be optimized (minimized). E (I ) =
i

Ci (Ii ) +
i, j

Vi j (Ii , I j ),

(1)

where the rst term, known as the data-term, sums over all pixels, and the second term, known as the neighborhood term, sums over all pair-wise neighborhoods. Ii is a feature vector of pixel i to be explained below. This type of energy function is popularly adopted by many vision algorithms such as in the graph cut algorithm [3,7]. The rst term is known as a data-term, capturing the color consistency, and the second term is a separation-term, describing the MRF neighborhood dependency. We convert the input RGB color image into the CIE-Lab space. The reason for such a conversion is that: distance in the Lab space is more close to humans color perception. Denote the color value (i.e., [ L , a , b]) at pixel i as Ii . We then compute for each pixel an associate feature vector. This feature vector is used to characterize local properties of the

123

406

H. Li, C. Shen

pixel, e.g., intensity, chromaticity, and texture, etc. These properties can be computed by applying some low-level image lters, such as a Gabor texton lter, etc. In this paper, for simplicity we directly use the color values of the central pixel under consideration and of its 4-neighboring pixels. Thus it forms a 15-dimension feature vector. We use Ii to denote the feature vector. We introduce a k -dimensional {0, 1} binary indicating vector xi to express the class label of pixel i , namely, xi = [xi 1 , xi 2 , . . . , xik ], where xia are {0, 1} binary variables with xia = 1 if pixel i belongs to class a , and xia = 0 otherwise. Clearly, we must have a xia = 1, i.e., the sum-to-one constraint. We stack all xi together to form a tall vector X . With these notation, the energy function is expressed as:
MN k k k

Mathematically, this leads to min


Y

Trace( LY ) 0, and rank (Y ) = 1. (6)

s.t . Y

E (I ) =
i =1 a =1

Ci (a )xia +
i , j a =1 b=1

Vi j (a , b)xia x jb .

(2)

Under some general conditions the above energy function can be reduced to a (very neat) form of constrained trace minimization: min Trace(C X T + P X D X T )
X

The attempt to exactly solve the above optimization problem under the rank-1 constraint proves to be extremely difcult. Therefore, a relaxation trick is suggested, which simply abandons the rank-1 constraint (i.e., an example of convex relaxation). After relaxation, the above problem becomes a typical convex SDP. Solving the SDP, one obtains image segmentation. We now examine the complexity of the resulting SDP. As there are ( M N k ) dimensions in vector X , consequently there will be ( M N k )2 unknown entries in Y that are to be computed. Therefore, the problem size is quadratic, and the overall computational complexity is polynomial in the problem size, which is O (( M N k )2n ). This complexity is very high even when the image size is moderately large. For example, with the state-of-the-art SDP solvers it is impossible to segment a 512 512 image into k = 2 classes, because that would involve solving an SDP of (512 512 2)2 = 274 109 variables.

(3) (4) (5)

3 Our LP formulation: k-class To reach a purely linear formulation, our key observation is that: under mild condition we can replace the second term in Eq. 2, which is now quadratic in X , with a linear term. As such, we obtain the following energy function:
MN k k

s.t .

Xe = e
k

xia {0, 1}, 0 i M N , 1 a k ,

where e is a vector with all-one elements, P and D are some (known) coefcient matrices (cf. [11], and the task is to nd the best X . The problem is a quadratic, non-convex, integer programming problem. Due to the integer constraints to solve it exactly is very hard. Recent progress in mathematical optimization theory has offered a very powerful approachconvex relaxationto approximately solve the integer programming problems. This approach is very promising, and forms the foundation of the present work. 2.1 A brief review of the SDP approach As we mentioned before, this paper is largely inspired by recent research on SDP-based image segmentation, for example [11]. In this SDP-based method, the product of X X T of Eq. 3 is replaced by a single positive semi-denite (PSD) matrix Y . Then the trace optimization problem becomes a linear optimization in the augmented unknown matrix Y , subjecting to the PSD constraint.

E(I ) =
i =1 a =1

Ci (a )xia +
i , j a =1

wi j |xia x ja |,

(7)

Fig. 1 Interactively build the GMM models for foreground and background from some user specied strokes

123

Interactive color image segmentation with linear programming

407

where w(i , j ) is a coefcient depending on the colors of pixel i and j . Possible choices of w include, e.g., wi j = exp i2 2j or Ii I j 1 . In essence, this formula is obtained by replacing the quadratic L -2 norm by a linear L -1 norm. Note that our approach is different from the linearization approach used by the SDPrelaxation where an unknown matrix Y = X X T is introduced. Such an L -1 approximation is justied by the Potts model in deriving MRF model for image, and is adopted by the graph cut algorithm [7] too. We will show by experimentation that the L -1 approximation does not compromise the nal segmentation quality evidently. Using the Markovian condition, we only need to consider the pair-wise interaction among the 4-neighboring pixels. In addition, we assume that we have already k prototype color feature vectors (denoted by 1 , . . . , k ) to represent each the pixel-class a {1, . . . , k }. Now the energy minimization can be written as:
MN k k I I
2

Denote the k th Gauss mode (components) of the object region f by f k with mean k and covariance f ,k . In experiments we x the number of Gaussian modes to 5, i.e., k = 1, . . . , 5. Similarly, the k th background GMM modes is denoted bk with (b k , b,k ). These two GMM models can be estimated by, for example, an EM algorithm, from two sets of userspecied scribbles in the foreground region and background region, respectively. We use scribble in our experiments, hence our algorithm is a semi-supervised color image segmentation algorithm.

Convergence curve (objective functions vs. iteration) 3.5 Primal obj function Dual obj functiton 3 2.5 2 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9

( Ii a )xia +
i =1 a =1 i , j a =1

wi j |xia x ja |,

(8) (9) (10)

s.t. i
a

xia = 1.0,

xia {0, 1}.

This is a typical Linear-Programming with ( M N k ) integer variables xia . One can efciently solve it with any available LP solver (e.g., Matlabs Linprog, MOSEK or CPLEX, etc.) by simply dropping the integrity constraints. After the LP computation, the integer solution can be recovered by applying a simple rounding process. 4 The 2-class LP formulation

Objective functions

# Iterations

Fig. 2 Converge curves of the primal and dual objective functions

Computation time (LP solver) vs. Image size (number of pixels) 140

The two-class segmentation problem is of particular interest in real applications, for example, to segment an objectof-interest from its background in the video surveillance scenario. In this section we will show that for the special problem setting we can further reduce the size (complexity) of the resulting LP to ( M N ), i.e., image size. This enables our algorithm to process images of larger size. For instance, by a modern LP solver our algorithm is capable of solving a one-million-pixel image in realistic time. Below we will briey describe our two-class LP-based segmentation algorithm. Following from the derivation of our general k -class formulation it is easy to understand how it works. 4.1 Foreground and background modeling For a given image we use the Gaussian Mixture Model (GMM) to describe the object and the background regions.

120

LP computation time (seconds)

100

80

60

40

20

8 x 10

9
4

number of pixels

Fig. 3 Computation time (MOSEKs Linprog) versus number of pixels

123

408

H. Li, C. Shen

4.2 Compute the -distances To obtain the GMM models for foreground and background, one can either use a supervised learning process, or via user interaction. We choose the second approachhence an interactive image segmentation method. Our method requires the user to simply draw a few strokes on the image to be segmented, indicating the intended foreground and background regions. Figure 1 illustrates an example. Such an interactive way has been adopted by many methods such as the Grab-cut algorithm [7] and some image matting algorithms [8].
Fig. 4 Some of our interactive color image segmentation results (from left to right : input image, user-specied strokes, segmented foreground region, segmented background region)

A potential difculty with using only a few strokes is that the estimated GMM models may not faithfully represent the true color distributions of the whole image regions. To overcome this, in this paper we do not directly use the GMM probability functions. Instead, Mahalanobis distances between a given color and each of the Gauss modes are used. This approach increases the robustness of the segmentation, because it allows for accounting for visual occlusions and illumination changes. We dene the foreground distance of a pixel i , denoted d f (Ii ), as the minimum Mahalanobis distance from the pixel

123

Interactive color image segmentation with linear programming

409

value to each of the foreground Gauss modes: d f (Ii ) = min((Ii k )T


k f 1 f ,k (Ii

In our algorithm the weight wi j is computed by (11) wi j = exp Ii I j 2 2


2

k ))1/2 .
f

(15)

Similarly, we dene the pixels background distance as:


T db (Ii ) = min((Ii b k) k 1 b,k (Ii 1/2 b . k ))

(12)

Combining these two, we dene the so-called -distance as (Ii ) = db (Ii ) d f (Ii ). (13)

Note that while both d f (Ii ) and db (Ii ) are non-negative scalars (Ii ) may take non-positive value. 4.3 Form an LP problem For the two-class segmentation problem, the variables to be minimized are binary pixel labels xi , with xi = 1 if it is a foreground pixel and xi = 0 otherwise. Now the problem is a binary 01 programming problem: Find the best 01 variables xi , i = 1, 2, . . . , M N , such that the energy function is minimized. Solving a general 01 Integer Programming problem exactly is a hard problem (in fact, it has proved to be NPhard). Rather than seeking the exact global optimal solution, if we settle for an approximate sub-optimal solution, then there exist many efcient algorithms for nding those approximate optimal solutions. Relaxation is one of the efcient algorithms, and is adopted in the paper. Specically, we relax the 01 constraints into bound constraints 0 xi 1. Consequently, this leads to a simple LP with ( M N ) variables:
MN MN

where is a user-specied parameter. Such a weight formula is inspired by the colorization algorithm [5]. It is used to embody the belief that: two neighboring pixels should have consistent labels, unless their difference (in feature/color) is too large. To choose the value of we adopt the method of [3], namely, = 2 |Ii I j 2 , where denotes the expectation over all pixels. Now we have established a standard LP problem. With the LP there is no need for a good initialization, and no risk of local minima. As having a similar formulation, our 2-class algorithm can be seen as a concrete realization of the excellent pure theoretic work of [12]. 4.4 Algorithm description The proposed LP-based image segmentation algorithm (for k = 2 case) is summarized in Algorithm-1.

min
xi i

i xi +
i

wi j |xi x j |,

(14)

s.t . i , i = 1, . . . , M N ; j N (i ), xi L , xi = bi , 1 xi 0. where N (i ) is the 4-neighbors of pixel i , is a user-specied regularization parameter (we use 0.2 in our experiments), L is the set of labeled pixels (i.e., scribbles), and bi the corresponding labels. The linear cost function (to be minimized) consists of two linear terms. The rst term is the date term. The second term is the sum of weighted label-inconsistency. From above we know that the weight in the local term must be positive. Therefore we can effectively minimize the energy by minimizing its upper bounds.

Fig. 5 More segmentation results: input and segmentation

123

410

H. Li, C. Shen

4.5 Remarks Unlike many other optimization-based image segmentation methods such as Graph-cut or belief propagation,

using the proposed LP form we are guaranteed to nd the true global optimum (up to an approximation because of the relaxation). There is no need for an initial guess, and no risk of local minima.

Fig. 6 Comparison with normalized-cut: input (left column), the N-cut results (middle column), and our LP segmentation results (right column). The numbers in the brackets are execution time

123

Interactive color image segmentation with linear programming 1. Convert the image to CIE-Lab space. Construct feature vector Ii at each pixel. 2. Use user-specied scribbles to estimate the foreground GMM and background GMM. 3. Compute the delta-distance i at every pixel i , and the weights wi j for each of its 4-neighbor pixels. 4. Establish the LP formulation based on Eq. 14. 5. Solve this LP problem using any LP solver, and round the results to 01 variables. 6. Output the rounded labels as the segmentation result.

411

Algorithm 1: Linear programming image segmentation

Since polynomial-time algorithms exist and widely adopted for solving LP problems, our algorithm is also computationally efcient. Due to the LP formulation, we can easily incorporate prior knowledge about the scene into the segmentation process. For example, by letting z i = z j we can specify that two pixels i and j have the same label.

5 Experiments To verify the effectiveness of our new LP algorithm we have conducted image segmentation experiments on real images. These test images are taken from the Berkeley Segmentation Benchmark Dataset [6]. For each input image we specify the foreground region and background region by a few strokes. Our LP-based algorithm is implemented in Matlab. We use MOSEKs Linprog function as the LP solver. This solver is in fact a Primal-Dual Interior Point methods. We found that our algorithm converges very quickly for images of moderate size. Usually, in about 1012 iterations the LP converges. Figure 2 gives an example of convergence curves (i.e., Primal and (neg-) dual objective functions vs. iteration).
Fig. 7 Sample frames of video-cutout (http://www. freefoto.com)

To test the timing performance we resize the input image into different sizes. For segmenting an image of 100 100 pixels, the LP solver costs only 2.9 s on a moderate PC (Intel P4, 2.8 Ghz, 1 G RAM). The computation time for a 30 30 image and a 200 200 image are 0.7 and 22.0 s, respectively. In contrast to this, it takes the SDP algorithm ([11]) 4 h to process a 30 30 image. Running our algorithm repeatedly on images of different sizes we obtain a timing curve as shown in Fig. 3. To segment an image of 300 300 size the LP solves takes only about 10 minutes. Image of such a big size is certainly beyond the capability of the best SDP solver available. It is for this reason, we do not provide any comparison with SDP, as the latter simply does not work for big images. To visually evaluate the segmentation performance, we run our algorithm on different images. Figure 4 shows some of our results. More results are shown in Fig. 5. Here the user wanted to extract a zebra or panda from a clutter background. Clearly, the users intention of foreground/background has been well captured, and the segmentation results are satisfactory. We also compared our LP-algorithm with the original normalized cut algorithm [2]. Such a comparison seems to be unfair to the N-cut algorithm, because the original version of the N-cut algorithm is not able to incorporate users prior knowledge (e.g., the scribbles) into the computation (note there has been later modications that have taken this into account, for example [14]). To compensate for this, we allow the N-cut algorithm segment the image to multiples regions (we used k = 6 in the experiments below), and evaluate the accuracy of the segmentation boundaries. Figure 6 gives some comparison results. All the test image are normalized to size 160 160. The Matlab code for normalized-cut algorithm is adapted from [2] but modied to deal with color images. It is clear that our new algorithm outperforms the N-cut algorithm in terms of both accuracy and computation time.

123

412

H. Li, C. Shen 4. Boykov, Y., Jolly, M.: Interactive graph cuts for optimal boundary and region segmentation of objects in n-d images. Proc. ICCV-2001 1(July) 105112 (2001) 5. Levin, A., Lischinski, D., Weiss, Y.: Colorization using optimization. ACM Transactions on Graphics. SIGGRAPH 2004, pp. 689 694 (2004) 6. Martin, D. et al.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. Proc. ICCV-2001 2, 416423 (2001) 7. Rother, C., Kolmogorov, V., Blake, A.: GrabCut: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. SIGGRAPH 23(3), 309314 (2004) 8. Wang, J., Cohen, M.: An iterative optimization approach for unied image segmentation and matting. Proc. ICCV 1, 936943 (2005) 9. Keuchel, J., Schnorr, J. et al.: Binary partitioning, perceptual grouping, and restoration with semidenite programming. IEEE Trans. PAMI 25, 13641379 (2003) 10. Keuchel, J. et al.: Hierarchical Image Segmentation based on Semidenite Programming Pattern Recognition. In: Proc. of 26th DAGM Symposium, LNCS, August. Springer, Berlin (2004) 11. Keuchel, J.: Multiclass Image Labeling with Semidenite Programming. In: Proc. ECCV06, Graz, pp. 454467 (2006) 12. Kleinberg, J., Tardos, E.: Approximation algorithms for classication problems with pairwise relationships: Metric labeling and Markov random elds. J. ACM 49, 616639 (2002) 13. Levin, A., Rav-Acha, A., Lischinski, D.: Spectral Matting. IEEE Trans. PAMI (2008) 14. Yu, S.X., Shi, J.: Segmentation given partial grouping constraints. IEEE Trans. PAMI 26(2), 173183 (2004) 15. Duchenne, O., Audibert, J., Keriven, R., Ponce, J., Segonne, F.: Segmentation by transduction. In: Proc. CVPR (2008) 16. Cui, J., Yang, Q., Wen, F., Wu, Q., Zhang, C., Van Cool, L., Tang, X.: Transductive Object Cutout. In: Proc. CVPR (2008) 17. Kohli, P., Torr, P.: Dynamic graph cuts for efcient inference in Markov random elds. IEEE Trans. PAMI 29(12), 20792088 (2007)

5.1 Video object cutout The proposed still-image object cutout algorithm can be easily extended to video object cutout. The user is only required to specify some scribbles in key frame(s). The segmentation result of the key frame is then propagated into other frames without further human interaction. Some sample resulting frames are given in Fig. 7. 6 Conclusion This paper has described a simple approach for color image segmentation. The problem is formulated naturally as a Linear Program. Such a formulation is easy to understand and easy to implement. The computation is efcient thanks to our novel and simpler LP formulation. At present we used only the off-the-shelf LP solver (based on the interior point method). It would be worthwhile trying other LP solvers to compare their computational performance. Moreover, due to the special structure of the problem, e.g., every pixel is using at most two neighboring constraints, there should other more specic options that will take this into account, and offer more efcient computation for the task. Another promising future research direction is to directly optimize the solver using some fast algorithms for special Linear program problem. In fact, the best performed graphcut algorithm is a special algorithm for solving the max-ow LP problems. In addition, in future work we will consider whether the same LP framework will allow a partial update scheme, similar to the concept of dynamic graph-cut [17]. If so, then our algorithm will be made even faster for the dynamic video cut tasks.
Acknowledgment NICTA is funded through the Australian Governments Backing Australias Ability initiative, in part through the Australian Research Council. We wish to thank the anonymous reviewers for their very helpful comments

Author biographies
Hongdong Li is an academic staff with the Computer Vision Group, Research School of Information Sciences and Engineering (RSISE) of ANU (The Australian National University). He is also a Senior Researcher with NICTA (National ICT Australia), Canberra Labs. His main research interests are computer vision, pattern recognition and image processing. Chunhua Shen received the B.Sc. and M.Sc. degrees from Nanjing University, China, in 1999 and 2002, respectively, and the Ph.D. degree from the School of Computer Science, University of Adelaide, Australia, in 2005. He is currently a Researcher with NICTA, Canberra Labs. He is also an Adjunct Research Fellow at the Australian National University; and Adjunct Lecturer at the University of Adelaide. His main research interests include statistical pattern analysis and its application in computer vision.

References
1. Georgescu, B., Shimshoni, I. Meer, P.: Mean Shift Based Clustering in High Dimensions: A Texture Classication Example. ICCV (2003) 2. Shi, J., Malik, J.: Normalized cuts and image segmentation. PAMI (2000) 3. Blake, A., Rother, C., Brown, M., Perez, P., Torr, P.: Interactive image segmentation using and adaptive gmmrf model. Proc. ECCV-2004 2, 428441 (2004)

123

Anda mungkin juga menyukai