910 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE .

3y ago
18 Views
2 Downloads
4.26 MB
15 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Javier Atchley
Transcription

910IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,VOL. 32, NO. 5,MAY 2010Unsupervised Object Segmentationwith a Hybrid Graph Model (HGM)Guangcan Liu, Zhouchen Lin, Senior Member, IEEE, Yong Yu, and Xiaoou Tang, Fellow, IEEEAbstract—In this work, we address the problem of performing class-specific unsupervised object segmentation, i.e., automaticsegmentation without annotated training images. Object segmentation can be regarded as a special data clustering problem whereboth class-specific information and local texture/color similarities have to be considered. To this end, we propose a hybrid graph model(HGM) that can make effective use of both symmetric and asymmetric relationship among samples. The vertices of a hybrid graphrepresent the samples and are connected by directed edges and/or undirected ones, which represent the asymmetric and/orsymmetric relationship between them, respectively. When applied to object segmentation, vertices are superpixels, the asymmetricrelationship is the conditional dependence of occurrence, and the symmetric relationship is the color/texture similarity. By combiningthe Markov chain formed by the directed subgraph and the minimal cut of the undirected subgraph, the object boundaries can bedetermined for each image. Using the HGM, we can conveniently achieve simultaneous segmentation and recognition by integratingboth top-down and bottom-up information into a unified process. Experiments on 42 object classes (9,415 images in total) showpromising results.Index Terms—Segmentation, graph-theoretic methods, spectral clustering.Ç1INTRODUCTIONCLASS-SPECIFIC (or category-level) object segmentation isone of the fundamental problems in computer vision.Its goal is to segment an image into regions, with eachregion solely containing object(s) of a class. As objectsegmentation requires that each segmented region be asemantic object, it is much more challenging than traditional image segmentation [2], [3], [4], [5], [6] (we shall callthis oversegmentation instead throughout this paper),which only requires that each region is a homogeneoustexture. To achieve object segmentation, object classesshould be represented appropriately. As the variance ofobject shape and color/texture within an object class can belarge, it is very difficult to obtain class-specific features thatcan describe the object class accurately. In this regard, objectsegmentation is a difficult problem. However, objectsegmentation is feasible due to the recent development ofrecognition and oversegmentation techniques in computervision. On the one hand, recently established shape models,such as the implicit shape model [7] and the sparserepresentation model [8], provide us with effective waysto extract the approximate shape priors of an object class.On the other hand, current oversegmentation techniques. G. Liu and Y. Yu are with the Department of Computer Science andEngineering, Shanghai Jiao Tong University, Shanghai 200240, P.R. China.E-mail: roth@sjtu.edu.cn, yyu@apex.sjtu.edu.cn. Z. Lin is with the Visual Computing Group, Microsoft Research Asia,Beijing 100080, P.R. China. E-mail: zhoulin@microsoft.com. X. Tang is with the Multimedia Lab, Department of InformationEngineering, The Chinese University of Hong Kong, Shatin, NewTerritories, Hong Kong SAR, P.R. China. E-mail: xtang@ie.cuhk.edu.hk.Manuscript received 27 July 2007; revised 3 July 2008; accepted 20 Jan. 2009;published online 10 Feb. 2009.Recommended for acceptance by R. Zabih.For information on obtaining reprints of this article, please send e-mail to:tpami@computer.org, and reference IEEECS Log NumberTPAMI-2007-07-0458.Digital Object Identifier no. 10.1109/TPAMI.2009.40.0162-8828/10/ 26.00 ß 2010 IEEE[2], [3], [4], [6] can well handle the low-level image featuresand produce reliable texture segmentation results. Thisgreatly reduces the difficulty of object segmentationbecause we only have to consider how to group theobtained subregions. So, accurate object segmentation ispossible if we can combine both class-specific (high-level)and low-level priors effectively. To achieve this, in thispaper we introduce a novel spectral method called thehybrid graph model (HGM).1According to whether the learning of object class priorsrequires human interaction or not, the existing objectsegmentation algorithms are broken into two categories:supervised and unsupervised. Supervised algorithms require either manually segmented masks in training images[7], [9], [10], the specification of shape templates [9], [11],[12], [13], [14], or other kinds of priors (e.g., object partconfiguration [15] or class fragments [16]). These algorithmsmay be applicable to a particular object class [13], a range ofobjects [10], [12], or object classes [7], [9], [11], [14], [15], [16],[17] provided that the class dependent priors are available.However, as a practical object recognition system needs tohandle a large number of classes of objects and most classesmay require many training samples due to significantintraclass shape and appearance variances, it is importantthat the learning does not involve any human interaction.This makes unsupervised algorithms more appealing. Therehas been sparse research in this direction. Borenstein andUllman [18] used the overlap between automaticallyextracted object parts (or fragments) to determine theforeground and the background. As individual parts areconsidered independently, the approach is prone towrongly judge background clutters as foreground parts.Winn and Jojic [19] combined all images together to find aconsistent segmentation based on the assumption that the1. The preliminary version of this paper [1] was accepted by ICCV 2007.Published by the IEEE Computer Society

LIU ET AL.: UNSUPERVISED OBJECT SEGMENTATION WITH A HYBRID GRAPH MODEL (HGM)Fig. 1. Our HGM-based single-class object segmentation. (a) Inputs: Aset of images each consisting of objects (foreground) of a class anddifferent backgrounds. (b) Outputs: Regions solely containing objects ofthe class. The whole process is fully automatic.object shape and color distribution pattern are consistentwithin class and that the color/texture variability within asingle object of a class is limited. Moreover, each imageshould only contain one object of the class. Rother et al. [20]showed that it is possible to use only two images to segmenttheir common parts simultaneously. They required thecommon parts to have similar shape and color/texture.Russell et al. [21] segmented images in multiple ways andthen borrowed techniques from document analysis todiscover multiple object classes. Their assumption was thatsome regions in some of the segmentations are correct foreach object. As segmentation precedes class discovery, it isusually hard to have accurate segmentation. Due to thelimitations of these existing methods, we aim at proposing anovel unsupervised algorithm that can produce moreaccurate object boundaries, where the assumption on thevariance of object shape and color/texture is much weakerand images can contain multiple objects.To ensure robustness, we follow the doctrine that objectsegmentation should be handled in parallel to objectrecognition [7], [9], [14], [15], [19], [22] as they are stronglycoupled problems. So, the top-down (for recognition) andthe bottom-up information (for segmentation) should beutilized simultaneously. Although no annotated trainingimages are available, as long as there are enough images,the common patterns of the object class will appearfrequently and the effect of the background will fade outas it is much less structured compared to the objects. So, ourtarget is to segment a large number of images simultaneously(Fig. 1). As we will not assume small intraclass shapevariance (e.g., Fig. 1a), unlike [7], [9], [14], we do not expectthat there will be a global shape prior for recognition.Therefore, we adopt local shape priors based on the work ofAgarwal and Roth [8]. We first extract the object parts usingan interest point detector [23]. The concurrence of objectparts and the weak spatial relationship among them formour shape priors. The local shape priors provide very weaktop-down constraints on the object shape, as the object partsare only sparsely distributed across the objects, and veryoften they also reside in the background. So, it is unlikely toobtain accurate segmentation by using such priors only. Onthe other hand, we break the images into superpixels [24]and group homogeneous superpixels into relatively largesubregions [6]. Although a subregion may not exactlycorrespond to an object, it is nonetheless homogeneous intexture. Such trustworthy oversegmentation results providegood basis for object segmentation. For example, Cao andFei-Fei [25] suggested to further group such subregions intoobjects. In contrast, we view the oversegmentation results as911Fig. 2. An illustration of the hybrid graph. A vertex denotes a datasample (superpixel). A directed edge represents the relation ofconditional dependence between a pair of samples, while an undirectededge represents the relation of homogeneous association. Betweeneach pair of vertices, there are at most three edges: two directed edgesand one undirected edge. In some scenarios, it is possible that somevertices are isolated.trustworthy bottom-up constraints on the object appearance.Both top-down and bottom-up constraints can be represented by our HGM naturally.The vertices of a hybrid graph (Fig. 2) represent thesamples, e.g., superpixels of an image. The vertices areconnected by directed edges and/or undirected ones. Adirected edge represents the dependence between thevertices that it connects (which is asymmetric), while anundirected edge represents the similarity between thevertices (which is symmetric). The directed and the undirected subgraphs represent our weak top-down and trustworthy bottom-up priors, respectively. Then, an image issegmented by classifying the vertices of the hybrid graph intotwo clusters, with one cluster being the foreground containing object(s) of the class and another being the background.The classification is based on computing a score vector thatassigns each vertex its “probability” of belonging to theunderlying class. The score vector is the solution to anoptimization problem that combines a random walk on thedirected subgraph and a minimal cut (Min-Cut) of theundirected subgraph. The optimization problem can also bededuced from the well-established manifold regularization [26]framework, where the random walk defines the loss functionand the Min-Cut defines the regularization function.Compared to the previous unsupervised object segmentation algorithms [18], [19], [20], [21], the main advantagesof our HGM-based method include:Larger variation in shape (including position, size,pose and profile) is allowed within a class. Larger variation in color/texture is allowed not onlywithin class but also within object. Multiple objects of the same class are allowed ineach image. More accurate output of object boundaries. It is fully automatic for single-class object segmentation.The remainder of this paper is organized as follows:Section 2 introduces the general formulation of an HGM fortwo-class clustering. Section 3 details our HGM-basedsingle-class object segmentation approach. Section 4 showsthe experimental results. Section 5 extends our HGM toaddress multiclass clustering and presents the results onmulticlass object segmentation. Finally, Section 6 concludesour paper.

9122IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,THE HYBRID GRAPH MODELCLUSTERINGFORTWO-CLASSIn this section, we present the HGM for two-class clusteringin an abstract sense. The extension to multiclass clusteringis deferred until Section 5. Let the set of n data samples (asample denotes a superpixel when applying HGM to objectsegmentation) be V ¼ fv1 ; . . . ; vn g, with both symmetric andasymmetric relationships among them. We want to classify Instead of directlythe samples into two classes C and C.assigning definite class labels, we aim at calculating a scorevector ! ¼ ð 1 ; . . . ; n ÞT and a threshold value t, where i isthe “probability” of vi belonging to C. With an appropriate t,C can be determined as the set Vþ ¼ fvi j i tg and C consists of the remaining samples V ¼ fvi j i tg.VOL. 32, NO. 5,MAY 2010Markov Chain. Ideally, like PageRank [27], this results in astationary distribution ! : ¼! ;PT!ð1Þwhich is also the solution tomin kP T ! ! k2 :! ð2ÞOn the other hand, from the undirected subgraph, if twoentities vi and vj are strongly associated, they are morelikely to belong to the same class. So, the score vectorshould minimize the cut costXaij ð i j Þ2 :ð3Þi;j2.1 Representing Data by a Hybrid GraphIn a hybrid graph, the symmetric relationship is represented by undirected edges, while the asymmetric relationship is represented by directed edges. The asymmetricrelationship is usually the conditional dependence betweenthe samples (which incorporates weak class-specificinformation), while the symmetric relationship oftenmeasures the homogeneity (e.g., similarity) among thesamples of a class. As a result, a hybrid graph incorporatestwo matrices that are associated to the directed subgraphand the undirected subgraph, respectively:1.Conditional Dependence Matrix P :P ¼ ½pij n n ;2.where pij measures the conditional dependence ofvj on vi . This matrix is usually asymmetric. In ourobject segmentation task, it represents the shapeconfiguration between superpixels, where theshape priors are first acquired from the concurrence of object parts and then transformed to thesuperpixels.Homogeneous Association Matrix A:A ¼ ½aij n n ;where aij measures the homogeneity between viand vj . This matrix is symmetric. In our problem, itrepresents the color/texture similarity and thespatial adjacency among superpixels, which mainlycome from the oversegmentation results.Therefore, in a hybrid graph (Fig. 2) there are at mostthree edges between each pair of vertices: Two are directedand one is undirected. The weights assigned to directededges and undirected ones correspond to matrix P andmatrix A, respectively. So, it is convenient to denote thehybrid graph as G ¼ ðV ; P ; AÞ.2.2 Computing the Score VectorAs mentioned before, i is the “probability” of sample vibelonging to the class C. Let us consider the directedsubgraph and undirected subgraph one by one.From the directed subgraph, as the samples are interdependent, the probability of sample vi should depend onthe probabilities of samples that point to it. So, theinterdependence among the samples naturally forms aPutting the above two criteria together, we have anoptimization problem to calculate the score vector ! :T ¼ 1;min EðG; ! Þ; subject to ! !! ð4ÞwhereEðG; ! Þ ¼ kP T ! ! k2 þ Xaij ð i j Þ2 ;ð5Þi;jG ¼ ðV ; P ; AÞ is the hybrid graph, and is a positiveparameter used to balance the effects of the two criteria. Inour experiments, we fix ¼ 1. The solution to problem (4)is the eigenvector associated to the minimum eigenvalue ofthe following matrix:MðGÞ ¼ ðI P ÞðI P T Þ þ LA ;ð6Þwhere LA is the Laplacian matrix of the undirected subgraph:PPLA ¼ DA A with DA ¼ diagf nj¼1 a1j ; . . . ; nj¼1 anj g, and Iis the identity matrix. Note that, if ! is the optimal solution to(4), so is ! . However, as i is the “probability” of sample vibelonging to C, we have to choose the ! such thatPn 0.ii¼12.3 Determining the ThresholdWith the score vector ! , the samples can be classified byusing a threshold t. It is natural that this threshold t bechosen as the mean of ! . However, some computed valuesin ! may be negative.2 Such negative scores make thethreshold underestimated. So, we further estimate thethreshold t by the geometric mean of ! . So, a reasonableestimate of t is tl t th , �ffinn1X1Xtl ¼ i and th ¼ 2 :n i¼1n i¼1 iAs we also want the classes to be compact and isolatedfrom each other, we borrow the normalized cut (N-Cut) [5]of the undirected subgraph as the criterion to helpdetermine the final threshold t:2. The negativity problem is not critical for our purpose because,actually, it is the relative order of the scores that matters. For ease ofoptimization, we do not enforce nonnegativity of the score vector ! in (4).

LIU ET AL.: UNSUPERVISED OBJECT SEGMENTATION WITH A HYBRID GRAPH MODEL (HGM)t ¼ arg min NCutðVþ ; V Þt2½tl ;th ð7ÞcutðVþ ; V ÞcutðVþ ; V Þþ;assocðV ; V Þt2½tl ;th assocðVþ ; V Þ¼ arg minwherecutðVþ ; V Þ ¼Xaij ; assocðVþ ; V Þ ¼vi 2Vþ ;vj 2V aij ;vi 2Vþ ;vj 2VXand assocðV ; V Þ ¼XThe optimal threshold can be found by testing t with ! . So,solving the above optimization problem just takesOðnÞ operations, where n is the number of samples. Wehave found that this configuration works well in ourexperiments.2.4 Discussions2.4.1 Connection to Manifold RegularizationAlthough HGM is essentially a heuristic approach toassign “probabilities” of samples belonging to the targetclass C, it can be fit into the well-established manifoldregularization [26] framework. Let us consider the problemof learning the conditional distribution P rðy j xÞ from datasamples. There is an unknown probability distributionP rðx; yÞ on X Y , where X is the input sample space andY ¼ f1; 0g with Y ¼ 1 if a sample belongs to C and Y ¼ 0 ifotherwise. Labeled samples are ðxi ; yi Þ pairs drawn fromP rðy j xÞ. Unlabeled samples are simply xi 2 X drawnfrom the marginal distribution P rX ðxÞ of P rðx; yÞ. To learnthe conditional distribution P rðy j xÞ with few or even nolabeled samples, the prior knowledge of the marginaldistribution P rX ðxÞ can be exploited:P rðyjxÞnXLðy0i ; P rðy j xi ÞÞi¼1þ RP rX ðP rðy j xÞÞ;where !y0 ¼ ½y01 ; . . . ; y0n T is a label vector that is possiblyknown a priori and L is some loss function, such as thesquared loss ðy0i P rðy j xi ÞÞ2 for MSE and the hinge lossfunction max½0; 1 y0i P rðy j xi Þ for SVM. The regularization term RP rX ðP rðy j xÞÞ is defined according to theconnection between the marginal and the conditionaldistributions. It is often assumed that if two points x1 andx2 are close to each other with respect to the intrinsicgeometry prescribed by P rX ðxÞ, then the conditionaldistributions P rðy j x1 Þ and P rðy j x2 Þ are similar. So, theregularization term can be defined asRP rX ðP rðy j xÞÞ ¼nX! ¼ arg min! nXðy0i i Þ2 þ i¼1aij ðP rðy j xi Þ P rðy j xj ÞÞ2 ;i;j¼1where aij measures the similarity between xi and xj .There are two kinds of methods to learn the conditionaldistribution: inductive and transductive. Inductive methodslearn a function fðxi Þ to fit P rðy j xi Þ, while transductivemethods learn a label vector directly. HGM is a kind oftransductive method. It learns a label vector ! , where i ¼ P rðy ¼ 1 j xi Þ. Considering the squared loss functionand integrating all the above ingredients, the label vectorcan be computed by minimizingnXaij ð i j Þ2 :i;j¼1The definition of the loss function usually needs somelabeled data. For example, Zhou et al. [28] used partiallylabeled samples and defined y0k ¼ 0 if xk is unlabeled. InHGM, it could be viewed that the labels come from therandom walk on the directed graphaij :vi 2V ;vj 2VP rðy j xÞ ¼ arg min913y0i ¼ P rðy ¼ 1 j xi Þ nXP rðy ¼ 1 j xj ÞP rðxi j xj Þj¼1¼nX j pji :j¼1It is easy to see that, in this way,nnnXXXðy0i i Þ2 j pji ii¼1i¼1!2¼ kP T ! ! k2 :j¼1Therefore, (5) is naturally deduced.So, our HGM is a kind of the regularization approach[26]. A key difference is that in HGM there is no labeledsample and we use a conditional dependence matrix P toincorporate the class-specific priors instead. It is worthnoting that the Laplacian matrix LA in (6) can be replacedby the normalized Laplacian (i.e., N-Cut of the undirectedsubgraph). However, we have experimentally found thatusing normalized Laplacian tends to cut off the highcurvature details of objects (e.g., the legs of horses). Incompari

LIU ET AL.: UNSUPERVISED OBJECT SEGMENTATION WITH A HYBRID GRAPH MODEL (HGM) 911 Fig. 1. Our HGM-based single-class object segmentation. (a) Inputs: A set of images each consisting of objects (foreground) of a class and different backgrounds. (b) Outputs: Regions solely containing objects of the class. The whole process is fully automatic. Fig. 2.

Related Documents:

IEEE 3 Park Avenue New York, NY 10016-5997 USA 28 December 2012 IEEE Power and Energy Society IEEE Std 81 -2012 (Revision of IEEE Std 81-1983) Authorized licensed use limited to: Australian National University. Downloaded on July 27,2018 at 14:57:43 UTC from IEEE Xplore. Restrictions apply.File Size: 2MBPage Count: 86Explore furtherIEEE 81-2012 - IEEE Guide for Measuring Earth Resistivity .standards.ieee.org81-2012 - IEEE Guide for Measuring Earth Resistivity .ieeexplore.ieee.orgAn Overview Of The IEEE Standard 81 Fall-Of-Potential .www.agiusa.com(PDF) IEEE Std 80-2000 IEEE Guide for Safety in AC .www.academia.eduTesting and Evaluation of Grounding . - IEEE Web Hostingwww.ewh.ieee.orgRecommended to you b

Dual Machined Brass Flywheels RP-25 Metal Wheels Constant & Directional Lights Proto MAX Metal Knuckle Couplers HO Lehigh Valley 910-9020 #214 910-9021 #218 Ontario Northland 910-9022 #1300 910-9023 #1301 Rock Island 910-9024 #450 910-9025 #454 Union Pacific† 910-9026 #D.S. 1192 910-9027 #D.S. 1195 NEW SCHEMES .

Signal Processing, IEEE Transactions on IEEE Trans. Signal Process. IEEE Trans. Acoust., Speech, Signal Process.*(1975-1990) IEEE Trans. Audio Electroacoust.* (until 1974) Smart Grid, IEEE Transactions on IEEE Trans. Smart Grid Software Engineering, IEEE Transactions on IEEE Trans. Softw. Eng.

IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 Quality-Aware Images Zhou Wang, Member, IEEE, Guixing Wu, Student Member, IEEE, Hamid R. Sheikh, Member, IEEE, Eero P. Simoncelli, Senior Member, IEEE, En-Hui Yang, Senior Member, IEEE, and Alan C. Bovik, Fellow, IEEE Abstract— We propose the concept of quality-aware image, in which certain extracted features of the original (high-

IEEE Robotics and Automation Society IEEE Signal Processing Society IEEE Society on Social Implications of Technology IEEE Solid-State Circuits Society IEEE Systems, Man, and Cybernetics Society . IEEE Communications Standards Magazine IEEE Journal of Electromagnetics, RF and Microwaves in Medicine and Biology IEEE Transactions on Emerging .

Standards IEEE 802.1D-2004 for Spanning Tree Protocol IEEE 802.1p for Class of Service IEEE 802.1Q for VLAN Tagging IEEE 802.1s for Multiple Spanning Tree Protocol IEEE 802.1w for Rapid Spanning Tree Protocol IEEE 802.1X for authentication IEEE 802.3 for 10BaseT IEEE 802.3ab for 1000BaseT(X) IEEE 802.3ad for Port Trunk with LACP IEEE 802.3u for .

Service Instruction Soloy Aviation Solutions Section I No. 910-01 Page: 4 of 40 Service Instruction No. 910-01 Revision: 3 September 9, 2014 (1) Install forward clamp assemblies (910-1315-1) and spacers (910-1315-5) on the existing Cessna second-row seat-rails per drawing 910-1000-1, Detail C. Leave hardware loose to

LAFS.910.RI.1.AP.2a LAFS.910.RI.1.AP.2b LAFS.910.RI.1.AP.2c LAFS.910.RI.1.AP.2d LAFS.910.RI.1.3 Analyze how the author unfolds an analysis or series of ideas or events, including the order in which the points are made, how they are introduced and developed and the connections that are drawn betw