Normalized Cuts Without Eigenvectors: A Multilevel Approach

3y ago
41 Views
2 Downloads
1.74 MB
38 Pages
Last View : 1d ago
Last Download : 2m ago
Upload by : Ronnie Bonney
Transcription

Normalized Cuts Without Eigenvectors: AMultilevel ApproachInderjit S. DhillonThe University of Texas at AustinSIAM Conference on Parallel ProcessingFebruary 24, 2006Joint work with Yuqiang Guan and Brian Kulis

ClusteringPartitioning data into clusters arises in various applications in data mining &machine learning.Examples:Bioinformatics: Identifying similar genesText Mining: Organizing document collectionsImage/Audio Analysis: Image and Speech segmentationWeb Search: Clustering web search resultsSocial Network Analysis: Identifying social groupsOther: Load balancing and circuit partitioning

Graph Partitioning/ClusteringIn many applications, the goal is to partition/cluster the nodes of agraph:High School Friendship Network[James Moody. American Journal of Sociology, 2001]

Graph Partitioning/ClusteringIn many applications, the goal is to partition/cluster the nodes of agraph:The Internet[The Internet Mapping Project, Hal Burch and Bill Cheswick, Lumeta Corp, 1999]

Graph Clustering ObjectivesHow do we measure the quality of a graph clustering?

Graph Clustering ObjectivesHow do we measure the quality of a graph clustering?Could simply minimize the edge-cut in the graphCan lead to clusters that are highly unbalanced in size

Graph Clustering ObjectivesHow do we measure the quality of a graph clustering?Could simply minimize the edge-cut in the graphCan lead to clusters that are highly unbalanced in sizeCould minimize the edge-cut in the graph while constraining theclusters to be equal in sizeNot a natural restriction in data analysis

Graph Clustering ObjectivesHow do we measure the quality of a graph clustering?Could simply minimize the edge-cut in the graphCan lead to clusters that are highly unbalanced in sizeCould minimize the edge-cut in the graph while constraining theclusters to be equal in sizeNot a natural restriction in data analysisPopular objectives include normalized cut, ratio cut and ratioassociationcNormalized Cut:minimizei 1cRatio Cut:minimizei 1links(Vi , V \ Vi )degree(Vi )links(Vi , V \ Vi ) Vi [Shi & Malik, IEEE Pattern Analysis & Machine Intelligence, 2000][Chan, Schlag & Zien, IEEE Integrated Circuits & Systems, 1994]

ExamplesNormalized CutRatio Cut

Spectral ClusteringTake a real relaxation of the clustering objective

Spectral ClusteringTake a real relaxation of the clustering objectiveGlobally optimal solution of the relaxed problem is given byeigenvectorsFor ratio cut: compute smallest eigenvectors of the LaplacianL D AFor normalized cut: compute smallest eigenvectors of thenormalized Laplacian I D 1/2 AD 1/2Post-process eigenvectors to obtain a discrete clustering

Spectral ClusteringTake a real relaxation of the clustering objectiveGlobally optimal solution of the relaxed problem is given byeigenvectorsFor ratio cut: compute smallest eigenvectors of the LaplacianL D AFor normalized cut: compute smallest eigenvectors of thenormalized Laplacian I D 1/2 AD 1/2Post-process eigenvectors to obtain a discrete clusteringProblem: Can be expensive if many eigenvectors of a very large graphare to be computed

The k -means AlgorithmGiven a set of vectors and an initial clustering, alternate betweencomputing cluster means and assigning points to the closest mean1. Initialize clusters πc and cluster means mc for all clusters c.2. For every vector ai and all clusters c, computed(ai , c) kai mc k2andc (ai ) argminc d(ai , c)3. Update clusters: πc {a : c (ai ) c}.4. Update means: mc 1 πc ai πcai5. If not converged, go to Step 2. Otherwise, output final clustering.

From k -means to Weighted Kernel k -meansIntroduce weights wi for each point ai : use the weighted mean instead

From k -means to Weighted Kernel k -meansIntroduce weights wi for each point ai : use the weighted mean insteadExpanding the distance computation yields:2kai mc k ai · ai 2aj πcwj ai · ajai πc wj ai ,aj πc(wj wl aj · alaj πcwj )2

From k -means to Weighted Kernel k -meansIntroduce weights wi for each point ai : use the weighted mean insteadExpanding the distance computation yields:2kai mc k ai · ai 2aj πcwj ai · ajai πc wj ai ,aj πc(wj wl aj · alaj πcwj )2Computation can be done only using inner products of data points

From k -means to Weighted Kernel k -meansIntroduce weights wi for each point ai : use the weighted mean insteadExpanding the distance computation yields:2kai mc k ai · ai 2aj πcwj ai · ajai πc wj ai ,aj πc(wj wl aj · alaj πcwj )2Computation can be done only using inner products of data pointsGiven a kernel matrix K that gives inner products in feature space,can compute distances using the above formula

From k -means to Weighted Kernel k -meansIntroduce weights wi for each point ai : use the weighted mean insteadExpanding the distance computation yields:2kai mc k ai · ai 2aj πcwj ai · ajai πc wj ai ,aj πc(wj wl aj · alaj πcwj )2Computation can be done only using inner products of data pointsGiven a kernel matrix K that gives inner products in feature space,can compute distances using the above formulaObjective function for weighted kernel k-means:kMinimize D({πc 1 })kwi kϕ(ai ) mc k2 c 1 ai πcwheremc wi ϕ(ai )ai πc wiai πc

The Weighted Kernel k -means AlgorithmGiven a kernel matrix (positive semi-definite similarity matrix), runk-means in the feature space1. Initialize clusters πc2. For every vector ai and all clusters c, computed(ai , c) Kii 2aj πcwj Kijai πc wj ai ,aj πc(wj wl Kjlaj πcwj )2andc (ai ) argminc d(ai , c)3. Update clusters: πc {a : c (ai ) c}.4. If not converged, go to Step 2. Otherwise, output final clustering.

Equivalence to Graph ClusteringSurprising Theoretical Equivalence:Weighted graph clustering objective is mathematically identical tothe weighted kernel k-means objective

Equivalence to Graph ClusteringSurprising Theoretical Equivalence:Weighted graph clustering objective is mathematically identical tothe weighted kernel k-means objectiveFollows by rewriting both objectives as trace maximization problems

Equivalence to Graph ClusteringSurprising Theoretical Equivalence:Weighted graph clustering objective is mathematically identical tothe weighted kernel k-means objectiveFollows by rewriting both objectives as trace maximization problemsPopular graph clustering objectives and corresponding weights andkernels for weighted kernel k-means given affinity matrix A:ObjectiveRatio AssociationRatio CutKernighan-LinNormalized CutNode Weight1 for each node1 for each node1 for each nodeDegree of the nodeKernelK σI AK σI LK σI LK σD 1 D 1 AD 1

Equivalence to Graph ClusteringSurprising Theoretical Equivalence:Weighted graph clustering objective is mathematically identical tothe weighted kernel k-means objectiveFollows by rewriting both objectives as trace maximization problemsPopular graph clustering objectives and corresponding weights andkernels for weighted kernel k-means given affinity matrix A:ObjectiveRatio AssociationRatio CutKernighan-LinNormalized CutNode Weight1 for each node1 for each node1 for each nodeDegree of the nodeKernelK σI AK σI LK σI LK σD 1 D 1 AD 1Implication: Can minimize graph cuts such as normalized cut and ratiocut without any eigenvector computation.

The Multilevel ApproachOverview of the approachInput GraphFinal ClusteringCoarseningRefiningInitial Clustering[CHACO, Hendrickson & Leland, 1994][METIS, Karypis & Kumar, 1999]

The Multilevel ApproachPhase I: CoarseningCoarsen the graph by merging nodes together to form smaller andsmaller graphsUse a simple greedy heuristic specialized to each graph cutobjective function

The Multilevel ApproachPhase I: CoarseningCoarsen the graph by merging nodes together to form smaller andsmaller graphsUse a simple greedy heuristic specialized to each graph cutobjective functionPhase II: Base ClusteringOnce the graph is small enough, perform a base clusteringVariety of techniques possible for this step

The Multilevel ApproachPhase I: CoarseningCoarsen the graph by merging nodes together to form smaller andsmaller graphsUse a simple greedy heuristic specialized to each graph cutobjective functionPhase II: Base ClusteringOnce the graph is small enough, perform a base clusteringVariety of techniques possible for this stepPhase III: RefiningUncoarsen the graph, level by levelUse weighted kernel k-means to refine the clusterings at eachlevelInput clustering to weighted kernel k-means is the clustering fromthe previous level

Experiments: gene networkMycobacterium tuberculosis gene network: 1381 genes and 9766functional linkages.Normalized cut values generated by Graclus and the spectral method# 2.53824.92395643.10135.364712818.73525.463

Experiments: gene networkMycobacterium tuberculosis gene network: 1381 genes and 9766functional linkages.Spy plots of the functional linkage matrix before and after clustering(128 clusters)—each dot indicates a non-zero 0400600800nz 9766100012000200400600800nz 960010001200

Experiments: gene networkMycobacterium tuberculosis gene network: 1381 genes and 9766functional linkages.Two example clusters: Histidine biosynthesis pathway and ATPsynthase multiprotein isD(A)(B)

Experiments: IMDB movie data setThe IMDB contains 1.4 million nodes and 4.3 million edges.Normalized cut values and computation time for a varied number ofclusters, using Graclus and the spectral methodNormalized cut values—lower cut values are better# ation time (in 1.42Spectral261.32521.69597.231678.055817.96---

Experiments: IMDB movie data setThe IMDB contains 1.4 million nodes and 4.3 million edges.We generate 5000 clusters using Graclus, which takes 12 minutes.If we use the spectral method, we would have to store 5000eigenvectors of length 1.4M; that is 24 GB main memory.MoviesActorsHarry Potter and the Sorcerer’s StoneDaniel Radcliffe, Rupert Grint,Harry Potter and the Chamber of SecretsEmma Watson, Peter Best,Harry Potter and the Prisoner of AzkabanJoshua Herdman, Harry Melling,Harry Potter and the Goblet of FireRobert Pattinson, James Phelps,Harry Potter and the Order of the PhoenixTom Felton, Devon Murray,Harry Potter: Behind the MagicJamie Waylett, Shefali Chowdhury,Harry Potter und die Kammer des Schreckens:Stanislav Ianevski, Jamie Yeates,Das grobe RTL Special zum FilmJ.K. Rowling: Harry Potter and MeBonnie Wright, Alfred Enoch, Scott Fern,Chris Rankin, Matthew Lewis, Katie LeungSean Biggerstaff, Oliver Phelps

Experiments: Image segmentationLeftmost plot is the original image and each of the 3 plots to the rightof it is a component (cluster) — body, tail and background.Normalized cut value for this multilevel clustering is .022138, smallerthan .023944 for spectral

Experiments: Benchmark graph clusteringTest graphs:Graph nameNo. of nodesNo. of edgesApplicationcopter255476352238helicopter meshmemplus1775854196memory circuitpcrystk0213965477309structural engineeringramage02168301424761navier stokes and continuity equations

Experiments: Benchmark graph clusteringComputation n time (in second) for ratio associationcomputation time (in second) for normalized 2010.50copter2mempluspcrystk02ramage02

Experiments: Benchmark graph clusteringQuality (normalized cut and ratio association):Normalized cut values scaled by those generated using spectral method1.4Ratio association values scaled by those generated using spectral e020copter2mempluspcrystk02ramage02

Experiments: Benchmark graph clusteringComputation time comparison between Graclus and Metis54.5GraclusK0GraclusB0KMetisPMetiscomputation time in seconds43.532.521.510.5110210number of clusters310

ConclusionsMinimizing graph cuts such as the normalized cut is useful in manyapplicationsA mathematical equivalence between spectral graph clusteringobjectives and the weighted kernel k-means objectiveMultilevel algorithm uses kernel k-means in its refinement phaseExperimental results show that the multilevel algorithm, as comparedto a state-of-the-art spectral clustering algorithm:Mostly outperforms spectral algorithm in terms of qualitySignificantly fasterRequires much less memory

Harry Potter and the Prisoner of Azkaban Joshua Herdman, Harry Melling, Harry Potter and the Goblet of Fire Robert Pattinson, James Phelps, Harry Potter and the Order of the Phoenix Tom Felton, Devon Murray, Harry Potter: Behind the Magic Jamie Waylett, Shefali Chowdhury, Harry Potter und die Kammer des Schreckens: Stanislav Ianevski, Jamie Yeates,

Related Documents:

6.1. Introduction to Eigenvalues 289 To explain eigenvalues, we first explain eigenvectors. Almo st all vectors change di-rection, when they are multiplied by A. Certain exceptional vectors x are in the same direction as Ax. Those are the “eigenvectors” . Multiply an eigenvector by

40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 420 440 460 480 Mix 2-2 TA, 2.2000 mg Integral 6.80 mJ Normalized 3.09 Jg 1 Onset 50.95 C Peak Integral Normalized Onset Peak Integral Normalized Onset Peak Integral Normalized Onset 53.44 C Peak 9.99 mJ 4.54 Jg 1 122.

of symmetric matrices are usually normalized to have Euclidean length equal to one, x 2 1. On the other hand, the eigenvectors of nonsymmetric matrices often have different normalizations in different contexts. Singular vectors are almost always normalized to have Euclidean length equal to one, u 2 v 2 1. You can still

- Spectral graph theory: grapheigenvalues closely related to almost all major global graph invariants - Have been adopted as compact global shape descriptors Eigenvectors - Useful extremalproperties, e.g., heuristic for NP‐hard problems normalized cuts and sequencing

BLUEPRINT TO CUTS PHASE ONE OVERVIEW Use this as a quick reference to the Arnold Schwarzenegger Blueprint to Cuts. Cross the workout off as you complete them and track your own progress. ARNOLD BLUEPRINT: CUTS PHASE 1 WORKOUTS Follow the rep ranges below unless listed otherwise CHEST/BACK PHASE 1: MON / THURS REMEMBER: Run 1-2 Miles as fast as possible 3-5 times per week Post-Workout REST .

Linear Algebra and its Applications 466 (2015) 102 116 Contents lists available at ScienceDirect . for linear mixed 0–1 optimization, such as the Gomory cuts, the lift-and-project cuts, and cuts from other hierarchies of tighter relaxations, extend to conic mixed 0–1 optimization. Belotti et al. [12] give conic cuts

four leg cuts, expressed as a percentage of the cold weight'of the open side. The trimmings consisted of excess fat, waste and bones, and this lean meat varied from best quality steak to stewing beef. A) The total weight of "lean retail cuts" produced from the four leg cuts, expre

peningkatan hasil belajar ips materi peninggalan sejarah hindu-buddha dan islam melalui cooperative learning type student teams achievement divisions (stad) pada siswa kelas v semester i mi tholabiyah tegaron kecamatan banyubiru kabupaten semarang tahun pelajaran 2016/2017 skripsi diajukan untuk memperoleh gelar sarjana pendidikan (s.pd) oleh: irma fatmawati nim 115-12-031 jurusan pendidikan .