Today

6 Views

0 Downloads

3.27 MB

13 Pages

Transcription

What is a Network? Network graph Informally a graph is a set of nodesjoined by a set of lines or arrows.Network/GraphTheory Representing a problem as a graph canprovide a different point of viewRepresenting a problem as a graph canmake a problem much simpler More accurately, it can provide theappropriate tools for solving the problemWhat makes a problem graph-like? There are two components to a graph Nodes and edgesIn graph-like problems, these componentshave natural correspondences to problemelementsEntities are nodes and interactions betweenentities are edgesMost complex systems are graph-like23456123456What is network theory?Graph-based representations 1 Network theory provides a set oftechniques for analysing graphsComplex systems network theory providestechniques for analysing structure in asystem of interacting agents, representedas a networkApplying network theory to a systemmeans using a graph-theoreticrepresentationFriendship Network

Scientific collaboration networkGenetic interaction networkTransportation NetworksBusiness ties in US biotechindustryProtein-Protein InteractionNetworksInternet

Graph Theory - HistoryEcological NetworksLeonhard Euler's paperon “Seven Bridges ofKönigsberg” ,published in 1736.Graph Theory - HistoryCycles in PolyhedraThomas P. KirkmanGraph Theory - HistoryTrees in Electric CircuitsWilliam R. HamiltonGustav KirchhoffHamiltonian cycles in Platonic graphsGraph Theory - HistoryEnumeration of Chemical IsomersArthur CayleyJames J. SylvesterGraph Theory - HistoryFour Colors of MapsGeorge PolyaFrancis Guthrie Auguste DeMorgan

Definition: GraphDefinitions G is an ordered triple G: (V, E, f)– V is a set of nodes, points, or vertices.– E is a set, whose elements are known asedges or lines.– f is a function Vertex– Basic Element– Drawn as a node or a dot.– Vertex set of G is usually denoted by V(G), or V Edge– A set of two elements– Drawn as a line connecting two vertices, calledend vertices, or endpoints.– The edge set of G is usually denoted by E(G), orE. maps each element of E to an unordered pair of vertices in V.Simple GraphsExampleSimple graphs are graphs without multipleedges or self-loops. V: {1,2,3,4,5,6} E: d Graph (digraph)Weighted graphs is a graph for which each edge has anassociated weight, usually given by a weightfunction w: E R. Edges have directions– An edge is an ordered pair of nodesloopmultiple arcarc1node1.223.2.3.541.55.516221435536

Structures and structuralmetrics Graph structures are used to isolateinteresting or important sections of agraphStructural metrics provide a measurementof a structural property of a graph Global metrics refer to a whole graphLocal metrics refer to a single node in a graphConnectivity a graph is connected if– you can get from any node to any other byfollowing a sequence of edges OR– any two nodes are connected by a path.Graph structures Identify interesting sections of a graph Interesting because they form a significantdomain-specific structure, or because theysignificantly contribute to graph propertiesA subset of the nodes and edges in agraph that possess certain characteristics,or relate to each other in particular waysComponent Every disconnected graph can be splitup into a number of connectedcomponents. A directed graph is strongly connected ifthere is a directed path from any node to anyother node.DegreeDegree (Directed Graphs) Number of edges incident on a node In-degree: Number of edges entering Out-degree: Number of edges leaving Degree indeg outdegoutdeg(1) 2indeg(1) 0outdeg(2) 2indeg(2) 2The degree of 5 is 3outdeg(3) 1indeg(3) 4

WalksDegree: Simple Facts If G is a graph with m edges, thenΣ deg(v) 2m 2 E If G is a digraph thenΣ indeg(v) Σ outdeg(v) E Number of Odd degree Nodes is evenA walk of length k in a graph is a succession of k(not necessarily different) edges of the formuv,vw,wx, ,yz.This walk is denote by uvwx xz, and is referred toas a walk between u and z.A walk is closed is u z.PathCycle A path is a walk in which all the edges and allthe nodes are different.1,2,5,2,3,4walk of length 5Walks and Paths1,2,5,2,3,2,1CW of length 6 A cycle is a closed path in which all theedges are different.1,2,3,4,6path of length 4Special Types of Graphs Empty Graph / Edgeless graph1,2,5,13-cycle2,3,4,5,24-cycleTrees Connected Acyclic Graph– No edge Two nodes have exactlyone path between them Null graph– No nodes– Obviously no edge

Special TreesRegularConnected GraphPathsAll nodes have the samedegreeStarsBipartite graphSpecial Regular Graphs: Cycles V can be partitionedinto 2 sets V1 and V2such that (u,v) EimpliesC3C4C5Complete Graph Every pair of vertices are adjacent Has n(n-1)/2 edges– either u V1 and v V 2– OR v V 1 and u V2.Complete Bipartite Graph Bipartite Variation of Complete Graph Every node of one set is connected toevery other node on the other setStars

Planar Graphs Can be drawn on a plane such that no two edgesintersect K4 is the largest complete graph that is planarSpecial Subgraphs:Subgraphs: CliquesA clique is a maximum completeconnected subgraph.ABCDEFGHISpanning tree Let G be a connected graph. Then aspanning tree in G is a subgraph of Gthat includes every node and is also atree.Subgraph Vertex and edge sets are subsets ofthose of G– a supergraph of a graph G is a graph thatcontains G as a subgraph.Spanning subgraph Subgraph H has the same vertex set asG.– Possibly not all the edges– “H spans G”.Isomorphism Bijection, i.e., a one-to-one mapping:f : V(G) - V(H)u and v from G are adjacent if and onlyif f(u) and f(v) are adjacent in H. If an isomorphism can be constructedbetween two graphs, then we say thosegraphs are isomorphic.

Isomorphism Problem Determining whether twographs are isomorphic Although these graphs lookvery different, they areisomorphic; one isomorphismbetween them isRepresentation (Matrix) Incidence Matrix–VxE– [vertex, edges] contains the edge's data Adjacency Matrix–VxV– Boolean values (adjacent or not)– Or Edge Weightsf(a) 1 f(b) 6 f(c) 8 f(d) 3f(g) 5 f(h) 2 f(i) 4 f(j) 7Representation (List)Matrices1234561,2 1,5 2,3 2,5 3,4 4,5 4,61 1000001 0110000 0101000 0001110 1010100 00000112345610100102101010301010040010115110100 Edge List– pairs (ordered if directed) of vertices– Optionally weight and other data Adjacency List (node list)6000100Implementation of a Graph. Adjacency-list representation– an array of V lists, one for each vertex inV.– For each u V , ADJ [ u ] points to all itsadjacent vertices.Edge and Node ListsEdge List121223253343455354Node List12223533435534

Edge Lists for WeightedGraphsEdge List1 2 1.22 4 0.24 5 0.34 1 0.55 4 0.56 3 1.5Topological DistanceA shortest path is the minimum pathconnecting two nodes.The number of edges in the shortest pathconnecting p and q is the topologicaldistance between these two nodes, d p,qRandom GraphsDistance MatrixN 12Erdős and Renyi (1959)p 0.0 ; k 0 V x V V matrix D ( dij ) such thatdij is the topological distance between i and j .1 2 3 4 5 61 0 1 2 2 1 32 1 0 1 2 1 33 2 1 0 1 2 24 2 2 1 0 1 15 1 1 2 1 0 26 3 3 2 1 2 0N nodesA pair of nodes hasprobability p of beingconnected.What interesting things canbe said for different valuesof p or k ?(that are true as N )p 1.0 ; k ½N 2Random GraphsRandom GraphsErdős and Renyi (1959)p 0.09 ; k 1Average degree, k pNErdős and Renyi (1959)p 0.0 ; k 0p 0.09 ; k 1p 0.045 ; k 0.5Let’s look at Size of the largest connected clusterp 1.0 ; k ½N 2Diameter (maximum path length between nodes) of the largest clusterAverage path length between nodes (if a path exists)p 0.0 ; k 0p 0.045 ; k 0.5p 0.09 ; k 1p 1.0 ; k ½N 2511124712.04.21.0Size of largest component1Diameter of largest component0Average path length between nodes0.0

Random GraphsRandom GraphsErdős and Renyi (1959)If k 1:– small, isolated clusters– small diameters– short path lengthsAt k 1:– a giant component appears– diameter peaks– path lengths are highFor k 1:– almost all nodes connected– diameter shrinks– path lengths shortenPercentage of nodes in largest componentDiameter of largest component (not to scale)Erdős and Renyi (1959)KentaroToyama If connections between people can be modeled as arandom graph, then 1.0– Because the average person easily knows more than oneperson (k 1),0k1.0– We live in a “small world” where within a few links, we areconnected to anyone in the world.– Erdős and Renyi showed that averagepath length between connected nodes isphase transitionDavidMumfordFanChungWhat does this mean?PeterBelhumeurWhat does this mean?The Alpha ModelRandom GraphsErdős and Renyi amaBIG “IF”!!! If connections between people can be modeled as arandom graph, then – Because the average person easily knows more than oneperson (k 1),– We live in a “small world” where within a few links, we areconnected to anyone in the world.– Erdős and Renyi computed averagepath length between connected nodes to be:Watts (1999)The people you know aren’trandomly chosen.People tend to get to know thosewho are two links away(Rapoport *, 1957).The real world exhibits a lot ofclustering.The Personal Mapby MSR Redmond’s Social Computing Group* Same Anatol Rapoport, known for TIT FOR TAT!The Alpha Modelα model: Add edges to nodes, asin random graphs, but makeslinks more likely when twonodes have a common friend.For a range of α values:Probability of linkage as a functionof number of mutual friends(α is 0 in upper left,1 in diagonal,and in bottom right curves.)The Alpha ModelWatts (1999)– The world is small (averagepath length is short), and– Groups tend to form (highclustering coefficient).α model: Add edges to nodes, asin random graphs, but makeslinks more likely when twonodes have a common friend.Clustering coefficient /Normalized path lengthWatts (1999)For a range of α values:Clustering coefficient (C) andaverage path length (L)plotted against α– The world is small (averagepath length is short), and– Groups tend to form (highclustering coefficient).α

The Beta ModelWatts and Strogatz (1998)Watts and Strogatz (1998)β 0β 0.125β 1People knowtheir neighbors,and a few distant people.People knowothers atrandom.Clustered, butnot a “small world”Clustered and“small world”Not clustered,but “small world”Power LawsAlbert and Barabasi (1999)Degree distribution of a random graph,N 10,000 p 0.0015 k 15.(Curve is a Poisson curve, for comparison.)Both α and β models reproduceshort-path results of randomgraphs, but also allow forclustering.Small-world phenomena occur atthreshold between order andchaos.Clustering coefficient (C) and averagepath length (L) plotted against βPower LawsAlbert and Barabasi (1999)What’s the degree (number ofedges) distribution over a graph,for real-world graphs?What’s the degree (number ofedges) distribution over a graph,for real-world graphs?Random-graph model results inPoisson distribution.Random-graph model results inPoisson distribution.Typical shape of a power-law distribution.But, many real-world networksexhibit a power-law distribution.But, many real-world networksexhibit a power-law distribution.Power LawsPower LawsAlbert and Barabasi (1999)Albert and Barabasi (1999)Power-law distributions are straightlines in log-log space.AnandanJenniferChayesKentaroToyama“The rich get richer!”Power-law distribution of nodedistribution arises ifHow should random graphs begenerated to create a power-lawdistribution of node degrees?Hint:Pareto’s* Law: Wealthdistribution follows a power law.KentaroToyamaNobuyukiHanakiFirst five random links reduce theaverage path length of thenetwork by half, regardless of N!People knowtheir neighbors.JonathanDonnerClustering coefficient /Normalized path lengthThe Beta Model– Number of nodes grow;– Edges are added in proportion tothe number of edges a nodealready has.Power laws in real networks:(a) WWW hyperlinks(b) co-starring in movies(c) co-authorship of physicists(d) co-authorship of neuroscientists* Same Velfredo Pareto, who defined Pareto optimality in game theory.“Map of the Internet” posterAdditional variable fitness coefficientallows for some nodes to growfaster than others.

Searchable Networks Searchable NetworksKleinberg (2000)Kleinberg (2000)Just because a short path exists,doesn’t mean you can easilyfind it.You don’t know all of the peoplewhom your friends know.a)Variation of Watts’s β model:–––b)For d 2, dip in time-to-search at α 2––Under what conditions is a networksearchable?Searchable NetworksKleinberg (2000)RaminZabihKentaroToyamaWatts, Dodds, Newman (2002) showthat for d 2 or 3, real networksare quite searchable.Killworth and Bernard (1978) foundthat people tended to search theirnetworks by d 2: geography andprofession.c)Lattice is d-dimensional (d 2).One random link per node.Parameter α controls probability of random link– greater for closer nodes.For low α , random graph; no “geographic”correlation in linksFor high α, not a small world; no short paths tobe found.Searchability dips at α 2, in simulationReferencesldous & Wilson, Graphs and Applications. AnIntroductory Approach, Springer, 2000.Wasserman & Faust, Social Network Analysis,Cambridge University Press, 2008.The Watts-Dodds-Newman modelclosely fitting a real-world experiment

Hamiltonian cycles in Platonic graphs Graph Theory - History Gustav Kirchhoff Trees in Electric Circuits Graph Theory - History ... Walks and Paths 1,2,5,2,3,4 1,2,5,2,3,2,1 1,2,3,4,6 walk of length 5 CW of length 6 path of length 4 Cycle •A cycle is a closed ...