Snappy - Stanford University

2y ago
31 Views
2 Downloads
2.42 MB
46 Pages
Last View : 14d ago
Last Download : 2m ago
Upload by : Gannon Casey
Transcription

http://snap.stanford.edu/snappyCS224W, Fall 2019

¡¡¡Introduction to SNAPSnap.py for PythonNetwork analyticsCS224W, Fall 2019

¡Stanford Network Analysis Platform (SNAP)is a general purpose, high-performancesystem for analysis and manipulation oflarge networks§ http://snap.stanford.edu§ Scales to massive networks with hundreds ofmillions of nodes and billions of edges¡SNAP software§ Snap.py for Python, SNAP C ¡SNAP datasets§ Over 70 network datasetsCS224W, Fall 2019

¡Prebuilt packages available for Mac OS X, Windows, Linux¡Snap.py ml§ Quick Introduction, Tutorial, Reference Manual¡SNAP user mailing eveloper resources§ Software available as open source under BSD license§ GitHub thonCS224W, Fall 2019

¡Source code available for Mac OS X, Windows, Linux¡SNAP documentation¡SNAP user mailing p://snap.stanford.edu/snap/doc.html§ Quick Introduction, User Reference Manual§ Source code, see ss¡Developer resources§ Software available as open source under BSD license§ GitHub repositoryhttps://github.com/snap-stanford/snap§ SNAP C Programming GuideCS224W, Fall 2019

Collection of over 70 social network datasets:http://snap.stanford.edu/dataMailing list: http://groups.google.com/group/snap-datasets§ Social networks: online social networks, edges representinteractions between people§ Twitter and Memetracker : Memetracker phrases, links and 467million Tweets§ Citation networks: nodes represent papers, edges representcitations§ Collaboration networks: nodes represent scientists, edgesrepresent collaborations (co-authoring a paper)§ Amazon networks : nodes represent products and edges linkcommonly co-purchased productsCS224W, Fall 2019

¡Snap.py (pronounced “snappy”):SNAP for Pythonhttp://snap.stanford.edu/snappyUser CodeSnap.pyC Fast Execution Easy to use, interactiveüüPythonSnap.py (C , Python)CS224W, Fall 2019PythonC SNAPSolutionPythonüü

¡Installation:§ Follow instructions on the Snap.py webpagepip install snap-stanfordIf you encounter problems, please report them on PiazzaCS224W, Fall 2019

XfLUCAY3eYvdcBA98TUMMusVZkwmpdaI/edit?usp sharingCS224W, Fall 2019

¡The most important step for using Snap.py:Import the snap module! python import snapCS224W, Fall 2019

¡On the ndex-tut.html¡We will cover:§ Basic Snap.py data types§ Vectors, hash tables and pairs§ Graphs and networks§ Graph creation§ Adding and traversing nodes and edges§ Saving and loading graphs§ Plotting and visualizationCS224W, Fall 2019

Variable types/names:¡¡¡.Int: an integer operation, variable: GetValInt().Flt: a floating point operation, variable; GetValFlt().Str: a string operation, variable; GetDateStr()Classes vs. Graph Objects:¡¡T.: a class type; TUNGraphP.: type of a graph object; PUNGraphData Structures:¡¡.V: a vector, variable TIntV InNIdV.VV: a vector of vectors (i.e., a matrix), variable FltVVTFltVV a matrix of floating point elements¡.H: a hash table, variable NodeHTIntStrH a hash table with TInt keys, TStr values¡.HH: a hash of hashes, variable NodeHHTIntIntHH a hash table with TInt key 1 and TInt key 2¡.Pr: a pair; type TIntPrCS224W, Fall 2019

¡¡¡¡¡¡¡¡¡¡Get.: an access method, GetDeg()Set.: a set method, SetXYLabel().I: an iterator, NodeIId: an identifier, GetUId()NId: a node identifier, GetNId()EId: an edge identifier, GetEId()Nbr: a neighbor, GetNbrNId()Deg: a node degree, GetOutDeg()Src: a source node, GetSrcNId()Dst: a destination node, GetDstNId()CS224W, Fall 2019

¡¡¡TInt: IntegerTFlt: FloatTStr: String¡¡Used primarily for constructing composite typesIn general no need to deal with the basic types explicitly§ Data types are automatically converted between C andPython§ An illustration of explicit manipulation: i snap.TInt(10) print i.Val10¡Note: do not use an empty string “” in TStr parametersCS224W, Fall 2019

For more information check out Snap.py Reference e/index-ref.htmlCS224W, Fall 2019

SNAP User Reference , Fall 2019

¡Sequences of values of the same type§ New values can be added the end§ Existing values can be accessed or changed¡Naming convention: T type name V§ Examples: TIntV, TFltV, TStrV¡Common operations:§§§§Add( value ): add a valueLen(): vector size[ index ]: get or set a value of an existing elementfor i in V: iteration over the elementsCS224W, Fall 2019

v 5)print v.Len()print v[3]v[3] 2*v[2]print v[3]for item in v:print itemfor i in range(0, v.Len()):print i, v[i]CS224W, Fall 2019Create an empty vectorAdd elementsPrint vector sizeGet and set element valuePrint vector elements

¡A set of (key, value) pairs§ Keys must be of the same types, values must be of the sametype (could be different from the key type)§ New (key, value) pairs can be added§ Existing values can be accessed or changed via a key¡Naming: T key type value type H§ Examples: TIntStrH, TIntFltH, TStrIntH¡Common operations:§§§§§§[ key ]: add a new or get or set an existing valueLen(): hash table sizefor k in H: iteration over keysBegI(), IsEnd(), Next(): element iteratorsGetKey( i ): get i-th keyGetDat( key ): get value associated with a keyCS224W, Fall 2019

h snap.TIntStrH()h[5]h[3]h[9]h[6]h[1] Create an empty tableAdd elementsprint h.Len()Print table sizeprint "h[3] ", h[3]Get element valueh[3] “peach"print "h[3] ", h[3]Set element valuefor key in h:print key, h[key]Print table elementsCS224W, Fall 2019

¡T key type value type H§ Key: item key, provided by the caller§ Value: item value, provided by the caller§ KeyId: integer, unique slot in the table,calculated by Jason”CS224W, Fall 2019

¡A pair of (value1, value2)§ Two values, type of value1 could be different fromthe value2 type§ Existing values can be accessed¡Naming: T type1 type2 Pr§ Examples: TIntStrPr, TIntFltPr, TStrIntPr¡Common operations:§ GetVal1: get value1§ GetVal2: get value2CS224W, Fall 2019

p snap.TIntStrPr(1,"one") print p.GetVal1()1 print p.GetVal2()one¡¡¡Create a pairPrint pair valuesTIntStrPrV: a vector of (integer, string) pairsTIntPrV: a vector of (integer, integer) pairsTIntPrFltH: a hash table with (integer,integer) pair keys and float valuesCS224W, Fall 2019

¡Graphs vs. Networks Classes:§ TUNGraph: undirected graph§ TNGraph: directed graph§ TNEANet: multigraph with attributes on nodes andedges¡Object types start with P , since they usewrapper classes for garbage collection§ PUNGraph, PNGraph, PNEANet¡Guideline§ For class methods (functions) use T§ For object instances (variables) use PCS224W, Fall 2019

G1 ,12)G2 snap.TUNGraph.New()N1 snap.TNEANet.New()CS224W, Fall 2019Create directedgraphAdd nodesbefore addingedgesCreate undirected graph,directed network

Traverse nodesfor NI in G1.Nodes():print "node id %d, out-degree %d, in-degree %d"% (NI.GetId(), NI.GetOutDeg(), NI.GetInDeg())Traverse edgesfor EI in G1.Edges():print "(%d, %d)" % (EI.GetSrcNId(), EI.GetDstNId())Traverse edges by nodesfor NI in G1.Nodes():for DstNId in NI.GetOutEdges():print "edge (%d %d)" % (NI.GetId(), DstNId)CS224W, Fall 2019

Save textsnap.SaveEdgeList(G4, "test.txt", “List of edges")Load textG5 snap.LoadEdgeList(snap.PNGraph,"test.txt",0,1)FOut ave binaryFIn snap.TFIn("test.graph")G4 snap.TNGraph.Load(FIn)Load binaryCS224W, Fall 2019

¡Example file: wiki-Vote.txt§ Download from http://snap.stanford.edu/data# Directed graph: wiki-Vote.txt# Nodes: 7115 Edges: 103689# FromNodeIdToNodeId010203040526 Load textG5 4W, Fall 2019

¡Plotting graph properties§ Gnuplot: http://www.gnuplot.info¡Visualizing graphs§ Graphviz: http://www.graphviz.org¡Other options§ Matplotlib: http://www.matplotlib.orgCS224W, Fall 2019

¡Install Gnuplot:http://www.gnuplot.info/¡Make sure that the directory containingwgnuplot.exe (for Windows) or gnuplot (forLinux, Mac OS X) is in your environmentalvariable PATHCS224W, Fall 2019

G snap.LoadEdgeList(snap.PNGraph, “stackoverflow-Java.txt", 0, 1)snap.PlotInDegDistr(G, "Stack-Java", "Stack-Java In Degree")Graph of Java QA onStackOverflow:in-degree distributionCS224W, Fall 2019

¡Snap.py generates three files:§ .png is the plot§ .tab file contains the data (tab separated file)§ .plt file contains the plotting commandsCS224W, Fall 2019

¡Install GraphViz:http://www.graphviz.org/¡Make sure that the directory containingGraphViz is in your environmental variable PATHCS224W, Fall 2019

G1 snap.TNGraph.New()Create dEdge(1,5)G1.AddEdge(5,1)G1.AddEdge(5,12)NIdName snap.TIntStrH()NIdName[1] "1"NIdName[5] "5"NIdName[12] "12"Set node labelsDrawsnap.DrawGViz(G1, snap.gvlDot, "G1.png", "G1", NIdName)CS224W, Fall 2019

G snap.LoadEdgeList(snap.PNGraph, "qa.txt", 1, 5)snap.PrintInfo(G, "QA Stats", "qa-info.txt", False)Output:QA Stats: DirectedNodes:Edges:Zero Deg Nodes:Zero InDeg Nodes:Zero OutDeg Nodes:NonZero In-Out Deg Nodes:Unique directed edges:Unique undirected edges:Self Edges:BiDir Edges:Closed triangles:Open triangles:Frac. of closed triads:Connected component size:Strong conn. comp. size:Approx. full diameter:90% effective diameter:CS224W, Fall 9

¡Complete, circle, grid, star, tree graphsGG snap.GenGrid(snap.PUNGraph, 4, 3)GT snap.GenTree(snap.PUNGraph, 4, 2)CS224W, Fall 2019

¡¡¡Erdos-Renyi, Preferential attachmentForest Fire, Small-world, Configuration modelKronecker, RMat, Graph rewiringGPA snap.GenPrefAttach(30, 3, snap.TRnd())CS224W, Fall 2019

¡Extract subgraphs, convert from one graphtype to anotherGet an induced subgraph on aset of nodes NIdV:NIdV snap.TIntV()for i in range(1,9): NIdV.Add(i)SubGPA snap.GetSubGraph(GPA, NIdV)CS224W, Fall 2019

¡Analyze graph connectedness§ Strongly and Weakly connected components§ Test connectivity, get sizes, get components, get largest§ Articulation points, bridges§ Bi-connected, 1-connectedMxWcc snap.GetMxWcc(G)Get largest WCCprint "max wcc nodes %d, edges %d" %(MxWcc.GetNodes(), MxWcc.GetEdges())WccV snap.TIntPrV()snap.GetWccSzCnt(G, WccV)Get WCC sizesprint "# of connected component sizes", WccV.Len()for comp in WccV:print "size %d, number of components %d" %(comp.GetVal1(), comp.GetVal2())CS224W, Fall 2019

¡Analyze node connectivity§ Find node degrees, maximum degree, degreedistribution§ In-degree, out-degree, combined degreeNId snap.GetMxDegNId(GPA)print "max degree node", NIdDegToCntV snap.TIntPrV()snap.GetDegCnt(GPA, DegToCntV)for item in DegToCntV:print "%d nodes with degree %d" % (item.GetVal2(), item.GetVal1())max degree node 113 nodes with degree 34 nodes with degree 43 nodes with degree 52 nodes with degree 61 nodes with degree 71 nodes with degree 92 nodes with degree 102 nodes with degree 111 nodes with degree 131 nodes with degree 15CS224W, Fall 2019Get node with max degreeGet degree distribution

¡Find “importance” of nodes in a graph§ PageRank, Hubs and Authorities§ Degree-, betweenness-, closeness-, farness-, andeigen- centralityPRankH snap.TIntFltH()snap.GetPageRank(G, PRankH)Calculate nodePageRank scoresPrint them outfor item in PRankH:print item, PRankH[item]CS224W, Fall 2019

¡Analyze connectivity among the neighbors§ # of triads, fraction of closed triads§ Fraction of connected neighbor pairs§ Graph-based, node-basedTriads snap.GetTriads(GPA)print "triads", TriadsCount triadsCalculate clusteringcoefficientCC snap.GetClustCf(GPA)print "clustering coefficient", CCCS224W, Fall 2019

¡Distances between nodes§ Diameter, Effective diameter§ Shortest path, Neighbors at distance d§ Approximate neighborhood (not BFS based)D snap.GetBfsFullDiam(G, 100)print "diameter", DED snap.GetBfsEffDiam(G, 100)print "effective diameter", EDCS224W, Fall 2019Calculate diameterCalculate effectivediameter

¡Identify communities of nodes§ Clauset-Newman-Moore, Girvan-Newman§ Can be compute time intensive§ BigClam, CODA, Cesna (C only)Clauset-Newman-MooreCmtyV snap.TCnComV()modularity snap.CommunityCNM(UGraph, CmtyV)for Cmty in CmtyV:print "Community: "for NI in Cmty:print NIprint "The modularity of the network is %f" % modularityCS224W, Fall 2019

¡Calculations based on graph adjacency matrix§ Get Eigenvalues, Eigenvectors§ Get Singular values, leading singular vectorsEigV snap.TFltV()snap.GetEigVec(G, EigV)nr 0for f in EigV:nr 1print "%d: %.6f" % (nr, f)CS224W, Fall 2019Get leadingeigenvector

¡Repeatedly remove nodeswith low degrees§ Calculate K-coreCore3 snap.GetKCore(G, 3)CS224W, Fall 2019Calculate 3-core

¡Introduction to SNAP ¡Snap.py for Python ¡Network analytics CS224W, Fall 2019 ¡ Stanford Network Analysis Platform (SNAP)

Related Documents:

SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity Qingyuan Zhao Stanford University qyzhao@stanford.edu Murat A. Erdogdu Stanford University erdogdu@stanford.edu Hera Y. He Stanford University yhe1@stanford.edu Anand Rajaraman Stanford University anand@cs.stanford.edu Jure Leskovec Stanford University jure@cs.stanford .

dimensions in 4', 5' and 8' lengths. All duct sizes and lengths are produced on the Snappy revolutionary Automatic Duct Line, insuring the highest quality duct available in todays industry. RECTANGULAR DUCT FITTINGS- Die punched parts and riveted construction make Snappy duct fittings the best in the market.

Computer Science Stanford University ymaniyar@stanford.edu Madhu Karra Computer Science Stanford University mkarra@stanford.edu Arvind Subramanian Computer Science Stanford University arvindvs@stanford.edu 1 Problem Description Most existing COVID-19 tests use nasal swabs and a polymerase chain reaction to detect the virus in a sample. We aim to

Domain Adversarial Training for QA Systems Stanford CS224N Default Project Mentor: Gita Krishna Danny Schwartz Brynne Hurst Grace Wang Stanford University Stanford University Stanford University deschwa2@stanford.edu brynnemh@stanford.edu gracenol@stanford.edu Abstract In this project, we exa

Stanford University Stanford, CA 94305 bowang@stanford.edu Min Liu Department of Statistics Stanford University Stanford, CA 94305 liumin@stanford.edu Abstract Sentiment analysis is an important task in natural language understanding and has a wide range of real-world applications. The typical sentiment analysis focus on

4 Writer’s Choice: Composition Practice, Grade 9, Unit 1 A. Writing a Snappy Beginning Your attitude toward what you write shows in your very first word. So begin with confidence and style! Follow the directions to try some snappy openers.

Stanford Health Care Organizational Overview 3 Contract Administration is a Shared Service of Stanford Health Care to Eight Other Stanford Medicine Entities Stanford Health are ("SH")is the flagship academic medical center associated with the Stanford University School of Medicine. SHC has 15,232 employees and volunteers, 613 licensed

Mar 16, 2021 · undergraduate and graduate students, faculty, staff, and members of the community. Anyone interested in auditioning for the Stanford Philharmonia, Stanford Symphony Orchestra, or Stanford Summer Symphony should contact Orchestra Administrator Adriana Ramírez Mirabal at orchestra@stanford.edu. For further information, visit orchestra.stanford.edu.