Geo-Friends Recommendation In GPS-Based Cyber-Physical .

3y ago
12 Views
2 Downloads
355.04 KB
8 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Hayden Brunner
Transcription

Geo-Friends Recommendation in GPS-BasedCyber-Physical Social NetworkXiao Yu, Ang Pan, Lu-An Tang, Zhenhui Li, Jiawei HanComputer Science DepartmentUniversity of Illinois, at Urbana Champaign{xiaoyu1, angpan1, tang18, zli28, hanj}@illinois.eduAbstract—The popularization of GPS-enabled mobile devicesprovides social network researchers a taste of cyber-physicalsocial network in advance. Traditional link prediction methodsare designed to find friends solely relying on social networkinformation. With location and trajectory data available, we cangenerate more accurate and geographically related results, andhelp web-based social service users find more friends in the realworld. Aiming to recommend geographically related friends insocial network, a three-step statistical recommendation approachis proposed for GPS-enabled cyber-physical social network. Bycombining GPS information and social network structures, webuild a pattern-based heterogeneous information network. Linksinside this network reflect both people’s geographical information, and their social relationships. Our approach estimateslink relevance and finds promising geo-friends by employinga random walk process on the heterogeneous information network. Empirical studies from both synthetic datasets and reallife dataset demonstrate the power of merging GPS data andsocial graph structure, and suggest our method outperformsother methods for friends recommendation in GPS-based cyberphysical social network.I. I NTRODUCTIONPopular social network services like Facebook and Twitterallow users to store and share both digital information, e.g.,web content, and also locations and trajectories collected fromthe real world. Analyzing these extra data from physical worldcan help us better understand people’s daily activities, socialareas and life patterns. Social network with data collected fromsensors is usually refered as Cyber-Physical Social Network[3]. In this paper, we study friend recommendation problemin cyber-physical social network. With location and trajectoryinformation available, we improve the accuracy of the resultsand make on-line social services much closer to users’ reallife.One major difference between virtual web-based social network and real life social network is, new friends in real worldtend to be geographically related. Geographical similarity ishiding in users’ recently GPS data. To help web-based socialnetwork users find more friends in their real life, we definepotential real life friends, who have both social similaritiesand geographical correlation as Geo-Friends, and denote GeoFriends Finding Problem as real life friends discovery on webbased social network.The reason why we want to isolate geo-friends from generalweb-based social network friends is intuitive. Geo-friends playan important role in off-line social events, e.g., holiday party,football game, or book club. Geo-friends have a much higherprobability to participate these real life events than otherfriends from virtual social network.Following example demonstrate the idea of recommendinggeo-friends in web-based social network.Example 1: Alex wants to find some new geo-friends tojoin him in a local charity event. There are three candidates:Bob who shares a large number of friends with Alex, but livesin another country; Carlos who works in the same companywith Alex, but shares no similarity in terms of social networkstructure; David who shares couple of common friends, andalso go to the same gym, same comic book store as Alex doesevery week.After analyzing both social structure and recently collectedGPS data, social network services should recommend Davidas Alex’s geo-friends, since he has a higher probability toparticipate the local event with Alex.Previous approaches of link prediction which usually onlyrely on social network structure would recommend Bob. Butapparently, Bob is not a good candidate for Alex’s socialevents, since he lives in another country. Also, solely relyingon location or trajectory information for geo-friend findingdoes not work as well. Carlos who has a very high positivegeographical correlation with Alex shares no social interestswith Alex. Recommending Carlos to Alex is pointless as well.In this paper, we propose a a three-step approach, namedGEo-Friends Recommendation framework (a.k.a., GEFR).First, interesting and discriminative GPS patterns are extractedfrom a large amount of raw GPS data. Then we combinesboth geo-information and social network in a pattern-basedheterogeneous information network. By applying random walkto reproduce friends making process on the network, we caneffectively identify potential geo-friends for a specific user.The contributions of this paper are summarized as follows: Propose geo-friends recommendation problem, and discuss the differences from previously studied link prediction problem. Define and generate a set of GPS patterns to describepeople’s real life social interaction and correlation. Propose a random walk-based statistical framework forgeo-friend recommendation (GEFR). Design and conduct a series of experiments on bothsynthetic and real-world datasets. Demonstrate the powerof GEFR in various situations.

II. BACKGROUNDSANDP RELIMINARIESWe briefly introduce the related data model, and define geofriends finding problem in context.A. Data ModelGPS data are continuous in both spacial and temporaldimensions. However, different devices have different datasample rate, which leads to shifting time intervals. Withoutlosing generality, we assume constant sample rate within oneapplication.Definition 1 (GPS Trajectory): A GPS trajectory can begenerated by sequentially connecting GPS records of a specificuser following the ascending order of timestamps. We denote(n)(1) (2)GPS trajectory for a person j as Sj hsj , sj , ., sj i,(i)where sj is a GPS location record.Definition 2 (GPS-CPN): A GPS-Based Cyber-PhysicalSocial Network can be defined as G(S, V, E), similarly tosocial network, V is the set of vertices, represents all thepeople in the network. E is the set of edges, represents allthe links between people. S is the set of GPS trajectories, andeach trajectory in S is associated with a specific person in V ,represents this person’s movements.UserLayerUserLayerPattern AGPSPatternsGPSDataPattern CPattern B(a) Cyber Physical Social Network (b) Pattern-Based Heterogeneous Information NetworkFig. 1.GPS NetworksNotice that, people who carry GPS devices is a subset ofvertices V in this network, so S V . An example of GPSCPN can be found in Figure 1(a).B. Problem DefinitionIntuitively, geo-friends recommendation is trying to findpotential real life friends on web-based social network. Withdata model defined above, we formally define this problem as:Given a GPS-CPN G(S, V, E), and a specific query person v ,one method should return a ranked list of people nodes in Vand also for each element v ′ in the list, hv ′ , v i / E. What’smore, the ranking score in the process should consider bothGPS trajectory S and social network (V, E).III. G EO F RIENDS F INDING F RAMEWORKThis section describes the three-step geo-friends recommendation framework (GEFR) in a given cyber-physical socialnetwork, including GPS pattern extraction, pattern-based heterogeneous information network building and random walkon the network. Details of the three-step approach will bepresented in the following subsections.A. GPS Pattern ExtractionMost GPS applications use raw GPS data directly, e.g.,storage or visualization. However, raw GPS data are in hugesize and hard to provide people with a semantic understandingof human behavior, In order to better understand the hiddeninformation, we first present four different heuristics on geographically correlation based on empirical observations.Common Location: GPS locations can reflect people’s interests, and people tend to go to their interests related locationsmore often. If two people share common locations, whichsuggests they might share common interests, the probabilitythat they become friends would be higher.Common Routine: GPS trajectory segments indicate people’s habits and routines. People who share similar routines,tend to become friends.Meeting: If two people share same locations at the sametimestamps in their GPS trajectory, they should be geographically related.Hanging Out: Two people share same routine in a specifictime period, which indicates they are hanging out in thattime period. If two people hang out, the probability of theybecoming geo-friends would be higher.Based on above empirical observations and heuristics, wepropose four different GPS patterns to capture these information. We first convert raw GPS trajectory dataset S tocategorical dataset Scat , and sequential dataset Sseq 1 . In Scat ,we simply discard temporal information and keep discretizedlocations in a unordered manner. While in Sseq , locations arestill sequentially connected by the order of timestamps. Withcategorical dataset Scat , sequential dataset Sseq and originalGPS dataset S, we define GPS patterns as follow.Definition 3 (FL-Pattern): Closed frequent patterns withsupport 2 in Scat is defined as Frequent Location Patterns(a.k.a., FL-Patterns), following Common Location heuristic.Definition 4 (FT-Pattern): Closed sequential pattern withsupport 2 and length 2 in Sseq is defined as FrequentTrajectory Pattern (a.k.a., FT-Pattern), following CommonRoutine heuristic.Definition 5 (FLT-Pattern): For each FL-Pattern, if locations share the same timestamp in all corresponding GPStrajectories, and no super-pattern with the same support canbe generated by adding another time constrained location,this pattern can be defined as Frequent Location with TimeConstraint Patterns (a.k.a., FLT-Patterns), following Meetingheuristic.Definition 6 (FTT-Pattern): Similarly to FLT-Pattern, Frequent Trajectory with Time Constraint Pattern (a.k.a., FTTPattern) can be defined as closed sequential pattern withsupport 2 and length 2 in Sseq and it shares thesame time period in corresponding GPS trajectories, followingHanging Out heuristic.1 GPS data discretization process are related to GPS error analysis and actualGPS coordinates. GPS datasets in the experiments of this paper have alreadybeen discretized and preprocessed. For detailed methods, please refer to [4]and [5].

can be found in Equation 1.argmaxi By combining GPS patterns generated in last subsection,and the given social network, we can build a pattern-basedheterogeneous information network H(P, V, E ′ ) as follows.Given GPS-CPS G(S, V, E), we first discard raw GPS trajectory set S. For each GPS pattern, we create an additional nodep, and link corresponding person node v with p if this GPSpattern exists in person v’s GPS trajectory history. And thencreate a new edge hv, pi, and add it to E ′ . Notice that, edgeset E ′ in heterogeneous information network contains threetypes of edges, which are edges between people, edges fromperson nodes to pattern nodes, as well as edges from patternnodes to person nodes.An example of GPS pattern-based heterogeneous information network is presented in Figure 1(b).One needs to notice that, adding a large number of GPSpatterns without selection, can decrease the performance ofour method badly. This can be explained from different perspectives. 1) High frequency patterns usually indicate commonpublic locations people can all related to, e.g., a bus station,or a hospital. These locations do not carry any discriminativeGPS information, and they do not follow any heuristics wementioned above. 2) The number of low length and supportpatterns are gigantic. Adding such patterns into the networkwill lead to a substantial increase of link search space forrandom walk process, which will reduce both the efficiencyand precision.Instead of manually refine patterns, we employ an entropybased thresholding measure similar to [7] to filter out patternwith high or low support and length. Threshold calculationpj log(pj ) k LXpk log(pk )(1)k i 1PPb iwhere pj nj / a ia 0 na , pk nk /b 0 nb and n is thepattern frequency.We apply this measure to MIT Reality Mining dataset [2].As presented in Figure 2, after calculation of entropy-basedthresholds, we first select patterns in support histogram withthreshold 13 and 18, and then filter out patterns with lengthlower than 4. This pattern selection strategy provides similarresults as our manual parameter tuning experiments.5350103004Num of 40(a) Support Count HistogramFig. 2.B. Building Pattern-Based Information Networkj iXj 0Num of PatternsTo mine FL-Patterns, we apply FP-Growth [6] on Scat andgenerate closed frequent patterns with support 2. AfterFL-Pattern generation, there are two methods to generate FTPatterns from Sseq . 1) Directly apply PrefixSpan [10] on Sseq ,and extract all closed sequential frequent patterns, or 2) firstcalculate the permutation set of each FL-Pattern, generatecombination set for each permutation, and then simply checkeach combination against Sseq , to make sure ascending timestamp order still hold. By collecting combinations in ascendingtimestamp order, FT-Pattern set could be generated as well.Method 2 could be more efficient if FL-Pattern set is small,and we used Method 2 in our experiments.FLT-Patterns can be generated based on FL-Patterns. Wefirst calculate the combination set for each FL-Pattern, andcheck the same timestamp constraint in each combinationin the raw GPS dataset. By collecting combinations holdingsame timestamp constraint, FLT-Pattern set can be generatedefficiently. Notice that, closed frequent pattern in Scat are notthe sufficient candidate set for FLT-Pattern mining. It is verylikely that only partial FL-Pattern meet the same timestampconstraint, so it is necessary to calculate the combinationset before check the time constraint. FTT-Patterns can begenerated from FT-Patterns in a similar way.00246810Length(b) Length Count HistogramGPS Pattern Selection using Entropy MeasureAfter the construction of the heterogeneous informationnetwork, edge weights between nodes need to be definedbefore random walk process. By adding filtered pattern nodesinto social network, edge set E ′ now contains three types ofedges, including, edges from GPS pattern node p to personnode v, edges from person node v to pattern node p, and alsoedges from person node v to person node v ′ . No edges aredefined between GPS patterns. We first define edge weightsfrom different types of GPS pattern nodes to person nodes.We use w( v, p) to represent raw edge weights, which will benormalized before random walk process.wF L (v, p) log (1 length(p))·α N bp (v) (2)wF T (v, p) log (1 length(p))·β N bp (v) (3)wF LT (v, p) log (1 length(p))·γ N bp (v) (4)wF T T (v, p) log (1 length(p) · timespan(p))·θ N bp (v) (5)where N bp (v) denotes the set of pattern nodes connecting toperson node v, length(p) denotes the length of pattern p, andtimespan(p) denotes time span of a time constraint patternp, in terms of number of timestamps. Parameter α, β, γ andθ controls pattern importance, and setting will be discussed innext section.We define edge weights starting from pattern nodes toperson nodes as follows.

w(p, v) 1 N bv (p) (6)where N bv (p) denotes the set of person nodes connecting topattern node p.The third type of edge are from person nodes to personnodes, we define the edge weight from person node v to itsneighbor v ′ as:w(v, v ′ ) 1 N bv (v) Output: A ranked list of recommended geo-friends(7)where N bv (v) denotes the set of person nodes connected toperson node v.From above weight definitions for edges from person nodesto pattern nodes, one can notice that, edges weights fromperson nodes to pattern nodes are various based on patterntypes, and pattern attributes, including length and timespan.Parameters α, β, γ and θ are designed to adjust the importance of different types of pattern in different scenarios.Edge weights from pattern nodes to person nodes, and fromperson nodes to person nodes are solely based on number ofneighbors. If number of people neighbor is smaller, the edgeweight will be larger.C. Random Walk on Heterogeneous Information NetworkRandom walk process on graph has been widely used insocial network analysis [15] [11] [17]. To apply random walkon GPS pattern-based information network, we first need todefine a transition probability matrix to describe all transitionprobability on the edge set of H. To represent all possibletransitions on H, the size of the matrix should be ( V P ) · ( V P ). We sort all person nodes based on their IDin GPS trajectory dataset, and then append a sorted patternnodes following the order they were extracted. By combiningequations 2 to 7, we can generate a transition weight matrix P r(H).One thing we need to notice is that, by adding additionalpattern attributes into edge weights definitions, includinglength and timespan, patterns are more semantic but the sumof weights on out going edges of person nodes is no longerequals to 1, we need to normalize edge weights of pattern to get transitionnodes in transition weight matrix P r(H)probability matrix P r(H) of the heterogeneous informationnetwork, following Equation 8.w(v, n)m N b(v) w(v, m)pr(v, n) PAlgorithm 1: Geo Friends Recommendation FrameworkInput: A GPS-based cyber-physical social networkG(S, V, E), where S is the GPS trajectorydataset, V is person node set, and E is edge set.A recommending target person v . Number ofrecommended friends K. Parameters α, β, γ andθ for edge weights definitions, and λ for randomwalk process.(8)where n is a node connected to person node v, and N bv denoteall nodes connected to person node v. No normalization isrequired for pattern nodes, because the weight on out goingedges of pattern nodes only depend on the number of personnode neighbors, the sum of which is already 1. We denotenormalized matrix as transition probability matrix P r(H) .For ease of presentation, we simplify the representation ofP r(H) as1, Generate FL-Patterns, FT-Patterns, FLT-Patterns andFTT-Patterns based on Definitions 4 to 7;2, Filter out biased patterns and shrink link search spaceby using Equation 1;3, Construct heterogeneous information networkH(P, N, E ′ ) by adding pattern nodes into GPS-basedcyber-physical social network, link pattern nodes withcorresponding person nodes, and define edge weithsfollowing Equations from 2 to 7;4, Generate transition probability matrix and normalizeeach column following Equation 8;(t)5, Iteratively update RN using Equations 10 or 11regarding input query node v until it converges;6, Rank link relevance based on the results from laststep, and return the top-k nodes as the final results; P r(H) P r(V )P r(B)P r(A)0 (9)where P r(V ) is an V V matrix representing the transition probability between person nodes to person nodes, asdefined in Equation 7. P r(V ) is not a symmetric matrix, sincetransition probability between two people nodes are definedbased on number of neighbors of the starting node. P r(A) isa P V matrix representing the transition probability fromGPS pattern nodes to person nodes, as defined in Equation6. P r(B) is a V P matrix representing the transitionprobability from person nodes to GPS pattern nodes, as definedin Equation 2, 3, 4, 5 and 8.We choose random walk with restart process to simulategeo-friends finding process, and estimate link relevance for aspecific query person v in heterogeneous information networkfor several reasons. This statistical approach is stable androbust in different datasets, and also random walk with restartprocess satisfies some basic heuristics for identifying geofriends.First we present a list of GPS pattern utilizing heuristics,which random walk process can simulate:(1) If a GPS pattern contains more geographical informationor has a stronger semantic meaning, the in-coming probabilityfrom person nodes to this pattern shoul

work. Empirical studies from both synthetic datasets and real-life dataset demonstrate the power of merging GPS data and social graph structure, and suggest our method outperforms other methods for friends recommendation in GPS-based cyber-physical social network. I. INTRODUCTION Popular social network services like Facebook and Twitter

Related Documents:

Nexo Geo M6 450 Watt 6,5" 1" / 80 (120 ) 20 CHF 70.00 Nexo Geo M6B 450 Watt 1 6,5" CHF 64.60 Nexo Geo S805 600 Watt 8" 1,4" / 80 5 CHF 75.40 Nexo Geo S830 600 Watt 8" 1,4" / 120 5 CHF 75.40 Nexo Geo S1210, Bi-Amp 1550 Watt 12" 1,4" / 80 10 CHF 96.95 Nexo Geo

3 Geo.513 Structural Geology and Geology of Nepal 4 100 4 Geo.514 Sedimentology 2 50 5 Geo.515 Practical of Geo.511 2 50 6 Geo.516 Practical of Geo.512 2 50 . Cambridge University Press, 536 p. 4. Adolf Seilacher, Trace Fossil Analysis, 2007, Springer Verlag, 238 p. 5. Moore, P. D., Webb, J.

GEO 827 -Digital Image Processing and Analysis DIPA Jiaguo Qi, Jiquan Chen and Ranjeet John Email: qi@msu.edu, jqchen@msu.edu, ranjeetj@msu.edu Tel. 517-353-8736 517-214-6675 Office: GEO 206 or MM 218 Lecture: T & Th: 5:20 -6:10pm; RM. GEO 201 Lab: T & Th: 7:00-8:50pm; RM GEO 201 GEO 827 -Digital Image Processing and Analysis Fall 2015

3. Overview of the Bible 2. How did the Bible come into being? 4. The First process of the Bible GPS is Understanding. 5. The Second process of the Bible GPS is Application. The Third process of the Bible GPS is Communication. 6. The Bible GPS on Galatians 5: 16-26 7. The Bible GPS on Ephesians 5: 8-20 8. The Bible GPS on Romans 3: 21-26

Development – GL ECO 4703 International Trade Theory and Policy ECO 4733 Multinational Corporation ANT 4352 4). Geography courses GEO 3001 Geographies of Global Change – GL GEO 3471 Political Geography GEO 3502 Economic Geography – GL GEO 4354 Geography of the Global Food System – GL GEO 4476 Political Ecology CPO 3304 5). History courses

Nexo Geo M6 450 Watt 6,5" 1" / 80 (120 ) 20 SFr. 70.20 Nexo Geo M6B 450 Watt 1 6,5" SFr. 64.80 Nexo Geo S805 600 Watt 8" 1,4" / 80 5 SFr. 75.60 Nexo Geo S830 600 Watt 8" 1,4" / 120 5 SFr. 75.60 Nexo Geo

Nexo Geo S805 7x L, 7x R. Nexo Geo S830 1x L, 1x R. Nexo Geo sub d12 2x L, 2x R. Nexo Geo PS8 1x L, 1x R (infill) Nexo Geo ID24 2x (infill) Mogelijkheid tot los aansturen van sub. Verstekers 2x Nexo NXAMP 4x4 1x Camco Tecton 384 3x QSC Audio RMX 2450 MENGTAFEL Midas Venice 24

Banking standards, requiring the largest UK banks (the ‘CMA9’) as ASPSPs3 to develop ‘Open APIs’ to provide access to Third Party Providers (TPPs) for retail and SME4 customer accounts. The Open Banking Implementation Entity (OBIE) was created as a Special Purpose Vehicle to instruct and