Dynamic Knowledge Graph Based Multi-Event Forecasting

2y ago
40 Views
2 Downloads
2.85 MB
11 Pages
Last View : 10d ago
Last Download : 3m ago
Upload by : Tia Newell
Transcription

Dynamic Knowledge Graph based Multi-Event ForecastingSonggaojun DengHuzefa RangwalaYue NingStevens Institute of TechnologyHoboken, New Jerseysdeng4@stevens.eduGeorge Mason UniversityFairfax, Virginiarangwala@cs.gmu.eduStevens Institute of TechnologyHoboken, New ng concurrent events of multiple types and their involved actors from open-source social sensors is an important task for manydomains such as health care, disaster relief, and financial analysis.Forecasting events in the future can help human analysts betterunderstand global social dynamics and make quick and accuratedecisions. Anticipating participants or actors who may be involvedin these activities can also help stakeholders to better respond to unexpected events. However, achieving these goals is challenging dueto several factors: (i) it is hard to filter relevant information fromlarge-scale input, (ii) the input data is usually high dimensional,unstructured, and Non-IID (Non-independent and identically distributed) and (iii) associated text features are dynamic and varyover time. Recently, graph neural networks have demonstratedstrengths in learning complex and relational data. In this paper,we study a temporal graph learning method with heterogeneousdata fusion for predicting concurrent events of multiple types andinferring multiple candidate actors simultaneously. In order to capture temporal information from historical data, we propose Glean,a graph learning framework based on event knowledge graphsto incorporate both relational and word contexts. We present acontext-aware embedding fusion module to enrich hidden featuresfor event actors. We conducted extensive experiments on multiplereal-world datasets and show that the proposed method is competitive against various state-of-the-art methods for social eventprediction and also provides much-need interpretation capabilities.CCS CONCEPTS Information systems Data mining; Computing methodologies Knowledge representation and reasoning; Temporal reasoning.KEYWORDSMulti-Event Forecasting; Knowledge Graphs; Word GraphsACM Reference Format:Songgaojun Deng, Huzefa Rangwala, and Yue Ning. 2020. Dynamic Knowledge Graph based Multi-Event Forecasting. In Proceedings of the 26th ACMSIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’20),August 23–27, 2020, Virtual Event, CA, USA. ACM, New York, NY, USA,11 pages. https://doi.org/10.1145/3394486.3403209Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from permissions@acm.org.KDD ’20, August 23–27, 2020, Virtual Event, CA, USA 2020 Association for Computing Machinery.ACM ISBN 978-1-4503-7998-4/20/08. . . e ressase intepr nt t02/23 opert ventforecasting0.40.2Accuse02/27/15ProtestConsult 0.40.3Visit0.20.1Event Summary: A Politicianattacked the state governmenton various fronts such as fertilizercrunch and land acquisition Figure 1: A motivating example of event graphs for multievent and multi-actor forecasting. Nodes denote entities andedges denote event types and timestamps.1INTRODUCTIONEvents such as organized crime, civil unrest movements, and disease outbreaks have a major impact on society. Previous eventforecasting approaches mainly focus on predicting the occurrenceof a given event type, e.g., protest [8, 24] or event subtypes, e.g. airpollution of CO vs. PM2.5 [11]. These methods are unable to identifyconcurrent events of multiple types. Meanwhile, existing methodsneglect to predict the potential actors in an event, which is crucialfor decision making. As shown in Figure 1, the entities/actors involved in social events are usually those who initiated an event orwere targeted in an event (e.g. “farm worker” plays multiple rolesin two events). Multi-event forecasting involves inferring multipleconcurrent events of different types and participants in the futureusing historical input data.Currently, events are often extracted from formal reports ornews articles and structured as temporal knowledge graphs withadditional textual features. An event graph, as shown in Figure 1,consists of entities (as nodes) and event types (as edges), whereeach edge has a timestamp indicating when the event occurred.Each event has a synopsis describing the content of the event.For instance, an event that occurred on Feb. 26, 2015, was CitizenCriticizes Government. The summary of the event is “A Politicianattacked the state government on various fronts such as fertilizercrunch and land acquisition act”. These previous events among farmworkers, the government, the agriculture ministry, and citizensprovide indicators for the prediction of “protest” events and “farmworker” participants in the future.Identifying related actors and past events in the prediction models provides backgrounds and clues for understanding the development of events and their hidden factors. Prior work [8, 24, 35] inthis area has sought to identify supporting evidence in the form ofprecursor documents or context graphs while forecasting futureevents. However, the identified evidence in the existing work isoften difficult to understand or requires further manual inspection

when predicting multiple concurrent events of different types. Recognizing the correct set of actors and clues for each event typeis difficult given the complex connections under past events. Thetask of multi-event and multi-actor forecasting presents severalchallenges:We evaluate the proposed method with other state-of-the-art models on real-world datasets with large-scale entity sets and event typesets. With quantitative and qualitative experiments, we demonstratethe strengths of the proposed method in multi-type and multi-actorevent forecasting. Structured and unstructured features Event data is usually amixture of structured records (time, actors, types, etc.) and unstructured textual information (e.g., the event summary shownon an edge in Figure 1). Utilizing both forms of data for eventforecasting is important given that relational records provide keyelements of events while textual features provide enriched background information. However, few studies have performed thefusion of heterogeneous data for concurrent event forecasting. Beyond link prediction: Knowledge graph completion models relational information by predicting links between entities.However, it is difficult in practice to apply this line of researchto predict both relations (event types) and entities (actors). Mostknowledge graph completion methods only model the inherentstructure of relational data and fail to exploit global historicaldata for future event predictions. Beyond event forecasting: Prior research on event modelingmainly focuses on forecasting event occurrences or counts inthe future using either pre-defined features [39] or pre-trainedembeddings [24]. Graph features constructed from text data haveproven to be advantageous in deep learning methods for eventprediction [8]. However, it is difficult to automatically extractevent actors from graph-based text features.2We address the above challenges by considering two interconnected sub-problems: forecasting multiple co-occurring events ofdifferent types and inferring multiple actors in each event type.We propose a new methodology to fuse relational (event graphs)and semantic (text) information from heterogeneous historical datato predict concurrent events of multiple types and multiple actorsin the future. Given this enriched information, we also provideinterpretations for the prediction results. The model is able to automatically select salient actors, words, and relevant events from thehistorical input data based on learned weights. Our contributionsare summarized as follows: We design a novel multi-event multi-actor forecasting framework that (1) utilizes global event information from entities, eventtypes, and event descriptions, to predict concurrent events of multiple types; and (2) predicts potential participants (actors/entities)in these events with temporal and intrinsic inference modulesbased on historical events and the inherent association betweenentities and event types. We introduce a new encoding method for integrating both dynamic graph data (event graphs) and text data (documents andsummaries) into graph-based relational features. We also providea graph sampling method to obtain specific features in historyfor a given event-type, thereby eliminating the unwanted noise. Given the entities and event types in event graphs, we identify related words from event summaries and propose a context-awareembedding fusion method. We incorporate an attention mechanism to learn the importance of each identified word and capturelocal contextual semantics.RELATED WORKOur study is closely related to a large body of literature on eventforecasting and knowledge graph completion.Event Forecasting. There has been extensive research on eventforecasting with spatio-temporal correlations, including many realworld applications such as stock market movements [2], diseaseoutbreaks [29], and criminal activities [34]. Most existing machinelearning methods only work on euclidean or grid-like data. For instance, linear regression models use tweet frequency (and quantity)to predict the occurrence time of future events [2]. More sophisticated features (e.g., topic-related keywords [34]) and models (e.g.multi-task learning[25, 39]) have also been investigated. Predictive methods have begun to confront the complex structure ofsocial events and their underlying relations by considering thetemporal evolution of event-related indicators or utilizing both temporal and geographical graph information [25]. Recently, GraphConvolutional Networks (GCN) have been proposed to addressnon-euclidean data in many domains such as social networks[26],natural language processing [20] and bio-medicine [10]. A dynamicgraph convolutional network [8] was proposed to encode temporal text features into graphs for forecasting societal events andidentifying their context graphs.Most of these aforementioned models focus on the predictionof a single type of events or do not recognize the sub-types ofevents. Thus they are not flexible enough given that different modelsare trained for different types of events, which is computationallyexpensive and cannot guarantee good performance.Event subtypes have also been studied. A multi-task multi-classdeep learning model has been proposed for predicting the futureoccurrence of subtype events [11]. Different types of events occursimultaneously and it is necessary to model concurrent events.However, many of the existing approaches have neglected hidden relational knowledge among entities. In this case, they cannotpredict the actors of events who have significant impacts on socialmovements. Our proposed approach aims to predict multiple eventswhere each event includes elements such as actors and event type.Events are augmented with textual descriptive summaries. Thisgives us the benefit of discovering the impact of hidden connectionsamong entities for predicting future events and actors/entities.Knowledge Graph Completion Although knowledge Graphs(KGs) have been recognized in many domains, most KGs are farfrom complete and are growing rapidly. Thus, KG completion (orlink prediction) has been proposed to improve KGs by filling themissing connections.Extensive studies have been done on modeling static, multirelation graph data for link prediction. These methods mainly embed entities and relations into low-dimensional spaces [3, 9, 32, 36].Among these methods, Relational Graph Convolutional Networks(RGCN) [28] is developed to generalize the GCN model [16] by dealing with direct, multi-relational graphs such as knowledge graphs.

Recently, CompGCN [33] has been proposed to jointly embed bothnodes and relations in a relational graph. These methods achievehigh accuracy on modeling static knowledge graphs.There are recent attempts to incorporate temporal informationin modeling dynamic knowledge graphs where facts may evolveover time. Know-Evolve [31] models the occurrence of a fact (edge)as a temporal point process. Embedding-based methods have beenproposed to model temporal information, such as relation embeddings [12], time embedding [17], and temporal hyperplane [7].These methods cannot generalize to unseen timestamps. Beyondlink prediction, Jin et al. studied an autoregressive architecture formodeling temporal sequences of multi-relational graphs [14]. Weemploy temporal knowledge graphs in multi-type event forecasting.The difference between our work and temporal KG completion isthat we are predicting events (and then actors) at future times. Weformulate the problem into a multi-label classification problem. Weconsider both unstructured text data (event summaries) and graphdata (event graphs) with temporal and relational information.3PROBLEM FORMULATIONThe objective of this study is to forecast multiple co-occurringfuture events along with their event types (e.g., consult, provideaid, etc) and identification of involved key actors (e.g., politicians,organizations, etc). We formulate the knowledge-graph-based eventforecasting task as two multi-label classification problems. We firstintroduce the formulation of the problems and then describe thenecessary definitions used in our proposed method. The importantmathematical notations are in Table 1.Problem 1. Multi-Event Forecasting. We model the probabilities of a set of event types occurring at a future timestamp t basedon historical input data X 1:t 1 : P yt X 1:t 1 .(1)Here yt R R is a vector of event types and X 1:t 1 can beencoded from a set of graphs, documents, or word frequencies.Problem 2. Multi-Actor Forecasting. In this problem, giventhe relevant historical data of an event, we model the probabilitiesof a set of actors (subject or object entities) being involved in agiven type of event yt : P at yt , X 1:t 1 ,(2)where at R E is a vector of entities. In this work, we considerboth relational information and textual event summaries as enhanced information in history to address these two problems. Next,we introduce the necessary definitions used in this paper.Definition 3.1 Temporal Event Graph. A temporal event graphis a multi-relational, directed graph with time-stamped edges (eventtypes) between nodes (entities). Let E be a finite set of entities and Rbe a finite set of event types. An event is defined as a time-stampededge (i.e., subject entity (s), event type (r ), object entity (o) and timet) succinctly represented by a quadruple (s, r, o, t) or (st , r t , ot ),where s, o E and r R. We denote a set of events at time t asGt {(s, r, o, t)}t . Temporal event graphs are built upon a sequenceof event sets in ascending time order {G1, ., GT }, where each timestamped edge has a direction pointing from the subject entity tothe object entity. The same triple (s, r, o) may occur multiple timesin different timestamps, yielding different event quadruples. EachTable 1: Important notations and descriptionsNotationsE, R, VGt , DtGtr , Dtrhu , orbωHt Ht , OtH t , O tdDescriptionsentity, event type, and vocabulary setsevent and word graphs at time tevent and word subgraphs of event type r at time tlearnable embedding of entity u and event type rpre-trained word2vec embedding of word ωsemantic embedding matrix of words at time trelational embedding matrices of entities and event typesat time tfused embedding matrices of entities and event types attime tfeature dimensionunique entity u (s or o) and unique event type r will be initializedwith an embedding vector as hu Rd , or Rd .Definition 3.2 Temporal Word Graph. Each event is attachedwith a paragraph of text or summary. We collect all the event summaries at each timestamp. Formally, we denote a set of summariesat time t as Dt {c}t . For each timestamp t, we construct anundirected word graph Dt where each node denotes a unique wordfrom Dt . The edges are formed based on word co-occurrence inthe summary set Dt . Specifically, we employ point-wise mutualinformation (PMI) [6] to measure the semantic relevance of twowords, that is, the edge weight. In the document set Dt , we definean edge between word ω and word φ if PMI(ω, φ) 0, notated as(ω, φ) Dt , that is Dt {(ω, φ)}t . Vt contains all words at timet and V is the vocabulary set. Each word ω is initialized with apre-trained embedding vector bω Rd . Temporal word graphs arebuilt from a series of summary sets, sorted in ascending time order.In the following section, we use over a letter to representrelational embeddings in event graphs, to denote semantic embeddings in word graphs, and to denote fused embeddings.4METHODOLOGYFigure 2 provides an overview of the multi-event forecasting framework that consists of three parts: (1) graph aggregation module toobtain the relational and semantic embeddings from event and wordgraphs at each historical timestamp, respectively; (2) context-awareembedding fusion module to enhance representations of entitiesand event types by blending contextual features from words; and(3) recurrent encoder module to learn temporal global embeddingsacross all historical times. For the multi-actor forecasting problem, we model node-level and edge-level representations instead ofglobal graph features. We propose a temporal inference module tomodel dynamic temporal features. An additional intrinsic inferencemodule is designed to model the inherent association of entitiesand event types. The pseudocode of multi-event forecasting andmulti-actor forecasting is presented in the supplemental material.4.1Multi-Event ForecastingAssuming that the occurrences of events at time t depend on thehistorical events that happened in the previous time window of mtime steps, we model the probabilities of events of multiple typesP yt at time t using the temporal graphs of events (Gt m:t 1 ) and

Input TimeT-1Model at time t EventPredictionGCNEmbeddingFusionEmbedding FusionPrediction at TT-1RecurrentEncodercourtMemberαt1 htT-ktGraphSampling CompGCNRecurrentEncoderT-1IntrinsicInference TemporalInferencelegalJudiciary arrest htMembers ofJudiciaryActorPredictionchargeactdetainαt2ot ott 1arrest, detain,or chargeT-12: An overviewof the proposedmodel. CompGCNaggregatesentitytype embeddinginFigure Figure2: An overviewof the proposedmodel. CompGCNaggregatesthe entitytheandeventandtypeeventembeddingin event graphsatevent graphsat oneachbasedtheonsemanticEq. ?. GCNlearns thesemanticembeddingin eachtemporalwordeach timestampbasedEq.timestamp4. GCN learnsembeddingin eachtemporalword graphusingEq. 6. EmbeddingusingEq. ?. andEmbeddingfusionappliedwordsto entitiesand eventtypeswith relatedwordsthe wordfusion graphis appliedto entitiesevent typeswithisrelatedin the wordgraphsaccordingto Eq. 7-8.The inrecurrentencodergraphs according to Eq. ?-?. The recurrent encoder models temporal information for final predictions. Inmodels temporal information for final predictions. In the multi-event forecasting task, we model the global (graph-level)the multi-event forecasting task, we model the global (graph-level) temporal information for predicting thetemporalinformationpredictingcandidate predictionevent types.For wethe modelmulti-actorprediction task,temporalwe modelfeaturethe individualcand

rangwala@cs.gmu.edu Yue Ning Stevens Institute of Technology Hoboken, New Jersey yue.ning@stevens.edu ABSTRACT Modeling concurrent events of multiple types and their involved ac-tors from open-source social sensors is an important task for many domains such

Related Documents:

The totality of these behaviors is the graph schema. Drawing a graph schema . The best way to represent a graph schema is, of course, a graph. This is how the graph schema looks for the classic Tinkerpop graph. Figure 2: Example graph schema shown as a property graph . The graph schema is pretty much a property-graph.

W e present a nov el model called the Dynamic Pose Graph (DPG) to address the problem of long-term mapping in dynamic en vironments. The DPG is an extension of the traditional pose graph model. A Dynamic Pose Graph is a connected graph, denoted DPG hN ;E i,with nodes n i 2 N and edges ei;j 2 E and is dened as follo ws: Dynamic P ose Graph .

Oracle Database Spatial and Graph In-memory parallel graph analytics server (PGX) Load graph into memory for analysis . Command-line submission of graph queries Graph visualization tool APIs to update graph store : Graph Store In-Memory Graph Graph Analytics : Oracle Database Application : Shell, Zeppelin : Viz .

a graph database framework. The proposed approach targets two kinds of graph models: - IFC Meta Graph (IMG) based on the IFC EXPRESS schema - IFC Objects Graph (IOG) based on STEP Physical File (SPF) format. Beside the automatic conversation of IFC models into graph database this paper aims to demonstrate the potential of using graph databases .

ture node by passing messages on the graph G. Different from existing message passing neural networks consider-ing a fully- or locally-connected static graph [34, 12], we propose a dynamic graph network model with two dynamic properties, i.e. dynamic sampling of graph nodes to approx-

1.14 About Oracle Graph Server and Client Accessibility 1-57. 2 . Quick Starts for Using Oracle Property Graph. 2.1 Using Sample Data for Graph Analysis 2-1 2.1.1 Importing Data from CSV Files 2-1 2.2 Quick Start: Interactively Analyze Graph Data 2-3 2.2.1 Quick Start: Create and Query a Graph in the Database, Load into Graph Server (PGX) for .

1.14 About Oracle Graph Server and Client Accessibility 1-46. 2 . Quick Starts for Using Oracle Property Graph. 2.1 Using Sample Data for Graph Analysis 2-1 2.1.1 Importing Data from CSV Files 2-1 2.2 Quick Start: Interactively Analyze Graph Data 2-3 2.2.1 Quick Start: Create and Query a Graph in the Database, Load into Graph Server (PGX) for .

2.1 Recent graph database systems Graph database systems are based on a graph data model representing data by graph structures and providing graph-based operators such as neighborhood traversal and pattern matching [22]. Table 1 provides an overview about re-cent graph database systems including supported data models, their application