SEISMIC: A Self-Exciting Point Process Model For .

3y ago

68 Views

2 Downloads

727.17 KB

10 Pages

Last View : 2d ago

Last Download : 3m ago

Upload by : Grant Gall

Report this link

Download PDF

Transcription

SEISMIC: A Self-Exciting Point Process Modelfor Predicting Tweet PopularityQingyuan ZhaoMurat A. ErdogduHera Y. HeStanford UniversityStanford UniversityStanford he1@stanford.eduAnand RajaramanJure LeskovecStanford UniversityStanford uABSTRACTSocial networking websites allow users to create and share content.Big information cascades of post resharing can form as users ofthese sites reshare others’ posts with their friends and followers.One of the central challenges in understanding such cascading behaviors is in forecasting information outbreaks, where a single postbecomes widely popular by being reshared by many users.In this paper, we focus on predicting the final number of resharesof a given post. We build on the theory of self-exciting point processes to develop a statistical model that allows us to make accurate predictions. Our model requires no training or expensive feature engineering. It results in a simple and efficiently computableformula that allows us to answer questions, in real-time, such as:Given a post’s resharing history so far, what is our current estimateof its final number of reshares? Is the post resharing cascade pastthe initial stage of explosive growth? And, which posts will be themost reshared in the future?We validate our model using one month of complete Twitter dataand demonstrate a strong improvement in predictive accuracy overexisting approaches. Our model gives only 15% relative error inpredicting final size of an average information cascade after observing it for just one hour.Categories and Subject Descriptors: H.2.8 [Database Management]: Database applications—Data miningGeneral Terms: Algorithms; Experimentation.Keywords: information diffusion; cascade prediction; self-excitingpoint process; contagion; social media.1.INTRODUCTIONOnline social networking services, such as Facebook, Youtube,and Twitter, allow their users to post and share content in the formof posts, images, and videos [9, 17, 21, 30]. As a user is exposedto posts of others she follows, the user may in turn reshare a postwith her own followers, who may further reshare it with their respective sets of followers. This way large information cascades ofpost resharing spread through the network.Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others thanACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from Permissions@acm.org.KDD’15, August 10-13, 2015, Sydney, NSW, Australia.c 2015 ACM. ISBN 978-1-4503-3664-2/15/08 . 15.00.DOI: http://dx.doi.org/10.1145/2783258.2783401.A fundamental question in modeling information cascades is topredict their future evolution. Arguably the most direct way to formulate this question is to consider predicting the final size of aninformation cascade. That is, to predict how many reshares a givenpost will ultimately receive.Predicting the ultimate popularity of a post is important for content ranking and aggregation. For instance, Twitter is overflowingwith posts and users have a hard time keeping up with all of them.Thus, much of the content gets missed and eventually lost. Accurate prediction would allow Twitter to rank content better, discovertrending posts faster, and improve its content-delivery networks.Moreover, predicting information cascades allows us to gain fundamental insights into predictability of collective behaviors whereuncoordinated actions of many individuals lead to spontaneous outcomes, for example, large information outbreaks.Most research on predicting information cascades involves extracting an exhaustive set of features describing the past evolutionof a cascade and then using these features in a simple machinelearning classifier to make a prediction about future growth [4, 6,17, 20, 26, 30]. However, feature extraction can be expensive andcumbersome, and one is never sure if more effective features couldbe extracted. The question remains how to design a simple andprincipled bottom-up model of cascading behavior. The challengelies in defining a model for an individual’s behavior and then aggregating the effects of the individuals in order to make an accurateglobal prediction.Present work. Here we focus on predicting the final size of an information cascade spreading through a network. We develop a statistical model based on the theory of self-exciting point processes.A point process indexed by time is called a counting process whenit counts the number of instances (reshares, in our case) over time.In contrast to homogeneous Poisson processes which assume constant intensity over time, self-exciting processes assume that all theprevious instances (i.e., reshares) influence the future evolution ofthe process. Self-exciting point processes are frequently used tomodel “rich get richer” phenomena [22, 23, 33, 36]. They are idealfor modeling information cascades in networks because every newreshare of a post not only increases its cumulative reshare count byone, but also exposes new followers who may further reshare thepost.We develop S EISMIC (Self-Exciting Model of Information Cascades) for predicting the total number of reshares of a given post.In our model, each post is fully characterized by its infectiousnesswhich measures the reshare probability. We allow the infectiousness to vary freely over time in agreement with the observation thatthe infectiousness can drop as the content gets stale (see Figure 1).

Retweet CountHistogram of Retweet Times75502500246Infectiousness Estimated by n by SEISMIC20000Retweets150001000050000024Time since original tweet (hour)TruthSEISMICLR6ObservedFigure 1: First 6 hours of retweeting activity of a populartweet [1] (top). The controversial tweet is about the fresh deathof dictator Muammar Gaddafi and mentions singer JustinBieber. Interestingly, the car manufacturer Chevrolet Twitteraccount inappropriately retweeted the tweet about 30 minutesafter the original tweet, which possibly lead to tweet’s sustainedpopularity. Tweet infectiousness against time as estimated byS EISMIC (middle). Predictions of the tweet’s final retweet count(denoted as “Truth”) as a function of time (bottom). We compare S EISMIC with time series linear regression (LR), “Observed” plots the cumulative number of observed retweets by agiven time. Notice S EISMIC quickly finds an accurate estimateof the tweet’s final retweet count.Moreover, our model is able to identify at each time point whetherthe cascade is in the supercritical or subcritical state, based onwhether its infectiousness is above or below a critical threshold.A cascade in the supercritical state is going through an “explosion”period and its final size cannot be predicted accurately at the current time. On the contrary, a cascade is tractable if it is in subcritical state. In this case, we are able to predict its ultimate popularityaccurately by modeling the future cascading behavior by a GaltonWatson tree.Our S EISMIC approach makes several contributions: Generative model: S EISMIC imposes no parametric assumptions and requires no expensive feature engineering. Moreover, as complete social network structure may be hard to obtain, S EISMIC assumes minimal knowledge of the network:The only required input is the time history of reshares andthe degrees of the resharing nodes. Scalable computation: Making a prediction using S EISMIConly requires computational time linear in the number of observed reshares. Since predictions for individual posts can bemade independently, our algorithm can also be easily parallelized. Ease of interpretation: For an individual cascade, the modelsynthesizes all its past history into a single infectiousness parameter. This infectiousness parameter holds a clear meaning, and can serve as input to other applications.We evaluate S EISMIC on one month of complete Twitter data,where users post tweets which others can then reshare by retweeting them. We demonstrate that S EISMIC is able to predict the final retweet count of a given tweet with 30% better accuracy thanthe state-of-the-art approaches (e.g., [12]). For reasonably popular tweets, our model achieves 15% relative error in predicting thefinal retweet count after observing the tweet for 1 hour, and 25%error after observing the tweet for just 10 minutes. Moreover, wealso demonstrate how S EISMIC is able to identify tweets that willgo “viral” and be among the most popular tweets in the future. Bymaintaining a dynamic list of 500 tweets over time, we are able toidentify 78 of the 100 most reshared tweets and 281 of the 500 mostreshared tweets in just 10 minutes after they are posted.The rest of the paper is organized as follows: Section 2 surveys the related work. Section 3 describes S EISMIC, and Section 4shows how the model can be used to predict the final size of aninformation cascade. We evaluate our method and compare its performance with a number of baselines as well as state-of-the-art approaches in Section 5. Last, in Section 6, we conclude and discussfuture research directions.2.RELATED WORKThe study of information cascades is a rich and active field [27].Recent models for predicting size of information cascades are generally characterized by two types of approaches, feature based methods and point process based methods.Feature based methods first extract an exhaustive list of potentially relevant features, including content features, original posterfeatures, network structural features, and temporal features [6]. Thendifferent learning algorithms are applied, such as simple regressionmodels [2, 6], probabilistic collaborative filtering [35], regressiontrees [3], content-based models [24], and passive-aggressive algorithms [26]. There are several issues with such approaches: laborious feature engineering and extensive training are crucial for theirsuccess, and the performance is highly sensitive to the quality ofthe features [4, 30]. Such approaches also have limited applicability because they cannot be used in real-time online settings—giventhe massive amount of posts being produced every second, it ispractically impossible to extract all the necessary features for everypost and then apply complicated prediction rules. In contrast, S EIS MIC requires no feature engineering and results in an efficientlycomputable formula that allows it to predict the final popularity ofmillions of posts as they are spreading through the network.The second type of approach is based on point processes, whichdirectly models the formation of an information cascade in a network. Such models were mostly developed for the complementaryproblem of network inference, where one observes a number of information cascades and tries to infer the structure of the underlyingnetwork over which the cascades propagated [8, 10, 13, 14, 15, 18,33, 36]. These methods have been successfully applied to study thespread of memes on the web [10, 14, 32, 33] as well as hashtags on

Symbolwptφ(s)itiniRtR NtNteλtp̂tR̂ (t)DescriptionPost/information cascadeInfectiousness of w at time t (Section 3.2)Memory kernel (Section 3.1)Node that contributed ith reshare.i 0 corresponds to the originator of the post.Time of the ith reshare relative to the original post.Out-Degree of the ith nodeCumulative popularity by time t: {i 0; ti t} Final popularity (final number of reshares):P {i 0} Cumulative degree of resharers by time t: i:ti t niEffective cumulativeR t degree of resharers by time t:P tφ(s ti )dsnNte Rii 0tiIntensity of cumulative popularity RtModel’s estimate of infectiousness pt at time tModel’s estimate at time t of final popularity R the network. We consider that the time s between the arrival ofa post in a users’ timeline and a reshare of the post by the useris distributed with density φ(s). The probability density φ(s) isalso called a memory kernel because it measures a physical/socialsystem’s memory of stimuli [7].The distribution of human response time φ(s) has been shown tobe heavy-tailed in social networks [5]. Usually the tail of φ(s) isassumed to follow a power-law with exponent between 1 and 2 ora log-normal distribution [7, 34]. However, due to the rapid natureof information sharing on Twitter, it is also natural to expect manyinstant reaction times. In fact, our exploratory data analysis in Section 5.2 confirms that in Twitter, φ(s) is approximately constant forthe first 5 minutes and then followed by a power-law decay. Different social networks may have different distributions of humanreaction times. However, φ(s) only needs to be estimated once pernetwork and thus we can safely assume it is given. We describe adetailed estimation procedure of φ(s) in Section 5.2.Table 1: Table of symbols.3.2Twitter [36]. In contrast, our goal is not to infer the network but topredict the ultimate size of a cascade in an observed network.A major distinction between our model and existing methodsbased on Hawkes processes (e.g., [22, 23, 33, 34, 36]) is that weassume the process intensity λt depends on another stochastic process pt , the post infectiousness. In other words, we allow the infectiousness to change over time. Moreover, some of these methods [34] rely on computationally expensive Bayesian inference,while our method has linear time complexity. Another recentlyproposed related work is [12], which also takes the point processapproach and directly aims to predict tweet popularity. However,their method makes restrictive parametric assumptions and does notconsider the network struct

SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity Qingyuan Zhao Stanford University qyzhao@stanford.edu Murat A. Erdogdu Stanford University erdogdu@stanford.edu Hera Y. He Stanford University yhe1@stanford.edu Anand Rajaraman Stanford University anand@cs.stanford.edu Jure Leskovec Stanford University jure@cs.stanford .

Related Documents:

Seismic Bracing Systems - Power-Strut

The Seismic Tables defined in Pages 5 & 6 are for a seismic factor of 1.0g and can be used to determine brace location, sizes, and anchorage of pipe/duct/conduit and trapeze supports. The development of a new seismic table is required for seismic factors other than 1.0g and must be reviewed by OSHPD prior to seismic bracing. For OSHPD,

73 Views

3y ago

EXAMPLE 9 SEISMIC ZONE 1 DESIGN 1

EXAMPLE 9 SEISMIC ZONE 1 DESIGN 1 2018 Design Example 9 Example 9: Seismic Zone 1 Design Example Problem Statement Most bridges in Colorado fall into the Seismic Zone 1 category. Per AASHTO, no seismic analysis is required for structures in Zone 1. However, seismic criteria must be addressed in this case.

115 Views

3y ago

USG Donn FinelineFire Rated Fineline Acoustical Suspension ...

SC2493 Seismic Technical Guide, Light Fixture Hanger Wire Requirements SC2494 Seismic Technical Guide, Specialty and Decorative Ceilings SC2495 Seismic Technical Guide, Suspended Drywall Ceiling Construction SC2496 Seismic Technical Guide, Seismic Expansion joints SC2497 Seismic

60 Views

2y ago

Current status of Seismic Hazard Map in Korea

Peterson, M.D., and others, 2008, United States National Seismic Hazard Maps ․ Frankel, A. and others, Documentation for the 2002 Update of the National Seismic Hazard Maps ․ Frankel, A. and others, 1996, National Seismic Hazard Maps Evaluation of the Seismic Zoninig Method ․ Cornell, C.A., 1968, Engineering seismic risk analysis

24 Views

1y ago

Developing Probabilistic Seismic Hazard Maps of Taungoo, Bago Region ...

To develop the seismic hazard and seismic risk maps of Taungoo. In developing the seismic hazard maps, probabilistic seismic hazard assessment (PSHA) method is used. We developed the seismic hazard maps for 10% probability of exceedance in 50 years (475 years return period) and 2 % probability in 50 years (2475 years return period). The seisic

17 Views

1y ago

Revised Report of Probabilistic Seismic Hazard Analysis Main Athens ...

This analysis complied with these provisions by using the USGS 2014 National Seismic Hazard Map seismic model as implemented for the EZ-FRISK seismic hazard analysis software from Fugro Consultants, Inc. For this analysis, we used a catalog of seismic sources similar to the one used to produce the 2014 National Seismic Hazard Maps developed by .

21 Views

1y ago

Probabilistic Seismic Hazard Maps in Dam Foundation - PWRI

the seismic design of dams. KEYWORDS: Dam Foundation, Probabilistic Seismic Hazard Maps, Seismic Design 1. INTRODUCTION To perform seismic design or seismic diagnosis, it is very important to evaluate the earthquake hazard predicted for a dam site in order to predict earthquake damage and propose disaster prevention measures. There are two .

28 Views

1y ago

Seismic hazard map of Coimbatore using subsurface fault rupture

Seismic hazard parameters are estimated and mapped in macro level and micro level based on the study area. The process of estimating seismic hazard parameters is called seismic . maps of Indian Regions earlier, based on several approaches. This includes probabilistic seismic hazard macrozonation of Tamil Nadu by Menon et al. (2010), Seismic .

21 Views

1y ago

Recent Views

Ministries of Finance and Nationally Determined Contributions

Rodrigo Rojo, IDB Sr. Consultant and advisor to Ministry of Finance of Chile. Colombia German Romero Otalora and Laura Marcela Ruiz Daza — Office of the Vice-Minister — Ministry of Finance. Ireland Paul Ryan — International Finance Division — Ministry of Finance Sean Judge — Department of Finance — Ministry of Finance

1y ago

232 Views

Public HealtH Strategy for 2011-2017 - WHO

ME – Ministry of Economics MES – Ministry of Education and Science MEPRD – Ministry of Environmental Protection and Regional Development MF – Ministry of Finance MH – Ministry of Health MI – Ministry of the Interior MJ – Ministry of Justice MRDLG – Ministry of Regional Development and Local Government MT – Ministry of Transport

3y ago

169 Views

2019 - 2020 Budget Kit - Parliament of Fiji

Ministry of Justice 35 Fiji Corrections Service 37 Ministry of Communications 40 Ministry of Civil Service 43 . Ministry of Health and Medical Services 60 Ministry of Housing and Community Development 64 Ministry of Women, Children and Poverty Alleviation 68 Ministry of Youth and Sports 73 Tertiary Scholarships and Loans Schemes 77 Ministry .

1y ago

124 Views

Ministry of Environment, Government of India

Ministries/Departments of the Government of India, namely, Department of Space, Ministry of Agriculture, Ministry of Chemicals and Fertilizers, Ministry of Coal, Ministry of Commerce and Industry, Ministry of Communications and Information Technology, Ministry of Drinking Water and Sanitation, Ministry of Earth

1y ago

146 Views

Men'S Ministry Guide Tp Rev2 Ai

4 CONTENTS Introduction to the Outreach Ministry Guides Series 6 Introduction to the Men's Ministry Volunteer Handbook 8 Section 1 Men's Ministry Foundations Chapter 1 Why Men's Ministry 12 Chapter 2 Ways The Bible Speaks To Men's Ministry 17 Chapter 3 9 Foundations Of An Effective Men's Ministry 21 Section 2 The Anatomy Of An Effective Men's Ministry

1y ago

111 Views

MANAGERIAL FINANCE - GBV

of Managerial Finance page 2 Introduction to Managerial Finance 1 Starbucks—A Taste for Growth page 3 1.1 Finance and Business What Is Finance? 4 Major Areas and Opportunities in Finance 4 Legal Forms of Business Organization 5 Why Study Managerial Finance? Review Questions 9 1.2 The Managerial Finance Function 9 Organization of the Finance

3y ago

6.8K Views

Chapter 1 The roles of finance function in organisations

The roles of the finance function in organisations 4. The role of ethics in the role of the finance function Ethics is the system of moral principles that examines the concept of right and wrong. Ethics underpins an organisation’s sustained value creation. The roles that the finance function performs should be carried out in an .File Size: 888KBPage Count: 10Explore furtherRole of the Finance Function in the Financial Management .www.managementstudyguide.c Roles and Responsibilities of a Finance Department in a .www.pharmapproach.comRoles and Responsibilities of a Finance Department .www.smythecpa.comTop 10 – Functions of Business Finance in an om23 Functions and Duties of Accounting and Finance nded to you b

1y ago

335 Views

2017-2018 GRANDE ÉCOLE MSc in MANAGEMENT

Descriptif des cours Course Outlines 10 Catalogue des cours/ Course Catalog 2017-2018 FIN: Finance/Finance A : Actuariat/Actuarial, Insurance E : Finance d’entreprise/Corporate Finance The course liste tables and the course outlines G : Finance générale/General Finance M : Finance de marché/Market Finance S : Synthèse/Synthesis IDS: Systèmes d’Information, Sciences de la Décision et .

3y ago

312 Views

Behavioral Finance and Wealth L Management

Introduction to Behavioral Finance CHAPTER1 What Is Behavioral Finance? Behavioral Finance: The Big Picture Standard Finance versus Behavioral Finance The Role of Behavioral Finance with Private Clients How Practical Application of Behavioral Finance Can Create a Successful Advisory Rel

2y ago

377 Views

Catalogue des Cours Course Catalog - ESSEC Business School

10 Catalogue des cours/Course Catalog 2021-2022 FIN: Finance/Finance E : Finance d'entreprise/Corporate Finance G : Finance générale/General Finance M : Finance de marché/Market Finance S : Synthèse/Synthesis IDS: Systèmes d'Information, Sciences de la Décision et Statistiques/ Information Systems, Decision Sciences and Statistics

1y ago

222 Views

PARLIAMENT OF THE REPUBLIC OF FIJI Research and Library .

Ministry of Civil Service 583.8 1,056.4 Fiji Police Force 59.0 25.0 Ministry of Education, Heritage and Arts 23,638.2 18,608.3 Ministry of Health and Medical Services 10,642.7 16,766.0 Ministry of Women, Children and Poverty Alleviation 2,906.3 6,581.5 Ministry of Youth and Sports 212.3 - Ministry of Agriculture 8,662.8 9,216.8

2y ago

169 Views

First Baptist Church Valdosta, Georgia First Family Chimes

Student Ministry 2 2 Children’s Ministry 2 Sunday, April 29 Education Ministry Music Ministry 3 3 Family Night Supper First Family News 3 4 Ministry This Week Facts and Figures 5 5 11:00 Worship Guide Sun. Evening Classes 6 7 Coming Events Adult Ministry 7 7 Volum

2y ago

175 Views

Government of India Ministry of New and Renewable Energy

Government of India Ministry of New and Renewable Energy MNRE . 1,00,000 MW Till year 2022 20,000 MW 20,000 MW 40,000 MW 20,000 MW Solar Park Unemployed Graduate States/Private/ . Ministry MW Potential Ministry of Agriculture 12 Ministry of Chemicals and Fertilizers Ministry of Health and Family 401

1y ago

141 Views

SINGAPORE - Kelly Services

FINANCE Chief Financial Officer Degree/Master 15 20,000 25,000 Finance Assistant Diploma 1-3 2,800 3,400 Finance Controller Degree 10-15 10,000 18,000 Finance Director Degree 15 15,000 20,000 Finance Executive/ Senior Finance Executive Degree 2-5 3,000 6,000 Finance Manager/ Assistan

2y ago

527 Views

Trade Finance & Supply Chain Finance Awards 2022

In February 2022, Global Finance will publish its annual selections for the World's Best Trade Finance and Supply Chain Finance Providers. Global Finance will name the best trade finance providers in more than 100 countries and territories, eight global regions and

1y ago

215 Views

SEISMIC: A Self-Exciting Point Process Model For .

It looks like you're using an ad-blocker