Twitter Data Analysis - Doc.ic.ac.uk

1y ago
4 Views
1 Downloads
2.82 MB
61 Pages
Last View : 2m ago
Last Download : 2m ago
Upload by : Braxton Mach
Transcription

1TWITTER DATA ANALYSIS TOUR22 Jan 2015Piyawat L Kumjorn & Panida Nimnual

Contents2 Processing Twitter Data Visualization (Exploratory Data Analysis)In-depth AnalysisTools for analysisInteresting sources

3Processing Twitter DataVisualization (Exploratory Data Analysis)Find more at ers

Twitter Data Visualization4 Visualization is for story telling, exploratory dataanalysis and result illustrationExtracts from twitter data UserText ( e?When?How?How much?(Aggregate Data)To visualize, combine these extracts together

User Time5 An interactive timeline based on when your friendsstarted using our-twitter-conversations

Time Amount6 A graph of the Tweet activity on the evening ofSunday May 1, 681263084

Geo Amount7 Twitter Heat Map of “f*ck you” and “Good /twitter-heatmap-good-morning-fckyou n 1811065.html

User Text8 While Twitter brings many users together, wetypically connect with like-minded souls nyt.pdf

User Amount9

Text Amount10 Word Cloud and Word Tree

Geo Text11 Real-time Tweet Mapshttp://trendsmap.com

Text Time Amount12 UEFA Champion Leaguehttps://uclfinal.twitter.com/

Geo Time Amount13 Tweet dly-evolving-user-interests

Text Time Geo Amount14 State of The Union 2014http://twitter.github.io/interactive/sotu2014/

15Processing Twitter DataIn-depth AnalysisFind more in :- Social Media Mining An IntroductionBy Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu-Twitter Data AnalyticsBy Shamanth Kumar, Fred Morstatter, and Huan Liu

Social Media Mining (1)16 There are three groups of questions we want toanswerGroup1: General Activities Whoare the most important people in a socialnetwork? How do people befriend others? How can we find interesting patterns in user-generatedcontent?

Social Media Mining (2)17 Group2: Communities and Interactions Howcan we identify communities in a social network? When someone posts an interesting article on a socialnetwork, how far can the article be transmitted in thatnetwork? Group3: Real-world problems Howcan we measure the influence of individuals in asocial network? How can we recommend content or friends toindividuals online? How can we analyze the behavior of individuals online?

Twitter Analysis18 Text Measures Trending TopicsSentimental AnalysisNetwork Measures User InfluenceUser Behavior

Trending Topics19 Count occurrences of Specific html

Latent Dirichlet Allocation (LDA)20 Every topic in LDA is a collection of wordsEach topic contains all of the words in the corpus with aprobability of the word belonging to that topic.For example, Sports Politics40% “basketball”, 35% “football”, 15% “baseball”,., 0.02% “congress”, and 0.01% “Obama”35% “congress”, 30% “Obama”, ., 1% “football”, 0.1%“baseball”, 0.1% “basketball” LDA finds the most probable words for a topic,associating each topic with a theme is left to the user

Preprocessing before LDA21 In order to using MALLET library in JAVA for LDA,we have to preprocess data with these five stepsStepProducts0. Raw DataNo more media blackout hiding #OCCUPYWALLSTREET! :)1. Lowercaseno more media blackout hiding #occupywallstreet! :)2. Tokenize[no, more, media, blackout, hiding, #occupywallstreet]3. Stopword[no, media, blackout, hiding, #occupywallstreet]Removal4. Stemming[no, media, blackout, hide, #occupywallstreet]5. Vectorization a vector that contains a sequence of numbers for eachword in the vocabulary

Typology of Trending Topics[1]22 NewsOngoing events: real-time information sharing Memes: triggered by viral ideas initiated by eitheran individual or an organization E.g. A soccer game, A keynote presentation by AppleE.g. Ice Bucket ChallengeCommemoratives: the commemoration of certainperson or event that is being remembered in agiven day E.g. New Year, Father Day, PrincessDiana[1] Arkaitz Zubiaga et.al., Real-Time Classification of Twitter Trends, Journal ofthe American Society for Information Science and Technology 2013

Sentimental Analysis23 “Sentiment analysis” seeks to automaticallyassociate a piece of text with a “sentiment score”, apositive or negative emotional scoreUsing natural language processing, text analysisand computational linguistics to identify and extractsubjective information in source materials.

Sentimental Analysis Approaches24 Existing approaches to sentiment analysis can begrouped into four main categories [1]Keyword spotting: based on the presence of unambiguousaffect words such as happy, sad, afraid, and bored Lexical affinity: not only detects obvious affect words, it alsoassigns arbitrary words a probable “affinity” to particularemotions Statistical methods: leverage on elements from machinelearning such as latent semantic analysis, support vectormachines, "bag of words" and Semantic Orientation Concept-level techniques: leverage on elements fromknowledge representation such as ontologies and semanticnetworks [1] http://en.wikipedia.org/wiki/Sentiment analysis

Dictionary-based Approach25 Sentiment analysis framework using dictionarybased approach Thereare words together with its sentimental score inthe specific dictionary Apply Porter stemmer to dictionary terms and tweets Compute Value [1,9] and then minus 5 Words not contained in the dictionary – neutral Total Score Sum of the score from each word in eachmetric

Naïve Bayes Approach (1)26 Sentiment analysis framework using Naïve BayesClassification Enumeratingeach Tweet in the dataset Building a lexicon from the Tweets that use an emoticon Calculating a sentiment score for each Tweet that doesnot have an emoticon

Naïve Bayes Approach (2)27

Sentimental Scale Visualization28 A graph showing sentimental tendency of tweetscontaining a word eet viz/tweet app/

Sentiment twitter.html? r 0

Tie Strength in 012/05/24/tie-strength-in-twitter/

Networks from Twitter Data31 Interest Graph(Twitter Social Graph)Conversation GraphRetweet Graphfriend – followermention (reply)retweet

Twitter Social Graph32 Try to find independent communities within a graph;assign modularity score based on connections fromindividual nodes to “hub” nodes (gephi)

Conversation Graph33 From 3000 tweetsfor 4 rappers(Drake, KendrickLamar, J Cole, andBig Sean)Created ByAchal Soni (Gephi)

Retweet Graph34 One can only identify the original source of theinformation and not the intermediate users alongthe information propagation path.

Network Measures35 Centrality Howimportant a node is within a network User Influence Transitivity and Reciprocity Howlinks (edges) are formed in a social graph Link Prediction Similarity (Structural, Regular) Computesimilarity between two nodes in a network Community Analysis, Behavior Prediction

Degree Centrality36 Count the number of links attached to the nodeThe key question was “how many people retweetedthis node?”

Eigenvalue Centrality37 Eigenvector Centrality builds upon this to ask “howimportant are these retweeters?”

Centrality Measures38 Degree CentralityEigenvector CentralityKatz CentralityPageRankBetweenness CentralityCloseness CentralityGroup Centrality

Collaborative ive filtering

Memory-based Approach40 User-based FilteringE.g. Movie RatingsThe PianoAmy-Jeff-Pulp Fiction- Clueless Cliffhanger-Fargo-Item-based FilteringMike Chris Ken - - -- - tering/

Memory-based Approach41 A prediction is normally based on the weightedaverage of the recommendations of several people.FindSimilarityWeightedPrediction

42Tools for analysisFind more at :http://en.wikipedia.org/wiki/Social network analysis software

Mining Twitter with R (1)43 Package “twitteR” (R based Twitter client) providesan interface to the Twitter web APIFunctionShort Descriptiondecode short urlA function to decode shortened URLsfavoritesA function to get favorite tweetsfriendshipsA function to detail relations between yourself & other usersgetCurRateLimitInfoA function to retrieve current rate limit informationgetTrendsFunctions to view Twitter trendsregisterTwitterOAuthRegister OAuth credentials to twitter R sessiontwListToDFA function to convert twitteR lists to twitteR/twitteR.pdf

Mining Twitter with R (2)44 The examples of other useful packages for textmining using Rhttp://onepager.togaware.com/TextMiningO.pdf

NodeXL (1)45 Network Overview Discovery Exploration for ExcelA free and open-source network analysis andvisualization software package for Microsoft Excel2007/2010Intended for users with little or no programmingexperience to allow them to collect, analyze, andvisualize a variety of networks

NodeXL (2)46

Gephi (https://gephi.github.io/)47 An open-source network analysis and visualizationsoftware package written in Java on the NetBeansplatformSee video: http://vimeo.com/9726202

Graphviz (www.graphviz.org)48 An open source graph visualization softwareA simple text language DiagramsOutput formats e.g. images and SVG for webpages; PDF or Postscript for inclusion in otherdocuments; or display in an interactive graphbrowserUseful features for concrete diagrams, such asoptions for colors, fonts, tabular node layouts, linestyles, hyperlinks, and custom shapes.

Graphviz (www.graphviz.org)49

50Interesting sources

Moments.twitter.com/uki/51

Analytics.twitter.com52

ytics

Blog.twitter.com54

Blog.twitter.com55

Interactive.twitter.com56 b.io/interactive/

Interactive.twitter.com57

Analyzing Big Data With Twitter58 Special course in Fall 2012 from UC BerkeleySchool of Informatics by Marti HearstCooperating with Twitter Inc.Taught Topics TwitterPhilosophy; Twitter Software Ecosystem Using Hadoop and Pig at Twitter The Twitter API Trend Detection in Twitter’s Streams Real-time Twitter Search Correlating Twitter Data with Other Data Graph Algorithms for the Twitter Social

Analyzing Big Data With Twitter59 Taught Topics (Cont.) GraphLab:Big Learning with Graphs Large-scale Anomaly Detection at Twitter Recommendation Algorithms at Twitter Security at Twitter Information Diffusion and Outbreak Detection at Twitter Etc. Find more on the course -s12/ Youtube Playlist of the lectureshttps://www.youtube.com/playlist?list PLE8C1256A28C1487F

Bibliography of Research on Twitter &Microblogging60 http://www.danah.org/researchBibs/twitter.php

61Thank YouHope you enjoy this twitter data analysis tour

Analyzing Big Data With Twitter Special course in Fall 2012 from UC Berkeley School of Informatics by Marti Hearst Cooperating with Twitter Inc. Taught Topics Twitter Philosophy; Twitter Software Ecosystem Using Hadoop and Pig at Twitter The Twitter API Trend Detection in Twitter's Streams Real-time Twitter Search

Related Documents:

Malvasia di Casorzo d’Asti/Malvasia di Casorzo/Casorzo DOC Malvasia di Castelnuovo Don Bosco DOC Monferrato DOC Nebbiolo d’Alba DOC Piemonte DOC Pinerolese DOC Rubino di Cantavenna DOC Sizzano DOC Strevi DOC Valli Ossolane DOC Valsusa DOC Verduno Pelaverga/Verduno DOC Your first stop for information about Italian wine

Twitter Marketing Understanding Twitter Tools to listen & measure Influence on Twitter: TweetDeck, Klout, PeerIndex How to do marketing on Twitter Black hat techniques of twitter marketing Advertising on Twitter Creating campaigns Types of ads Tools for twitter marketing Twitter Advertising Twitter Cards Video Marketing

Grey 7005 51 Charcoal 7016 9 For full details on Doc M refer to the Armitage Shanks Doc M Solutions brochure. 1 : 11 : 1 : 1. Doc M 1 : 11 : 2 : 1 Doc M 1 : 11 : 1 : 2 Doc M. Doc M Doc M 1 : 11 : 2 : 2 1 : 11 : 2 : 3 Close Coupled Left Or Right Hand Packs Doc M pack, specifically designed to latest recommendations which

twitter facebook Assembly 37 S. Monique Limón Democratic website twitter facebook . Facebook Assembly 38 Dante Acosta Republican website twitter facebook Assembly 39 Patty Lopez Democratic website twitter facebook Assembly 39 Raul Bocanegra Democratic website twitter facebook Assembly 40 Abigail Medina Democratic website

The tips in this handbook will help you set up your Twitter profile to best represent your values and your campaign. Your username on Twitter is part of your identity . Tips for growing your Twitter username recognition Put your Twitter @username on your printed materials and merchandise: Adding your Twitter @username to your .

Twitter Toolkit: Blueprint to Your First 1000 Twitter Followers Most people just use Twitter for scrolling, looking at the news and following celebrities. But, if you look a little closer, there's a side of Twitter where many savvy entrepreneurs are making money every day from Tweeting. This is 'Money Twitter.'

TweetViz: Twitter Data Visualization. D. Stojanovski, I. Dimitrovski, G. Madjarov Faculty of Computer Science and Engineering. Ss. Cyril and Methodius University in Skopje. . Twitter API Twitter user data Tweets with keyword or hashtag - Twitter Search. 25.11.2014 MAESTRA - Learning from Massive, Incompletely annotated, and .

Anatomy is the study of the structure of living things. b. Physiology is the science of the functioning of living organisms and their component parts. SELF-ASSESSMENT EXERCISE 2 i. Factors that determine divisions in anatomy are: a. Degree of structural detail under consideration 5. HEM 604 BASIC ANATOMY AND PHYSIOLOGY OF HUMAN BODY b. Specific processes c. Medical application ii. The analysis .