Frequent Pattern Mining - Charu Aggarwal

1y ago

10 Views

2 Downloads

2.39 MB

85 Pages

Last View : 15d ago

Last Download : 3m ago

Upload by : Brady Himes

Report this link

Download PDF

Transcription

Frequent Pattern Mining

Charu C. Aggarwal Jiawei HanEditorsFrequent Pattern Mining2123

EditorsCharu C. AggarwalIBM T. J. Watson Research CenterYorktown HeightsNew YorkUSAJiawei HanUniversity of Illinois at Urbana-ChampaignUrbanaIllinoisUSAISBN 978-3-319-07820-5ISBN 978-3-319-07821-2 (eBook)DOI 10.1007/978-3-319-07821-2Springer Cham Heidelberg New York Dordrecht LondonLibrary of Congress Control Number: 2014944536 Springer International Publishing Switzerland 2014This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of thematerial is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations, recitation,broadcasting, reproduction on microﬁlms or in any other physical way, and transmission or informationstorage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodologynow known or hereafter developed. Exempted from this legal reservation are brief excerpts in connectionwith reviews or scholarly analysis or material supplied speciﬁcally for the purpose of being entered andexecuted on a computer system, for exclusive use by the purchaser of the work. Duplication of thispublication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’slocation, in its current version, and permission for use must always be obtained from Springer. Permissionsfor use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable toprosecution under the respective Copyright Law.The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoes not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevantprotective laws and regulations and therefore free for general use.While the advice and information in this book are believed to be true and accurate at the date of publication,neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors oromissions that may be made. The publisher makes no warranty, express or implied, with respect to thematerial contained herein.Printed on acid-free paperSpringer is part of Springer Science Business Media (www.springer.com)

PrefaceThe ﬁeld of data mining has four main “super-problems” corresponding to clustering,classiﬁcation, outlier analysis, and frequent pattern mining. Compared to the otherthree problems, the frequent pattern mining model for formulated relatively recently.In spite of its shorter history, frequent pattern mining is considered the marqueeproblem of data mining. The reason for this is that interest in the data mining ﬁeldincreased rapidly soon after the seminal paper on association rule mining by Agrawal,Imielinski, and Swami. The earlier data mining conferences were often dominatedby a large number of frequent pattern mining papers. This is one of the reasons thatfrequent pattern mining has a very special place in the data mining community. Atthis point, the ﬁeld of frequent pattern mining is considered a mature one.While the ﬁeld has reached a relative level of maturity, very few books coverdifferent aspects of frequent pattern mining. Most of the existing books are eithertoo generic or do not cover frequent pattern mining in an exhaustive way. A needexists for an exhaustive book on the topic that can cover the different nuances in anexhaustive way.This book provides comprehensive surveys in the ﬁeld of frequent pattern mining.Each chapter is designed as a survey that covers the key aspects of the ﬁeld of frequentpattern mining. The chapters are typically of the following types: Algorithms: In these cases, the key algorithms for frequent pattern mining areexplored. These include join-based methods such as Apriori, and pattern-growthmethods. Variations: Many variations of frequent pattern mining such as interesting patterns, negative patterns, constrained pattern mining, or compressed patterns areexplored in these chapters. Scalability: The large sizes of data in recent years has led to the need for big dataand streaming frameworks for frequent pattern mining. Frequent pattern miningalgorithms need to be modiﬁed to work with these advanced scenarios. Data Types: Different data types lead to different challenges for frequent patternmining algorithms. Frequent pattern mining algorithms need to be able to workwith complex data types, such as temporal or graph data.v

viPreface Applications: In these chapters, different applications of frequent pattern miningare explored. These includes the application of frequent pattern mining methodsto problems such as clustering and classiﬁcation. Other more complex algorithmsare also explored.This book is, therefore, intended to provide an overview of the ﬁeld of frequentpattern mining, as it currently stands. It is hoped that the book will serve as a usefulguide for students, researchers, and practitioners.

Contents12An Introduction to Frequent Pattern Mining . . . . . . . . . . . . . . . . . . . . .Charu C. Aggarwal1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Frequent Pattern Mining Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . .2.1 Frequent Pattern Mining with the Traditional SupportFramework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.2 Interesting and Negative Frequent Patterns . . . . . . . . . . . . . . . .2.3 Constrained Frequent Pattern Mining . . . . . . . . . . . . . . . . . . . . .2.4 Compressed Representations of Frequent Patterns . . . . . . . . . .3 Scalability Issues in Frequent Pattern Mining . . . . . . . . . . . . . . . . . . .3.1 Frequent Pattern Mining in Data Streams . . . . . . . . . . . . . . . . .3.2 Frequent Pattern Mining with Big Data . . . . . . . . . . . . . . . . . . .4 Frequent Pattern Mining with Advanced Data Types . . . . . . . . . . . . .4.1 Sequential Pattern Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.2 Spatiotemporal Pattern Mining . . . . . . . . . . . . . . . . . . . . . . . . . .4.3 Frequent Patterns in Graphs and Structured Data . . . . . . . . . . .4.4 Frequent Pattern Mining with Uncertain Data . . . . . . . . . . . . . .5 Privacy Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 Applications of Frequent Pattern Mining . . . . . . . . . . . . . . . . . . . . . . .6.1 Applications to Major Data Mining Problems . . . . . . . . . . . . . .6.2 Generic Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 Conclusions and Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Frequent Pattern Mining Algorithms: A Survey . . . . . . . . . . . . . . . . . .Charu C. Aggarwal, Mansurul A. Bhuiyan and Mohammad Al Hasan1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Join-Based Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.1 Apriori Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.2 DHP Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.3 Special Tricks for 2-Itemset Counting . . . . . . . . . . . . . . . . . . . .113467788991010111112131313141419192223242728vii

viii34Contents2.4 Pruning by Support Lower Bounding . . . . . . . . . . . . . . . . . . . . .2.5 Hypercube Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Tree-Based Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3.1 AIS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3.2 TreeProjection Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3.3 Vertical Mining Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Recursive Sufﬁx-Based Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.1 The FP-Growth Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.2 Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Maximal and Closed Frequent Itemsets . . . . . . . . . . . . . . . . . . . . . . . .5.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.2 Frequent Maximal Itemset Mining Algorithms . . . . . . . . . . . . .5.3 Frequent Closed Itemset Mining Algorithms . . . . . . . . . . . . . . .6 Other Optimizations and Variations . . . . . . . . . . . . . . . . . . . . . . . . . . .6.1 Row Enumeration Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6.2 Other Exploration Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 Reducing the Number of Passes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7.1 Combining Passes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7.2 Sampling Tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7.3 Online Association Rule Mining . . . . . . . . . . . . . . . . . . . . . . . . .8 Conclusions and Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . rn-Growth Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Jiawei Han and Jian Pei1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 FP-Growth: Pattern Growth for Mining Frequent Itemsets . . . . . . . .3 Pushing More Constraints in Pattern-Growth Mining . . . . . . . . . . . .4 PreﬁxSpan: Mining Sequential Patterns by Pattern Growth . . . . . . .5 Further Development of Pattern Growth-Based Pattern MiningMethodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .65Mining Long Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Feida Zhu1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 A Pattern Lattice Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Pattern Enumeration Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.1 Breadth-First Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.2 Depth-First Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Row Enumeration Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 Pattern Merge Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6.1 Piece-wise Pattern Merge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6668727477787983838486878788899293

Contentsix6.2 Fusion-style Pattern Merge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 987 Pattern Traversal Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1018 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103567Interesting Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Jilles Vreeken and Nikolaj Tatti1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Absolute Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.1 Frequent Itemsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.2 Tiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.3 Low Entropy Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Advanced Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Static Background Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.1 Independence Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.2 Beyond Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.3 Maximum Entropy Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.4 Randomization Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Dynamic Background Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.1 The General Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.2 Maximum Entropy Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.3 Tile-based Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.4 Swap Randomization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 Pattern Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6.1 Itemsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6.2 Tiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6.3 Swap Randomization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Negative Association Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Luiza Antonie, Jundong Li and Osmar Zaiane1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Negative Patterns and Negative Association Rules . . . . . . . . . . . . . . .3 Current Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Associative Classiﬁcation and Negative Association Rules . . . . . . . .5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Constraint-Based Pattern Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Siegfried Nijssen and Albrecht Zimmermann1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Problem Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.1 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Level-Wise Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149152

xContents3.1 Generic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Depth-First Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.1 Basic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.2 Constraint-based Itemset Mining . . . . . . . . . . . . . . . . . . . . . . . .4.3 Generic Frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.4 Implementation Considerations . . . . . . . . . . . . . . . . . . . . . . . . . .5 Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .153154154155158159159162162Mining and Using Sets of Patterns through Compression . . . . . . . . . .Matthijs van Leeuwen and Jilles Vreeken1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.1 Kolmogorov Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.2 MDL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.3 MDL in Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Compression-based Pattern Models . . . . . . . . . . . . . . . . . . . . . . . . . . .3.1 Pattern Models for MDL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3.2 Code Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3.3 Instances of Compression-based Models . . . . . . . . . . . . . . . . . .4 Algorithmic Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.1 Candidate Set Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.2 Direct Mining of Patterns that Compress . . . . . . . . . . . . . . . . . .5 MDL for Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.1 Classiﬁcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.2 A Dissimilarity Measure for Datasets . . . . . . . . . . . . . . . . . . . . .5.3 Identifying and Characterizing Components . . . . . . . . . . . . . . .5.4 Other Data Mining Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.5 The Advantage of Pattern-based Models . . . . . . . . . . . . . . . . . .6 Challenges Ahead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6.1 Toward Mining Structured Data . . . . . . . . . . . . . . . . . . . . . . . . .6.2 Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6.3 Task- and/or User-speciﬁc Usefulness . . . . . . . . . . . . . . . . . . . .7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .165489Frequent Pattern Mining in Data Streams . . . . . . . . . . . . . . . . . . . . . . .Victor E. Lee, Ruoming Jin and Gagan Agrawal1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.1 Frequent Pattern Mining: Deﬁnition . . . . . . . . . . . . . . . . . . . . . .2.2 Data Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.3 Frequent Item Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Frequent Itemset Mining Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 91192193193194194195196199200201201202203204

Contents1011xi3.1 Mining the Full Data Stream . . . . . . . . . . . . . . . . . . . . . . . . . . . .3.2 Recently Frequent Itemsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3.3 Closed and Maximal Itemsets . . . . . . . . . . . . . . . . . . . . . . . . . . .3.4 Mining Data Streams with Uncertain Data . . . . . . . . . . . . . . . .4 Mining Patterns Other than Itemsets . . . . . . . . . . . . . . . . . . . . . . . . . .4.1 Subsequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.2 Subtrees and Semistructured Data . . . . . . . . . . . . . . . . . . . . . . . .4.3 Subgraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .206209214216216217218219219220Big Data Frequent Pattern Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .David C. Anastasiu, Jeremy Iverson, Shaden Smith and GeorgeKarypis1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Frequent Pattern Mining: Overview . . . . . . . . . . . . . . . . . . . . . . . . . . .2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.2 Basic Mining Methodologies . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Paradigms for Big Data Computation . . . . . . . . . . . . . . . . . . . . . . . . .3.1 Principles of Parallel Algorithms . . . . . . . . . . . . . . . . . . . . . . . .3.2 Shared Memory Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3.3 Distributed Memory Systems . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Frequent Itemset Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.1 Memory Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.2 Work Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.3 Dynamic Load Balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.4 Further Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Frequent Sequence Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.1 Serial Frequent Sequence Mining . . . . . . . . . . . . . . . . . . . . . . . .5.2 Parallel Frequent Sequence Mining . . . . . . . . . . . . . . . . . . . . . .6 Frequent Graph Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6.1 Serial Frequent Graph Mining . . . . . . . . . . . . . . . . . . . . . . . . . . .6.2 Parallel Frequent Graph Mining . . . . . . . . . . . . . . . . . . . . . . . . .7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .225Sequential Pattern Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Wei Shen, JianyongWang and Jiawei Han1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Problem Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Apriori-based Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3.1 Horizontal Data Format Algorithms . . . . . . . . . . . . . . . . . . . . . .3.2 Vertical Data Format Algorithms . . . . . . . . . . . . . . . . . . . . . . . . .4 Pattern Growth Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.1 FreeSpan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.2 PreﬁxSpan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50250252255256261261263264264268271271272

xii1213Contents5Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.1 Closed Sequential Pattern Mining . . . . . . . . . . . . . . . . . . . . . . . .5.2 Multi-level, Multi-dimensional Sequential Pattern Mining . . .5.3 Incremental Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.4 Hybrid Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.5 Approximate Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.6 Top-k Closed Sequential Pattern Mining . . . . . . . . . . . . . . . . . .5.7 Frequent Episode Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 Conclusions and Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .274274276277278279279280281281Spatiotemporal Pattern Mining: Algorithms and Applications . . . . . .Zhenhui Li1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Basic Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.1 Spatiotemporal Data Collection . . . . . . . . . . . . . . . . . . . . . . . . .2.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.3 Background Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Individual Periodic Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3.1 Automatic Discovery of Periodicity in Movements . . . . . . . . . .3.2 Frequent Periodic Pattern Mining . . . . . . . . . . . . . . . . . . . . . . . .3.3 Using Periodic Pattern for Location Prediction . . . . . . . . . . . . .4 Pairwise Movement Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.1 Similarity Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.2 Generic Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.3 Behavioral Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.4 Semantic Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Aggregate Patterns over Multiple Trajectories . . . . . . . . . . . . . . . . . .5.1 Frequent Trajectory Pattern Mining . . . . . . . . . . . . . . . . . . . . . .5.2 Detection of Moving Object Cluster . . . . . . . . . . . . . . . . . . . . . .5.3 Trajectory Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .283Mining Graph Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Hong Cheng, Xifeng Yan and Jiawei Han1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Frequent Subgraph Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.1 Problem Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.2 Apriori-Based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.3 Pattern-Growth Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.4 Closed and Maximal Subgraphs . . . . . . . . . . . . . . . . . . . . . . . . .2.5 Mining Subgraphs in a Single Graph . . . . . . . . . . . . . . . . . . . . .2.6 The Computational Bottleneck . . . . . . . . . . . . . . . . . . . . . . . . . 00302304304307307308308309310311311313

Contents14xiii3Mining Signiﬁcant Graph Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . .3.1 Problem Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3.2 gboost: A Branch-and-Bound Approach . . . . . . . . . . . . . . . . . . .3.3 gPLS: A Partial Least Squares Regression Approach . . . . . . . .3.4 LEAP: A Structural Leap Search Approach . . . . . . . . . . . . . . . .3.5 GraphSig: A Feature Representation Approach . . . . . . . . . . . . .4 Mining Representative Orthogonal Graphs . . . . . . . . . . . . . . . . . . . . .4.1 Problem Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.2 Randomized Maximal Subgraph Mining . . . . . . . . . . . . . . . . . .4.3 Orthogonal Representative Set Generation . . . . . . . . . . . . . . . .5 Mining Dense Graph Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.1 Cliques and Quasi-Cliques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.2 K-Core and K-Truss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.3 Other Dense Subgraph Patterns . . . . . . . . . . . . . . . . . . . . . . . . . .6 Mining Graph Patterns in Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 Mining Graph Patterns in Uncertain Graphs . . . . . . . . . . . . . . . . . . . .8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36336Uncertain Frequent Pattern Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Carson Kai-Sang Leung1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 The Probabilistic Model for Mining Expected Support-BasedFrequent Patterns from Uncertain Data . . . . . . . . . . . . . . . . . . . . . . . .3 Candidate Generate-and-Test Based Uncertain Frequent PatternMining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Hyperlinked Structure-Based Uncertain Frequent Pattern Mining . .5 Tree-Based Uncertain Frequent Pattern Mining . . . . . . . . . . . . . . . . .5.1 UF-growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.2 UFP-growth . . . . . . . . . . . . . . . . . .

Imielinski, and Swami. The earlier data mining conferences were often dominated by a large number of frequent pattern mining papers. This is one of the reasons that frequent pattern mining has a very special place in the data mining community. At this point, the ﬁeld of frequent pattern mining is considered a mature one.

Related Documents:

Outlier Analysis Second Edition - Charu Aggarwal

Outlier Analysis Second Edition Charu C. Aggarwal IBM T. J. Watson Research Center Yorktown Heights, New York November 25, 2016 PDF Downloadable from http://rd .

7 Views

1y ago

D.A.V. PUBLIC SCHOOL PUSHPANJALI ENCLAVE, PITAMPURA ...

1 200001 yuvansh mangla dr manoj kumar 80 10002 . 58 200058 sankalp agarwal samir agarwal 30 10112 59 200059 amaya jitender kathuria 70 10099 60 200060 ridhvi aggarwal manu aggarwal 80 10172 . 97 200097 nityam aggarwal nand kishore aggarwal 50 10490 98 200098 vivika gupta rohit gupta 60 10501

74 Views

3y ago

FDSMO: Frequent DNA Sequence Mining Using FBSB and Optimization

Step 5: Mine frequent items whose backings are not less than the Min-sup and push the frequent items into the frequent item-set L1. Step 6:Produce the frequent1-itemset L1and bitmap Pos. Algorithm2. Frequent k length sequence mining and (k 1) length sequence generation Input: Bitmap Pos for k length sequences, minimal support min-sup

7 Views

1y ago

S.a Engineering College, Chennai 77 (An Autonomous Institution ...

CS2208A PRINCIPLES OF PROGRAMMING LANGUAGES L T P C 3 0 0 3 OBJECTIVES: . Charu C. Aggarwal, ―Social Network Data Analytics‖, Springer; 2014 4. Giles, Mark Smith, John Yen, ―Advances in Social Network Mining and Analysis‖, Springer, 2010. 5. Guandong Xu , Yanchun Zhang and Lin Li, ―Web Mining and Social Networking - Techniques

5 Views

9m ago

An Introduction to Graph Mining

Graph Mining and Graph Kernels An Introduction to Graph Mining Graph Pattern Explosion Problem ! If a graph is frequent, all of its subgraphs are frequent the Apriori property! An n-edge frequent graph may have 2n subgraphs!! In the AIDS antiviral screen dataset with 400 compounds, at the su

21 Views

2y ago

CS570 Introduction to Data Mining - Emory University

Frequent Itemset Mining Frequent itemset mining: frequent set of items in a transaction data set First proposed by Agrawal, Imielinski, and Swami in SIGMOD 1993 SIGMOD Test of Time Award 2003 “This paper started a field of research. In addition to containing an innovative algorithm, its subject matter bro

14 Views

2y ago

Data Mining in Bioinformatics - UQAM

DATA MINING What is data mining? [Fayyad 1996]: "Data mining is the application of specific algorithms for extracting patterns from data". [Han&Kamber 2006]: "data mining refers to extracting or mining knowledge from large amounts of data". [Zaki and Meira 2014]: "Data mining comprises the core algorithms that enable one to gain fundamental in

43 Views

2y ago

Introduction An Introduction to Modal Logic

An Introduction to Modal Logic 2009 Formosan Summer School on Logic, Language, and Computation 29 June-10 July, 2009 ; 9 9 B . : The Agenda Introduction Basic Modal Logic Normal Systems of Modal Logic Meta-theorems of Normal Systems Variants of Modal Logic Conclusion ; 9 9 B . ; Introduction Let me tell you the story ; 9 9 B . Introduction Historical overview .

116 Views

3y ago

Recent Views

Grammar as a Foreign Language - List of Proceedings

Grammar as a Foreign Language Oriol Vinyals Google vinyals@google.com Lukasz Kaiser Google lukaszkaiser@google.com Terry Koo Google terrykoo@google.com Slav Petrov Google slav@google.com Ilya Sutskever Google ilyasu@google.com Geoffrey Hinton Google geoffhinton@google.com Abstract Synta

2y ago

445 Views

Attention is All you Need - NIPS

Google Brain avaswani@google.com Noam Shazeer Google Brain noam@google.com Niki Parmar Google Research nikip@google.com Jakob Uszkoreit Google Research usz@google.com Llion Jones Google Research llion@google.com Aidan N. Gomezy University of Toronto aidan@cs.toronto.edu Łukasz Kaiser Google Brain lukaszkaiser@google.com Illia Polosukhinz illia .

1y ago

303 Views

GSA Implementation of Google (G) Suite

Google Meet Classic Hangouts Google Chat Google Calendar Google Drive and Shared Drive Google Docs Google Sheets Google Slides Google Forms Google Sites Google Keep Apps Script D

2y ago

316 Views

Google Drive (Google Docs, Google Sheets, Google Slides)

Google Drive (Google Docs, Google Sheets, Google Slides) Employees are automatically issued a Kyrene Google account. Navigate to drive.google.com. Use Kyrene email address and network password to login. Launch in Chrome browser for best experience. Google Drive is a cloud storage sys

2y ago

388 Views

Quick Guide of Using Google Home to Control Smart Devices

Configuration needs Google Home app. Search "Google Home" in App Store or Google Play to install the app. 3.1 Set up Google Home with Google Home app You can skip this part if your Google Home is already set up. 1. Make sure your Google Home is energized. 2. Open the Google Home app by tapping the app icon on your mobile device. 3.

1y ago

326 Views

Elaboração de Provas Online usando o Formulário Google Docs

2 Após o login acesse o Google Drive ou o Google Docs e selecione a ferramenta Google Forms (Formulários). Clique na caixa de Ferramentas do Google, localizada no canto direito superior da tela e selecione o Google Drive. Na tela do Google Drive clique em New , opção More e selecione Google Forms. OBS: É possível acessar o google

11m ago

123 Views

ACS WASC Templates

File upload, Folder upload, Google Docs, Google Sheets, or Google Slides. You can also create Google Forms, Google Drawings, Google My Maps, etc. Share with exactly who you want — without email attachments. Search or sort your list of files, folders, and Google Docs. Preview files and Google Docs.

2y ago

366 Views

Google Drive - San Bernardino City Unified School District

Google Apps All of the Google applications that are available upon logging into Google.com (G , Gmail, Gphotos, Gdrive, etc.). Google Suite Google’s online cloud based office companion applications (Docs, Sheets, Slides). Google Drive Google’s online cloud storage and file sharing/collaboration application.

2y ago

378 Views

Single Sign On for Google Apps with NetScaler Unified Gateway

Google Apps for Work is a suite of cloud computing productivity and collaboration applications provided by Google on a subscription basis. It includes Google’s popular web applications including Gmail, Google Drive, Google Hangouts, Google Calendar and Google

2y ago

295 Views

Serviceteil

Google 84, 87, 124 Google 110 Google AdWords 101, 103 Google Alerts 127 Google Analytics 89 Google Maps 100, 110, 173 Google-Maps 63 Google Places 100, 103, 124 Graphiken 66 H Haftung 170 Haftungsausschluss 72 Hausfarbe 11 Headline 35 Heilmittelwerbegesetz 14, 69, 163 Heilversprechen 164 HONcode 78 HTML 58 HWG 31 I Imagefilm 31

2y ago

336 Views

Best practices for managing identities when you move to Google Cloud

Google Cloud. To provide t he informat ion an organizat ion would ne e d to transfer data and ownership from one Google Account to anot her for s ome of t he noncore Google s er vice s, such as Google Ads, Google Analyt ics, or DV360. Intende d audience Organizat ion administrators. Sta planning Google Cloud / Google Wor kspace migrat ion. Key .

1y ago

481 Views

Introduction - Google Earth User Guide

Google Earth Community: Learn from other Google Earth users by asking questions and sharing answers on the Google Earth Community forums. Using Google Earth: This blog describes how you can use some of the interesting features of Google Earth. Selecting a Server Note: This section is relevant to Google Earth Pro and EC users.

3y ago

288 Views

Using Google Forms to Manage Officials Signups

Google Sheets, deleting a response from the form or sheet will not affect the other. Once the Google Form is linked to a Google Sheet, clicking on the spreadsheet icon will open the linked Google Sheet. Google Responses Sheet Google automatically creates and populates the sp

2y ago

276 Views

Google Cheat Sheets - Shake Up Learning

Google Slides Cheat Sheet p. 15-18 Google Sheets Cheat Sheet p. 19-22 Google Drawings Cheat Sheet p. 23-26 Google Drive for iOS Cheat Sheet p. 27-29 Google Chrome Cheat Sheet p. 30-32 ShakeUpLearning.com Google Cheat Sheets - By Kasey Bell 3

2y ago

296 Views

ChromeBox CXI (McQueen) UM (date) EN

Create a new Google Account. You can create a new Google Account if you don’t already have one. Click . Create a Google Account. on the right to set up a new account. A Google Account gives you access to useful web services developed by Google, such as Gmail, Google Docs, and Google Calendar. Browse as a guest

2y ago

177 Views

Frequent Pattern Mining - Charu Aggarwal

It looks like you're using an ad-blocker