Visual Data Mining - Stony Brook University

1m ago

14 Views

0 Downloads

3.44 MB

14 Pages

Last View : 1d ago

Last Download : n/a

Upload by : Eli Jorgenson

Report this link

Download PDF

Transcription

Motivation Visual Data Mining Chidroop Madhavarapu CSE 591:Visual Analytics Why Visual Data Mining Visualization for Data Mining Huge amounts of information Limited display capacity of output devices Visual Data Mining (VDM) is a new approach for exploring very large data sets, combining traditional mining methods and information visualization techniques. Integration of visualization and data mining : Visual Data Mining approaches fall under 3 categories: Data Mining process visualization. Data Mining result visualization. Interactive Visual Data Mining.

Data Mining process visualization Data Mining result visualization Visualization techniques are used to support Data Mining. To visually convey the results of Mining tasks, such as clustering or classification, to enhance user interpretation. Ex: When required to handle large amount of multidimensional data in the format of Data Tables or relational databases. ( Parallel Coordinates, scatter plots etc.) Examples include Scatter plots, Box plots, BLOB and H-BLOB clustering algorithms, Decision trees, Association Rules, Interactive Visual Data Mining Rather than using Visual Data exploration and analytical mining algorithms as separate tools, a stronger DM strategy would be to tightly couple the visualizations and analytical processes into one DM tool. Using visualization tools in the data mining process to help users make smart data mining decisions. Examples include the Control project, OptiGrid, PBC (Perception Based Classification). V-Miner: Using Enhanced Parallel Coordinates to Mine Product Design and Test Data Kaii Zhao, Bing Liu, Thomas Tirpak, Andreas Schaller University of Illinois, Chicago Motorola Labs

INTRODUCTION V-Miner : Multivariable visualization tool. Designed for Mining product design and test data. New technique based on Parallel coordinate visualization. Goal is to discover useful knowledge from mobile phone testing data that can be used to provide feedback to the design engineers. Design Process for consumer electronics. Engineers design specific sections of phone based on previous successful designs, new product specs, design simulations etc. Prototypes are built Functional tests are performed on prototypes. If the requirements are not met, start with next design cycle (from step 1). Above steps are repeated until design meets the specification. Then the phone is released to the NPI team for volume manufacturing. THE DATA For a new product, number of iterations of design revisions should be coordinated. 100’s of variables involved which are changed/tested in the different revisions. V-Miner is used to reduce engineering costs, design defects by mining useful knowledge from the test data . After each design change, all test variables are measured. Each variable takes numerical values and has the following properties: Has an upper limit and lower limit. If a value does not fall in this range, its unacceptable. Has an ideal value called the target value.

SAMPLE TEST DATA Each change is a new design. Data is a sequential set. Subsequent changes are based on earlier changes. With the testing data, designers are interested in : Significant changes in variables with design change. Cause of these changes. Stable variables whose values are not affected by design changes. Using Traditional Mining algorithms is not adequate here because, Due to large number of variables, association rule mining generates too many rules. Decision trees does not find all interesting patterns, but only subset of the patterns. To solve the problem, we can use parallel coordinates which give an intuitive view to the underlying data. Parallel Coordinates Overview Problem with the traditional parallel coordinates technique Does not consider the sequence in which the data was generated. Sol: Add a sequence component to the traditional parallel coordinate visualization. -- Add trend figures. Does not consider the ordering of the variables. Sol: A querying and sorting tool is implemented to enable users to issue queries and rearrange the axes accordingly. So, design an Enhanced Parallel Coordinates system.

TREND FIGURES TREND FIGURES Order of data records is of high significance, as it might reveal sequence dependent relations. Extend the existing system by adding a additional graph for each variable above its coordinate. Thus it is possible to quickly see variables that change in similar ways by comparing the trend figures. QUERYING AND SORTING Need for Data Mining Allows user to query shapes based on approximate pattern matching. Two main types of pattern: Value change pattern & Failure pattern. Value change pattern indicates how a variable’s value changes over different design changes. up :3 down: 1 stable: 2 Example: 3312 Failure pattern indicates if the value falls within the upper an lower limit after the design modification. F: failure O: ok Example: OOOFF Goal for the application is to enable engineers at Motorola to identify the following: String comparison is more convenient and intuitive for human users. Ordering of the variables in parallel coordinate visualization is done according to the comparison results. Variables that show prominent changes in their values after some design changes. Stable variables that aren't affected by the design changes. Failure patterns of variables that failed after certain design changes. Variables that have similar value change patterns.

DATA NORMALIZATION Variables whose values are out of range are normalized to either larger than 1 or less than -1. Normalized values close to 0 are the ones close to the target values. Procedure normalization (value, min, max, target) // return value stores in: normalized value if ((value min) && (value max)) then normalized value (value - target) / (max - min); else if (value max) then normalized value (value - target) / (max - min) 1; else // value min normalized value (value - target) / (max - min) -1; end-if end-if KEY FEATURES Data in different designs are visualized using different colors. For each variable, a trend figure is drawn on the top of the screen User can identify significant characteristics from visualization . User can easily identify which variables are out of range or within the range (ex 19, 20). Variables that behave similarly from the trend figures (ex 33, 34). Some variables have stable values over all design changes (ex 15). In classical parallel coordinate visualization, overlapping lines significantly hinders visualization. Trend figures mitigate the problem. STABLE VARIABLES If the user is interested in identifying stable variables, he can issue a ‘222 ’ query on the value change pattern. Variables are ordered with the stable variables appearing first.

Test variables that failed after the first design change. Consistent Failures Variables affected by many components Result of a ‘FO’ Pattern

Initial Visualization of a data set V-Miner can be used with the existing data mining tools. Previously mined rules can be used to filter the data in the visualization. V-miner does not act simply as a tool that filters data, instead provided opportunity to the user to interact with the data visually. Visualization of data after filtering using rules Engineers have a set of rules from a data mining tool, they are loaded into V-Miner and user can select rules from here.

CONCLUSIONS This visualization system significantly speeds up the data mining process. V-Miner is able to find knowledge that cannot be found by other tools like correlation between variables, failure patterns in sets of multiple variables. Engineers can use V-Miner and their favorite mining tools together and recursively to mine for finer details. Interactive Data Analysis : The CONTROL Project Joseph M Hellerstien, Ron Avnur, Andy Chou, Chris Olston, Vijayshankar Raman. University of California, Berkley Data Analysis Objective: Obtain unknown information. Is an Iterative process. Complex process involving multiple, time consuming steps. Batch (Current systems) Vs Online Processing (CONTROL - Continuous Output and Navigation Technology with Refinement Online). Black Box Vs Crystal Ball. Quality & Accuracy Vs Interactive response times.

Online Aggregation Relational DB: Partition, Calculate, Return. Online aggregation system: Relational Databases Vs Online query processing. SELECT college, AVG(grade) FROM enroll GROUP BY college; Online Data Visualization: CLOUDS Partially completed visualization of US cities without CLOUDS: GOAL: Make Visualization more interactive by quickly displaying an accurate approximation of the final image.

Partially completed visualization of US cities with CLOUDS: US cities with conventional algorithm after 25 and 65 seconds: US cities with Clouds algorithm after 25 and 65 seconds: Graph of Mean Squared error over time. Clouds have lower error over non clouds.

Sampling from Multiple Joins : Ripple Joins Classical join algorithms scan large portion of input before they return records. Ripple Join: Operation Assume ripple join of relations R and S Ripple join algorithm can start returning output immediately upon invocation. Select random tuple r from R. Join with previously selected S tuples. Do random select s from S. Join with previous R tuples. Join r and s. Ripple Join Ripple Join: Square Two-Table Join In each matrix in the figure, the R axis represents tuples of R, the S axis represents tuples of S, each position (r, s) in each matrix represents a corresponding tuple in R x S. R S X The “square” version of this ripple join samples from R and S at the same rate.

Ripple Join: Square Two-Table Join Ripple Join: Square Two-Table Join R S XX XX R S XXX XXX XXX Ripple Join: Square Two-Table Join CONTROL Today R S XXXX XXXX XXXX XXXX Control Algorithms is used in several freeware and commercial systems. Online aggregation techniques is integrated into the DB2 Universal Database. CLOUDS is implemented in Berkley’s Tioga Datasplash visualization system.

Visual Data Mining. Chidroop Madhavarapu CSE 591:Visual Analytics. Motivation. Visualization for Data Mining Huge amounts of information Limited display capacity of output devices. Visual Data Mining (VDM) is a new approach for exploring very large data sets, combining traditional mining methods and information .

Related Documents:

September 2019 Undergraduate Guide - Stony Brook University

Stony Brook University Stony Brook, NY 11794-2350. 2 CONTENTS 1. Introduction 3 2. Degree Requirements for Electrical Engineering 5 2.1 ABET Requirements for the Major 5 2.2 Stony Brook Curriculum (SBC) 6 . Stony Brook electrical engineering students may work as interns in engineering and high-technology industries

11 Views

1y ago

BENJAMIN S. HSIAO - Stony Brook University

2014- Co-founding Director, Innovative Global Energy Solutions Center, Stony Brook University 2012-2013 Vice President for Research and Chief Research Officer (1.5 years), Stony Brook University 2007-2012 Chair, Department of Chemistry, Stony Brook University 2002- Professor, Department of Chemistry, Stony Brook University .

8 Views

1y ago

Modelling attention control using a convolutional neural ...

Modelling attention control using a convolutional neural network designed after the ventral visual pathway Chen-Ping Yua,c, Huidong Liua, Dimitrios Samarasa and Gregory J. Zelinskya,b aDepartment of Computer Science, Stony Brook University, Stony Brook, NY, USA; bDepartment of Psychology, Stony Brook University, Stony Brook, NY, USA; cD

33 Views

2y ago

Towards Uniformly Dispersed Battery Electrode Composite ...

3Department of Materials Science and Engineering, Georgia Institute of Technology, Atlanta, GA USA 4Department of Chemistry, Stony Brook University, Stony Brook, NY USA 5Department of Materials Science and Engineering, Stony Brook University, Stony Brook, NY USA 6Energy Sciences Directorate,

49 Views

2y ago

Statistically Signiﬁcant Detection of ... - Vivek Kulkarni

Vivek Kulkarni Stony Brook University, USA vvkulkarni@cs.stonybrook.edu Rami Al-Rfou Stony Brook University, USA ralrfou@cs.stonybrook.edu Bryan Perozzi Stony Brook University, USA bperozzi@cs.stonybrook.edu Steven Skiena Stony Brook University, USA skiena@cs.stonybrook.edu ABSTRACT

105 Views

2y ago

BSW PROGRAM - Stony Brook University Hospital

BSW PROGRAM. Undergraduate Student Handbook. 2020 - 2021. School of Social Welfare Health Sciences Center, Level 2, Room 092. Stony Brook University Stony Brook, New York 11794-8231. Stony Brook University/SUNY is an affirmative action, equal opportunity educator and employer.

11 Views

1y ago

Volume 8, Issue 1 DEPARTMENT OF PSYCHOLOGY - Stony Brook

Stony Brook University, Psychology-B, Stony Brook, NY 11794-2500 . 2 . After completing his degree at Stony Brook in Summer 2002 and taking a position at Monmouth, Gary and his wife Colleen . Gary teaches research, intimate rela-tionships, as well as courses on the self. He also runs a lab with the help of 8-10 undergraduates (a majority of .

15 Views

1y ago

Purpose ofThis Aiireement - SUNY Suffolk

1. Stony Brook representatives will meet at Suffolk with students in the Joint Admissions Program. The purpose ofthese meetings will be to provide academic advice about Stony Brook's curriculum and general information about Stony Brook. 2. Stony Brook agrees to provide, as needed, information sessions with faculty and staff of

14 Views

1y ago

Recent Views

Case 580 Sl Backhoe Service Manual

series b, 580c. case farm tractor manuals - tractor repair, service and case 530 ck backhoe & loader only case 530 ck, case 530 forklift attachment only, const king case 531 ag case 535 ag case 540 case 540 ag case 540, 540c ag case 540c ag case 541 case 541 ag case 541c ag case 545 ag case 570 case 570 ag case 570 agas, case

3y ago

237 Views

12 PUBLIC LAW AND PRIVATE LAW - Home: The National .

INTRODUCTION TO LAW MODULE - 3 Public Law and Private Law Classification of Law 164 Notes z define Criminal Law; z list the differences between Public and Private Law; and z discuss the role of Judges in shaping Law 12.1 MEANING AND NATURE OF PUBLIC LAW Public Law is that part of law, which governs relationship between the State

3y ago

745 Views

Dr. Ram Manohar Lohiya National Law University, Lucknow

2. Health and Medicine Law 3. Int. Commercial Arbitration 4. Law and Agriculture IXth SEMESTER 1. Consumer Protection Law 2. Law, Science and Technology 3. Women and Law 4. Land Law (UP) Xth SEMESTER 1. Real Estate Law 2. Law and Economics 3. Sports Law 4. Law and Education **Seminar Courses Xth SEMESTER (i) Law and Morality (ii) Legislative .

3y ago

496 Views

Companies Law - Cayman Islands dollar

Law 1 of 1971-15th December, 1970 Law 7 of 2000- 20th July, 2000 Law 7 of 1973-28th June, 1973 Law 5 of 2001-20th April, 2001 Law 24 of 1974-22nd November, 1974 Law 10 of 2001-25th May, 2001 Law 25 of 1975-9th December, 1975 Law 29 of 2001-26th September, 2001 Law 19 of 1977-10th November, 1977 Law 46 of 2001-14th January, 2002

3y ago

454 Views

It’s the Law!

ciples stated in Boyle’s Law, Charles’ Law, Gay-Lussac’s Law, Henry’s Law, and Dalton’s Law. Students will be able to explain the application of Boyle’s Law, Charles’ Law, Gay-Lussac’s Law, Henry’s Law, and Dalton’s Law to observations or events related to SCUBA diving. MateriaLs None audio/visuaL MateriaLs None teachinG tiMe

2y ago

378 Views

WHAT LAW IS ? An Introduction to Law

common law system civil law system!! sources of law in civil law !! a1. primary: statutes (written law) enacted by legislative power are the principal source of law. ! a2. two subsidiary sources of law: ! a2.1 administrative regulations a.2.2 customs!! ! sources of law in common law !!! b1. two primary sources of

2y ago

385 Views

GENERAL SELECTION GUIDE - LOADER - Combi Wear Parts

case 721e z bar 132,5 r10 r10 - - case 721 bxt 133,2 r10 r10 - - case 721 cxt 136,5 r10 r10 - - case 721 f xr tier 3 138,8 r10 r10 - - case 721 f xr tier 4 138,8 r10 r10 - - case 721 f xr interim tier 4 138,9 r10 r10 - - case 721 f tier 4 139,5 r10 r10 - - case 721 f tier 3 139,6 r10 r10 - - case 721 d 139,8 r10 r10 - - case 721 e 139,8 r10 r10 - - case 721 f wh xr 145,6 r10 r10 - - case 821 b .

3y ago

267 Views

Your one stop shop for deli container packaging - Pactiv

12oz Container Dome Dimensions 4.5 x 4.5 x 2 Case Pack 960 Case Weight 27.44 Case Cube 3.21 YY4S18Y 16oz Container Dome Dimensions 4.5 x 4.5 x 3 Case Pack 480 Case Weight 18.55 Case Cube 1.88 YY4S24 24oz Container Dome Dimensions 4.5 x 4.5 x 4.17 Case Pack 480 Case Weight 26.34 Case Cube 2.10 YY4S32 32oz Container Dome Dimensions 4.5 x 4.5 x 4.18 Case Pack 480 Case Weight 28.42 Case Cube 2.48 YY4S36

1y ago

115 Views

Faculty of Juridical, Social and Political Sciences Year .

Law L Law IV 8 Drept procesual civil II / Civil Procedure Law II 5 Law L Law IV 8 Dreptul comerțului internațional / International ommercial Law 4 Law L Law IV 8 riminalistică / Forensics 4 Law L Law IV 8 Practică de cercetare pentru elaborarea lucrării de lincență(3 săptămân

2y ago

384 Views

Ohm ’s Law

Ohm ’s Law Ohm's law states that, in an electrical circuit, the current passing through most materials is directly proportional to the potential difference applied across them. 3-1—3-3: Ohm ’s Law Formulas There are three forms of Ohm’s Law: I V/R V IR R V/I where:File Size: 1MBPage Count: 40Explore furtherOhm's Law Quiz MCQs with Answers Ohm Lawohmlaw.comOhm’s Law Worksheet - Basic Electricity - All About omohms law worksheet - eering.orgOhm’s Law Worksheet - Richmond County School Systemwww.rcboe.orgOhm's Law with Examples - Physics Problems with Solutions ended to you b

2y ago

295 Views

Intermediate Law Law and You Worksheet 3: Australian law - Home Affairs

4. There are different kinds of law to deal with different kinds of problems. Four important kinds of law are civil law, criminal law, family law and administrative law. Civil law deals with disputes between individuals; for example, if someone sells you goods that are faulty, or that cause you injury or damage, you can take that person to court.

4m ago

110 Views

PRINCIPLES OF BUSINESS LAW - DPHU

ABE Diploma in Business Administration Study Manual PRINCIPLES OF BUSINESS LAW Contents Study Unit Title Page Syllabus i 1 Nature and Sources of Law 1 Nature of Law 3 Historical Origins 6 Sources of Law 9 The European Community and UK Law: An Overview 13 2 Common Law, Equity and Statute Law 23 Custom 25 Case Law 26 Nature of Equity 32

3y ago

285 Views

WHARTON CONSULTING CLUB - Wall Street Oasis

Case 4: Major Magazine Publisher 56 61 63 Case 5: Tulsa Hotel - OK or not OK? Case 6: The Coffee Grind Case 7: FoodCo Case 8: Candy Manufacturing 68 74 81 85 Case 9: Chickflix.com Case 10: Skedasky Farms Case 11: University Apartments 93 103 108 Case 12: Vidi-Games Case 13: Big School Bus Company Case 14: American Beauty Company 112 118

2y ago

347 Views

WRITING CASE NOTES AND CASE COMMENTS1 - The Open University Law School

Jessica Giles, Law Lecturer, The Open University Contents 1. Introduction Learning outcomes 2. Writing case notes 2.1 How to start 2.2 Common law, civil law, international law and supranational law legal systems and types of judgment 2.3 Deconstructing and reconstructing a case 2.2.1 Organising the pieces 2.2.2. Reconstructing legal argument

1y ago

136 Views

A Trail Guide to Careers in Environmental Law

law, constitutional law, property law, bankruptcy law, criminal law, food and drug law, land use planning law, and international law. A distinctive aspect of environmental practice is the role of science in advocacy efforts.

3y ago

241 Views

Visual Data Mining - Stony Brook University

It looks like you're using an ad-blocker