Exploratory Data Analysis: A Bird’s Eye View

2y ago
12 Views
2 Downloads
1.15 MB
11 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Vicente Bone
Transcription

Exploratory Data AnalysisModule I: A Bird’s Eye ViewDr. Mark WilliamsonDaCCoTAUniversity of North Dakota

IntroductionExploratory data analysis: The approach that explores datasets to summarize their main characteristicsusing summary statistics and graphical methods First step for successful statistical analysesExploratory sicTests

LandscapeExploratory DataAnalysisViewdataSummarystatisticsObservation numberVariable numberVariable typeVariable rd deviationOutliersMissing dataBasicGraphsHistogramBar plotBoxplotScatter-plotQQ-plotBasicTestsCheck assumptionsT-testsCorrelationsANOVALinear model

Structures and UsesViewdataObservation numberVariable numberVariable typeVariable categoryStructureVariable category:Nuisance/bookkeepingDependent variableIndependent variableStructure:Long or widePaired dataRepeated measuresObservation number:number of rowsnumber of samplesVariable type:Numerical discreteNumerical continuousCategorical nominalCategorical ordinalVariable number:Number of columnsNumberBody enior1098.817210.1BiologyFreshman

Structures and andard deviationOutliersMissing dataMeanMedianAlso know as the average; the sum of all values divided by number of valuesAlso known as the middle value; less influenced by outliers than meanModeThe most common value among observationsRangeThe distance between the highest and lowest value among observationsVarianceHow much the values are spread out; average all the numbers differ from the meanStandard deviation Another measures of how much the values are spread out; square root of varianceOutliersMissing dataVery high or very low values; may be true or some sort of measurement/writing errorData that should be there but is not; may be error or simple not measured

Structures and UsesBasicGraphsHistogram:Distribution of valuesBoxplot:Median, quartiles,outliers across groupsQQ-plot:Test for normalityBar-plot:Mean across catter-plotRelationship between tonumerical variables

Structures and nearOutliersmodelMissing dataCheckassumptionsT-testsIs the data normally distributed? (histogram, qq-plot)How large is the variance? (summary statistics)Are there equal observations across groups? (summary statistics)Is there is a difference between one group and a set value or between two groups?(boxplot, bar-plot)CorrelationsIs there a relationship between two numerical variables? (scatter plot)ANOVAIs there a difference between three or more groups? (boxplot, bar-plot)Linear modelCan the dependent variable be predicted from one or more independentvariables?

Examples

Quick Assessment1. What are the four basic parts of exploratorydata analysis?View Data, Summary Statistics, Simple Graphs, Simple Tests2. What graph would you use to see if theremight be a relationship between twonumerical variables?Scatter Plot3. What exploration part helps determine thevariable number, type, and category?Viewing the data4. What summary statistic is also called theaverage?Mean5. If you ran an ANOVA, what sort of graph(s)would you use to display the results?Bar-plot and/or boxplot6. What type of graphis this?Histogram

Quick Assessment1. What are the four basic parts of exploratorydata analysis?View Data, Summary Statistics, Simple Graphs, Simple Tests2. What graph would you use to see if theremight be a relationship between twonumerical variables?Scatter Plot3. What exploration part helps determine thevariable number, type, and category?Viewing the data4. What summary statistic is also called theaverage?Mean5. If you ran an ANOVA, what sort of graph(s)would you use to display the results?Bar-plot and/or boxplot6. What type of graphis this?Histogram

Summary and Conclusion Exploratory data analysis is the first step in analyzing data Viewing data helps determine the type and structure of the data Summary statistics helps summarize the data numerically Simple graphs helps visualize structure and relationships of the data Simple tests provide a guide for further data analysis Tune in next time for a stroll through the core components ofExploratory Data Analysis in Module II: Leaves and Trees

Mean Median Mode Range Variance Standard deviation Outliers Missing data Mean Also know as the average; the sum of all values divided by number of values Median Also known as the middle value; less influenced by outliers than mean Mode The most common value among observations Range The distance between the highest and lowest value among .

Related Documents:

BeLux October 2015 vat incl. Little Bird 1x Little Bird piece 100 Bird 1x Bird piece 150 Super Bird 1x Super Bird piece 250 Sub AIR Wireless subwoofer piece 600 Bird pack 2 stands L&B 2x stand for Little or Bird pack 140 Bird pack 2 stands Super 2x stand for Super pack 180 iTransmitter High definition wireless piece 90 USB Transmitter Wireless transmitter piece

Base Controller Brand Base Controller Model Add-on Brand Add-on Model Based Controllers Add-on Qualifying Products List as of Aug 01, 2021 . Rain Bird ESP-LXME Rain Bird IQ4G-USA Rain Bird ESP-LXME Rain Bird IQNCC4G Rain Bird ESP-LXME Rain Bird IQNCCEN Rain Bird ESP-LXME Rain Bird IQNCCRS. Page 5 of 5

Exploratory Data Analysis - Detailed Table of Contents [1.] This chapter presents the assumptions, principles, and techniques necessary to gain insight into data via EDA--exploratory data analysis. 1. EDA Introduction [1.1.] 1. What is EDA? [1.1.1.] 2. How Does Exploratory Data Analysis differ from Classical Data Analysis?

Bird Care Tips Keep the bird in a warm room. Feed your bird food it is used to eating. Give your bird twelve hours of quiet and darkness each day. Do not handle your bird for the first few weeks. Except during playtime, keep the bird in its cage. Avoid loud noises around your bird.

for automated suggestions for portions of the exploratory data analysis process. Many intuitive user interface features that would be ideal to have for an exploratory data analy-sis tool are available in Tableau [1] which is descended from earlier research in exploratory data analysis such as Polaris [14] and Mackinlay's earlier work [9].

of methods for Exploratory Data Analysis & Sentiment Analysis by utilizing various packages concerned. Keywords—Exploratory Data Analysis; Sentiment Analysis; Data Analytics; Python; Seaborn; Numpy; Tensorflow - Keras I. INTRODUCTION The term "Data Analysis" is known to be rooted in the statistics space, which itself is known to have a .

The tasks of Exploratory Data Analysis Exploratory Data Analysis is listed as an important step in most methodologies for data analysis (Biecek,2019;Grolemund and Wickham,2019). One of the most popular methodologies, the CRISP-DM (Wirth,2000), lists the following phases of a data mining project: 1.Business understanding. 2.Data understanding.

built upon existing models. Tukey contrasted exploratory analysis with calculations of values, or con rmatory data analysis . These two sets of methods are both forms of model checking: exploratory data analysis is the search for unanticipated areas of model mis t, and con rmatory data analysis quanti es the extent to which these discrepancies .