More Descripve Stascs: Percen Les, Boxplots, And

3y ago
3 Views
2 Downloads
2.01 MB
45 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Ryan Jay
Transcription

More descrip ve sta s cs:Percen les, boxplots, andz-scores

Outline for todayBe#er know a player Wade BoggsReview: Central tendency and measures of varia onMore descrip ve sta s cs: Percen les, 5 number summaries, boxplots Z-scores

Be er a playerWade BoggsAny ques ons about worksheet 2?

Descrip ve sta s csWhat is a sta s c?A sta s c is a numerical summary (func on) of sample

The meanMean x1 x2 x3 xnn Σ xinSample mean (x̅) vs. popula on mean (μ)μx̄

The medianThe median is the value in the middle of your data ½ of the values are greater than the median and ½ are lessThe median is resistant to outliers when the mean is not

The standard devia on

Mean

Mean stdev

Large vs small standard devia onsSame mean, different standard devia onWhich has a largest standard devia on?A) GreenB) RedC) BlueSame standard devia on, different meanWhich has a largest mean?A) RedB) Blue

The 95% rule (of thumb)If a distribu on of data is approximately symmetric and bellshaped, about 95% of the data should fall within twostandard devia ons of the mean.i.e., 95% of the data is in the interval: x̄ -2s to x̄ 2s

Percen lesThe pth percen le is the value of a quan ta vevariable which is greater than p percent of the data

Percen les/quan lesh#ps://emeyers.shinyapps.io/baseball stat percen les/

What is a good sta s c for ?Use the website to determine what “good” values are forthe following sta s cs: Home runs (HR)On base percentage (OBP)Bafng average (BA)Strikeouts (SO)h#ps://emeyers.shinyapps.io/baseball stat percen les/

PuOng sta s cs in context90th percen le

PuOng sta s cs in context90th percen le

Calcula ng percen lesThe pth percen le is the value of a quan ta vevariable which is greater than p percent of the data.David Or z’s Home run data:OrderSorted dataPercen le123223R: quantile(x, probs)32842953063273583593710381154

Calcula ng percen lesThe pth percen le is the value of a quan ta vevariable which is greater than p percent of the data.David Or z’s Home run data:OrderSorted dataPercen le1230223R: quantile(x, probs)3284295306325073583593710381154100

Calcula ng percen lesThe pth percen le is the value of a quan ta vevariable which is greater than p percent of the data.David Or z’s Home run data:OrderSorted dataPercen le123022310R: quantile(x, 4100

Calcula ng percen lesTypically we ask for a value that is at the pth percen lerather than calculate the percen les for our data- e.g., the 25% percen le value is 28.5- (weighted mean of the 20th and 30th percen les)David Or z’s Home run data:OrderSorted dataPercen le123022310R: quantile(x, 4100

Five Number SummaryFive Number Summary (min, Q1, median, Q3, max)Q1 25th percen leQ3 75th percen le(also called 1st quar le)(also called 3rd quar le)Roughly divides the data into fourthsR: fivenum(x)

Range and Interquar le RangeRange maximum – minimumInterquar le range (IQR) Q3 – Q1R: IQR(x)

Compute: 5 number summary, range,and IQR fro David Or z home runs1. Five Number Summary (min, Q1, median, Q3, max)2. Range maximum – minimum3. Interquar le range (IQR) Q3 – Q15429383523233028353237Also use the percen le app to find the 5 number summary for HRs for all playerseason with over 500 PA: h#ps://emeyers.shinyapps.io/baseball stat percen les/

5 number summary, range, andIQR fro David Or z home runs1. Five Number Summary: (23, 28.5, 32, 36, 54)2. Range: 313. Interquar le range (IQR) 7.55429383523233028353237The 5 number summary for HRs for all player-season with over 500 PA is:(0, 4, 10, 20, 73)

Detec ng of outliersAs a rule of thumb, we call a data value an outlier if it is:Smaller than: Q1 - 1.5 * IQRLarger than: Q3 1.5 * IQRAre there any outliers in David Or z home run numbers?1. Five Number Summary: (23, 28.5, 32, 36, 54)2. Range: 313. Interquar le range (IQR) 7.5

BoxplotsA boxplot is a graphical display of the 5 numbersummary and consists of:1. Drawing a box from Q1 to Q32. Dividing the box with a line drawn at the median3. Draw a line from each quar le to the most extremedata value that is not and outlier4. Draw a dot/asterisk for each outlier data point.

Home runsBox plot of David Or z home runsR: boxplot(x)

Box plot quizFHome runsEDCBAWhat is:Q1?Q3?The median?Most extreme valuesthat are not outliers Outliers

Two current players: who is best?Miguel Cabrera:HR in 2014 25David Or z:HR in 2014 35

Comparing players with side-by-sidebox plotsABHow would you describe the differences between these twoplayers in terms of HRs? Who is be#er?

Let’s compare two more players19851941Wade Boggs: BA .368Ted Williams: BA .406Career best seasonsWho is be#er?

Who is best here?Is Ted Williams be#er than Wade Boggs?

Ted Williams hit .406 in 194123 plenty of people hit over .400 before himbut no one has since

Max bafng averageHave the best players go en worse athiOng over the past 140 years?Year

Comparing players across meperiodsProblem: baseball has changed from 1871 to nowWe can’t simply compare sta s cs to judge howgreat a baseball player is when comparing acrossdecadesUseful to judge the ‘greatness’ of players rela ve totheir peers

Histograms of baOng average1941 vs. 1985Do the bafng averages look similar in these years?

Density of baOng average 1941 vs. 1985Do the bafng averages look similar in these years?

z-scoresThe z-scores tells how many standard devia ons avalue x is from the mean (x̄ ), in a way that isindependent of the units of measurement.

z-scores for comparing players across erasWhen comparing players across eras, we will use the mean(x̄), and standard devia on (s) from each era separately.This will give a measure of player performance rela ve totheir peers in the same era

Comparing Ted and Wade to their peersIn 1941: Mean bafng average was: .276 Standard devia on in bafng average was: .033 Ted William’s bafng average was: .406In 1985: Mean bafng average was: .266 Standard devia on in bafng average was: .027 Wade Bogg’s bafng average was: .368Calculate z-scores for Ted and Wade’s bafng averages.Who was be#er rela ve to their peers?

Comparing Ted and Wade to their peersWade’s bafng average z-score: 3.82Ted’s bafng average z-score: 3.97Who is the be#er hi#er?

Career z-scored baOng averages

What about Home Runs

Next class: correla on!Q and R: Big Data Baseball chapter 4

Five Number Summary Five Number Summary (min, Q 1, median, Q 3, max) Q 1 25th percen le) Roughly divides the data into fourths R: fivenum(x)

Related Documents:

CP Programmation Français P1 (7 sem.) P2 (7 sem.) P3 (5 sem.) P4 (7 sem.) P5 (10 sem.) Copier de manière experte CP Positionnement et lignage Les boucles e l Les étrécies i u t Les ronds c o Les ronds a d Le s / Les ponts m n Les lettres p j La lettre r Les lettres q g Les lettres v w Les lettres y z Les lettres b h Les lettres k f La .

territoriales (« ART »), les traités (les traités numérotés, les traités modernes et les traités sur les droits fonciers (« TDF »), les accords sur les établissements des Métis, les ententes d’autonomie gouvenementale (« EAG ») et les revendications spécifiques. Animateur : Jeff Harris, Myers Weinberg LLP (Winnipeg, Manitoba)

les titres de créance négociables à court terme, à savoir principalement les bons du Tré-sor émis par les Trésors nationaux (ceux du Trésor français sont les BTF et les BTAN courts), les certificats de dépôt émis par les banques et les billets de trésorerie émis par les entreprises. 1.2.1. Les emprunts « en blanc »

Guide de biosécurité pour le secteur des pépinières Page 6 Les vecteurs biologiques, comme les plantes entrantes, les insectes (y compris les insectes avantageux) et les personnes. Les vecteurs physiques, comme l'équipement. Les vecteurs environnementaux, comme le vent et les eaux de surface. Afin de déterminer les points critiques dans les voies de transmission des ravageurs

Les croix commémoratives, les croix de chemin et les petites niches de parterre sont peu ou pas ornementées, alors que les niches de grande taille et les calvaires tendent à être plus sophistiqués. 5 PRÉSERVATION ET MISE EN VALEUR Contrairement au mobilier religieux conservé dans les églises, les monastères, les établissements d'enseignement ou les presbytères, la préservation .

ou tout autre espace non utilisé par les enfants, comme les bureaux, les bureaux du personnel, les escaliers, les espaces de rangement fixe, les corridors, les toilettes, la cuisine, la salle de lavage et les chambres d’isolement. . – des espaces soient aménagés pour les jeux individuels, en petits groupes ou en grands groupes;

par un Pacs vivant sous le même toit, les enfants (légitimes, natu-rels ou adoptés), les petits-enfants, un frère ou une sœur, le père, la mère, les beaux-parents, les grands-parents, le tuteur légal, les beaux-frères et belles-sœurs, les gendres et belles-filles, les oncles et tantes, les neveux et nièces de l’Assuré. NULLITÉ

Secret weapon for 70% white hair coverage. Ammonia freepermanent colour. Result: Luminous reflects and added volume. Perfect for: Women who want a multi-dimensional result and white hair coverage. Classic, rich permanent colour that treats the hair while colouring. Result: Intense and long lasting colour. Perfect for: Women who want the ultimate radiant colour results with absolute confidence .