Advanced Statistics - Universität Innsbruck

2y ago
11 Views
2 Downloads
323.67 KB
40 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Aydin Oneil
Transcription

Advanced StatisticsAdvanced StatisticsJanette Waldejanette.walde@uibk.ac.atDepartment of StatisticsUniversity of Innsbruck

Advanced StatisticsContentsIntroductionBasics/Descriptive StatisticsScales of measurementGraphical exploration of dataDescriptive characteristics for a variableEstimationCharacteristics of an estimatorConfidence intervalStatistical hypothesis testingStatistical testing principleTesting errorsPower analysisWhy multivariate analysis?

Advanced StatisticsIntroduction“We are pattern-seeking story-telling animals.”(Edward Leamer)”Statistics does not hand truth to the user on asilver platter. However, statistics confinesarbitrariness and provides comprehensibleconclusions.”“Es gibt keine Tatsachen, es gibt nurInterpretationen.” (Friedrich Nietzsche)

Advanced StatisticsIntroductionPreliminary comments1. You will learn to apply statistical tools correctly,interpret the findings appropriately and get anidea about the possibilities of analyzingresearch questions employing statistics.2. It is not possible and not worthwhile to learnall statistical methods in such a course.However, this course is successful if it enablesyou to improve your knowledge in statisticalmethods on your own. Therefore this coursegives you profound knowledge about somestatistical analyzing tools and shows you thecorrect application of them.

Advanced StatisticsIntroductionPreliminary comments3. Although knowing the most sophisticatedanalyzing instruments one may be confrontedwith limits in getting results or findingappropriate interpretations or applying tools inthe given framework. This has to be accepted(“If we torture the data long enough, they willconfess.”).4. Be aware: Never confuse statistical significancewith biological significance.

Advanced StatisticsBasics/Descriptive StatisticsScales of measurementScales of measurement1. Nominal Scale. Nominal data are attributes likesex or species, and represent measurement atits weakest level. We can determine if oneobject is different from another, and the onlyformal property of nominal scale data isequivalence.2. Ranking Scale. Some biological variablescannot be measured on a numerical scale, butindividuals can be ranked in relation to oneanother. Two formal properties occur inranking data: equivalence and greater than.

Advanced StatisticsBasics/Descriptive StatisticsScales of measurementScales of measurement3. Interval and Ratio Scales. Interval and ratioscales have all the characteristics of the rankingscale, but we know the distances between theclasses. If we have a true zero point, we have aratio scale of measurement.

Advanced StatisticsBasics/Descriptive StatisticsGraphical exploration of dataHistogramSkewed distribution300250250200200frequency (density)frequency (density)Normal distribution30015015010010050500 4 3 2 10X123400246810Y1214161820

Advanced StatisticsBasics/Descriptive StatisticsGraphical exploration of dataBox PlotSkewed distributionNormal distribution183162frequency (density)frequency (density)1410 1 21210864 320 4XY

Advanced StatisticsBasics/Descriptive StatisticsGraphical exploration of dataQ-Q PlotIIIMany statistical methods make someassumptions about the distribution of the data(e.g. normality).The quantile-quantile plot provides a way tovisually investigate such an assumption.The QQ-plot shows the theoretical quantilesversus the empirical quantiles. If thedistribution assumed (theoretical one) is indeedthe correct one, we should observe a straightline.

Advanced StatisticsBasics/Descriptive StatisticsGraphical exploration of dataQ-Q PlotNormal Q Q Plot3001020Sample Quantiles0 1 2012 2 1012Theoretical Quantiles0.20.30.4Theoretical Quantilesdensity 10.1 20.0Sample Quantiles140250Normal Q Q Plot 4 2024

Advanced StatisticsBasics/Descriptive StatisticsDescriptive characteristics for a variableSummary StatisticIIIIIIMean, medianPercentiles, inter quartile rangeMinimum, maximum, rangeStandard deviation, varianceCoefficient of variationMedian absolute deviation, mean absolutedeviation

Advanced StatisticsEstimationFundamental conceptsPopulations must be defined at the start of anystudy and this definition should include the spatialand temporal limits to the inference. The formalstatistical inference is restricted to these limits.Possibility of drawing samples randomly.Population parameters are considered to be fixedbut unknown values (in contrast to the Bayesianapproach).

Advanced StatisticsEstimationCharacteristics of an estimatorCharacteristics of an estimatorA good estimator of a population parameter shouldhave the following characteristics:I The estimator should be unbiased, meaningthat the expected value of the sample statistic(the mean of its probability distribution) shouldequal the parameter.I It should be consistent so as the sample sizeincreases then the estimator will get closer tothe population parameter.I It should be efficient, meaning it has the lowestvariance among all competing estimators.

Advanced StatisticsEstimationCharacteristics of an estimatorUnbiasedness of sample mean as estimatorfor the population meann 500.30.2mean of each sample0.10 0.1 0.2 0.3 0.4123456number of sample78910

Advanced StatisticsEstimationCharacteristics of an estimatorConsistency of the sample mean asestimator for the population meann 1050 51234567891067891078910n 10050 512345n 10,00050 5123456

Advanced StatisticsEstimationCharacteristics of an estimatorEfficiency of the sample mean and of themedian as an estimator for the populationcentral tendency1,000 samples with n 100, variabe is normally distributed with population mean zero and standard deviation ten43distribution of the means210 1 2 3 4 5meanestimatormedian

Advanced StatisticsEstimationConfidence intervalConfidence interval for the populationmeanConsider a population of N observations of thevariable X . We take a random sample of nobservations {x1, x2, ., xn} from the population.I Median versus sample mean (x̄).I Having an estimate of a parameter is only thefirst step in estimation. We also need to knowhow precise our estimate is: Standard error.Standard error of the mean: sex̄ σ̂nI Confidence interval for the population mean:CI(1 α) : [x̄ tdf n 1,1 αsex̄ ; x̄ tdf n 1,1 αsex̄ ]

Advanced StatisticsEstimationConfidence interval95% confidence interval for the populationmeann 101050 5 101234567891067891078910n 1001050 5 1012345n 10,0000.40.20 0.2 0.4123456

Advanced StatisticsStatistical hypothesis testingStatistical testing principleStatistical tests and scientific hypothesesA statistical test is a confrontation of the real world(observations) to a theory (model) with the aim offalsifying the model.Model: H0 : µ 0 and Ha : µ 6 0Real world: x̄, s

Advanced StatisticsStatistical hypothesis testingStatistical testing principleStatistical tests and scientific hypothesesAs such the statistical test (as a scientific method)fits directly into the philosophy of science describedby the English philosopher Karl Popper (1902–1994)(see e.g. The Logic of Scientific Discovery, 1972).Basically the philosophy says that 1) theories cannot be empirically verified but only falsified and 2)scientific progress happens by having a theory untilit is falsified. That is, if we observe a phenomenon(data) which under the model (theory) is veryunlikely, then we reject the model (theory).

Advanced StatisticsStatistical hypothesis testingStatistical testing principleStatistical tests and scientific hypotheses”No amount of experimentation can ever prove me right; asingle experiment can prove me wrong.” (Albert Einstein)In other words, experiments can mainly be used forfalsifying a scientific hypothesis – never for provingit! When we have a scientific theory, we conduct anexperiment in order to falsify it. Therefore, thestrong conclusion arising from an experiment iswhen a hypothesis is rejected. Accepting (moreprecisely – not rejecting) a hypothesis is not a verystrong conclusion (maybe acceptance is simply dueto that the experiment is too small).

Advanced StatisticsStatistical hypothesis testingStatistical testing principleExampleSuppose we have a coin, and that our hypothesis isthat the coin is fair, i.e. that P(head) P(tail) 1/2. Suppose we toss a coin n 25 times andobserve 21 heads. The probability of actuallyobserving these data under the model is P(21 heads,4 tails) 0.0004. It is a very unlikely (but possible)event to see such data if the model is true. In thisfalsification process we employ the interpretationprinciple of statistics:Unlikely events do not occur.

Advanced StatisticsStatistical hypothesis testingStatistical testing principleStatistical tests and scientific hypothesesIf we do not employ this principle we can never sayanything at all on the basis of statistics(observations): An opponent can always claim thatthe present observations just are “an unfortunateoutcome” which - no matter how unlikely they are are possible.

Advanced StatisticsStatistical hypothesis testingStatistical testing principleStatistical tests and scientific hypothesesIn practice the statistical interpretation principleneeds more structure:I In a large sample space, all possible outcomeswill have a very small probability, so it will beunlikely to have the data one has.I In addition there is also the question abouthow small a probability is needed in order toclassify data as being unlikely.I Concepts of p-value and significance level α.

Advanced StatisticsStatistical hypothesis testingTesting errorsTwo Types of ErrorsRecall that the following four outcomes are possiblewhen conducting a test:RealityH0HaOur DecisionH0 HaType I ErrorProb α (Prob 1 α)Type II ErrorProb β(Prob 1 β)The significance level α of any fixed level test is theprobability of a Type I error.

Advanced StatisticsStatistical hypothesis testingTesting errorsAcceptable levels of errorsIType I error (α)IIIIType II error (β)IIITypically α 0.05 (This convention is due to R.A.Fisher)For more stringent tests α 0.01 or α 0.001Exploratory or preliminary experiments α 0.10Typically 0.20Often unspecified and much less than 0.20Statistical power (1 β)

Advanced StatisticsStatistical hypothesis testingPower analysisThe power of a statistical testThe power of a significance test measures its abilityto detect an alternative hypothesis.The power against a specific alternative iscalculated as the probability that the test will rejectH0 when that specific alternative is true.

Advanced StatisticsStatistical hypothesis testingPower analysisExample: Computing statistical powerDoes exercise make strong bones?Can a 6-month exercise program increase the total body bonemineral content (TBBMC) of young women? A team ofresearchers is planning a study to examine this question.Based on the results of a previous study, they are willing toassume that σ 2 for the percent change in TBBMC over the6-month period. A change in TBBMC of 1% would beconsidered important, and the researcher would like to have areasonable chance of detecting a change this large or larger.Are 25 subjects a large enough sample for this project?

Advanced StatisticsStatistical hypothesis testingPower analysisExample (cont.)1. State the hypotheses: let µ denote the meanpercent change:H0 : µ 0Ha : µ 02. Calculate the rejection region: The z testrejects H0 at the α 0.05 level whenever:z x̄x̄ µ0 1.645σ/ n2/ 25That is we reject H0 when x̄ 0.658.

Advanced StatisticsStatistical hypothesis testingPower analysisExample (cont.)3. Compute the power at a specific alternative:The power of the test at alternative µ 1 isP(x̄ 0.658 µa 1) 0.8Plot graph.4. Statistical power is the probability of rejectingH0 given population effect size (ES), α andsample size (n). This calculation also requiresknowledge of the sampling distribution of thetest statistic under the alternative hypothesis:Power curve.

Advanced StatisticsStatistical hypothesis testingPower analysisExample (cont.)Power function in dependence of the effect size10.90.8power 1 β0.70.60.50.40.30.20.1000.20.40.60.81µ µ0a1.21.41.61.82

Advanced StatisticsStatistical hypothesis testingPower analysisWays to increase powerIIIIncrease α. A 5% test of significance will havea greater chance of rejecting the alternativethan a 1% test because the strength ofevidence required for rejection is less.Consider a particular alternative that is fartheraway from µ0 . Values of µ that are in Ha butlie close to the hypothesized value µ0 are harderto detect than values of µ that are far from µ0 .Increase the sample size. More data willprovide more information about x̄ so we have abetter chance of distinguishing values of µ.

Advanced StatisticsStatistical hypothesis testingPower analysisWays to increase powerIDecrease σ. This has the same effect asincreasing the sample size: it provides moreinformation about µ. Improving themeasurement process and restricting attentionto a subpopulation are two common ways todecrease σ.

Advanced StatisticsStatistical hypothesis testingPower analysisHow many samples are needed to achievea power of 0.8 in a t-test?Effect size index for the t-test for a differencebetween two independent means.d µ1 µ2σwhere d is the effect size index, µ1 and µ2 aremeans, σ is the common standard deviation of themeans.Effect size indices are available for many statisticaltests.

Advanced StatisticsStatistical hypothesis testingPower analysisHow many samples are needed to achievea power of 0.8 in a t-test?Effect Sizeα 0.10 α 0.05 α 0.01Large effect202638(d 0.8)Medium effect506495(d 0.5)Small effect310393586(d 0.2)Source: Cohen (1992), p. 158.Recommendation: Use estimates of statistical power as aguide to planning experiments (a priori power analysis).

Advanced StatisticsStatistical hypothesis testingPower analysisIs lack of statistical power a widespreadproblem?”We estimated the statistical power of the first and laststatistical test presented in 697 papers from 10 behavioraljournals . On average statistical power was 13-16% to detecta small effect and 40-47% to detect a medium effect. This isfar lower than the general recommendation of a power of 80%.By this criterion, only 2-3%, 13-21%, and 37-50% of the testsexamined had a requisite power to detect a small, medium, orlarge effect, respectively.”Jennions, M.D., and A.P. Moeller 2003. Behavioral Ecology14, 438-455.

Advanced StatisticsStatistical hypothesis testingPower analysisFurther readingsCohen, J. 1992. A power primer. PsychologicalBulletin 112: 155-159.Jennions, M.D., and A.P. Moeller 2003. A survey ofthe statistical power of research in behavioralecology and animal behavior. Behavioral Ecology14: 438-455.Hoenig, J.M., and D.M. Heisey 2001. The abuse ofpower: the pervasive fallacy of power calculationsfor data analysis. American Statistician 55: 19-24.

Advanced StatisticsWhy multivariate analysis?Why multivariate analysis?Male FemaleAccept3520Refuse entry 4540Total8060I Example: 44% of male applicants are admittedby a university, but only 33% of femaleapplicants.I Does this mean there is unfair discrimination?I University investigates and breaks down figuresfor Engineering and English programmes.

Advanced StatisticsWhy multivariate analysis?Simpson’s ParadoxEngineeringAcceptRefuse entryTotalIIMale Female301030106020EnglishAcceptRefuse entryTotalMale51520Female103040No relationship between sex and acceptance foreither programme. So no evidence ofdiscrimination. Why?More females apply for the English programme,but it is hard to get into. More males appliedto Engineering, which has a higher acceptancerate than English. Must look deeper thansingle cross-tab to find this out!

Advanced Statistics Statistical hypothesis testing Statistical testing principle Statistical tests and scientific hypotheses As such the statistical test (as a scientific method) fits directly into the philosophy of science described by the English philosopher Karl Popper (1902

Related Documents:

Orbx FTX LOWI Innsbruck User Guide 3 . Thank you! Orbx would like to thank you for purchasing FTX LOWI Innsbruck International Airport . A destination that needs no introduction, Innsbruck is one of the iconic airports of the world, with a jaw- dropping backdrop, heart-stopping approaches, and a location smack dab in the heart of Europe, only a

2 Department of Pediatrics I, Medical University of Innsbruck,6020 Innsbruck,Austria 3 Division of Hygiene and Medical Microbiology, Medical University of Innsbruck,6020 Innsbruck,Austria; marco.grasse@i-med.ac.at 4 Center for Mind/Brain Sciences (CIMeC), University of Trento,38068Rovereto, Italy; luigi.balasco@unitn.it

1 Airborne measurements of particulate organic matter by PTR -MS: a pilot study Felix Piel 1,2, Markus Müller 1, Tomas Mikoviny 3, Sally E. Pusede 4, Armin Wisthaler 2,3 5 1Ionicon Analytik GmbH, Innsbruck , 6020 , Aust ria 2Institute for Ion Physics and Applied Physics , University of Innsbruck, Innsbruck, 6020, Aust ria 3Depart

Simon Fraser University Associate Professor at the Department of Strategic Management, Marketing and Tourism Innsbruck University School of Management Senior Researcher at the Institute of Tourism and Service Economics University of Innsbruck Trainee and project assistant at the Institute of Tourism and Service Eco-nomics University of Innsbruck

Statistics Student Version can do all of the statistics in this book. IBM SPSS Statistics GradPack includes the SPSS Base modules as well as advanced statistics, which enable you to do all the statistics in this book plus those in our IBM SPSS for Intermediate Statistics book (Leech et al., in press) and many others. Goals of This Book

3 Scienti c and Organizing Committees 3.1 Scienti c Committee ( ) Liliane Basso Barichello (UFRGS, Porto Alegre, lbaric@mat.ufrgs.br)( ) Piermarco Cannarsa (Universit a di Roma Tor Vergata, Roma,cannarsa@axp.mat.uniroma2.it)( ) Ciro Ciliberto (Universit a di Roma Tor Vergata, Roma, cilibert@axp.mat.uniroma2.it)- co-chair ( ) Giorgio Fotia (Universit a di Cagliari, Giorgio.Fotia@crs4.it)

Clemens Berger, Universit e de Nice-Sophia Antipolis: cberger@math.unice.fr Richard Blute, Universit e d’ Ottawa: rblute@uottawa.ca Lawrence Breen, Universit e de Paris 13: breen@math.u

4 4 Animal Science Anywhere Michigan 4 outh Develoment Michigan State Universit Etension Coright 2014 Michigan State Universit Boar of Trustees Michigan State Universit is an armative actioneual oortunit emloyer. 4IDENTIFYING CUTS O