Hypothesis Testing With One-Way ANOVA - University Of Michigan

1y ago
16 Views
2 Downloads
3.78 MB
64 Pages
Last View : 19d ago
Last Download : 3m ago
Upload by : Ryan Jay
Transcription

Hypothesis Testing with OneOne-WayANOVAStatisticsArlo ClarkClark-FoosFoos

Conceptual RefresherStandardized z distribution of scores and of meanscan be represented as percentile rankings.2. t distribution of means, mean differences, anddifferences between means can all be standardized,standardizedallowing us to analyze differences between 2 means3. Numerator of test statistic is always somedifference (between scores, means, meandifferences, or differences between means)4. Denominator represents some measure ofvariability (or form of standard deviation).1.

Calculating Refreshery Test Statisticsy Numerator Differences between groupsy Example: Men are taller than womany Denominator Variability within groupsy Example: Not all men/women are the same height* There is overlap between these distributions.z ( M μM )σMt (M μ M )sM ( M X M Y ) ( μ X μY ) ( M X M Y )t sDifferencesDifference

Analysis of Variance (ANOVA)y Hypothesis test typically used with one or more nominal IV (with at least3 groups overall) and an interval DV.y t Test: Distance between two distributionsy F ratio: Uses two measures of variability

F Ratio (Sir Ronald Fisher)between - ggroupsp varianceF within - groups variancey Between-Groups Variance: An estimate of thepopulation variance based on the differencesamong the means of the samplesy WithinWithin-GroupsGroups Variance: An estimate of thepopulation variance based on the differenceswithin each of the three or more sampledistributions

More than two groupsy Example:Example Speech rates in America,America Japan,Japan & Walest test?Two Sources ofVariance:Between &Withint test?t test?

Problem of Too Many Testsp(A) AND p(B) p(A) x p(B)p(A) OR p(B) p(A) p(B)y The probabilityprobabilit of a TypeT pe I error (rejecting the nulln ll whenhen the nnullll istrue) greatly increases with the number of comparisons.Fishing Expedition“If you torture the data long enough,the numbers will prove anything you want” (Bernstein, 1996)

Problem of Too Many Tests

Types of ANOVAy Always preceded by two adjectives1. NumberN b off IndependentI dd t VariablesV i bl2. Experimental DesignyOne-WayOW ANOVA:ANOVA HypothesisHh i test thath iincludesl d onenominal IV with more than two levels and an interval DV.yWithin-Groups One -Way ANOVA: ANOVA where eachsample is composed of the same participants (AKArepeated measures ANOVA).yBetween-Groups One-Way ANOVA: ANOVA where eachsamplel isi composedd off diffdifferent participants.i i

Assumptions of ANOVAfrom 1st edition of textbook

Assumption of Homoscedasticityy HomoscedasticHd tipopulations have thesame variancey Heteroscedasticpopulations havedifferentffvariances

to the Six Stepsy Research Question:y What influences foreign students to choose an Americangraduate program? In particular, how important are financialaspects to students in Arts & Sciences, Education, Law, &B iBusiness?y Data Source:y Survey of 17 graduate students from foreign countries currentlyenrolled in universities in the U.S.Importance ScoresArts & Sciences4543Education4344Law3323Business44434

1 Identify1.y Populations: All foreign graduate students enrolled inprograms in the U.S.y Comparison Distribution: F distributiony Test:TOOne-WayW Between-SubjectsBS bjANOVAy Assumptions:y Participants not randomly selectedy Be careful generalizing resultsy Not clear if population dist. are normal. Data are not skewed.y Homoscedasticityy We will return to this later during calculations—Don’t Forget!

2 Hypotheses2.y Null:N ll ForeignF i graduated t studentst d t iin AArtst &SSciences,iEdEducation,tiLLaw,and Business all rate financial factors the same, on average.µ 1 µ2 µ3 µ4y Research: Foreign graduate students in Arts & Sciences, Education,Law, and Business do not all rate financial factors the same, onaverage.µ1 µ 2 µ 3 µ 4

3 Determine characteristics3.y 2 groups and interval DV:F distributiony df for each sample: NSample - 1yyyyArts & Sciences:Ed tiEducation:Law:Business:df1 5 - 1 4df2 4 - 1 3df3 4 - 1 3df4 4 - 1 3y dfBetween: NGroups - 1 4 - 1 3y Numerator dfy dfWithin: df1 df2 df3 df4 4 3 3 3 13y Denominator dff

4 Determine Critical Values4.p .05dfBetween 3dfWithin 13FCritical 3.41

5 Calculate the Test Statistic5.y In order to do this, we need 2 measures of variancey Between-Groups Variancey Within-Groups Variancey We will do this shortly

6 Make a Decision6.y If our calculated test statistic exceeds our cutoff, wereject the null hypothesis and can say the following:“Foreign“Fi graduatedstudentsdstudyingd i ini theh U.S.U S ratefinancial factors differently depending on the type ofprogram in which they are enrolledenrolled”y ANOVA does not tell us where our differences are!y We just know that there is a difference somewhere.

L i off ANOVA:LogicANOVA QQuantifyingtif i OOverlaplbetween - groups varianceF within - ggroupsp variancey Whenever differences between sample means are largeand differences between scores within each sample aresmall, the F statistic will be large.y Remember that large test statistics indicate statisticallysignificant results

L i off ANOVA:LogicANOVA QQuantifyingtif i OOverlaplLarge withingroups variability &small betweengroups variabilityb) LargeLwithini higroups variability &large betweengroups variabilityblc) Small withingroupsgp variabilityy &small betweengroups variability.a)yLessess OOverlap!e ap

L i off ANOVA:LogicANOVA QQuantifyingtif i OOverlaplbetweenbt- groups varianceiF within - groups variancey If between-groups within-groups, F 1y Null hypothesis predicts F 1y No differences between groupsy Within-groups variance based on scores, between-groupsvariance based on means.y Need correction.

C l l ti thCalculatingthe F Statistic:St ti ti TheTh SourceSTableT bly Source TableTable: Presents the important calculations andfinal results of an ANOVA in a consistent and easy-toread format.f

C l l ti thCalculatingthe F Statistic:St ti ti TheTh SourceSTableT blCol.l 1: ThCThe sources off variabilityi bilitCol. 5: Value of test statistic, F ratioCol. 4: Mean Square: arithmetica erage of squaredaveragesq ared dedeviationsiationsCol. 3: Degrees of freedomCol. 2: Sum of SquaresMS BetweenSS Between df BetweenMSWithin SSWithindfWithinF MS BetweenMSWithin

Sums of Squared DeviationsPut all of your scores in onecolumn, with samplesdenoted in anothercolumn.columnfrom 1st edition of textbookSSTotal Σ ( X GM )2Grand Mean: Refers to themean of all scores in astudy, regardless of theirsample.lΣ( X )GM NTotal

Sums of Squared DeviationsSSWithinWi hi Σ ( X M )2Calculate the squaredd i ti off eachdeviationhscore from its ownparticular sampleppmeanfrom 1st edition of textbook

Sums of Squared DeviationsSS Between Σ ( M GM )BCalculate the squaredd i ti off eachdeviationhsample mean fromthe ggrand mean.from 1st edition of textbook2

Sums of Squared Deviationsfrom 1st edition of textbook

Source Table for our Examplefrom 1st edition of textbook

What is our decision?y Back to Step 1.y Homoscedasticityfrom 1st edition of textbooky Because the largest variance (.500) is not more than twice(unequal sample sizes) the smallest variance (.251) then wehhavemett thithis assumption.ti

What is our decision?y Step 6. Make a decisionF 3.94 Fcrit 3.41y WeW can rejectj t ththe nullll hhypothesis.th i ThThere iis ((are)) adifference somewhere.y Where?y post-hoc test: Statistical procedure frequently carried outafterf we reject theh nullll hhypothesishin an ANOVA;Oit allowsllus to make multiple comparisons among several means.y ppost-hoc: Latin for “after this”y Examples: Tukey’s HSD, Scheffe, Dunnet, Duncan, Bonferroni

Reporting ANOVA in APA Style1.Italic letter F:F2.OOpenparenthesish i :F(3.Between Groups df then comma:F(dfBetween ,4.Within Groups df:F(dfBetweenet ee , dfWithint )5.Close parentheses, equal sign:F(dfBetween , dfWithin) 6.F Statistic then comma:F(dfBetween , dfWithin) 1.23,7.Lower case,case italic letter p:pF(dfBetween , dfWithin) 1.23,1 23 p8.Significant, less than .05:y OR non significant:y OR exact p value:Anotherexample:F(dfBetween , dfWithin) 1.23, p .05F(dfBetween , dfWithin) 1.23, p .05F(dfBetween , dfWithin) 1.23, p .02

Between-Subjects One Way ANOVAExample:p Memoryy for Emotional Stimuli

Between-Subjects One Way ANOVA:MMemoryforf EmotionalE til StimuliSti liDo you have differences in memory for emotional vs. neutral events?yDo others have the same differences or is it something unique to you?yLet’s find out y Research Question: Will people asked to study pure lists of eitherpositive, negative, or neutral pictures have differences in recall ofthose pure lists?y Research Design: We asked 17 participants study one single list ofeither 30 positive, 30 negative, or 30 neutral pictures (from IAPS).Following a brief delay all participants were asked to recall as many ofthe 30 studied photos as they could. These data are on the followingslide.

Between-Subjects One Way ANOVA:MMemoryforf EmotionalE til StimuliSti liAlready Stated: NTotal 17, one IV with 3 levels (Emotion) is between-sub.Below are the proportion of pictures on their studied lists that eachparticipant successfully recalled (100% perfect 71.680.890.50.610 900.900 600.60M .86M .61M .634

Between-Subjects One Way ANOVA:MMemoryforf EmotionalE til StimuliSti liAlready Stated/CalculatedNTotal 17NNeg 6NNeut 6NPos 5dfNeg 5dfNeut 5dfPos 4dfBetween 2df Within 14MNeg .86MNeut .61MPos .634y Six Steps to Hypothesis Testing again!1.Population: All memories for negative, neutral, and positive events.ComparisonpDistribution: F distributionTest: One-Way Between-Subjects ANOVAy Assumptions:y Participantspwere randomlyy selected from subjectj ppooly Not clear if population dist. are normal. Data are not skewed.y Homoscedasticity

Between-Subjects One Way ANOVA:Mtil StiMemory ffor EEmotionalStimulili2. HypothesesypNull: On average, memories fornegative,ti neutral,t l andd positiveitipictures will not differ.µNeg µNeut µPosResearch: On average, memories fornegative,i neutral,l andd positiveiipictures will be different.µNeg µNeut µPos

Between-Subjects One Way ANOVA:MMemoryforf EmotionalE til StimuliSti li3. Determine characteristicsy 2 groups and interval DV:F .71.680.890.50.610.900.60M .86M .61M .634s2 .00784s2 .00472s2 .00683

Between-Subjects One Way ANOVA:MMemoryforf EmotionalE til StimuliSti liDigression: Test for HomoscedasticityRuleIf sample sizes differacross conditions,largest variance mustnot be more thantwice (2x) the .910.71.680.890.50.610.900.60M .86M .61M .634s2 .00784s2 .00472s2 .00683.007847 4.004747 * 2 .00944944.00784 .00944 so this assumption is met.

Between-Subjects One Way ANOVA:MMemoryforf EmotionalE til StimuliSti li4. Determine critical valuesAlready Stated/CalculatedNTotal 17NNeg 6NNeut 6NPos 5dfNeg 5dfNeut 5dfPos 4dfBetween 2dfWithin 14MNeg .86MNeut .61MPos .634s2 .00784s2 .00472s2 .00683Fcrit 3.74

Between-Subjects One Way ANOVA:MMemoryforf EmotionalE til StimuliSti liGM 5. Calculate a test statistic SourceSSdfBetween2Within14Total16SSWithin Σ ( X M )2MSΣ( X )NTotalFSS Between Σ ( M GM )SSTotal Σ ( X GM )22

Between-Subjects One Way ANOVA:MMemoryforf EmotionalE til StimuliSti liSSTotal Σ ( X GM )5. Calculate a test statistic GM Σ( X )NTotalGM .7053X0.690.840 0.510.680.612(X - GM) (X - GM)-0.020.00020.1350.01810 2250.2250 0.10.0091SSTotal .31352

Between-Subjects One Way ANOVA:MMemoryforf EmotionalE til StimuliSti liSSWithin Σ ( X M )5. Calculate a test statistic MNeg .86MNeut .61MPos .634X0.690.840 0.510.680.61(X - M)-0.17-0.020 .096-0.1240.046-0.0242(X - M)0.02890.00040 0.01210.000100.00920.01540.00210.0006SSWithin .09012

Between-Subjects One Way ANOVA:MMemoryforf EmotionalE til StimuliSti liSS Between Σ ( M GM )5. Calculate a test statistic GM .7053X0.690.840 0.510.680.61M0.860.860 6340.6340.6340.6342(M - GM) (M - GM)0.1550.0240.1550.0240 1550.1550 n .2232

Between-Subjects One Way ANOVA:MMemoryforf EmotionalE til StimuliSti li5. Calculate a test statistic 0064Total .313516MS BetweenMSWithinSS Between Bdf BetweenSSWithin dfWithinMS BetweenF MSWithin

Between-Subjects One Way ANOVA:MMemoryforf EmotionalE til StimuliSti li6. Make a .090114.0064Total .313516Fcrit 3.74

Between-Subjects One Way ANOVA:MMemoryforf EmotionalE til StimuliSti liF 17.97 Fcrit 33.74746. Make a decisionRecall of negative, neutral, and positive pictureswas different, F(2, 14) 19.97, p .05.But which pictures were remembered best? Worst?

A Priori & PostPost-Hoc Tests

Hindsight is 2020-2020y Althought oug youryou data mayay suggest anew relationship, and thus newanalyses y Theory should guide research andthus comparisonsthishouldh ld bbe decidedd id don before you conduct yourexperiment.p

Planned & A Priori Comparisonsy Based on literature reviewy Theoreticaly Plannedld comparisonsy A test that is conducted when there are multiple groups ofscores, but specific comparisons have been specified priorscoresto data collection.y A Priori Comparisons

Planned & A Priori Comparisonsy If you have planned comparisons y Just run t testsy Subjective Decision about p valuep .05?y p .01?y Bonferroni Correction?y

Post Hoc Tukey HSDPost-Hoc:y TukeyT k HonestlyHtl SiSignificantifi t DiffDifferencey Determines differences between means in terms ofstandard errory ‘Honest’ because we adjust for making multiple comparisonsy The HSD is compared to a critical valuey Overview1. Calculate differences between a pair of means2. Divide this difference by the standard error* Basically this is a variant of a t test *p again sortgof.Oh no,, that means the six steps

Tukey HSD(M1 M 2 )HSD sM(M1 M 2 )t sDifferencey For Tukey HSD, standard error is calculateddifferently depending on whether your sample sizesare equal or not.

Tukey HSDy Equal Sample SizessM MSWithinNN Sample sizewithini hi eachh groupy Unequal Sample SizessM MSWithinN′N GroupsN′ 1 N

Tukey HSDy Determine Critical Value from Tabley Make a Decisiony Let’s go back to our memory for emotional picturesexample

Tukey HSD:HSD Exampley Memory for Emotional Pictures Example:Between-Subjects One Way ANOVAy Decision: Recall of negative, neutral, and positivepictures was different,different F(2,F(2 14) 19.97,19 97 p .05.05y Where are our differences?y Let’s get our qcrit first

Tukey HSD:HSD ExampleAlready Stated/CalculatedNTotal 17NNeg 6NNeut 6NPos 5dfNeg 5dfNeut 5dfPos 4dfBetween 2B(k 3)dfWithin 14MNeg .86MNeut .61qcrit 3.70MPos .634

Tukey HSD:HSD ExampleAlready Stated/CalculatedNTotal 170.690.59.6440.840.64.730.930.62.510 910.910 710.71.68680.890.50.610.900.60NNeg 6NNeut 6NPos 5dfNeg 5dfNeut 5dfPos 4dfBetween 2(k 3)dfWithin 14MNeg .86MNeut .61qcrit 114.0064Total .313516MPos .634

Tukey HSD:HSD Exampley Standard Error: Unequal Sample SizesN GroupsN′ 1 N sM MSWithinN′N′ 33 5.6251 1 1 .533 6 6 5.0064sM .00113780011378 0.0340 0345.625

Tukey HSD:HSD Exampley Negative (M 0.86) vs. Neutral (M 0.61)M 1 M 2 ) (.86 .61)(HSD 7.35sM.034y Negative (M 0.86) vs. Positive (M 0.634)M 1 M 2 ) (.86(( 86 .634)634)HSD 6.65sM.034y Neutral (M 0.61) vs. Positive (M 0.634)M 1 M 2 ) ((.61 .634))(HSD 0 71 0.71sM.034

Tukey HSD:HSD Exampley Make a Decisiony Post hoc comparisons using the Tukey HSD testrevealed that negative pictures were betterremembered (M .86) than either positive (M .634) orneutral (M .61) pictures, with no differences betweenthe latter two.

Bonferonni CorrectionAn alternative ppost-hoc strategygy

Bonferroni CorrectionFishing Expeditiony Remember the problem of too many tests?y Inflates the risk of a Type I error.y False positivesy Is there a way to address that without a new test?y WeWe’veve hinted at it already already

Bonferroni Correction

Summaryy Between-Subjects One Way ANOVAy Two Sources of Variancey New Sums of Squaresy New dfy Homoscedasticityyy The problem of too many testsy Source Tabley Post-Hoc testsyyyyTukey’s HSDBonferroniLSDetc.

differences between means can all be standardizeddifferences between means can all be standardized, allowing us to analyze differences between 2 means 3. Numerator of test statistic is always someNumerator of test statistic is always some difference (between scores, means, mean differences, or differences between means) 4.

Related Documents:

Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data might allow you to reject H 1: The alternate hypothesis Example: H 0: Average IQ of Group1 subjects Group2 subjects H 1: Average IQ of Group1 subjects Group2 subjects Given data we wish to probabilistically test out the hypotheses

Lecture 7: Hypothesis Testing and ANOVA. Goals Introduction to ANOVA Review of common one and two sample tests Overview of key elements of hypothesis testing. . the test statistic under the null hypothesis and assumptions about the distribution of the sample data (i.e., normality)

Hypothesis testing is a statistical technique that is used in a variety of situations. Testing a hypothesis involves Deducing the consequences that should be observable if the hypothesis is correct. Selecting the research methods that will permit

HYPOTHESIS TESTING Hypothesis testing –Testing an assertion about a population based on a random sample. –Example: Hypothesis: a given coin is fair Test: flip the coin 100 times, count the number of heads –If the coin is fair, we expect approximately 50 heads. –E.g. if the number of heads is in [47, 53], the hypothesis is true .

inclusion in the sample. Johan A. Elkink hypothesis testing. Statistical inference Point estimation . Johan A. Elkink hypothesis testing. Statistical inference Point estimation Confidence intervals Hypothesis tests Bayesian inference Terminology Aparameteris a characteric of the population distribution (e.g. . large. Johan A. Elkink .

7. Know that your hypothesis may change over time as your research progresses. You must obtain the professor's approval of your hypothesis, as well as any modifications to your hypothesis, before proceeding with any work on the topic. Your will be expressing your hypothesis in 3 ways

Cover photo: Yasgur farm, Woodstock, New York revision date 15 JULY 2009 HYPOTHESIS TESTING COMPARED TO JURY TRIALS 3 COMPARISONS BETWEEN HYPOTHESIS TESTS AND JURY DECISION-MAKING General Specific Example Criminal Trial Null Hypothesis H0

In this chapter, you will conduct hypothesis tests on single means and single proportions. You will also learn about the errors associated with these tests. Hypothesis testing consists of two contradictory hypotheses or statements, a decision based on the data, and a conclusion. To perform a hypothesis test, a statistician will: