ECE531 Lecture 4a: Neyman-Pearson Hypothesis Testing -

2y ago
19 Views
2 Downloads
370.48 KB
30 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Randy Pettway
Transcription

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingECE531 Lecture 4a: Neyman-Pearson HypothesisTestingD. Richard Brown IIIWorcester Polytechnic Institute12-February-2009Worcester Polytechnic InstituteD. Richard Brown III12-February-20091 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingHypothesis Testing: What We Know Bayesian decision rules: minimize the expected/weighted risk for aparticular prior distribution π. Minimax decision rules: minimize the worst-case risk exposure over allpossible prior distributions.Example: To approve a new flu test, the FDA requires the test tohave a false positive rate of no worse than 10%. Should we use the Bayes criterion?Should we use the minimax criterion?How do we assign a risk structure to this sort of problem?In many hypothesis testing problems, there is a fundamentalasymmetry between the consequences of “false positive” (decide H1when the true state is x0 ) and “miss / false negative” (decide H0when the true state is x1 ).Worcester Polytechnic InstituteD. Richard Brown III12-February-20092 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingThe Neyman-Pearson Criterion and TerminologyFor now, we will focus on simple binary hypothesis testing under the UCA.R0 (ρ) Prob(decide H1 state is x0 ) Pfp probability of “false positive” or probability of “false alarm”.andR1 (ρ) Prob(decide H0 state is x1 ) Pfn probability of “false negative” or “missed detection”.DefinitionThe Neyman-Pearson criterion decision rule is given asρNP arg min Pfn (ρ)ρsubject to Pfp (ρ) αwhere α [0, 1] is called the “significance level” of the test.Worcester Polytechnic InstituteD. Richard Brown III12-February-20093 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingThe N-P Criterion: 3 Coin Flips(q0 0.5, q1 0.8, α 0.1)D1510.90.80.7R10.6NP CRV R0 0.1D140.50.4minimax CRV0.30.2D12Bayes CRV λ 0.60.1D800Worcester Polytechnic Institute0.10.20.30.40.5R00.6D. Richard Brown III0.70.80.9D0112-February-20094 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingNeyman-Pearson Hypothesis Testing ExampleCoin flipping problem with a probability of heads of either q0 0.5 orq1 0.8. We observe three flips of the coin and count the number ofheads. We can form our conditional probability matrix 0.125 0.008 0.375 0.096 P 0.375 0.384 where Pℓj Prob(observe ℓ heads state is xj ).0.125 0.512Suppose we need a test with a significance level of α 0.125. What is the N-P decision rule in this case? What is the probability of correct detection if we use this N-Pdecision rule?What happens if we relax the significance level to α 0.5?Worcester Polytechnic InstituteD. Richard Brown III12-February-20095 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingIntuition: The HikerYou are going on a hike and you have a budget of 5 to buy food for thehike. The general store has the following food items for sale: One box of crackers: 1 and 60 calories One candy bar: 2 and 200 calories One bag of potato chips: 2 and 160 calories One bag of nuts: 3 and 270 caloriesYou would like to purchase the maximum calories subject to your 5budget. What should you buy?What if there were two candy bars available? The idea here is to rank the items by decreasing value (calories perdollar) and then purchase items with the most value until all themoney is spent.The final purchase may only need to be a fraction of an item.Worcester Polytechnic InstituteD. Richard Brown III12-February-20096 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingN-P Hypothesis Testing With Discrete ObservationsBasic idea:P Sort the likelihood ratio Lℓ ℓ,1 by observation index in descendingPℓ,0order. The order of L’s with the same value doesn’t matter. Now pick v to be the smallest value such thatXPfp Pℓ,0 αℓ:Lℓ v This defines a deterministic decision rule(1 Lℓ vδv (yℓ ) 0 otherwisePIf we can find a value of v such that Pfp ℓ:Lℓ v Pℓ,0 α then weare done. The probability of detection is thenXPD Pℓ,1 β.ℓ:Lℓ vWorcester Polytechnic InstituteD. Richard Brown III12-February-20097 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingN-P Hypothesis Testing With Discrete Observations PIf we cannot find a value of v such that Pfp ℓ:Lℓ v Pℓ,0 α thenit must be the case that, for any ǫ 0,XXPfp (δv ) Pℓ,0 αandPfp (δv ǫ ) Pℓ,0 αℓ:Lℓ vℓ:Lℓ v ǫPfpPfp (δv ǫ )αPfp (δv )v In this case, we must randomize between δv and δv ǫ .Worcester Polytechnic InstituteD. Richard Brown III12-February-20098 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingN-P RandomizationWe form the usual convex combination between δv and δv ǫ asρ (1 γ)δv γδv ǫfor γ [0, 1]. The false positive probability is thenPfp (1 γ)Pfp (δv ) γPfp (δv ǫ )Setting this equal to α and solving for γ yieldsγ Worcester Polytechnic Instituteα Pfp (δv )Pfp (δv ǫ ) Pfp (δv )Pα ℓ:Lℓ v Pℓ,0Pℓ:Lℓ v Pℓ,0D. Richard Brown III12-February-20099 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingN-P Decision Rule With Discrete ObservationsThe Neyman-Pearson decision rule for simple binary hypothesis testingwith discrete observations is then: 1 if L(y) vNPρ (y) γ if L(y) v 0 if L(y) vwhereL(y) : Pℓ,1Prob(observe y state is x1 ) Prob(observe y state is x0 )Pℓ,0and v 0 is the minimum value such thatXPfp Pℓ,0 α.ℓ:Lℓ vWorcester Polytechnic InstituteD. Richard Brown III12-February-200910 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingExample: 10 Coin FlipsCoin flipping problem with a probability of heads of either q0 0.5 orq1 0.8. We observe ten flips of the coin and count the number of heads. 0.0010 0.00000.0001 0.0098 0.0000 0.0004 0.0439 0.0001 0.0017 0.1172 0.0008 0.0067 0.2051 0.0055 0.0268 P 0.2461 0.0264 and L 0.1074 0.2051 0.0881 0.4295 0.1172 0.2013 1.7180 0.0439 0.3020 6.8719 0.0098 0.2684 27.4878 0.0010 0.1074109.9512 What is v, ρNP (y), and β when α 0.001, α 0.01, α 0.1?Worcester Polytechnic InstituteD. Richard Brown III12-February-200911 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingExample: Randomized vs. Deterministic Decision Rules120100v806040200 410 310 210α 10101010.8β0.60.40.20 410Worcester Polytechnic Instituterandomized decision rulesdeterministic decision rules 310 210αD. Richard Brown III 11001012-February-200912 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingExample: Same Results Except Linear Scale10.90.80.7β0.60.50.40.30.20.1randomized decision rulesdeterministic decision rules000.1Worcester Polytechnic Institute0.20.30.40.5αD. Richard Brown III0.60.70.80.9112-February-200913 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingRemarks: 1 of 3The blue line on the previous slide is called the Receiver OperatingCharacteristic (ROC). An ROC plot shows the probability of detectionPD 1 R1 as a function of α R0 . The ROC plot is directly related toour conditional risk vector plot.conditional risk vecorsROC curve11 R1 Pr(true positive)R1 Pr(false negative)10.80.60.40.2000.5R0 Pr(false positive)Worcester Polytechnic Institute10.80.60.40.200D. Richard Brown III0.5R0 Pr(false positive)112-February-200914 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingRemarks: 2 of 3ROC curve1The N-P criterion seeks adecision rule that maximizesthe probability of detectionsubject to the constraint thatthe probability of false alarmmust be no greater than α.0.90.81 R1 Pr(detection)0.70.60.50.40.3ρNP arg max PD (ρ)0.2ρ0.10 00.10.20.30.40.50.6R0 Pr(false positive)0.70.80.91s.t. Pfp (ρ) αThe term power is often used instead of “probability of detection”.The N-P decision rule is sometimes called the “most powerful test ofsignificance level α”.Intuitively, we can expect that the power of a test will increase withthe significance level of the test.Worcester Polytechnic InstituteD. Richard Brown III12-February-200915 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingRemarks: 3 of 3 Like Bayes and minimax, the N-P decision rule for simple binaryhypothesis testing problems is just a likelihood ratio comparison(possibly with randomization).Can same intuition that you developed for the discrete observationcase be applied in the continuous observation case? Form L(y) pp10 (y)(y) .Find the smallest v such that the decision rule 1 if L(y) vρNP (y) γ if L(y) v 0 if L(y) vhas Pfp α. The answer is “yes”, but we need to formalize this claim byunderstanding the fundamental Neyman-Pearson lemma.Worcester Polytechnic InstituteD. Richard Brown III12-February-200916 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingThe Neyman-Pearson Lemma: Part 1 of 3: OptimalityRecall pj (y) for j {0, 1} and y Y is the conditional pmf or pdf of theobservation y given that the state is xj .LemmaLet ρ be any decision rule satisfying Pfp (ρ) α and let ρ′ be any decisionrule of the form if p1 (y) vp0 (y) 1′ρ (y) γ(y) if p1 (y) vp0 (y) 0if p1 (y) vp0 (y)where v 0 and 0 γ(y) 1 are such that Pfp (ρ′ ) α. ThenPD (ρ′ ) PD (ρ).Worcester Polytechnic InstituteD. Richard Brown III12-February-200917 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingThe Neyman-Pearson Lemma: Part 1 of 3: OptimalityProof.By the definitions of ρ and ρ′ , we always have [ρ′ (y) ρ(y)][p1 (y) vp0 (y)] 0. HenceZ[ρ′ (y) ρ(y)][p1 (y) vp0 (y)] dy 0YRearranging terms, we can writeZZρ′ (y)p1 (y) dy ρ(y)p1 (y) dyYvYPD (ρ′ ) PD (ρ) PD (ρ′ ) PD (ρ) »Zρ′ (y)p0 (y) dy Yˆ v Pfp (ρ′ ) Pfp (ρ) Zρ(y)p0 (y) dyY–v [α Pfp (ρ)]But v 0 and Pfp (ρ) α implies that the RHS is non-negative. HencePD (ρ′ )Worcester Polytechnic Institute PD (ρ).D. Richard Brown III12-February-200918 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingThe Neyman-Pearson Lemma: Part 2 of 3: ExistenceLemmaFor every α [0, 1] there exists a decision rule ρNP of the form if p1 (y) vp0 (y) 1NPρ (y) γ(y) if p1 (y) vp0 (y) 0if p1 (y) vp0 (y)where v 0 and γ(y) γ [0, 1] (a constant) such that Pfp (ρNP ) α.Worcester Polytechnic InstituteD. Richard Brown III12-February-200919 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingThe Neyman-Pearson Lemma: Part 2 of 3: ExistenceProof by construction.Let ν 0, Yν {y Y R: p1 (y) νp0 (y)}R and Zν {y Y : p1 (y) νp0 (y)}. Forν2 ν1 , Yν2 Yν1 and Yν p0 (y) dy Yν p0 (y) dy.21Let v be the smallest value of ν such thatZp0 (y) dy αYvChooseγThen 8 Rp0 (y) dy α R YvifZv p0 (y) dy:any arbitrary number in [0, 1]Pfp (ρNP ) Worcester Polytechnic InstituteZp0 (y) dy γYvZRYvp0 (y) dy αotherwisep0 (y) dy αZvD. Richard Brown III12-February-200920 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingThe Neyman-Pearson Lemma: Part 3 of 3: UniquenessLemmaSuppose that ρ′′ (y) is any N-P decision rule for H0 versus H1 withsignificance level α. Then ρ′′ (y) must be of the same form a ρNP (y)except possibly on a subset of Y having zero probability under H0 and H1 .Worcester Polytechnic InstituteD. Richard Brown III12-February-200921 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingThe Neyman-Pearson Lemma: Part 3 of 3: UniquenessProof.If ρ′′ is a N-P decision rule with significance level α, then it must be true thatPD (ρ′′ ) PD (ρNP ). From part 1 of the Lemma, we know thatˆ PD (ρNP ) PD (ρ′′ ) v α Pfp (ρ′′ )which implies that Pfp (ρ′′ ) α since the LHS of the inequality is zero. SoPD (ρ′′ ) PD (ρNP ) and Pfp (ρ′′ ) Pfp (ρNP ). We can work the proof of part 1 of theLemma back to writeZ[ρNP (y) ρ′′ (y)][p1 (y) vp0 (y)] dy 0YNote that the integrand here is non-negative. This implies that ρNP (y) and ρ′′ (y) candiffer only on the set Zv {y Y : p1 (y) vp0 (y)}. This then implies that ρNP (y)and ρ′′ (y) must have the same form and can differ only in the choice of γ.RFrom part 2 of the lemma, we know that γ is arbitrary when Zv p0 (y) dy 0.ROtherwise, if Zv p0 (y) dy 0, ρNP (y) and ρ′′ (y) must share the same value of γ.Worcester Polytechnic InstituteD. Richard Brown III12-February-200922 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingExample: Coherent Detection of BPSKSuppose a transmitter sends one of two scalar signals a0 or a1 and thesignals arrive at a receiver corrupted by zero-mean additive white Gaussiannoise (AWGN) with variance σ 2 .We want to use N-P hypothesis maximizePD Prob(decide H1 a1 was sent)subject to the constraintPfp Prob(decide H1 a0 was sent) α.Signal model conditioned on state xj :Y aj ηwhere aj is the scalar signal and η N (0, σ 2 ). Hence (y aj )21exppj (y) 2σ 22πσWorcester Polytechnic InstituteD. Richard Brown III12-February-200923 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingExample: Coherent Detection of BPSKHow should we approach this problem? We know from the N-P Lemmathat the optimum decision rule will be of the form if p1 (y) vp0 (y) 1NPρ (y) γ(y) if p1 (y) vp0 (y) 0if p1 (y) vp0 (y)where v 0 and 0 γ(y) 1 are such that Pfp (ρNP ) α. How shouldwe choose our threshold v?We need to find the smallest v such thatZp0 (y) dy αYvwhere Yv {y Y : p1 (y) vp0 (y)}.Worcester Polytechnic InstituteD. Richard Brown III12-February-200924 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingExample: Likelihood Ratio for a0 0, a1 1, σ 2 114likelihood ratio L(y) p1(y)/p0(y)121086420 Worcester Polytechnic Institute 0observation yD. Richard Brown III12312-February-200925 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingExample: Coherent Detection of BPSK(y)isNote that, since a1 a0 , the likelihood ratio L(y) pp01 (y)monotonically increasing. This means that finding v is equivalent tofinding a threshold τ so that Z τ a0p0 (y) dy α Q αστp0 (y)p1 (y)a0a1How are τ and v related? Once we find τ , we can determine v bycomputingv L(τ ) Worcester Polytechnic Institutep1 (τ ).p0 (τ )D. Richard Brown III12-February-200926 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingExample: Coherent Detection of BPSK: Finding τUnfortunately, no “closed form” solution exists to exactly solve the inversex1 of a Q function. We can use the fact that Q(x) 2 erfcto write2Q τ a0σ 1 erfc2 τ a0 2σ α.This can be rewritten asτ 2σerfc 1 (2α) a0and we can use Matlab’s handy erfcinv function to compute the lowerbound on τ . It turns out that we are going to always be able to find avalue of τ such that τ a0Q ασso we won’t have to worry about randomization here.Worcester Polytechnic InstituteD. Richard Brown III12-February-200927 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingExample: Coherent Detection of BPSK010 1probability10 210PfpPD 310 1 0.5Worcester Polytechnic Institute00.51τD. Richard Brown III1.522.5312-February-200928 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingExample: Coherent Detection of BPSK14τv121086420 2 4 310Worcester Polytechnic Institute 210 1αD. Richard Brown III1001012-February-200929 / 30

ECE531 Lecture 4a: Neyman-Pearson Hypothesis TestingFinal Comments on Neyman-Pearson Hypothesis Testing1. N-P decision rules are useful in asymmetric risk scenarios or inscenarios where one has to guarantee a certain probability of falsedetection.2. N-P decision rules are simply likelihood ratio comparisons, just likeBayes and minimax. The comparison threshold in this case is chosento satisfy the significance level constraint.3. Like minimax, randomization is often necessary for N-P decision rules.Without randomization, the power of the test may not be maximizedfor the significance level constraint.4. The original N-P paper: “On the Problem of the Most Efficient Testsof Statistical Hypotheses,” J. Neyman and E.S. Pearson,Philosophical Transactions of the Royal Society of London, Series A,Containing Papers of a Mathematical or Physical Character, Vol. 231(1933), pp. 289-337. Available on jstor.org.Worcester Polytechnic InstituteD. Richard Brown III12-February-200930 / 30

One bag of potato chips: 2 and 160 calories One bag of nuts: 3 and 270 calories You would like to purchase the maximum calories subject to your 5 budget. What should you buy? What if there were two candy bars available? The idea here is to rank the items by decreasing value (calories per

Related Documents:

to the famous Cramer-Rao lower bound (CRLB). Worcester Polytechnic Institute D. Richard Brown III 26-March-2009 2 / 43. ECE531 Lecture 9: Information Inequality and the CRLB . It is not a valid pdf in θ. Worcester Polytechnic Institute D. Richard Brown III 26-March-2009 4 / 43. ECE531 Lecture 9: Information Inequality and the CRLB Example: .

THE NEYMAN-PEARSON THEORY AS DECISION THEORY, AND AS INFERENCE THEORY; WITH A CRITICISM OF THE LINDLEY-SAVAGE ARGUMENT FOR BAYESIAN THEORY 1. INTRODUCTION AND SUMMARY The concept of a decision, which is basic in the theories of Neyman Pearson, W

Neyman's theory of confidence interval estimation was designed, as . the investigator has merely indeterminate knowledge of "prior" chances for the unknown quantities.' However, this criticism of Bayesian . In "In Defense of the Neyman-Pearson Theory of Confidence Inter- vals", D. Mayo ex

Introduction of Chemical Reaction Engineering Introduction about Chemical Engineering 0:31:15 0:31:09. Lecture 14 Lecture 15 Lecture 16 Lecture 17 Lecture 18 Lecture 19 Lecture 20 Lecture 21 Lecture 22 Lecture 23 Lecture 24 Lecture 25 Lecture 26 Lecture 27 Lecture 28 Lecture

His favorite Polish poets were Adam Mickiewicz and Julian Tuwim. He often quoted them at social gatherings. I would probably not remember those early talks about Jerzy Neyman if I would not have taken an advanced statistics course. It was that course in which one day I learned about the Neyman-Pearson lemma and its importance in statistics.

The Neyman-Pearson Theory of statistics (NPT), often referred to as 'standard' or 'orthodox' statistical theory, is the generally-received view in university departments of statistics, and it underlies common statis tical reports

In the spirit of statistical learning theory, we develop the theoretical foundations of a Neyman-Pearson approach to learning classifiers from labeled training data. We show that several results and concepts from standard learning theory

Engineering Mathematics – I, Reena Garg, Khanna Book Publishing . AICTE Recommended Books for Undergraduate Degree Courses as per Model Curriculum 2018 AICTE Suggested Books in Engineering & Technology w.e.f. 2018-19 BSC103 – Mathematics – II 1. Advanced Engineering Mathematics, Chandrika Prasad & Reena Garg, Khanna Book Publishing 2. Higher Engineering Mathematics, Ramana B.V., Tata .