Introduction To Probability, Selected Textbook Summary .

2y ago
16 Views
3 Downloads
546.77 KB
77 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Emanuel Batten
Transcription

Introduction to ProbabilitySECOND EDITIONDimitri P. Bertsekas and John N. TsitsiklisMassachusetts Institute of TechnologySelected Summary Material – All Rights ReservedWWW site for book information and ordershttp://www.athenasc.comAthena Scientific, Belmont, Massachusetts1

1Sample Space andProbabilityExcerpts from Introduction to Probability: Second Editionby Dimitri P. Bertsekas and John N. Tsitsiklis cMassachusetts Institute of TechnologyContents1.1.1.2.1.3.1.4.1.5.1.6.1.7.Sets . . . . . . . . . .Probabilistic Models . . .Conditional Probability .Total Probability TheoremIndependence . . . . . .Counting . . . . . . .Summary and DiscussionProblems . . . . . . . . . . .and. . . . . . . . . . .Bayes’. . . . . . . . . . . . . . .Rule. . . . . . . . . p. 3. p. 6p. 18p. 28p. 34p. 44p. 51p. 531

From Introduction to Probability, by Bertsekas and Tsitsiklis2Chap. 11.1 SETS1.2 PROBABILISTIC MODELSElements of a Probabilistic Model The sample space Ω, which is the set of all possible outcomes of anexperiment. The probability law, which assigns to a set A of possible outcomes(also called an event) a nonnegative number P(A) (called the probability of A) that encodes our knowledge or belief about the collective“likelihood” of the elements of A. The probability law must satisfycertain properties to be introduced shortly.Probability Axioms1. (Nonnegativity) P(A) 0, for every event A.2. (Additivity) If A and B are two disjoint events, then the probabilityof their union satisfiesP(A B) P(A) P(B).More generally, if the sample space has an infinite number of elementsand A1 , A2 , . . . is a sequence of disjoint events, then the probability oftheir union satisfiesP(A1 A2 · · ·) P(A1 ) P(A2 ) · · · .3. (Normalization) The probability of the entire sample space Ω isequal to 1, that is, P(Ω) 1.

Sec. 1.2Probabilistic Models3Discrete Probability LawIf the sample space consists of a finite number of possible outcomes, then theprobability law is specified by the probabilities of the events that consist ofa single element. In particular, the probability of any event {s1 , s2 , . . . , sn }is the sum of the probabilities of its elements: P {s1 , s2 , . . . , sn } P(s1 ) P(s2 ) · · · P(sn ).Discrete Uniform Probability LawIf the sample space consists of n possible outcomes which are equally likely(i.e., all single-element events have the same probability), then the probability of any event A is given byP(A) number of elements of A.nSome Properties of Probability LawsConsider a probability law, and let A, B, and C be events.(a) If A B, then P(A) P(B).(b) P(A B) P(A) P(B) P(A B).(c) P(A B) P(A) P(B).(d) P(A B C) P(A) P(Ac B) P(Ac B c C).

4From Introduction to Probability, by Bertsekas and TsitsiklisChap. 11.3 CONDITIONAL PROBABILITYProperties of Conditional Probability The conditional probability of an event A, given an event B withP(B) 0, is defined byP(A B) P(A B),P(B)and specifies a new (conditional) probability law on the same samplespace Ω. In particular, all properties of probability laws remain validfor conditional probability laws. Conditional probabilities can also be viewed as a probability law on anew universe B, because all of the conditional probability is concentrated on B. If the possible outcomes are finitely many and equally likely, thenP(A B) number of elements of A B.number of elements of B1.4 TOTAL PROBABILITY THEOREM AND BAYES’ RULETotal Probability TheoremLet A1 , . . . , An be disjoint events that form a partition of the sample space(each possible outcome is included in exactly one of the events A1 , . . . , An )and assume that P(Ai ) 0, for all i. Then, for any event B, we haveP(B) P(A1 B) · · · P(An B) P(A1 )P(B A1 ) · · · P(An )P(B An ).

Sec. 1.5Independence51.5 INDEPENDENCEIndependence Two events A and B are said to be independent ifP(A B) P(A)P(B).If in addition, P(B) 0, independence is equivalent to the conditionP(A B) P(A). If A and B are independent, so are A and B c . Two events A and B are said to be conditionally independent,given another event C with P(C) 0, ifP(A B C) P(A C)P(B C).If in addition, P(B C) 0, conditional independence is equivalentto the conditionP(A B C) P(A C). Independence does not imply conditional independence, and vice versa.Definition of Independence of Several EventsWe say that the events A1 , A2 , . . . , An are independent ifP\i SAi! Yi SP(Ai ),for every subset S of {1, 2, . . . , n}.

6From Introduction to Probability, by Bertsekas and TsitsiklisChap. 11.6 COUNTINGThe Counting PrincipleConsider a process that consists of r stages. Suppose that:(a) There are n1 possible results at the first stage.(b) For every possible result at the first stage, there are n2 possible resultsat the second stage.(c) More generally, for any sequence of possible results at the first i 1stages, there are ni possible results at the ith stage.Then, the total number of possible results of the r-stage process isn1 n2 · · · nr .Summary of Counting Results Permutations of n objects: n!. k-permutations of n objects: n!/(n k)!. nn! Combinations of k out of n objects: .kk! (n k)! Partitions of n objects into r groups, with the ith group having niobjects: nn! .n 1 , n 2 , . . . , nrn1 ! n2 ! · · · nr !1.7 SUMMARY AND DISCUSSION

2Discrete Random VariablesExcerpts from Introduction to Probability: Second Editionby Dimitri P. Bertsekas and John N. Tsitsiklis cMassachusetts Institute of Basic Concepts . . . . . . . . . . . .Probability Mass Functions . . . . . .Functions of Random Variables . . . . .Expectation, Mean, and Variance . . . .Joint PMFs of Multiple Random VariablesConditioning . . . . . . . . . . . . .Independence . . . . . . . . . . . . .Summary and Discussion . . . . . . .Problems . . . . . . . . . . . . . . .p. 72p. 74p. 80p. 81p. 92p. 97p. 109p. 115p. 1197

8From Introduction to Probability, by Bertsekas and TsitsiklisChap. 22.1 BASIC CONCEPTSMain Concepts Related to Random VariablesStarting with a probabilistic model of an experiment: A random variable is a real-valued function of the outcome of theexperiment. A function of a random variable defines another random variable. We can associate with each random variable certain “averages” of interest, such as the mean and the variance. A random variable can be conditioned on an event or on anotherrandom variable. There is a notion of independence of a random variable from anevent or from another random variable.Concepts Related to Discrete Random VariablesStarting with a probabilistic model of an experiment: A discrete random variable is a real-valued function of the outcomeof the experiment that can take a finite or countably infinite numberof values. A discrete random variable has an associated probability mass function (PMF), which gives the probability of each numerical value thatthe random variable can take. A function of a discrete random variable defines another discreterandom variable, whose PMF can be obtained from the PMF of theoriginal random variable.

Sec. 2.4Expectation, Mean, and Variance2.2 PROBABILITY MASS FUNCTIONSCalculation of the PMF of a Random Variable XFor each possible value x of X:1. Collect all the possible outcomes that give rise to the event {X x}.2. Add their probabilities to obtain pX (x).2.3 FUNCTIONS OF RANDOM VARIABLES2.4 EXPECTATION, MEAN, AND VARIANCEExpectationWe define the expected value (also called the expectation or the mean)of a random variable X, with PMF pX , byE[X] XxpX (x).xExpected Value Rule for Functions of Random VariablesLet X be a random variable with PMF pX , and let g(X) be a function ofX. Then, the expected value of the random variable g(X) is given by XE g(X) g(x)pX (x).x9

10From Introduction to Probability, by Bertsekas and TsitsiklisChap. 2VarianceThe variance var(X) of a random variable X is defined byvar(X) Ehand can be calculated asvar(X) Xx 2 iX E[X], 2x E[X] pX (x).It is always nonnegative. Its square root is denoted by σX and is called thestandard deviation.Mean and Variance of a Linear Function of a Random VariableLet X be a random variable and letY aX b,where a and b are given scalars. Then,E[Y ] aE[X] b,var(Y ) a2 var(X).Variance in Terms of Moments Expression 2var(X) E[X 2 ] E[X] .

Sec. 2.5Joint PMFs of Multiple Random Variables112.5 JOINT PMFS OF MULTIPLE RANDOM VARIABLESSummary of Facts About Joint PMFsLet X and Y be random variables associated with the same experiment. The joint PMF pX,Y of X and Y is defined bypX,Y (x, y) P(X x, Y y). The marginal PMFs of X and Y can be obtained from the jointPMF, using the formulaspX (x) XpX,Y (x, y),pY (y) yXpX,Y (x, y).x A function g(X, Y ) of X and Y defines another random variable, and XXE g(X, Y ) g(x, y)pX,Y (x, y).xyIf g is linear, of the form aX bY c, we haveE[aX bY c] aE[X] bE[Y ] c. The above have natural extensions to the case where more than tworandom variables are involved.

12From Introduction to Probability, by Bertsekas and TsitsiklisChap. 22.6 CONDITIONINGSummary of Facts About Conditional PMFsLet X and Y be random variables associated with the same experiment. Conditional PMFs are similar to ordinary PMFs, but pertain to auniverse where the conditioning event is known to have occurred. The conditional PMF of X given an event A with P(A) 0, is definedbypX A (x) P(X x A)and satisfiesXpX A (x) 1.x If A1 , . . . , An are disjoint events that form a partition of the samplespace, with P(Ai ) 0 for all i, thenpX (x) nXP(Ai )pX Ai (x).i 1(This is a special case of the total probability theorem.) Furthermore,for any event B, with P(Ai B) 0 for all i, we havepX B (x) nXi 1P(Ai B)pX Ai B (x). The conditional PMF of X given Y y is related to the joint PMFbypX,Y (x, y) pY (y)pX Y (x y). The conditional PMF of X given Y can be used to calculate themarginal PMF of X through the formulapX (x) XypY (y)pX Y (x y). There are natural extensions of the above involving more than tworandom variables.

Sec. 2.6Conditioning13Summary of Facts About Conditional ExpectationsLet X and Y be random variables associated with the same experiment. The conditional expectation of X given an event A with P(A) 0, isdefined byXE[X A] xpX A (x).xFor a function g(X), we have XE g(X) A g(x)pX A (x).x The conditional expectation of X given a value y of Y is defined byE[X Y y] XxxpX Y (x y). If A1 , . . . , An be disjoint events that form a partition of the samplespace, with P(Ai ) 0 for all i, thenE[X] nXi 1P(Ai )E[X Ai ].Furthermore, for any event B with P(Ai B) 0 for all i, we haveE[X B] We haveE[X] nXi 1P(Ai B)E[X Ai B].XypY (y)E[X Y y].

14From Introduction to Probability, by Bertsekas and TsitsiklisChap. 22.7 INDEPENDENCESummary of Facts About Independent Random VariablesLet A be an event, with P(A) 0, and let X and Y be random variablesassociated with the same experiment. X is independent of the event A ifpX A (x) pX (x),for all x,that is, if for all x, the events {X x} and A are independent. X and Y are independent if for all pairs (x, y), the events {X x}and {Y y} are independent, or equivalentlypX,Y (x, y) pX (x)pY (y),for all x, y. If X and Y are independent random variables, thenE[XY ] E[X] E[Y ].Furthermore, for any functions g and h, the random variables g(X)and h(Y ) are independent, and we have E g(X)h(Y ) E g(X) E h(Y ) . If X and Y are independent, thenvar(X Y ) var(X) var(Y ).

Sec. 2.8Summary and Discussion152.8 SUMMARY AND DISCUSSIONSummary of Results for Special Random VariablesDiscrete Uniform over [a, b]:pX (k) E[X] (1, if k a, a 1, . . . , b,b a 10,otherwise,a b,2var(X) (b a)(b a 2).12Bernoulli with Parameter p: (Describes the success or failure in a singletrial.) p,if k 1,pX (k) 1 p, if k 0,E[X] p,var(X) p(1 p).Binomial with Parameters p and n: (Describes the number of successesin n independent Bernoulli trials.) n kpX (k) p (1 p)n k ,k 0, 1, . . . , n,kE[X] np,var(X) np(1 p).Geometric with Parameter p: (Describes the number of trials until thefirst success, in a sequence of independent Bernoulli trials.)pX (k) (1 p)k 1 p,E[X] 1,pk 1, 2, . . . ,var(X) 1 p.p2Poisson with Parameter λ: (Approximates the binomial PMF when nis large, p is small, and λ np.)pX (k) e λE[X] λ,λk,k!k 0, 1, . . . ,var(X) λ.

3General Random VariablesExcerpts from Introduction to Probability: Second Editionby Dimitri P. Bertsekas and John N. Tsitsiklis cMassachusetts Institute of inuous Random Variables and PDFsCumulative Distribution Functions . . .Normal Random Variables . . . . . . .Joint PDFs of Multiple Random VariablesConditioning. . . . . . . . . . . .The Continuous Bayes’ Rule . . . . . .Summary and Discussion . . . . . . .Problems . . . . . . . . . . . . . . .p.p.p.p.p.p.p.p.14014815315816417818218417

18From Introduction to Probability, by Bertsekas and TsitsiklisChap. 33.1 CONTINUOUS RANDOM VARIABLES AND PDFSSummary of PDF PropertiesLet X be a continuous random variable with PDF fX . fX (x) 0 for all x.Z fX (x) dx 1. If δ is very small, then P [x, x δ] fX (x) · δ. For any subset B of the real line,P(X B) ZfX (x) dx.BExpectation of a Continuous Random Variable and its PropertiesLet X be a continuous random variable with PDF fX . The expectation of X is defined byE[X] ZxfX (x) dx. The expected value rule for a function g(X) has the form E g(X) Z g(x)fX (x) dx. The variance of X is defined by 2 var(X) E X E[X] We haveZ 2x E[X] fX (x) dx. 20 var(X) E[X 2 ] E[X] . If Y aX b, where a and b are given scalars, thenE[Y ] aE[X] b,var(Y ) a2 var(X).

Sec. 3.2Cumulative Distribution Functions193.2 CUMULATIVE DISTRIBUTION FUNCTIONSProperties of a CDFThe CDF FX of a random variable X is defined byFX (x) P(X x),for all x,and has the following properties. FX is monotonically nondecreasing:if x y, then FX (x) FX (y). FX (x) tends to 0 as x , and to 1 as x . If X is discrete, then FX (x) is a piecewise constant function of x. If X is continuous, then FX (x) is a continuous function of x. If X is discrete and takes integer values, the PMF and the CDF canbe obtained from each other by summing or differencing:FX (k) kXpX (i),i pX (k) P(X k) P(X k 1) FX (k) FX (k 1),for all integers k. If X is continuous, the PDF and the CDF can be obtained from eachother by integration or differentiation:FX (x) Zx fX (t) dt,fX (x) dFX(x).dx(The second equality is valid for those x at which the PDF is continuous.)

20From Introduction to Probability, by Bertsekas and TsitsiklisChap. 33.3 NORMAL RANDOM VARIABLESNormality is Preserved by Linear TransformationsIf X is a normal random variable with mean µ and variance σ 2 , and if a 6 0,b are scalars, then the random variableY aX bis also normal, with mean and varianceE[Y ] aµ b,var(Y ) a2 σ 2 .CDF Calculation for a Normal Random VariableFor a normal random variable X with mean µ and variance σ 2 , we use atwo-step procedure.(a) “Standardize” X, i.e., subtract µ and divide by σ to obtain a standardnormal random variable Y .(b) Read the CDF value from the standard normal table:P(X x) P X µx µ σσ x µx µ P Y Φ.σσ

Sec. 3.4Normal Random 0.9993.9995.9996.9997.9990.9993.9995.9997.9998The standard normal table. The entries in this table provide the numerical valuesof Φ(y) P(Y y), where Y is a standard normal random variable, for y between 0and 3.49. For example, to find Φ(1.71), we look at the row corresponding to 1.7 andthe column corresponding to 0.01, so that Φ(1.71) .9564. When y is negative, thevalue of Φ(y) can be found using the formula Φ(y) 1 Φ( y).

22From Introduction to Probability, by Bertsekas and TsitsiklisChap. 33.4 JOINT PDFS OF MULTIPLE RANDOM VARIABLESSummary of Facts about Joint PDFsLet X and Y be jointly continuous random variables with joint PDF fX,Y . The joint PDF is used to calculate probabilities: P (X, Y ) B Z ZfX,Y (x, y) dx dy.(x,y) B The marginal PDFs of X and Y can be obtained from the joint PDF,using the formulasfX (x) Z fX,Y (x, y) dy,fY (y) Z fX,Y (x, y) dx. The joint CDF is defined by FX,Y (x, y) P(X x, Y y), anddetermines the joint PDF through the formulafX,Y (x, y) 2 FX,Y(x, y), x yfor every (x, y) at which the joint PDF is continuous. A function g(X, Y ) of X and Y defines a new random variable, and E g(X, Y ) Z Z g(x, y)fX,Y (x, y) dx dy. If g is linear, of the form aX bY c, we haveE[aX bY c] aE[X] bE[Y ] c. The above have natural extensions to the case where more than tworandom variables are involved.

Sec. 3.5Conditioning233.5 CONDITIONINGConditional PDF Given an Event The conditional PDF fX A of a continuous random variable X, givenan event A with P(A) 0, satisfiesP(X B A) ZfX A (x) dx.B If A is a subset of the real line with P(X A) 0, thenfX {X A} (x) fX (x), if x A,P(X A) 0,otherwise. Let A1 , A2 , . . . , An be disjoint events that form a partition of the sample space, and assume that P(Ai ) 0 for all i. Then,fX (x) nXP(Ai )fX Ai (x)i 1(a version of the total probability theorem).Conditional PDF Given a Random VariableLet X and Y be jointly continuous random variables with joint PDF fX,Y . The joint, marginal, and conditional PDFs are related to each otherby the formulasfX,Y (x, y) fY (y)fX Y (x y),Z fX (x) fY (y)fX Y (x y) dy. The conditional PDF fX Y (x y) is defined only for those y for whichfY (y) 0. We haveP(X A Y y) ZAfX Y (x y) dx.

24From Introduction to Probability, by Bertsekas and TsitsiklisChap. 3Summary of Facts About Conditional ExpectationsLet X snd Y be jointly continuous random variables, and let A be an eventwith P(A) 0. Definitions: The conditional expectation of X given the event A isdefined byZ E[X A] xfX A (x) dx. The conditional expectation of X given that Y y is defined byZE[X Y y] xfX Y (x y) dx. The expected value rule: For a function g(X), we have E g(X) A Zand E g(X) Y y g(x)fX A (x) dx, Z g(x)fX Y (x y) dx. Total expectation theorem: Let A1 , A2 , . . . , An be disjoint eventsthat form a partition of the sample space, and assume that P(Ai ) 0for all i. Then,nXE[X] P(Ai )E[X Ai ].i 1Similarly,E[X] Z E[X Y y]fY (y) dy. There are natural analogs for the case of functions of several randomvariables. For example,Z E g(X, Y ) Y y g(x, y)fX Y (x y) dx,and E g(X, Y ) Z E g(X, Y ) Y y]fY (y) dy.

Sec. 3.5Conditioning25Independence of Continuous Random VariablesLet X and Y be jointly continuous random variables. X and Y are independent iffX,Y (x, y) fX (x)fY (y),for all x, y. If X and Y are independent, thenE[XY ] E[X] E[Y ].Furthermore, for any functions g and h, the random variables g(X)and h(Y ) are independent, and we have E g(X)h(Y ) E g(X) E h(Y ) . If X and Y are independent, thenvar(X Y ) var(X) var(Y ).

26From Introduction to Probability, by Bertsekas and TsitsiklisChap. 33.6 BAYES’ RULE AND APPLICATIONS IN INFERENCEBayes’ Rule Relations for Random VariablesLet X and Y be two random variables. If X and Y are discrete, we have for all x, y with pX (x) 6 0, pY (y) 6 0,pX (x)pY X (y x) pY (y)pX Y (x y),and the terms on the two sides in this relation are both equal topX,Y (x, y). If X is discrete and Y is continuous, we have for all x, y with pX (x) 6 0,fY (y) 6 0,pX (x)fY X (y x) fY (y)pX Y (x y),and the terms on the two sides in this relation are both equal tolimδ 0P(X x, y Y y δ).δ If X and Y are continuous, we have for all x, y with fX (x) 6 0,fY (y) 6 0,fX (x)fY X (y x) fY (y)fX Y (x y),and the terms on the two sides in this relation are both equal tolimδ 0P(x X x δ, y Y y δ).δ2

Sec. 3.7Summary and Discussion273.7 SUMMARY AND DISCUSSIONSummary of Results for Special Random VariablesContinuous Uniform Over [a, b]:fX (x) E[X] 1, if a x b,b a 0,otherwise,a b,2(b a)2.12var(X) Exponential with Parameter λ:fX (x) λe λx , if x 0,0,otherwise,E[X] 1,λFX (x) var(X) 1 e λx , if x 0,0,otherwise,1.λ2Normal with Parameters µ and σ 2 0:122fX (x) e (x µ) /2σ ,2π σE[X] µ,var(X) σ 2 .

4Further Topicson Random VariablesExcerpts from Introduction to Probability: Second Editionby Dimitri P. Bertsekas and John N. Tsitsiklis cMassachusetts Institute of TechnologyContents4.1.4.2.4.3.4.4.4.5.4.6.Derived Distributions . . . . . . . . . . . . .Covariance and Correlation . . . . . . . . . .Conditional Expectation and Variance Revisited .Transforms . . . . . . . . . . . . . . . . .Sum of a Random Number of Independent RandomSummary and Discussion . . . . . . . . . . .Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Variables. . . . . . .p.p.p.p.p. p. p.20221722222924024424629

30From Introduction to Probability, by Bertsekas and TsitsiklisChap. 44.1 DERIVED DISTRIBUTIONSCalculation of the PDF of a Function Y g(X) of a ContinuousRandom Variable X1. Calculate the CDF FY of Y using the formulaZ FY (y) P g(X) y fX (x) dx.{x g(x) y}2. Differentiate to obtain the PDF of Y :fY (y) dFY(y).dyThe PDF of a Linear Function of a Random VariableLet X be a continuous random variable with PDF fX , and letY aX b,where a and b are scalars, with a 6 0. Then,1fY (y) fX a y ba .

Sec. 4.2Covariance and Correlation31PDF Formula for a Strictly Monotonic Function of a ContinuousRandom VariableSuppose that g is strictly monotonic and that for some function h and all xin the range of X we havey g(x)if and only ifx h(y).Assume that h is differentiable. Then, the PDF of Y in the region wherefY (y) 0 is given by dhfY (y) fX h(y)(y) .dy4.2 COVARIANCE AND CORRELATIONCovariance and Correlation The covariance of X and Y is given byh icov(X, Y ) E X E[X] Y E[Y ] E[XY ] E[X] E[Y ]. If cov(X, Y ) 0, we say that X and Y are uncorrelated. If X and Y are independent, they are uncorrelated. The converse isnot always true. We havevar(X Y ) var(X) var(Y ) 2cov(X, Y ). The correlation coefficient ρ(X, Y ) of two random variables X andY with positive variances is defined byand satisfiescov(X, Y )ρ(X, Y ) p,var(X)var(Y ) 1 ρ(X, Y ) 1.

32From Introduction to Probability, by Bertsekas and TsitsiklisChap. 44.3 CONDITIONAL EXPECTATION AND VARIANCE REVISITEDLaw of Iterated Expectations:Law of Total Variance: E E[X Y ] E[X]. var(X) E var(X Y ) var E[X Y ] .Properties of the Conditional Expectation and Variance E[X Y y] is a number whose value depends on y. E[X Y ] is a function of the random variable Y , hence a random variable. Its value is E[X Y y] whenever the value of Y is y. E E[X Y ] E[X] (law of iterated expectations). E[X Y y] may be viewed as an estimate of X given Y y. Thecorresponding error E[X Y ] X is a zero mean random variable thatis uncorrelated with E[X Y ]. var(X Y ) is a random variable whose value is var(X Y y) wheneverthe value of Y is y. var(X) E var(X Y ) var E[X Y ] (law of total variance).

Sec. 4.4Transforms334.4 TRANSFORMSSummary of Transforms and their Properties The transform associated with a random variable X is given by Xesx pX (x), xMX (s) E[esX ] Z esx fX (x) dx,X discrete,X continuous. The distribution of a random variable is completely determined by thecorresponding transform. Moment generating properties:MX (0) 1,dMX (s)ds E[X],s 0 If Y aX b, then MY (s) esb MX (as).dnMX (s)dsn E[X n ].s 0 If X and Y are independent, then MX Y (s) MX (s)MY (s).

From Introduction to Probability, by Bertsekas and Tsitsiklis34Transforms for Common Discrete Random VariablesBernoulli(p) (k 0, 1) p,if k 1,pX (k) 1 p, if k 0,Binomial(n, p) (k 0, 1, . . . , n) n kpX (k) p (1 p)n k ,kMX (s) 1 p pes .MX (s) (1 p pes )n .Geometric(p) (k 1, 2, . . .)pX (k) p(1 p)k 1 ,MX (s) pes.1 (1 p)esPoisson(λ) (k 0, 1, . . .)pX (k) e λ λk,k!sMX (s) eλ(e 1) .Uniform(a, b) (k a, a 1, . . . , b)1pX (k) ,b a 1 esa es(b a 1) 1MX (s) .(b a 1)(es 1)Transforms for Common Continuous Random VariablesUniform(a, b) (a x b)fX (x) 1,b aMX (s) esb esa.s(b a)MX (s) λ,λ sExponential(λ) (x 0)fX (x) λe λx ,Normal(µ, σ 2 ) ( x )122fX (x) e (x µ) /2σ ,2π σMX (s) e(σ2 2(s λ).s /2) µs .Chap. 4

Sec. 4.6Summary and Discussion354.5 SUM OF A RANDOM NUMBER OF INDEPENDENT RANDOMVARIABLESProperties of the Sum of a Random Number of Independent Random VariablesLet X1 , X2 , . . . be identically distributed random variables with mean E[X]and variance var(X). Let N be a random variable that takes nonnegative integer values. We assume that all of these random variables are independent,and we consider the sumThen:Y X1 · · · XN . E[Y ] E[N ] E[X]. 2 var(Y ) E[N ] var(X) E[X] var(N ). We have MY (s) MN log MX (s) .Equivalently, the transform MY (s) is found by starting with the transform MN (s) and replacing each occurrence of es with MX (s).4.6 SUMMARY AND DISCUSSION

5Limit TheoremsExcerpts from Introduction to Probability: Second Editionby Dimitri P. Bertsekas and John N. Tsitsiklis cMassachusetts Institute of TechnologyContents5.1.5.2.5.3.5.4.5.5.5.6.Markov and Chebyshev InequalitiesThe Weak Law of Large Numbers .Convergence in Probability . . . .The Central Limit Theorem . . .The Strong Law of Large NumbersSummary and Discussion . . . .Problems . . . . . . . . . . . .p.p.p.p.p.p.p.26526927127328028228437

38From Introduction to Probability, by Bertsekas and TsitsiklisChap. 55.1 MARKOV AND CHEBYSHEV INEQUALITIESMarkov InequalityIf a random variable X can only take nonnegative values, thenP(X a) E[X],afor all a 0.Chebyshev InequalityIf X is a random variable with mean µ and variance σ 2 , then σ2P X µ c 2 ,cfor all c 0.5.2 THE WEAK LAW OF LARGE NUMBERSThe Weak Law of Large NumbersLet X1 , X2 , . . . be independent identically distributed random variables withmean µ. For every ǫ 0, we have P Mn µ ǫ P X1 · · · Xn µ ǫn 0,as n .

Sec. 5.3Convergence in Probability395.3 CONVERGENCE IN PROBABILITYConvergence of a Deterministic SequenceLet a1 , a2 , . . . be a sequence of real numbers, and let a be another realnumber. We say that the sequence an converges to a, or limn an a, iffor every ǫ 0 there exists some n0 such that an a ǫ,for all n n0 .Convergence in ProbabilityLet Y1 , Y2 , . . . be a sequence of random variables (not necessarily independent), and let a be a real number. We say that the sequence Yn convergesto a in probability, if for every ǫ 0, we have lim P Yn a ǫ 0.n

40From Introduction to Probability, by Bertsekas and TsitsiklisChap. 55.4 THE CENTRAL LIMIT THEOREMThe Central Limit TheoremLet X1 , X2 , . . . be a sequence of independent identically distributed randomvariables with common mean µ and variance σ 2 , and defineZn X1 · · · Xn nµ .σ nThen, the CDF of Zn converges to the standard normal CDF1Φ(z) 2πZze x2/2dx, in the sense thatlim P(Zn z) Φ(z),n for every z.Normal Approximation Based on the Central Limit TheoremLet Sn X1 · · · Xn , where the Xi are independent identically distributed random variables with mean µ and variance σ 2 . If n is large, theprobability P(Sn c) can be approximated by treating Sn as if it werenormal, according to the following procedure.1. Calculate the mean nµ and th

The conditional probability of an event A, given an event B with P(B) 0, is defined by P(A P(A B) B), P(B) and specifies a new (conditional) probability law on the same sample space Ω. In particular, all properties of probability laws remain

Related Documents:

Joint Probability P(A\B) or P(A;B) { Probability of Aand B. Marginal (Unconditional) Probability P( A) { Probability of . Conditional Probability P (Aj B) A;B) P ) { Probability of A, given that Boccurred. Conditional Probability is Probability P(AjB) is a probability function for any xed B. Any

Pros and cons Option A: - 80% probability of cure - 2% probability of serious adverse event . Option B: - 90% probability of cure - 5% probability of serious adverse event . Option C: - 98% probability of cure - 1% probability of treatment-related death - 1% probability of minor adverse event . 5

Probability measures how likely something is to happen. An event that is certain to happen has a probability of 1. An event that is impossible has a probability of 0. An event that has an even or equal chance of occurring has a probability of 1 2 or 50%. Chance and probability – ordering events impossible unlikely

probability or theoretical probability. If you rolled two dice a great number of times, in the long run the proportion of times a sum of seven came up would be approximately one-sixth. The theoretical probability uses mathematical principles to calculate this probability without doing an experiment. The theoretical probability of an event

Engineering Formula Sheet Probability Conditional Probability Binomial Probability (order doesn’t matter) P k ( binomial probability of k successes in n trials p probability of a success –p probability of failure k number of successes n number of trials Independent Events P (A and B and C) P A P B P C

Chapter 4: Probability and Counting Rules 4.1 – Sample Spaces and Probability Classical Probability Complementary events Empirical probability Law of large numbers Subjective probability 4.2 – The Addition Rules of Probability 4.3 – The Multiplication Rules and Conditional P

Target 4: Calculate the probability of overlapping and disjoint events (mutually exclusive events Subtraction Rule The probability of an event not occurring is 1 minus the probability that it does occur P(not A) 1 – P(A) Example 1: Find the probability of an event not occurring The pr

Solution for exercise 1.4.9 in Pitman Question a) In scheme Aall 1000 students have the same probability (1 1000) of being chosen. In scheme Bthe probability of being chosen depends on the school. A student from the rst school will be chosen with probability 1 300, from the second with probability 1 1200, and from the third with probability 1 1500