3y ago

280 Views

47 Downloads

6.18 MB

378 Pages

Transcription

SCHAUM'SouTli nesINTROOUCTION TOPROBABILITY ANOSTATISTICS -Covers all probability fundamentalsStatistics with latest applications No calculus needed 3fJ7 fully sol ved problemsPerfect aid for bettergradesUse with these courses: ri! Inlroduclion lo Probabilily and Slalislics ri! Probabilityri! Slalislics ri! Inlroduclion 10 Slalislics

SCHAUM'S OUTLINE OFTHEORYPROBLEMSANDofINTRODUCTION TO PROBABILITYAND STATISTICS SEYMOUR LIPSCHUTZ, Ph.D.Prolessor 1 MathematicsTemple University0JOHN J. SCHILLER, Jr., Ph.D.Associate Prolessor 1 MathematicsTemple University0 SCHAUM'S OUTLINE SERIESMcGRAW-HILLNew York San Francisco Washington, D.C. Auckland Bogotá Caracas LisbonLondon Madrid Mexico City MUan Montreal New DelhiSan Juan Singapore Sydney Tokyo Toronto

SEYMOUR LlPSCI-IUTZ, who is preseotly 00 the mathematics faculty ofTemple University, formerly taught at the Polytechnic Institute of Brooklynand was visiting professor in the Computer Science Department of BrooklynCollege. He rcccived his Ph.D. in 1960 at the Courant Institute of Math ematical Sciences of New York University. Some of his other books in theSchaum's Outline Series are Begil/I/il/g Lil/ear Algebra; Discrele Malhe malics, 2nd ed.; Probabilily; aod Linear Algebra, 2nd ed.JOUN J. SCI-lIttER is an Associate Professor of Mathematics at TempleUniversity. He received his Ph.D. at the University of Pennsylvania andhas published research papers in the areas of Riemann surfaces, discretemathematics, and mathematica1 biology. He has also coauthored texts iofinite mathematics, precalculus, and calculus.Schaunl's Outline of Theory and Problems ofINTRODUCfION TO PROBABlUTY AND STATISTICSCopyright It 1998 by The McGraw-HiH Companies. Inc. AH rights re served. Printed in theUnitcd States of America. Except as permittcd under the Copyright Act of 1976, no part ofthis publication may be rcproduced or distributed in any forms or by any means, or stored in adata base or rctrieval system. with out the p rior wrillen permission of the publisher.1 2 3 4 5 6 7 8 9 10ISBN0-07-038084-8II12\314 1 5 16 17 18 1920 PRS PRS 9 O 2 I 098Sponsoring Editor: Barbara GilsonProduction Supervisor: Tina CameronEditing Supervisor: MaUJ'Cen B. WalkerProject Superv ision: Keyword Publishing en'ÍcesLibrary uf Congress Calaloging-in-Publication DalaMcGraw-HillA Division o{TheMcGraw-HiIl Companies

PrefaceProbability and statl stlcs appear explicitly or implicitly in many disciplines,including computer and information science, physics, chemistry, geology, biology,medicine, psychology, sociology, political science, education, economics, business,operations research, and all branches of engineering.The purpose of this book is to present an introduction to principIes and methodsof probability and statistics which would be useful to all individual s regardless oftheir fields of specialization. It is designed for use as a supplement to all currentstandard texts, or as a textbook in a beginning course in probability and statisticswith high school algebra as the only prerequisite.The material is divided into two parts, since the logical development is notdisturbed by the division while the usefulness as a text and reference book isincreased.Part 1 covers descriptive statistics and elements of probability. The first chaptertreats descriptive statistics which motivates various concepts appearing in thechapters on probability, and the second chapter covers sets and counting whichare needed for a modern treatment of probability. Part 1 also includes a chapteron random variables where we define expectation, variance, and standard deviationof random variables, and where we discuss and prove Chebyshev's inequality and thelaw of large numbers. This is followed by a separate chapter on the binomial andnormal distributions, where the central limit theorem is discussed in the context ofthe normal approximation to the binomial distribution.Part II treats inferential statistics. It begins with a chapter on samplingdistributions for sampling with and without replacement and for small and largesamples. Then there are chapters on estimation (confidence intervals) and hypoth esis testing for a single population, and then a separate chapter covering these topicsfor two populations. Lastly, there is a chapter on chi-square tests and analysis ofvanance.Each chapter begins with clear statements of pertinent definitions, principIes,and theorems together with illustrative and other descriptive material. This isfollowed by graded sets of solved and supplementary problems. The solvedproblems serve to illustrate and amplify the material, and provide the repetition ofbasic principIes so vital to effective learning. The supplementary problems serve asa complete review of the material in the chapter.We wish to thank many friends and colleagues for invaluable suggestions andcritical review of the manuscript. We also wish to express our gratitude to the staffof McGraw-Hill, particularly to Barbara Gilson and Mary Loebig Giles, for theirexcellent cooperation.SEYMOUR LIPSCHUTZJOHN J. SCHILLERTemple University111

ContentsChapter 1PARTIDescriptive Statistics and ProhahilityPRELlMINARY: DESCRIPTlVE STATlSTlCS . . . . . . . . . . . . . . . . .1 . 1 Introduction. 1 .2 Frequency Tables, Histograms. 1.3 Measures ofCentral Tendency: Mean and Median. 1 .4 Measures of Dispersion:Variance and Standard Deviation. 1.5 Measures of Position: Quartilesand Percentiles. 1 .6 Measures of Comparison: Standard Units and Co efficient of Variation. 1.7 Additional Descriptions of Data. 1 .8 BivariateData, Scatterplots. 1 .9 Correlation Coefficient. 1 . 10 Methods of LeastSquares, Regression Line, Curve Fitting.Chapter 2SETS AND COUNTlNG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Chapter 3BASIC PROBABILlTYChapter 4CONDITlONAL PROBABILlTY AND INDEPENDENCEChapter 5RANDOM VARIABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Chapter 6BINOMIAL AND NORMAL DISTRIBUTlONS . . . . . . . . . . . . . . . . .2. 1 Introduction. 2.2 Sets and Elements, Subsets. 2.3 Venn Diagrams.2.4 Set Operations. 2.5 Finite and Countable Sets. 2.6 CountingElements in Finite Sets, Inclusion-Exclusion PrincipIe. 2.7 ProductSets. 2.8 Classes of Sets, Power Sets, Partitions. 2.9 MathematicalInduction. 2. 10 Counting PrincipIes. 2. 1 1 Factorial Notation, BinomialCoefficients. 2. 12 Permutations. 2. 13 Combinations. 2.14 Tree Dia grams.3 .1 Introduction. 3.2 Sample Space and Events. 3.3 Axioms of Probabil ity. 3.4 Finite Probability Spaces. 3.5 Infinite Sample Spaces. 3.6 Clas sical Birthday Problem. 3.7 Expectation.4. 1 Introduction. 4.2 Conditional Probability. 4.3 Finite StochasticProcesses and Tree Diagrams. 4.4 Total Probability and Bayes'Formula. 4.5 Independent Events. 4.6 Independent Repeated Trials.5.1 Introduction. 5.2 Random Variables. 5.3 Probability Distributionof a Finite Random Variable. 5.4 Expectation of a Finite Random Variable. 5.5 Variance and Standard Deviation. 5.6 Joint Distribution ofRandom Variables. 5.7 Independent Random Variables. 5.8 Functionsof a Random Variable. 5.9 Discrete Random Variables in General. 5.10Continuous Random Variables. 5. 1 1 Cumulative Distribution Function. 5.12 Chebyshev's Inequality and the Law of Large Numbers.6.1 Introduction. 6.2 Bernoulli Trials, Binomial Distribution. 6.3 Normal Distribution. 6.4 Evaluating Normal Probabilities. 6.5 NormalApproximation of the Binomial Distribution. 6.6 Poisson Distribution.6.7 Multinomial Distribution.v14587109132180

CONTENTSVIInferential StatisticsPART11Chapter 7SAMPLlNG DISTRIBUTlONSChapter 8CONFIDENCE INTERVALS FOR A SINGLE POPULATlONChapter 9HYPOTHESES TESTS FOR A SINGLE POPULATlONChapter 10INFERENCE FOR TWO POPULATlONSChapter 11CHI-SQUARE TESTS AND ANALYSIS OF VARIANCE.8 .1 Parameters and Statistics. 8.2 The Notion of a Confidence Interval.8.3 Confidence Intervals for Means. 8.4 Confidence Intervals for Propor tions. 8.5 Confidence Intervals for Variances.9.1 Introduction: Testing Hypotheses About Parameters. 9.2 HypothesesTests for Means. 9.3 Hypotheses Tests for Proportions. 9.4 HypothesesTests for Variances.10.1 Confidence Intervals for the Difference of Means. 10.2 HypothesesTests for the Difference of Means. 10.3 Confidence Intervals for Differ ences of Proportions. 10.4 Hypotheses Tests for Differences of Propor tions. 10.5 Confidence Intervals for Ratios of Variances. 10.6Hypotheses Tests for Ratios of Variances.1 1 . 1 Chi-Square Goodness-of-Fit Test. 1 1.2 Chi-Square Test for EqualDistributions. 1 1 .3 Chi-Square Test for Independent Attributes. 1 1 .4One-Way Analysis of Variance. 1 1 .5 Two-Way Analysis of Variance.APPENDIXINDEX.7.1 Introduction: Sampling With and Without Replacement. 7.2 SampleMean. 7.3 Sample Proportion. 7.4 Sample Variance.210236261291322.359.367

PART 1:Descripfive Sfafisfics and ProbabilifyChapter 1Preliminary: Descriptive Statistics1.1 INTRODUCTlONStatistics, on the one hand, means lists of numerieal values; for example, the salaries of the employ ees of a eompany, or the SAT seores of the ineoming students of a university. Statisties as a seienee,on the other hand, is the braneh of mathematies whieh organizes, analyzes, and interprets sueh rawdata. Statistieal methods are applieable to any area of human endeavor where numerieal data areeolleeted for some type of deeision-making proeess.This preliminary ehapter simply eovers topies related to gathering and deseribing data ealledDescriptive Statistics. It will be used in both the first part of the text, whieh mainly treats ProbabilityTheory, and the seeond part of the text, whieh mainly treats Inferential Statisties.Real Line RThe notation R will be used to denote the set of real numbers, whieh are the numbers we use fornumerieal data. We assume the reader is familiar with the graphieal representation of R as points on astraight line, as pietured in Fig. 1 - 1 . We refer to sueh a line as the real Une or the real Une R.-:n;i-41-,[5i-31i-2-,fii-1ioThe real l ine RFig.1:n;i2i31i4.1-1Frequently we will deal with sets of numbers ealled intervals. Speeifieally, for any real numbers aand b, with a b, we denote and define intervals from a to b as follows:(a, b) {x : a x b},[a, b] {x : a :::; x :::; b},[a, b) {x : a :::; x b},(a, b] {x : a b :::; b},open intervalclosed intervalclosed-open intervalopen-closed intervalThat is, eaeh interval eonsists of all the points between a and b; the term "closed" and a braeket are usedto indieate that the endpoint belongs to the interval and the term "open" and a parenthesis are used toindieate that an endpoint does not belong to the interval.Subscript Notation, Surnrnation SyrnbolConsider a list of numerieal data, say the weights of eight students. They may all be denoted by:The numbers 1 , 2, . . . , 8 written below the ws are ealled subscripts. An arbitrary element in the list willbe denoted by Wj' The subseript j is ealled an index beeause it gives the position of the element in thelist. (The letters i and k are also frequently used as index symbols.)

2PRELIMINARY: DESCRIPTIVE STATISTICS[CHAPo 1The sum of the eight weights of the students may be expressed in the formW¡ W2 W3 W4 Ws W6 W7 WgClearly, this expression for the sum would be very long and awkward to use if there were many morenumbers in the list. Mathematics has developed a shorthand for such sums which is independent of thenumber of items in the list.Summation notation uses the summation symbol ¿ (the Greek letter sigma). Specifically, given alist X¡ , X2 , . . . , Xn of n numbers, its sum may be denoted byorn¿j ¡ x·Jwhich is read:The sum of the x-sub-js asj goes from 1 to n. If the number n of items is understood we may simplywrite¿ x}More generally, suppose f(k) is an algebraic expression involving the variable k, and n¡ and n2 areintegers for which n¡ :::; n2 . Then we definen2¿ f(k) f(n¡ ) f(n¡ 1) f(n¡ 2) . . . f(n2 )k n ¡Thus we have, for example,g¿ w} W¡ W2 W3 W4 Ws W6 W7 Wg} ¡s¿ k2 32 42 52 9 16 25 50k 3¿ akbk a¡b¡ a2b2 . . . an bn¿(x) - .xl (x¡ - .xl (X2 - .xl . . . (xn - .xl(We assume the index goes from 1 to n in the last two sums.)1.2 FREQUENCY TABLES, HISTOGRAMSOne of the first things one usually does with a large list of numerical data is to form some type offrequency table, where the table shows the number of times an individual item occurs or the number ofitems that fall within a given interval. These frequency distributions may be pictured using histo grams. We illustrate this technique with two examples.EXAMPLE 1 .1An apartment house has 45 apartments, with the following number of tenants:23 5 2 2 24 2 6 2 4 32 4 34 4 2 4 4 2 2 34 235 2 43 2 4 4 2 53 4Observe that the only numbers which appear in the list are 1 , 2, 3, 4, 5, and 6. The frequency distribution ofthese numbers appears in Fig. 1-2. Specifically, column l lists the given numbers and column 2 gives the frequency

CHAPo J]PRELlMINARY: DESCRIPTIVE STATISTICS3ofeach numbcr. (Thcse frcqucncics can lx: obtailled by somc sort of"tally coun\"' as in Problcm 1.2.) Figure 1-2also glVCS the clIlllulative frcquency distribution. Spccifically. colunlll 3 gives the eumulative frequeney of eachnllmbcr, which is the llllmbcr of tenant nllmbers not excccding the givcll numbcr. Thc cllllllllative friXIllency isobtaincd by simply adding IIp the frcqllcncies untíl the given frequency. Ckarly, the last eUlllulative frequencynUlllbcr 45 is the same as the sum of all frequencies, that is, the nUlllber of apartments.The frequency distribution in Fig. ]-2 may be pictured by a hislogram shown in Fig. 1-3. Ahislogram i s simply a bar graph w here Ihe height of the bar gi ves the number of times the given numberappears in the list. Similarly, the cumulative frequency distribution could be presented as a histogram,Ihe heights of the bars would be 8, 22, 29, . . . , 6123S 145""2229414445, ,Fig. 1-2EXAMPlE 1.2- ,---¡,Fig. 1-3Suppose Ihe 6:00 P.M. lemperaturcs (in degrccs Fahrenheil) for a 35-day period are as follows:72 78 86 93 lOó 107 98 82 81 77 87 8291 95 92 83 76 78 73 81 86 92 93 84107 99 94 86 81 77 73 76 80 88 91Ralhcr Ihan fmd the frcqucncy dislribulion of each individual data item, it is more usefnl lo conslrucl a frcquencytable which counts Ihc numbcr oflimes Ihc obscrviXI tcmperalllre falls in a givcn class, i.e. an inlerval with ccrlain¡imils. This is donc in rig. 1-4.The numbers 70, 75, 80, . . . are called Ihe c/ass bOllfldaries or class lililÍ/s. If a dala item falls on aclass boundary, il is usually assigned 10 Ihe higher class; for example, Ihe number 95 was placed in Ihe95-100 class. ometimes a frequency lable also lisIs each c/ass I'allle, i.e. the midpoinl of the classinterval which serves as an approximalion 10 the values in Ihe interval.Figure ]-5 shows Ihe hislogram which corresponds to Ihe frequeney dislribulion in Fig. 1-4. II alsoshows Ihe Ji"cqllcllcy polygoll, which is a line graph obtained by connecling the midpoints of the topS ofthe reclangles in Ihe hislogram. Observe Ihal Ihe line graph is eXlended 10 Ihe class value 67.5 on theleft and 10 1 12.5 on Ihe right. In sueh a case, Ihe sum of the areas of Ihe rcclangles equals Ihe areabounded by the frequency polygon and the x-axis.Inlen'al NOlation, Number of Clas.wsThe entries forming a class can be denOled using intcrval notation. Sincc a bracket indicates IItal aclass boundary bclongs 10 an in terval, bUI a parenthesis means thal it does not, the classes in Fig. 1·3 canbe denoted by[70,75), [75, 80), . . . , [lOS, l lO)respcctively. Also, tItere is no fixed rule for the number of classes that should be formed for data. Thefewer the number of classes, tbe less specific is the infonnalion displaycd by the histogram, but a larger

4[CHAP. lPRELIMINARY: DESCRIPTIVE STATISTICSboundarics,Classel",valuc, F FCumulativeFrequency 107.53S 35Fig." . /,,3235,/.,/ "/ 1-4\"/ \\ T""""."'\\",ro'"'"\'"Fig. 1-5numbcr of classcs may defcat ¡he purpose 01' grouping ¡he data (sec Problem 1.1). The rule ofthumb isthat Ihe numbcr 01' classes should lic somewhcre between 5 and 10.Qualitatil'e Dala, Bar and Circular GraphsMasl data in Ihis text \ViII be numerical unless othcrwisc stated or implied. I-Iowevcr, somclimcs \Vedo come into contact with nonnulllcrical data, called qllalilalire da/a, such as gcndcr (male or remale),majar subjccI (English, Mathcmatics, Philosophy, . . .), place of birth, aud so on. Clcarly, a frcqucncytable can bc formcd for such data (but a cumulativc frcqucncy tablc would havc no mcaning). lnstcad01' a histogram, such data may bc picturcd as (a) a bar graph and/or (h) a circular graph (also called a piegraph or pie cha,,).EXAMPlE 1.3 Suppos.:: the studenls al a slllall COm1llunily College in Philadelphia are partitioned into fivegroups according to their home address: (1) Philadelphia. (2) suburbs of Philade1phia. (3) Pennsylvania (outsidePhilade1phia and its suburbs), (4) New Jersey, and (5) elsewhere; and suppose the following is the frequencydislribulion for Ihe collegc dunng somc semcSler:Number of 00607540500Draw (a) Ihe bar graph, and (h) Ihe circular graph of Ihe data.(a)(h)Figure 1-6 shows a bar gmph for Ihe dala. The Icngth of each bar is proporlional to the number of studentsliving in Ihe area. The bar graph is nOI a horiwntal hislogram. Spccifically, Ihe order or ¡he dala can heinterchanged in Ihe bar graph, e.g. pulling New Jersey berore Pennsylvania. Wilhoul esscntially changing thegraph. This cannOI be done with a histogmm, sincc ¡he dala is numerical and has a given order. (Ahistogram may be viewed as a spccial kind of bar graph.)Figure 1-7 shows a circular graph ror Ihe dala. If ( is thc number ofdegrces in a "slice" (sector) correspond ing 10 a group with l'i ilems out of SUM items, thcn( (l/jjSUM) (360)For example, Philadclphia is assigncd a slice with[(225)/(500)1(360) 162 degreesClearly, Ihe sum of the degrces as igncd lO the dala must equal 360 degrces.

CHAPo J]PRELlMINARY: DESCRIPTIVE STATISTICS '00, '005 l) N aNew JerseyElscwherePennsylvanias. :Jf·ig.1.3).{lFig. 1-7MEASURES OF CENTRAL TENIlENCY: MI .:AN ANlJ MlmlANThere are various ways of giving an overview of data. One way is by the graphical descriptionsdiscussed aboye. There are also numerical descriplions of data. Numbers such as the mean andmedian give, in sorne sense, the central or middlc value of the data. Olher numbers. such as varianccand standard deviation, measure the dispersion or spread of the data abou1 the mean. The centraltendency of data is discussed in this section and dispersion in the following section.The data we discuss will come either from a random samplc of a larger population or from the largerpopulation itsclf. We distinguish these two cases using different notation as fo[]ows:JI number.\'of ilems in Ihe sample,(read: x-bar) sample mean,N number of elcmems in Ihe populalion(read: mu) populalÍon mean2cr population variance¡.t; sample variancc,No/e: Greek letters are uscd with the population and are callcd parame/ers.the sampJcs aod are called SIa/ü/ics.Latin lctters are used withMeanSuppose a sample consists of the eight numbcrs:7,II 11,8,12,7,6,6The sample mean .\' is defllled to be the sum of the values divided by the number of values; Ihat ¡s,-\' 7 11 11 8 12 7 6 686888 5Generally speaking, suppose Xl, x2," . , Xli are 11 numerical vnlues of some samplc. Then:Sample mean:Cx"' CCX"2 C-' ' ' ccx- "x "(J. i)Now suppose 1hat the data are organized into a frequency tablc; let there be k distillel numeri

INTRODUCTION TO PROBABILITY AND STATISTICS SEYMOUR LIPSCHUTZ, Ph.D. Prolessor 01 Mathematics Temple University JOHN J. SCHILLER, Jr., Ph.D. Associate Prolessor 01 Mathematics Temple University SCHAUM'S OUTLINE SERIES McGRA W -HILL New York San

Related Documents: