Randomized Algorithms And Probabilistic Analysis Michael .

3y ago
78 Views
3 Downloads
6.83 MB
366 Pages
Last View : 23d ago
Last Download : 3m ago
Upload by : Mollie Blount
Transcription

Probability and ComputingRandomized Algorithms and Probabilistic Analysis.\'. '.Michael MitzenmacherEli Upfal

Probability and ComputingRandomization and probabilistic techniques play an important role in modern computer science, with applications ranging from combinatorial optimization and machinelearning to communication networks and secure protocols.This textbook is designed to accompany a one- or two-semester course for advancedundergraduates or beginning graduate students in computer science and applied mathematics. It gives an excellent introduction to the probabilistic techniques and paradigmsused in the development of probabilistic algorithms and analyses. It assumes only anelementary background in discrete mathematics and gives a rigorous yet accessibletreatment of the material, with numerous examples and applications.The first half of the book covers core material, including random sampling, expectations, Markov's inequality, Chebyshev's inequality, ChernotT bounds, balls-and-binsmodels, the probabilistic method, and Markov chains. In the second half, the authorsdelve into more advanced topics such as continuous probability, applications of limitedindependence, entropy, Markov chain Monte Carlo methods. coupling, martingales,and balanced allocations. With its comprehensive selection of topics, along with manyexamples and exercises, this book is an indispensable teaching tool.Michael Mitzenmacher is John L. Loeb Associate Professor in Computer Science atHarvard University. He received his Ph.D. from the University of California. Berkeley, in 1996. Prior to joining Harvard in 1999, he was a research staff member at DigitalSystems Research Laboratory in Palo Alto. He has received an NSF CAREER Awardand an Alfred P. Sloan Research Fellowship. In 2002, he shared the IEEE InformationTheory Society "Best Paper" Award for his work on error-correcting codes.Eli Upfal is Professor and Chair of Computer Science at Brown University. He receivedhis Ph.D. from the Hebrew University, Jerusalem, Israel. Prior to joining Brown in1997, he was a research staff member at the IBM research division and a professor atthe Weizmann Institute of Science in Israel. His main research interests are randomizedcomputation and probabilistic analysis of algorithms, with applications to optimizationalgorithms, communication networks, parallel and distributed computing. and computational biology.

Probability and ComputingRandomized Algorithms andProbabilistic AnalysisMichael MitzenmacherEli UpfalHarl'ard Unil'crsityBn!\\'Il Unil'ersit\'CAMBRIDGEUNIVERSITY PRESS

PUBLISHED BY THE PRESS SYNDICATE OF THE UNIVERSITY OF CAMBRIDGEThe Pitt Building, Trumpington Street. Cambridge. United KingdomCAMBRIDGE UNIVERSITY PRESSThe Edinburgh Building. Cambridge CB2 2RU. UK40 West 20th Street. New York. NY 10011-4211. USA477 Williamstown Road. Port Melbourne. VIC 3207. AustraliaRuiz de Alarc6n 13.28014 \ladrid. SpainDock House. The Waterfront. Cape Town 8001. South Africahttp://www.cambridge.org Michael Mitzenmacher and Eli l'pfal 2005This book is in copyright. Subject to statutory exception andto the provisions of relevant collective licensing agreements.no reproduction of any part may take place \\ithllutthe written permission of Cambridge University Press.First published 2005Printed in the United States of AmericaType/ace Times 10.5/13 pt.System AMS-TEX[FH]A catalog record for this book is available from the British Library.Library of Congress Cataloging in Publication dataMitzenmacher, Michael. 1969Probability and computing: randomized algorithms and probabilisticanalysis / Michael Mitzenmacher. Eli Upfal.p. cm.Includes index.ISBN 0-521-83540-2 (alk. paper)I. Algorithms. 2. Probahilities. 3. Stochastic analysis. I. Upfal. Eli. 1954-. II. Title.QA274.M574 200551W.1 - dc22ISBN 0521 835402 hardback2004054540

ContentspagePrefaceXlllEvents and Probability1.11.21.31.41.5Application: Verifying Polynomial IdentitiesAxioms of ProbabilityApplication: Verifying Matrix MultiplicationApplication: A Randomized Min-Cut AlgorithmExercises2 Discrete Random Variables and Expectation2.12.22.32.42.52.63Random Variables and Expectation2.1.1 Linearity of Expectations2.1.2 Jensen's InequaJi tyThe Bernoulli and Binomial Random VariablesConditional ExpectationThe Geometric Distribution2.4.1 Example: Coupon Collector's ProblemApplication: The Expected Run-Time of QuicksortExercisesMoments and 3844Markov's InequalityVariance and Moments of a Random Variable3.2.1 Example: Variance of a Binomial Random VariableChebyshev's Inequality3.3.1 Example: Coupon Collector's ProblemApplication: A Randomized Algorithm for Computing the Median3.4.1 The Algorithm3.4.2 Analysis of the AlgorithmExercisesvii444548485052535457

CONTENTS461Chernoff Bounds4.14.2Moment Generating FunctionsDeriving and Applying Chernoff Bounds4.2.1 Chernoff Bounds for the Sum of Poisson Trials4.2.2 Example: Coin Flips4.2.3 Application: Estimating a Parameter4.3 Better Bounds for Some Special Cases4.4 Application: Set Balancing4.5* Application: Packet Routing in Sparse Networks4.5.1 Permutation Routing on the Hypercube4.5.2 Permutation Routing on the Butterfly4.6 Exercises56767697172737883Balls, Bins, and Random Graphs905.15.2905.35.45.55.65.75.86616363Example: The Birthday ParadoxBalls into Bins5.2.1 The Balls-and-Bins Model5.2.2 Application: Bucket SortThe Poisson Distribution5.3.1 Limit of the Binomial DistributionThe Poisson Approximation5.4.1 * Example: Coupon Collector's Problem, RevisitedApplication: Hashing5.5.1 Chain Hashing5.5.2 Hashing: Bit Strings5.5.3 Bloom Filters5.5.4 Breaking SymmetryRandom Graphs5.6.1 Random Graph Models5.6.2 Application: Hamiltonian Cycles in Random GraphsExercisesAn Exploratory 18124The Probabilistic Method1266.16.21261286.36.46.5The Basic Counting ArgumentThe Expectation Argument6.2.1 Application: Finding a Large Cut6.2.2 Application: Maximum SatisfiabilityDerandomization Using Conditional ExpectationsSample and Modify6.4.1 Application: Independent Sets6.4.2 Application: Graphs with Large GirthThe Second Moment Method6.5.1 Application: Threshold Behavior in Random Graphsviii129130131133133134134135

CONTENTS6.66.7The Conditional Expectation InequalityThe Lovasz Local Lemma6.7.16.7.26.8*Application: SatistiabilityExplicit Constructions Using the Local Lemma6.8.17Application: Edge-Disjoint PathsApplication: A Satisfiability Algorithm6.9Lovasz Local Lemma: The General Case6.10Exercises larkov7.1Chains and Random WalksMarkov Chains: Definitions and Representations7.1.17.1.27.2Example: The Gambler's RuinStationary Distributions7.3.17.4Application: A Randomized Algorithm for 3-SatisfiabilityClassification of States7.2.17.3Application: A Randomized Algorithm for 2-SatisfiabilityExample: A Simple QueueRandom Walks on Undirected Graphs7.4.1Application: An s-t Connectivity Algorithm7.5Parrondo's Paradox7.6ExercisesX Continuous Distributions and the Poisson Process8.1Continuous Random Variables8.1.18.1.28.2The Uniform Distribution8.2.18.3Probability Distributions in lRJoint Distributions and Conditional ProbabilityAdditional Properties of the Uniform DistributionThe Exponential Distribution8.3.1 Additional Properties of the Exponential Distribution8.3.2* Example: Balls and Bins with Feedback8.48.58.68.7'IThe Poisson Process8.4.1Interarrival Distribution8.4.28.4.3Combining and Splitting Poisson ProcessesConditional Arrival Time DistributionContinuous Time Markov ProcessesExample: Markovian Queues8.6.1Mj Mj I Queue in Equilibrium8.6.2MjMjljK Queue in Equilibrium8.6.3The N umber of Customers in an M j M j x QueueExercisesEntropy, Randomness, and Information9.19.2The Entropy FunctionEntropy and Binomial 9201204205207210212213216216219225225228

CONTENTS9.39.49.5*9.610Entropy: A Measure of RandomnessCompressionCoding: Shannon's TheoremExercisesThe Monte Carlo .410.510.6The Monte Carlo MethodApplication: The DNF Counting Problem10.2.1 The Na"ive Approach10.2.2 A Fully Polynomial Randomized Scheme for ONF CountingFrom Approximate Sampling to Approximate CountingThe Markov Chain Monte Carlo Method10.4.1 The Metropolis AlgorithmExercisesAn Exploratory Assignment on Minimum Spanning Trees11 * Coupling of Markov Chains11.111.211.311.411.511.611.712271Variation Distance and Mixing TimeCoupling11.2.1 Example: Shuffling Cards11.2.2 Example: Random Walks on the Hypercube11.2.3 Example: Independent Sets of Fixed SizeApplication: Variation Distance Is NonincreasingGeometric ConvergenceApplication: Approximately Sampling Proper ColoringsPath g Times12.2.1 Example: A Ballot TheoremWald's EquationTail Inequalities for MartingalesApplications of the Azuma-Hoeffding Inequality12.5.1 General Formalization12.5.2 Application: Pattern Matching12.5.3 Application: Balls and Bins12.5.4 Application: Chromatic NumberExercisesPairwise Independence and Universal Hash Functions31413.1314315316Pairwise Independence13.1.1 Example: A Construction of Pairwise Independent Bits13.1.2 Application: Oerandomizing an Algorithm for Large Cutsx

CONTENTS13.1.313.213.313.413.5Example: Constructing Pairwise Independent Values Moduloa PrimeChebyshev's Inequality for Pairwise Independent Variables13.2.1 Application: Sampling Using Fewer Random BitsFamilies of Universal Hash Functions13.3.1 Example: A 2-Universal Family of Hash Functions13.3.2 Example: A Strongly 2-Universal Family of Hash Functions13.3.3 Application: Perfect HashingApplication: Finding Heavy Hitters in Data StreamsExercises14 * Balanced 8333336The Power of Two Choices14.1.1 The Upper BoundTwo Choices: The Lower BoundApplications of the Power of Two Choices14.3.1 Hashing14.3.2 Dynamic Resource AllocationExercises336336341344344345345Further Reading349Index350\"ote: Asterisks indicate advanced materiaLxi

Preface\\-hy Randomness?\\-hy should computer scientists study and use randomness? Computers appear to have far too unpredictably as it is! Adding randomness would seemingly be a dis.1J\antage, adding further complications to the already challenging task of efficientlyutilizing computers.Science has learned in the last century to accept randomness as an essential comr .)nent in modeling and analyzing nature. In physics, for example, Newton's laws led uple to believe that the universe was a deterministic place; given a big enough calcu;.1tur and the appropriate initial conditions, one could determine the location of planets cars from now. The development of quantum theory suggests a rather different view;the universe still behaves according to laws, but the backbone of these laws is proba llistic. "God does not play dice with the universe" was Einstein's anecdotal objection:,) modern quantum mechanics. Nevertheless, the prevailing theory today for subpar:h:k physics is based on random behavior and statistical laws, and randomness plays a'l niticant role in almost every other field of science ranging from genetics and evolu:11 111 in biology to modeling price fluctuations in a free-market economy.Computer science is no exception. From the highly theoretical notion of proba J!i"tic theorem proving to the very practical design of PC Ethernet cards, randomness.1nJ probabilistic methods playa key role in modern computer science. The last twoJe :ades have witnessed a tremendous growth in the use of probability theory in computing. Increasingly more advanced and sophisticated probabilistic techniques havexen developed for use within broader and more challenging computer science appli.: .1tions. In this book, we study the fundamental ways in which randomness comestl1 hear on computer science: randomized algorithms and the probabilistic analysis of.J.l orithms.Randomized algorithms: Randomized algorithms are algorithms that make random.:hnices during their execution. In practice, a randomized program would use values enerated by a random number generator to decide the next step at several branchesxiii

PREFACEof its execution. For example, the protocol implemented in an Ethernet card uses random numbers to decide when it next tries to access the shared Ethernet communicationmedium. The randomness is useful for breaking symmetry, preventing different cardsfrom repeatedly accessing the medium at the same time. Other commonly used applications of randomized algorithms include Monte Carlo simulations and primalitytesting in cryptography. In these and many other important applications, randomizedalgorithms are significantly more efficient than the best known deterministic solutions.Furthermore, in most cases the randomized algorithms are also simpler and easier toprogram.These gains come at a price; the answer may have some probability of being incorrect, or the efficiency is guaranteed only with some probability. Although it may seemunusual to design an algorithm that may be incorrect, if the probability of error is sufficiently small then the improvement in speed or memory requirements may well beworthwhile.Probabilistic analysis ( f algorithms: Complexity theory tries to classify computation problems according to their computational complexity, in particular distinguishing between easy and hard problems. For example, complexity theory shows that theTraveling Salesmen problem is NP-hard. It is therefore very unlikely that there is analgorithm that can solve any instance of the Traveling Salesmen problem in time thatis subexponential in the number of cities. An embarrassing phenomenon for the classical worst-case complexity theory is that the problems it classifies as hard to computeare often easy to solve in practice. Probabilistic analysis gives a theoretical explanationfor this phenomenon. Although these problems may be hard to solve on some set ofpathological inputs, on most inputs (in particular, those that occur in real-life applications) the problem is actually easy to solve. More precisely, if we think of the input asbeing randomly selected according to some probability distribution on the collection ofall possible inputs, we are very likely to obtain a problem instance that is easy to solve,and instances that are hard to solve appear with relatively small probability. Probabilistic analysis of algorithms is the method of studying how algorithms perform when theinput is taken from a well-defined probabilistic space. As we will see, even NP-hardproblems might have algorithms that are extremely efficient on almost all inputs.The BookThis textbook is designed to accompany one- or two-semester courses for advancedundergraduate or beginning graduate students in computer science and applied mathematics. The study of randomized and probabilistic techniques in most leading universities has moved from being the subject of an advanced graduate seminar meantfor theoreticians to being a regular course geared generally to advanced undergraduateand beginning graduate students. There are a number of excellent advanced, researchoriented books on this subject, but there is a clear need for an introductory textbook.\\"e hope that our book satisfies this need.The textbook has developed from courses on probabilistic methods in computer sci nc taught at Brown (CS 155) and Harvard (CS 223) in recent years. The emphasisxiv

PREFACE1n these courses and in this textbook is on the probabilistic techniques and paradigms,nnt on particular applications. Each chapter of the book is devoted to one such method\)r technique. Techniques are clarified though examples based on analyzing random1/ d algorithms or developing probabilistic analysis of algorithms on random inputs.\ 1any of these examples are derived from problems in networking, reflecting a promin nt trend in the networking field (and the taste of the authors).The book contains fourteen chapters. We may view the book as being divided into:\\ 0 parts, where the first part (Chapters 1-7) comprises what we believe is core matenal. The book assumes only a basic familiarity with probability theory, equivalent to\\ hat is covered in a standard course on discrete mathematics for computer scientists.Chapters 1-3 review this elementary probability theory while introducing some interc . . ting applications. Topics covered include random sampling, expectation, Markov's1I1 quality, variance, and Chebyshev's inequality. If th class has sufficient background111 probability, then these chapters can be taught quickly. We do not suggest skippingth m, however, because they introduce the concepts of randomized algorithms andrrobabilistic analysis of algorithms and also contain several examples that are usedthroughout the text.Chapters 4-7 cover more advanced topics, including Chernoff bounds, balls-andt"lins models, the probabilistic method, and Markov chains. The material in these chap:cr . . is more challenging than in the initial chapters. Sections that are particularly chal:cnging (and hence that the instructor may want to consider skipping) are marked with.In asterisk. The core material in the first seven chapters may constitute the bulk of a'-!uarter- or semester-long course, depending on the pace.The second part of the book (Chapters 8-14) covers additional advanced materialthat can be used either to fill out the basic course as necessary or for a more advanced'l'cond course. These chapters are jargely self-contained, so the instructor can chooseth topics best suited to the class. The chapters on continuous probability and entrnpy are perhaps the most appropriate for incorporating into the basic course. OurIntroduction to continuous probability (Chapter 8) focuses on uniform and exponentialJI . . tributions, including examples from queueing theory. Our examination of entropy,Chapter 9) shows how randomness can be measured and how entropy arises naturally1n the context of randomness extraction, compression, and coding.Chapters 10 and 11 cover the Monte Carlo method and coupling, respectively; these.:hapters are closely related and are best taught together. Chapter 12, on martingales,.:\ )\ rs important issues on dealing with dependent random variables, a theme that continues in a different vein in Chapter 13's development of pairwise independence andJl'randomization. Finally, the chapter on balanced allocations (Chapter 14) covers at,)pic close to the authors' hearts and ties in nicely with Chapter 5's analysis of balls.1nJ-bins problems.The order of the subjects, especially in the first part of the book, corresponds to hl'ir relative importance in the algorithmic literature. Thus, for example, the study\)f Chernoff bounds precedes more fundamental probability concepts such as Markov.:hains. However, instructors may choose to teach the chapters in a different order. A.:,)urse with more emphasis on general stochastic processes, for example, may teach\1arkov chains (Chapter 7) immediately after Chapters 1-3, following with the chapterxv

PREFACEon balls, bins, and random graphs (Chapter 5, omitting the Hamiltonian cycle example). Chapter 6 on the probabilistic method could then be skipped, following insteadwith continuous probability and the Poisson process (Chapter 8). The material fromChapter 4 on Chernoff bounds. however, is needed for most of the remaining material.Most ofthe exercises in the book are theoreticaL but we have included some programming exercises - including two more extensive exploratory assignments that requiresome programming. We have found that occasional programming exercises are oftenhelpful in reinforcing the book's ideas and in adding some variety to the course.We have decided to restrict the material in this book to methods and techniques basedo

Probability and Computing Randomized Algorithms and Probabilistic Analysis '. '. . . \ Michael Mitzenmacher Eli Upfal . Probability and Computing Randomization and probabilistic techniques play an important role in modern com .

Related Documents:

applying probability in the theory of algorithms, but an equally essential aim is to point out the variety of ways in which probability plays a role. One useful step in understanding this variety comes from making a clear distinction between the subject of probabilistic algorithms and the

deterministic polynomial-time algorithms. However, as argued next, we can gain a lot if we are willing to take a somewhat non-traditional step and allow probabilistic veriflcation procedures. In this primer, we shall survey three types of probabilistic proof systems, called interactive proofs, zero-knowledge proofs, and probabilistic checkable .

non-Bayesian approach, called Additive Regularization of Topic Models. ARTM is free of redundant probabilistic assumptions and provides a simple inference for many combined and multi-objective topic models. Keywords: Probabilistic topic modeling · Regularization of ill-posed inverse problems · Stochastic matrix factorization · Probabilistic .

A Model for Uncertainties Data is probabilistic Queries formulated in a standard language Answers are annotated with probabilities This talk: Probabilistic Databases 9. 10 Probabilistic databases: Long History Cavallo&Pitarelli:1987 Barbara,Garcia-Molina, Porter:1992 Lakshmanan,Leone,Ross&Subrahmanian:1997

in Bayesian probabilistic techniques for the representation of knowledge. This article describes a randomized approximation scheme (ras), which we have named BN-RAS, for the Bayesian inference problem in belief networks. BN-RAS co

Randomized algorithms Randomized algorithms make random rather than deterministic decisions The main advantage is that no input can reliably produce worst-case results because the algorithm runs differently each time These algorithms are commonly used in situations where no correct polynomial algorithm is known 39

expected running time. 2 Monte Carlo randomized algorithms: for a given input x the running time is deterministic but the output is random;correct with some probability. In this case we are interested in analyzing the probability of the correct output (and also the running time). 3 Algorithms whose running time and output may both be random.

Oregon English Language Arts and Literacy Standards Grade 2 Standards June 2019 * Denotes a revision has been made to the original Common Core State Standard. 255 Capitol St NE, Salem, OR 97310 503-947-5600 1 . Oregon achieves . . . together! Grade 2 Introduction to the Oregon Standards for English Language Arts and Literacy Preparing Oregon’s Students When Oregon adopted the Common Core .