Bayesian Networks - University Of Wisconsin–Madison

2y ago

25 Views

2 Downloads

1.31 MB

19 Pages

Last View : 1d ago

Last Download : 3m ago

Upload by : Bria Koontz

Report this link

Download PDF

Transcription

IntroductionBayesian Networks (aka Bayes Nets, Belief Nets,Directed Graphical Models) Chapter 14.1, 14.2, and 14.4plus optional paper “Bayesiannetworks without tears” [based on slides by Jerry Zhu and Andrew Moore]12Full Joint Probability DistributionUsing the Full Joint Distribution Making a joint distribution of N variables:1. List all combinations of values (if each variablehas k values, there are kN combinations)2. Assign each combination a probability 3. They should sum to 13Probabilistic models allow us to useprobabilistic inference (e.g., Bayes’s rule) tocompute the probability distribution over a setof unobserved (“hypothesis”) variables givena set of observed variablesFull joint probability distribution table is greatfor inference in an uncertain world, but isterrible to obtain and storeBayesian Networks allow us to represent jointdistributions in manageable chunks using§ Independence, conditional independenceBayesian Networks can do any inferenceOnce you have the joint distribution, you can doanything, e.g. marginalization:P(E) rows matching E P(row)e.g., P(Sunny or Hot) (150 50 40 5)/365Convince yourself this is the same as P(sunny) P(hot) - P(sunny and inyCold60/365RainyCold60/36541

Using the Joint Distribution The Bad NewsYou can also do inference: rows matching Q AND E P(row) P(Q E) rows matching E P(row)P(Hot 365RainyCold60/3655 For N variables, each taking k values, the jointdistribution has kN numbers (and kN – 1 degrees offreedom)It would be nice to use fewer numbers Bayesian Networks to the rescue!§ Provides a decomposed / factorizedrepresentation of the FJPD§ Encodes a collection of conditionalindependence relations6Bayesian Networks 14Full Joint distribution requires a lot of storage spaceExampleRepresent (direct) dependencies graphically§ A: your alarm soundsDirected, acylic graphs (DAGs)Nodes random variables§ J: your neighbor John calls you§ M: your other neighbor Mary calls you§ “CPT” stored at each node quantifies conditionalprobability of node’s r.v. given all its parentsDirected arc from A to B means A “has a directinfluence on” or “causes” B§ Evidence for A increases likelihood of B(deductive influence from causes to effects) § John and Mary do not communicate (they promisedto call you whenever they hear the alarm)What kind of independence do we have?What does the Bayes Net look like?§ Evidence for B increases likelihood of A(abductive influence from effects to causes)Encodes conditional independence assumptions152

Conditional IndependenceConditional Independence Random variables can be dependent, butconditionally independentExample: Your house has an alarm§ Neighbor John will call when he hears the alarm§ Neighbor Mary will call when she hears the alarm § Assume John and Mary don’t talk to each other Is JohnCall independent of MaryCall?§ No – If John called, it is likely the alarm went off,which increases the probability of Mary calling§ P(MaryCall JohnCall) P(MaryCall)16But, if we know the status of the alarm, JohnCallwill not affect whether or not Mary callsP(MaryCall Alarm, JohnCall) P(MaryCall Alarm)We say JohnCall and MaryCall areconditionally independent given AlarmIn general, “A and B are conditionallyindependent given C” means:P(A B, C) P(A C)P(B A, C) P(B C)P(A, B C) P(A C) P(B C)17ExampleExamples§ A: your alarm sounds§ J: your neighbor John calls you§ M: your other neighbor Mary calls you § John and Mary do not communicate (they promisedto call you whenever they hear the alarm)What kind of independence do we have? Conditional independence: P(J,M A) P(J A)P(M A)What does the Bayes Net look like? Our BN: P(A,J,M) P(A) P(J A) P(M A)§ A: yoursoundsChainrule:alarmP(A,J,M) P(A) P(J A) P(M A,J)§ J: your neighbor John calls youOurassumesconditionalindependence,§ M:BNyourother neighborMarycalls youso P(M A,J) P(M A)§ John and Mary do not communicate (they promisedto call you whenever they hear the alarm)What kind of independence do we have? Conditional independence P(J,M A) P(J A)P(M A)What does the Bayes Net look like?AJ18AMJM193

A Simple Bayesian NetworkS Î {no , light , heav y}Smoking0.80P(S no)P(S light) 0.15P(S heavy) 0.05CancerC Î {none , benign , m alignant }noDon’t needto storeA Bayesian Networklight0.96 0.88P(C benign S ) 0.03 0.08P(C malig S ) 0.01 0.04P(C none S )21GenderExposureto ngTumor27Applications 28AgePathfinder Pathfinder was one of the first BN systems It performed diagnosis of lymph-nodeMedical diagnosis systemsManufacturing system diagnosisComputer systems diagnosisNetwork systems diagnosisHelpdesk troubleshootingInformation retrievalCustomer modelingdiseases It dealt with over 60 diseases and 100symptoms and test results 14,000 probabilities Commercialized and applied to about 20tissue types324

PathfinderBayesNet448 nodes,906 arcs3335Conditional Independence in Bayes NetsConditional Independence§ A node is conditionally independent of itsnon-descendants, given its parents§ A node is conditionally independent of all othernodes, given its “Markov blanket” (i.e., its parents,children, and children’s parents)AgeGenderSmokingCancer is conditionallyindependent of Ageand Gender givenSmokingCancer36375

Interpreting Bayesian NetsConditional Independence CancerSerumCalcium2 nodes are (unconditionally) independent ifthere’s no undirected path between themIf there is an undirected path between 2 nodes, thenwhether or not they are independent or dependentdepends on what other evidence is knownSerum Calcium isconditionally independent ofLung Tumor, given CancerLungTumorAA and B are independentgiven nothing else, but aredependent given CCP(L SC, C) P(L C)38B39Example with 5 VariablesCreating a Bayes Net § B: there’s burglary in your house§ E: there’s an earthquake§ A: your alarm soundsStep 1: Add variables. Choose the variables youwant to include in the Bayes NetB§ J: your neighbor John calls you§ M: your other neighbor Mary calls you EAB, E are independentJJ is directly influenced by only A (i.e., J isconditionally independent of B, E, M, given A)MB: there’s burglary in your houseM is directly influenced by only A (i.e., M isconditionally independent of B, E, J, given A)E: there’s an earthquakeA: your alarm soundsJ: your neighbor John calls youM: your other neighbor Mary calls you40416

Creating a Bayes Net Creating a Bayes Net Step 2: Add directed edges The graph must be acyclic If node X is given parents Q1, , Qm, you areThe table at node X must list P(X Parent values) forall combinations of parent valuese.g., you must specifyP(J A) AND P(J A)P(B) 0.001BEsince they don’t haveP(E) 0.002to sum to 1!saying that any variable that’s not a descendant ofX is conditionally independent of X given Q 1, , Q mBStep 3: Add CPTs (Conditional Probability Tables)P(A P(A P(A P(A EB, E) 0.95B, E) 0.94 B, E) 0.29 B, E) 0.001AJB: there’s burglary in your houseAME: there’s an earthquakeJME: there’s an earthquakeP(J A) 0.9P(J A) 0.05A: your alarm soundsJ: your neighbor John calls youM: your other neighbor Mary calls you42B: there’s burglary in your houseP(M A) 0.7P(M A) 0.01A: your alarm soundsJ: your neighbor John calls youM: your other neighbor Mary calls you43The Bayesian Network Created from aDifferent Variable OrderingCreating a Bayes Net1. Choose a set of relevant variables2. Choose an ordering of them, call them X1, , XN3. for i 1 to N:1. Add node Xi to the graph2. Set parents(Xi) to be the minimal subset of{X1 Xi-1}, such that xi is conditionallyindependent of all other members of {X1 Xi-1}given parents(Xi)3. Define the CPTs forP(Xi assignments of parents(Xi)) Different ordering leads to different graph, in general Best ordering when each variable is considered after allvariables that directly influence it44457

The Bayesian Network Created from aDifferent Variable OrderingCompactness of Bayes Nets A Bayesian Network is a graph structure forrepresenting conditional independence relations ina compact wayA Bayes net encodes the full joint distribution (FJPD),often with far less parameters (i.e., numbers)A full joint table needs kN parameters (N variables, kvalues per variable)§ grows exponentially with NIf the Bayes net is sparse, e.g., each node has at mostM parents (M N), only needs O(NkM) parameters§ grows linearly with N§ can’t have too many parents, though4647Variable Dependencies Directed arc from one variable to another variableA Causal ChainBAA Is A guaranteed to be independent of B?§ No – Information can be transmitted over 1 arc (ineither direction) Example: My knowing the Alarm went off,increases my belief there has been a Burglary,and, similarly, my knowing there has been aBurglary increases my belief the Alarm went off48This local configuration is called a “causal chain:”BBCIs A guaranteed to be independent of C?§ No – Information can be transmitted between A andC through B if B is not observed Example: B à A à MNot knowing Alarm means that my knowing thata Burglary has occurred increases my belief thatMary calls, and similarly, knowing that MaryCalls increases my belief that there has been aBurglary498

Causal Chain This local configuration is called a “causal chain:”AA Common CauseBBThis configuration is called “common cause:”ACIs A independent of C given B? § Yes – Once B is observed, information cannot betransmitted between A and C through B; B“blocks” the information path; “C is conditionallyindependent of A given B” Example: B à A à M CIs it guaranteed that B and C are independent?§ No – Information can be transmitted through A to thechildren of A if A is not observedIs it guaranteed that B and C are independent given A?§ Yes – Observing the cause, A, blocks the influencebetween effects B and C; “B is conditionallyindependent of C given A”Knowing that the Alarm went off means thatalso knowing that a Burglary has taken placewill not increase my belief that Mary Calls50B51Common Effect This configuration is called “common effect:”A Common EffectBACAre A and B independent? Are A and B independent given C?§ No – Information can be transmitted through Camong the parents of C if C is observed Example: B à A ß EIf I already know that the Alarm went off, my further knowingthat there has been an Earthquake, decreases my belief thatthere has been a Burglary. Called “explaining away.”§ Proof: P(a,b) Σc P(a,b,c) by marginalization Σc P(a) P(b a) P(c a,b) by chain rule Σc P(a) P(b) P(c a,b) by cond. indep. P(a) P(b) Σc P(c a,b)52BC§ Yes Example: B à A ß EBurglary and Earthquake cause the Alarm togo off, but they are not correlated P(a) P(b)This configuration is called “common effect:”§ Similarly, if C has descendant D and D is given,then A and B are not independentsince last term 1539

Computing a Joint Entry from a Bayes NetD-SeparationHow to compute an entry in the joint distribution (FJPD)?Determining if two variables in a BayesianNetwork are independent or conditionallyindependent given a set of observed evidencevariables, is determined using “d-separation”E.g., what is P(S, M, L, R, T)?P(S) 0.3P(L M,S) 0.05P(L M, S) 0.1P(L M,S) 0.1P(L M, S) 0.2P(R M) 0.3P(R M) 0.6LP(T L) 0.3P(T L) 0.8R55Computing with Bayes NetP(S) 0.3P(L M,S) 0.05P(L M, S) 0.1P(L M,S) 0.1P(L M, S) 0.2MSVariable OrderingP(M) 0.6P(R M) 0.3P(R M) 0.6LP(T L) 0.3P(T L) 0.8Before applying chain rule, best to reorder all ofthe variables, listing first the leaf nodes, then allthe parents of the leaves, etc. Last variableslisted are those that have no parents, i.e., theroot nodes.RTApply the Chain Rule conditional independence!P(T, R, L, M, S) P(T R, L, M, S) * P( R, L, M, S) P(T L) * P( R, L, M, S) P(T L) * P( R L, M, S) * P(L, M, S) P(T L) * P( R M) * P(L, M, S) P(T L) * P( R M) * P(L M, S) * P( M, S) P(T L) * P( R M) * P(L M, S) * P( M S) * P(S) P(T L) * P( R M) * P(L M, S) * P( M) * P(S)57P(M) 0.6TD-separation is covered in CS 76054MSSo, for previous example,P(S,L,M,T,R) P(T,R,L,S,M)5810

The General CaseComputing Joint Probabilitiesusing a Bayesian Network AP(X1 x1 , X2 x2 , ., Xn-1 xn-1 , Xn xn) P(Xn xn , Xn-1 xn-1 , ., X2 x2 , X1 x1) P(Xn xn Xn-1 xn-1 , ., X2 x2 , X1 x1) * P(Xn-1 xn-1 , , X2 x2 , X1 x1)How is any marginal probability computed?Sum the relevant joint probabilities: P(Xn xn Xn-1 xn-1 , ., X2 x2 , X1 x1) * P(Xn-1 xn-1 X2 x2 , X1 x1) *P(Xn-2 xn-2 , ., X2 x2 , X1 x1) P (( Xn i 1n(i xi )(( Xi 1 xi 1 ) , ,( X1 x1 ))) P ( Xi xi ) Assignments of Parents ( Xi )i 1 A BN can answer any query (i.e., marginal probability)about the domain by marginalization (“summing out”) overthe relevant joint probabilities60Inference by EnumerationWhere Are We Now? 61DCompute: P(c) P(a,b,c,d) P(a, b,c,d) P( a,b,c,d) P( a, b,c,d) P(a,b,c, d) P(a, b,c, d) P( a,b,c, d) P( a, b,c, d))59CCompute: P(a,b) P(a,b,c,d) P(a,b,c, d) P(a,b, c,d) P(a,b, c, d):B joint matching Q AND E P(joint)We defined a Bayes net, using a small numberof parameters, to describe the joint probabilityAny joint probability can be computed asP(x1, , xN) i P(xi parents(xi))The above joint probability can be computed intime linear in the number of nodes, NWith this joint distribution, we can compute anyconditional probability, P(Q E); thus we canperform any inferenceHow?by def. of cond. prob.P(Q E) joint matching E P(joint)P(E) 0.002For example: P(B J, M)1. Compute P(B,J, M)2. Compute P(J, M)P(B) 0.001P(A P(A P(A P(A BB, E) 0.95B, E) 0.94 B, E) 0.29 B, E) 0.0013. Return P(B,J, M)/P(J, M)JP(J A) 0.9P(J A) 0.05EAMP(M A) 0.7P(M A) 0.016211

Inference byenumerationCompute the joints (4 of them)Inference by enumerationCompute the joints (8 of them)P(B, J, M, A, E) joint matching Q AND E P(joint)P(Q E) joint matching EFor example: P(B J, M)1. Compute P(B,J, M)2. Compute P(J, M)P(B, J, M, A, E)P(B, J, M, A, E)P(joint)P(B, J, M, A, E)P(E) 0.002Each is O(N) for sparse graphP(x1, xN) P(B) 0.001 i P(xi parents(xi))Sum them upP(A P(A P(A P(A BB, E) 0.95B, E) 0.94 B, E) 0.29 B, E) 0.0013. Return P(B,J, M)/P(J, M)E joint matching Q AND E P(joint)P(J, M,B,A,E)P(Q E) joint matching EFor example: P(B J, M)1. Compute P(B,J, M)A2. Compute P(J, M)JP(J A) 0.9P(J A) 0.05MP(AP(x B,, x E) 0.0011N) i P(xi parents(xi))3. Return P(B,J, M)/P(J, M)P(M A) 0.7P(M A) 0.0163P(J, M,B,A, E)P(J, M,B, A,E)P(joint)P(J, M,B, A, E)P(E) 0.002P(J, M, B,A,E)P(B) 0.001P(J, M, B,A, E)BEP(J, M, B, A,E)P(A B, P(J, M, B, A, E)E) 0.95P(A B, E) 0.94A graphEachisE) 0.29O(N) for sparseP(A B,Sum them upJMP(J A) 0.9P(J A) 0.05P(M A) 0.7P(M A) 0.0164Another ExampleInference by Enumeration joint matching Q AND E P(joint)Sum up 4 jointsP(Q E) joint matching E P(joint)Sum up 8 jointsFor example: P(B J, M)1. Compute P(B,J, M)2. Compute P(J, M)3. Return P(B,J, M)/P(J, M)P(E) 0.002P(B) 0.001BEIn general, if there areBoolean variables, jP(A B,NE) 0.95P(A B,query E) 0.94vars, and kAP(A B, E) 0.29P(A B,evidence E) 0.001 vars, howmany joints to sum up?JMP(J A) 0.9P(J A) 0.0565Compute P(R T, S) from the following Bayes NetP(S) 0.3MSP(M) 0.6P(R M) 0.3P(R M) 0.6P(LP(LP(LP(LP(M A) 0.7P(M A) 0.01 M, S) 0.05M, S) 0.1 M, S) 0.1 M, S) 0.2LP(T L) 0.3P(T L) 0.8RT6612

Another ExampleAnother ExampleStep 1: Compute P(R, T, S)Step 1: Compute P(R, T, S)Step 2: Compute P(T, S)Step 2: Compute P(T, S)Step 3: ReturnStep 3: ReturnP(R, T, S)-----------------P(T, S)P(s) 0.3P(LP(LP(LP(L Sum of all the rows in theJoint that match T SP(R, T, S)-----------------P(T, S)MSM, S) 0.05M, S) 0.1 M, S) 0.1 M, S) 0.2Sum of all the rows in theJoint that match R T SP(M) 0.6P(s) 0.3P(R M) 0.3P(R M) 0.6LP(T L) 0.3P(T L) 0.8P(LP(LP(LP(LRT M, S) 0.05M, S) 0.1 M, S) 0.1 M, S) 0.2MSP(M) 0.6P(R M) 0.3P(R M) 0.6LP(T L) 0.3P(T L) 0.8RTCompute P(R T, S)Compute P(R T, S)6869Another ExampleStep 1: Compute P(R, T, S)Sum of all the rows in theJoint that match R T SStep 2: Compute P(T, S)Step 3: ReturnP(L P(L P(L P(L M, S) 0.05M, S) 0.1 M, S) 0.1 M, S) 0.2S P(T, S)Note: Inference through a Bayes Net can goboth “forward” and “backward” across arcsSSum of all the rows in theJoint that match T SP(R, T, S)------------------------------------P(R, T, S) P( R, T, S)P(s) 0.34 joint computes 8 joint computesMEach of these obtained bythe “computing a jointP(M) 0.6probability entry” method inthe earlierslidesP(R½M) 0.3 P(R M) 0.6LP(T½L) 0.3P(T L) 0.8RTMCausal (top-down) inferenceL§ Given a cause, infer its effectsT§ E.g., P(T S)RDiagnostic (bottom-up) inference§ Given effects/symptoms, infer a cause§ E.g., P(S T)Compute P(R T, S)707113

The Good NewsThe Bad News We can do inference. That is, we can computeany conditional probability:P( Some variables Some other variable values )P( E1 Ù E2 )P( E1 E2 ) P ( E2 ) å P( joint entry)joint entries matching E1 and E2å P( joint entry) joint entries matching E2 “Inference by Enumeration” Algorithm 72In general if there are N variables and evidence contains jvariables, and each variable has k values, how manyjoints to sum up? k(N-j)It is this summation that makes inference byenumeration inefficient Computing conditional probabilities by enumerating allmatching entries in the joint is expensive:Exponential in the number of variablesSome computation can be saved by carefully ordering theterms and re-using intermediate results (variableelimination algorithm)A more complex algorithm called a join tree (junction tree)can save even more computationBut, even so, exact inference with an arbitrary BayesNet is NP-Complete77Compute P(d)Variable Elimination AlgorithmVTGeneral idea: Write query in the formxkx3x2XBDInitial factors:P(v, s, t, l, a, b, x, d) P(v)P(s)P(t v)P(l s)P(b s)P(a t,l)P(x a)P(d a,b)i IterativelyInference by Enumeration (i.e., brute force) approach:§ Move all irrelevant terms outside of innermost sum§ Perform innermost sum, getting a new termP(d) P(v, s,t,l,a,b, x,d)§ Insert the new term into the product79LANeed to eliminate: v, s, x, t, l, a, bP(xn , e ) ! P(xi pai )Sxbaltsv8014

VWe want to compute P(d)Need to eliminate: v, s, x, t, l, a, bSTAInitial factorsParameter (CPT) Learning for BN L§ Ask domain experts, or§ Learn from dataBXWhere do you get these CPT numbers?DP(v)P(s)P(t v)P(l s)P(b s)P(a t,l)P(x a)P(d a,b)Eliminate: vCompute:P(B) 0.001f v (t) P(v)P(t v)P(A P(A P(A P(A v f v (t)P(s)P(l s)P(b s)P(a t,l)P(x a)P(d a,b)BEP(E) 0.002B, E) 0.95B, E) 0.94 B, E) 0.29 B, E) 0.001AJNote: fv(t) P(t)P(J A) 0.9P(J A) 0.05Idea behind VariableElimination Algorithm81Parameter (CPT) Learning for BN§ Learn from a data set like this:( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)(B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)(B, E, A, J, M)( B, E, A, J, M)94P(M A) 0.7P(M A) 0.0193Parameter (CPT) Learning for BN M§ Learn from a data set like this:How to learn this CPT?P(B) 0.001P(A P(A P(A P(A BP(E) 0.002B, E) 0.95B, E) 0.94 B, E) 0.29 B, E) 0.001AJP(J A) 0.9P(J A) 0.05EMP(M A) 0.7P(M A) 0.01( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)(B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)(B, E, A, J, M)( B, E, A, J, M) Count #(B) and #( B) in dataset.P(B) #(B) / [#(B) #( B)]P(B) 0.001P(A P(A P(A P(A BP(E) 0.002B, E) 0.95B, E) 0.94 B, E) 0.29 B, E) 0.001AJP(J A) 0.9P(J A) 0.05EMP(M A) 0.7P(M A) 0.019515

Parameter (CPT) Learning for BNParameter (CPT) Learning for BN§ Learn from a data set like this:( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)(B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)(B, E, A, J, M)( B, E, A, J, M)§ Learn from a data set like this:Count #(E) and #( E) in dataset.P(E) #(E) / [#(E) #( E)]P(B) 0.001P(A P(A P(A P(A BEP(E) 0.002B, E) 0.95B, E) 0.94 B, E) 0.29 B, E) 0.001AJMP(J A) 0.9P(J A) 0.05 P(M A) 0.7P(M A) 0.0196( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)(B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)(B, E, A, J, M)( B, E, A, J, M)P(A P(A P(A P(A BEP(E) 0.002B, E) 0.95B, E) 0.94 B, E) 0.29 B, E) 0.001AJMP(J A) 0.9P(J A) 0.05P(M A) 0.7P(M A) 0.0197Parameter (CPT) Learning for BN§ Learn from a data set like this:( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)(B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)(B, E, A, J, M)( B, E, A, J, M)98P(B) 0.001 Parameter (CPT) Learning for BN Count #(A) and #( A) in datasetwhere B true and E true.P(A B,E) #(A) / [#(A) #( A)]§ Learn from a data set like this:Count #(A) and #( A) in datasetwhere B true and E false.P(A B, E) #(A) / [#(A) #( A)]P(B) 0.001P(A P(A P(A P(A BP(E) 0.002B, E) 0.95B, E) 0.94 B, E) 0.29 B, E) 0.001AJP(J A) 0.9P(J A) 0.05EMP(M A) 0.7P(M A) 0.01( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)(B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)(B, E, A, J, M)( B, E, A, J, M) Count #(A) and #( A) in datasetwhere B false and E true.P(A B,E) #(A) / [#(A) #( A)]P(B) 0.001P(A P(A P(A P(A BP(E) 0.002B, E) 0.95B, E) 0.94 B, E) 0.29 B, E) 0.001AJP(J A) 0.9P(J A) 0.05EMP(M A) 0.7P(M A) 0.019916

Parameter (CPT) Learning for BNParameter (CPT) Learning for BN§ Learn from a data set like this:( B, E, A, J, M)( B, E, A, J, M)( B,p E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)(B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B,p E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B,p E, A, J, M)( B, E, A, J, M)(B, E, A, J, M)( B, E, A, J, M) p§ ‘Unseen event’ problemCount #(A) and #( A) in datasetwhere B false and E false.P(A B, E) #(A) / [#(A) #( A)]P(B) 0.001P(A P(A P(A P(A BEP(E) 0.002B, E) 0.95B, E) 0.94 B, E) 0.29 B, E) 0.001AJP(J A) 0.9P(J A) 0.05MP(M A) 0.7P(M A) 0.01100( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)(B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)( B, E, A, J, M)(B, E, A, J, M)( B, E, A, J, M)Count #(A) and #( A) in datasetwhere B true and E true.P(A B,E) #(A) / [#(A) #( A)]What if there’s no row with(B, E, A, *, *) in the dataset?Do you want to setP(A B,E) 1P( A B,E) 0?Why or why not? 101Smoothing CPTsNaive Bayes Classifier Testing Phase§ “Add-1” smoothing: add 1 to all counts § In the previous example, count #(A) and #( A) indataset where B true and E true§ P(A B,E) [#(A) 1] / [#(A) 1 #( A) 1]For a given test instance defined by X1 v1, , Xn vn,compute𝑎𝑟𝑔𝑚𝑎𝑥( P(Y c) -* , 𝑃(𝑋* 𝑣* Y c)§ If #(A) 1, #( A) 0:§ without smoothing P(A B,E) 1, P( A B,E) 0Class variable§ with smoothing P(A B,E) 0.67, P( A B,E) 0.33§ If #(A) 100, #( A) 0: § without smoothing P(A B,E) 1, P( A B,E) 0§ with smoothing P(A B,E) 0.99, P( A B,E) 0.01§ Smoothing saves you when you don’t have enoughdata, and hides away when you do Evidence variableAssumes all evidence variables are conditionallyindependent of each other given the class variableRobust because it gives the right answer as long as thecorrect class is more likely than all others§ It’s a form of Maximum a posteriori (MAP) estimation10310417

A Special BN: Naïve Bayes ClassifiersBN Special Case: Naïve Bayes A special Bayes Net structure:J§ a ‘class’ variable Y at root, compute P(Y X1, , XN)§ evidence nodes Xi (observed features) are allleavesC§ conditional independence between all evidenceassumed. Usually not valid, but often empiricallyOK ZHJPerson is JuniorCBrought coat to classZLives in zipcode 53706HSaw “Hunger Games 1” morethan onceWhat’s stored in the CPTs?Y X1X2XN105106A Special BN: Naïve Bayes ClassifiersP(J) CP(C J) P(C J) 107JZHJPerson is JuniorCBrought coat to classZLives in zipcode 53706HSaw “Hunger Games 1” morethan onceBayesian Network PropertiesP(Z J) P(H J) P(Z J) P(H J) Bayesian Networks compactly encode jointdistributions Topology of a Bayesian Network is onlyguaranteed to encode conditionalindependencies§ Arcs do not necessarily represent causalrelations12918

What You Should Know Inference with full joint distributionProblems of full joint distributionBayesian Networks: representation (nodes,arcs, CPT) and meaningCompute joint probabilities from Bayes netInference by enumerationNaïve Bayes classifierYou are NOT responsible for the VariableElimination Algorithm13019

1.List all combinations of values (if each variable has kvalues, there are kNcombinations) 2.Assign each combination a probability 3.They should sum to 1 Weather Temperature Prob. Sunny Hot 150/365 Sunny Cold 50/365 Cloudy Hot 40/365 Cloudy Cold 60/365 Rainy Hot 5/365 Rainy Cold 60/365 3 Using the Full Joint Distribution

Related Documents:

Learning Bayesian Networks and Causal Discovery - GitHub Pages

Learning Bayesian Networks and Causal Discovery Reasoning in Bayesian networks The most important type of reasoning in Bayesian networks is updating the probability of a hypothesis (e.g., a diagnosis) given new evidence (e.g., medical findings, test results). Example: What is the probability of Chronic Hepatitis in an alcoholic patient with

10 Views

1y ago

Bayesian Networks - Boston University

Bayesian networks can also be used as inﬂuence diagramsinstead of decision trees. . Bayesian networks do not necessarily imply inﬂuence by Bayesian uentists’methodstoestimatethe . comprehensible theoretical introduction into the method illustrated with various examples. As

16 Views

2y ago

Using Bayesian networks to model watershed management ...

Key words Bayesian networks, water quality modeling, watershed decision support INTRODUCTION Bayesian networks A Bayesian network (BN) is a directed acyclic graph that graphically shows the causal structure of variables in a problem, and uses conditional probability distributions to deﬁne relationships between variables (see Pearl 1988, 1999;

62 Views

3y ago

Probabilistic Representation and Reasoning

Alessandro Panella (CS Dept. - UIC) Probabilistic Representation and Reasoning May 4, 2010 14 / 21. Bayesian Networks Bayesian Networks Bayesian Networks A Bayesian (or belief) Network (BN) is a direct acyclic graph where: nodes P i are r.v.s

18 Views

2y ago

Computational Bayesian Statistics

Computational Bayesian Statistics An Introduction M. Antónia Amaral Turkman Carlos Daniel Paulino Peter Müller. Contents Preface to the English Version viii Preface ix 1 Bayesian Inference 1 1.1 The Classical Paradigm 2 1.2 The Bayesian Paradigm 5 1.3 Bayesian Inference 8 1.3.1 Parametric Inference 8

25 Views

2y ago

Lecture Notes on Bayesian Nonparametrics Peter Orbanz

value of the parameter remains uncertain given a nite number of observations, and Bayesian statistics uses the posterior distribution to express this uncertainty. A nonparametric Bayesian model is a Bayesian model whose parameter space has in nite dimension. To de ne a nonparametric Bayesian model, we have

21 Views

1y ago

Building Bayesian Networks: Elicitation, Evaluation, and Learning

evaluation of performance robustness, i.e., sensitivity, of Bayesian networks, d) the sensi-tivity inequivalent characteristic of Markov equivalent networks, and the appropriateness of using sensitivity for model selection in learning Bayesian networks, e) selective reﬁnement for

10 Views

1y ago

AP District of the Year Awards - College Board

Wisconsin Arrowhead Union High School District Wisconsin School District of eloit‡ Wisconsin D C Everest School District Wisconsin Diocese of Green Bay Wisconsin Green Bay Area Public School District Wisconsin Greendale School District Wisconsin Hamilton School District

38 Views

2y ago

Recent Views

Quotes within Quotes: When Single (') and Double (") Quotes . - SAS

Here the outside double quotes are replaced by a single quote and the apostrophe is replaced by two single quotes. This works because when the parser sees two single (or double) quotes immediately following each other, the parser resolves them into one quote mark after the closing quote has been determined.

1y ago

237 Views

IN THIS ISSUE CAR WASH INSIGHT Recent, Notable M&A Transactions .

9/8/2022 Club Car Wash Sites of Tidal Wave Express Car Wash 8 8/29/2022 Take 5 Car Wash Soft Touch Car Wash, Auto Oasis Car Wash, Clearwater Car Wash and Birdie's Car Wash 5 8/25/2022 WhiteWater Express Geaux Clean Car Wash 7 8/19/2022 ModWash Home Team Car Wash 3 8/18/2022 Splash In ECO Car Wash (Wills Group) Blue Hen Car Wash 2

9m ago

100 Views

Personal insurance - Car & Business insurance King Price Insurance

The king's insurance options 5 Things you need to know 7 The stuff you need to do 14 How to claim 16 Our commitment to you 20 Car insurance 22 Car warranty 37 Shortfall cover 45 Scratch and dent 46 Tyre and rim 48 Motorbike insurance 53 Trailer and caravan insurance 64 Watercraft insurance 68 Home contents insurance 77 Buildings insurance 89

1y ago

673 Views

What These Inspirational Quotes Say

Self Motivation Quotes Success Quotes Teacher Quotes And after reading all of these inspirational quotes you’d like to share which quotation is . -- Brian Tracy "You must constantly ask yourself these questions: Who am I around? What are they doing to me? Wha

2y ago

302 Views

ESSENTIAL PLAN - Discovery

Car insurance only Car and home insurance Car insurance only Car and home insurance 12.5% 25% 5% 10% YOUR FUEL CASH BACK PERCENTAGE GET TO THE HIGHEST CASH BACK PERCENTAGE Add at least R250 000 of home insurance (household contents, buildings or both) Take your car to Tiger Wheel & Tyre and pass the Annual MultiPoint check

1y ago

269 Views

CAR INSURANCE EVERYTHING EXPLAINED - RSA Insurance Group

CAR INSURANCE 93013821.indd 1 15/03/2018 10:46. 2 WELCOME TO µ CAR INSURANCE Thank you for choosing µ to protect you and your car. This booklet is intended to help you check your cover and to reassure you that µ will give you the protection you need for the year ahead. First of all, to help you understand your car insurance policy we want to .

1y ago

274 Views

Describe types and purposes of insurance.

D.O. CAPS Consumer Skills: Insurance—10E 3 Your car - The car you drive can also affect your insurance rates. Insurance companies place certain kinds of cars in special risk categories. You should ask your insurance agent before making a car purchase to make sure you aren't getting a car that will cost you extra for your liability insurance.

1y ago

233 Views

Money Online Price Comparison - WordPress

you to compare car insurance quotes. You'll notice at the top of the screen is a warning regarding telling the truth when completing any form of car insurance quote as something withheld, which later becomes known, can void an insurance claim. 7 The process of completing a car insurance price comparison is broken down into 4

1y ago

174 Views

Quotations - Free Website Builder: Create free websites

cards, but sometimes, playing a poor hand well." . 50th Birthday Quotes 60th Birthday Quotes And there are more. Funny Birthday Quotes Cute Birthday Quotes . it a try, itʼs free. Triumph over failure can be a

2y ago

267 Views

The Top 100 Motivational & Inspirational Quotes for 2015

I've spent hours crawling through the web trying to find the best quotes to keep me motivated and inspired all throughout the New Year. I've saved hundreds of quotes on my laptop and figured that words alone could motivate and inspire me. but if I couple the quotes

2y ago

329 Views

Inspirational Quotes - Guideposts

Inspirational Quotes Inspiring quotes are like vitamins for the soul. From the heartfelt to the humorous, the words of wisdom you’ll find here will strengthen your faith, lift your spirits, and even spark a positive change in your life. This collection of some our favorite inspirational quotes from religious figures, world leaders, authors,

2y ago

553 Views

Contours Options Infant Car Seat Adapter Instruction Sheet

your Infant Car Seat, as described in the instruction manual provided by the Infant Car Seat manufacturer. † WHEN USING ONLY ONE INFANT CAR SEAT ADAPTER OR TWO FOR TWINS, THE FOLLOWING INFANT CAR SEATS CAN BE USED: † If your Infant Car Seat is not one of the models listed above, DO NOT use your infant car seat with this car seat adapter.

2y ago

564 Views

Microsoft Advertising Travel Update

last minute cruise deals -58.50% Car Rental Queries WoW Change car rental -43.80% rental cars -46.30% car rentals -40.60% cheap car rentals -48.00% car rentals cheapest rates -52.20% rent a car- 40.30% cheap rental cars -45.60% rental car -41.80% car rental deals -49.30% rental cars lowest price -53.90% Flight Queries WoW Change cheap flights .

1y ago

337 Views

Design and development of lift for an automatic car parking system

1. Stacker type car parking system 2. Puzzle type car parking system 3. Level type car parking system 4. Chess type car parking system 5. Rotary type car parking system 6. Tower type car parking system But lift is used only in tower type car parking system. Objectives:-

6m ago

172 Views

Gold Tier - MAPFRE Insurance

Foy Insurance of MA, LLC 198 Frank Consolati Insurance Agency, Inc. 198 County Insurance Agency, Inc. 198 Woodrow W Cross Agency 214 Woodland Insurance Agency, Inc. 214 Tegeler Insurance Services of CT, Inc. 214 Pantano/VonKahle Insurance Agency, Inc. 214 . Hanson Insurance Agency, Inc. 287 J.H. Slattery Insurance Agency, Inc. 287

1y ago

565 Views

Bayesian Networks - University Of Wisconsin–Madison

It looks like you're using an ad-blocker