Variational Mean Field For Graphical Models - Computing

1y ago

19 Views

2 Downloads

2.68 MB

37 Pages

Last View : 14d ago

Last Download : 3m ago

Upload by : Genevieve Webb

Report this link

Download PDF

Transcription

Variational Mean Fieldfor Graphical ModelsCS/CNS/EE 155Baback MoghaddamMachine Learning Groupbaback @ jpl.nasa.gov

Approximate Inference Consider general UGs (i.e., not tree-structured) All basic computations are intractable (for large G)- likelihoods & partition function- marginals & conditionals- finding modes

Taxonomy of Inference bbs, M-HDeterministic(MC) MC, SACluster MPLBPEPVariational

Approximate Inference Stochastic (Sampling)- Metropolis-Hastings, Gibbs, (Markov Chain) Monte Carlo, etc- Computationally expensive, but is “exact” (in the limit) Deterministic (Optimization)- Mean Field (MF), Loopy Belief Propagation (LBP)- Variational Bayes (VB), Expectation Propagation (EP)- Computationally cheaper, but is not exact (gives bounds)

Mean Field : Overview General idea- approximate p(x) by a simpler factored distribution q(x)- minimize “distance” D(p q) - e.g., Kullback-Liebleroriginal Gp( x) (Naïve) MF H0 φ ( x )cccq ( x) structured MF Hs q (x )iiiq ( x ) q A ( x A ) q B ( xB )

Mean Field : Overview Naïve MF has roots in Statistical Mechanics (1890s)- physics of spin glasses (Ising), ferromagnetism, etc- why is it called “Mean Field” ?with full factorization : E[xi xj ] E[xi ] E[xj ] Structured MF is more “modern”Coupled HMMStructured MF approximation(with tractable chains)

KL Projection D(Q P ) Infer hidden h given visible v (clamp v nodes with δ ‘s) Variational : optimize KL globallythe right density form for Q “falls out”KL is easier since we’re taking E [.] wrt simpler QQ seeks mode with the largest mass (not height)so it will tend to underestimate the support of PP 0 forces Q 0

KL Projection D(P Q ) Infer hidden h given visible v (clamp v nodes with δ ‘s) Expectation Propagation (EP) : optimize KL locallythis KL is harder since we’re taking E [.] wrt Pno nice global solution for Q “falls out”must sequentially tweak each qc (match moments)Q covers all modes so it overestimates supportP 0 forces Q 0

α - divergences The 2 basic KL divergences are special cases of Dα (p q) is non-negative and 0 iff p q– when α– when α- 1 we get KL(P Q) 1 we get KL(Q P)– when α 0 D0 (P Q) is proportional to Hellinger’s distance (metric)So many variational approximations must exist, one for each α !

for more on α - divergencesShun-ichi Amari

for specific examples of α 1See Chapter 10Variational Single GaussianVariational Linear RegressionVariational Mixture of GaussiansVariational Logistic RegressionExpectation Propagation (α -1)

Hierarchy of Algorithms(based on α and structuring)Power EP exp family Dα(p q)Structured MF exp family KL(q p)FBP fully factorized Dα(p q)EP exp family KL(p q)MF fully factorized KL(q p)TRW fully factorized Dα(p q) α 1BP fully factorized KL(p q)by Tom Minka

Variational MF11 ψ ( x)p( x) γ c ( xc ) eZ cZlog Z log eψ ( x)ψ ( x) clog(γ c ( xc ))dxe log Q( x)dxQ( x)ψ ( x) supQ EQ log [eJensen’sψ ( x) EQ log [eψ ( x ) / Q( x)]/ Q( x)] supQ { EQ [ψ ( x)] H [Q( x)] }

Variational MFlog Z supQ { EQ [ψ ( x)] H [Q( x)] }Equality is obtained for Q(x) P(x)(all Q admissible)Using any other Q yields a lower bound on log ZThe slack in this bound is KL-divergence D(Q P)Goal: restrict Q to a tractable subclass Qoptimize with supQ to tighten this boundnote we’re (also) maximizing entropy H[Q]

Variational MFlog Z supQ { EQ [ψ ( x)] H [Q( x)] }Most common specialized family :“log-linear models”ψ ( x) cθc φc ( xc ) θ Tφ ( x)linear in parameters θ(natural parameters of EFs)φ (x)(sufficient statistics of EFs)clique potentialsFertile ground for plowing Convex Analysis

Convex AnalysisThe Old TestamentThe New Testament

Variational MF for EFlog Z supQ { EQ [ψ ( x)] H [Q( x)] }log Z supQ { EQ [θ φ ( x)] H [Q( x)] }log Z supQ {θ EQ [φ ( x)] H [Q( x)] }TTA(θ ) sup µ M {θ µ A * ( µ ) }TM set of all moment parameters realizable under subclass QEFnotation

Variational MF for EFSo it looks like we are just optimizing a concave function(linear term negative-entropy) over a convex setYet it is hard . Why?1. graph probability (being a measure) requires a very largenumber of marginalization constraints for consistency (leadsto a typically beastly marginal polytope M in the discrete case)e.g., a complete 7-node graph’s polytope has over 108 facets !In fact, optimizing just the linear term alone can be hard2. exact computation of entropy -A*(µ) is highly non-trivial(hence the famed Bethe & Kikuchi approximations)

Gibbs Sampling for Ising Binary MRF G (V,E ) with pairwise clique potentials1. pick a node s at random2. sample u Uniform(0,1)3. update node s :4. goto step 1a slower stochastic version of ICM

Naive MF for Ising use a variational mean parameter at each site1. pick a node s at random2. update its parameter :3. goto step 1 deterministic “loopy” message-passing how well does it work? depends on θ

Graphical Models as EF G(V,E) with nodes sufficient stats : clique potentials probability log-partition mean parameterslikewise for θ st

Variational Theorem for EF For any mean parameter µ where θ (µ ) is the corresponding natural parameterin relative interior of Mnot in the closure of M the log-partition function has this variational representation this supremum is achieved at the moment-matching value of µ

Legendre-Fenchel Duality Main Idea: (convex) functions can be “supported” (lower-bounded) by acontinuum of lines (hyperplanes) whose intercepts create a conjugate dualof the original function (and vice versa)conjugate dual of Aconjugate dual of A*Note that A** A (iff A is convex)

Dual Map for EFTwo equivalent parameterizations of the EFBijective mapping between Ω and the interior of MMapping is defined by the gradients of A and its dual A*Shape & complexity of M depends on X and size and structure of G

Marginal Polytope G(V,E) graph with discrete nodes Then M convex hull of all φ (x) equivalent to intersecting half-spaces aT µ b difficult to characterize for large G hence difficult to optimize over interior of M is 1-to-1 with Ω

The Simplest Graph G(V,E) a single Bernoulli nodexφ (x) x density log-partition(of course we knew this) we know A* too, but let’s solve for it variationally differentiate rearrange tostationary point, substitute into A*Note: we found both the meanparameter and the lower boundusing the variational method

The 2nd Simplest Graphx1x2 G(V,E) 2 connected Bernoulli nodes moments variational problem solve (it’s still easy!)moment constraints

The 3rd Simplest Graphx1x2x33 nodes16 constraints# of constraints blows up real fast:7 nodes200,000,000 constraintshard to keep track of valid µ’s(i.e., the full shape and extent of M )no more checking our results againstclosed-forms expressions that wealready knew in advance!unless G remains a tree, entropy A*will not decompose nicely, etc

Variational MF for Ising tractable subgraph H (V,0 ) fully-factored distribution moment space entropy is additive :- variational problem for A(θ ) using coordinate ascent :

Variational MF for Ising Mtr is a non-convex inner approximationM tr Mwhat causes this funky curvature? optimizing over Mtr must then yield a lower bound

Factorization with Trees suppose we have a tree G (V,T ) useful factorization for trees entropy becomes- singleton terms- pairwise termsMutual Information

Variational MF for Loopy Graphs pretend entropy factorizes like a tree (Bethe approximation) define pseudo marginalsmust impose thesenormalizationand marginalizationconstraints define local polytope L(G ) obeying these constraints note thatM (G ) L(G ) for any Gwith equality only for trees : M(G ) L(G )

Variational MF for Loopy GraphsL(G) is an outer polyhedral approximationsolving this Bethe Variational Problem we get the LBP eqs !so fixed points of LBP are thestationary points of the BVPthis not only illuminates what was originally an educated “hack” (LBP)but suggests new convergence conditions and improved algorithms (TRW)

see ICML’2008 Tutorial

Summary SMF can also be cast in terms of “Free Energy” etc Tightening the var bound min KL divergence Other schemes (e.g, “Variational Bayes”) SMF- with additional conditioning (hidden, visible, parameter) Solving variational problem gives both µ and A(θ ) Helps to see problems through lens of Var Analysis

Matrix of Inference MethodsExactDiscreteDeterministic approximationChain (online)Low treewidthHigh treewidthBP forwardsVarElim, Jtree,recursiveconditioningLoopy BP, mean field,structured variational,EP, graph-cutsJtree sparse linearalgebraLoopy BPEP, EM, VB,EP, variational EM, VB,NBP, GibbsBoyen-Koller (ADF),beam searchGaussianOtherStochastic approximationBP Kalman filterEKF, UKF, momentmatching (ADF)Particle filterNBP, GibbsGibbsGibbsBP Belief Propagation, EP Expectation Propagation, ADF Assumed Density Filtering, EKF Extended Kalman Filter,UKF unscented Kalman filter, VarElim Variable Elimination, Jtree Junction Tree, EM Expectation Maximization,VB Variational Bayes, NBP Non-parametric BPby Kevin Murphy

THEEND

entropy is additive :- variational problem for A(q) . Matrix of Inference Methods EP, variational EM, VB, NBP, Gibbs EP, EM, VB, NBP, Gibbs EKF, UKF, moment matching (ADF) Particle filter Other Loopy BP Gibbs Jtree sparse linear algebra Gaussian BP Kalman filter Loopy BP, mean field, structured variational, EP, graph-cuts Gibbs

Related Documents:

Analytical Mechanics: Variational Principles

Agenda 1 Variational Principle in Statics 2 Variational Principle in Statics under Constraints 3 Variational Principle in Dynamics 4 Variational Principle in Dynamics under Constraints Shinichi Hirai (Dept. Robotics, Ritsumeikan Univ.)Analytical Mechanics: Variational Principles 2 / 69

34 Views

2y ago

Bruksanvisning för bilstereo Bruksanvisning for bilstereo ... - Jula

Bruksanvisning för bilstereo . Bruksanvisning for bilstereo . Instrukcja obsługi samochodowego odtwarzacza stereo . Operating Instructions for Car Stereo . 610-104 . SV . Bruksanvisning i original

376 Views

1y ago

PART I PART II VARIATIONAL PRINCIPLES IN CONTINUUM ...

II. VARIATIONAL PRINCIPLES IN CONTINUUM MECHANICS 4. Introduction 12 5. The Self-Adjointness Condition of Vainberg 18 6. A Variational Formulation of In viscid Fluid Mechanics . . 25 7. Variational Principles for Ross by Waves in a Shallow Basin and in the "13-P.lane" Model . 37 8. The Variational Formulation of a Plasma . 9.

29 Views

2y ago

10 tips och tricks för att lyckas med ert sap-projekt

10 tips och tricks för att lyckas med ert sap-projekt 20 SAPSANYTT 2/2015 De flesta projektledare känner säkert till Cobb’s paradox. Martin Cobb verkade som CIO för sekretariatet för Treasury Board of Canada 1995 då han ställde frågan

737 Views

2y ago

Nordens 25 största medieföretag efter omsättning

service i Norge och Finland drivs inom ramen för ett enskilt företag (NRK. 1 och Yleisradio), fin ns det i Sverige tre: Ett för tv (Sveriges Television , SVT ), ett för radio (Sveriges Radio , SR ) och ett för utbildnings program (Sveriges Utbildningsradio, UR, vilket till följd av sin begränsade storlek inte återfinns bland de 25 största

338 Views

1y ago

SS 02 52 68 Ljudklassning av utrymmen i byggnader - byggtjanst.se

Hotell För hotell anges de tre klasserna A/B, C och D. Det betyder att den "normala" standarden C är acceptabel men att motiven för en högre standard är starka. Ljudklass C motsvarar de tidigare normkraven för hotell, ljudklass A/B motsvarar kraven för moderna hotell med hög standard och ljudklass D kan användas vid

358 Views

1y ago

Apple Developer Program License Agreement (Swedish)

LÄS NOGGRANT FÖLJANDE VILLKOR FÖR APPLE DEVELOPER PROGRAM LICENCE . Apple Developer Program License Agreement Syfte Du vill använda Apple-mjukvara (enligt definitionen nedan) för att utveckla en eller flera Applikationer (enligt definitionen nedan) för Apple-märkta produkter. . Applikationer som utvecklas för iOS-produkter, Apple .

345 Views

1y ago

PLEASE NOTE:-

HINDI: Amrit Hindi Pathmala Amity MATHS: My Book of Maths Pre-Primer Kangaroo kids SCIENCE: Green Leaf Primary Science Dhruv G.K Brain Game Introductry Amit ART: MAC Book Part - 1 Millanium STATIONERY 2 Copies Four Line 2 Copies Double Line 2 Copies Square Line 1 School Diary 10 Copy Covers 1 Sketch Book Spiral 1 Box Jumbo Crayons (12) F.C 1 Packet Pencils 1 Packet Erasers 1 Test Copy . St .

189 Views

3y ago

Recent Views

MOOSIC PRE ORDER OFFER 2018

9781860960147 Jazz Piano Grade 5: The CD 22.92 17.24 18.76 19.83 9781860960154 Jazz Piano from Scratch 55.00 41.36 45.02 47.58 9781860960161 Jazz Piano Aural Tests, Grades 1-3 18.15 13.65 14.86 15.70 9781860960505 Jazz Piano Aural Tests, Grades 4-5 15.29 11.50 12.52 13.23 Easier Piano Pieces (ABRSM)

3y ago

95 Views

Bethel A.M.E. Annual Women's Day Celebration

Annual Women's Day Celebration Theme: Steadfast and Faithful Women 1993 Bethel African Methodi st Epi scopal Church Champaign, Illinois The Ministry Thi.! Rev. Sleven A. Jackson, Pastor The Rev. O.G. Monroe. Assoc, Minister The Rl. Rev. James Haskell Mayo l1 ishop, f7011rt h Episcop;l) District The Rev. Lewis E. Grady. Jr. Prc. i ding Elder . Cover design taken from: Book of Black Heroes .

3y ago

97 Views

Automotive - Siemens Digital Industries Software

of this system requires a new level of close integration between mechanical, electrical and thermal domains. It becomes necessary to have true multi-domain data exchange between engineering software tools to inform the system design from an early concept stage. At the most progressive automotive OEMs, thermal, electrical

3y ago

51 Views

PRESENTER BIOGRAPHIES

PRESENTER BIOGRAPHIES. MDPH Commissioner Remarks: Cheryl Bartlett, RN Commissioner . MA Department of Public Health . Cheryl Bartlett was named Commissioner of the Massachusetts Department of Public Health in June 2013. As Commissioner Ms. Bartlett chairs the newly appointed Prevention and Wellness Advisory Board, which oversees a 60 million Prevention Trust Fund – the first of its kind in .

3y ago

116 Views

2019 SPLUNK INC. Splunk Certification Certification Exam .

Sample Questions Test Blueprint Splunk Core Certified Consultant Test Blueprint Splunk Certification Exams Table of Contents Please note: Sample questions (where available) are provided to give candidates a general idea of the formatting and type of questions for each of the exams listed above. The test blueprints provide much

3y ago

73 Views

Programme Specification BSc Chemistry (2020-21 )

The BSc Chemistry degree aims to enhance your enthusiasm for chemistry and to provide an intellectually stimulating learning environment. You will gain extensive in-depth knowledge and understanding of chemistry and related subjects, as well as a comprehensive training in practical chemistry and an appreciation of the importance of the discipline in different contexts. The programme will .

3y ago

51 Views

Chimney - Robot Virtual Games

Chimney Junior Each Total Correct balls on the Chimney Each ball will give you points if it is equal to the color indicated by the cube. 40 80 Incorrect balls on the Chimney Each ball will take you points if it is not equal to the color indicated by the cube. -5 -10 Park the robot Robot stops on Finish Area and simulation stops.

3y ago

42 Views

Timeline of the Cold War - truman.library

Timeline of the Cold War 1945 Defeat of Germany and Japan February 4-11: Yalta Conference meeting of FDR, Churchill, Stalin - the 'Big Three' Soviet Union has control of Eastern Europe. The Cold War Begins May 8: VE Day - Victory in Europe. Germany surrenders to the Red Army in Berlin July: Potsdam Conference - Germany was officially partitioned into four zones of occupation. August 6: The .

3y ago

253 Views

skinnytaste Cookbook Index

Naked Persian Turkey Burgers The Skinnytaste Cookbook Perfect Poultry 156 6 6 6 Orecchiette with Sausage, Baby Kale, and Bell Pepper The Skinnytaste Cookbook Perfect Poultry 181 11 11 4. RECIPE COOKBOOK CHAPTER PG SP Roasted Poblanos Rellenos with Chicken The Skinnytaste Cookbook Perfect Poultry 173 7 10 5

3y ago

71 Views

3-in-1 Cooking System - NinjaKitchen

5 ˆˇ 6 Getting to Know the Ninja 3-in-1 Cooking System Control Panel Function Dial Turn the dial to select Stovetop, Slow Cook or Oven mode. Stovetop - Use the Ninja 3-in-1 Cooking System as you would a stovetop.

3y ago

39 Views

BIOLOGY - Michigan

Credit for high school Earth Science, Biology, Physics, and Chemistry will be defined as meeting both essential and core subject area content expectations. Assessment Prerequisite Knowledge and Skills Basic Science Knowledge Orientation Towards Learning Reading, Writing, Communication Basic Mathematics Conventions, Probability, Statistics .

3y ago

27 Views

Investigatingrespiration*in*ectotherms(crickets)*

ets)*

Males"of" ud" chirpingsoundbyrubbingtheir forewingstogether;theydothisto p .

3y ago

34 Views

The Criminal Justice Response to Child Abuse: Lessons .

Rates of Criminal Justice Action on Investigated Cases Study Sample N Rate Tjaden & Thoennes, 1992 CPS 833 4% prosecuted Finkelhor, 1983 State clearing - house data 6096 24% criminal justice action taken Stroud, Martens & Barker, 2000 &KLOGUHQ¶V Advocacy Center 1043 56% referred to p rosecutors

3y ago

45 Views

Curriculum Adaptations for Exceptional Students

Adapting curriculum and instruction . The Center for School and Community Integration, Institute for the Study of Developmental Disabilities. Why do we want to use curriculum adaptations? Looking at learning in new and different ways. Get creative! EM 1.1.8 – Student understands concepts of

3y ago

29 Views

Brunei Darussalam In Brief - information.gov.bn

‘Brunei Darussalam In Brief’ is a publication where it discusses briefly on the socio-economic welfare of Brunei Darussalam in general. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form by any means without prior written permission from

3y ago

65 Views

Variational Mean Field For Graphical Models - Computing

It looks like you're using an ad-blocker