28 : Approximate Inference - Distributed MCMC - Carnegie Mellon University

7m ago

4 Views

1 Downloads

4.09 MB

17 Pages

Last View : 1m ago

Last Download : 3m ago

Upload by : Olive Grimm

Report this link

Download PDF

Transcription

10-708: Probabilistic Graphical Models, Spring 2015 28 : Approximate Inference - Distributed MCMC Lecturer: Avinava Dubey 1 Scribes: Hakim Sidahmed, Aman Gupta Introduction For many interesting problems, drawing samples from a complicated probability distribution is a difficult problem and approximate methods have to be used. A popular class of sampling algorithms is the ”Markov Chain Monte Carlo” (MCMC) methods. These methods use an adaptive proposal Q(x0 kx) to sample from the true distribution P (x) using a Markov Chain. If certain conditions are met, samples from the adaptive distribution can be used as surrogate samples from the true distribution. Gibbs sampling is a particular type of MCMC algorithm for which each random variable xi is sampled given the other variables. In this note, we briefly review MCMC methods. We then go on to describe strategies to scale these methods to large datasets and models. 2 Review of MCMC methods [6] We briefly review MCMC methods in this section. Monte Carlo Methods are used to draw samples from a probability density function that is hard to compute. This situation can occur in Markov Random Fields (MRFs) where the normalization constant can be too hard to compute, or in high-dimensional models. MCMC methods are used when it is impossible to sample directly from the original distribution. They use a proposal distribution Q(x0 kx) which proposes which point to sample next. 2.1 The Metropolis-Hastings algorithm The Metropolis-Hastings algorithm is a popular MCMC method and is a generalization of the Metropolis algorithm. The algorithm is described in Algo 1. In Metropolis-Hastings, the form of the proposal distribution evolves based on the present state. A common choice of family for the proposal distribution is the isotropic Gaussian. The choice of the variance for this distribution is critical for the performance of the sampling: A smaller variance will increase the number of accepted samples, but will reduce the speed of search in of the space. Thus, choosing a proposal distribution forces us to perform a tradeoff between speed and acceptance rate, and requires care. 2.2 Gibbs Sampling - Special case of Metropolis-Hastings Gibbs sampling is another popular MCMC method. In Gibbs sampling, each random variable xi is sampled separately, with all the other variables fixed. In this particular case, the proposal distribution Q is chosen 1

2 28 : Approximate Inference - Distributed MCMC Algorithm 1 The Metropolis Hastings Algorithm [6] 1: Input: Conditional proposal Q(x0 kx), target P (x) 2: Output: Sampled set of N correlated samples drawn from P(x) 3: Initialize state x(0) 4: Initialize sampled set as Null 5: for t 1.N do 6: Sample proposal Q(x0 kx(t) ) to obtain candidate x0 (t) 7: 8: 9: 10: 11: 12: 13: 14: 15: Accept this state with probability α if s thentate is accepted Set x(t 1) x0 else x(t 1) x(t) end if Add x(t 1) to the sampled set end for return the sampled set 0 P (x0 )Q(xx )kx P (x(t) )Q(x0 kx(t) ) in the following manner: Q(x0 kx) P (x0 kx i ) This is a special case of the Metropolis-Hastings with an acceptance rate of 1. The details of Gibbs sampling have been described in Algo 2. Algorithm 2 The Gibbs Sampling Algorithm [6] 1: Input: Conditional proposal 2: Output: Sampled set of N correlated samples drawn from P(x) (0) (0) 3: Initialize state x1 , ., xK 4: Initialize sampled set as Null 5: for t 1.N do 6: for i 1.K do (t 1) (t 1) (t 1) (t 1) (t 1) P (Zi kx1 7: Sample xi , ., xi 1 , xi 1 , ., xK ) 8: Add x(t 1) to the sampled set 9: end for 10: end for 11: return the sampled set 3 3.1 Gibbs sampling Taking multiple chains There is no fixed method which can detect the convergence of a Markov Chain. However, certain heuristics can be used to detect the convergence of MCMC methods. A common method consists of running multiple chains created with different seeds in parallel (one chain per CPU). When all the chains have close loglikelihoods, we can reasonably assume that they have converged. This heuristic is illustrated in figure 1. Each chain is run in parallel on a different core. While the chains on the left plot have not converged, we can reasonably assume the chains on the right plot have converged.

28 : Approximate Inference - Distributed MCMC 3 Figure 1: Number of iterations vs log-likelihood for multiple chains - left: No convergence - right: Convergence However, this strategy of running multiple chains in parallel does not solve all the issues. Indeed, if the chains have a long burn-in period (years for instance), then this strategy will not reduce this duration. Hence, it is important to be able to sample individual chains faster. 4 Naive parallel Gibbs sampling A naive strategy for speeding up Gibbs sampling is Synchronous Gibbs sampling. This method consists of sampling all the variables of a problem in parallel on a different processor. This method is illustrated in the alarm network example (figure 2): Figure 2: Naive parallel Gibbs sampling for the alarm network This approach works as follows: 1. Assign each variable to a processor 2. Sample each variable at time t using the results values of the other variables at t 1 3. Send all the updated variables to all the processors The main difference with the Sequential Gibbs Sampler is that each variable is sampled using the *old* values for the other variables, regardless of whether newer values have been sampled for them.

4 28 : Approximate Inference - Distributed MCMC Algorithm 3 The synchronous Gibbs sampler[3] 1: for all variables Xj do in parallel (t 1) (t) π(Xj kx 2: Execute Gibbs update: Xj Nj ) 3: barrier end In this example, consider that all the variables are set to ”false” at t 0. Variable B will be assigned to worker 0, E to worker 1, A to worker 2, J to worker 3 and M to worker 4. At time t 1, worker 0 will sample variable B using all the other variables at time t 0 (figure 3). Figure 3: Variable B is sampled at time t 1 using the values for the other variables at time t 0 Worker 1 will sample variable E in parallel using all the variables at time t 0 (figure 4). Figure 4: Variable E is sampled at time t 1 using the values for the other variables at time t 0 Worker 2 will sample variable A in parallel using all the variables at time t 0 (figure 5). One possible disadvantage of this method is that it can require a lot of communications (to send the variable updates to all the workers). This problem can be limited by assigning blocks of variables to workers instead of single variables. Even though this method works well for many models, like collapsed Gibbs sampling for LDA, for many models, the Synchronous Gibbs Sampler is not ergodic and hence does not converge to the correct stationary distribution. Let us see such an example where the synchronous Gibbs sampler does not converge to the stationary distribution. Consider the Bayes Network defined in figure 6. This network essentially encodes a XOR relation between (A,B) and (A,C): If A is true, then with high probability B and C will be false. On the other hand, if A is false, then with high probability, B and C will be true.

28 : Approximate Inference - Distributed MCMC 5 Figure 5: Variable A is sampled at time t 1 using the values for the other variables at time t 0 Figure 6: Example of Bayes Network where Naive parallel Gibbs sampling fails Using the naive parallel Gibbs Sampler strategy, we sample each of the variables A, B, and C using the old values. Starting with a state t 0 where all the variables are set to false, we sample A: P (AkB, C) P (BkA)P (CkA) . Given B and C are false, A will, with high probability, be true: P (A T kB F, C F ) P (B F kA T )P (C F kA T ) 0.999x0.999 1 P (A F kB F, C F ) P (B F kA F )P (C F kA F ) 0.001x0.001 0 Likewise, the naive parallel Gibbs Sampler draws with high probability ”true” for variable B: P (B T kA F ) 0.999 1 P (B F kA F ) 0.001 0 and variable C will be, with high probability, drawn to be ”true” as well. At time t 1, all the variables will therefore be true with high probability. At the following iteration, the variables will likely be chosen to be all ”false” and so on. In fact, this algorithm will flip to a stationary 1 distribution state with probability only 10,000 4 . In sequential Gibbs sampling, in opposite, once A would have flipped, neither B nor C would have flipped. In general, the naive parallel Gibbs sampler performs poorly on near-discrete distributions. There are, however, ways to perform good-quality Gibbs sampling in parallel.

6 28 : Approximate Inference - Distributed MCMC Figure 7: At time t 0, all the variables are chosen false Figure 8: At time t 1, variable A will be, with high probability, drawn to be true 5 The Chromatic sampler An interesting implementation of parallel Gibbs sampling has been proposed by Gonzalez, Joseph, et al (2011). They introduced the Chromatic sampler. This method is based on two important concepts: Graph coloring and Markov Blanket. In Markov Random Fields, the Markov Blanket of a node is defined as its immediate neighbors. A node is independent of all the other nodes of the graph given its Markov Blanket. Consider the graph in figure 12. Because each node of a given color is separated from the other nodes of the same color, we can sample it independently of the other nodes of the same color. The Chromatic sampler will hence: 1. Find subsets of nodes, such that all nodes in a given subset are not in each other’s Markov Blankets, and the subsets cover the whole graph 2. Sample each colored subset in parallel Since each colored subgraph can be sampled in parallel, we want to find a graph coloring that creates as large subgraphs as possible. In the two colors graph given as example, the Chromatic sampler samples all the nodes in green in parallel, sends the updates to all the nodes, then samples all the red nodes in parallel and so on.

28 : Approximate Inference - Distributed MCMC 7 Figure 9: At time t 1, variable B will be, with high probability, drawn to be true Figure 10: At time t 1, variable C will be, with high probability, drawn to be true One important particular case is given by trees. These graphs always admit a 2-coloring. More generally, bipartite graphs are always 2-colorable. This strategy does not work for all the graphs. For full cliques for instance, n nodes require n colors. Besides, the problem of determining whether a graph is k-colorable is NP-hard. Heuristics have to be used to find a good coloring. 6 Online parallel MCMC For many problems, the data is received in a streaming way. Examples include news or twitter feeds. We need to process new data points one at a time. In addition to that, we need to be able to ”forget” old data Algorithm 4 The Chromatic sampler[3] 1: Input: k-colored MRF 2: for each of the k colors ki : i {1, ., k} do 3: for all variables Xj κi in the ith color do in parallel (t 1) (t 1) (t) π(Xj kx 4: Execute Gibbs update: Xj Nj κ i , xNj κ i ) 5: barrier end 6: end for

8 28 : Approximate Inference - Distributed MCMC Figure 11: At time t 2, all the variables will be, with high probability, drawn to be false Figure 12: Example of graph coloring points, because the amount of memory available is limited. Because of the frequency this data is received, we need to process it constant time for each new data point. And because of the amount of data to be processed, it is important to employ methods that can be parallel to scale them up. 6.1 Sequential Monte Carlo Sequential Monte Carlo is a method that generalizes particle filters. Particle filters incrementally sample latent variables X at time t given observations Y up to t: P (Xt kY1:t ). While particle filters assume the graphical model is a state-space model, SMC does not assume the models has any particular structure at all: Given n random variables x1 ,.,xn , SMC constructs proposal distributions such that, given samples from P (x1:k ), we can sample P (x1:k 1 ) in constant time. This method does not require the (intractable) computation of the normalization constant as the distribution P (x1:n ) needs only be evaluated up to a multiplicative constant.

28 : Approximate Inference - Distributed MCMC 6.2 9 Sequential Importance Sampling Sequential importance sampling is the foundation of sequential Monte Carlo. In order to get samples in constant time, we assume the distribution factorizes in the following form: Y qn (x1:n ) qn 1 (x1:n 1 kx1:n 1 ) q1 (x1 ) qk (xk kx1:k 1 ) The advantage of this method is that once we have samples from 1 to k, we don’t need to redraw these samples. Note that the variables xi can represent more than single variables. For instance, if the observed variable yi represents documents, then xi can represent the distribution over topics. In normalized importance sampling, the sample weights are defined by: ri wi P j where ri (1) rj P 0 (xi ) Q(xi ) That is, the weights are proportional to the probability of the samples divided by the proposal distribution of that sample. They quantify the importance of that sample in finding the marginal distribution. We saw previously that the samples can be drawn in constant time. We can also get the weights in constant time. In Sequential importance sampling, the unnormalized weights r can be written as: r(x1:n ) Pn0 (x1:n ) qn (x1:n ) x r(x1:n ) 1:n 1 Pn 1 (x1:n 1 ) Pn0 (x1:n ) qn 1 (x1:n 1 ) Pn 1 0(x1:n 1 )qn (xn kx1:n 1 ) r(x1:n ) rn 1 (x1:n 1 )αn (x1:n 1 ) r(x1:n ) r1 (x1 ) n Y αk (x1:k ) k 2 where αn (x1:n ) Pn0 (x1:n ) Pn 0(x1:n 1 )qn (xn kx1:n 1 ) This formula shows that the unnormalized weights can be computed incrementally: We compute αn and use it to update r(x1:n 1 ) to r(x1:n ). For this update to be constant time, Pn0 (x1:n ) must be computable 0 (x1:n 1 ) in constant time. We save the unnormalized weights r at each iteration, and compute from Pn 1 the normalized weights w as needed from r using 1. Hence, for each new variable xn , we can sample x and compute the normalized weights w in constant time. And, since the samples do not depend on each other, sequential importance sampling can easily be made parallel. The algorithm is described in Algo 5. Note that in the algorithm is very fast, even though the computation of the normalized weights is sequential. Although in theory we should synchronize this step, these updates can be done asynchronously [5].

10 28 : Approximate Inference - Distributed MCMC Algorithm 5 Sequential Importance Sampling for n 1 do Parallel draw samples xi1 q1 (x1 ) Parallel compute unnormalized weights r1i P10 q1 (xi1 ) w1i bynormalizingr1i Compute normalized weights end for for n 2 do Parallel draw samples xin qn (xn kx1:n 1 i) i i compute unnormalized weights rni rn 1 αn (xi1:n ) rn 1 Pn 1 0(xi Pn 0(xi1:n ) i i 1:n 1 )qn (xn kx1:n 1 ) Compute the normalized weights wni by normalizing rni end for One major drawback of sequential importance sampling is that the variance of the samples increases exponentially with n[4], and the accuracy can decrease significantly. One solution to this problem is to resample from n T to T at every iteration. Beyond decreasing variance, resampling: 1. removes samples xk with low weights, which come from low probability regions. However, we want to focus on high probability regions o P (x). Since each sample gets an equal amount of computation, regarless of its weight wk , resampling makes sure we spend time on the weights computations of high probability samples. 2. prevents a small number of samples xk from dominating the empirical distribution: Every time we resample, all the weights are reset to N1 , thus preventing some sample weights to grow too much. The sequential Monte Carlo algorithm is then just the sequential importance sampling with resampling at every iteration: Algorithm 6 Sequential Monte Carlo for n 1 do Parallel draw samples xi1 q1 (x1 ) Parallel compute unnormalized weights r1i Compute normalized weights P10 q1 (xi1 ) w1i bynormalizingr1i Parallel resample w1i , xi1 into N equally weighted particles xi1 end for for n 2 do Parallel draw samples xin qn (xn kx1:n 1 i) i i compute unnormalized weights rni rn 1 αn (xi1:n ) rn 1 Pn 1 0(xi Pn 0(xi1:n ) i i 1:n 1 )qn (xn kx1:n 1 ) Compute the normalized weights wni by normalizing rni Parallel resample wni , xin into N equally weighted particles xin end for

28 : Approximate Inference - Distributed MCMC 7 11 Parallel inference for Bayesian Nonparametrics In this section, we review parallel inference methods for a popular Bayesian nonparametric model, viz. the Dirichlet Process Mixture Model (DPMM). We first recap the model and then discuss normal and parallel inference techniques for DPMMs. Finally, we discuss some comparative results. 7.1 The Dirichlet Process Mixture Model Figure 13: Infinite mixture models : restaurant perspective Let us begin with a finite representation. Assume a restaurant with K tables. Each table has one and only one dish (represented by a color in figure 13). Specifically, the items in the situation represent the following: Table : Cluster People : Items to be clustered Dish/color on each table : Center of each cluster Hidden variables : Assignment of people to each table The estimation problem is to find out what dish is on each table (parameter computation) and the assignments of people to different tables (cluster assignments). Now, let us take different situations into account. If people sit on the table with the most preferred dish/color, this algorithm simulates hard k-means with people(items) being assigned to tables(clusters). In another scenario. people sit on the table proportional to appreciation of dish/color. Then, this is a representation of the soft k-means algorithm.

12 28 : Approximate Inference - Distributed MCMC Figure 14: Soft k-means as a a generative process In general, the soft k-means can be described as a generative process. Each dish (parameter setting) is represented by a distribution, from which items can be drawn. Specifically, we draw K cluster centers from a distribution H. Then, we draw Zi from a uniform distribution representing K clusters. Using Zi and cluster information, we then proceed to generate item Xi . This information is depicted as a graphical model in figure 14. Figure 15: Finite Mixture Model Generative Model Yet another ”flavor” to the above problem is adding the ”rich gets richer” property to the above situation. In that case, people sit on the table proportional to appreciation of dish/color and number of people sitting on the table. Thus, tables with more people will attract new people towards them. This is the finite version of the Dirichlet Distribution Mixture Model. The generative process of the Dirichlet Distribution Mixture Model is very similar to the soft k-means’ process, except that now cluster numbers are drawn from multinomials instead of the uniform distribution. The graphical model associated with this process is described in figure 15.

28 : Approximate Inference - Distributed MCMC 7.1.1 13 Infinite representation of Dirichlet Process Mixture Model Figure 16: DPMM generative process Figure 17: Description of the generative process Now, assume that the number of tables is infinite. People either sit on a table proportional to appreciation of dish/color and number of people sitting on the table, or choose a new table and place a new dish on it. They can choose a new table with a certain probability. The posterior distribution of such a setting results in a Dirichlet distribution and hence this process is referred to as a Dirichlet process mixture model. Samples from a Dirichlet distribution can be created using the stick breaking construction. The probabilities of choosing a table must sum to one. In the stick breaking construction, we break a stick into two parts. The first part is the probability of choosing a table(cluster). We then take the remaining part of the stick and break it into two parts again. Again, we choose the first part as the probability of choosing a table. We keep repeating this process infinite times and essentially get a distribution over an infinite number of tables (clusters). The generative process of the DPMM is described in figure 16 and 17. This process now involves using a multinomial distribution over an infinite number of clusters. Each component Vk of the multinomial is drawn from a Beta distribution to simulate the stick breaking process. Each Zn is drawn from an infinite multinomial. Using Zn and cluster information, samples can be drawn from the distribution.

14 7.2 28 : Approximate Inference - Distributed MCMC Inference of Dirichlet Process Mixture Model A simple Gibbs sampling algorithm would involve sampling each of the variables given the rest of the variables having fixed values. Variables to sample are table proportion Vk , table assignment to each customer (Z) and dish at each table η. This is easily parallelizable and many different parallelization schemes are described subsequently. 7.3 Parallel Inference of Dirichlet Process Mixture Model An easy way to parallelize Gibbs sampling for DPMMs is to use the observation that given Vk and ηk , all Zs are independent of each other. Thus, we can parallelize the inference of all the Zs, but are forced to sample all Vk and ηk together. Empirical results show that this process leads to poor mixing, and thus is not of great practical utility. An interesting way to improve the above-mentioned process was introduced recently [7] and involves sampling all Zs in different processors and maintaining a local copy of all Vk and ηk by taking noisy samples for each processor. Collapsed Gibbs sampling involves integrating out all Vk and ηk . This leads to good mixing, but couples all the Zs and makes parallel inference hard. Thus, collapsed Gibbs sampling suffers from large computational cost. Yet another inference technique is variational inference. In variational inference, we approximate the posterior with a distribution belonging to a more manageable family of distribution. This typically involves making simplifying assumptions about the original model and relaxing many dependencies. Parallel inference is easy in variational methods, but the search space for ”good” models becomes limited and these methods are typically less accurate than MCMC methods. Sequential Monte Carlo (SMC) Methods keep a pool of particles, approximate the distribution using weighted combination of the pool. Parallel inference is easy, but SMC methods suffer from high variance. To alleviate this, it is sometimes necessary to resample via MCMC methods. 7.3.1 Parallel MCMC for DPMMs A recent idea for parallel inference of DPMMs uses a very simple trick. It involves running local collapsed Gibbs samplers on different cores/processes and then simple/approximate recombination of the results produced by each core. This generally does not work very well, because two newly discovered clustered in two different processors may not be the same. Also, re-combining results is a problem. However, this model has shown to give good results empirically for some models. Another parallel inference uses the idea that Dirichlet Mixture of Dirichlet processes are Dirichlet processes. This can be understood using a nice analogy. Each restaurant can be reached via a key πi (drawn from a multinomial distribution) earmarked for that restaurant. Thus, choosing the restaurant can also be thought of as a Dirichlet process. Given the keys of all items, each of the processors is independent of other processors. Conditioned on the Restaurant allocation data are distributed according to P independent Dirichlet process. We can then perform local collapsed Gibbs sampling on the independent DPs on each processor, and combine results. We can propose to move a cluster c to a processor p. The acceptance ratio depends on the cluster size. This can be done asynchronously without affecting algorithm performance. The generative process is described in figure 18. The figure shows the equivalence between the auxiliary method and the original method. By introducing some auxiliary variables, we were able to add the power

28 : Approximate Inference - Distributed MCMC 15 Figure 18: Comparison of many inference methods for DPMMs of parallelism to the process. 7.4 Results Figure 19 compares the performance of the above described algorithms on the same dataset. Gibbs sampling suffers from computational complexity. Variational methods lead to poor accuracy. SMC methods improve in terms of accuracy when we increase the number of particles, but suffer from high variance. It is clear from the figure that the auxiliary variable method outperforms the naive parallel MCMC method and leads to the best performance in terms of time and accuracy. Figure 20 compares the perplexity of different models for Hierarchical Dirichlet Processes. It is clear that the AVparallel method outperforms the naive parallel MCMC method by achieving the lowest perplexity. Figure 19: Comparison of many inference methods for DPMMs

16 28 : Approximate Inference - Distributed MCMC Figure 20: Comparison of inference methods for Hierarchical DPs 8 Conclusion We discussed some parallel inference schemes for graphical models. Some conclusions are that naive parallel inference scheme does not always work. It is a good idea to utilize the structure of the problem to achieve conditional independence to increase chances of parallelizing. Also, it is important to peform either exact parallel inference or achieve a bound on the error of such methods to ensure good quality of inference results.

28 : Approximate Inference - Distributed MCMC 17 References [1] Eric P.Xing Probabilistic Graphical Models, lectures notes. http://www.cs.cmu.edu/ epxing/Class/10708/ [2] MacKay, David JC. Introduction to monte carlo methods [Learning in graphical models. Springer Netherlands, 1998. 175-204.] [3] Gonzalez, Joseph and Low, Yucheng and Gretton, Arthur and Guestrin, Carlos Parallel Gibbs sampling: From colored fields to thin junction trees [International Conference on Artificial Intelligence and Statistics]. P 324–332, 2011. [4] Doucet, Arnaud, Nando De Freitas, and Neil Gordon. introduction to sequential Monte Carlo methods. [Springer New York, 2001 ] [5] Paige, Brooks, et al. Asynchronous Anytime Sequential Monte Carlo [Advances in Neural Information Processing Systems. 2014 ] [6] Xie, Luu Scribe notes 2014 [7] Newman, David and Asuncion, Arthur and Smyth, Padhraic and Welling, Max Distributed algorithms for topic models 2014

28 : Approximate Inference - Distributed MCMC Lecturer: Avinava Dubey Scribes: Hakim Sidahmed, Aman Gupta 1 Introduction For many interesting problems, drawing samples from a complicated probability distribution is a di cult problem and approximate methods have to be used. A popular class of sampling algorithms is the "Markov

Related Documents:

Markov Chain Monte Carlo and Variational Inference: Bridging the Gap

1.2. MCMC and Auxiliary Variables A popular alternative to variational inference is the method of Markov Chain Monte Carlo (MCMC). Like variational inference, MCMC starts by taking a random draw z 0 from some initial distribution q(z 0) or q(z 0 x). Rather than op-timizing this distribution, however, MCMC methods sub-

16 Views

7m ago

Markov chain Monte Carlo (MCMC) - University of British Columbia

sampling, etc. The most popular method for high-dimensional problems is Markov chain Monte Carlo (MCMC). (In a survey by SIAM News1, MCMC was placed in the top 10 most important algorithms of the 20th century.) 2 Metropolis Hastings (MH) algorithm In MCMC, we construct a Markov chain on X whose stationary distribution is the target density π(x).

8 Views

7m ago

PROC MCMC Application-Time Series Forecasting Submission

In addition, one can visually inspect the underlying components of the model in the MCMC procedure. The following points from Larsen (2016) provided further motivation for authors of this paper to simulate the airline model using SAS PROC MCMC procedure: MCMC handles uncertainty better as one can quantify posterior uncertainty of individual

11 Views

7m ago

General state space Markov chains and MCMC algorithms - arXiv

G.O. Roberts, J.S. Rosenthal/Markov chains and MCMC algorithms 23 2. Constructing MCMC Algorithms We see from the above that an MCMC algorithm requires, given a probability distribution π(·) on a state space X, a Markov chain on X which is easily run on a computer, and which has π(·) as its stationary distribution as in (4).

17 Views

7m ago

MCMC Methods for Financial Econometrics - gsb.columbia.edu

MCMC Methods for Financial Econometrics Michael Johannes and Nicholas Polson May 8, 2002 Abstract This chapter discusses Markov Chain Monte Carlo (MCMC) based methods for es-timating continuous-time asset pricing models. We describe the Bayesian approach to empirical asset pricing, the mechanics of MCMC algorithms and the strong theoretical

16 Views

7m ago

Stochastic Gradient MCMC with Stale Gradients - NeurIPS

convergence behavior of SG-MCMC with Stale gradients (S2G-MCMC). Our goal is to evaluate the posterior average of a test function (x), deﬁned as , R X (x)ˆ(x)dx, where ˆ(x) is the desired posterior distribution with x the possibly augmented model parameters (see Section 2). In practice, S2G-MCMC generates Lsamples fx lgL

9 Views

7m ago

Inference in Deep Gaussian Processes using Stochastic ...

2.3 Inference The goal of inference is to marginalize the inducing outputs fu lgL l 1 and layer outputs ff lg L l 1 and approximate the marginal likelihood p(y). This section discusses prior works regarding inference. Doubly Stochastic Variation Inference DSVI is

37 Views

3y ago

JAI HIND COLLEGE BASANTSING INSTITUTE OF SCIENCE J.T ...

Smith G. M., Cryptogamic Botany – Algae and Fungi, Vol. I, McGraw Hill publications, 1955. 5 Course code: SBOT502 Palaeobotany, Angiosperms, Anatomy, Palynology (Credits :04/ Lectures/Week:04) 60 L Learning Objectives: Learn about the process of fossilization and different fossils. Morphology of flower and fruits will help the students, understand the classification in an effective manner .

114 Views

3y ago

Recent Views

Pegasus - University of Exeter Blogs

'Pegasus', Dept. of Classics and Ancient History, Amory Building, Rennes Drive, University of Exeter, Exeter EX4 4RJ E-mail: pegasus@exeter.ac.uk . Pegasus - 2 - Issue 52 (2009) s in . The major event this last year was the announcement of the outcome of the Research Assessment Exercise 2008 in .

11m ago

72 Views

MANAGERIAL FINANCE - GBV

of Managerial Finance page 2 Introduction to Managerial Finance 1 Starbucks—A Taste for Growth page 3 1.1 Finance and Business What Is Finance? 4 Major Areas and Opportunities in Finance 4 Legal Forms of Business Organization 5 Why Study Managerial Finance? Review Questions 9 1.2 The Managerial Finance Function 9 Organization of the Finance

3y ago

6.8K Views

Chapter 1 The roles of finance function in organisations

The roles of the finance function in organisations 4. The role of ethics in the role of the finance function Ethics is the system of moral principles that examines the concept of right and wrong. Ethics underpins an organisation’s sustained value creation. The roles that the finance function performs should be carried out in an .File Size: 888KBPage Count: 10Explore furtherRole of the Finance Function in the Financial Management .www.managementstudyguide.c Roles and Responsibilities of a Finance Department in a .www.pharmapproach.comRoles and Responsibilities of a Finance Department .www.smythecpa.comTop 10 – Functions of Business Finance in an om23 Functions and Duties of Accounting and Finance nded to you b

2y ago

335 Views

Mathematics 2 Problem Sets - Exeter

Mathematics 2 Mathematics Department Phillips Exeter Academy Exeter, NH August 2019. To the Student Contents: Members of the PEA Mathematics Department have written the material in this book. As you work through it, you will discover that algebra, geometry, and trigonometry have been integr

2y ago

119 Views

The Seafarer RL 4 The Wanderer The Wife’s Lament

survive in the Exeter Book, a manuscript of Anglo-Saxon poems produced by a single scribe around a.d. 950. In addition to these and other secular poems, the Exeter Book contains religious verse, nearly 100 riddles, and a heroic narrative. It is the largest collection of Old English poetry in existence. Neglected Treasure Originally, the Exeter

2y ago

139 Views

UPPER SCHOOL - Exeter

Club, Exeter Soccer Club, or Exeter Volleyball Club. UPPER SCHOOL students may also partici-pate in musical or choral groups. DESIGNING YOUR OWN CURRICULUM As an UPPER SCHOOL student, you have the free-dom to design your own academic curriculum. You may enroll in any three of the more than

2y ago

341 Views

Exeter a gentleman in Moscow weekend

“A Gentleman in Moscow” Itinerary at a Glance: Day 1 Arrive in Moscow Day 2 Backstage Tour of the Bolshoi Day 3 Kremlin Tour Day 4 Depart Moscow . Why Exeter International? Our Knowledge & Experience . At Exeter International we have been creating mem

2y ago

328 Views

Phillips Exeter Academy Courses of Instruction 2022-23

Academic Excellence Academic excellence is a signature strength of Phillips Exeter Academy. In every discipline and . — academic, artistic, athletic and extracurricular — . and passions and the agency needed to carry these forward. Non Sibi Non Sibi, or Not For Oneself, inscribed on Exeter’s

2y ago

114 Views

Exeter College Association

club in Takoradi, Ghana. Katrina Hancock read Earth Sciences at Exeter between 1998 and 2002. She joined the Development Office in 2004 and has been Director of Development since 2006. Mark Houghton-Berry,Honorary Fellow, read Literae Humaniores at Exeter between 1976 and 1980. He is CEO of Tudor Capital LP, the European arm of a US based hedge .

1y ago

95 Views

Exeter Plan: Outline Draft Plan Consultation Sustainability Appraisal .

1. Draft Non-Technical Summary of the SA Report for the Exeter Plan (Outline Draft Consultation) M. Andrew B. Miller S. Temple K. Nicholls K. Nicholls 10.08.2022 2. Final Non-Technical Summary of the SA Report for the Exeter Plan (Outline Draft Consultation) M. Andrew B. Miller S. Temple K. Nicholls K. Nicholls 09.09.2022 3.

1y ago

108 Views

2017-2018 GRANDE ÉCOLE MSc in MANAGEMENT

Descriptif des cours Course Outlines 10 Catalogue des cours/ Course Catalog 2017-2018 FIN: Finance/Finance A : Actuariat/Actuarial, Insurance E : Finance d’entreprise/Corporate Finance The course liste tables and the course outlines G : Finance générale/General Finance M : Finance de marché/Market Finance S : Synthèse/Synthesis IDS: Systèmes d’Information, Sciences de la Décision et .

3y ago

312 Views

Behavioral Finance and Wealth L Management

Introduction to Behavioral Finance CHAPTER1 What Is Behavioral Finance? Behavioral Finance: The Big Picture Standard Finance versus Behavioral Finance The Role of Behavioral Finance with Private Clients How Practical Application of Behavioral Finance Can Create a Successful Advisory Rel

2y ago

377 Views

Catalogue des Cours Course Catalog - ESSEC Business School

10 Catalogue des cours/Course Catalog 2021-2022 FIN: Finance/Finance E : Finance d'entreprise/Corporate Finance G : Finance générale/General Finance M : Finance de marché/Market Finance S : Synthèse/Synthesis IDS: Systèmes d'Information, Sciences de la Décision et Statistiques/ Information Systems, Decision Sciences and Statistics

1y ago

222 Views

SINGAPORE - Kelly Services

FINANCE Chief Financial Officer Degree/Master 15 20,000 25,000 Finance Assistant Diploma 1-3 2,800 3,400 Finance Controller Degree 10-15 10,000 18,000 Finance Director Degree 15 15,000 20,000 Finance Executive/ Senior Finance Executive Degree 2-5 3,000 6,000 Finance Manager/ Assistan

2y ago

527 Views

Ministries of Finance and Nationally Determined Contributions

Rodrigo Rojo, IDB Sr. Consultant and advisor to Ministry of Finance of Chile. Colombia German Romero Otalora and Laura Marcela Ruiz Daza — Office of the Vice-Minister — Ministry of Finance. Ireland Paul Ryan — International Finance Division — Ministry of Finance Sean Judge — Department of Finance — Ministry of Finance

1y ago

232 Views

28 : Approximate Inference - Distributed MCMC - Carnegie Mellon University

It looks like you're using an ad-blocker