Bayesian Parameter Inference For Stochastic Biochemical .

2y ago
28 Views
3 Downloads
641.15 KB
15 Pages
Last View : Today
Last Download : 3m ago
Upload by : Wade Mabry
Transcription

Downloaded from rsfs.royalsocietypublishing.org on November 1, 2012Bayesian parameter inference for stochastic biochemicalnetwork models using particle Markov chain Monte CarloAndrew Golightly and Darren J. WilkinsonInterface Focus published online 29 September 2011doi: 10.1098/rsfs.2011.0047ReferencesThis article cites 35 articles, 6 of which can be accessed t-1Article cited urlsP PPublished online 29 September 2011 in advance of the print journal.Subject collectionsArticles on similar topics can be found in the following collectionsbioinformatics (4 articles)environmental science (6 articles)systems biology (22 articles)Email alerting serviceReceive free email alerts when new articles cite this article - sign up in the box at the topright-hand corner of the article or click hereAdvance online articles have been peer reviewed and accepted for publication but have not yet appeared inthe paper journal (edited, typeset versions may be posted when available prior to final publication). Advanceonline articles are citable and establish publication priority; they are indexed by PubMed from initial publication.Citations to Advance online articles must include the digital object identifier (DOIs) and date of initialpublication.To subscribe to Interface Focus go to: ns

Downloaded from rsfs.royalsocietypublishing.org on November 1, 2012Interface Focusdoi:10.1098/rsfs.2011.0047Published onlineBayesian parameter inferencefor stochastic biochemical networkmodels using particle Markovchain Monte CarloAndrew Golightly and Darren J. Wilkinson*School of Mathematics and Statistics, Newcastle University, Merz Court,Newcastle upon Tyne NE1 7RU, UKComputational systems biology is concerned with the development of detailed mechanistic modelsof biological processes. Such models are often stochastic and analytically intractable, containinguncertain parameters that must be estimated from time course data. In this article, we considerthe task of inferring the parameters of a stochastic kinetic model defined as a Markov ( jump)process. Inference for the parameters of complex nonlinear multivariate stochastic processmodels is a challenging problem, but we find here that algorithms based on particle Markovchain Monte Carlo turn out to be a very effective computationally intensive approach to the problem. Approximations to the inferential model based on stochastic differential equations (SDEs)are considered, as well as improvements to the inference scheme that exploit the SDE structure. Weapply the methodology to a Lotka–Volterra system and a prokaryotic auto-regulatory network.Keywords: chemical Langevin equation; Markov jump process; pseudo-marginalapproach; sequential Monte Carlo; stochastic differential equation; stochastickinetic model1. INTRODUCTIONIn §2, a review of the basic structure of the problem ispresented, showing how the Markov process representation of a biochemical network is constructed, andintroducing a diffusion approximation that greatlyimproves computational tractability. In §3, a Bayesianapproach to the inferential problem is given, togetherwith an introduction to methods of solution that are‘likelihood-free’ in the sense that they do not requireevaluation of the discrete time transition kernel of theMarkov process. Unfortunately, most obvious inferentialalgorithms suffer from scalability problems as either thenumber of parameters or time points increase. In §4, it isshown how the recently proposed particle Markov chainMonte Carlo (pMCMC) algorithms [7] may be appliedto this class of models, as these do not suffer from scalability problems in the same way as more naive approaches. Itis also shown how the structure of stochastic differentialequation (SDE) models may be exploited in order toadapt the basic pMCMC approach, departing fromthe likelihood-free paradigm, but leading to algorithmsthat are (relatively) computationally efficient, even inlow/no measurement error scenarios where likelihoodfree approaches tend to break down.Computational systems biology [1] is concerned withdeveloping dynamic simulation models of complexbiological processes. Such models are useful for developing a quantitative understanding of the process, fortesting current understanding of the mechanisms, andto allow in silico experimentation that would be difficultor time-consuming to carry out on the real system in thelaboratory. The dynamics of biochemical networks at thelevel of the single cell are well known to exhibit stochasticbehaviour [2]. A major component of the stochasticity isintrinsic to the system, arising from the discreteness ofmolecular processes [3]. The theory of stochastic chemical kinetics allows the development of Markov processmodels for network dynamics [4], but such modelstypically contain rate parameters that must be estimatedfrom imperfect time course data [5]. Inference for suchpartially observed nonlinear multivariate Markovprocess models is an extremely challenging problem [6].However, several strategies for rendering the inferenceproblems more tractable have been employed in recentyears, and new methodological approaches have recentlybeen developed that offer additional possibilities. Thispaper will review these methods, and show how theymay be applied in practice to some low-dimensional butnevertheless challenging problems.2. STOCHASTIC CHEMICAL KINETICSFor mass-action stochastic kinetic models (SKMs) [3], itis assumed that the state of the system at a given time isrepresented by the number of molecules of each reactingchemical ‘species’ present in the system at that time,*Author for correspondence (d.j.wilkinson@ncl.ac.uk).One contribution to a Theme Issue ‘Inference in complex systems’.Received 19 May 2011Accepted 6 September 20111This journal is q 2011 The Royal Society

Downloaded from rsfs.royalsocietypublishing.org on November 1, 20122 Stochastic biochemical network modelsA. Golightly and D. J. Wilkinsonand that the state of the system is changed at discretetimes according to one or more reaction ‘channels’. Soconsider a biochemical reaction network involving uspecies X1, X2, . . . , Xu and v reactions R1, R2, . . . , Rv,written using standard chemical reaction notation asR1 :p11 X 1 þ p12 X 2 þ þ p1u X u! q11 X 1 þ q12 X 2 þ þ q1u X u ;R2 :Rv :p21 X 1 þ p22 X 2 þ þ p2u X u! q21 X 1 þ q22 X 2 þ þ q2u X u.pv1 X 1 þ pv2 X 2 þ þ pvu X u! qv1 X 1 þ qv2 X 2 þ þ qvu X u :Let Xj,t denote the number of molecules of species Xj attime t, and let Xt be the u-vector Xt ¼ (X1,t, X2,t, . . . ,Xu,t) . The v u matrix P consists of the coefficientspij, and Q is defined similarly. The u v stoichiometrymatrix S is defined byS ¼ ðQ PÞ :The matrices P, Q and S will typically be sparse. On theoccurrence of a reaction of type i, the system state, Xt, isupdated by adding the ith column of S. Consequently, ifDR is a v-vector containing the number of reactionevents of each type in a given time interval, then thesystem state should be updated by DX, whereDX ¼ SDR:The stoichiometry matrix therefore encodes importantstructural information about the reaction network. Inparticular, vectors in the left null-space of S correspondto conservation laws in the network. That is, any uvector a that satisfies a S ¼ 0 has the property (clearfrom the above equation) that a Xt remains constantfor all t.Under the standard assumption of mass-action stochastic kinetics, each reaction Ri is assumed to havean associated rate constant, ci, and a propensity function, hi (Xt, ci), giving the overall hazard of a type ireaction occurring. That is, the system is a Markovjump process (MJP), and for an infinitesimal timeincrement dt, the probability of a type i reaction occurring in the time interval (t,t þ dt] is hi (Xt, ci)dt. Thehazard function for a particular reaction of type itakes the form of the rate constant multiplied by a product of binomial coefficients expressing the number ofways in which the reaction can occur, that is, u YX j;t:hi ðXt ; ci Þ ¼ cipijj¼1It should be noted that this hazard function differsslightly from the standard mass action rate laws usedin continuous deterministic modelling, but is consistent(up to a constant of proportionality in the rate constant) asymptotically in the high concentration limit.Let c ¼ (c1, c2, . . . , cv) and h(Xt,c) ¼ (h1(Xt, c1),h2(Xt, c2), . . . , hv(Xt, cv)) . Values for c and the initialsystem state X0 ¼ x0 complete specification of theInterface FocusMarkov process. Although this process is rarelyanalytically tractable for interesting models, it isstraightforward to forward-simulate exact realizationsof this Markov process using a discrete event simulationmethod. This is due to the fact that if the current timeand state of the system are t and Xt, respectively, thenthe time to the next event will be exponential with rateparameterh0 ðXt ; cÞ ¼vXhi ðXt ; ci Þ;i¼1and the event will be a reaction of type Ri with probability hi (Xt, ci)/h0(Xt, c) independently of the waitingtime. Forwards simulation of process realizations inthis way is typically referred to as Gillespie’s directmethod in the stochastic kinetics literature, afterGillespie [4]. See Wilkinson [3] for further backgroundon stochastic kinetic modelling.In fact, the assumptions of mass-action kinetics, aswell as the one-to-one correspondence between reactionsand rate constants may both be relaxed. All of what follows is applicable to essentially arbitrary v-dimensionalhazard functions h(Xt, c).The central problem considered in this paper is thatof inference for the stochastic rate constants, c, givensome time course data on the system state, Xt. It istherefore most natural to first consider inference forthe earlier-mentioned MJP SKM. As demonstrated byBoys et al. [6], exact Bayesian inference in this settingis theoretically possible. However, the problem appearsto be computationally intractable for models of realisticsize and complexity, due primarily to the difficulty ofefficiently exploring large integer lattice state space trajectories. It turns out to be more tractable (though byno means straightforward) to conduct inference for acontinuous state Markov process approximation to theMJP model. Construction of this diffusion approximation, known as the chemical Langevin equation(CLE), is the subject of the next section.2.1. The diffusion approximationThe diffusion approximation to the MJP can be constructed in a number of more or less formal ways. Wewill present here an informal intuitive construction, andthen provide brief references to more rigorous approaches.Consider an infinitesimal time interval, (t,t þ dt].Over this time, the reaction hazards will remain constant almost surely. The occurrence of reaction eventscan therefore be regarded as the occurrence of eventsof a Poisson process with independent realizations foreach reaction type. Therefore, if we write dRt for thev-vector of the number of reaction events of each typein the time increment, it is clear that the elements areindependent of one another and that the ith elementis a Po(hi (Xt, ci)dt) random quantity. From this, wehave that E(dRt) ¼ h(Xt, c)dt and Var(dRt) ¼diagfh(Xt, c)g dt. It is therefore clear ffidRt ¼ hðXt ; cÞ dt þ diagf hðXt ; cÞg dWtis the Itô SDE that has the same infinitesimal mean andvariance as the true MJP (where dWt is the increment of

Downloaded from rsfs.royalsocietypublishing.org on November 1, 2012Stochastic biochemical network modelsa v-dimensional Brownian motion). Now since dXt ¼SdRt, we ffiffiffiffiffiffiffiffiffidXt ¼ ShðXt ; cÞ dt þ S diagfhðXt ; cÞgS dWt ; ð2:1Þwhere now Xt and Wt are both u-vectors. Equation (2.1)is the SDE most commonly referred to as the CLE, andrepresents the diffusion process that most closelymatches the dynamics of the associated MJP, and canbe shown to approximate the SKM increasingly wellin high concentration scenarios [8]. In particular,while it relaxes the assumption of discrete states, itkeeps all of the stochasticity associated with the discreteness of state in its noise term. It also preserves manyof the important structural properties of the MJP. Forexample, equation (2.1) has the same conservationlaws as the original SKM. However, it should benoted that the approximation breaks down in low-concentration scenarios, and therefore should not beexpected to work well for models involving specieswith very low copy-number. This is quite typical formany SKMs; yet the approximation often turns out tobe adequate for inferential purposes in practice.More formal approaches to the construction of theCLE usually revolve around the Kolmogorov forwardequations for the Markov processes. The Kolmogorovforward equation for the MJP is usually referred to inthis context as the chemical master equation. A secondorder Taylor approximation to this system of differentialequations can be constructed, and compared with thecorresponding forward equation for an SDE model(known in this context as the Fokker – Planck equation).Matching the second-order approximation to theFokker – Planck equation leads to the same CLE (2.1),as presented earlier. See Gillespie [8,9] for further details.3. BAYESIAN INFERENCESuppose that the MJP X ¼ fXtj1 t Tg is notobserved directly, but observations (on a regular grid)y ¼ fytjt ¼ 1, 2, . . . ,Tg are available and assumed conditionally independent (given X), with conditionalprobability distribution obtained via the observationequationYt ¼ F Xt þ 1t ;1t N ð0; SÞ:ð3:1ÞHere, we take Yt to be a length-p vector, F is a constantmatrix of dimension u p and 1t is a length-p Gaussianrandom vector. This flexible setup allows for observingonly a subset of components of Xt, and taking F to bethe u u identity matrix corresponds to the case ofobserving all components of Xt (subject to error).Note that in the case of unknown measurement errorvariance, the parameter vector c can be augmented toinclude the elements of S. Bayesian inference maythen proceed through the posterior densitypðc; xjyÞ / pðcÞpðxjcÞTYpðyt jxt ; cÞ;ð3:2Þt¼1where p(c) is the prior density ascribed to c, p(xjc) isthe probability of the MJP and p( ytjxt, c) is theobservation density constructed from equation (3.1),Interface FocusA. Golightly and D. J. Wilkinson3which we let depend explicitly on c for the purposesof generality. Since the posterior in equation (3.2)will typically be unavailable in closed form, samples areusually generated from p(c,xjy) through a suitableMCMC scheme.3.1. Likelihood-free/plug-and-play methodsOne of the problems with standard approaches to usingMCMC for inference in realistic data-poor scenarios isthe difficulty of developing algorithms to explore a huge(often discrete) state space with a complex likelihoodstructure that makes conditional simulation difficult.Such problems arise frequently, and in recent years, interest has increasingly turned to methods that avoid some ofthe complexity of the problem by exploiting the fact thatwe are easily able to forward-simulate realizations ofthe process of interest. Methods such as likelihood-freeMCMC (LF-MCMC) [10] and approximate Bayesiancomputation [11] are now commonly used to tackleBayesian inference problems, which would be extremelydifficult to solve otherwise. Although not the main focusof this paper, it is of course possible to develop similar computationally intensive algorithms from a non-Bayesianstandpoint, where such likelihood-free approaches aresometimes termed ‘plug-and-play’; see Ionides et al. [12],He et al. [13] and Wood [14] for further details.A likelihood-free approach to this problem can beconstructed as follows. Suppose that interest lies in theposterior distribution p(c,xjy). A Metropolis–Hastings(MH) scheme can be constructed by proposing a jointupdate for c and x as follows. Supposing that the currentstate of the Markov chain is (c,x), first sample a proposednew value for c, c w, by sampling from some (essentially)arbitrary proposal distribution q(c wjc). Then, conditionalon this newly proposed value, sample a proposed newsample path, x w, by forwards simulation from the modelp(x wjc w). Together, the newly proposed pair (c w,x w) isaccepted with probability minf1,Ag, whereA¼pðcw Þ qðcjcw Þ pðyjxw ; cw Þ :pðcÞqðcw jcÞpðyjx; cÞCrucially, the potentially problematic likelihood term,p(xjc), does not occur in the acceptance probability,owing to the fact that a sample from it was used inthe construction of the proposal. Note that choosingan independence proposal of the form q(c wjc) ¼ p(c w)leads to the simpler acceptance ratioA¼pðyjxw ; cw Þ:pðyjx; cÞThis ‘canonical’ choice of proposal will not be ‘optimal’,but lends itself to more elaborate schemes, as we willconsider shortly.3.2. Sequential likelihood-free Markov chainMonte CarloThe basic LF-MCMC scheme discussed earlier mightperform reasonably well, provided that y is not highdimensional, and there is sufficient ‘noise’ in themeasurement process to make the probability of

Downloaded from rsfs.royalsocietypublishing.org on November 1, 20124 Stochastic biochemical network modelsA. Golightly and D. J. Wilkinsonacceptance non-negligible. However, in practice y is oftenof sufficiently large dimension that the overall acceptance rate of the scheme is intolerably low. In this case,it is natural to try and ‘bridge’ between the prior andthe posterior with a sequence of intermediate distributions. There are several ways to do this, but here it ismost natural to exploit the Markovian nature of the process and consider the sequence of posterior distributionsobtained as each additional time point is observed.Define the data up to time t as yt ¼ fy1, . . . , ytg. Also,define sample paths xt ;fxsjt 2 1, s tg, t ¼ 1, 2, . . . ,so that x ¼ fx1, x2, . . . g. The posterior at time t canthen be computed inductively as follows.1. Assume at time t that we have a (large) sample fromp(c,xtjyt) (for t ¼ 0, initialize with sample from theprior, p(c)p(x0jc)).2. Run an MCMC algorithm that constructs a proposalin two stages:(a) First sample (c w, xwt ) p(c,xtjyt) by picking atrandom and perturbing c w slightly (samplingfrom a kernel density estimate of the distribution).(b) Next sample xwtþ1 by forward simulation fromwwp(xwtþ1jc , xt ).(c) Accept/reject (c w, xwtþ1) with probability minf1,Ag, wherewpðytþ1 jxtþ1; cw Þ:A¼pðytþ1 jxtþ1 ; cÞ3. Output the sample from p(c,xtþ1jytþ1), put t :¼ t þ 1,return to step 2.Consequently, for each observation yt, an MCMC algorithm is run that takes as input the current posteriordistribution prior to observation of yt and outputs theposterior distribution given all observations up to yt.As yt is typically low-dimensional, this strategy usuallyleads to good acceptance rates.This algorithm has been applied successfully to biochemical network inference [15], but suffers from twodifferent problems as the problem size increases. First, asthe number of parameters (the dimension of c) increases,the algorithm suffers from the usual ‘curse of dimensionality’ as it becomes increasingly difficult to effectively coverthe parameter space with a Monte Carlo sample. Second,as the number of time points increases, the methodsuffers from the ‘particle degeneracy’ problem that iswell known to affect sequential Monte Carlo (SMC) algorithms targeting fixed model parameters [16]. The jitteringapproach used in step 2(a) can alleviate the problemsomewhat [17] but does not completely overcome it.Both of these problems can be addressed by using particleMCMC methods [18], and by the particle marginal Metropolis–Hastings (PMMH) algorithm, in particular.4. PARTICLE MARGINAL METROPOLIS –HASTINGSSuppose that interest is in the marginal parameterposteriorðpðcjyÞ ¼ pðc; xjyÞdx;Interface Focuswhere p(c,xjy) i

of inference for the stochastic rate constants, c, given some time course data on the system state, X t.Itis therefore most natural to first consider inference for the earlier-mentioned MJP SKM. As demonstrated by Boys et al. [6], exact Bayesian inference in this settin

Related Documents:

Computational Bayesian Statistics An Introduction M. Antónia Amaral Turkman Carlos Daniel Paulino Peter Müller. Contents Preface to the English Version viii Preface ix 1 Bayesian Inference 1 1.1 The Classical Paradigm 2 1.2 The Bayesian Paradigm 5 1.3 Bayesian Inference 8 1.3.1 Parametric Inference 8

Stochastic Variational Inference. We develop a scal-able inference method for our model based on stochas-tic variational inference (SVI) (Hoffman et al., 2013), which combines variational inference with stochastic gra-dient estimation. Two key ingredients of our infer

and stochastic evolution takes into account the discrete number of entities in the system and the random nature of the events taking place, drawing nearer to the theories of thermodynamics and stochastic processes [2]. In this paper we considerapproximate Bayesian methods for parameter inference in dynamical

Bruksanvisning för bilstereo . Bruksanvisning for bilstereo . Instrukcja obsługi samochodowego odtwarzacza stereo . Operating Instructions for Car Stereo . 610-104 . SV . Bruksanvisning i original

Bayesian" model, that a combination of analytic calculation and straightforward, practically e–-cient, approximation can ofier state-of-the-art results. 2 From Least-Squares to Bayesian Inference We introduce the methodology of Bayesian inference by considering an example prediction (re-gression) problem.

Stochastic parameter inference is thus a fundamental and challenging problem in systems biology, and it is crucial for obtaining validated and predictive models. In this paper we propose an approach for the parameter inference problem that combines Gillespie’s

Bayesian inference for a discretely observed stochastic kinetic . Abstract The ability to infer parameters of gene regulatory networks is emerging as a key problem in systems biology. The biochemical data are intrinsically stochastic and tend t

THE SECRET LANGUAGE OF DESIGNED BY EIGHT AND A HALF BROOKLYN, NY SCIENCE, NATURE, HISTORY, CULTURE, BEAUTY OF RED, ORANGE, YELLOW, GREEN, BLUE & VIOLET JOANN ECKSTUT AND ARIELLE ECKSTUT 15213_COLOR_001-009.indd 3 7/3/13 12:18 PM. Joann Eckstut is a leading color consultant and interior designer who works with a wide range of professionals including architects, developers and manufacturers of .