Application Of Multivariate Density Estimation (MDE) To .

3y ago
48 Views
2 Downloads
1.25 MB
12 Pages
Last View : 1d ago
Last Download : 3m ago
Upload by : Mariam Herr
Transcription

Application of Multivariate Density Estimation (MDE) to FaciesSimulation with Transition Probability Matrices (TPM)Yupeng Li, Sahyun Hong and Clayton V. DeutschIn this study, the process of constructing a multivariate probability distribution from specified marginaldistribution is addressed. The univariate marginal and bivariate marginal distribution can be obtainedfrom transition probability matrices built from vertical profiles of well data. The 3D conditioning dataconfigurations are transformed to vertical transition probability data configuration during the multivariateprobability distribution estimation, which makes it possible to use the vertical transition probability to the3-D space.IntroductionIn geostatistics, it is very common to build the models of a categorical variable that represents facies orrock types. Although sometimes, deterministical models based on professionals’ experiences are used inpractice, the geostatistical simulation realizations are being used increasingly for uncertainty quantification.Stochastic modeling algorithms such as sequential indicator simulation (SIS) are widely used to constructthese multiple realizations.Based on the sequential simulation algorithm, SIS is commonly used for categorical variables. Theclassical SIS algorithm is fast and straightforward because the modeling of the conditional probabilitydistribution at each unsampled location requires the solution of only a single (co)kriging system for eachcategory. Due to the complex features of geological variables, some non-linear geostatistics algorithmshave been developed including multiple point geostatistics (Strebelle, 2002). Both the traditionalgeostatistics (SIS) and multiple geostatistics are based on Bayes Law. A multivariate joint probability isobtained (directly or indirectly) and the conditional probability for the unsampled location given theavailable data can be obtained using Bayes law. Sequential simulation decomposes the multivariate jointprobability by recursive application of Bayes law, while the multiple point statistics set up the multivariatejoint probability by scanning a training image (Deutsch, 2002, Caers and Zhang, 2004).In this paper, a new multivariate probability distribution estimation scheme based on the facies transitionprobability matrix (TPM) is proposed. The TPM provides the bivariate and univariate marginal distributionfor any combination of two categorical variables between any two locations at any lag distance. From thisinformation, specific multivariate probability distributions that satisfy the bivariate and univariateconstraints built from the n-conditioning data can be inferred. The estimation process is a kind of iterationbased on the transition probability calculated directly from the conditioning data’s vertical profile andtransformed to 3-D data configuration.The first section of this paper is an introduction on transition probability and the definition of TPM.Instead of using indicator variograms, transition probabilities are used as the tools to characterize thespatial relationships. In the second section, the construction of a transition probability matrix is introduced.In the third section, the related mathematic relationships between the multivariate probabilities distributionand bivariate marginal distribution are illustrated. Based on that, the multivariate probability distributionestimation (MDE) process is presented. The MDE process presented in this section is a non-linearapproach compared with kriging. The fourth section explains the vertical-horizontal transform approachwhich facilitates 3-D estimation and simulation by this new approach. The final section will present somepreliminary results of MDE approach and future work on this new method.TPM definitionIn stochastic theory, the Markov chain is a sequence of random variables (X1, X2, . ) with the Markovproperty, that is, given the present state, the future and past states are independent, which can be written as:103-1

P ( X n 1 k n 1 X n k n , ., X 1 k 1 ) P ( X n 1 k n 1 X n k n ) .;1, , ,) called the state space ofThe possible values of X t form a countable set S (the chain, where ki denote mutually exclusive, exhaustively defined states of a stochastic process. If asequence of states has the Markov property, then every future state is conditionally independent of everyprior state. The changes of states are called transitions. The conditioning probability of a state given theprevious state P( X n 1 kn 1 X n kn ) is called the transition probability. The wholly description of thetransition probability of a finite state space of S ( ki , i m ) going from stateform aththmatrix T ( x, h n) with the i row and j column element of t(n)i, jto stateh ni, jin n steps will P ( X n k j X 0 ki ) .This matrix is called transition probability matrix (TPM).Assuming the lithological types at location X t in the vertical profile of a well will only depend upon thelithological type at the preceding location X t 1 , it is recognized as a Markov chain process. Many effortshave been done on the use of Markov chain transition matrix in geology and geostatistics. Most often, theMarkov chain transition matrix is used in the vertical profile explanation, sedimentary evolution analysisand stratigraphic sequences simulation. Krumbein has tried molded a transgressive-regressive strand-linedeposits using a time-discrete transition matrix to control the lateral shifting(Krumbein, 1968). It is alsohave been used in soil science to describe the spatial order of different soil cases and vertical spatial changeof textural((Li, 1997). Carle and Fogg tried to integrate transition probabilities into the frame of indicatorgeostatistics for litho-facies simulation (Carle and Fogg, 1996; Carle and Fogg, 1997; Weissmann andFogg, 1999;Carle, 2000). Elfeki used a kind of coupled markov chain to character the heterogeneity as anon-Gaussian field by multi-dimensional transition probabilities (Elfeki and Dekking, 2001; Elfeki, 2006).TPM constructionMainly, the vertical profile is structured as discrete-state Markov chains in two ways. In one way,observations are spaced equally along a vertical profile to yield transition probability matrices. Thetransitions of the equally spaced rock types at discrete points are counted. Because the same rock type maybe observed at successive points, the transition matrix that gives the probability of going from one rocktype to another generally has nonzero elements on the main diagonal. The second approach considers onlythe succession of certain rock types, and because each transition is to a different rock type within thesystem, the diagonal elements are all zero. In this approach the whole successful thickness of a same rocktype, which is one state of Markov chain may recognized from log curve or form outcrop. It is also calledan embedded Markov chain.In this study, the first approach is used to build the TPM. Suppose the whole vertical profile is H whichdivided into n equal segments using an equal segment. The state space in this Markov chain is the faciescategory set k i ( k i 1, 2 , ., K ) . Then, in each segment will define a state of a Markov chain and thetransition probability of the whole profile will form a TPM.For example, the total observed number of state k followed the state k giving the observation interval h nis the counted total number of Ki . When the interval is 1, the transitionis denoted as , whileprobability from state Ki to state K j will be:t,, The probability of a transition from K1 to K1, K2, K3,.K is given by t 1h, j 1 (j 1,2, ,m ) in the first row andso on and denoted as:103-2

T h 1 tih,i 1 tih, j 1 . tih,m 1 h 1 1 .t2,h m t2,1 . . h 1h 1 tm,1. tm,m Generally, from the account approach, the elements of the TPM ( h ) will be:ti , j ( x , h ) (ni , jnini , j) ( n ni, ji, jni) i, jpi , j ( x, h)pi ( x, h)i, jWhere:ti , j ( x, h) is the transition probability of state Ki to state K jni , j is the number of state Ki followed by K j after h steps;n i is the row sum of the ni , j , ni ni, j;j ni, jis the whole sum of the tally matrix entries;i, jpi , j ( x, h) is the joint probability of two state K i and K j ;pi ( x, h) is the univariate marginal probability of state KiThe TPM has to fulfill specific properties:(1) Its elements are non-negative, 0 ti , j 1 ;(2) The elements of each row sum up to one,m tj 1i, j( x, h) 1;Generally,ti , j ( x, h) p{K j exit at (x h) K i exit at x} p{K j exit at (x h) AND K i exit at x}p ( K i exit at x)When the process is stationary or homogenous, the transition probability is independent of position x, thetransition probability ti , j ( h ) and the bivariate joint probability p (h; ki , k j ) will depend only on theintervals vectors. It shows that the bivariate joint probability:p(h; ki , k j ), h; ki , k j ; i, j 1,., Kcan be calculated from the transition probability ti , j ( x, h) as:p(h, ki , k j ) p{ki exit at (x h) AND k j exit at x} ti, j (h)* pi (h)The bivariate probability matrix for a particular lag h can be also fully defined by its relatively transitionprobabilities matrix. The sum of all K bivariate joint probabilities should be 1. A strong assumption ofsymmetry would entail that p(h, ki , k j ) p(h, k j , ki ) .TPM curvesFor a stationary process, the transition probability will depend on the lag between different observationpositions. The transition probability t , h will form a diagram as the h i increasing from zero to a furtherdistance. For t , h (i j), that means the states changed to themselves and we can call it direct-transition; ifit is t , h (i j), it is called cross-transition which reflects the cross-correlations or inter-states relationshipbetween different states. For example there are 3 rock types in a research well profile. The transitionprobability matrices curves of rock type 1 changed to rock type 1, 2, and 3 is shown in Figure 1.103-3

Figure 1 transition probability matrices curves. Red: direct-transition of rock type 1 to 1; Green: crosstransition of rock type 1 to 2; Blue: cross-transition of rock type 1 to 3As the calculation distance increases, the transition probability of rock type 1 changing to itself isdecreasing, with that of changing to rock type 2 and 3 increasing. The same curve can be calculated for theothers which compose the whole transition probability matrix curves as shown in figure 2.Figure 2 transition probability matrices curvesFrom the plots in Figure 2, given any known rock type and distance interval for two locations, the transitionprobability can be calculated. These curves reveal some geological and geostatistical information : (1) asthe distance increase, the transition curve will become flat as reach their sill, The TPM will reflect theglobal univariate probability which can be identified as the percentage of this rock type within the wholesection. (2) The TPM will also reflect the relatively bivariate distribution for a particular lag h, in this103-4

bivariate distribution the univariate probability at this particular lag h are also imbedded. (3) The transitionprobability matrices curves also reflect the spatial juxtaposition information. In Figure 1, during theprocedure of rock type 1 decreasing and reached its sill, the other 2 rock types begin to increase andreached their sill with variable probabilities. The rock type 2 has a higher probability to exist than rocktype 3 as facies 1 decrease.Multivariate probability distribution Estimation (MDE) Based on TPMOne of the important problems in geostatistics is finding a proper means to describe spatial dependence.Bivariate probability matrices inferred from TPM can be used as an alternative of variogram to characterizethe spatial dependence. Using this full matrix ofbivariate joint probability in multivariate probabilitydistribution estimation is the purpose of this research. Inference from well data leads to a completespecification of the bivariate joint probability matrices for all distance and direction vectors in the verticalprofile:p ( h ; k i , k j ), h ; i , j 1, ., KWhere: p(h; ki , k j ) is the bivariate joint probability of rock typek i and k j ;k i and k j are two rock type in two different location;h is the observed interval between two different locations.After the bivariate joint probabilities are inferred, the main concern is the construction of a multivariatejoint distribution given these sets of bivariate joint probabilities. Consider the following schematicsituation, where the data-data vectors and data-unsampled vectors are all expressed in bivariate jointprobability. Of course, the data could be distributed in 3-D space.Figure 3 schematic horizontal transition probability data configurationAssume there are n locations ( u1 . u n ), each of the n location has K categories. The n randomvariables will form a multivariate probability distribution function (pdf) PMV (u1 , u2 ,., un ) which is definedas below and represents the probability of a specific configuration of categories ki (i 1,., K ) existing atlocations u1 , u2 ,., un .PMV (u1 , u2 ,., un ) prob(u1 k1 , u2 k2 ,., un kn ); ki 1, 2,., KnIn this distribution, there are totally K possible values. Each of these values occurs with a givenfrequency which is identified with an index I mv that is:NI mv 1 (un 1) * K n 1n 1103-5

Where: I mv 1, 2,., K n ; u n is the code of thecategories; K is the total number of the categories.location of the data configuration that identifies itsAfter obtaining this multivariate probability distribution, it is straightforward to calculate the conditioningprobability using the Bayes law as below:PMV (u0 , u j k j )pMV (u0 , u1 ,., un ) pMV (u1 ,., un )p (u0 u1 , u2 ,., un ) K ki , k j 1PMV (u0 ki , u j k j )where: u j u 0 , u1 ,., u n ;ki , k j 1,., K ; i, j 1,., nThe denominator will be the sum of several indices that specified by the conditioning data category values;the nominator is the specific category’s multivariate probability. Our challenge is to estimate thismultivariate distribution p MV ( u1 ,., u n ) based on the full set of bivariate joint distribution.Constraints of the multivariate distributionAs stated previously, from the transition probability matrices, we can get the bivariate joint probabilitypaim (h; k ', k '') of any two facies k ' and k '' . While if the multivariate probability distribution is known,the bivariate joint probability can be calculated as:KKKpcal (u j1 , u j2 , k ', k '') . . . PMV (u1 ,., un ) Pk ',k ''k1 1 k2 1k j1 k 'k j2 k ''kn 1Where: ki , k ', k '' 1,2,.K ; i 1,.nj1, j2 1,.n; j1 j2 ;The bivariate joint probability pcal (u j1 , u j 2 , k ', k '') calculated from the multivariate probabilitydistribution should be equal those obtained from the transition probability matrix paim (h; k ', k '') . This willimpose () constraints on the multivariate probability distribution.The order relationship of the multivariate distribution will compose another constraint to this multivariateprobability distribution.Kn pi 1MV(u1 ,., un ) 1 u1 ,., un 1,., KBased on those two constraints, an iteration approach is adopted to modify an initial multivariateprobability distribution to satisfy those two constraints. The steps are:Step 1: The initial value of the multivariate probability values come from assumption that the facies at eachlocations are independent. Under the independent assuming, the multivariate distribution constrained onlyto the univariate probabilities could be written as follows:npMV (u1 k1 ,., u n kn ) pk jj 1k1 ,., kn 1,., KStep 2: The initial distributions are modified by the constrained of transition probability which expressed asa bivariate joint probability between every two locations. After modified by bivariate joint probability, anew multivariate probability distribution is:103-6

*pMV(u 0 k0 , u1 k1 ,., u n k n )*Step 3: Go to the bivariate joint probability constraints using p MV as the new initial value for thedistribution until there are no notable changes in the conditioning probability calculated from themultivariate probability distribution.The iteration modifying process is shown in figure 4.Initial multivariate probability based on theindependent assumption and univariatenmarginal proportion: P p mvj 1kjCalculate the bivariate joint probability from the initial multivariateK KKprobability:pcal (u j1 , u j2 , k ', k '') . . . PMV (u1 ,., un ) Pk ',k ''k1 1 k2 1k j1 k 'k j2 k ''kn 1Calculate the modify factors for the multivariate probability from thep ( h ; k ', k '')calculated and aiming bivariate joint probability:Fbiv aimp cal ( h ; k ', k '')Modify the multivariate probability and calculate the bivariate joint probability*from the new PmvYesIs there a mismatch between thecalculated and aiming bivariate jointprobability ?NoStop and output the modified multivariate probabilityFigure 4 the modifying process of MDETransformation of TPM to any Spatial DirectionUsually, in the vertical profile of the well data or outcrop, the data have a higher density which can buildthe transition probability matrices more easily. While for estimation and simulation, the grid needs to besimulated in a 3D space.Geological research have clear seen that sedimentary facies show vertical sequence superposition and thevertical progression of facies will reflect lateral facies changes. Sedimentary environments that started outside-by-side will end up overlapping one another over time due to transgressions and regressions. Theresult is a vertical sequence of facies mirrors the original lateral distribution of sedimentary environments103-7

les.html). It is called Waltheer’s law namedd after a Germaangeologgist. Later, moore advanced theoriestsuch asa sequence sttratigraphy theories are used to correlate thheverticaal profile withh the lateral shifts. They all provide cluess to explain hoow the facies anda surroundinngdeposiits change andd shift laterallly from the verticalvfacies deposition staack pattern as earth's surfacceunderggoes changes. As an exampple, in a singlee transgressionn cycle (a relattive rise in seaa level resultinngdeposiition of marinne strata over terrestrial strrata), the vertiical facies staacking pattern will related tothorizoontal pattern as shown in Figuure 5.Figure 5 Walther’s Law in a sinngle transgression cycle, whiich shows thatt facies vary ini an analogouusntally and verticallymanneer both horizonAfter the transition probability area calculated in vertical prrofile, based ono Walther’s Law,La kind ofofohorizoontal to verticaal transformatioon ratio can bee used to proviide the transitioon probability information forany laateral horizontaal direction. NowNassuming in a basin marrginal as showwn in Figure 6, direction alonnglocatioon a to locationn b is the direcction toward shhoreface, wherre the juxtaposiition will become more sandyy.From location a too location f iss the directionn leaving the shoreface andd toward the offshore. Thheojuxtapposition in this direction will become more mudy. The shhore line is neaarly parallel to the direction ofa to d,, and we believve those locatioons are all in a single sedimenntary sequencee cycle.i any distancee vector which can decompose into three vectors: along thhe direction of aaDistannce vector a-c isb, direection of a-d and the vertical direction. Given the annisotropy ratio along those three directionns, the effecctively distancee in vertical traansition probabbility matrix off a-c will be:as:heffeect (c) (hmainhhm) 2 ( min )2 ( vert )2aminamainavertmWheree:Anisotropy ratio along thee main differennce direction;Distance off unsampled location departts from the saample

Figure 1 transition probability matrices curves. Red: direct-transition of rock type 1 to 1; Green: cross-transition of rock type 1 to 2; Blue: cross-transition of rock type 1 to 3 As the calculation distance increases, the transition probability of rock type 1 changing to itself is decreasing, with that of changing to rock

Related Documents:

6.7.1 Multivariate projection 150 6.7.2 Validation scores 150 6.8 Exercise—detecting outliers (Troodos) 152 6.8.1 Purpose 152 6.8.2 Dataset 152 6.8.3 Analysis 153 6.8.4 Summary 156 6.9 Summary:PCAin practice 156 6.10 References 157 7. Multivariate calibration 158 7.1 Multivariate modelling (X, Y): the calibration stage 158 7.2 Multivariate .

Introduction to Multivariate methodsIntroduction to Multivariate methods – Data tables and Notation – What is a projection? – Concept of Latent Variable –“Omics” Introduction to principal component analysis 8/15/2008 3 Background Needs for multivariate data analysis Most data sets today are multivariate – due todue to

An Introduction to Multivariate Design . This simplified example represents a bivariate analysis because the design consists of exactly two dependent or measured variables. The Tricky Definition of the Multivariate Domain Some Alternative Definitions of the Multivariate Domain . “With multivariate statistics, you simultaneously analyze

Multivariate Statistics 1.1 Introduction 1 1.2 Population Versus Sample 2 1.3 Elementary Tools for Understanding Multivariate Data 3 1.4 Data Reduction, Description, and Estimation 6 1.5 Concepts from Matrix Algebra 7 1.6 Multivariate Normal Distribution 21 1.7 Concluding Remarks 23 1.1 Introduction Data are information.

Motivation Intro. toMultivariateNormal BivariateNormal MoreProperties Estimation CLT Others Outline Motivation The multivariate normal distribution The Bivariate Normal Distribution More properties of multivariate normal Estimation of µand Σ Central Limit Theorem Reading: Johnson & Wichern pages 149-176

Multivariate data 1.1 The nature of multivariate data We will attempt to clarify what we mean by multivariate analysis in the next section, however it is worth noting that much of the data examined is observational rather than collected from designed experiments. It is also apparent th

Multivariate calibration has received significant attention in analytical chemistry, particularly in spectroscopy. Martens and Naesl provide an excellent general reference on multivariate calibration. Examples of multivariate calibration in a spectroscopic context are associated w

Multivariate longitudinal analysis for actuarial applications We intend to explore actuarial-related problems within multivariate longitudinal context, and apply our proposed methodology. NOTE: Our results are very preliminary at this stage. P. Kumara and E.A. Valdez, U of Connecticut Multivariate longitudinal data analysis 5/28