Using Bayesian Networks To Model Watershed Management .

3y ago
61 Views
2 Downloads
365.73 KB
16 Pages
Last View : 5d ago
Last Download : 3m ago
Upload by : Maxine Vice
Transcription

Q IWA Publishing 2005 Journal of Hydroinformatics 07.4 2005267Using Bayesian networks to model watershedmanagement decisions: an East Canyon Creek case studyDaniel P. Ames, Bethany T. Neilson, David K. Stevens and Upmanu LallABSTRACTAn approach to developing and using Bayesian networks to model watershed managementdecisions is presented with a case study application to phosphorus management in the EastCanyon watershed in Northern Utah, USA. The Bayesian network analysis includes a graphicalmodel of the key variables in the system and conditional and marginal probability distributionsderived from a variety of data and information sources. The resulting model is used to 1) estimatethe probability of meeting legal water quality requirements for phosphorus in East Canyon Creekunder several management scenarios and 2) estimate the probability of increased recreationaluse of East Canyon Reservoir and subsequent revenue under these scenarios.Key words Bayesian networks, water quality modeling, watershed decision supportDaniel P. Ames (corresponding author)Department of Geosciences,Idaho State University,Pocatello, Idaho 83209-8150,USATel: 1 208 282 7851;E-mail: amesdani@isu.eduBethany T. NeilsonDavid K. StevensUtah Water Research Laboratory,Utah State University,Logan, UT, 84322-8200,USAUpmanu LallDepartment of Earth and EnvironmentalEngineering,918 Seeley Mudd Building,Columbia University, 500 West 120th St,New York, NY, 10027,USAINTRODUCTIONBayesian networksA Bayesian network (BN) is a directed acyclic graph thatgraphically shows the causal structure of variables in aproblem, and uses conditional probability distributions todefine relationships between variables (see Pearl 1988, 1999;Jensen 1996) A simple 3-node BN is shown in Figure 1including the variables A, B and C. The graph structureindicates that A and B are conditionally independent andthat C is conditionally dependent on both A and B. Totransform this graph into a BN, the marginal probabilityexperts and communication of results to stakeholders. Thedisadvantage of discretization is in the potential loss ofinformation; however, it can be particularly useful in thecase of variables with a distinct breakpoint significant tomanagement.For example, if A and B are each discretized into twostates, then the BN model would require estimates of themarginal probabilities P(A ¼ a1), P(A ¼ a2), P(B ¼ b1) andP(B ¼ b2). To complete the BN, and assuming two states forC, then the CPT representing the following conditionalprobabilities would also be required:distributions P(A) and P(B) as well as the conditionalPðC ¼ c1jA ¼ a1; B ¼ b1ÞPðC ¼ c2jA ¼ a1; B ¼ b1ÞPðC ¼ c1jA ¼ a1; B ¼ b2ÞPðC ¼ c2jA ¼ a1; B ¼ b2ÞBN, the variables are discretized into distinct statesPðC ¼ c1jA ¼ a2; B ¼ b1ÞPðC ¼ c2jA ¼ a2; B ¼ b1Þallowing one to characterize the continuous probabilityPðC ¼ c1jA ¼ a2; B ¼ b2ÞPðC ¼ c2jA ¼ a2; B ¼ b2Þ:probability distribution P(CjA,B) (read as “the probability ofC given A and B”) need to be estimated.To simplify estimating and using these quantities in thedistributions through a discretized conditional probabilitytable (CPT). Discretization of variables is not a requirementOnce these probabilities are estimated, then the BN isof BNs in general (see Pearl 1988) but is a convention usedcomplete and propagation of information through the BNhere to ease computation, elicitation of probabilities fromcan be used to view how decisions and observed conditions

268D. P. Ames et al. Bayesian network modeling of watershed management decisionsABCFigure 1 Simple 3-node BN showing conditional dependence of C on A and B. A andB are conditionally independent.(called “evidence”) at one node affect the probableconditions at other nodes. Downward propagation ofevidence through the BN is based on the law of totalprobability:Journal of Hydroinformatics 07.4 2005Problem Definition1. Identify management endpoints2. Identify management alternatives3. Identify critical intermediate andexogenous variables4. Establish discretization states forvariables5. Plan for use of probabilistic resultsModel Inference1. Observed data2. Dynamic simulation model3. Expert elicitation4. Stakeholder surveys5. Uninformed equal oddsPðc1Þ ¼ Pðc1ja1; b1Þ·Pða1; b1Þ þ Pðc1ja1; b2Þ·Pða1; b2Þþ Pðc1ja2; b1Þ·Pða2; b1Þ þ Pðc1ja2; b2Þ·Pða2; b2Þ:Upward propagation of evidence through the BN is basedon Bayes’ Rule:Model Validation1. Independent information2. Sensitivity analysis3. Adaptive managementFigure 2 Generalized approach to developing a BDN for watershed managementdecision problems.Pða1; b1jc1Þ ¼ Pðc1ja1; b1Þ·Pða1; b1Þ Pðc1Þ:Problem definitionSince their inception, BNs have been used extensively inmedicine and computer science (Heckerman 1997). InThe watershed management problem must be formulated asrecent years, BNs have been applied in environmentala BN graph, providing an opportunity for stakeholders andmanagement studies, including the Neuse Estuary Bayesiandecision-makers to produce a first-cut assessment of theecological response network (Borsuk & Reckhow 2000),important variables, decisions, outcomes and relationshipsBaltic salmon management (Varis & Kuikka 1997), climatein the problem. Variables in a BN are represented by nodeschange impacts on Finnish watersheds (Kuikka & Varisin the graph. The three types of BN graph nodes include:1997), the Interior Columbia Basin Ecosystem Managementdecision nodes (representing sets of distinct managementProject (Lee & Bradshaw 1998) and waterbody eutrophica-alternatives), utility nodes (representing costs and othertion (Haas 1998). As collectively illustrated in these studies,value measures) and state nodes (representing variables thata BN graph structures a problem such that it is visuallycan exist in any of several separate states with a certaininterpretable by stakeholders and decision-makers whileprobability). The BN graph serves as a reference for laterserving as an efficient means for evaluating the probabledata analysis and information gathering used to refine theoutcomes of management decisions on selected variables.graph structure and infer probability distributions. Thefollowing steps are proposed for building the BN graph.Constructing Bayesian networks for watershedmanagementIdentify management endpoints.Selection of endpoints atthe outset helps keep the BN focused only on variablessignificant to the decision problem under investigation. IfA generalized approach for developing BNs specifically forthis is done with the direct input of stakeholders, it has thewatershed management problems is proposed, followed byeffect of bringing different interests together to agree on aa case study implementation of the approach. Key elementsset of endpoints for evaluation. Additionally, geographicof the approach are shown in Figure 2 and are described inlocations (control points) at which the endpoints will begreater detail as follows.evaluated must also be selected.

269D. P. Ames et al. Bayesian network modeling of watershed management decisionsJournal of Hydroinformatics 07.4 2005Decision alternativesprobabilistic results will be used to address the problem. Formay include, but are not limited to, long-term planningexample, the plan may be to convert probabilistic resultsoptions as well as day-to-day management activities. Ainto binary results using some threshold (e.g. if theseparate decision node is added to the BN graph for eachprobability of impairment is over 70% then the lake isset of mutually exclusive alternatives.reported as impaired). It is also useful to report results inIdentify management alternatives.terms of risk (e.g. “under management scenario one, there isAa 20% chance that the temperature requirement forminimum number of intermediate state nodes should besalmonid spawning will be exceeded”). Presenting resultsselected to define the relationship between managementin this way shifts the burden of assessing risk acceptabilityoptions and endpoints while capturing all variables thatfrom technical analysts to regulators and policy personnel.Identify critical intermediate and exogenous variables.decision-makers and stakeholders consider important. In agroup setting, this process can iterate until all partiesinvolved agree on a single BN graph structure. ExogenousModel inferencevariables that drive the system, but are not managed (e.g.CPTs that define the probabilistic relationship betweenprecipitation), are also identified at this stage.variables in the BN can be inferred from a variety ofinformation sources, including observed data, model simuTerciles orlation results and expert judgment. Additionally, economicquartiles of the data can be convenient discretization statesanalyses, stakeholder surveys or expert judgment can bewhen all that is needed is a distinction between “high”,used to estimate cost– benefit utility functions and CPTs“medium” and “low.” Alternatively, it may be more mean-where hard data is not available.Establish discretization states for variables.ingful to stakeholders and decision-makers if the discretizaObserved data can include water qualitytion is based on values critical to the management problem.Observed data.For example, a stream segment might have a cold wateror streamflow monitoring data, ecological measures, sedi-fishery beneficial use criterion of 228C and a salmonidment loads, riparian vegetation, recreational use recordsspawning beneficial use criterion of 158C, making theseand other relevant information. Cheng et al. (1997) presentuseful breakpoints for three states of temperature. Likewise,an algorithm for inferring both the BN graph structure anda low 7-day average streamflow with a 10-year return periodCPTs from observed data. When the BN graph structure is(7Q10) can be a meaningful breakpoint for streamflow.known, the following steps can be used to infer CPTs fromdata:Identify data sources.It is important to identify datasources at the outset to ensure that all available and relevant(1) Simultaneous observations of each variable are tabulated and sorted by parent variable.information is used in the BN model. This activity may help(2) Observations are converted into categories (High,one to refine the BN model by eliminating graph nodesMedium, Low or True, False, etc) based on the nodewhere no information is available and adding nodes wherediscretization defined previously.information is available. This activity will also help identify(3) For every combination of states of the parent nodes,data gaps where expert judgment may be needed tothe number of occurrences of states of the child ischaracterize the relationship between variables.counted.(4) Probabilities are calculated as the number of occur-Plan for use of probabilistic results.For some environ-mental problems, results may be required or expected to berences of a child state divided by the total number ofobservations for that combination of parent states.“true” or “false” (e.g. “the lake is impaired”). However, bydefinition, results from a BN analysis are probabilistic (e.g.Dynamic simulation model.“there is a high probability that the lake is impaired”).can also be used to estimate BN CPTs (for example, seeBecause of this, it is important to establish early on howBorsuk & Reckhow 2000). This is particularly useful inA dynamic simulation model

270D. P. Ames et al. Bayesian network modeling of watershed management decisionsJournal of Hydroinformatics 07.4 2005cases where there is little or no observed data available toDepending on its number of nodes, a single BN maycharacterize a particular relationship in the BN. In this way,require estimates of several CPTs. Each of these may bemodel results are integrated in a single BN with data andderived using any one of the approaches presented here. Forexpert judgment used to characterize other relationships.example, in a dam management BN, a CPT for streamflowThe following steps are used to estimate a CPT using agiven different dam release plans could be estimated fromdynamic simulation model.observed data; a CPT for flooding given different states of(1) Construct and calibrate the simulation model.(2) Identify model input variables corresponding to parentnodes in the BN and model output variables corresponding to child nodes in the BN.(3) Run the model using an uncertainty analysis techniquesuch as Monte Carlo simulation, varying the selectedstreamflow might be derived from a flooding model; and aCPT for economic impact given flooding might be estimatedthrough expert judgment. In this way a BN provides aframework for integrating all relevant variables in thesystem using the best available information for each intervariable relationship.input variables and calibration parameters about anappropriate distribution.(4) Tabulate simulation output with corresponding sets ofinput variable conditions.(5) Discretize the input and output data, tabulate theresults and use them to generate a conditionalprobability table using the same method describedfor observed data.There are several important issues to consider when using amodel to generate simulations for use in a CPT. Uncertaintyassociated with the formulation of the model will not beexplicitly accounted for in the BN. Results generated by themodel for conditions outside of the model calibration andvalidation data range will add to unreliability in the BN.Often, large amounts of data are necessary to accuratelycalibrate a deterministic model. In this case, it may be moreappropriate to generate probability distributions directlyfrom the data, rather than to use a simulation model.Model validationA completed BN should be validated using independentinformation when available. However, this can be achallenge when the BN CPTs were derived from sourcesother than observed data or when no new data becomesavailable for assessing the BN model. Marcot et al. (2001)created BNs using expert judgment and validated the BNmodels using independent assessment of probability distributions by third-party experts. In some cases, the best oronly available validation option may be to make decisionsin the watershed and compare the results to those predictedby the BN model. This would be a suitable approach incases where adaptive management is prescribed. At aminimum, a sensitivity analysis that considers the uncertainties in the BN model should be conducted.This generic approach to developing and using a BN forwatershed management should be applicable to a variety ofproblem types such as total maximum daily load (TMDL)Other sources of information. In cases where data areimplementation, integrated watershed planning and man-sparse and no appropriate models are available, CPTs canagement, pollutant trading and assessing the impact of riverbe inferred from information obtained from experts andmanagement on endangered species. In the remainder ofstakeholders. For example, expert judgment may be neededthis paper, a case study on the application of the approach isto estimate the probability of increased recreational use of apresented with a BN analysis and results.stream reach given improved fish habitat or the probabilityof degradation of surrounding areas given increasedrecreation. See Cooke (1991) and Meyer & Booker (1991)for methodologies for eliciting probabilistic informationfrom individuals. When no expert judgment is availableEAST CANYON RESERVOIR CASE STUDYCase study overviewfor the needed CPT then equal odds are used ðe:g:East Canyon Reservoir (ECR) in northern Utah, USA hasP(a1jB) ¼ 0.50, P(a2jB) ¼ 0.50).experienced a dramatic decrease in recreational use over

271D. P. Ames et al. Bayesian network modeling of watershed management decisionsJournal of Hydroinformatics 07.4 2005the past several years due to reductions in fish populations.and fish habitat in ECR has deteriorated and recreationalThe State of Utah Department of Environmental Qualityvisitation has decreased from 300 000 visitor-days/yr to(DEQ) has identified one of the causes of this problem asabout 80 000 visitor-days/yr.excess phosphorus entering the reservoir through EastAnalysis of WWTP releases and streamflow data revealCanyon Creek, resulting in increased algal growth andthat, during late summer, the WWTP contributes a largesubsequent eutrophication (low levels of dissolved oxygen)percentage (as high as 80%) of the flow in the creek. Also, the(Judd 1999). Phosphorus concentrations in East CanyonWWTP is the only major phosphorus point source. At theCreek have been determined to be in violation of the legaltime of this study, the Utah Department of Environmentallimit streams, placing this water body on the state’s list ofQuality was exploring new limits on phosphorus loadingsimpaired waters (Utah DEQ 1998). The challenge faced byfrom the WWTP. The available physical and chemicalDEQ is to identify sources of phosphorus in East Canyonphosphorus removal technologies that would have to beCreek and select management alternatives to control theseimplemented to attain these limits are very costly. As a result,sources in an economical manner.the superintendents of the WWTP have challenged the StateFigure 3 shows the East Canyon drainage, dominated byof Utah’s position that restricting phosphorus in the plant’sEast Canyon Creek which flows north approximately 26 kmeffluent will improve conditions in the stream and reservoir.(16 miles) from Kimball Junction into ECR. ECR is the sinkIn addition to the WWTP, phosphorus also enters Eastfor surface and ground water flows from Park City, KimballCanyon Creek from non-point sources in the watershedJunction, Jeremy Ranch and rural areas in the Wasatchheadwaters. For the purpose of this case study, headwatersMountains, east of Salt Lake City and Bountiful, UT, USA.are considered to be in the Kimball Junction area near theThe Snyderville Basin wastewater treatment plant (WWTP)intersection of the highways Interstate 80 and Utah 65.is the only major phosphorus point source in the drainage.Non-point sources of phosphorus in the drainage includeECR hosts a state park and has historically supported aseptic systems, grazing lands, camp grounds, golf courses,high quality cold water fishery. In recent years water qualityresidential development and recreational reservoir use.East Canyon ReservoirEast Canyon CreekSalt Lake City # # Jeremy RanchSnyderville Basin WWTP# Kimball JunctionUtah# ParkCityFigure 3 East Canyon Creek watershed in northern Utah.

D. P. Ames et al. Bayesian network modeling of watershed management decisions272Journal of Hydroinformatics 07.4 2005phosphorus concentrations in the WWTP effluent. InECR BN MODEL DEVELOPMENTTable 1, OP TP Alternative B (“Status quo”) representsProblem definitioncurrent conditions. Conditions at the WWTP prior to theuse of biological treatment of wastes are represented byIdentify management endpointsOP TP Alternative A (“No bio. treatment”). Alternatives CThe goals of the management of point and non-point sourceand D represent two specific treatment technologies thatphosphorus are: 1) decrease the risk of not meeting thecan be installed at the WWTP. The first (Alternative C) islegally required phosphorus limit in East Canyon Creek andtargeted to reduce effluent phosphorus to 0.10 mg/L and the2) increase the probability of improved recreational use andsecond (Alternative D) is targeted to reduce effluentrevenue at ECR State Park. BN nodes associated with thesephosphorus to 0.05 mg/L. Management alternatives in themanagement endpoints are shown in Figure 4 as REV RSwatershed headwaters (OP HW) include “Status quo”(revenue generated at the reservoir) and PH ST (phos-(Alternative A) and “Reduce non-point” (Alternative B).phorus concentrations in East Canyon Creek).Identify critical intermediate and exogenous variablesIdentify management alternativesTable 2 shows a list of all of the key variables for thisThe number of potentially acceptable and viable manage-study with a brief description and the variable type. Thement

Key words Bayesian networks, water quality modeling, watershed decision support INTRODUCTION Bayesian networks A Bayesian network (BN) is a directed acyclic graph that graphically shows the causal structure of variables in a problem, and uses conditional probability distributions to define relationships between variables (see Pearl 1988, 1999;

Related Documents:

Learning Bayesian Networks and Causal Discovery Reasoning in Bayesian networks The most important type of reasoning in Bayesian networks is updating the probability of a hypothesis (e.g., a diagnosis) given new evidence (e.g., medical findings, test results). Example: What is the probability of Chronic Hepatitis in an alcoholic patient with

value of the parameter remains uncertain given a nite number of observations, and Bayesian statistics uses the posterior distribution to express this uncertainty. A nonparametric Bayesian model is a Bayesian model whose parameter space has in nite dimension. To de ne a nonparametric Bayesian model, we have

Alessandro Panella (CS Dept. - UIC) Probabilistic Representation and Reasoning May 4, 2010 14 / 21. Bayesian Networks Bayesian Networks Bayesian Networks A Bayesian (or belief) Network (BN) is a direct acyclic graph where: nodes P i are r.v.s

Bayesian networks can also be used as influence diagramsinstead of decision trees. . Bayesian networks do not necessarily imply influence by Bayesian uentists’methodstoestimatethe . comprehensible theoretical introduction into the method illustrated with various examples. As

evaluation of performance robustness, i.e., sensitivity, of Bayesian networks, d) the sensi-tivity inequivalent characteristic of Markov equivalent networks, and the appropriateness of using sensitivity for model selection in learning Bayesian networks, e) selective refinement for

Computational Bayesian Statistics An Introduction M. Antónia Amaral Turkman Carlos Daniel Paulino Peter Müller. Contents Preface to the English Version viii Preface ix 1 Bayesian Inference 1 1.1 The Classical Paradigm 2 1.2 The Bayesian Paradigm 5 1.3 Bayesian Inference 8 1.3.1 Parametric Inference 8

example uses a hierarchical extension of a cognitive process model to examine individual differences in attention allocation of people who have eating disorders. We conclude by discussing Bayesian model comparison as a case of hierarchical modeling. Key Words: Bayesian statistics, Bayesian data a

automotive manufacturers worldwide. Those companies that take a forward-thinking approach will gain a competitive advantage and secure a leadership position in a realigned automotive value chain. At Seco, we partner with OEMs and other vehicle-based organisations around the globe to help automotive manufacturers overcome their