An Accessible Method For Implementing Hierarchical Models .

3y ago
30 Views
2 Downloads
483.59 KB
8 Pages
Last View : 11d ago
Last Download : 3m ago
Upload by : Jenson Heredia
Transcription

An Accessible Method for Implementing HierarchicalModels with Spatio-Temporal Abundance DataBeth E. Ross1*, Mevin B. Hooten2,3,4, David N. Koons51 Department of Wildland Resources, Utah State University, Logan, Utah, United States of America, 2 U. S. Geological Survey, Colorado Cooperative Fish and WildlifeResearch Unit, Fort Collins, Colorado, United States of America, 3 Department of Fish, Wildlife, & Conservation Biology, Colorado State University, Fort Collins, Colorado,United States of America, 4 Department of Statistics, Colorado State University, Fort Collins, Colorado, United States of America, 5 Department of Wildland Resources andthe Ecology Center, Utah State University, Logan, Utah, United States of AmericaAbstractA common goal in ecology and wildlife management is to determine the causes of variation in population dynamics overlong periods of time and across large spatial scales. Many assumptions must nevertheless be overcome to make appropriateinference about spatio-temporal variation in population dynamics, such as autocorrelation among data points, excess zeros,and observation error in count data. To address these issues, many scientists and statisticians have recommended the use ofBayesian hierarchical models. Unfortunately, hierarchical statistical models remain somewhat difficult to use because of thenecessary quantitative background needed to implement them, or because of the computational demands of using MarkovChain Monte Carlo algorithms to estimate parameters. Fortunately, new tools have recently been developed that make itmore feasible for wildlife biologists to fit sophisticated hierarchical Bayesian models (i.e., Integrated Nested LaplaceApproximation, ‘INLA’). We present a case study using two important game species in North America, the lesser and greaterscaup, to demonstrate how INLA can be used to estimate the parameters in a hierarchical model that decouples observationerror from process variation, and accounts for unknown sources of excess zeros as well as spatial and temporal dependencein the data. Ultimately, our goal was to make unbiased inference about spatial variation in population trends over time.Citation: Ross BE, Hooten MB, Koons DN (2012) An Accessible Method for Implementing Hierarchical Models with Spatio-Temporal Abundance Data. PLoSONE 7(11): e49395. doi:10.1371/journal.pone.0049395Editor: Martin Krkosek, University of Otago, New ZealandReceived June 1, 2012; Accepted October 11, 2012; Published November 16, 2012This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone forany lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.Funding: Delta Waterfowl (www.deltawaterfowl.org) and the Ecology Center and Wildland Resources Department at Utah State University provided funding forthis study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Competing Interests: The authors have declared that no competing interests exist.* E-mail: beth.ross@aggiemail.usu.eduthe data are known without error [3,11,12]. Ignoring observationerror can lead to inaccurate estimation of focal parameters, such asvital rates and trends in abundance [3,13]. In addition, ignoringuncertainty in ecological processes themselves can result inspurious conclusions [10]. For example, if spatial autocorrelationis present in the data, but not accounted for, the varianceassociated with the parameter will be estimated to be smaller thanit should be, potentially causing the researcher to conclude thata result is statistically significant when, in reality, it is not [14].While extremely useful for ecological applications, a drawbackof the hierarchical modeling approach is the required backgroundin statistics and computer programming that is necessary toimplement sophisticated models incorporating spatio-temporaldynamics. Markov Chain Monte Carlo (MCMC) is a powerfulmethod used to sample from the hierarchical Bayesian posteriordistributions of parameters, or determine Maximum LikelihoodEstimates via data cloning [15]. Unfortunately, MCMC involveshigh computational time and the challenge of correctly interpreting output (e.g., convergence diagnostics). Estimation of statevariables in hierarchical models with the Kalman Filter, a set ofrecursive equations, is also limited by modeling assumptions [6].Recently, a statistical algorithm has been developed for obtainingposterior distributions from Bayesian hierarchical models that doesnot involve the use of MCMC. Based on Laplace’s work withintegral transforms [16], this new method is referred to asIntroductionMonitoring and detecting changes in population abundance,and determining why changes vary over space and time, areconcepts that are central to ecology, conservation, and management [1,2]. Obtaining accurate estimates of trends and otherchanges in population abundance is more difficult than it mightappear given complicating issues such as density-dependence andstochasticity. Various sources of uncertainty in data and underlying ecological processes (e.g., temporal or spatial autocorrelation and observation error) can confound inference aboutchanges in population abundance [3].Statistical models that have been used to assess changes inpopulation abundance have ranged in complexity from linearregression to generalized linear models (e.g., Poisson regression),generalized additive models, and time-series models [4,5]. Whileincreasing complexity in these models helps meet assumptions,none can simultaneously decouple process error from observationerror, and account for both temporal and spatial autocorrelation.Hierarchical models may present the best statistical approach forassessing changes in population abundance across large spatialareas [6–8]. Hierarchical models are ideal for handling observational data because they allow for the explicit separation ofobservation and process error [6,9,10]. This is a key distinction, ashierarchical models do not require the assumption that either thedata are collected without error, or that the processes underlyingPLOS ONE www.plosone.org1November 2012 Volume 7 Issue 11 e49395

Hierarchical Models with Spatio-Temporal Datain size, and are based on geographical and political boundaries.Observers count duck species, and whether or not the ducks arepaired (with a mate), single drakes, or in mixed-sex groups.Two particular species of interest that are surveyed during theBPS are the lesser and greater scaup, which are countedcollectively as ‘scaup’ due to their similar appearance. Scaup arethe most abundant and widespread diving duck in North America,and are important game species [27]. Since 1978, however, thecontinental population of scaup has declined to levels that are 16%below the 1955–2010 average and *34% below the NorthAmerican Waterfowl Management Plan goal [25]. This declinehas sparked concern amongst hunters, management agencies, andconservation groups alike [28]. The greatest decline in abundanceof scaup appears to be occurring in the western boreal forest,where populations may have depressed rates of reproductivesuccess, survival, or both [28–31]. However, the specific vital-ratepathways responsible for the decline are not known [32]; nor isthere a consenus on the underlying mechanisms that may havecaused the population decline. Leading hypotheses include:decreased food availablilty during spring migration, and consequent arrival on the breeding grounds in poor body condition thatcould inhibit reproductive success (i.e., the Spring ConditionHypothesis [28,33–35]); and toxins (particularly selenium [36,37])acquired on the Great Lakes and other migratory stopoverlocations that could inhibit both reproduction and survival [38].However, DeVink et al. [39] and others [36,37] concluded thatselenium and mercury levels are low in boreal scaup, and not likelyresponsible for the population decline in this important breedingregion. Moreover, toxins do not seem to be inhibiting scaup vitalrates [40]. Studies examining the Spring Condition Hypothesishave produced conflicting results across spatial regions [39]. Otherhypothesized drivers include anthropogenic development of theboreal forest (e.g., hydropower dams, logging, oil and gasextraction, and mining) and climate change. Recently, Dreveret al. [41] found that decreasing snow pack on the breedinggrounds in the boreal forest (and thus subsequent pond conditions)is negatively correlated with regional population growth rates inscaup.To better understand the causes of the decline, a high level ofimportance has been placed on retrospective analyses thataccurately identify the spatial and temporal changes in populationabundance [42]. A spatio-temporal analysis of population trendswould thus help identify the regions where scaup abundance hasdeclined most severely, where no change has occurred over thelong-term, and even identify areas where abundance may beincreasing. Such information is useful for directing managementand conservation actions toward the most important areas, andwould allow researchers to gain clearer insight into the mechanisms that might be causing declines in some regions and increasesin others.integrated nested Laplace approximation (INLA), which usesa three-step process involving Laplace approximations andnumerical integration to derive posterior distributions for theparameters of interest [17]. Although INLA is only capable ofhandling hierarchical specifications with generalized linear processmodels and a latent Gaussian random field, it has the addedadvantage of being more accurate than MCMC within reasonablecomputation times [17]. Additionally, the development of a readilyavailable R package [18] makes INLA accessible to users witha reasonable background in parametric statistics [17].Hierarchical models have been used to assess trends inabundance while accounting for process error and observationerror [19], incorporate mechanisms of density-dependence[6,12,20], and estimate second-order spatial autocorrelation toobtain unbiased estimates of precision in population dynamicsacross a landscape [8]. However, fully spatio-temporal models ofpopulation dynamics can still be difficult to visualize andimplement [21]. This is not altogether surprising, given thestatistical and computational requirements that have, until now,been needed to estimate parameters in sophisticated hierarchicalmodels that account for first- and second-order autocorrelationprocesses across both space and time.Fortunately, INLA methodology and the associated R package[18] developed by Rue et al. [17] make the implementation ofhierarchical spatio-temporal models relatively easy. This packageallows for researchers to easily incorporate various structures intothe process model that are relevant to modeling populationabundance. Latent models such as seasonal variations [22], spatialeffects [23], and autoregressive models (i.e., AR(1) [6]) can bespecified using INLA. Additionally, these process models can becompared using deviance information criterion (DIC [24]) inINLA, thus allowing the researcher to determine which modelsbest describe the given population. The same comparison usingDIC can be done when selecting among error models for the data.In addition to latent (or unobservable) variables, covariates ofinterest and temporal trends can be specified directly in theprocess model to account for additional sources of variation in theprocess. The ability to account for first- and second-orderprocesses in turn allows the researcher to make inferencepertaining to the covariates of interest while also learning aboutthe latent variables.In this paper, we utilize a long-term, continental-scale dataset ofabundance counts of the lesser and greater scaup (Aythya affinis andA. marila, respectively) to illustrate how INLA can be used to fitmodels with underlying spatio-temporal mechanisms and autocorrelation to better understand the processes governing population dynamics. We then discuss how additional processes can easilybe incorporated into our case study to address an array ofecological questions.Case StudyEvery year, the U.S. Fish and Wildlife Service and CanadianWildlife Service conduct the North American May Breeding PairSurvey (BPS), which provides a rich source of demographic dataon 10 focal duck species (as well as others). The BPS includes areasin the north-central United States, much of western Canada, andAlaska; purposefully covering a large portion of each species’breeding range [25]. This survey has been conducted every Maythrough June since 1955 using aerial transects [26]. Surveys areflown at approximately 193 kilometers per hour at an altitude of27–30 meters. Within each stratum, the largest sampling unit ofthe survey, pilots fly multiple transects, each comprised of 28.8 kmstrip-segments. The number of segments sampled in a transectranges from 1 to 35, and has changed over time. Strata units varyPLOS ONE www.plosone.orgHierarchical ModelTo account for process and observation error, we used Bayesianhierarchical models to examine change in scaup abundance foreach stratum between 1957 and 2009 (scaup data were scarce in1955–1956). Here, our focus is on the delineation of scauprecorded in breeding pairs, rather than total scaup abundance, asthese pairs best represent the breeding potential of the population.We did not utilize data regarding single drakes due to the skewedsex ratio in scaup [28]. Further, counting methods for the mixedsex non-breeding groups changed in 1975 [26]; thus, use of thesedata would confound long-term analysis from 1957 onward. Ourbasal unit of data was the number of pairs, yi,j,t , observed on each2November 2012 Volume 7 Issue 11 e49395

Hierarchical Models with Spatio-Temporal DataXX{12et (e1,t , . . . ,em,t )0 * N(0,fore ), ande :se (D{W)all time t 1, ,T representing spatially correlated errors andgj (gj,1 , . . . ,gj,T )0 * N(0,Sg ) for all strata j 1, ,m representingtemporally correlated errors. The e and g terms incorporated intothe model (2) are intended to account for variation in both thespatial and temporal dimensions of the system, respectively, W isa proximity matrix describing the neighborhood structure, and Dis a diagonal matrix with the row sums of W as diagonal elements[48]. Again, if these sources of uncertainty are not accounted for,erroneous inference could be made because model assumptionswould not be met [49].Rather than standardize the survey area to some sort of grid aspast studies have done (e.g., Gardner et al. [50]), we instead usedthe existing spatial structure present in the survey, and based ourspatial field on the strata units as areal regions. This provided thebasis for a conditional autoregressive structure (CAR [51,52]),while also yielding results that were directly useful for managementof the species. The CAR model is a commonly used spatial modelfor areal processes, and incorporates dependence among areallocations through the neighborhood structure of the strata. Inconstructing the proximity matrix, W, we calculated the Euclideandistance between the center of each stratum, and then denoted allpairs of strata as neighbors if they were within a threshold distanceof 7.75 decimal degrees of each other. This distance was chosenfor this case study because it was the shortest distance at which allstrata had at least one neighbor, and constructing the proximitymatrix in such a way accounted for the more spatially isolatedstrata in the north and the highly connected strata in the south.Thus, in our case, W was a 52 52 matrix (dimensionsdetermined by the number of strata) with elements Wx,y 1indicating stratum x and stratum y were neighbors, and Wx,y 0 ifotherwise.In addition to spatial structure, we also accounted for latenttemporal dependence in the system. We let gj,t * N (agj,t{1 ,s2g ),where a is the autocorrelation parameter, which introducesa temporal correlation structure on gj , the vectors of residualsabout the Pmodeled log counts in stratum j over time, such thatgj * N (0, g ). Given these parameterizations, the entire latent(i.e., random-effect) process is a Gaussian Markov random field,which allows for the use of INLA to approximate posteriordistributions. Also, in a biologically meaningful context, the modelfor gj,t implies latent Gompertz growth, since the population attime t is dependent on the population at time t-1 on the log scale.segment i, in stratum j, in year t. The hierarchical approachallowed us to incorporate spatial autocorrelation among strata intothe process model while also gaining insight into the changeswithin each stratum. We also incorporated terms in our processmodel to account for temporal autocorrelation in the data, whichoften occurs in temporally dynamic systems. We also selectedbetween data models to determine which models best fit the data.Bayesian hierarchical models are usually comprised of three‘submodels’: a data model describing the distribution of the data,a process model specifying the underlying mechanisms that giverise to the data, and the parameter model indicating thedistributions of the parameters in the process model [43]. Below,we outline each of these model components for our case study.Data ModelBased on preliminary analysis, the BPS scaup data appeared tobe overdispersed, containing a disproportionately high number ofzeros along with a high variance relative to the mean [44,45].Thus, we considered two potential data models, a negativebinomial model where yi,j,t * NegBinom(mj,t ,w), and a zeroinflated negative binomial model where(yi,j,t *0,NegBinom (mj,t ,w),with probabilitywith probabilityy(1{y)ð1Þwhere mj,t denotes the average abundance of counted pairs forsegments i 1, ,nj in stratum j 1, ,m during observationperiod t 1, ,T (i.e., years 1957–2009). Models were thencompared using DIC. We considered the zero-inflated modelbecause it accounts for excess zeros, which can arise from morezero counts in a dataset than would be well described by a typicaldata model, and can occur from either ‘false’ or ‘true’ zeros. Falsezeros may arise due to birds that were present during the survey,but unobserved, or not present during the survey (e.g., temporaryemigration to a loafing pond). Unlike these false zeros, true zeroscould occur when scaup do not fully occupy their suitable habitat,or when the habitat is simply not suitable [46]. One of the benefitsof hierarchical modeling is that one can specify a distribution thatproperly supports the data (i.e., the range of values y can take on),alleviating the need for direct data transformations (e.g., log orarcsine) that are commonly used to specify linear process models.This is a valuable property of generalized linear models, given thatmodels based on negative binomial distributions can performbetter at modeling count data than transformed data withGaussian model assumptions [47].Parameter ModelThe parameter model consists of all the prior distributionsassigned to the unknown parameters in the process model. Theoverdispersion parameter for the negative binomial distributionwas specified as w log(n), where n is the original negativebinomial size parameter, and then modeled as w* N(0,100). Theregression parameters were specified with conjugate Gaussianpriors so that b0,j ,b1,j * N(0,1000), for j 1, ,m. The variancecomponent was specified in terms of precision [17], and was givena conjugate gamma prior, s{2e * Gamma(1,1 20000). Lastly,given the reparameterization of h1 (1{a2 ) s2g andh1wasassignedthepriorh2 (1za) (1{a),h1 * Gamma(1,1 20000) and h2 was assigned the priorh2 * N(0,5). These reparameterizations are suggested for implementation of INLA for ease of processing [17].Process ModelWe incorporated each data model into a simple log-linearregression model with hierarchical terms (i.e., random eff

error, and account for both temporal and spatial autocorrelation. Hierarchical models may present the best statistical approach for assessing changes in population abundance across large spatial areas [6–8]. Hierarchical models are ideal for handling observa-tional data because they allow for the explicit separation of

Related Documents:

Bruksanvisning för bilstereo . Bruksanvisning for bilstereo . Instrukcja obsługi samochodowego odtwarzacza stereo . Operating Instructions for Car Stereo . 610-104 . SV . Bruksanvisning i original

10 tips och tricks för att lyckas med ert sap-projekt 20 SAPSANYTT 2/2015 De flesta projektledare känner säkert till Cobb’s paradox. Martin Cobb verkade som CIO för sekretariatet för Treasury Board of Canada 1995 då han ställde frågan

service i Norge och Finland drivs inom ramen för ett enskilt företag (NRK. 1 och Yleisradio), fin ns det i Sverige tre: Ett för tv (Sveriges Television , SVT ), ett för radio (Sveriges Radio , SR ) och ett för utbildnings program (Sveriges Utbildningsradio, UR, vilket till följd av sin begränsade storlek inte återfinns bland de 25 största

Hotell För hotell anges de tre klasserna A/B, C och D. Det betyder att den "normala" standarden C är acceptabel men att motiven för en högre standard är starka. Ljudklass C motsvarar de tidigare normkraven för hotell, ljudklass A/B motsvarar kraven för moderna hotell med hög standard och ljudklass D kan användas vid

LÄS NOGGRANT FÖLJANDE VILLKOR FÖR APPLE DEVELOPER PROGRAM LICENCE . Apple Developer Program License Agreement Syfte Du vill använda Apple-mjukvara (enligt definitionen nedan) för att utveckla en eller flera Applikationer (enligt definitionen nedan) för Apple-märkta produkter. . Applikationer som utvecklas för iOS-produkter, Apple .

EPA Test Method 1: EPA Test Method 2 EPA Test Method 3A. EPA Test Method 4 . Method 3A Oxygen & Carbon Dioxide . EPA Test Method 3A. Method 6C SO. 2. EPA Test Method 6C . Method 7E NOx . EPA Test Method 7E. Method 10 CO . EPA Test Method 10 . Method 25A Hydrocarbons (THC) EPA Test Method 25A. Method 30B Mercury (sorbent trap) EPA Test Method .

Accessible Light Rail Features NJ TRANSIT operates 3 Light Rail Systems (58 of 62 stations are accessible): Hudson-Bergen Light Rail (100% accessible) Newark City Subway(most of the stations are accessible) The River Line (100% accessible). All accessible stations are shown on our system map with the international symbol of

och krav. Maskinerna skriver ut upp till fyra tum breda etiketter med direkt termoteknik och termotransferteknik och är lämpliga för en lång rad användningsområden på vertikala marknader. TD-seriens professionella etikettskrivare för . skrivbordet. Brothers nya avancerade 4-tums etikettskrivare för skrivbordet är effektiva och enkla att