Bayesian Generalized Linear Models In R

2y ago

13 Views

3 Downloads

427.53 KB

12 Pages

Last View : 16d ago

Last Download : 3m ago

Upload by : Bria Koontz

Report this link

Download PDF

Transcription

Bayesian Generalized Linear Models in RAs published in Benchmarks RSS Matters, January ss-mattersJon Starkweather, PhD1

Jon Starkweather, PhDjonathan.starkweather@unt.eduConsultantResearch and Statistical Supporthttp://www.unt.eduhttp://www.unt.edu/rssRSS hosts a number of “Short Courses”.A list of them is available at:http://www.unt.edu/rss/Instructional.htmThe programming scripts contained in this article can also be found at:http://www.unt.edu/rss/class/Jon/R SC2

Bayesian Generalized Linear Models in RBayesian statistical analysis has benefited from the explosion of cheap and powerful desktop computingover the last two decades or so. Bayesian techniques can now be applied to complex modeling problemswhere they could not have been applied previously. It seems likely that the Bayesian perspective willcontinue to challenge, and perhaps sub-plant, traditional frequentist statistical methods which have dominated many disciplines of science for so long. The current article will review one function which allowsthe user to conduct linear regression, general linear modeling, and generalized linear modeling (i.e. nonGaussian; e.g., Poisson, binomial, etc.). A fairly simple model is specified, then modeled using traditionaltechniques, and then modeled with a Bayesian approach. Do not implement these methods unless youunderstand the core principles of the Bayesian perspective (i.e. priors, likelihoods, posteriors, etc., and allthey entail).A complete treatment of the Bayesian perspective is beyond the scope of this article and could fillseveral books; and has. Interested readers can consult a number of introductory texts focusing on theBayesian perspective (e.g., Berry, 1996; Bolstad, 2004; Gelman, Carlin, Stern, & Rubin, 2004; Hoff,2009). Very generally speaking, the Bayesian approach to statistical inference differs from traditionalfrequentist inference by assuming that the data are fixed and model parameters are random, which setsup problems in the form of; what is the probability of a hypothesis (or parameter), given the data athand? These types of problems can be stated with symbols as: p (H D). Traditional frequentist inferenceassumes that the model parameters are fixed (though unknown) and the data are essentially random; forinstance, if the null hypothesis is true, what is the probability of this data? These types of problems canbe stated in the general form; what is the probability of the data given a hypothesis? In symbols, thistranslates to: p (D H).Bayesian methods focus on five essential elements. First, the incorporation of prior information (e.g.,expert opinion, a thorough literature review of the same or similar variables, and/or prior data). Priorinformation is generally specified quantitatively in the form of a distribution (e.g., normal/Gaussian, Poisson, binomial, etc.) and represents a probability distribution for a coefficient; meaning, the distribution ofprobable values for a coefficient we are attempting to model (e.g., a β weight). It may help to think of theprior as an educated best guess. Second, the prior is combined with a likelihood function. The likelihoodfunction represents the data (i.e. what is the distribution of the estimate produced by the data). Third,the combination of the prior with the likelihood function results in the creation of a posterior distributionof coefficient values. Fourth, simulates are drawn from the posterior distribution to create an empiricaldistribution of likely values for the population parameter. Fifth, basic statistics are used to summarizethe empirical distribution of simulates from the posterior. The mode (or median or mean) of this empirical distribution represents the maximum likelihood estimate of the true coefficient’s population value(i.e. population parameter) and credible intervals can capture the true population value with probabilityattached.Keep in mind, priors should be rationally and honestly derived. They can be weak or strong. Theseterms refer to the strength of belief we have in the prior(s). Weak priors result when we do not have agreat deal of evidence or prior information on which to base the prior(s). When the prior is weak, theprior distribution will be wide, reflecting a great many possible values and the likelihood will be moreinfluential in creating the posterior distribution. Strong priors, conversely, result when we have a greatdeal of evidence on which to base the prior(s). When the prior is strong, the prior distribution will benarrow, reflecting a smaller range of possible values and the likelihood will be less influential in creatingthe posterior (strong priors will influence the posterior more than the likelihood). It should be clear theone key feature of the prior is the ability to quantify our uncertainty. The posterior can be thought of asa compromise between the prior and the likelihood. If the prior is weak, then it will be less influential in3

creating the posterior; if the prior is strong, then it will be more influential in creating the posterior.The example used here is a simple linear regression model with one interval/ratio outcome (extro) andthree interval/ratio predictors (open, agree, social). The simulated data set contains 500 cases, each withcomplete data (i.e. no missing values).Import the data from the web, get a summary of the data, and take a look at the correlations. We see verylittle multicollinearity here.Confirm / take a look at the core Linear Model (lm) – traditional Ordinary Least Squares (OLS) regression.Notice in the output, the intercept is approximately -5.0. The unstandardized coefficients for each predic4

tor are listed. The open coefficient is approximately 1.2, the agree coefficient is .90, the social coefficient isapproximately 0.40. We could calculate a 95% confidence interval (CI95 ) for each predictor’s coefficient;for instance, the CI95 for open is 1.187 to 1.212. But what does this really tell us? Well, it is interpretedas: if an infinite number of samples were taken from this population, 95% of the open coefficient valueswould be between 1.187 and 1.212. But it does not tell us the range which contains the true populationvalue.The same results are below; but, the results below were generated with the Generalized Linear Model(glm) function, specifying the default Gaussian (normal) family distribution. The primary benefit of theglm function is the ability to specify error distributions other than normal.To conduct the Bayesian GLM, load the package ‘arm’ which contains the bayesglm function (Gelman,et al., 2010). You will notice there are several dependencies.5

Conduct the Bayesian Generalized linear model (here family Gaussian) and get the summary of theoutput. Notice the specification of the prior mean, scale, and degrees of freedom. Each ‘family’ of distributions requires specific prior specifications (e.g. a binomial distribution would have slightly differentprior specification; see the package documentation for details1 or simply type help(bayesglm) in theR rm/arm.pdf6

We can see the output matches up with the traditional linear model (OLS regression) as well as the traditional GLM. As sample sizes increase the results should converge to the same values.One of the benefits of the Bayesian perspective (for any analysis) is that it allows us to make credibleinterval statements. Credible intervals are similar to confidence intervals, but in the Bayesian framework,the interval REALLY IS believed to contain the true population parameter. For instance: a 95% credibleinterval for a parameter being estimated is interpreted as; there is a 95% probability that the actual parameter is included in that interval. This is because the interval is based on information from the posteriordistribution; of for instance, one of the predictor’s coefficient posterior distribution (e.g. the open variable’s coefficient posterior distribution).The bayesglm function represents a kind of short cut of the Bayesian approach to inference. Typically,the posterior is not used directly for making inferences. Instead, an empirical distribution is constructedbased on draws from the posterior and that empirical distribution is what informs the inference(s). Here,we are using the bayesglm as a proxy for doing the added empirical distribution. With the bayesglmwe get a distribution of ’simulates’ which are used in place of an actual empirical distribution (which willbe covered further below).Retrieve the posterior distributions of the coefficients for the intercept and all three predictors. The headfunction simply lists the first 10 rows of the object on which it is run (the default head is the first 6).7

Extract just the posterior distribution of the ’open’ variable’s coefficient. Again, the head function simplylists the first 10 items of the object.Take a look at the posterior distribution of the open variable’s coefficient (normally a histogram would notbe used, it is used here simply as a graphical reference).8

The ‘density’ plot is normally used to display a posterior. The density plot can be thought of as a highlymodified histogram. Imagine a histogram with 100 bins (instead of the 7 as displayed in the histogramabove), then imagine plotting a line from the x-axis through each bin’s midpoint at the top of each bin,and then back down to the x-axis; the result would be the density plot.9

Now we can retrieve the 95% credible interval for the open variable’s coefficient.Recall, this credible interval is interpreted as; there is a 95% probability that the true population value ofthe open coefficient is between 1.186 and 2.210. Keep in mind, these numbers will fluctuate slightly basedon the iterative nature of the function.To make truly Bayesian inferences about our coefficients, we need to do the extra step of creating the em10

pirical distribution(s) mentioned above. Going further entails actually creating an empirical distributionbased on iterative draws from the posterior. The MCMCregress function in the package ‘MCMCpack’(Martin, Quinn, & Park, 2010)2 provides us with the Markov Chain Monte Carlo simulation method ofcreating the empirical distribution; which itself allows us to then compute the descriptive statistics usedfor inference. Meaning, the mode, median, or mean of the empirical MCMC simulates’ distribution isthe ’maximum likelihood’ estimate (i.e. top of a density function) of the population parameter. TheMCMCregress function also gives us the credible interval which includes the actual population parameter value.First, load the ‘MCMCpack’ library.Next, apply the MCMCregress function. Notice, the model formula is the same, but here we have somenew options. The ’burnin’ argument is used because MCMC iterates are sensitive to their initial startvalues, so the first few (i.e. 3000) iterations are discarded. The ’mcmc’ simply issues how many (postburnin) iterations will be used to build the empirical distribution. The ’thin’ defaults to 1 and representsa control on convergence, such that once approximate convergence has been reached it can be beneficialto keep only a few simulates and discard the rest to conserve computer resources (Gelman, Carlin, Stern,& Rubin, 2004). The verbose option (by default is off) simply does or does not print the iteration historyas the function runs. The seed argument simply allows the user to set the random number generator seed.The ’beta.start’ argument allows the user to set a start value for the beta vector.Notice in the summary, we get the coefficient estimates (“1. Empirical.”) and credible intervals (“2.Quantiles.”). So, we can say there is a 95% probability that the true population value of the open coefficient is between 1.1869 and 1.2120.2The package ‘MCMCpack’ contains functions for many other types of models and contains other ancillary functions forworking with MCMC objects.11

References & ResourcesAlbert, J. (2007). Bayesian Computation with R. New York: Springer Science Business Media, LLC.Berry, D. A. (1996). Statistics: A Bayesian perspective. Belmont, CA: Wadsworth Publishing Company.Bolker, B. M. (2008). Ecological Models and Data in R. Princeton, NJ: Princeton University Press.Bolstad, W. M. (2004). Introduction to Bayesian statistics. Hoboken, NJ: John Wiley & Sons, Inc.Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2004). Bayesian Data Analysis (2nd ed.). BocaRaton, FL: Chapman & Hall/CRC.Gelman, A., Jakulin, A., Pittau, M. G., & Su, Y. (2009). A Weakly Informative Default Prior DistributionFor Logistic And Other Regression Models. The Annals of Applied Statistics, 2(4), 1360-1383.Gelman, A., Su, Y., Yajima, M., Hill, J., Pittau, M. G., Kerman, J., & Zheng, T. (2010). Package ’arm’.Available at: http://cran.r-project.org/web/packages/armHoff, P. D. (2009). A First Course in Bayesian Statistical Methods. New York: SpringerScience Business Media, LLC.Martin, A. D., Quinn, K. M., & Park, J. H. (2010). Package ’MCMCpack’. Available /MCMCpack.pdfUntil next time, “Stop; hey, what’s that sound.”This article was last updated on January 25, 2016.This document was created using LATEX12

Jan 25, 2016 · Bayesian Generalized Linear Models in R Bayesian statistical analysis has beneﬁted from the explosion of cheap and powerful desktop computing over the last two decades or so. Bayesian techniques can now be applied to complex modeling problems where they could not have been applied previously. It seems l

Related Documents:

2/74 Generalized linear models - York University

Quasi-poisson models Negative-binomial models 5 Excess zeros Zero-inﬂated models Hurdle models Example 6 Wrapup 2/74 Generalized linear models Generalized linear models We have used generalized linear models (glm()) in two contexts so far: Loglinear models the outcome variable is thevector of frequencies y in a table

61 Views

3y ago

Introduction to Generalized Nonlinear Models in

Overview of Generalized Nonlinear Models in R Linear and generalized linear models Generalized linear models Problems with linear models in many applications: I range ofy is restricted (e.g.,y is a count, or is binary, or is a duration) I e ects are not additive I va

22 Views

2y ago

Analysis of Generalized Linear Mixed Models - USDA ARS

of software for implementing generalized linear mixed models, we have found researchers increasingly interested in using these models, but it is "easier said than done." Our goal is to help those who have worked with linear mixed models to begin moving toward generalized linear mixed models. The benefits and chal-lenges are discussed from a .

19 Views

1y ago

Generalized Linear Models in R

Nov 21, 2017 · Generalized Linear Models in R Markus Gesmann 21 November 2017 1. Agenda Generalized linear models (GLMs) are a flexible generalization of linear models, with applications in many disciplines. This talk will give an introd

12 Views

2y ago

Nonparametric Regression in R - McMaster University

All of these models extend straightforwardly to generalized nonparametric regression, much as linear models extend to generalized linear models (discussed in Chapter 5 of the text). The random and link components are as in generalized linear models, but the linear predictor of the GLM 0 1x 1 2x 2 px p

15 Views

2y ago

3 Generalized Linear Models - MacEwan University

The data exhibit an exponential pattern, which means that a log linear model can be appropriate. SPSS commands To t a Loglinear Model to the data: 1. Analyze Generalized Linear Models Generalized Linear Models 2. Type of Model Tab:

22 Views

2y ago

Variational Bayesian Linear Dynamical Systems - University at Buffalo

Variational Bayesian Linear Dynamical Systems 5.1 Introduction This chapter is concerned with the variational Bayesian treatment of Linear Dynamical Systems (LDSs), also known as linear-Gaussian state-space models (SSMs). These models are widely used in the ﬁelds of signal ﬁltering, prediction and control, because: (1) many systems of inter-

8 Views

1y ago

SCHOOL OF ACCOUNTING SCIENCES - Unisa

accounting requirements for preparation of consolidated financial statements. IFRS 10 deals with the principles that should be applied to a business combination (including the elimination of intragroup transactions, consolidation procedures, etc.) from the date of acquisition until date of loss of control. OBJECTIVES/OUTCOMES After you have studied this learning unit, you should be able to .

216 Views

3y ago

Recent Views

Financial Performance of General Insurance Companies in Bangladesh: A .

general insurance companies, 17 private general insurance companies and the only public general insurance company (SBC) were taken for the study. SBC was established in 1973. To get a better representation the insurance companies were selected based on their year of establishment. Among the private general insurance companies 7 companies were

1y ago

215 Views

National Insurance Commission (Nic) Insurance, Reinsurance and Broking .

Companies, 27 Non-Life Insurance Companies, 3 Reinsurance Companies, one Reinsurance contact office and 91 Insurance Brokers and Loss Adjusters. For periodic updates, please visit the Commission's website: www.nicgh.org Life Insurance Companies The following are Life insurance companies in good standing as at November 20, 2020.

1y ago

152 Views

Personal insurance - Car & Business insurance King Price Insurance

The king's insurance options 5 Things you need to know 7 The stuff you need to do 14 How to claim 16 Our commitment to you 20 Car insurance 22 Car warranty 37 Shortfall cover 45 Scratch and dent 46 Tyre and rim 48 Motorbike insurance 53 Trailer and caravan insurance 64 Watercraft insurance 68 Home contents insurance 77 Buildings insurance 89

1y ago

673 Views

BARBADOS INSURANCE COMPANIES - PwC

Underlying legislation for international insurance companies An international insurance company can be licensed in Barbados under the Exempt Insurance Act, Cap. 308A or, alternatively, registered under the Insurance Act, Cap. 310 which also governs local insurance companies. International companies that choose to register under the

1y ago

161 Views

Solvency II and the Taxation of Life Insurance Companies - GOV.UK

59 UK life insurance companies 60 Overseas life insurance companies: rule corresponding to s.59 61 Transfers of business and transfers within a group Share pooling rules 62 UK life insurance companies 63 Overseas life insurance companies: rule corresponding to s.62 64 Sections 62 and 63: supplementary Long-term business fixed capital

1y ago

149 Views

Gold Tier - MAPFRE Insurance

Foy Insurance of MA, LLC 198 Frank Consolati Insurance Agency, Inc. 198 County Insurance Agency, Inc. 198 Woodrow W Cross Agency 214 Woodland Insurance Agency, Inc. 214 Tegeler Insurance Services of CT, Inc. 214 Pantano/VonKahle Insurance Agency, Inc. 214 . Hanson Insurance Agency, Inc. 287 J.H. Slattery Insurance Agency, Inc. 287

1y ago

565 Views

List of Insurance Companies by Insurance Manager - Cayman Islands dollar

2447 Batan Insurance Company SPC, Ltd. 29-Sep-03 1307714 BBG Insurance Services, Ltd. 09-Aug-16 1254 BCHS Insurance, Ltd. 07-Oct-98 1168 Bearacuda Re 01-Aug-97 2639 Bedrock Insurance Limited 24-Nov-05 2150 Bom Ambiente Insurance Company 14-Jun-00 2565 Boundless Insurance Company, Ltd. 01-Dec-04 769 Bucap Limited 03-Mar-89

1y ago

293 Views

Insurance Market Review - Central Bank of Bahrain Home

Consolidated Balance Sheet of Arab Insurance Group Summary of Investment Activities of National Insurance companies Section Five: Directory of Insurance Companies in Bahrain A. Onshore B. Exempt C. Insurance Brokers D. Insurance Pools and Syndicates E. Insurance Experts, Consultants and Representative Offices List of BMA Officers

10m ago

72 Views

Role of Insurance Companies in Financial Market

The subject of this paper is a theoretical study of the influence of insurance companies on the financial market. By analying insurance companies, the aim is to determine how and in what way they affect business of financial market. KEY WORDS: financial market, offer, demand, insurance, insurance companies, capital JEL: G22 UDC: 339.13:368

1y ago

155 Views

Consumer Guide to Auto Insurance - csimt.gov

consumer guide to auto insurance contents introduction to auto insurance 1 understanding your auto insurance policy 2 required auto insurance 3 optional types of auto insurance 4-5 getting the right coverage 6 accidents and violations 7 how to shop for auto insurance 8 shopping tips 9 frequently asked questions 10-11 insurance complaints/when you have a problem 12

2y ago

805 Views

Industry Observations Insurance Industry

Jun 30, 2019 · 6/17/2019 Commercial Insurance Branch of Extraco Banks, N.A. Higginbotham Insurance Group, Inc. Insurance Brokers NA 6/13/2019 Links Insurance Services, LLC World Insurance Associates LLC Property and Casualty Insurance NA 6/13/2019 Abram Interstate Insurance Services, Inc. Risk Placement Services,

2y ago

619 Views

Life Insurance Buyer's Guide Life Insurance - National Association of .

Life Insurance uers uide Naional ssociaion of Insurance Commissioners Compare the Different Types of Insurance Policies There are many types of life insurance pol-icies. You should choose a policy with fea-tures that fit your individual needs. Some things to consider are: Term Insurance vs. Cash Value In-surance. Term insurance is intended to

1y ago

520 Views

your guide to understanding auto ins in nh - New Hampshire

Hampshire Insurance Department does not mandate or set Auto Insurance Rates. Auto Insurance Rates will vary by insurance company. This guide is intended to give New Hampshire consumers basic information on auto insurance. It suggests ways to: Lower the cost of your auto insurance, shop for Auto insurance and, file an auto insurance claim.

1y ago

449 Views

18.01.41 - REPLACEMENT OF LIFE INSURANCE AND ANNUITIES - Idaho

Department of Insurance Replacement of Life Insurance and Annuities. Page 3. 04. Existing Life Insurance or Annuity. "Existing Life Insurance or Annuity" means any life insurance or annuity in force, including life insurance under a binding or conditional receipt or a lif e insurance policy or annuity that is within an unconditional refund period.

1y ago

407 Views

EXAMINATION REPORT OF THE ADMIRAL INSURANCE COMPANY AS OF . - Delaware

Berkley Regional Specialty Insurance Comp 31295 DE Carolina Casualty Insurance Company 10510 IA Clermont Insurance Company 33480 IA Continental Western Insurance Company 10804 IA Firemen's Insurance Com pany of Wash, D.C. 21784 DE Gemini Insurance Company 10833 DE Great Divide Insurance Company 25224 ND

1y ago

258 Views

Bayesian Generalized Linear Models In R

It looks like you're using an ad-blocker