# Categorical Data Analysis Using SAS And Stata

2y ago
14 Views
397.92 KB
24 Pages
Last View : 2m ago
Transcription

Categorical Data AnalysisUsing SAS and StataHsueh-Sheng WuCenter for Family and Demographic ResearchMar 3, 20141

Outline Why do we need to learn categorical data analyses? A summary of different categorical data analyses– Analyses of contingency tables– Regression models Logistic regression Ordered logistic regression Multinomial logistic regression Stata commands SAS commands Interpreting the results Predicted probability Conclusions2

Why Do We Need to Learn CategoricalData Analysis? Four measurement levels––––Nominal (e.g., gender, race)Ordinal (e.g., attitude toward cohabitation)Interval (e.g., temperature)Ratio (e.g., income) Categorical variables are those measured atnominal and ordinal levels. Interval or ratio variables can be transformedinto nominal or ordinal variables, but not theother way around.3

What Is Special about Categorical Variable? The distribution of a categorical variable is described by itsfrequency and proportion rather than by its mean andvariance. Statistical methods (i.e., t-test, correlation, OLS regression)designed for continuous dependent variables are notadequate for analyzing categorical dependent variables. The decision on how to analyze categorical variables is oftenbased on:– The measurement level and number of categories in dependentvariables– The measurement level and number of categories in independentvariables– Sample size– Number of independent variables4

Different Models for CategoricalDependent VariablesCategorical models address three types of questions: Examination of contingency tables– Proportions– Relative risks– Odds ratio How the characteristics of individuals affect the choice– Binary logistic regression– Ordered logistic regression– Multinomial logistic regression5

Analyzing a Two-way Contingency Table Analyzing a 2x2 tableTable. Gender and EmploymentTable. Gender and 200200Maleρ11-ρ 1Female200400Femaleρ21-ρ 2Difference of Two Proportions 1 2 1 2SE 1 (1 1) 2 (1 2)n1 n26

Analyzing a Two-way Contingency Table (Cont.) 1Relative Risk 2Odds RatioOdds1Odds Ratio Odds2 1 2(1 1) 11 (1 2)SE 1n11 12 21 22 1n12 11. 22 12 . 211n 21 1n 227

Example DataP1 200/400 0.5P2 200/600 0.33– Difference of two proportionsP1 - P2 0.17– Relative riskP1/P2 1.51– Odds Ratio(200*400)/(200*200) 28

Analyzing a Three-way Contingency Table A three-way contingency table can be viewedas multiple two-way contingency tablescreated at different levels of a third variable. Example:Table. Relations among Country, Gender, and EmploymentCounty ACountry BEmployedUnemployedEmployed UnemployedMale1801202080Female12080803209

Example– Difference of proportionCountry A: (180/300) – (120/200) 0Country B: (20/100) –(80/320) 0– Relative riskCountry A: (180/300)/(120/200) 0.6/0.6 1Country B: (20/100) –(80/320) 0.2/0.2 1– Odds RatioCountry A: (180*80)/(120*120) 1Country B: (20*320)*(80*80) 110

Models for Examining How Characteristicsof Individuals Affect ChoicesLogistic Regression ( x) 1log( ) log() 1 ( x) 2 exp( )e ( ) 1 exp( ) 1 e Ordered Logistic Regressionp(Y j ) 1 . j,j 1,., J . j ,p(Y j ) 1logit [ p(Y j )] log[] log[], j 1,., J1 p(Y j ) j 1 . J ,11

Models for Examining How Characteristics ofIndividuals Affect Choices (Cont.)Multinomial Logistic Regressionlog(log( j) jJa ) log( aj , j 1,., J 1 J) log( ( ) ( ) ( ) ( ) bb aJ) log( b)JJaa bbabb12

Relations among These Three Models Ordered logistic regression and multinomial logisticregression are an extension of logistic regression. Both ordered and multinomial logistic regression can betreated as models simultaneously estimating a series oflogistic regression. Ordered logistic regression assumes different intercepts,but the same slope for different categories, whilemultinomial logistic regression assumes differentintercept and slope parameters for different categories.13

A List of Variables in the Datavariable namevariable labelLabel ValueLabel Label57101310 99719978aidIDmarriedMarital Status01Not marriedMarriededucEducation1234Less than High SchoolHigh SchoolSome collegecolleges or moreunionUnion FemaleageAgeagesqAge squaredfemaleageInteraction term of female and age24-33576-10890-3314

Data for Logistic Regression, Ordered LogisticRegression, and Multinomial Logistic 282615

Stata CommandsLogistic Regressionlogit married female age femaleagelogit married female age femaleage, orOrdered Logistic Regressionologit educ female age femaleageologit educ female age femaleage, orMultinomial Logistic Regressionmlogit union female age femaleage,base(0)16

SAS CommandsLogistic RegressionProc Logistic data in.annotated 3 2;Format married marriedf. educ educf.;Model married educ female age femaleage;run;17

SAS and Stata CommandsOrdered Logistic RegressionProc Logistic data in.annotated descending;Format educ educf. female femalef.;Model educ female age femaleage;run;PROC QLIM data in.annotated;MODEL educ female agefemaleage/DISCRETE (DIST LOGISTIC);RUN;Multinomial Logistic Regressionmlogit union educ female age femaleage, base(0)18

SAS and Stata CommandsMultinomial Logistic Regressionproc logistic data in.annotated 3 2;class union (ref "0");model union educ female age femaleage/link glogit;run;19

Interpreting the Results The sample sizeThe reference categoryThe regression coefficientsThe odds ratio20

Predicted Probability Predicted probability is useful to describethe results Odds Exp(the sum of coefficients) Predicted Probability Odds/(1 Odds) You can present predicted probability withgraphs21

Predicted Probability (continued)Ta bl e 3. Predi ca ted Proba bi l i ty for Ma l e a nd Fema l e Res pondentsFema l eInterceptAgeAge*Fema l eSum ofcoeffi centsOdds Ra tioPredi ctedProba bi l i tycoeffi centva l uecoeffi centva l uecoeffi centva l uecoeffi 9771822

Predicted Probability (continued)23

Conclusions If you have categorical dependent variables, youneed to choose adequate methods to analyze them. You need to choose the regression models that fityour data and research questions. If you have event counts (e.g., the number ofaccidents), you need to use other models such asPoisson regression, Log-linear model, or Negativebinomial regression for analyses. For additional help with categorical data analysis,feel free to contact me at wuh@bgsu.edu and 3723119.24

Categorical Data Analysis Using SAS and Stata Hsueh-Sheng Wu Center for Family and Demographic Research Mar 3, 2014 . . 12 1 11 1 12. 21 11. 22 22 21 12 11 (1 2) 2 (1 1) 1 Odds2 Odds1 Odds Ratio S S S S S S O S S S S S S S 2 . For additional help with categorical data analysis, feel free to contact me at wuh@bgsu.edu and 372-3119. 24 .

Related Documents:

POStERallows manual ordering and automated re-ordering on re-execution pgm1.sas pgm2.sas pgm3.sas pgm4.sas pgm5.sas pgm6.sas pgm7.sas pgm8.sas pgm9.sas pgm10.sas pgm1.sas pgm2.sas pgm3.sas pgm4.sas pgm5.sas pgm6.sas pgm7.sas pgm8.sas pgm9.sas pgm10.sas 65 min 45 min 144% 100%

SAS OLAP Cubes SAS Add-In for Microsoft Office SAS Data Integration Studio SAS Enterprise Guide SAS Enterprise Miner SAS Forecast Studio SAS Information Map Studio SAS Management Console SAS Model Manager SAS OLAP Cube Studio SAS Workflow Studio JMP Other SAS analytics and solutions Third-party Data

Both SAS SUPER 100 and SAS SUPER 180 are identified by the “SAS SUPER” logo on the right side of the instrument. The SAS SUPER 180 air sampler is recognizable by the SAS SUPER 180 logo that appears on the display when the operator turns on the unit. Rev. 9 Pg. 7File Size: 1MBPage Count: 40Explore furtherOperating Instructions for the SAS Super 180www.usmslab.comOPERATING INSTRUCTIONS AND MAINTENANCE MANUALassetcloud.roccommerce.netAir samplers, SAS Super DUO 360 VWRuk.vwr.comMAS-100 NT Manual PDF Calibration Microsoft Windowswww.scribd.com“SAS SUPER 100/180”, “DUO SAS SUPER 360”, “SAS .archive-resources.coleparmer Recommended to you b

Both SAS SUPER 100 and SAS SUPER 180 are identified by the “SAS SUPER 100” logo on the right side of the instrument. International pbi S.p.AIn « Sas Super 100/180, Duo Sas 360, Sas Isolator » September 2006 Rev. 5 8 The SAS SUPER 180 air sampler is recognisable by the SAS SUPER 180 logo that appears on the display when the .File Size: 1019KB

Jan 17, 2018 · SAS is an extremely large and complex software program with many different components. We primarily use Base SAS, SAS/STAT, SAS/ACCESS, and maybe bits and pieces of other components such as SAS/IML. SAS University Edition and SAS OnDemand both use SAS Studio. SAS Studio is an interface to the SAS

SAS Stored Process. A SAS Stored Process is merely a SAS program that is registered in the SAS Metadata. SAS Stored Processes can be run from many other SAS BI applications such as the SAS Add-in for Microsoft Office, SAS Information Delivery Portal, SAS Web

Jan 01, 2020 · SAS Programming SAS (Statistical Analysis System). SAS is a business application software which is used for DBMS and reporting, visualization and data mining purpose. To begin with sas we must start with sas data manipulation using sas programming Session-1 SAS Jargons and navigation though SAS PC ad SAS EG windows

LSI (SATA) Embedded SATA RAID LSI Embedded MegaRaid Intel VROC LSI (SAS) MegaRAID SAS 8880EM2 MegaRAID SAS 9280-8E MegaRAID SAS 9285CV-8e MegaRAID SAS 9286CV-8e LSI 9200-8e SAS IME on 53C1064E D2507 LSI RAID 0/1 SAS 4P LSI RAID 0/1 SAS 8P RAID Ctrl SAS 6G 0/1 (D2607) D2516 RAID 5/6 SAS based on