Data Analysis Declare Data With Stata Cheat Sheet TIME .

2y ago
90 Views
6 Downloads
5.94 MB
6 Pages
Last View : 14d ago
Last Download : 2m ago
Upload by : Jayda Dunning
Transcription

Data AnalysisCheat SheetFor more info, see Stata’s reference manual (stata.com)Results are stored as either r -class or e -class. See Programming Cheat SheetExamples use auto.dta (sysuse auto, clear)unless otherwise notedunivar price mpg, boxplotssc install univarcalculate univariate summary with box-and-whiskers plotstem mpgreturn stem-and-leaf display of mpgfrequently used commands aresummarize price mpg, detailhighlighted in yellowcalculate a variety of univariate summary statisticsfor Stata 13: ci mpg price, level (99)ci mean mpg price, level(99)compute standard errors and confidence intervalsrcorrelate mpg pricereturn correlation or covariance matrixpwcorr price mpg weight, star(0.05)return all pairwise correlation coefficients with sig. levelsmean price mpgestimates of means, including standard errorsproportion rep78 foreignestimates of proportions, including standard errors forecategories identified in varlistratioestimates of ratio, including standard errorstotal priceestimates of totals, including standard errorsEstimation with Categoricalmeasure somethingCATEGORICAL VARIABLESidentify a group to whichan observations belongsINDICATOR VARIABLESdenote whetherT Fsomething is true or falseOPERATORi.ib.fvsetc.o.###4tsline plot04100TIME-SERIES OPERATORSL.F.D.S.lag x t-1lead x t 1difference x t-x t-1seasonal difference x t-xt-1USEFUL ADD-INSL2.F2.D2.S2.185019001950id 3id 402-period lag x t-22-period lead x t 2difference of difference xt-xt 1-(xt 1-xt 2)lag-2 (seasonal difference) xt xt 2SURVEY DATA197019801990webuse nhanes2b, clearsvyset psuid [pweight finalwgt], strata(stratid)declare survey design for a datasetrsvydescribereport survey-data detailssvy: mean age, over(sex)estimate a population mean for each subpopulationsvy, subpop(rural): mean ageestimate a population mean for rural areasesvy: tabulate sex heartatkreport two-way table with tests of independencesvy: reg zinc c.age##c.age female weight ruralestimate a regression using survey weightswebuse drugtr, clearstset studytime, failure(died)r declare survey design for a datasetstsumsummarize survival-time datastcox drug ageeestimate a Cox proportional hazard model1 Estimate Modelsid 22compact time series into means, sums, and end-of-period valuestscollapcarryforward carry nonmissing values forward from one obs. to the nextidentify spells or runs in time seriestsspellSURVIVAL ANALYSISid 122000webuse nlswork, clearxtset id yeardeclare national longitudinal data to be a panelxtdescribextline plotreport panel aspects of a datasetwage relative to inflationrxtsum hourssummarize hours worked, decomposingstandard deviation into between andwithin componentsxtline ln wage if id 22, tlabel(#3)plot panel data as a line plotxtregln w c.age##c.age ttl exp, fe vce(robust)eestimate a fixed-effects model with robust standard errorsstores results as e-class2 Diagnosticssome are inappropriate with robust SEsTim Essam (tessam@usaid.gov) Laura Hughes (lhughes@usaid.gov)follow us @StataRGIS and @flaneusekspricepriceestat hettest test for heteroskedasticityregress price mpg weight, vce(robust)rovtest test for omitted variable biasestimate ordinary least-squares (OLS) modelvifreport variance inflation factoron mpg weight and foreign, apply robust standard errorsdfbeta(length)Type help regress postestimation plotsregress price mpg weight if foreign 0, vce(cluster rep78)calculate measure of influence for additional diagnostic plotsregress price only on domestic cars, cluster standard errorsrvfplot, yline(0)avplotsrreg price mpg weight, genwt(reg wt)plot residualsplot all partialestimate robust regression to eliminate outliersmpgrep78probit foreign turn price, vce(robust)against fittedregression leverageADDITIONALMODELSvaluesplots in one graphFitted valuesweightheadroomestimate probit regression withbuilt-in Stata principal components analysispcacommandrobust standard errorsfactor analysisfactor3 Postestimation commands that use a fitted modelpoisson nbregcount outcomeslogit foreign headroom mpg, orcensored datatobitestimate logistic regression andregress price headroom length Used in all postestimation examplesinstrumental variablesivregress ivreg2report odds ratiosdiff user-written difference-in-differencedisplayb[length]display se[length]bootstrap, reps(100): regress mpg /* rd ssc install ivreg2 regression discontinuityreturn coefficient estimate or standard error for mpg*/ weight gear foreignxtabond xtdpdsys dynamic panel estimatorfrom most recent regression modelestimate regression with bootstrapping teffects psmatch propensity score matchingmargins,dydx(length) returns e-class information when post option is usedjackknife r(mean), double: sum mpg synthsynthetic control analysisreturntheestimated marginal effect for mpgBlinder–Oaxaca decompositionrjackknife standard error of sample mean oaxacamargins, eyex(length)more details at http://www.stata.com/manuals/u25.pdfreturn the estimated elasticity for price& Factor Variablespredict yhat if e(sample)DESCRIPTIONEXAMPLEregress price i.rep78specify indicatorsspecify rep78 variable to be an indicator variablecreate predictions for sample on which model was fitregress price ib(3).rep78specify base indicatorset the third category of rep78 to be the base categorypredict double resid, residualsfvset base frequent rep78command to change baseset the base to most frequently occurring category for rep78calculate residuals based on last fit modeltreat variable as continuousregress price i.foreign#c.mpg i.foreigntreat mpg as a continuous variable andspecify an interaction between foreign and mpgtest headroom 0set rep78 as an indicator; omit observations with rep78 2regress price io(2).rep78omit a variable or indicatorr test linear hypotheses that headroom estimate equals zerocreate a squared mpg term to be used in regressionregress price mpg c.mpg#c.mpgspecify interactionslincom headroom - lengthcreate all possible interactions with mpg (mpg and mpg )regress price c.mpg##c.mpgspecify factorial interactionstest linear combination of estimates (headroom length)tabulate foreign rep78, chi2 exact expectedtabulate foreign and repair record and return chi2and Fisher’s exact statistic alongside the expected valuesttest mpg, by(foreign)estimate t test on equality of means for mpg by foreignr prtest foreign 0.5one-sample test of proportionsksmirnov mpg, by(foreign) exactKolmogorov–Smirnov equality-of-distributions testranksum mpg, by(foreign)equality tests on unmatched data (independent samples)anova systolic drug webuse systolic, clearanalysis of variance and covariancee pwmean mpg, over(rep78) pveffects mcompare(tukey)estimate pairwise comparisons of means with equalvariances include multiple comparison adjustmentCONTINUOUS VARIABLESPANEL / LONGITUDINALwebuse sunspot, cleartsset time, yearlydeclare sunspot data to be yearly time seriestsreportrreport time-series aspects of a datasetgenerate lag spot L1.spotcreate a new variable of annual lags of sunspotsNumber of sunspotstsline spotplot time series of sunspotsearima spot, ar(1/2)estimate an autoregressive model with 2 lagspriceStatistical TestsTIME SERIESpriceSummarize DataBy declaring data type, you enable Stata to apply data munging and analysis functions specific to certain data typesResidualswith StataDeclare Data2inspired by RStudio’s awesome Cheat Sheets b.io/StataTrainingDisclaimer: we are not affiliated with Stata. But we like it.updated July 2019CC BY 4.0

Programmingwith StataCheat SheetFor more info, see Stata’s reference manual (stata.com)1 Scalarsboth r- and e-class results contain scalarsscalar x1 3create a scalar x1 storing the number 3scalar a1 “I am a string scalar”create a scalar a1 storing a string2 MatricesScalars can holdnumeric values orarbitrarily long stringsDISPLAYING & DELETING BUILDING BLOCKS[scalar matrix macro estimates] [list drop] blist contents of object b or drop (delete) object b[scalar matrix macro estimates] dirlist all defined objects for that classmatrix list bmatrix dirscalar drop x1list contents of matrix b list all matrices delete scalar x1GLOBALSpublic or private variables storing textavailable through Stata sessionsLOCALSR- AND E-CLASS: Stata stores calculation results in two* main classes:rreturn results from general commandssuch as summarize or tabulateereturn results from estimationcommands such as regress or meanTo assign values to individual variables use:r individual numbers or stringse rectangular array of quantities or expressionse pointers that store text (global or local)1 SCALARS2 MATRICES3 MACROSLoops: Automate Repetitive TasksANATOMY OF A LOOPobjects to repeat overtemporary variable usedonly within the looprequires local macro notation* there’s also s- and n-classPUBLICavailable only in programs, loops, or do-files PRIVATElocal myLocal price mpg lengthcreate local variable called myLocal with thestrings price mpg and lengthsummarize myLocal' add a before and a ' after local macro name to callsummarize contents of local myLocallevelsof rep78, local(levels)create a sorted list of distinct values of rep78,store results in a local macro called levelslocal varLab: variable label foreign can also do with value labelsstore the variable label for foreign in the local varLabmean pricee ereturn listreturns list of scalars, macros,matrices, and functionssummarize price, detailr return listreturns a list of scalarsscalars:r(N)r(mean)r(Var)r(sd) 746165.25.86995225.97.2949.49.Results are replacedeach time an r-class/ e-class commandis calledscalars:e(df r)e(N over)e(N)e(k eq)e(rank) 7317311generate meanN e(N)create a new variable equal toobs. in estimation commandpreserve create a temporary copy of active dataframe set restore pointsrestore restore temporary copy to point last preserved to test code thatgenerate p mean r(mean)create a new variable equal toaverage of priceACCESSING ESTIMATION RESULTSchanges dataAfter you run any estimation command, the results of the estimates arestored in a structure that you can save, view, compare, and export.regress price weightUse estimates storeestimates store est1to compile resultsstore previous estimation results est1 in memory for later usessc install estouteststo est2: regress price weight mpgeststo est3: regress price weight mpg foreignestimate two regression models and store estimation resultsestimates table est1 est2 est3print a table of the two estimation results est1 and est2EXPORTING RESULTSThe estout and outreg2 packages provide numerous flexible options for making tablesafter estimation commands. See also putexcel and putdocx commands.esttab est1 est2, se star(* 0.10 ** 0.05 *** 0.01) labelcreate summary table with standard errors and labelsesttab using “auto reg.txt”, replace plain seexport summary table to a text file, include standard errorsoutreg2 [est1 est2] using “auto reg2.txt”, see replaceexport summary table to a text file using outreg2 syntaxsee also whileStata has three options for repeating commands over lists or values:foreach, forvalues, and while. Though each has a different first line,the syntax is consistent:foreach x of varlist var1 var2 var3 {Many Stata commands store results in types of lists. To access these, use return orereturn commands. Stored results can be scalars, macros, matrices, or functions.matrix ad2 a , dmatrix ad1 a \ drow bind matricescolumn bind matricesmatselrc b x, c(1 3) findit matselrcselect columns 1 & 3 of matrix b & store in new matrix xmat2txt, matrix(ad1) saving(textfile.txt) replaceexport a matrix to a text filessc install mat2txtglobal pathdata "C:/Users/SantasLittleHelper/Stata"define a global variable called pathdatacd pathdata add a before calling a global macrochange working directory by calling global macroglobal myGlobal price mpg lengthsummarize myGlobalsummarize price mpg length using globalbasic components of programming4 Access & Save Stored r- and e-class Objectse-class results are stored as matricesmatrix a (4\ 5\ 6)matrix b (7, 8, 9)create a 3 x 1 matrixcreate a 1 x 3 matrixmatrix d b' transpose matrix b; store in d3 MacrosBuilding Blockscommand(s) you want to repeatcan be one line or manycommand x', option.}open brace mustappear on first lineclose brace must appearon final line by itselfFOREACH: REPEAT COMMANDS OVER STRINGS, LISTS, OR VARIABLESforeach x in of [ local, global, varlist, newlist, numlist ] {list types: objects over which theStata commands referring to x'commands will be repeated}loops repeat the same commandSTRINGSover different arguments:sysuse "auto.dta", clearforeach x in auto.dta auto2.dta {same as.tab rep78, missingsysuse " x'", cleartab rep78, missingsysuse "auto2.dta", clear}tab rep78, missingLISTSforeach x in "Dr. Nick" "Dr. Hibbert" {display length("Dr. Nick")display length ( " x '" )display length("Dr. Hibbert")}When calling a command that takes a string,surround the macro name with quotes.VARIABLESforeach x in mpg weight {summarize x'}must define list typeforeach x of varlist mpg weight {summarize x'} foreach in takes any listas an argument withelements separated byspaces foreach of requires youto state the list type,which makes it fastersummarize mpgsummarize weightFORVALUES: REPEAT COMMANDS OVER LISTS OF NUMBERSiteratorforvalues i 10(10)50 {display i'numeric values overwhich loop will run}DEBUGGING CODEUse display command toshow the iterator value ateach step in the loopITERATORSdisplay 10display 20.i 10/5010, 11, 12, .i 10(10)5010, 20, 30, .i 10 20 to 50 10, 20, 30, .see also capture and scalar rcset trace on (off )trace the execution of programs for error checkingPUTTING IT ALL TOGETHERsysuse auto, clearpull out the first wordgenerate car make word(make, 1)from the make variablecalculate unique groups oflevelsof car make, local(cmake)define thecar make and store in local cmakelocal i to belocal i 1an iteratorAdditional Programming Resourcesstore the length of locallocal cmake len : word count cmake'cmake in local cmake lenbit.ly/statacodeforeach x of local cmake {download all examples from this cheat sheet in a do-filedisplay in yellow "Make group i' is x'"TEMPVARS & TEMPFILES special locals for loops/programsssc install adolistado updateadolistif i' cmake len' {initializeanewtemporaryvariablecalledtemp1tempvar temp1Update user-written ado-filesList/copy user-written ado-filestests the position of thedisplay "The total number of groups is i'"save squared mpg values in temp1generate temp1' mpg 2iterator,executescontentsnet install package, from aster) in brackets when thesummarize the temporary variable temp1}summarize temp1'install a package from a Github repositorycondition is truelocali i'increment iterator by onetempfile myAuto create a temporary file to see anced}save myAuto'be used within a program tempnameconfigure Sublime text for Stata 11–15Tim Essam (tessam@usaid.gov) Laura Hughes (lhughes@usaid.gov)follow us @StataRGIS and @flaneuseksinspired by RStudio’s awesome Cheat Sheets b.io/StataTrainingDisclaimer: we are not affiliated with Stata. But we like it.updated July 2019CC BY 4.0

Data Processingwith StataCheat SheetFor more info, see Stata’s reference manual (stata.com)Basic SyntaxAll Stata commands have the same format (syntax):[by varlist1:]command[varlist2][ exp]apply thecommand acrosseach uniquecombination ofvariables invarlist1Useful Shortcutskeyboard buttonsF2describe dataCtrl 8open the data editorcleardelete data in memoryAT COMMAND PROMPTPgUpTabclsPgDnCtrl 9open a new do-fileCtrl Dhighlight text in do-file,then ctrl d executes itin the command linescroll through previous commandsautocompletes variable name after typing partclear the console (where results are displayed)Set uppwdprint current (working) directorycd "C:\Program Files\Stata16"change working directorydirdisplay filenames in working directorydir *.dtaList all Stata data in working directory underlined partsare shortcuts –capture log closeuse "capture"close the log on any existing do-files or "cap"log using "myDoFile.txt", replacecreate a new log file to record your work and resultssearch mdescpackages containfind the package mdesc to install extra commands thatexpand Stata’s toolkitssc install mdescinstall the package mdesc; needs to be done onceImport Datasysuse auto, clearfor many examples, weload system data (auto data)use the auto dataset.use "yourStataFile.dta", clearload a dataset from the current directory frequently usedcommands areimport excel "yourSpreadsheet.xlsx", /*highlighted in yellow*/ sheet("Sheet1") cellrange(A2:H11) firstrowimport delimited "yourFile.csv", /**/ rowrange(2:11) colrange(1:8) varnames(2)import sas "yourSASfile.sas7bdat", bcat("value labels file")see help import forimport spss "yourSPSSfile.sav"more optionswebuse set ster/Day2/Data"webuse "wb indicators long"set web-based directory and load data from the webfunction: what areyou going to doto varlists?[if exp][in range][weight]column to save output as condition: only apply toapplya new variable apply the function specific rowscommand toif something is truebysort rep78 : summarizeprice[using filename]applyweightspull data from a file(if not loaded)[,options]special optionsfor commandIn this example, we want a detailed summarywith stats like kurtosis, plus mean and medianif foreign 0 & price 9000, detailTo find out more about any command–like what options it takes–type help commandArithmeticadd (numbers) combine (strings) subtract* multiply/ divide raise to a powerBasic Data OperationsLogic&and! or not or tests if something is equal assigns a value to a variable equal! notor equalif foreign ! 1 & price 10000makeChevy ColtBuick RivieraHonda CivicVolvo 260foreign0011price3,98410,3724,49911,995 less than less than or equal to greater than greater or equal toif foreign ! 1 price 10000makeChevy ColtBuick RivieraHonda CivicVolvo 260foreign0011price3,98410,3724,49911,995Explore DataVIEW DATA ORGANIZATIONdescribe make pricedisplay variable type, format,and any value/variable labelscountcount if price 5000number of rows (observations)can be combined with logicds, has(type string)lookfor "in."search for variable types,variable name, or variable labelisid mpgcheck if mpg uniquelyidentifies the dataSEE DATA DISTRIBUTIONcodebook make priceoverview of variable type, stats,number of missing/unique valuessummarize make price mpgprint summary statistics(mean, stdev, min, max)for variablesinspect mpgshow histogram of data andnumber of missing or zeroobservationshistogram mpg, frequencyplot a histogram of thedistribution of a variableBROWSE OBSERVATIONS WITHIN THE DATAMissing values are treated as the largestbrowse orCtrl 8 positive number. To exclude missing values,ask whether the value is less than "."open the data editorlist make price if price 10000 & !missing(price)clist . (compact form)list the make and price for observations with price 10,000display price[4]display the 4th observation in price; only works on single valuesgsort price mpg (ascending)gsort –price –mpg (descending)sort in order, first by price then miles per gallonduplicates reportassert price! .finds all duplicate values in each variableverify truth of claimlevelsof rep78display the unique values for rep78Tim Essam (tessam@usaid.gov) Laura Hughes (lhughes@usaid.gov)follow us @StataRGIS and @flaneuseksinspired by RStudio’s awesome Cheat Sheets (rstudio.com/resources/cheatsheets)Change Data TypesStata has 6 data types, and data can also be missing:no datatrue/falsenumberswordsmissingstring int long float doublebyteTo convert between numbers & strings:gen foreignString string(foreign)"1"tostring foreign, gen(foreignString)1"1"decode foreign , gen(foreignString)"foreign"1ge

Declare Data tsline spot plot time series of sunspots xtset id year declare national longitudinal data to be a panel . By declaring data type, you enable Stata to apply data munging and analysis functions specific to certain data types TIME-SERIES OPERAT

Related Documents:

Psalm 106:1-48 Praise Yaah – Declare the Mighty Acts of Yahweh Joy in Forgiveness of Israel's Sins 1.Praise Yaah (the LORD)!Oh, give thanks to Yahweh ORD), for He is good!For His mercy endures forever. 2 Who can utter the mighty acts of Yahweh (the LORD)?Who can declare all His praise?

In Stata, just as you declare the data to be svyset, you declare it to be an MI (multiple imputation) dataset. In order for Stata to recognize that a variable has been imputed, you need to use mi import and register the imputed variables. Here, you will

In Stata, just as you declare the data to be svyset, you declare it to be an MI (multiple imputation) dataset. In order for Stata to recognize that a variable has been imputed, you need to use mi import and register the imputed variables. Here, you will d

CS3233 ‐Competitive Programming, Steven Halim, SoC, NUS. Top Coder Coding Style (5) 8. Declare (large) static DS as global variable – All input size is known, declare data structure size LARGER than needed to avoid silly bugs – AvoidAvo

Speaking of Jesus Christ [See verse 24} Paul writes, “Whom God hath set forth to be a propitiation through faith in his blood, to declare his righteousness for the remission of sins that are past, through the forbearance of God; To declare, I say, at this time his righteousness: that he might be just, and the justifier of him which

Name: Shabar Said Student Number: x18144845 Degree for which thesis is submitted: MAHRM Material submitted for award (a) I declare that the work has been composed by myself. (b) I declare that all verbatim extracts contained in the thesis have been distinguished by quotation marks

concerning the Word of life--the life was manifested, and we have seen, and bear witness, and declare to you that eternal life which was with the Father and was manifested to us--that which we have seen and heard we declare to you, that you also may have fellowship with us; and truly our fel

BEAM Team Memo Rosalind Arwas Carolyn Perkins Helen Woodhall A very warm welcome to the March/April 2021 edition of The BEAM. This time last year, the spring edition unexpectedly almost became our last but, as the