Stata Tutorial 14 Final - Princeton University

2y ago
26 Views
2 Downloads
968.00 KB
44 Pages
Last View : 1d ago
Last Download : 3m ago
Upload by : Ronan Garica
Transcription

STATA 14 Tutorialby Manfred W. Keilto AccompanyIntroduction to Econometrics, 4th Edition (2018)by James H. Stock and Mark W. -------------------1. STATA: INTRODUCTION12. CROSS-SECTIONAL DATAInteractive Use: Data Input and Simple Data Analysisa) The Easy and Tedious Way: Manual Data Entryb) Summary Statisticsc) Graphical Presentationsd) Simple Regressione) Entering Data from a Spreadsheetf) Importing Data Files directly into STATAg) Multiple Regression Modelh) Data TransformationsBatch (Do-Files)348101416172021233. SUMMARY OF FREQUENTLY USED STATA COMMANDS374. FINAL -------------------

1. STATA: INTRODUCTIONThis tutorial will introduce you to a statistical and econometric software package calledSTATA. The tutorial is an introduction to some of the most commonly used features inSTATA. These features were used by the authors of your textbook to generate the statisticalanalysis report in Chapters 3-9 (Stock and Watson, 2018). The tutorial provides the necessarybackground to reproduce the results of Chapters 3-9 and to carry out related exercises. It doesnot cover panel data (Chapter 10), binary dependent variables (Chapter 11), instrumentalvariable analysis (Chapter 12), or time-series analysis (Chapters 15-17), nor the estimatespresented in Big Data (Chapter 14).The most current professional version is STATA 15. Both STATA 13 and STATA 14 aresufficiently similar so that those who have only have access to STATA 13 can also use thistutorial. As with many statistical packages, newer versions of a program allow you to use moreadvanced and recently developed techniques that you, as a first time user, most likely will notencounter in a first course of statistics or econometrics. There are several versions of STATA14, such as STATA/IC, STATA/SE, and STATA/MP. The difference is basically in terms ofthe number of variables STATA can handle and the speed at which information is processed.Most users will probably work with the “Intercooled” (IC) version.STATA runs on the Windows, Mac, and Unix computers platform. I assume most of you willbe using STATA on Windows computers. It is produced by StataCorp in College Station, TX.You can read about various product information at the firm’s Web site, www.stata.com . Thereare 21 subject-specific statistics reference manuals in addition to four general referencemanuals (User’s Guide, Base, Data Management, Graphics, Functions) and the User’s Guidethat can be downloaded with STATA 15 (STATA 14 is not that different as far as you, as abeginner, are concerned). Perhaps the most useful of these are the User’s Guide and the BaseReference Manuals. You can order STATA by calling (800) 782-8272 or writing toservice@stata-press.com. In addition, if you purchase the Student Version, you can acquireSTATA at a steep discount. Prices vary, but you could get a “perpetual license” for STATA/ICfor 198, or a six-month license for as low as 45.Econometrics deals with three types of data: cross-sectional data, time series data, and panel(longitudinal) data (see Chapter 1 of the Stock and Watson (2018)). In a cross-section youanalyze data from multiple entities at a single point in time. In a time series you observe thebehavior of a single entity over multiple time periods. This can range from high frequency datasuch as financial data (hours, days); to data observed at somewhat lower (monthly)frequencies, such as industrial production, inflation, and unemployment rates; to quarterly data(GDP) or annual (historical) data. One big difference between cross-sectional and time seriesanalysis is that the order of the observation numbers does not matter in cross-sections. Withtime series, you would lose some of the most interesting features of the data if you shuffled theobservations. Finally, panel data can be viewed as a combination of cross-sectional and timeseries data, since multiple entities are observed at multiple time periods. STATA allows you towork with all three types of data.-1-

STATA is most commonly used for cross-sectional and panel data in academics, business, andgovernment, but you can work with it relatively easily when you analyze time-series data.STATA allows you to store results within a program and to “retrieve” these results for furthercalculations later. Remember how you calculated confidence intervals in statistics say for apopulation mean? Basically you needed the sample mean, the standard error, and some valuefrom a statistical table. In STATA, you can calculate the mean and standard deviation of asample and then temporarily “store” these. You then work with these numbers in a standardformula for confidence intervals. In addition, STATA provides the required numbers from therelevant distribution (normal, 2 , F, etc.).While STATA is truly “interactive,” you can also run a program as a “batch” mode Interactive use: you type a STATA command in the STATA Command Window (seebelow) and hit the Return/Enter key on your keyboard. STATA executes the commandand the results are displayed in the STATA Results Window. Then you enter the nextcommand, STATA executes it, and so forth, until the analysis is complete. Even thesimplest statistical analysis typically will involve several STATA commands.Batch mode: all of the commands for the analysis are listed in a file, and STATA is toldto read the file and execute all of the commands. These files are called Do-Files and aresaved using a .do suffix.In the good old days, the equivalent of writing a Do-File was to submit a “batch” of cards, eachcard containing a single command (now line), to a technician, who would use a card reader toenter these into the computer. The computer would then execute the sequence of statements.(You stored this batch of cards typically in a filing cabinet, and the deck was referred to as a“file.”) While you will work at first in interactive mode by clicking on buttons or writing singleline commands, you will very soon discover the advantage of running your regressions in batchmode. This method allows you to see the history of commands, and you can also analyzewhere exactly things went wrong if there are problems (“errors”) with any of your commands.This tutorial will initially explain the interactive use of STATA since it is more intuitive.However, we will switch as soon as it makes sense into the batch mode and you shouldseriously try to do your research/class work using this mode (“Do-Files”).STATA produces highly professional looking graphs and charts. However, it requires somepractice to generate these. A separate manual (Graphics) is devoted to the topic only. SinceSTATA works in a Windows format, it allows you to cut and paste the data into otherWindows-based program, such as Word or WordPerfect.Finally, there is a warning about the limitations of this tutorial. The purpose is to help you gainan initial understanding of how to work with STATA. I hope that the tutorial looks lessdaunting than the manuals. However, it cannot replace the accompanying manuals, which youwill have to consult for more detailed questions (alternatively use “Help” within the program).Feel free to provide me with feedback of how the tutorial can be improved for futuregenerations of students (mkeil@cmc.edu). Colleagues of mine and I have decided to set up a-2-

“Wiki”” run by studdents but suupervised by faculty at mmy academicc institutionn. We have ffoundthat thee “wisdom ofo crowds” oftenoproducces valuablee informationn for those wwho follow. Thisis, of course,cjust a suggestionn. Finally yoou may wannt to think abbout workinng with statiisticalsoftwarre as learninng a new lannguage: practicing it rouutinely will rresult in impprovement. If youset it asideafor too long, you willw only remmember the most imporrtant lines buut will forgeet theimportaant details. AnotherAdangger of tutoriaals like this is that you ssimply followw the instrucctionsand whhen you are done, you dod not remeember the coommands. Itt is thereforre a good iddea tokeep a separate sheeet and to wrrite down coommands andd examples of them if yyou think youu willuse theem later. I willw give youu short exercises so that you can praactice the coommands onn yourown. AtA the end off this tutorial, I have provvided a summmary of seleected STAT ccommands.2. CROOSS-SECTIIONAL DATTAInteracctive Use: DataD Input annd Simple Data AnalysissLet’s getg started. ClickCon the STATA icoon to begin yyour sessionn, or choose STATA 12 fromyour STTART winddow. Once yoou have starrted STATA, you will seee a large wiindow contaainingseverall smaller winndows. At thhis point youu can load a data set or enter data (described beelow)and beggin the statisstical analysiis.-3-

The ressults of yourr various opperations willl be displayyed in the soo-called Ressults Windoww. Onthe botttom right, thhere is a Vaariables Winndow, whichh shows the names of vaariables currrentlyactive in the dataffile. Above iti is the Revview Windoww, which letts you vieww previously usedSTATAA commandds. In interacctive use, STATA allowws you to eexecute commmands eitheer byclickingg on commaand buttons oro by typing the equivaleent command into the Coommand Windowon the bottombof the initial pagge.In this tutorial, wee will work withw two daata applicatioons: two crooss-sectionall (Californiaa TestScore DataDSet ussed in chaptters 4-9; andd the Curreent Populatioon Survey DData Set used inChapteers 3 and 8) asa an exercisse.a) The Easy and Teedious Way: Manual Data EntryIn Chaapters 4 to 9 you will workwwith thhe Californiaa Test Scoree Data Set. These are ccrosssectionnal data. There are 420 obbservations from K-6 annd K-8 schoool districts ffor the years 1998and 1999. You willl not want tot enter a larrge amount of data mannually, sincee it is tedious andleaves room for humanherrorr. As a resuult, it is gennerally not a recommeended methood ofinputtinng data. Hoowever, there are occaasions whenn you have collected ddata by youurself(somethhing that ecoonomists aree doing moree and more). The alternaative is to entter the data iinto aspreadssheet (Excel) and then too cut and passte the data ((see below).Enterinng data mannually is useed here for pedagogicaal purposes since it gives you an iinitialundersttanding of howhto workk with data inn STATA. IIn other worrds, it will bbe useful thaat youbecomee aware of entering, andd editing, datta in the proggram. Here I will use a sub-sample of 10observaations from thet California Test Score Data Set.To starrt, click on thhe Data Editor (Edit) buttonbon thhe toolbar, orr type the coommand ediit intothe Commmand Winndow. This willw open the following s creen:-4-

To enteer data manuually, start tyyping in the observationss (no need too label the coolumns noww; youwill naame the varriables subseequently). HereHI have chosen 10 observationns of test sscores(testscrr) and the sttudent-teacher ratio (str)) from the ddata set you will use in Chapter 4 oof thetextboook (type in thhe numbers forf all three 520.120.422.422.919.120.219.7After enteringethe data, double-click the grey box aat the top oof the first ccolumn (thee boxdirectlyy above the blue one inn the abovee picture). TThis will ressult in the ffollowing boox toappear::-5-

In the NameNbox, replacervar11 (school) withw the namme of the firsst column vaariable, heree obs.Do a siimilar operattion for the second coluumn, that is rrename var22 as testscr. IIn the Labell box,you maay want to enter informmation that thatthelps yoou remembeer how the data was crreatedoriginaally or as infformation foor others whho may subs equently woork with youur data. I suuggestyou entter hereAvgg test score ( (read scr ( math scr)//2)d enter for thhe third variaable strSimilarrly you couldStudent teacherr ratio (teachhers/enrl tot)After completing thhis task, the Data Editorr screen shouuld look as ffollows:-6-

Next cllose the boxx. Note that your commaands to edit the data noww appear inn the Resultss Box,your coommand to edit is listeed in the Coommand Boxx, and your newly creaated variablees areshown in the variabble list on thhe upper righht-hand side:Enterinng data in thhis way is veery tedious, anda you wil l make data input errorss frequently. Youwill see below howw to enter dataddirectly from a spreeadsheet or aan ASCII file, which arre themost coommon formms of data yoou will receivve in the futuure.In geneeral, you cann look at variiables that allready exist by typing inn the commaandlist varnaame1, varnamme2, where varnameivreffers to a variiable that exxists in your wworkfile. Trry it here by typinglisst testscr strThis coommand will list, one sccreen at a timme, the data on the variaables for eveery observatiion inthe datta set. (Misssing values are denoted by a periodd or “.” in SSTATA.) Laater on, youu willwork withw large datadset, andd you will probablypnott want to seee all observvations. Youu canimagine how long this may take with 5,0000 observatiions or moree. Failing too look at thee data-7-

observaation by observation off course takkes away thhe ability to spot errorss in the dataa set,perhaps generated by others duuring data enntry. Howevver, there aree other methhods to spott suchproblemms such as suummarizingg the data.You caan always stop the listinng by hittingg the break bbutton on thhe toolbar (itt looks like a redpentagoon with a whhite “x” in thhe middle). This button can be usedd to stop the execution oof anydemandd in STATAA.hould see thee following:You shb) Summary Statistticsorking with the same daata set. Typee in the folloowingFor thee moment, leet’s just see if we are wocommaandsum teestscr str, deetailsum staands for “summmarize” annd the option detail givees you a moore extensivee list of summmarystatisticcs for each of the variaables you haave entered . These incllude the meedian and ceertainpercenttiles of the frequencyfrdisstribution. YouY will learrn later that yyou can alsoo obtain summmarystatisticcs for a subset of your daata by addinng an if or in command fofollowing thee variable naame.-8-

The summary statiistics are expplained in ChapterC2 off your textboook (for exammple, Kurtoosis isdefinedd in equationn (2.15) on pagep22 in Sttock and Waatson (2018))).If yourr summary sttatistics diffefer, then checck the data aagain. To retturn to the ddata observattions,edit thee data usingg the Data EditorEor simmply return to the otherr open windoow in the STTATAprogram. Once yoou have locaated the dataa problem, click on thee observatioon and change it.pthe presserve buttonn again.After correcting thee problem, pressOnce youy have enttered the data, there are various thinngs you can ddo with it. YYou may waant tokeep a hard copy ofo what you just enteredd. If so, clickk on the Print button. TThis will prinnt theentire outputoof whhat you have produced soo far.gidea to save the daata and your work frequeently in somme form. Maany ofIn geneeral, it is a goodus havve learned thhrough painnful experiennces how eaasy it is to lose hours of work byy notbackingg up data/ressults in somee fashion. Too save the ddata set you ccreated, eithher press the Savebutton or click on File and theen Save As. Follow thee usual Winndows formaat for savingg files(drives, directories, file type, etc.).eIf you save datasetts in STATTA readable format, thenn youshould use the exteension “.dtaa.” Once youu have savedd your workk, you can ccall it up thee nexttime yoou intend to use it by clicking on Fille and then OOpen. Try thhese operatioons by savinng thecurrentt workfile unnder the namme “SW14smppl.dta.”-9-

c) Grapphical PreseentationsMost oftenoit is a goodgidea to generate graaphs (“picturres”) to get ssome “feel” for the data. Youwill be able to deteect outliers whichwmay be the result oof data entryy errors or yoou will be abble tosee if thhe data “makkes sense.” AlthoughASTTATA offerss many graphhing optionss, we will onnly gothroughh a few commmonly used ones here.1 ThereTare thhree graphs thhat you will use most offten: histograms or bar chartss;line graphs,, where one or more varriables are pplotted acrosss entities (thhese will becomemore imporrtant in time series analyysis when yoou are plottinng variables over time);scatterplots (crossplots)), where one variable is ggraphed agaainst another.The puurpose of histtograms is too display absolute or relative frequencies for a ssingle variabble. Ingenerall, the commaand ishistogramhvaarname, perccent title( )The ‘percent’ optioon producess relative freequencies, aand the title option addss whatever nnameyou place betweenn ( ) to thee top of thee graph. Yoou can eitheer save the graph you havegeneratted, or copy and paste it into annother Winndows basedd documentt, such as Word(replacing ‘percentt’ with ‘freqquency’ would have reesulted in abbsolute, rathher than relaative,frequenncies to be plotted;ptherre are other options forr you to expplore, such aas the numbber ofclasses (“bins”) to choose, etc.)).Tryhistoogram testscr, percent tittle(Testscorees)1I foundd the followingg STATA site particularlypuseeful for ics/gph/staatagraphs.html- 10 -

To creaate a line grraph in a crooss section, you can addd a third varriable in youur data set wwhichtakes ono the numbber of the observationo(here:(1, 2, 3, , 10). Name it “oobs” and labbel it“Schoool District Noo.” Let’s ploot the studennt-teacher rattio for the fiirst 10 obserrvations usinng thescatter command. The commmand is folloowed by thee two variaables you wwould like too seeplottedd, where the firstfone apppears on the Y axis and thhe second onn the X axis.scatter vaarname1 varrname2plots vaariable 1 aggainst variabble 2. Try thhis with the sstudent-teachher ratio andd the just crreatedvariable school.sccatter str obssh just gives youy the data points here.The ressulting grapha two wayys to make thistmore infformative, oone is to connnect the pooints by usinng theThere areline coommand folllowed by thhe two variaable names. Alternativeely you can use the twwowayconneccted commannd to have booth the pointts and the linnes displayedd. Try both hhere:linel str obstwowway connecteed str obshe graph apppears, you can edit it using the Grapph Editor (eeither use File and then StartAfter thGraphh Editor or push the Grraph Editor button). Allter the grapph until it loooks like thee onebelow.- 11 -

Let mee help you geetting startedd and then youy do the reest. We willl begin withh the x-axis. Youcan ediit specific axxis labels or numbers byy first clickinng what youu would like to change. Clickon the x-axisxand a red box shoould surrounnd the numbeers. Then cllick the varioous options, suchas tick numbers, labbels, and griid lines.Some ofo the alternaations can bee made in thee resulting ddialog boxesFrequently you wiill be interested either in causal reelationships between vaariables or in theability of one variaable to forecast another. As a result, it is a goodd idea to plott two variables inthe samme graph.The first way to loook for a relationship is too plot the obbservations oof both variaables. This can bey generalizinng the commmand twowayy connected to include mmore than twwo variable nnamesdone by(one foor the Y axis and one for the X axis). Try this herre withnnected str teestscr obstwoway conThe ressulting graphh is pretty unninformativee, since test sscores and sttudent-teachher ratios aree on adifferennt scale. Youu can allow forf two (or more)mscaless by enteringg the followinng commandd:twowayy (scatter strr obs, c(1) yaaxis(1)) (scattter testscr oobs, c(1) yaxxis(2))This coommand insttructs STATTA to use twoo Y axis, onee for the studdent-teacher ratio on the leftside off the graph, anda the otherr for test scorres on the rigght side of thhe graph. Yoou may wannt to“beautiify” the resulting graph byb using the graph editorr. See if youu can producce somethingg likethe grapph below:- 12 -

24620630Avg test score2221610201860019Student teacher ratio23640Grahph 2Test Scores and Student-Teacher Ratio Across 10 School Districts123456School District7Student-Teacher Ratio8910Avg Test ScoreTo get an even better idea about the relationship, you can display a two-dimensionalrelationship in a scatterplot (see page 85-6 of your Stock and Watson (2018) textbook). Givenour discussion above, you could simply use the command scatter testscr str. However, youmay want to see what a fitted line through that scatter plot would look like, in which case youhave to modify the command slightly:scatter testscr str lfit testscr strwhere ‘ ’ is the key ‘ ’ typed twice.This will result in the following graph (after beautification):Graph 3600Test Scores610620630640Scatterplot of Test Scores vs Student-Teacher Ratio192021Student-Teacher Ratio2223Fitted values(Not to worry about the positive slope here. Remember, this is a sample, and a very small oneat that. After all, you may get 10 heads in 10 flips of a coin.)- 13 -

d) Simple RegressionThere is a commonly held belief among many parents that lower student-teacher ratios willresult in better student performance. Consequently, in California, for example, all K-3 classeswere reduced to a maximum student-teacher ratio of 20 (“Class Size Reduction Act” – CSR) inthe late ‘90s. This comes at a cost, of course. Initially, it was 1.8 billion a year. At such a highcost, the natural question arises whether or not it is worth it. That is why you are analyzing theeffect of reducing student-teacher ratios in Chapters 4-9 of the Stock and Watson textbook.For the 10 school districts in our sample, we seem to have found a positive relationshipbetween larger classes and poor student performance. Not to worry – we will soon work withall 420 observations from the California School Data Set, and we will then find the negativerelationship you have seen in the textbook – for now, we are more concerned about learningtechniques in STATA.In the previous section, we included a regression line in the scatterplot, something that youshould have encountered towards the end of your statistics course. However, the graph of theregression line does not allow you to make quantitative statements about the relationship; youwant to know the exact values of the slope and the intercept. For example, in generalapplications, you may want to predict the effect of an increase by one in the explanatoryvariable (here the student-teacher ratio) on the dependent variable (here the test scores).To answer the questions relating to the more precise nature of the relationship between classsize and student performance, you need to estimate the regression intercept and slope. Aregression line is little else than fitting a line through the observations in the scatterplotaccording to some principle. You could, for example, draw a line from the test score for thelowest student-teacher ratio to the test score for the highest student-teacher ratio, ignoring allthe observations in between. Or you could sort the data by student-teacher ratio and split thesample in half so that the observations with the lowest ten student-teacher ratios are in one set,and the observations with the highest ten student-teacher ratios are in the other set. For each ofthe two sets you could calculate the average student-teacher ratio and the correspondingaverage test score, and then connect the two resulting points. Or you could just eyeball therelationship. Some of these principles have better properties than others to infer the trueunderlying (population) relationship from the given sample. The principle of ordinary leastsquares (OLS), for example, will give you desirable properties under certain restrictiveassumptions that are discussed in Chapter 4 of the Stock/Watson textbook.Back to computing. If the dependent variable, Y, is only determined by a single explanatoryvariable X in a linear fashion of the typeYi 0 1 X i uii 1,2, ., Nwith “u” representing the error, or random disturbance, not accounted for by the linear- 14 -

equatioon, then thee task is to find a valuue for 0 and 1 . IIf you had values for thesecoefficients, then 1 describes the effect off a unit increease in X onn Y. Often a regression line isa lineaar approximaation to an underlyingurelationshiprand the inttercept 0 oonly has a uusefulmeaninng if observvations arounnd X 0 occur in the daata. As we have seen iin the scatteerplotabove, there are noo observatioons around thet student-tteacher ratioo of zero, annd it is therreforebetter notn to interprret the numeerical value ofo the interceept at all. Yoour professoor most likelyy willgive yoou a seriouss penalty inn the exam forf interpretting the inteercept here because witth nostudentts present, thhere is no score to recordd. (What woould be the ffunction of thhe teacher inn thatcase?)There area various wayswto estimmate the reggression line . The commmand for regrressing a varriableY on a constant (inttercept) and another variiable X is:reg Y Xwhere “reg”“standss for least squuares regression. For thee current appplication, typpeg testscr str, rregt “r” following the commma indicattes that you aare using heeteroskedastiicity-robustwhere thestandarrd errors (eveen though yoou have not requested ann intercept too be included, STATA wwillautomaatically do soo. There is ann option for you to supppress the inteercept, but yoou will mosttlikely nevernuse it).utput appearss as follows:The ouAccordding to these results, lowwering the stuudent-teacheer ratio by onne student per class resuults inan decrrease of 0.6 points, on avverage, in thhe district wiide test scorre. Using thee notation off yourtextboook, you shouuld display thhe results as follows: 618.9.1 0.61TestScore0 STR, R 2 0.007, SSER 9.8(51.1) (2.33)- 15 -

Note thhat the resullt for the 10 chosen schoool districts is quite diffferent from the sample of all420 schhool districtss. However, this is a ratther small saample and thhe regressionn R2 is quitee low.As a mattermof faact, in Chappter 5 of yoour textbookk, you will learn that the above slopecoefficient is not sttatistically siignificant.e) Enteering Data frrom a SpreaddsheetSo far you enteredd data manuually. Most often you wwill work wwith larger ddata sets thaat areexternaal to the STAATA prograam, i.e., theyy will not bee included inn, or be partt of, the proogramitself. ThisTmakes sense as daata sets eithher become very large or are geneerated by annotherprogramm, such as a spreadsheett.Stock anda Watsonn present thee California Test Score Data Set inn Chapter 4 of the textbbook.Locate the correspponding Exceel file caschhool.xlsx on the accomppanying webb site (wheree yout tutorial) and open itt. Highlight alla data and copy it. Nexxt, start STAATA and typpe thefound thiswords “edit” into thet command line. Thiss will open tthe Data Ediitor. Make sure to selecct thegrey boox to the immmediate righht of “1” before pastingg. Now passte the data iinto the neww dataeditor, choosing thhe option “Treat First Roow as Variaable Names.”” Note that STATA hass nowconvenniently includded the name of the variables in the Data Editor.This is what you shhould see in STATA:When youy are donee, you are ready to save thet file. Namme it caschool.dta.You caan now reprooduce Equattion (4.11) fromfthe texttbook. Use tthe regressioon commandd youpreviouusly learned to generate the followinng output.- 16 -

. reg testscr str, rLinear regressionNumber of obsF( 1,418)Prob FR-squaredRoot MSEtestscrCoef.strcons-2.279808698.933RobustStd. Err.519489210.36436t-4.3967.44P t 0.0000.000 42019.260.00000.051218.581[95% Conf. Interval]-3.300945678.5602-1.258671719.3057(You can find the standard errors and the t-statistic on p. 139 of the Stock and Watson (2018)textbook. The regression R 2 , sum of squared residuals (SSR), and standard error of theregression (SER) are presented in Section 4.3.)f) Importing Data Files directly into STATAExcel (Spreadsheet) FilesEven though the cut and paste method seemed straightforward enough, there is a second, moredirect way to import data into STATA from Excel, which does not involve copying and pastingdata points.Start again with a new STATA file. In general, make sure your data is organized with thevariable names in Row 1 of your spreadsheet with each column representing a differentvariable, and the observations in the rows beneath the variable names. Then, save your data setin Excel (or an alternative spreadsheet program) as a .csv file (specifically CSV (commadelimited) (this stands for comma separated values). Next, type the following command intothe command window in STATA:insheet using filenamewhere (filename) is the directory location of your file. (To find this, locate the file and rightclick, selecting the Properties button). You must add the file name at the end of the directorylocation, proceeded by a backslash; example C:\Econometrics\StockWatson\caschool.csv. Ifyour filename has any spaces or any symbol that appears on the number keys of the keyboard,then you should put quotation marks around your filename. STATA reads spaces as denotingseparations between words, and therefore will only read the filename up until the first space orsymbol, and then considers the rest to be a separate command.NOTE: In order to insheet data, there must be no data already stored in memory. To get rid ofany data that is already stored, type the commandclearbefore “insheeting.”- 17 -

Once you have insheeted your data, you should see this reflected in your Results box and yourvariables should appear in your Variables List box. You can type edit to see your data in thedata editor.To save your data as a STATA file, click on File on the upper toolbar, then select Sav

There are several versions of STATA 14, such as STATA/IC, STATA/SE, and STATA/MP. The difference is basically in terms of the number of variables STATA can handle and the speed at which information is processed. Most users will probably work with the “Intercooled” (IC) version. STATA runs on the Windows, Mac, and Unix computers platform.

Related Documents:

Stata is available in several versions: Stata/IC (the standard version), Stata/SE (an extended version) and Stata/MP (for multiprocessing). The major difference between the versions is the number of variables allowed in memory, which is limited to 2,047 in standard Stata/IC, but can be much larger in Stata/SE or Stata/MP. The number of

Categorical Data Analysis Getting Started Using Stata Scott Long and Shawna Rohrman cda12 StataGettingStarted 2012‐05‐11.docx Getting Started Using Stata – May 2012 – Page 2 Getting Started in Stata Opening Stata When you open Stata, the screen has seven key parts (This is Stata 12. Some of the later screen shots .

To open STATA on the host computer, click on the “Start” Menu. Then, when you look through “All Programs”, open the “Statistics” folder you should see a folder that says “STATA”. Click on the folde r and it will open up three STATA programs (STATA 10, STATA 11, and STATA 12). These are all the

- However, as of Stata 11: can record edits and apply them to other graphs . A Visual Guide To Stata Graphics, Third Edition, by Michael Mitchell Stata 12 Graphics Manual (may want to start with "graph intro") Stata 12 Graphics. 3 Stata Graphics Syntax graph graphtype graph bar graph twoway plottype graph twoway scatter

STATA/IC, STATA/SE, and STATA/MP. The difference is basically in terms of the number of variables STATA can handle and the speed at which information is processed. Most users will probably work with the “Intercooled” (IC) version. STATA runs on the Windows (2000, 2003, XP, Vista, Server 2008, or Windows 7), Mac, and Unix computers platform.

Stata/MP, Stata/SE, Stata/IC, or Small Stata. Stata for Windows installation 1. Insert the installation media. 2. If you have Auto-insert Notification enabled, the installer will start auto-matically. Otherwise, you will want to navigate to your installation media and double-click on Setup.exe to start the installer. 3.

Stata/IC and Stata/SE use only one core. Stata/MP supports multiple cores, but only commands are speeded up. . I am using Stata 14 and not Stata 15) Setting up the seed using dataset lename. type can be F create creates a dataset with empty seeds for each variation. If option fill is used, then seeds are random numbers.

Dictator Adolf Hitler was born in Branau am Inn, Austria, on April 20, 1889, and was the fourth of six children born to Alois Hitler and Klara Polzl. When Hitler was 3 years old, the family moved from Austria to Germany. As a child, Hitler clashed frequently with his father. Following the death of his younger brother, Edmund, in 1900, he became detached and introverted. His father did not .