2016 National Survey Of Children’s Health

2y ago
13 Views
2 Downloads
539.55 KB
9 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Ellie Forte
Transcription

2016 National Survey of Children’s HealthGuide to Analysis with Multiple Imputed DataU.S. Census Bureau01/2/2018

Multiple imputation details and purposeIn the 2016 NSCH, missing values for household income, household size, and respondent educationvariables were multiply imputed during the weighting process. The variables, household income andhousehold size, were combined to form the Family Poverty Level ratio (FPL). Imputation is usefulbecause it uses observed data to estimate a plausible response for a missing value. It is preferable topreserve sample size and avoid bias in only using observed or known values in a “complete-case”analysis, which assumes that data are missing completely at random. In particular, 18.56% of the sample(22.83% of the weighted sample) was missing household FPL, which varied by other known demographiccharacteristics. Using only the known or reported data would severely limit sample size and biasestimates.Multiple imputation creates several plausible responses for a missing value, using other variables in thedataset to adjust the missing response (Allison, 2001; Rubin, 1996; Schaefer and Graham, 2006). Thesemultiple imputations offer a means of accurately incorporating the uncertainty of the imputed values formissing items. More specifically, combining or averaging estimates across all six imputed values willappropriately increase the standard error to account for this uncertainty while only slightly altering thepoint estimates. Using only a single imputation, particularly with a large amount of missing data as in thecase of FPL, incorrectly assumes certainty in the imputation as if there were no missing data at all—andwill produce standard errors that are too low and tests of significance that are too high (increased Type1 error).The initial public use file included the first of six multiply imputed values for FPL [FPL], household size[HHSIZE I], and respondent education [A1 GRADE I] as well as the single hot-deck imputed values forchild sex and race/ethnicity which had minimal missing data ( 1%)—all of which were used in theconstruction of survey weights. An additional file has now been released with all six imputed values forFPL [FPL I1-FPL I6]. Due to a smaller percentage of missing data and the more limited categories thatwere imputed for weighting, multiple imputations for household size and respondent education havelimited value and are not included on the imputation file. This document includes example code to showhow to analyze multiply imputed FPL data using SAS, SAS-callable SUDAAN, and Stata. These proceduresor commands will appropriately combine or average the point estimates across implicates and increasestandard errors so that significance levels are not overstated. The term implicate will be used in thisdocumentation, although other sources may use imputation (StataCorp LP, 2013).Analyzing data in a multiple imputation frameworkThe NSCH public use FPL implicate file contains the imputed values stored in different variables, one foreach of the imputed responses. These variables contain both fully reported and imputed values. Aftermerging the public use and implicate files by unique household ID (HHID), Table 1 shows an exampledataset, a wide file, with FPL I1 -- FPL I6. For our purposes we call each version, or plausible response,of the imputed data an implicate. For the case ID 1, the FPL I1 -- FPL I6 are not identical because therewas missing data on either income or household count and these values are imputed. For the case ID 2,

the poverty ratio variables are identical because there was no missing data. SAS-callable SUDAAN andStata can accommodate the wide dataset form.Table 1. Example of a wide dataset with an imputed observationIDSC AGE FPL I1 FPL I2 FPL I3 FPL I4 FPL I511012513510090130216250250250250250FPL I6115250Table 2 shows how the dataset needs to be re-organized to do analyses using the multiple imputationvariables in SAS. SAS-callable SUDAAN and Stata can use the long dataset form but it is a less efficientform of storage that requires more computational resources. In the long dataset, the variable ‘Implicate’appears. In SAS, the actual variable will be called ‘ Imputation ’.Table 2. Example of a long dataset with animputed observationIDSC ampleThis documentation includes example code for analyzing multiply imputed data in SAS, SAS-callableSUDAAN, and Stata. The example code estimates the proportion of children living below the poverty lineby children with special health care needs status (SC CSHCN). We first create a variable named ‘poor i’(0 above the poverty line, 1 below the poverty line).How to obtain estimates in SAS:In SAS, you will need to reshape data from a wide to long format. This data step is included in theexample code. In this step we copy the non-imputed variables (e.g. age) in the dataset along with asingle FPL variable and FWC variable, until each respondent has six observations in the dataset, one foreach implicate (see Table 2).

Once the data have been reshaped, we can use proc surveymeans to get the mean of the variable poorfor each imputed dataset. The proc mianalyze procedure will then combine the estimates by averagingthe mean across the implicates and calculate the standard error according to Rubin’s formula (Rubin,1996; SAS Institute, 2009).

libname file " Replace with file directory *************In order to use proc mianalyze, we will need to create a long, or **************************/data file.stackedmerge file.nsch 2016 topical file.nsch 2016 implicate;by hhid;array fpli{6} fpl i1-fpl i6;do Imputation 1 to 6;fpl i fpli{ Imputation *************Creating an indicator for whether or not thehousehold is below the poverty ***************/IF FPL I 100 THEN POOR I 1;ELSE POOR I te parameter of interest for each implicate after sorting by ***********************************/proc sort data stacked;by Imputation ;run;proc surveyfreq data stacked;strata stratum fipsst; * design statements;cluster hhid;weight fwc;by Imputation ; * identify the imputation;tables sc cshcn*poor i / row cl; * request crosstab with row % and CIs;ods output crosstabs mi table ; * estimates stored in new dataset mi *********************************Combine the implicates using proc mianalyze after sorting by variables ofinterest. This applies Rubin's rules (Rubin, 1996) to properly inflatestandard errors for the imputed ***************************/proc sort data mi table;by sc cshcn poor i;run;proc mianalyze data mi table;by sc cshcn poor i; * requests data for each combination of cshcn and poor;modeleffects rowpercent ; * combined percentage over all imputations;stderr rowstderr; * combined standard error over all imputations;run;

How to obtain estimates in SAS-callable SUDAAN:Using SUDAAN, you can leave the data in wide form without re-shaping. A data step is needed toconvert the design variables to numeric per SUDAAN requirements and to create the binary povertyvariable. The sorted file can then be analyzed in any procedure using the mi var statement to identifythe ***********************SUDAAN can analyze implicate data in two forms (one wide datasetor separate datasets for each implicate). This example will showthe easier or more efficient option of a single wide ********************/libname file " Replace with file directory ";data file.example;merge file.nsch 2016 topical file.nsch 2016 implicate;by ****************Converting design variables to numeric per SUDAAN **********************/hhidnum input(hhid,8.);fipsstnum input(fipsst,8.);stratumnum *****************************Creating an indicator for whether or not thehousehold is below the poverty ***************/array fpl i{6} fpl i1-fpl i6;array poor i{6} poor i1-poor i6;do i 1 to 6;IF fpl i{i} 100 THEN poor i{i} 1;ELSE poor i{i} 0;end;drop *****************Data must be sorted prior to ******************/proc sort data file.example;by fipsstnum stratumnum ***********************Analyzing multiple implicate **************/proc crosstab data file.example design wr ;nest fipsstnum stratumnum hhidnum / psulev 3; * design statements;weight fwc;mi var poor i1-poor i6; * identifies implicates, called by first variablelisted in remainder of code;class sc cshcn poor i1;table sc cshcn*poor i1; * requests crosstab;print nsum wsum rowper serow lowrow uprow /style nchs nsumfmt f10.0wsumfmt f10.0; * requests row percentages;

run;

How to obtain estimates in Stata:In Stata, just as you declare the data to be svyset, you declare it to be an MI (multiple imputation)dataset. In order for Stata to recognize that a variable has been imputed, you need to use mi import andregister the imputed variables. Here, you will declare the variables FPL and Poor to be imputed.Stata makes a missing flag when it imputes variables based on the ‘.’ responses. These missing valuesare not available in the Public Use File. The work-around we advise is generating the variable FPL I0 andthen setting all values to ‘.’. Rubin’s (1996) formula will calculate the correct variance across implicatesregardless of whether the values were imputed or reported.Once the data have been imported, and mi set, they are ready for analysis. Simply using the ‘mi est: svy:’prefix will combine the estimates by averaging across the implicates and calculate the standard erroraccording to Rubin’s formula (Rubin, 1996).local file " Replace with file directory "use " file'\nsch 2016 topical", clearmerge 1:1 hhid using " file'\nsch 2016 implicate"egen statacross group(fipsst stratum) /* create single cluster variable for svy */gen fpl i0 ./* create missing variable for original fpl, m 0 */save " file'\nsch 2016 topical", replace /* must be saved prior to declaring imputation */mi import wide, imputed(fpl i0 fpl i1-fpl i6) drop/* declare imputed data */mi passive: generate poor i 0/* generate new variable based on imputed fpl */mi passive: replace poor i 1 if fpl i0 100mi svyset hhid [pweight fwc], strata(statacross)/* declare survey data */mi est: svy: proportion poor i, over(sc cshcn)/* request crosstab of poor i by sc cshcn */ReferencesAllison, P. D. 2001. Missing Data. Thousand Oaks: Sage Publications.Rubin, D.B. 1996. Multiple Imputation After 18 Years. Journal of the American Statistical Association91: 473-489.Schaefer, J.L. and J.W. Graham. 2002. Missing Data: Our View of State of the Art. Psychological Methods7(2): 147-177.SAS Institute Inc. 2009. SAS/STAT 9.2 User’s Guide, Second Edition. Cary, NC: SAS Institute Inc.

StataCorp, LP. 2013. Stata Multiple-Imputation Reference Manual: Release 13. College Station, TX: StataPress.

In Stata, just as you declare the data to be svyset, you declare it to be an MI (multiple imputation) dataset. In order for Stata to recognize that a variable has been imputed, you need to use mi import and register the imputed variables. Here, you will d

Related Documents:

6 months to 5 years old 13.8 2013 National Nutrition Survey, FNRI-DOST Pregnant 24.6 2013 National Nutrition Survey, FNRI-DOST Lactating 16.7 2013 National Nutrition Survey, FNRI-DOST 60 years old and up 20.8 2013 National Nutrition Survey, FNRI-DOST 2.2.s2 Prevalence of exclusively breastfed children 0 to 5 months old 48.8 2015

National Demographic and Health Survey, please contact The 2017 Philippines National Demographic and Health Survey (NDHS 2017) is the sixth Demographic and Health Survey (DHS) conducted in the Philippines as part of The DHS Program and the 11 national demographic survey conducted since 1968. The survey is designed to provide

2018 National Survey of Children’s Health U.S. Census Bureau Abstract Objectives This report details the development, plan, and operation of the 2018 National Survey of Children’s Health (NSCH). This survey is designed to provid

Survey as a health service research method Study designs & surveys Survey sampling strategies Survey errors Survey modes/techniques . Part II (preliminary) Design and implementation of survey tools Survey planning and monitoring Analyzing survey da

new survey. Select one of those options to apply to your new survey form. 1)Create a new survey from scratch - will create a blank survey form that you can use to add your own questions 2)Copy an existing survey - can be used to create a copy of a survey form you have already created 3)Use a Survey Template - will allow you to select

1. A recruitment survey (public survey) will be used to recruit subjects in the study. Public survey link. 2. If a participant agrees to participate, a demographic survey (private survey) will be sent to the participant to fill out. Automatic survey invitation. 3. Based on the answer in the demographic survey, the

High Risk Groups of Children Street & working children Children of sex workers Abused, tortured and exploited children Children indulging in substance abuse Children affected by natural calamities, emergencies and man made disasters Children with disabilities Child beggars Children suffering from terminal/incurable disease Orphans, abandoned & destitute children

National Wage & Salary Survey 2015 National Sales Compensation Survey An Employer Associations of America (EAA) Sponsored Survey, coordinated by MRA - The Management Association in cooperation with 11 associations nationwide. Published: November 2015 Next Publication: November 2016 Confidential Survey Report