Applied Survey Data Analysis

 Project Overview         Information about Authors           Links to Data Sets for First and Second Editions          Links to Additional Sites        

Survey Data Analysis Publications         Professional Reviews        Frequently Asked Questions     Supplemental Code Examples


    

        
 

Analysis Examples Replication-Second Edition     

The analysis examples replication materials cover Chapters 5-13 of ASDA Second Edition but not every software package contains all chapters.  Lack of a link for a given chapter indicates that this software package does not include the ability to perform this type of analysis technique. 

 

SAS v9.4 Code and Results

Overview of SAS Commands

Chapter 5 Analysis Examples

Chapter 6 Analysis Examples

Chapter 7 Analysis Examples

Chapter 8 Analysis Examples

Chapter 9 Analysis Examples

Chapter 10 Analysis Examples

Chapter 11 Analysis Examples

Chapter 12 Analysis Examples

 

Stata v14 Code (Results Are Presented Throughout the Book)

Overview of Stata Commands

Chapter 5 Analysis Examples

Chapter 6 Analysis Examples

Chapter 7 Analysis Examples

Chapter 8 Analysis Examples

Chapter 9 Analysis Examples

Chapter 10 Analysis Examples

Chapter 11 Analysis Examples

Chapter 12 Analysis Examples

Chapter 13 Analysis Examples

 

SPSS V22 Code and Results

Overview of SPSS Commands

Chapter 5 Analysis Examples

Chapter 6 Analysis Examples

Chapter 7 Analysis Examples

Chapter 8 Analysis Examples

Chapter 9 Analysis Examples

Chapter 10 Analysis Examples

Chapter 11 Analysis Examples

 

IVEware V0.3 Code and Results

Overview of IVEware Commands

Chapter 5 Analysis Examples

Chapter 6 Analysis Examples

Chapter 7 Analysis Examples

Chapter 8 Analysis Examples

Chapter 9 Analysis Examples

Chapter 10 Analysis Examples

Chapter 11 Analysis Examples

Chapter 12 Analysis Examples

Chapter 13 Analysis Examples

 

Sudaan 11.0 Code and Results

Overview of Sudaan Commands

Chapter 5 Analysis Examples

Chapter 6 Analysis Examples

Chapter 7 Analysis Examples

Chapter 8 Analysis Examples

Chapter 9 Analysis Examples

Chapter 10 Analysis Examples

Chapter 11 Analysis Examples

Chapter 12 Analysis Examples

 

R Survey Package v3.31-5 Code and Results

Overview of R Commands

Chapter 5 Analysis Examples

Chapter 6 Analysis Examples

Chapter 7 Analysis Examples

Chapter 8 Analysis Examples

Chapter 9 Analysis Examples

Chapter 10 Analysis Examples

Chapter 11 Analysis Examples

Chapter 12 Analysis Examples

 

WesVar 5.1 Code and Results

Overview of WesVar Commands

Chapter 5 Analysis Examples

Chapter 6 Analysis Examples

Chapter 7 Analysis Examples

Chapter 8 Analysis Examples

Chapter 9 Analysis Examples

Chapter 10 Analysis Examples

Chapter 11 Analysis Examples

Chapter 12 Analysis Examples

 

MPlus V7.4 Code and Results

Overview of Mplus Commands

Chapter 7 Analysis Examples

Chapter 8 Analysis Examples

Chapter 9 Analysis Examples

Chapter 10 Analysis Examples

Chapter 11 Analysis Examples

Chapter 12 Analysis Examples

Chapter 13 Analysis Examples

 


ASDA First Edition      

      

Analysis Examples Replication-First Edition     

The analysis examples replication materials cover Chapters 5-12 of ASDA First Edition but not every software package contains all 8 chapters.  Lack of a link for a given chapter indicates that this software package does not include the ability to perform this type of analysis technique. 


SAS v9.2 Code and Results

Chapter 5 Analysis Examples

Chapter 6 Analysis Examples

Chapter 7 Analysis Examples

Chapter 8 Analysis Examples

Chapter 9 Analysis Examples

Chapter 10 Analysis Examples

Chapter 11 Analysis Examples

 

Sudaan 10.0 Code and Results

Chapter 5 Analysis Examples

Chapter 6 Analysis Examples

Chapter 7 Analysis Examples

Chapter 8 Analysis Examples

Chapter 9 Analysis Examples

Chapter 10 Analysis Examples

Chapter 11 Analysis Examples

 

SPSS/PASW V18.0 Code and Results

Chapter 5 Analysis Examples

Chapter 6 Analysis Examples

Chapter 7 Analysis Examples

Chapter 8 Analysis Examples

Chapter 9 Analysis Examples

Chapter 10 Analysis Examples

 

IVEware Code and Results

Chapter 5 Analysis Examples

Chapter 6 Analysis Examples

Chapter 7 Analysis Examples

Chapter 8 Analysis Examples

Chapter 9 Analysis Examples

Chapter 10 Analysis Examples

Chapter 11 Analysis Examples

 

WesVar 4.3 Code and Results

Chapter 5 Analysis Examples

Chapter 6 Analysis Examples

Chapter 7 Analysis Examples

Chapter 8 Analysis Examples

Chapter 9 Analysis Examples

Chapter 10 Analysis Examples

 

R Survey 3.2 Code and Results

Chapter 5 Analysis Examples

Chapter 6 Analysis Examples

Chapter 7 Analysis Examples

Chapter 8 Analysis Examples

Chapter 9 Analysis Examples

Chapter 10 Analysis Examples

Chapter 11 Analysis Examples

 

Mplus 5.2 Code and Results

Chapter 7 Analysis Examples

Chapter 8 Analysis Examples

Chapter 9 Analysis Examples

Chapter 10 Analysis Examples

Chapter 11 Analysis Examples

Chapter 12 Analysis Examples

 

Stata v10.1 Code and Results

Chapter 5 Analysis Examples

Chapter 6 Analysis Examples

Chapter 7 Analysis Examples

Chapter 8 Analysis Examples

Chapter 9 Analysis Examples

Chapter 10 Analysis Examples

Chapter 11 Analysis Examples

Chapter 12 Analysis Examples

 

 

Site Overview

This site contains information about the text "Applied Survey Data Analysis", (first and second editions) including author biographies,links to public release data sets and related sites, code and output for analysis examples replicated in current software packages, and information about new publications of interest to survey data analysts.   Other features include a FAQ log and links to other software and statistical sites.  We plan to intermittently update this site with news about ongoing statistical and software advances in the field of analysis of survey data.    

 

Special Notes from Authors

ASDA-Second Edition is Available as of June 28, 2017!

 

Project Overview

Applied Survey Data Analysis is the product born of many years of teaching applied survey data analysis classes and practical experience analyzing survey data. We have taught various versions of this course in the ISR/SRC Summer Institute Program, as part of University of Michigan/CSCAR, and within the Survey Methodology Program at University of Michigan and University of Maryland.  Our goal has been to integrate teaching materials and practical analysis knowledge into a textbook geared to a level accessible for graduate students and working analysts who may have varying levels of statistical and analytic expertise.  We intend to update the materials on this website as statistical and software improvements emerge with the goal of assisting analyst and researchers performing survey data analysis.  

 

Information About Authors

Patricia A. Berglund is a Senior Research Associate in the Survey Methodology Program at the Institute for Social Research.  She has extensive experience in the use of computing systems for data management and complex sample survey data analysis. She works on research projects in youth substance abuse, adult mental health, and survey methodology using data from Army STARRS, Monitoring the Future, the National Comorbidity Surveys, World Mental Health Surveys, Collaborative Psychiatric Epidemiology Surveys, and various other national and international surveys. In addition, she is involved in development, implementation, and teaching of analysis courses and computer training programs at the Survey Research Center-Institute for Social Research.  She also lectures in the SAS® Institute-Business Knowledge Series.   mailto:pberg@umich.edu

Steven G. Heeringa is a Research Scientist in the Survey Methodology Program, the Director of the Statistical and Research Design Group in the Survey Research Center, and the Director of the Summer Institute in Survey Research Techniques at the Institute for Social Research. He has over 25 years of statistical sampling experience directing the development of the SRC National Sample design, as well as sample designs for SRC's major longitudinal and cross-sectional survey programs. During this period he has been actively involved in research and publication on sample design methods and procedures such as weighting, variance estimation, and the imputation of missing data that are required in the analysis of sample survey data. He has been a teacher of survey sampling methods to U.S. and international students and has served as a sample design consultant to a wide variety of international research programs based in countries such as Russia, the Ukraine, Uzbekistan, Kazakhstan, India, Nepal, China, Egypt, Iran, and Chile. mailto:sheering@umich.edu

Brady T. West Brady T. West is a Research Associate Professor in the Survey Methodology Program, located within the Survey Research Center at the Institute for Social Research on the University of Michigan-Ann Arbor (U-M) campus. He also serves as a Statistical Consultant on the U-M Consulting for Statistics, Computing, and Analytics Research (CSCAR) team. He earned his PhD from the Michigan Program in Survey Methodology in 2011. Before that, he received an MA in Applied Statistics from the U-M Statistics Department in 2002, being recognized as an Outstanding First-year Applied Masters student, and a BS in Statistics with Highest Honors and Highest Distinction from the U-M Statistics Department in 2001. His current research interests include the implications of measurement error in auxiliary variables and survey paradata for survey estimation, survey nonresponse, interviewer variance, and multilevel regression models for clustered and longitudinal data. He is the lead author of a book comparing different statistical software packages in terms of their mixed-effects modeling procedures (Linear Mixed Models: A Practical Guide using Statistical Software,Second Edition, Chapman Hall/CRC Press, 2014), and he is a co-author of a second book entitled Applied Survey Data Analysis (with Steven Heeringa and Pat Berglund), which was published by Chapman Hall in April 2010 and has a second edition in press that will be available in mid-2017. Brady lives in Dexter, MI with his wife Laura, his son Carter, his daughter Everleigh, and his American Cocker Spaniel Bailey. mailto:bwest@umich.edu

 

Professional Reviews of ASDA-Second Edition

    Review/Summary from Chapman Hall Website  

  Features

- Bootstrap methods of variance estimation.

- Estimation and inference for specialized functions such as the Gini coefficient and log-linear models.

- Updated approaches to examining model diagnostics, testing goodness of fit, and estimation and display of marginal effects in linear and generalized linear models.

- State-of-the-art methods for analysis of longitudinal survey data.

- Fractional imputation methods for item missing data.

- Enhanced treatment of methods and software for fitting multilevel models, structural equation models and other latent variable models to complex sample survey data.

- Updated review of software packages for the analysis of complex sample survey data.

  Summary

Highly recommended by the Journal of Official Statistics, The American Statistician, and other journals, Applied Survey Data Analysis, Second Edition provides an up-to-date overview of state-of-the-art approaches to the analysis of complex sample survey data. Building on the wealth of material on practical approaches to descriptive analysis and regression modeling from the first edition, this second edition expands the topics covered and presents more step-by-step examples of modern approaches to the analysis of survey data using the newest statistical software.

Designed for readers working in a wide array of disciplines who use survey data in their work, this book continues to provide a useful framework for integrating more in-depth studies of the theory and methods of survey data analysis. An example-driven guide to the applied statistical analysis and interpretation of survey data, the second edition contains many new examples and practical exercises based on recent versions of real-world survey data sets. Although the authors continue to use Stata for most examples in the text, they also continue to offer SAS, SPSS, SUDAAN, R, WesVar, IVEware, and Mplus software code for replicating the examples on the book’s updated Web site.

 

Links to Data Sets for First and Second Editions

National Comorbidity Survey-Replication (Collaborative Psychiatric Epidemiology Surveys)

        http://www.icpsr.umich.edu/cpes (for online documentation tools and data download) 

        http://www.hcp.med.harvard.edu/ncs (for NCS-R specific information)

National Health and Nutrition Examination Survey (National Center for Health Statistics)    

        http://www.cdc.gov/nchs/

Health and Retirement Survey (Institute for Social Research-University of Michigan)

        http://hrsonline.isr.umich.edu

European Social Survey (ESS)

        http://www.europeansocialsurvey.org/

United States Census Bureau

     http://www.census.gov/

 

Chapter Exercises Data Sets - Second Edition

   These data sets are subsets of the original data and are designed for use with the chapter exercises in ASDA Second Edition. We provide SAS and Stata format data sets here but for other software, please use a data transfer software or import/export tools within software of choice to translate to needed format.

     Chapter Exercises Data Sets (SAS Format) - Second Edition      Chapter Exercises Data Sets (Stata Format) - Second Edition

Chapter Exercises Data Sets - First Edition

     These data sets are subsets of the original data and are designed for use with the chapter exercises in ASDA.

      Chapter Exercises Data Sets (Stata and SAS Format) - First Edition     Chapter Exercises Data Sets (R Format) - First Edition

Analysis Example Data Sets - First Edition

     These data sets are subsets of the original data and are designed for use with the analysis examples in ASDA - First Edition.  We have included the raw variables used in the variable recodes and constructed variables used in the analysis examples. 

      Analysis Examples Data Sets (Stata and SAS Format) - First Edition     

 

Frequently Asked Questions

        This document contains frequently asked questions and brief answers.  Click here: FAQ Document  

        This working paper addresses Accounting for Multi-stage Sample Designs in Complex Sample Variance Estimation by Brady West.  Click here to download: Multi-Stage Sample Designs

 

Links to Additional Sites

Data Archive

        University of Michigan (ICPSR) Data Archive http://www.icpsr.umich.edu

Software for Survey Data Analysis

        SAS® software     http://www.sas.com

        STATA® software     http://www.stata.com

        Sudaan® software     http://www.rti.org

        SPSS® software     http://www.spss.com

        Mplus® software     http://statmodel.com

        R software     http://www.r-project.org/

        WesVar software     http://www.westat.com/westat/statistical_software/wesvar

        IVEware     http://www.isr.umich.edu/src/smp/ive

        SDA from ICPSR  http://www.icpsr.umich.edu (online analysis system with survey correction capabilities)  

        Manual for Package ‘svydiags’ from R, Linear Regression Model Diagnostics for Survey Data  Link to Manual

Software Updates

        Stata - V14 is current as of May 2017  

        IBM/SPSS-SPSS 22 is current as of May 2017  

        SAS - v9.4 is current as of May 2017 

        See software websites for additional software updates and versions

 

Supplemental Code

This section provides key updates to software for analysis of survey data.

         SAS-Example of how to use replicate weights using NHANES data: SAS Replicate Weights Example

         Stata-Example of Mediation analysis with survey data and subpopulation indicator: Stata sgmediation example    

         R-Example of Quantile Regression with Bootstrap Method: R Quantile Regression Example

         SAS-Example of use of NOMCAR option with PROC SURVEYMEANS: SAS NOMCAR Example

         Example of How to Create a Delimited Text File in SAS and Read Text File in R: Text File SAS to R Example  

         An Example of Fuller’s (1984) Method for Testing the Bias of Unweighted Estimates of Regression Parameters in a  Linear Regression Model Fuller's Method

 

         SAS code to implement Wilcoxon rank sum test for complex sample survey data: http://www.blackwellpublishing.com/rss

 

         SAS Paper with Examples of ODS Graphics and SG Procedures with Examples of Weighted Frequency Plots: SAS Paper with ODS Graphics and SG Procedures Examples

 

         Note on How SPSS handles Strata with A Single or "Lonely" PSU: http://www-01.ibm.com/support/docview.wss?uid=swg21479202

 

         Link to Stata command for calculation of Population Attributable Risk proportions (user written "punaf" command): http://www.imperial.ac.uk/nhli/r.newson/usergp/uk2012/newson_ohp1.pdf

 

         Example of using PROC EXPORT to convert SAS data set to Stata (.dta) and SPSS (.sav): SAS PROC EXPORT Example     

 

         Multiple Imputation Using the Fully Conditional Specification Method: A Comparison of SAS, Stata, IVEware, and R: Link to Presentation  

   

         Analysis of Survey Data Using the SAS SURVEY Procedures: A Primer: Link to Presentation

 

         Link to Web Site with Information about Free Tools for Survey Data Analysis and Map Production: http://www.asdfree.com/2014/12/maps-and-art-of-survey-weighted.htm Link to full code for Map Examples: https://github.com/davidbrae/swmap

 

         SAS Repeated Replication Macro to do Design-Based Poisson Regression (with a comparison to Stata svy: poisson command): Link to Code and Results

 

         New Stata V14+ Features: 1.The "survwgt" contributed package for creating replicate weights: Link to Package. 2.The "bs4rw" modifier for performing quantile regression. Install using http://www.stata.com/users/jpitblado/bs4rw. Implement a command referring to replicate weights that have already been generated: "survwgt: bs4rw, rw(brrrwt*): qreg $depvar $demo if subpop==1 [pw=perwt5], q(.5)".

         R package for fractional hot deck imputation (FHDI) is now available from CRAN (Primary Author, Dr. Jae Kim). Link to Code and Information

         Modified Stata file, pwigls_genlin_adcv_modAV1.do for C11 for Viega Method (Author is Dr. A. Viega). Link to File

 

         Example of SAS 9.4 PROC SURVEYMEANS with DOMAIN Statement and DIFF Option for Difference of Means Test. Link to File

 

         Example of Use of R "Convey" Package for Svy GINI Coefficient. Link to File

 

         Examples of R Survey Package RegTermTest Command Syntax For Tests of Interactions Only and Main Effects Plus Interactions. Link to File

 

Statistical Resources for Analysis of Survey Data

        University of Michigan      

        Institute for Social Research-Summer Institute     www.isr.umich.edu/src/si

        IVEware (Imputation and Variance Estimation software)     www.isr.umich.edu/src/smp/ive

        ICPSR summer institute     http://www.icpsr.umich.edu/icpsrweb/sumprog/

        Center for Statistical Consulting and Research     www.umich.edu/~cscar/

        University of California-Los Angeles

        Statistical and Survey Data Analysis     http://www.ats.ucla.edu/stat/

        University of North Carolina-Chapel Hill

        Population Center     http://www.cpc.unc.edu/

        American Statistical Association 

        Home Page     http://www.amstat.org/

 

Survey Data Analysis Publications - General Survey Data Analysis Topics (since 2015)

This section is designed to provide information about key updates in publications regarding Survey Data analysis.  We will add to the list as new publications emerge.

         Mplus Notes area with many articles about survey data analysis: http://statmodel.com/resrchpap.shtml.

         Presentation on AIC and BIC for Survey Data by Thomas Lumley and Alastair Scott: Link to Presentation

         Lumley and Scott, AIC AND BIC FOR MODELING WITH COMPLEX SURVEY DATA, Journal of Survey Statistics and Methodology,2015, Link to Paper

         Thompson, Mary E., Using Longitudinal Complex Survey Data, Annual Review of Statistics.and Its Application,2015. 2:305–20, Link to Paper

         Bridget L. Ryan, John Koval, Bradley Corbett, Amardeep Thind, M. Karen Campbell, and Moira Stewart, Assessing the impact of potentially influential observations in weighted logistic regression, The Research Data Centres Information and Technical Bulletin, Catalogue no. 12-002‑X —No. 2015001, Link to Paper

         Jianzhu Li and Richard Valliant, Linear Regression Diagnostics in Cluster Samples,Journal of Official Statistics, Vol. 31, No. 1, 2015, pp. 61–75, Link to Paper

         Miles, Andrew, Obtaining Predictions from Models Fit to Multiply Imputed Data, Sociological Methods & Research, pp. 1-11, 2015, Link to Paper

         Luchman, J.N., Determining Subgroup Difference Importance with Complex Survey Designs An Application of Weighted Dominance Analysis, Survey Practice, Vol. 8, no 4, 2015, Link to Paper

         Oya Kalaycioglu,Andrew Copas, Michael King and Rumana Z. Omar, A comparison of multiple-imputation methods for handling missing data in repeated measurements observational studies, Journal of the Royal Statistical Society, June 2015, Link to Paper

         Natalie Dean, Marcello Pagano, EVALUATING CONFIDENCE INTERVAL METHODS FOR BINOMIAL PROPORTIONS IN CLUSTERED SURVEYS, Journal of Survey Statistics and Methodology, October 2015, Link to Paper

         Zhou, H., Elliott, M.R., Raghunathan, T.E. (2015). "Synthetic Multiple Imputation Procedure For Multi-Stage Complex Samples," to appear in Journal of Official Statistics soon.

         Zhou, H., Elliott, M.R., Raghunathan, T.E. (2015). "A Two-Step Semiparametric Method to Accommodate Sampling Weights in Multiple Imputation," in Biometrics 2015 Sep 22. Link to Paper

         Zhou, H., Elliott, M.R., Raghunathan, T.E. (2015). "Multiple Imputation In Two-Stage Cluster Samples Using The Weighted Finite Population Bayesian Bootstrap," to appear in Journal of Survey Statistics and Methodology soon.

         Stapleton, L. and Kang, Y. (2016). "Design Effects of Multilevel Estimates From National Probability Samples", Sociological Methods & Research 0049124116630563, first published on February 11, 2016 as doi:10.1177/0049124116630563, Link to Paper

         Daoying Lin, Lingxiao Wang, and Yan Li, "HAPLOTYPE-BASED STATISTICAL INFERENCE FOR POPULATION-BASED CASE–CONTROL AND CROSS-SECTIONAL STUDIES WITH COMPLEX SAMPLE DESIGNS", J Surv Stat Methodol published 25 April 2016, 10.1093/jssam/smv040. Link to Paper

         Bollen,K., Biemer,P., Karr,A., Tueller,S., Berzofsky,M.,"Are Survey Weights Needed? A Review of Diagnostic Tests in Regression Analysis", Annual Review of Statistics and Its Application Vol. 3: 375-392 (Volume publication date June 2016). Link to Paper

         Hanzhi Zhou, Michael R. Elliott, and Trivellore E. Raghunathan,"Multiple Imputation in Two-stage Cluster Samples Using the Weighted Finite Population Bayesian Bootstrap", J Surv Stat Methodol 2016 4: 139-170. Link to Paper

         Minsun Kim Riddles, Jae Kwang Kim, and Jongho Im, "A Propensity-score-adjustment Method for Nonignorable Nonresponse", J Surv Stat Methodol 2016 4: 215-245. . Link to Paper

         Brady T. West, Joseph W. Sakshaug, Guy Alain S. Aurelien, "How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?", Published: June 29,http://dx.doi.org/10.1371/journal.pone.0158120. Link to Paper

         Ismael Flores Cervantes and J. Michael Brick, "Nonresponse adjustments with misspecified models in stratified designs", Survey Methodology, Catalogue no. 12-001-X, Release date: June 22, 2016. Link to Paper

         Xiaying Zheng and Ji Seung Yang, "Using Sample Weights in Item Response Data Analysis Under Complex Sample Designs", L.A. van der Ark et al. (eds.), Quantitative Psychology Research, Springer, Proceedings in Mathematics & Statistics 167, DOI 10.1007/978-3-319-38759-8_10. Link to Paper

         Xing Lui, "Fitting Proportional Odds Models for Complex Sample Survey Data with SAS, IBM SPSS, Stata, and R", General Linear Model Journal, 2016, Vol. 42(2). Link to Paper

         Toth, Daniel, Bureau of Labor Statistics, "An R Package for Modeling Survey Data with Regression Trees", WSS Seminar, 2017. Link to Presentation

         Hsu HY1, Lin JJH2, Skidmore ST3, "Analyzing individual growth with clustered longitudinal data: A comparison between model-based and design-based multilevel approaches", Behav Res Methods. 2017 Jun 20. doi: 10.3758/s13428-017-0905-7. [Epub ahead of print]. Link to Paper

         Qixuan Chen, Michael R. Elliott, David Haziza, Ye Yang, Malay Ghosh, Roderick J. A. Little, Joseph Sedransk, and Mary Thompson, "Approaches to Improving Survey-Weighted Estimates", Statist. Sci.Volume 32, Number 2 (2017), 227-248. Link to Paper

 

Survey Data Analysis Publications - Bayes Related (since 2015)

         Si, Y., Pillai, N.S., and Gelman, A., "Bayesian nonparametric weighted sampling inference" Bayesian Analysis, 2015, 10(3) 605-625. Link to Paper   Link to STAN Codes for Binary Outcome   Link to STAN Codes for Continuous Outcome  

 

Errata Second Edition

         Please check this link for corrections to ASDA Second Edition: ASDA Second Edition Errata

Errata First Edition

        Please check this link for corrections to ASDA First Edition : ASDA Errata

www.isr.umich.edu/src/smp/asda
 For problems or questions regarding this Web site contact [pberg@umich.edu].
Last updated: September 18, 2017