Skip to main content
placeholder image

Split Questionnaire Designs: collecting only the data that you need through MCAR and MAR designs

Journal Article


Abstract


  • We call a sample design that allows for different patterns, or sets, of data items to be collected from different sample units a Split Questionnaire Design (SQD). SQDs can be thought of as incorporating missing data into survey design. This paper examines the situation where data that are not collected by an SQD can be treated as Missing Completely At Random or Missing At Random, targets are regression coefficients in a generalised linear model fitted to binary variables, and targets are estimated using Maximum Likelihood. A key finding is that it can be easy to measure the relative contribution of a respondent to the accuracy of estimated model parameters before collecting all the respondent's model covariates. We show empirically and theoretically that we could achieve a significant reduction in respondent burden with a negligible impact on the accuracy of estimates by not collecting model covariates from respondents who we identify as contributing little to the accuracy of estimates. We discuss the general implications for SQDs.

Publication Date


  • 2017

Citation


  • Chipperfield, J. O., Barr, M. L. & Steel, D. G. (2017). Split Questionnaire Designs: collecting only the data that you need through MCAR and MAR designs. Journal of Applied Statistics, Online First 1-11.

Scopus Eid


  • 2-s2.0-85029438598

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers1/1993

Number Of Pages


  • 10

Start Page


  • 1

End Page


  • 11

Volume


  • Online First

Place Of Publication


  • United Kingdom

Abstract


  • We call a sample design that allows for different patterns, or sets, of data items to be collected from different sample units a Split Questionnaire Design (SQD). SQDs can be thought of as incorporating missing data into survey design. This paper examines the situation where data that are not collected by an SQD can be treated as Missing Completely At Random or Missing At Random, targets are regression coefficients in a generalised linear model fitted to binary variables, and targets are estimated using Maximum Likelihood. A key finding is that it can be easy to measure the relative contribution of a respondent to the accuracy of estimated model parameters before collecting all the respondent's model covariates. We show empirically and theoretically that we could achieve a significant reduction in respondent burden with a negligible impact on the accuracy of estimates by not collecting model covariates from respondents who we identify as contributing little to the accuracy of estimates. We discuss the general implications for SQDs.

Publication Date


  • 2017

Citation


  • Chipperfield, J. O., Barr, M. L. & Steel, D. G. (2017). Split Questionnaire Designs: collecting only the data that you need through MCAR and MAR designs. Journal of Applied Statistics, Online First 1-11.

Scopus Eid


  • 2-s2.0-85029438598

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers1/1993

Number Of Pages


  • 10

Start Page


  • 1

End Page


  • 11

Volume


  • Online First

Place Of Publication


  • United Kingdom