Skip to main content

Imputation of household survey data using linear mixed models

Journal Article


Abstract


  • Mixed models are regularly used in the analysis of clustered data, but are only recently

    being used for imputation of missing data. In household surveys where multiple people are

    selected from each household, imputation of missing values should preserve the structure

    pertaining to people within households and should not artificially change the apparent

    intracluster correlation (ICC). This paper focuses on the use of multilevel models for

    imputation of missing data in household surveys. In particular, the performance of a best

    linear unbiased predictor for both stochastic and deterministic imputation using a linear

    mixed model is compared to imputation based on a single level linear model, both with

    and without information about household respondents.

    In this paper an evaluation is carried out in the context of imputing hourly wage rate in the

    Household, Income and Labour Dynamics of Australia Survey. Nonresponse is generated

    under various assumptions about the missingness mechanism for persons and households,

    and with low, moderate and high intra-household correlation to assess the benefits of the

    multilevel imputation model under different conditions. The mixed model and single level

    model with information about the household respondent lead to clear improvements when

    the ICC is moderate or high, and when there is informative missingness.

Publication Date


  • 2015

Citation


  • Lago, L. Patricia. & Clark, R. Graham. (2015). Imputation of household survey data using linear mixed models. Australian and New Zealand Journal of Statistics, 57 (2), 169-187.

Scopus Eid


  • 2-s2.0-84931575348

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers/4146

Has Global Citation Frequency


Number Of Pages


  • 18

Start Page


  • 169

End Page


  • 187

Volume


  • 57

Issue


  • 2

Place Of Publication


  • Australia

Abstract


  • Mixed models are regularly used in the analysis of clustered data, but are only recently

    being used for imputation of missing data. In household surveys where multiple people are

    selected from each household, imputation of missing values should preserve the structure

    pertaining to people within households and should not artificially change the apparent

    intracluster correlation (ICC). This paper focuses on the use of multilevel models for

    imputation of missing data in household surveys. In particular, the performance of a best

    linear unbiased predictor for both stochastic and deterministic imputation using a linear

    mixed model is compared to imputation based on a single level linear model, both with

    and without information about household respondents.

    In this paper an evaluation is carried out in the context of imputing hourly wage rate in the

    Household, Income and Labour Dynamics of Australia Survey. Nonresponse is generated

    under various assumptions about the missingness mechanism for persons and households,

    and with low, moderate and high intra-household correlation to assess the benefits of the

    multilevel imputation model under different conditions. The mixed model and single level

    model with information about the household respondent lead to clear improvements when

    the ICC is moderate or high, and when there is informative missingness.

Publication Date


  • 2015

Citation


  • Lago, L. Patricia. & Clark, R. Graham. (2015). Imputation of household survey data using linear mixed models. Australian and New Zealand Journal of Statistics, 57 (2), 169-187.

Scopus Eid


  • 2-s2.0-84931575348

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers/4146

Has Global Citation Frequency


Number Of Pages


  • 18

Start Page


  • 169

End Page


  • 187

Volume


  • 57

Issue


  • 2

Place Of Publication


  • Australia