Skip to main content

Linear regression with nested errors using probability-linked data

Journal Article


Download full-text (Open Access)

Abstract


  • Probabilistic matching of records is widely used to create linked data sets for use in health science, epidemiological, economic, demographic and sociological research. Clearly, this type of matching can lead to linkage errors, which in turn can lead to bias and increased variability when standard statistical estimation techniques are used with the linked data. In this paper we develop unbiased regression parameter estimates to be used when fitting a linear model with nested errors to probabilistically linked data. Since estimation of variance components is typically an important objective when fitting such a model, we also develop appropriate modifications to standard methods of variance components estimation in order to account for linkage error. In particular, we focus on three widely used methods of variance components estimation: analysis of variance, maximum likelihood and restricted maximum likelihood. Simulation results show that our estimators perform reasonably well when compared to standard estimation methods that ignore linkage errors. © 2014 Australian Statistical Publishing Association Inc. Published by Wiley Publishing Asia Pty Ltd.

Publication Date


  • 2014

Citation


  • Samart, K. & Chambers, R. L. (2014). Linear regression with nested errors using probability-linked data. Australian and New Zealand Journal of Statistics, 56 (1), 27-46.

Scopus Eid


  • 2-s2.0-84899115755

Ro Full-text Url


  • http://ro.uow.edu.au/cgi/viewcontent.cgi?article=4476&context=eispapers

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers/3459

Number Of Pages


  • 19

Start Page


  • 27

End Page


  • 46

Volume


  • 56

Issue


  • 1

Abstract


  • Probabilistic matching of records is widely used to create linked data sets for use in health science, epidemiological, economic, demographic and sociological research. Clearly, this type of matching can lead to linkage errors, which in turn can lead to bias and increased variability when standard statistical estimation techniques are used with the linked data. In this paper we develop unbiased regression parameter estimates to be used when fitting a linear model with nested errors to probabilistically linked data. Since estimation of variance components is typically an important objective when fitting such a model, we also develop appropriate modifications to standard methods of variance components estimation in order to account for linkage error. In particular, we focus on three widely used methods of variance components estimation: analysis of variance, maximum likelihood and restricted maximum likelihood. Simulation results show that our estimators perform reasonably well when compared to standard estimation methods that ignore linkage errors. © 2014 Australian Statistical Publishing Association Inc. Published by Wiley Publishing Asia Pty Ltd.

Publication Date


  • 2014

Citation


  • Samart, K. & Chambers, R. L. (2014). Linear regression with nested errors using probability-linked data. Australian and New Zealand Journal of Statistics, 56 (1), 27-46.

Scopus Eid


  • 2-s2.0-84899115755

Ro Full-text Url


  • http://ro.uow.edu.au/cgi/viewcontent.cgi?article=4476&context=eispapers

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers/3459

Number Of Pages


  • 19

Start Page


  • 27

End Page


  • 46

Volume


  • 56

Issue


  • 1