New Methods to Support Census Bureau Record Linkage
This project will undertake research and software development to improve Census capabilities for entity resolution and record linkage.
We will build on ongoing projects linking Census data products to themselves, to external surveys, and to administrative data. We will develop and evaluate approaches that move best practices beyond Fellegi and Sunter (1969) and its implementations, with supervised and unsupervised machine learning to estimate probabilistic models of record linkage and use multiple imputation to propagate uncertainty. We will develop and compute meaningful diagnostics using multiple novel truth datasets, including one relying on biometrics in administrative data and another using longitudinal survey self-reports over multiple decades. We will share our approaches in LinkageLibrary, a community resource for improving record linkage, and encourage other researchers to test these approaches on alternative truth datasets.
Michael G Mueller-Smith