Generating Synthetic Unit-Record Data from Published Marginal Tables

Alan Lee
University of Auckland

We survey methods for generating synthetic data sets without making use of unit-record data. The methods we describe allow the creation of unit-record data in the form of high-dimensional tables whose marginals match publicly available marginal tables. We consider methods based in integer and quadratic programming which allow the construction of tables which exactly match the public tables, and also methods based on iterative proportional fitting which match the public tables approximately.

We describe a set of R functions which implement the methods under study, and apply the methods to data from the 2001 Census of Population and Dwellings.

