Assessment of Imputation Methods for Integrated Business Data

Ricardo Enrico Namay II
Statistics New Zealand

A comparison of three imputation methods for an integrated business data was carried out. Initially, multiple imputation, mean imputation and donor imputation were tested. Because of computational limitations, the study was consequently restricted to mean and donor imputation. The paper demonstrates how imputation methods can be computationally compared with respect to several dimensions: order and distribution preservation, plausibility of individual values, preservation of correlations, and aggregate statistics. Through random sampling and linear programming, this paper also proposes a method to construct a rectangular subsample that replicates the pattern of missing values of the dataset to be imputed while at the same time preserving relative imputation class sizes.

Session 2a, Sample Surveys: 10:50 — 11:10, Room 031

Presentation Program