Session 1a, Statistical Methodology

This session will be held in the Erskine Building, Room 446

10:50 — 11:10

A logical difficulty with regression analysis: the estimation of non-existent parameters

Robin Willink
Industrial Research Ltd.

‘All models are wrong’ (Box), so any model used in a regression problem can only provide an approximation to the unknown function f(x). Therefore, the parameters of the model do not all represent quantities that actually exist and the quantities ‘estimated’ by the calculated regression coefficients are not all properly defined. So ‘parameter estimation’ is a misnomer and the values of the parameters are actually ‘chosen’. Furthermore, confidence intervals and credible intervals often quoted for the non-existent quantities have no legitimate meaning.

We describe this logical problem in the context of univariate linear regression. Subsequently, we identify quantities that do actually exist and are efficiently estimated by the ordinary least-squares coefficients. The problem of genuine interest is often the estimation of f(x), not the choice of values for the parameters of some approximating function. So we also present a method of estimating f(x) that takes some account of the error incurred by choosing a model. Lastly, we identify other misleading terminology in mathematics and statistics.

11:10 — 11:30

The fourth-root-n consistency and the efficiency of profile likelihood

Yuichi Hirose
Victoria University of Wellington

Profile likelihood is a popular method of estimation in the presence of nuisance parameter. Especially, it is useful for estimation in semi-parametric models, since the method reduces the infinite-dimensional estimation problem to a finite-dimensional one.In this presentation, we show the efficiency of a semi-parametric maximum likelihood estimator based on the profile likelihood. By introducing a new parameterization, we improve the seminal work of Murphy and van der Vaart (2000) in two ways: we prove the no bias condition in a general semi-parametric model context, and dealt with the direct quadratic expansion of the profile likelihood rather than an approximate one.

11:30 — 11:50

Comparison between K–Means and Hierarchical Clustering of Dependent and Independent Data Generated from Multivariate Gaussian Copula Function

Francesca Marta Lilja Di Lascio
Department of Statistics, University of Bologna, Italy

Warren J. Ewens
Department of Biology, University of Pennsylvania, USA

We use the multivariate gaussian copula function ([3]) to evaluate the ability of the K–means algorithm ([2]) and the hierarchical method ([1]) to identify clusters correspondent to the marginal probability functions holding by the dependence structure of their joint distribution function via copula function.

Both for the k–means and the hierarchical clustering we make simulations distinguishing (1) small and big sample, (2) the value of the dependence parameter of the copula function, (3) the value of parameters margins (well–separated, overlapped and distinct margins) and, finally, (4) the kind of the dispersion matrix (unstructured and exchangeable).

We evaluate the performance of the two clustering methods under study by means of (1) the difference between the ‘real’ value of the dependence parameters and its value post-clustering, (2) the percentage of iterations in which the number of the observations for each cluster is different from the ‘real’ one, (3) the capability to identify the exact probability model of the margins.

We find that the hierarchical method works well if the margins are well-distinct irrespective of cluster size, while the k–means works well only if the sample is small. The performance of both clustering methods is independent from the dispersion structure.


  1. [1] Everitt, B., (1993). Cluster Analysis (3 ed.), London, E. Arnold, NY, Halsted Press.
  2. [2] Hartigan, J.A., and Wong, M.A., (1979). Algorithm AS 136: A K–Means Clustering Algorithm, Applied Statistics, vol. 28, n. 1, pp. 100–108.
  3. [3] Nelsen, R.B., (2006). Introduction to copulas, New York, Springer.

11:50 — 12:10

Bias reduction for kernel estimates of density functionals

Professor Martin Hazelton
Institute of Information Sciences and Technology, Massey University

There are a number of important statistical functions that can be expressed as simple functionals of probability densities. These include the relative risk function (a ratio of typically bivariate densities used in geographical epidemiology and elsewhere) and the binary regression function. In many cases parametric models are insufficiently flexible to describe these functionals and a nonparametric approach is to be preferred.

Nonparametric estimation of such functionals can be achieved by substituting kernel estimates in place of the unknown densities. Moreover, in principle we can obtain improved performance in the functional estimates by applying a range of bias reduction techniques developed for density estimation per se. However, in practice this approach tends to lead to poor results.

In this talk I will describe a new methodology which combines local bias reduction techniques borrowed from the density estimation literature with global smoothing optimized for the particular functional to be estimated. The results are encouraging.

The methodology is illustrated through examples on binary regression for low birth weight data, and on geographical variation in the relative risk of cancer of the larynx.

Presentation Program