Clustering Continuous or Categorical Data with the Forward Search

 

Andrea Cerioli Marco Riani  Anthony C. Atkinson
Dipartimento di Economia, Dipartimento di Economia, The London School of Economics,
UniversitÓ di Parma, UniversitÓ di Parma, London WC2A 2AE, UK
Italy Italy  UK
mriani@unipr.it mriani@unipr.it  a.c.atkinson@lse.ac.uk

Abstract


The normal distribution, perhaps after data transformation, is the most widely used model for continuous multivariate data. Models for multivariate categorical data are more problematic. In this paper we give examples of the use of the forward search for clustering data that are either all continuous or all discrete.

 

The dataset used in the paper can be downloaded here

Last modified 02/06/2009 00.40.58