Many Variables have within group homogeneity (similarity of values for the individual units that comprise the groups). Measures of within group homogeneity are useful for the sample design and statistical analysis of datasets for populations that contain groups, such as individuals in geographical areas. Homogeneity measures can easily be defined for continuous or dichotomous variables. Here we propose a homogeneity measure for a multi-category variable and show how this measure can be calculated without access to individual level data. We apply the measure to data from the UK census and show how this measure can be related to the homogeneity of particular llinear combinations of the categories called Canonical Grouping Variables (CGVs), and explain how these are interpreted.
AMS Subject Classification: 91G70
Keywords: Group; Clustering; Homogeneity; Intra-class correlation; Categorical variables; Canonical grouping variables; Aggregate data; Census area data