In marketing, Geodemographic segmentation is a multivariate statistical classification technique for discovering whether the individuals of a population fall into different groups by making quantitative comparisons of multiple characteristics with the assumption that the differences within any group should be less than the differences between groups.
The information technologies employed in geodemographic segmentation include geographic information system and database management software.
Geodemographic segmentation is based on two simple principles:
The use of different algorithms leads to different results, but there is no single best approach for selecting the best algorithm, just as no algorithm offers any theoretical proof of its certainty. One of the most frequently used techniques in geodemographic segmentation is the widely known k-means clustering algorithm. In fact most of the current commercial geodemographic systems are based on a k-means algorithm. Still, clustering techniques coming from artificial neural networks, genetic algorithms, or fuzzy logic are more efficient within large, multidimensional databases (Brimicombe 2007).
Neural networks can handle non-linear relationships, are robust to noise and exhibit a high degree of automation. They do not assume any hypotheses regarding the nature or distribution of the data and they provide valuable assistance in handling problems of a geographical nature that, to date, have been impossible to solve. One of the best known and most efficient neural network methods for achieving unsupervised clustering is the Self-Organizing Map (SOM). SOM has been proposed as an improvement over the k-means method, for it provides a more flexible approach to census data clustering The SOM method has been recently used by Spielman and Thill (2008) to develop geodemographic clustering of a census dataset concerning New York City.
Another way of characterizing an individual polygon’s similarity to all the regions is based on fuzzy logic. The basic concept of fuzzy clustering is that an object may belong to more than one clusters. In binary logic, the set is limited by the binary yes - no definition, meaning that an object either belongs or not to a cluster. Fuzzy clustering allows a spatial unit to belong to more than one clusters with varying membership values. Most studies concerning geodemographic analysis and fuzzy logic employ the Fuzzy C-Means algorithm and the Gustafson-Kessel algorithm, (Feng and Flowerdew 1999).
Famous geodemographic segmentation systems are Claritas Prizm (US), PSYTE HD (Canada), Tapestry (US), CAMEO (UK), ACORN (UK) and MOSAIC (UK) system. New systems targeting subgroups of the population are also emerging. For example, Segmentos examines the geodemographic lifestyles of Hispanics in the United States. Both MOSAIC and ACORN use Onomastics to infer the ethnicity from resident names.