Many animals, including humans, tend to live in groups, herds, flocks, bands, packs, shoals, or colonies (hereafter: groups) of conspecific individuals. The size of these groups, as expressed by the number of people/etc in a group such as 8 groups of 9 people in each one, is an important aspect of their social environment. Group size tend to be highly variable even within the same species, thus we often need statistical measures to quantify group size and statistical tests to compare these measures between two or more samples. Group size measures are notoriously hard to handle statistically since groups sizes typically follow an aggregated (right-skewed) distribution: most groups are small, few are large, and a very few are very large.
Statistical measures of group size roughly fall into two categories.
As Jarman (1974) pointed out, average individuals live in groups larger than average. Therefore, when we wish to characterize a typical (average) individual’s social environment, we should apply non-parametric estimations of group size. Reiczigel et al. (2008) proposed the following measures:
Imagine a sample with 3 groups, where group sizes are 1, 2, and 6 individuals, respectively, then
Generally speaking, given there are G groups with sizes n1, n2, ..., nG, mean crowding can be calculated as:
Due to the aggregated (right-skewed) distribution of group members among groups, the application of parametric statistics would be misleading. Another problem arises when analyzing crowding values. Crowding data consist of non-independent values, or ties, which show multiple and simultaneous changes due to a single biological event. (Say, all group members' crowding values change simultaneously whenever an individual joins or leaves.)
Reiczigel et al. (2008) discuss the statistical problems associated with group size measures (calculating confidence intervals, 2-sample tests, etc.) and offer a free statistical toolset (Flocker 1.1).
Size of groups, organizations, and communities
An aphid colony
European paper wasp colony