Disjoint-set data structure

Disjoint-set/Union-find Forest

multiway tree

Invented

1964

Invented by

Bernard A. Galler and Michael J. Fischer

Algorithm	Average	Worst Case
Space	$O(n)$	$O(n)$
Search	$O(α (n))$	$O(α (n))$
Merge	$O(α (n))$	$O(α (n))$

Algorithm	Average	Worst Case
Space	$O(n)$	$O(n)$
Search	$O(α (n))$	$O(α (n))$
Merge	$O(α (n))$	$O(α (n))$

In computer science, a disjoint-set data structure, also called a union–find data structure or merge–find set, is a data structure that keeps track of a set of elements partitioned into a number of disjoint (nonoverlapping) subsets. It supports two useful operations:

The other important operation, MakeSet, which makes a set containing only a given element (a singleton), is generally trivial. With these three operations, many practical partitioning problems can be solved (see the Applications section).

In order to define these operations more precisely, some way of representing the sets is needed. One common approach is to select a fixed element of each set, called its representative, to represent the set as a whole. Then, Find(x) returns the representative of the set that x belongs to, and Union takes two set representatives as its arguments.

A simple disjoint-set data structure uses a linked list for each set. The element at the head of each list is chosen as its representative.

MakeSet creates a list of one element. Union appends the two lists, a constant-time operation if the list carries a pointer to its tail. The drawback of this implementation is that Find requires O(n) or linear time to traverse the list backwards from a given element to the head of the list.

This can be avoided by including in each linked list node a pointer to the head of the list; then Find takes constant time, since this pointer refers directly to the set representative. However, Union now has to update each element of the list being appended to make it point to the head of the new combined list, requiring O(n) time.

When the length of each list is tracked, the required time can be improved by always appending the smaller list to the longer. Using this weighted-union heuristic, a sequence of m MakeSet, Union, and Find operations on n elements requires O(m + nlog n) time. For asymptotically faster operations, a different data structure is needed.

...
Wikipedia