Information integration (II) (also called referential integrity) is the merging of information from heterogeneous sources with differing conceptual, contextual and typographical representations. It is used in data mining and consolidation of data from unstructured or semi-structured resources. Typically, information integration refers to textual representations of knowledge but is sometimes applied to rich-media content. Information fusion, which is a related term, involves the combination of information into a new set of information towards reducing redundancy and uncertainty.
Examples of technologies available to integrate information include deduplication, and string metrics which allow the detection of similar text in different data sources by fuzzy matching. A host of methods for these research areas are available such as those presented in the International Society of Information Fusion.