Comparative linguistics (originally comparative philology) is a branch of historical linguistics that is concerned with comparing languages to establish their historical relatedness.
Genetic relatedness implies a common origin or proto-language and comparative linguistics aims to construct language families, to reconstruct proto-languages and specify the changes that have resulted in the documented languages. To maintain a clear distinction between attested and reconstructed forms, comparative linguists prefix an asterisk to any form that is not found in surviving texts. A number of methods for carrying out language classification have been developed, ranging from simple inspection to computerised hypothesis testing. Such methods have gone through a long process of development.
The fundamental technique of comparative linguistics is to compare phonological systems, morphological systems, syntax and the lexicon of two or more languages using techniques such as the comparative method. In principle, every difference between two related languages should be explicable to a high degree of plausibility and systematic changes, for example in phonological or morphological systems are expected to be highly regular (i.e. consistent). In practice, the comparison may be more restricted, e.g. just to the lexicon. In some methods it may be possible to reconstruct an earlier proto-language. Although the proto-languages reconstructed by the comparative method are hypothetical, a reconstruction may have predictive power. The most notable example of this is Saussure's proposal that the Indo-European consonant system contained laryngeals, a type of consonant attested in no Indo-European language known at the time. The hypothesis was vindicated with the discovery of Hittite, which proved to have exactly the consonants Saussure had hypothesized in the environments he had predicted.
Where languages are derived from a very distant ancestor, and are thus more distantly related, the comparative method becomes impracticable. In particular, attempting to relate two reconstructed proto-languages by the comparative method has not generally produced results that have met with wide acceptance. The method has also not been very good at unambiguously identifying sub-families and different scholars have produced conflicting results, for example in Indo-European. A number of methods based on statistical analysis of vocabulary have been developed to try and overcome this limitation, such as lexicostatistics and mass comparison. The former uses lexical cognates like the comparative method but the latter uses only lexical similarity. The theoretical basis of such methods is that vocabulary items can be matched without a detailed language reconstruction and that comparing enough vocabulary items will negate individual inaccuracies. Thus they can be used to determine relatedness but not to determine the proto-language.