*** Welcome to piglix ***

Computational lexicology


Computational lexicology is a branch of computational linguistics, which is concerned with the use of computers in the study of lexicon. It has been more narrowly described by some scholars (Amsler, 1980) as the use of computers in the study of machine-readable dictionaries. It is distinguished from computational lexicography, which more properly would be the use of computers in the construction of dictionaries, though some researchers have used computational lexicography as synonymous.

Computational lexicology emerged as a separate discipline within computational linguistics with the appearance of machine-readable dictionaries, starting with the creation of the machine-readable tapes of the Merriam-Webster Seventh Collegiate Dictionary and the Merriam-Webster New Pocket Dictionary in the 1960s by John Olney et al. at System Development Corporation. Today, computational lexicology is best known through the creation and applications of WordNet. As the computational processing of the researchers increased over time, the use of computational lexigology has been applied ubiquitously in the text analysis. In 1987, amongst others Byrd, Calzolari, Chodorow have developed computational tools for text analysis. In particular the model was designed for coordinating the associations involving the senses of polysemous words.

Computational lexicology has contributed to the understanding of the content and limitations of print dictionaries for computational purposes (i.e. it clarified that the previous work of lexicography was not sufficient for the needs of computational linguistics). Through the work of computational lexicologists almost every portion of a print dictionary entry has been studied ranging from:

Many computational linguists were disenchanted with the print dictionaries as a resource for computational linguistics because they lacked sufficient syntactic and semantic information for computer programs. The work on computational lexicology quickly led to efforts in two additional directions.

First, collaborative activities between computational linguists and lexicographers led to an understanding of the role that corpora played in creating dictionaries. Most computational lexicologists moved on to build large corpora to gather the basic data that lexicographers had used to create dictionaries. The ACL/DCI (Data Collection Initiative) and the LDC (Linguistic Data Consortium) went down this path. The advent of markup languages led to the creation of tagged corpora that could be more easily analyzed to create computational linguistic systems. Part-of-speech tagged corpora and semantically tagged corpora were created in order to test and develop POS taggers and word semantic disambiguation technology.


...
Wikipedia

...