*** Welcome to piglix ***

Textual analysis


Content analysis is a research method for studying communication artifacts. Social scientists use content analysis to quantify patterns in communication. Practices and philosophies of content analysis vary between scholarly communities. They all involve systematic reading or observation of texts or artifacts which are assigned labels (sometimes called codes) to indicate the presence of interesting, meaningful patterns . After labeling a large set of texts, a researcher is able to statistically estimate the proportions of patterns in the texts, as well as correlations between patterns. Computers are increasingly used in content analysis. Popular qualitative data analysis programs provide efficient work-flow and data management tools for labeling. Simple computational techniques can provide descriptive data such as word frequencies and document lengths. Machine learning classifiers can greatly increase the number of texts which can be labeled, but the scientific utility of doing so is a matter of debate.

Content analysis is best understood as a broad family of techniques. Effective researchers choose techniques that best help them answer their substantive questions. That said, according to Klaus Krippendorff, six questions must be addressed in every content analysis:

The simplest and most objective form of content analysis considers unambiguous characteristics of the text such as word frequencies, the page area taken by a newspaper column, or the duration of a radio or television program. Analysis of simple word frequencies is limited because the meaning of a word depends on surrounding text. Keyword In Context routines address this by placing words in their textual context. This helps resolve ambiguities such as those introduced by synonyms and homonyms.

A further step in analysis is the distinction between dictionary-based (quantitative) approaches and qualitative approaches. Dictionary-based approaches set up a list of categories derived from the frequency list of words and control the distribution of words and their respective categories over the texts. While methods in quantitative content analysis in this way transform observations of found categories into quantitative statistical data, the qualitative content analysis focuses more on the intentionality and its implications. There are strong parallels between qualitative content analysis and thematic analysis.


More generally, content analysis is research using the categorization and classification of speech, written text, interviews, images, or other forms of communication. In its beginnings, using the first newspapers at the end of the 19th century, analysis was done manually by measuring the number of lines and amount of space given a subject. With the rise of common computing facilities like PCs, computer-based methods of analysis are growing in popularity. Answers to open ended questions, newspaper articles, political party manifestoes, medical records or systematic observations in experiments can all be subject to systematic analysis of textual data.


...
Wikipedia

...