*** Welcome to piglix ***

Data driven journalism


Data-driven journalism, often shortened to "ddj", is a term in use since 2009, to describe a journalistic process based on analyzing and filtering large data sets for the purpose of creating a news story. Main drivers for this process are newly available resources such as open source software, open access publishing and open data. This approach to journalism builds on older practices, most notably on CAR (acronym for "computer-assisted reporting") a label used mainly in the US for decades. Other labels for partially similar approaches are "precision journalism", based on a book by Philipp Meyer, published in 1972, where he advocated the use of techniques from social sciences in researching stories.

Data-driven journalism has an even wider approach. At the core the process builds on the growing availability of open data that is freely available online and analyzed with open source tools. Data-driven journalism strives to reach new levels of service for the public, helping consumers, managers, politicians to understand patterns and make decisions based on the findings. As such, data driven journalism might help to put journalists into a role relevant for society in a new way.

As projects like the MP Expense Scandal (2009) and the 2013 release of the "Offshore leaks" demonstrate, data-driven journalism can assume an investigative role, dealing with "not-so open" aka secret data on occasion.

According to information architect and multimedia journalist Mirko Lorenz, data-driven journalism is primarily a workflow that consists of the following elements: digging deep into data by scraping, cleansing and structuring it, filtering by mining for specific information, visualizing and making a story. This process can be extended to provide information results that cater to individual interests and the broader public.

Data journalism trainer and writer Paul Bradshaw describes the process of data-driven journalism in a similar manner: data must be found, which may require specialized skills like MySQL or Python, then interrogated, for which understanding of jargon and statistics is necessary, and finally visualized and mashed with the aid of open source tools.


...
Wikipedia

...