*** Welcome to piglix ***

Data deluge


The information explosion is the rapid increase in the amount of published information or data and the effects of this abundance. As the amount of available data grows, the problem of managing the information becomes more difficult, which can lead to information overload. The Online Oxford English Dictionary indicates use of the phrase in a March 1964 New Statesman article.The New York Times first used the phrase in its editorial content in an article by Walter Sullivan on June 7, 1964, in which he described the phrase as "much discussed". (p11.) The earliest use of the phrase seems to have been in an IBM advertising supplement to the New York Times published on April 30, 1961, and by Frank Fremont-Smith, Director of the American Institute of Biological Sciences Interdisciplinary Conference Program, in an April 1961 article in the AIBS Bulletin (p18.)

Many sectors are seeing this rapid increase in the amount of information available such as healthcare, supermarkets, and even governments with birth certificate informations and immunization records. Another sector that is being affected by this phenomena is journalism. Such profession, which in the past was responsible for the dissemination of information, may be suppressed by so many sources of information today.

Techniques to gather knowledge from an overabundance of electronic information (e.g., data fusion may help in data mining) have existed since the 1970s. Another common technique to deal with such amount of information is qualitative research. Such approach aims at organizing the information, synthesizing, categorizing and systematizing in order to be more usable and easier to search.

A new metric that is being used in an attempt to characterize the growth in person-specific information, is the disk storage per person (DSP), which is measured in megabytes/person (where megabytes is 106 bytes and is abbreviated MB). Global DSP (GDSP) is the total rigid disk drive space (in MB) of new units sold in a year divided by the world population in that year. The GDSP metric is a crude measure of how much disk storage could possibly be used to collect person-specific data on the world population. In 1983, one million fixed drives with an estimated total of 90 terabytes were sold worldwide; 30MB drives had the largest market segment. In 1996, 105 million drives, totaling 160,623 terabytes were sold with 1 and 2 gigabyte drives leading the industry. By the year 2000, with 20GB drive leading the industry, rigid drives sold for the year are projected to total 2,829,288 terabytes Rigid disk drive sales to top $34 billion in 1997.


...
Wikipedia

...