*** Welcome to piglix ***

De-anonymization


De-anonymization (also spelt as deanonymization) is a strategy in data mining in which anonymous data is cross-referenced with other sources of data to re-identify the anonymous data source. The term became popular in 2006 when Arvind Narayanan and Vitaly Shmatikov entered a contest hosted by Netflix, and applied their de-anonymization techniques to successfully identify Netflix data for a number of specific members.

More and more data are becoming publicly available over the Internet. These data are released after applying some anonymization techniques like removing personally identifiable information (PII) such as names, addresses and social security numbers to ensure the sources' privacy. This assurance of privacy allows the government to legally share limited data sets with third parties without requiring written permission. Such data has proved to be very valuable for researchers, particularly in health care. However, as the Netflix contest dramatically revealed so much of data is available, even after anonymization, that a specific individual’s identity could be re-discovered.


...
Wikipedia

...