*** Welcome to piglix ***

Data Re-Identification


Data Re-Identification is the practice of matching de-identified data with publicly available information, or auxiliary data, in order to discover the individual to which the data belongs to. This is a concern because companies with privacy policies, health care providers, and financial institutions may release the data they collect after the data has gone through the de-identification process. The de-identification process involves masking, generalizing or deleting both direct and indirect identifiers; the definition of this process is not universal, however. Information in the public domain, even seemingly anonymized, may thus be re-identified in combination with other pieces of available data and basic computer science techniques. The Common Rule Agencies, a collection of multiple U.S. federal agencies and departments including the U.S. Department of Health and Human Services, speculate that re-identification is becoming gradually easier because of "big data" - the abundance and constant collection and analysis of information along the evolution of technologies and the advances of algorithms. However, others have claimed that de-identification is a safe and effective data liberation tool and do not view re-identification as a concern.

A 2000 study found that 87 percent of the U.S. population can be identified using a combination of their gender, birthdate and zip code. Others do not think that re-identification is a serious threat, and call it a "myth"; they claim that the combination of zip code, date of birth and gender is rare or partially complete, such as only the year and month birth without the date, or the county name instead of the specific zip code, thus the risk of such re-identification is thus reduced in many instances.

Existing privacy regulations typically protect information that has been modified, so that the data is deemed anonymized, or de-identified. For financial information, the Federal Trade Commission permits its circulation if it is  de-identified and aggregated. The Gramm Leach Bliley Act (GLBA), which mandates financial institutions give consumers the opportunity to opt out of having their information shared with third parties, does not cover de-identified data if the information is aggregate and does not contain personal identifiers, since this data is not treated as personally identifiable information.


...
Wikipedia

...