In information science, profiling refers to the process of construction and application of user profiles generated by computerized data analysis.
This involves the use of algorithms or other mathematical techniques that allow the discovery of patterns or correlations in large quantities of data, aggregated in databases. When these patterns or correlations are used to identify or represent people, they can be called profiles. Other than a discussion of profiling technologies or population profiling, the notion of profiling in this sense is not just about the construction of profiles, but also concerns the application of group profiles to individuals, e. g., in the cases of credit scoring, price discrimination, or identification of security risks (Hildebrandt & Gutwirth 2008) (Elmer 2004).
Profiling is not simply a matter of computerized pattern-recognition; it enables refined price-discrimination, targeted servicing, fraud detection, and extensive social sorting. Real-time machine profiling constitutes the precondition for emerging socio-technical infrastructures envisioned by advocates of ambient intelligence,autonomic computing (Kephart & Chess 2003) and ubiquitous computing (Weiser 1991).
One of the most challenging problems of the information society involves dealing with increasing data-overload. With the digitizing of all sorts of content as well as the improvement and drop in cost of recording technologies, the amount of available information has become enormous and increases exponentially. It has thus become important for companies, governments, and individuals to discriminate information from noise, detecting useful or interesting data. The development of profiling technologies must be seen against this background. These technologies are thought to efficiently collect and analyse data in order to find or test knowledge in the form of statistical patterns between data. This process, called Knowledge Discovery in Databases (KDD) (Fayyad, Piatetsky-Shapiro & Smyth 1996), provides the profiler with sets of correlated data usable as "profiles".