*** Welcome to piglix ***

Auditory scene analysis


In psychophysics, auditory scene analysis (ASA) is a proposed model for the basis of auditory perception. This is understood as the process by which the human auditory system organizes sound into perceptually meaningful elements. The term was coined by psychologist Albert Bregman. The related concept in machine perception is computational auditory scene analysis (CASA), which is closely related to source separation and blind signal separation.

The three key aspects of Bregman's ASA model are: segmentation, integration, and segregation.

Sound reaches the ear and the eardrum vibrates as a whole. This signal has to be analyzed (in some way). The model proposes that sounds will either be heard as "integrated" (heard as a whole -- much like harmony in music), or "segregated" into individual components (which leads to counterpoint). For example, a bell can be heard as a 'single' sound (integrated), or some people are able to hear the individual components -- they are able to segregate the sound. This can be done with chords where it can be heard as a 'color', or as the individual notes.

In many circumstances the segregated elements can be linked together in time, producing an auditory stream. This ability of auditory streaming can be demonstrated by the so-called cocktail party effect. Up to a point, with a number of voices speaking at the same time or with background sounds, one is able to follow a particular voice even though other voices and background sounds are present. In this example, the ear is segregating this voice from other sounds (which are integrated), and the mind "streams" these segregated sounds into an auditory stream. This is a skill which is highly developed by musicians, notably conductors who are able to listen to one, two, three or more instruments at the same time (segregating them), and following each as an independent line through auditory streaming. Organists also develop this skill having to stream up to five or more voices [parts] at a time.

Natural sounds, such as the human voice, musical instruments, or cars passing in the street, are made up of many frequencies, which contribute to the perceived quality (or timbre) of the sounds. When two or more natural sounds occur at once, all the components of the simultaneously active sounds are received at the same time, or overlapped in time, by the ears of listeners. This presents their auditory systems with a problem: Which parts of the sound should be grouped together and treated as parts of the same source or object? Grouping them incorrectly can cause the listener to hear non-existent sounds built from the wrong combinations of the original components.


...
Wikipedia

...