Auditory masking

Auditory masking occurs when the perception of one sound is affected by the presence of another sound.

Auditory masking in the frequency domain is known as simultaneous masking, frequency masking or spectral masking. Auditory masking in the time domain is known as temporal masking or non-simultaneous masking.

The unmasked threshold is the quietest level of the signal which can be perceived without a masking signal present. The masked threshold is the quietest level of the signal perceived when combined with a specific masking noise. The amount of masking is the difference between the masked and unmasked thresholds.

Gelfand provides a basic example. Let us say that for a given individual, the sound of a cat scratching a post in an otherwise quiet environment is first audible at a level of 10 dB SPL. However, in the presence of a masking noise (for example, a vacuum cleaner that is running simultaneously) that same individual cannot detect the sound of the cat scratching unless the level of the scratching sound is at least 26 dB SPL. We would say that the unmasked threshold for that individual for the target sound (i.e., the cat scratching) is 10 dB SPL, while the masked threshold is 26 dB SPL. The amount of masking is simply the difference between these two thresholds: 16 dB.

The amount of masking will vary depending on the characteristics of both the target signal and the masker, and will also be specific to an individual listener. While the person in the example above was able to detect the cat scratching at 26 dB SPL, another person may not be able to hear the cat scratching while the vacuum was on until the sound level of the cat scratching was increased to 30 dB SPL (thereby making the amount of masking for the second listener 20 dB).

Simultaneous masking occurs when a sound is made inaudible by a noise or unwanted sound of the same duration as the original sound. For example, a powerful spike at 1 kHz will tend to mask out a lower-level tone at 1.1 kHz. Also, two sine tones at 440 and 450 Hz can be perceived clearly when separated. They cannot be perceived clearly when presented simultaneously.

If two sounds of two different frequencies are played at the same time, two separate sounds can often be heard rather than a combination tone. The ability to hear frequencies separately is known as frequency resolution or frequency selectivity. When signals are perceived as a combination tone, they are said to reside in the same critical bandwidth. This effect is thought to occur due to filtering within the cochlea, the hearing organ in the inner ear. A complex sound is split into different frequency components and these components cause a peak in the pattern of vibration at a specific place on the cilia inside the basilar membrane within the cochlea. These components are then coded independently on the auditory nerve which transmits sound information to the brain. This individual coding only occurs if the frequency components are different enough in frequency, otherwise they are in the same critical band and are coded at the same place and are perceived as one sound instead of two.

...
Wikipedia