Code-excited linear prediction (CELP) is a speech coding algorithm originally proposed by M. R. Schroeder and B. S. Atal in 1985. At the time, it provided significantly better quality than existing low bit-rate algorithms, such as residual-excited linear prediction and linear predictive coding vocoders (e.g., FS-1015). Along with its variants, such as algebraic CELP, relaxed CELP, low-delay CELP and vector sum excited linear prediction, it is currently the most widely used speech coding algorithm. It is also used in MPEG-4 Audio speech coding. CELP is commonly used as a generic term for a class of algorithms and not for a particular codec.
The CELP algorithm is based on four main ideas:
The original algorithm as simulated in 1983 by Schroeder and Atal required 150 seconds to encode 1 second of speech when run on a Cray-1 supercomputer. Since then, more efficient ways of implementing the codebooks and improvements in computing capabilities have made it possible to run the algorithm in embedded devices, such as mobile phones.
Before exploring the complex encoding process of CELP we introduce the decoder here. Figure 1 describes a generic CELP decoder. The excitation is produced by summing the contributions from an adaptive (a.k.a. pitch) codebook and a stochastic (a.k.a. innovation or fixed) codebook:
where is the adaptive (pitch) codebook contribution and is the stochastic (innovation or fixed) codebook contribution. The fixed codebook is a vector quantization dictionary that is (implicitly or explicitly) hard-coded into the codec. This codebook can be algebraic (ACELP) or be stored explicitly (e.g. Speex). The entries in the adaptive codebook consist of delayed versions of the excitation. This makes it possible to efficiently code periodic signals, such as voiced sounds.