In cryptography, correlation attacks are a class of known plaintext attacks for breaking stream ciphers whose keystream is generated by combining the output of several linear feedback shift registers (called LFSRs for the rest of this article) using a Boolean function. Correlation attacks exploit a statistical weakness that arises from a poor choice of the Boolean function – it is possible to select a function which avoids correlation attacks, so this type of cipher is not inherently insecure. It is simply essential to consider susceptibility to correlation attacks when designing stream ciphers of this type.
Correlation attacks are possible when there is a significant correlation between the output state of one individual LFSR in the keystream generator and the output of the Boolean function that combines the output state of all of the LFSRs. Combined with partial knowledge of the keystream (which is easily derived from partial knowledge of the plaintext, as the two are simply XORed together), this allows an attacker to brute-force the key for that individual LFSR and the rest of the system separately. For instance, if, in a keystream generator in which four 8-bit LFSRs are combined to produce the keystream, and one of the registers is correlated to the Boolean function output, we may brute force it first and then the remaining three, for a total attack complexity of 28 + 224. Compared to the cost of launching a brute force attack on the entire system, with complexity 232, this represents an attack effort saving factor of just under 256, which is substantial. If a second register is correlated with the function, we may repeat this process and drop the attack complexity to 28 + 28 + 216 for an effort saving factor of just under 65028. In this sense, correlation attacks can be considered divide and conquer algorithms.
Correlation attacks are perhaps best explained via example. We will consider the case of the Geffe keystream generator. The Geffe generator consists of three LFSRs: LFSR-1, LFSR-2 and LFSR-3. If we denote the outputs of these registers by , and , respectively, then the Boolean function that combines the three registers to provide the generator output is given by (i.e. ( AND ) XOR (NOT AND )). There are 23 = 8 possible values for the outputs of the three registers, and the value of this combining function for each of them is shown in the table below: