Correlation attack on a simple toy stream-cipher

Let us assume a simple toy stream cipher (otherwise it could have been any of the famous GSM A5/1, Bluetooth E0… stream ciphers) but with 5 registers defined by the following primitive polynomials

p[1] = x2 +               x + 1 ;          11
p[2] = x3 +               x + 1 ;         101
p[3] = x4 +               x + 1 ;       1001
p[4] = x7 +               x + 1 ;  1000001
p[5] = x8 + x7 + x2 + x + 1 ; 11000011

and a non-linear combining function f defined below

f = x1•x2•x3•x4•x5 ⊕ x4•x5 ⊕ x5;

And let’s say we know the first 32 bytes of the clear-text and cipher-text. From these we can retrieve the key-stream used for the first 32 bytes of clear-text as follows

key-stream = clear-text ⊕ cipher-text

Now we have 256 bits of key-stream. What do we want? Well, we want to know the initial states of the 5 registers so that we will be able to break later communications after the 256th bit. How can we do that? One way is to brute force all the possible value combinations of the 5 registers. That is 28 * 27 * 24 * 23 * 22 = 224 number of checks. If the polynomials were of higher order, brute forcing wouldn’t be feasible. So we need another technique to minimize our work. Now we need to see the correlation between the registers and all possible values of combining function. We use Truth Table of five variables and a 6th column holding output of f. Then we compare the output of each register and the 6th column for matches.

  R1 

 R2

  R3

  R4 

  R5

f

0 0 0   0   0   0
0 0 0 0 1 1
0 0 0 1 0 0
0 0 0 1 1 0
0 0 1 0 0 0
0 0 1 0 1 1
0 0 1 1 0 0
0 0 1 1 1 0
0 1 0 0 0 0
0 1 0 0 1 1
0 1 0 1 0 0
0 1 0 1 1 0
0 1 1 0 0 0
0 1 1 0 1 1
0 1 1 1 0 0
0 1 1 1 1 0
1 0 0 0 0 0
1 0 0 0 1 1
1 0 0 1 0 0
1 0 0 1 1 0
1 0 1 0 0 0
1 0 1 0 1 1
1 0 1 1 0 0
1 0 1 1 1 0
1 1 0 0 0 0
1 1 0 0 1 1
1 1 0 1 0 0
1 1 0 1 1 0
1 1 1 0 0 0
1 1 1 0 1 1
1 1 1 1 0 0
1 1 1 1 1 1

We see that register 5 is correlated to f (25/32 matches). Note that we can exploit the correlation of other registers as well. If we used both register 5 and other correlated register together, our final brute force attack would be significantly fast. But for now, let’s just take register 5 only and brute force the rest of the registers. Now that we know register 5 is correlated, we try all the 256 possible values of it to find the one that results with the maximum match with the key-stream. We use the following simple C code

for (i=1; i<256; i++) {
	ivR5 = i;
	for (j=0; j<256; j++) {
		R5 = (ivR5&1)==1?1:0;
		lfsr(polyR5, &ivR5, 1, 8);
		r5[j]=R5;
	}
	
	match = 0;
	for (j=0; j<256; j++) {
		if (r5[j] == key_stream[j])
			match++;
	}
	matches[i] = match;
}

After this run, matches[] contains the number of matches of the key-stream and the output of all possible values of register 5. Note that the lfsr() function is a modified version of Linear Feedback Shift Register (LFSR) taking the polynomial, the initial value of the register, the number of bits to output and the length of the polynomial as argument and returns the output of the register (1 or 0). Now we need to select the maximum as follows

for (i=0; i<max_match; i++) {
	max_match = matches[i];
	idx_max = i;
}

// idx_max will hold the initial value of R5

At this step we have the initial value of register 5. We can proceed to 27 * 24 * 23 * 22 = 216 checks for the rest of the registers (in reality, this would be significant as we eliminated the register with the highest degree). A sample code could look like the following for brute forcing the other registers.

ivR5 = idx_max;
for (i=1; i<128; i++) {
	for (j=1; j<16; j++) {
		for (k=1; k<8; k++) {
			for (l=1; l<4; l++) {
				ivR4 = i;
				ivR3 = j;
				ivR2 = k;
				ivR1 = l;
				.
				.
				.

By now we have the initial values of the 5 registers. We can decrypt and read any communication between the two parties without knowing the symmetric key they used to generate the key-stream. Cheers!