Correlation attack on a simple toy stream cipher

Let us assume a simple toy stream cipher (it could be any of the famous GSM A5/1, Bluetooth E0… stream ciphers) but with 5 registers defined by the following primitive polynomials

p[1] = x2 +               x + 1 ;          11
p[2] = x3 +               x + 1 ;         101
p[3] = x4 +               x + 1 ;       1001
p[4] = x7 +               x + 1 ;  1000001
p[5] = x8 + x7 + x2 + x + 1 ; 11000011

and a non-linear combining function f defined below

f = x1•x2•x3•x4•x5 ⊕ x4•x5 ⊕ x5;

And let’s say we know the first 32 bytes of the clear-text and cipher-text. From these we can retrieve the key-stream used for the first 32 bytes of clear-text as follows

key-stream = clear-text ⊕ cipher-text

Now we have 256 bits of key-stream. What do we want? Well, we want to know the initial states of the 5 registers so that we will be able to break later communications after the 256th bit. How can we do that? One way is to brute force all the possible value combinations of the 5 registers. That is 28 * 27 * 24 * 23 * 22 = 224 number of checks. If the polynomials were of higher order, brute forcing wouldn’t be feasible. So we need another technique to minimize our work. Now we need to see the correlation between the registers and all possible values of combining function. We use Truth Table of five variables and a 6th column holding output of f. Then we compare the output of each register and the 6th column for matches.







0 0 0   0   0   0
0 0 0 0 1 1
0 0 0 1 0 0
0 0 0 1 1 0
0 0 1 0 0 0
0 0 1 0 1 1
0 0 1 1 0 0
0 0 1 1 1 0
0 1 0 0 0 0
0 1 0 0 1 1
0 1 0 1 0 0
0 1 0 1 1 0
0 1 1 0 0 0
0 1 1 0 1 1
0 1 1 1 0 0
0 1 1 1 1 0
1 0 0 0 0 0
1 0 0 0 1 1
1 0 0 1 0 0
1 0 0 1 1 0
1 0 1 0 0 0
1 0 1 0 1 1
1 0 1 1 0 0
1 0 1 1 1 0
1 1 0 0 0 0
1 1 0 0 1 1
1 1 0 1 0 0
1 1 0 1 1 0
1 1 1 0 0 0
1 1 1 0 1 1
1 1 1 1 0 0
1 1 1 1 1 1

We see that register 5 is correlated to f (25/32 matches). Note that we can exploit the correlation of other registers as well. If we used both register 5 and other correlated register together, our final brute force attack would be significantly fast. But for now, let’s just take register 5 only and brute force the rest of the registers. Now that we know register 5 is correlated, we try all the 256 possible values of it to find the one that results with the maximum match with the key-stream. We use the following simple C code

for (i=1; i<256; i++) {
	ivR5 = i;
	for (j=0; j<256; j++) {
		R5 = (ivR5&1)==1?1:0;
		lfsr(polyR5, &ivR5, 1, 8);
	match = 0;
	for (j=0; j<256; j++) {
		if (r5[j] == key_stream[j]) match++;
	matches[i] = match;

After this run, matches[] contains the number of matches of the key-stream and the output of all possible values of register 5. Note that the lfsr() function is a modified version of Linear Feedback Shift Register (LFSR) taking the polynomial, the initial value of the register, the number of bits to output and the length of the polynomial as argument and returns the output of the register (1 or 0). Now we need to select the maximum as follows

for (i=0; i<max_match; i++) {
		max_match = matches[i];
		idx_max = i;
// idx_max will hold the initial value of R5

At this step we have the initial value of register 5. We can proceed to 27 * 24 * 23 * 22 = 216 checks for the rest of the registers (in reality, this would be significant as we eliminated the register with the highest degree). A sample code could look like the following for brute forcing the other registers.

        ivR5 = idx_max;
	for (i=1; i<128; i++) {
	   for (j=1; j<16; j++) {
              for (k=1; k<8; k++) {
		  for (l=1; l<4; l++) {					
		     ivR4 = i;
		     ivR3 = j;
		     ivR2 = k;
		     ivR1 = l;

By now we have the initial values of the 5 registers. We can decrypt and read any communication between the two parties without knowing the symmetric key they used to generate the key-stream. Cheers!


X11 based Linux keylogger

As a challenge to the paper “Unprivileged Black-Box Detection of User-Space Keyloggers“, we were askd to write a Linux keylogger that can hide behind the tool mentioned on the paper. A friend and I came up with several ideas to hide our keylogger from being detected. We didn’t manage to include all the ideas in the keylogger code because of time constraint but the professor approved that if we had included those options, their tool wouldn’t have detected it. Before writing the ideas we came up with, let me explain how their detection method works in a nutshell.

They assumed that, by design, keyloggers capture keystrokes and save them to file. If for example, “AbCdeF” was pressed on the keyboard, somewhere, some process is writing a file with the same bytes. So they check for this correlation by sending keys to all running processes and monitoring file activities and checking the number of bytes written. If the same amount of bytes were written to file, then the process that wrote that file is a keylogger.

Our ideas to circumvent this are presented below. Obviously, these are for educational purpose only.

1. Buffering.  We have a buffer that changes its size every time it writes its content to file. Here is how it works.

Let’s say we have buf[1024]. And let the first random buffer size be 750. Then we keep buffering until 750 is reached and write it to file. Then the next buffer size would be randomly chosen. Say 900. The process continues like that.

2. If we’re on an Internet-connected computer, we directly post the logged keystrokes on a remote server. That’s it, nothing on file.

3. Being selective when we capture keys. Let’s face it, when we capture keys, usually it’s password or something related. So why would we be interested in keys that are entered on Sublime? So, we targeted web browsers: Mozilla Firefox and Google Chrome. What does it mean? Their tool sends our keylogger a key and it is ignored. Keeps logging like a boss 🙂

There were also other ideas like having a different sized output (different size than the entered keys) on file by applying cryptography but the paper says they addressed this issue very well so we didn’t bother to test it. Will post code if anybody is interested.

The C source code of X11 based Linux keylogger can be found here.

This work got us full points.


IRC Remote Controller for uTorrent in C

Few years ago, I made an IRC bot uTorrent remote controller for personal use. There was no such tool during that time (even now, I think) but I badly needed it. The ideal case where you need this tool is for example when you run your torrent client on an office/school computer and you don’t have port forwarding (or you don’t have a static private IP) and you cannot control your torrents using the remote control that ships with uTorrent but you want to control your torrents from home. Let’s say you want something downloaded and you are at home (or anywhere), so you just go to some IRC channel on some server (configurable) and talk to the bot. Current features are adding new torrent (via direct URL), pausing, stoping, deleting and viewing status. A typical command looks like the following:

!add hxxp://
[1] Some.File                                       700MB   340KB/s  68%
[2] Some.Other.File                       110MB   240KB/s  24%

Where the status is shown in different color background depending on the kind (stopped, paused, active). The bot joins the channel with random nick because when connection is interrupted for few minutes and comes back online, the bot cannot join using the old nick. This is because the server thinks the nick is in use until the PING request times out for the previous bot. In order to avoid this problem, each time the bot joins a channel, it uses a random nick. If the bot is in a bad connection, we might see many bots from the same machine but only one is active. The rest will leave the channel when they don’t respond to the server request in time. If anybody is interested, I will check if it still works on the current uTorrent client without modification and post source code (or maybe binary because the IRC bot might be used maliciously) with some explanation on GitHub. I normally prefer writing things in C so this was written in pure C.