Set 4: Stream crypto and randomness

Challenge 25: Break "random access read/write" AES CTR

Back to CTR. Encrypt the recovered plaintext from this file (the ECB exercise) under CTR with a random key (for this exercise the key should be unknown to you, but hold on to it).

Now, write the code that allows you to "seek" into the ciphertext, decrypt, and re-encrypt with different plaintext. Expose this as a function, like, edit(ciphertext, key, offset, newtext).

Imagine the "edit" function was exposed to attackers by means of an API call that didn't reveal the key or the original plaintext; the attacker has the ciphertext and controls the offset and "new text".

Recover the original plaintext.

Food for thought: A folkloric supposed benefit of CTR mode is the ability to easily "seek forward" into the ciphertext; to access byte N of the ciphertext, all you need to be able to do is generate byte N of the keystream. Imagine if you'd relied on that advice to, say, encrypt a disk.

In [19]:
from base64 import b64decode
from itertools import islice
import os

from libmatasano import (
    bxor,
    transform_aes_128_ctr,
    aes_128_ctr_keystream_generator,
    html_test
)

First let's get the data:

In [2]:
with open('data/25.txt') as f:
    data = b64decode(f.read())

I'm wondering if there's something I am doing wrong because the "data" looks meaningless while the instructions say that it is supposed to be "recovered plaintext":

In [8]:
data[:30]
Out[8]:
b'\t\x120\xaa\xde>\xb30\xdb\xaaCX\xf8\x8d*l7\xb7-\x0c\xf4\xc2,4J\xecAB\xd0\x0c'

Well it doesn't really matter for what we're doing here, as long as we are able to recover this data from the ciphertext.

We encrypt the data:

In [3]:
key = os.urandom(16)
nonce = 0

ctxt = transform_aes_128_ctr(data, key, nonce)

Here is the "edit" function. We compute the modified chunk of ciphertext by encrypting the new text, meaning by XORing the new text with the keystream at the proper index.

Then we append and prepend the parts of the ciphertext that do not have to change.

In [10]:
def edit(ctxt, key, nonce, offset, newtext):
    keystream = aes_128_ctr_keystream_generator(key, nonce)

    new_chunk = bxor(newtext,
                     islice(keystream, offset, offset+len(newtext)))

    result = ctxt[:offset] + new_chunk + ctxt[offset+len(newtext):]
    return result

Quick test:

In [15]:
edited_ctxt = edit(ctxt, key, nonce,
                  offset=10, newtext=b'LOOOOOL')

print(data[:20])
print(transform_aes_128_ctr(edited_ctxt, key, nonce)[:20])
b'\t\x120\xaa\xde>\xb30\xdb\xaaCX\xf8\x8d*l7\xb7-\x0c'
b'\t\x120\xaa\xde>\xb30\xdb\xaaLOOOOOL\xb7-\x0c'

Attack

This "edit" function allows us to get two different messages XORed against the same keystream. We already saw that this is bad (see challenges 19 and 20):

$$ ( Msg_1 \oplus Key ) \oplus ( Msg_2 \oplus Key ) = Msg_1 \oplus Msg_2 $$

Then if you know $ Msg_2 $ you can recover $ Msg_1 $.

But here it's even easier: we can choose the second message. Then we can just ask to encrypt a "message" being only null bytes, and... the ciphertext is equal to the keystream, which we can use to decrypt.

It's that simple.

In [16]:
recovered_keystream = edit(ctxt, key, nonce,
                           offset=0,
                           newtext=b'\x00'*len(ctxt))
In [17]:
recovered_plaintext = bxor(ctxt, recovered_keystream)
In [20]:
html_test(recovered_plaintext == data)
OK