Challenge 27: Recover the key from CBC with IV=Key

Take your code from the CBC exercise and modify it so that it repurposes the key for CBC encryption as the IV.

Applications sometimes use the key as an IV on the auspices that both the sender and the receiver have to know the key already, and can save some space by using it as both a key and an IV.

Using the key as an IV is insecure; an attacker that can modify ciphertext in flight can get the receiver to decrypt a value that will reveal the key.

The CBC code from exercise 16 encrypts a URL string. Verify each byte of the plaintext for ASCII compliance (ie, look for high-ASCII values). Noncompliant messages should raise an exception or return an error that includes the decrypted plaintext (this happens all the time in real systems, for what it's worth).

Use your code to encrypt a message that is at least 3 blocks long:

AES-CBC(P_1, P_2, P_3) -> C_1, C_2, C_3

Modify the message (you are now the attacker):

C_1, C_2, C_3 -> C_1, 0, C_1

Decrypt the message (you are now the receiver) and raise the appropriate error if high-ASCII is found.

As the attacker, recovering the plaintext from the error, extract the key:

P'_1 XOR P'_3

So we have access to an encryption and decryption oracle that uses the key as the IV. As a reminder, here is an illustration of CBC decryption:

CBC decryption

When you submit three blocks (C1, C2, C3) of ciphertext to our decryption oracle, the first block of plaintext you get is AES_decrypt(C1, K) XOR IV, which in our case where IV = K is equal to AES_decrypt(C1, K) XOR K. If we manage to get AES_decrypt(C1, K), we can compute K and we are done.

How do we get AES_decrypt(C1, K)? By submitting (0, C1) for decryption, where 0 represents a block of ciphertext containing only zeros. The decryption of the second block will result in AES_decrypt(C1, K) XOR 0, that is, AES_decrypt(C1, K).

This is why the instructions tell us to submit (C1, 0, C1). Let's call (P1, P2, P3) the three blocks of plaintext you get from the decryption oracle, you have P1 XOR P3 = K.

In [7]:
ZERO_BLOCK = b'\x00'*16
c1 = os.urandom(16)
PaddingError                              Traceback (most recent call last)
<ipython-input-7-35583ec3ac68> in <module>
      1 ZERO_BLOCK = b'\x00'*16
      2 c1 = os.urandom(16)
----> 3 oracle.decrypt(c1+ZERO_BLOCK+c1)

<ipython-input-1-9a7ccfd9df8c> in decrypt(self, ctxt)
     12     def decrypt(self, ctxt):
---> 13         ptxt = decrypt_aes_128_cbc(ctxt, iv=self.key, key=self.key)
     14         try:
     15             ptxt.decode(encoding='ascii')

~/repos/cryptopals/challenges/libmatasano.py in decrypt_aes_128_cbc(ctxt, iv, key)
    180         padded_msg += bxor(mask, tmp)
    181         mask = enc_block
--> 182     return pkcs7_strip(padded_msg, block_size)
    184 for _ in range(4):

~/repos/cryptopals/challenges/libmatasano.py in pkcs7_strip(x, block_size)
     89     if padding_size > block_size:
---> 90         raise PaddingError('illegal last byte (greater than block size)')
     91     if padding_size == 0:
     92         raise PaddingError('illegal last byte (zero)')

PaddingError: illegal last byte (greater than block size)

Oops. What happened? The decryption oracle applied CBC decryption, but then it applied PKCS#7 padding removal, since messages encrypted with CBC are supposed to be padded. Now PKCS#7 failed because the decrypted bytes had an invalid padding.

This is not really surprising: the ciphertext we submitted is completely made up, and we have no idea on what the last decrypted bytes will be. So the chance that it ends with a valid padding it pretty low (more precisely it's about 1/256, that is the probability that the last byte is \x01 in which case PKCS#7 accepts the padding without even looking at the previous bytes).

So how do we make sure that PKCS#7 will not raise an error? I had a solution that used a technique we already used, for instance in challenge 17: a for loop where we mess with the last byte of the second-to-last ciphertext block, until the last decrypted byte happens to be \x01 and PKCS#7 is happy. Here is what it looked like:

In [ ]:
for i in range(0xFF):
        ptxt = oracle.decrypt(c1 + b'\x00'*15 + bytes([i]) + c1)
    except PaddingError:
    print('Error: always get a padding error')

It works, but it's not ideal: first it requires us to make many calls to the decryption oracle, but even when PKCS#7 did not raise an error, we are not 100% sure to get the key by computing P1 XOR P3: there is small probability that the padding PKCS#7 accepted was another valid PKCS#7 padding than \x01: \x02\x02, \x03\x03\x03... Even in this case we would be able to get K, but it makes us wish for a simpler and more reliable solution.

At the time I first solved this challenge (November 2018) I just used this solution and moved on to the next one, and it stayed like this for a long time, until someone opened an issue on GitLab suggesting a better solution: We use the encryption oracle to get a three-block ciphertext (C1, C2, C3) that we know has valid padding (because it comes from the encryption oracle) and we send (C1, 0, C1, C2, C3) to the decryption oracle. The last block of decrypted bytes will be AES_decrypt(C3) XOR C2 which is guaranteed to have proper padding, and we can still XOR the first and third block of the output of the decryption oracle to get the key.

In [8]:
# We need anything between 16*2 (inclusive)
# and 16*3 (exclusive) bytes of plaintext
# to get 3 blocks of ciphertext
ctxt = oracle.encrypt(os.urandom(16*2 + 14))

c1, c2, c3 = split_bytes_in_blocks(ctxt, blocksize=16)

ZERO_BLOCK = b'\x00'*16

decrypted = oracle.decrypt(c1 + ZERO_BLOCK + c1 + c2 + c3)

p1, p2, p3, p4, p5 = split_bytes_in_blocks(decrypted, blocksize=16)

html_test(bxor(p1, p3) == oracle.key)