Challenge 17: The CBC padding oracle¶

This is the best-known attack on modern block-cipher cryptography.

Combine your padding code and your CBC code to write two functions.

The first function should select at random one of the following 10 strings:
MDAwMDAwTm93IHRoYXQgdGhlIHBhcnR5IGlzIGp1bXBpbmc=
MDAwMDAxV2l0aCB0aGUgYmFzcyBraWNrZWQgaW4gYW5kIHRoZSBWZWdhJ3MgYXJlIHB1bXBpbic=
MDAwMDAyUXVpY2sgdG8gdGhlIHBvaW50LCB0byB0aGUgcG9pbnQsIG5vIGZha2luZw==
MDAwMDAzQ29va2luZyBNQydzIGxpa2UgYSBwb3VuZCBvZiBiYWNvbg==
MDAwMDA0QnVybmluZyAnZW0sIGlmIHlvdSBhaW4ndCBxdWljayBhbmQgbmltYmxl
MDAwMDA1SSBnbyBjcmF6eSB3aGVuIEkgaGVhciBhIGN5bWJhbA==
MDAwMDA2QW5kIGEgaGlnaCBoYXQgd2l0aCBhIHNvdXBlZCB1cCB0ZW1wbw==
MDAwMDA3SSdtIG9uIGEgcm9sbCwgaXQncyB0aW1lIHRvIGdvIHNvbG8=
MDAwMDA4b2xsaW4nIGluIG15IGZpdmUgcG9pbnQgb2g=
MDAwMDA5aXRoIG15IHJhZy10b3AgZG93biBzbyBteSBoYWlyIGNhbiBibG93
... generate a random AES key (which it should save for all future encryptions), pad the string out to the 16-byte AES block size and CBC-encrypt it under that key, providing the caller the ciphertext and IV.

The second function should consume the ciphertext produced by the first function, decrypt it, check its padding, and return true or false depending on whether the padding is valid.

What you're doing here: This pair of functions approximates AES-CBC encryption as its deployed serverside in web applications; the second function models the server's consumption of an encrypted session token, as if it was a cookie.

It turns out that it's possible to decrypt the ciphertexts provided by the first function.

The decryption here depends on a side-channel leak by the decryption function. The leak is the error message that the padding is valid or not.

You can find 100 web pages on how this attack works, so I won't re-explain it. What I'll say is this:

The fundamental insight behind this attack is that the byte 01h is valid padding, and occur in 1/256 trials of "randomized" plaintexts produced by decrypting a tampered ciphertext.

02h in isolation is not valid padding.

02h 02h is valid padding, but is much less likely to occur randomly than 01h.

03h 03h 03h is even less likely.

So you can assume that if you corrupt a decryption AND it had valid padding, you know what that padding byte is.

It is easy to get tripped up on the fact that CBC plaintexts are "padded". Padding oracles have nothing to do with the actual padding on a CBC plaintext. It's an attack that targets a specific bit of code that handles decryption. You can mount a padding oracle on any CBC block, whether it's padded or not.

In [1]:

import os
import random
from random import randint
from libmatasano import (
    encrypt_aes_128_cbc, decrypt_aes_128_cbc,
    PaddingError, split_bytes_in_blocks, bxor,
    cbc_xor, BLOCK_SIZE, pkcs7_strip, html_test
)
from base64 import b64decode

In [2]:

class Oracle:
    messages = list(map(b64decode, [
        b'MDAwMDAwTm93IHRoYXQgdGhlIHBhcnR5IGlzIGp1bXBpbmc=',
        b'MDAwMDAxV2l0aCB0aGUgYmFzcyBraWNrZWQgaW4gYW5kIHRoZSBWZWdhJ3MgYXJlIHB1bXBpbic=',
        b'MDAwMDAyUXVpY2sgdG8gdGhlIHBvaW50LCB0byB0aGUgcG9pbnQsIG5vIGZha2luZw==',
        b'MDAwMDAzQ29va2luZyBNQydzIGxpa2UgYSBwb3VuZCBvZiBiYWNvbg==',
        b'MDAwMDA0QnVybmluZyAnZW0sIGlmIHlvdSBhaW4ndCBxdWljayBhbmQgbmltYmxl',
        b'MDAwMDA1SSBnbyBjcmF6eSB3aGVuIEkgaGVhciBhIGN5bWJhbA==',
        b'MDAwMDA2QW5kIGEgaGlnaCBoYXQgd2l0aCBhIHNvdXBlZCB1cCB0ZW1wbw==',
        b'MDAwMDA3SSdtIG9uIGEgcm9sbCwgaXQncyB0aW1lIHRvIGdvIHNvbG8=',
        b'MDAwMDA4b2xsaW4nIGluIG15IGZpdmUgcG9pbnQgb2g=',
        b'MDAwMDA5aXRoIG15IHJhZy10b3AgZG93biBzbyBteSBoYWlyIGNhbiBibG93',
    ]))
    
    def __init__(self):
        self.key = os.urandom(16)
        
    def encrypt(self):
        # here "self.messages" is a class attribute
        # see https://docs.python.org/3/reference/datamodel.html#the-standard-type-hierarchy
        msg = random.choice(self.messages)
        iv = os.urandom(16)
        ctxt = encrypt_aes_128_cbc(msg, iv, self.key)
        return ctxt, iv
    
    def decrypt(self, cryptogram):
        ctxt, iv = cryptogram
        decrypt_aes_128_cbc(ctxt, iv, oracle.key)
        # the oracle does not return the result of the decryption
        # but it will raise an exception if padding stripping failed.

In [3]:

oracle = Oracle()

In [4]:

ctxt, iv = oracle.encrypt()
cryptogram = {'ctxt': ctxt, 'iv': iv}

In [5]:

oracle.decrypt((cryptogram['ctxt'], cryptogram['iv']))

In [6]:

try:
    oracle.decrypt((os.urandom(16), cryptogram['iv']))
    raise Exception('A padding error was expected')
except PaddingError:
    print('Got a padding error just as expected')

Got a padding error just as expected

As we already did, we will take an example with a block size of 4 just for the sake of simplicity.

The message will be 6 characters encoded as 6 bytes (it could be the word “attack” for instance), and noted M1 M2 M3 M4 M5 M6. PKCS#7 padding for this message with a block size of 4 requires to add two bytes of padding, so the padded messa eg will be

M1 M2 M3 M4 | M5 M6 P  P

Where a "pipe" character | denotes the block limits and where the P byte is equal to 0x02

Here is how we can recover the last message byte M6 using a "padding oracle", that is some external way to know if the padding if the underlying message is correct or not.

Remember with CBC mode we are able to flip bits in the underlying plaintext by flipping bits in the ciphertext. This makes us capable of XORing the plaintext with an arbitrary value (XORing is equivalent to bit flipping) as long as we are only affecting bytes in a same block. We will do this using the cbc_xor function which we created in the previous challenges and put in our libmatasano.py file.

First of all this gives us a very simple way to tell how many bytes of PKCS#7 padding bytes there are at the end of the plaintext: You use CBC bitflipping to alter the very last byte. It is very probable that the resulting message will have an invalind padding.

    M1 M2 M3 M4 | M5 M6 P  P
⊕                          X
=   M1 M2 M3 M4 | M5 M6 P  P'

unless P' = 1, oracle will give a Padding Error

To remove the probability of not getting a padding error when you are altering a padding byte, you can just re-test with a different value X: you know the byte you are altering is part of the padding if you got a padding error with at least one value of X.

You then alter the second-to-last byte of plaintext, see if the padding oracle raises a padding error, and so on. When you don't get a padding error whatever value you use for X, it means that the byte you are altering was not part of the padding.

    M1 M2 M3 M4 | M5 M6 P  P
⊕                    X      
=   M1 M2 M3 M4 | M5 ?? P  P

no padding error, whatever the value of 'X' is

In [7]:

for i in range(BLOCK_SIZE):
    altered_cryptogram = cbc_xor(cryptogram, pad=b'\xff', index=len(ctxt)-1-i)
    
    try:
        oracle.decrypt((altered_cryptogram['ctxt'], altered_cryptogram['iv']))
        
        # trying with another value just to be sure
        altered_cryptogram = cbc_xor(cryptogram, pad=b'\x11', index=len(ctxt)-1-i)
        oracle.decrypt((altered_cryptogram['ctxt'], altered_cryptogram['iv']))
        
        # If we reach this line without any padding error
        # We are pretty sure that the byte we are messing with
        # is *not* part of the padding
        padding_length = i
        break
    except PaddingError:
        # It seems that the byte we are messing with
        # is part of the padding;
        # let's move on to the next byte
        continue
else:
    # having padding errors at all iteration
    # indicates that the last block is all padding,
    # that is, message length was a multiple of block size
    padding_length = BLOCK_SIZE

print('padding length:', padding_length)

# checking we got it right
expected = BLOCK_SIZE - len(decrypt_aes_128_cbc(ctxt, iv, oracle.key)) % BLOCK_SIZE
html_test(padding_length == expected)

padding length: 5

The technique to recover the message bytes is quite similar.

We will use bitflipping in order to perform the following operation:

    M1 M2 M3 M4 | M5 M6 P  P
⊕                    X1 X2 X3
=   M1 M2 M3 M4 | M5 P' P' P'

Such that the resulting message is properly padded, that is in our example P' = 3.

X2 and X3 are easy to compute: because we know the padding lenght we know the values of the original padding bytes P, and we know the value of the padding bytes P' we want to get after bit flipping (P' = P+1). Thus, it is easy for us to compute X2 = X3 = P ⊕ P'.

What we don't know is which value of X1 will result in a proper padding. We just have to try every single value and give the result to the padding oracle: when we don't get a padding error, it means that we found the proper byte X1.

    M1 M2 M3 M4 | M5 M6 P  P
⊕                    i  X2 X3
=   M1 M2 M3 M4 | M5 ?? P' P'

Padding Error unless M6 ⊕  i = P',
that is, i = X1

It is now easy to infer the value of M6:

M6 = P' ⊕  X1

We can then repeat the same process to learn message byte M5, trying to find new values X1 X2 X3 X4 such that

    M1 M2 M3 M4 | M5 M6 P  P
⊕                 X1 X2 X3 X4
=   M1 M2 M3 M4 | P' P' P' P'

Where P' is equal to 0x04 this time in our example. Again here we will only have to try every posibility for X1 because we already know the proper values for X2 X3 X4 (recall we know M6 now)

Then, there are a couple details to pay attention to to be able to recover the whole message in every situation: When we know a whole block of plaintext, we have to remove it because the paddig oracle will never look at bytes in the second-to-last block.

Say we already recovered M6 and M5, now we are looking for X1 such that the following operation results in a message with proper padding:

    M1 M2 M3 M4
⊕            X1
=   M1 M2 M3 P'

(that is we want to obtain P' = 1).

How do we do that ? Same as before: we try every possible value for X1 and give the result to the padding oracle. If we don't get a padding error, it can be one of the following situations:

we got a last byte of value 1: this is what we wanted
we got a last byte of value 2 and it happens that the message byte M3 has a value of 2 as well
we got a last byte of value 3 and it happens that message bytes M2 and M3 both have a value of 3 as well
etc …

What do we do to make sure we are in the first case and not in the other ones? Well first the first case is much more likely than the others, so we could just accept to have an attack that will not work every now and then.

(TODO mechanism to ensure 100% attack success rate)

In [8]:

known_bytes = bytes([padding_length])*padding_length

def recover_one_more_byte(cryptogram, known_bytes):
    ctxt = cryptogram['ctxt']
    iv = cryptogram['iv']
    
    # we drop ciphertext blocks which plaintext is known entirely
    # the index of the target byte (the one we are going to mess with)
    # is computed from the end of the remaining ciphertext blocks
    # (negative index)
    # with K for known bytes, T for target, X for the rest:
    # XXXX XXTK KKKK => target index will be -2
    nb_blocks_to_drop = len(known_bytes) // BLOCK_SIZE
    target_byte_index = - (len(known_bytes) % BLOCK_SIZE) - 1

    # I ♥ slices!
    # note the "or None" so that if "nb_blocks_to_drop" is zero
    # we get a slice with no end (goes to the end of the iterable)
    not_dropped = slice(0, -nb_blocks_to_drop*BLOCK_SIZE or None)

    l = len(known_bytes[not_dropped])

    for i in range(256):
        pad = bytes([i]) + bxor(known_bytes[not_dropped], bytes([l+1])*(l))
        #     ^^^^^^^^^^   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        #     the "X1" in      the "X2", "X3" etc... in the explanation
        #   the explanation
        cryptogram_to_alter = {'ctxt': ctxt[not_dropped], 'iv': iv}
        cryptogram_altered = cbc_xor(cryptogram_to_alter, pad, target_byte_index)
        try:
            oracle.decrypt((cryptogram_altered['ctxt'], cryptogram_altered['iv']))
            # If we reach this line padding was okay,
            # so we know exactly what value we reached through bit flipping,
            # and we can deduce the original value of the byte we were messing with
            target_byte_value = (l+1) ^ i
            return bytes([target_byte_value])
        except PaddingError:
            # We still haven't found a value for "X1"
            # which results in a valid padding
            continue
    else:
        err_msg = 'all values triggered a padding error'
        if l == 2:
            err_msg += '; probably a mistake when recovering last byte of the block'
        raise Exception(err_msg)
        
# quick sanity check
assert recover_one_more_byte(cryptogram, known_bytes) == bytes([decrypt_aes_128_cbc(ctxt, iv, oracle.key)[-1]])

while len(known_bytes) < len(ctxt):
    new_byte = recover_one_more_byte(cryptogram, known_bytes)
    known_bytes = new_byte + known_bytes

pkcs7_strip(known_bytes, BLOCK_SIZE)

Out[8]:

b'000006And a high hat with a souped up tempo'

This attack was first described by Serge Vaudenay in 2002 in the following paper:

Vaudenay S. (2002) Security Flaws Induced by CBC Padding — Applications to SSL, IPSEC, WTLS.... In: Knudsen L.R. (eds) Advances in Cryptology — EUROCRYPT 2002. EUROCRYPT 2002. Lecture Notes in Computer Science, vol 2332. Springer, Berlin, Heidelberg