Take your CTR encrypt/decrypt function and fix its nonce value to 0. Generate a random AES key.
In successive encryptions (not in one big running CTR stream), encrypt each line of the base64 decodes of the following, producing multiple independent ciphertexts:
SSBoYXZlIG1ldCB0aGVtIGF0IGNsb3NlIG9mIGRheQ== Q29taW5nIHdpdGggdml2aWQgZmFjZXM= RnJvbSBjb3VudGVyIG9yIGRlc2sgYW1vbmcgZ3JleQ== RWlnaHRlZW50aC1jZW50dXJ5IGhvdXNlcy4= SSBoYXZlIHBhc3NlZCB3aXRoIGEgbm9kIG9mIHRoZSBoZWFk T3IgcG9saXRlIG1lYW5pbmdsZXNzIHdvcmRzLA== T3IgaGF2ZSBsaW5nZXJlZCBhd2hpbGUgYW5kIHNhaWQ= UG9saXRlIG1lYW5pbmdsZXNzIHdvcmRzLA== QW5kIHRob3VnaHQgYmVmb3JlIEkgaGFkIGRvbmU= T2YgYSBtb2NraW5nIHRhbGUgb3IgYSBnaWJl VG8gcGxlYXNlIGEgY29tcGFuaW9u QXJvdW5kIHRoZSBmaXJlIGF0IHRoZSBjbHViLA== QmVpbmcgY2VydGFpbiB0aGF0IHRoZXkgYW5kIEk= QnV0IGxpdmVkIHdoZXJlIG1vdGxleSBpcyB3b3JuOg== QWxsIGNoYW5nZWQsIGNoYW5nZWQgdXR0ZXJseTo= QSB0ZXJyaWJsZSBiZWF1dHkgaXMgYm9ybi4= VGhhdCB3b21hbidzIGRheXMgd2VyZSBzcGVudA== SW4gaWdub3JhbnQgZ29vZCB3aWxsLA== SGVyIG5pZ2h0cyBpbiBhcmd1bWVudA== VW50aWwgaGVyIHZvaWNlIGdyZXcgc2hyaWxsLg== V2hhdCB2b2ljZSBtb3JlIHN3ZWV0IHRoYW4gaGVycw== V2hlbiB5b3VuZyBhbmQgYmVhdXRpZnVsLA== U2hlIHJvZGUgdG8gaGFycmllcnM/ VGhpcyBtYW4gaGFkIGtlcHQgYSBzY2hvb2w= QW5kIHJvZGUgb3VyIHdpbmdlZCBob3JzZS4= VGhpcyBvdGhlciBoaXMgaGVscGVyIGFuZCBmcmllbmQ= V2FzIGNvbWluZyBpbnRvIGhpcyBmb3JjZTs= SGUgbWlnaHQgaGF2ZSB3b24gZmFtZSBpbiB0aGUgZW5kLA== U28gc2Vuc2l0aXZlIGhpcyBuYXR1cmUgc2VlbWVkLA== U28gZGFyaW5nIGFuZCBzd2VldCBoaXMgdGhvdWdodC4= VGhpcyBvdGhlciBtYW4gSSBoYWQgZHJlYW1lZA== QSBkcnVua2VuLCB2YWluLWdsb3Jpb3VzIGxvdXQu SGUgaGFkIGRvbmUgbW9zdCBiaXR0ZXIgd3Jvbmc= VG8gc29tZSB3aG8gYXJlIG5lYXIgbXkgaGVhcnQs WWV0IEkgbnVtYmVyIGhpbSBpbiB0aGUgc29uZzs= SGUsIHRvbywgaGFzIHJlc2lnbmVkIGhpcyBwYXJ0 SW4gdGhlIGNhc3VhbCBjb21lZHk7 SGUsIHRvbywgaGFzIGJlZW4gY2hhbmdlZCBpbiBoaXMgdHVybiw= VHJhbnNmb3JtZWQgdXR0ZXJseTo= QSB0ZXJyaWJsZSBiZWF1dHkgaXMgYm9ybi4=
(This should produce 40 short CTR-encrypted ciphertexts).
Because the CTR nonce wasn't randomized for each encryption, each ciphertext has been encrypted against the same keystream. This is very bad.
Understanding that, like most stream ciphers (including RC4, and obviously any block cipher run in CTR mode), the actual "encryption" of a byte of data boils down to a single XOR operation, it should be plain that:
CIPHERTEXT-BYTE XOR PLAINTEXT-BYTE = KEYSTREAM-BYTE
And since the keystream is the same for every ciphertext:
CIPHERTEXT-BYTE XOR KEYSTREAM-BYTE = PLAINTEXT-BYTE (ie, "you don't say!")
Attack this cryptosystem piecemeal: guess letters, use expected English language frequence to validate guesses, catch common English trigrams, and so on.
Don't overthink it. Points for automating this, but part of the reason I'm having you do this is that I think this approach is suboptimal.
Okay, so I don't understand what they expect from us here.
As soon as I start thinking about it I end up solving it in the way they describe in challenge 20.
Well they say that the only purpose of this challenge is to show you how much better the other method is, so let's just go to the next challenge.
In this file find a similar set of Base64'd plaintext. Do with them exactly what you did with the first, but solve the problem differently.
Instead of making spot guesses at to known plaintext, treat the collection of ciphertexts the same way you would repeating-key XOR.
Obviously, CTR encryption appears different from repeated-key XOR, but with a fixed nonce they are effectively the same thing.
To exploit this: take your collection of ciphertexts and truncate them to a common length (the length of the smallest ciphertext will work).
Solve the resulting concatenation of ciphertexts as if for repeating- key XOR, with a key size of the length of the ciphertext you XOR'd.
import os
from base64 import b64decode
from libmatasano import transform_aes_128_ctr
with open('data/20.txt') as f:
data = [b64decode(line) for line in f]
key = os.urandom(16)
ctxts = [transform_aes_128_ctr(line, key, nonce=0)
for line in data]
Take the following example with three-letters messages
being encrypted with the same key and the same nonce.
Say the keystream generated with this key and nonce happens to start with ZJK
(that is, three bytes encoding the letters "ZJK" in ASCII encoding).
key: Z J K
msg 1: L O L
msg 2: F O O
msg 3: B A R
etc...
Here is the intuition of the attack:
we take the first byte of every ciphertext and concatenate them together.
The result is a long list of letters
all XORed against the same byte (in our example, Z
).
We can then re-use the attack we designed in challenge 3
against "single-byte XOR cipher".
Let us verify that our attack against "single-bye XOR cipher" works well against a "column" of ciphertext bytes:
# Python has a very useful "zip" builtin function
# that perfectly fits our need here
# let's just first show how it works
# note that the 'zip' function stop at the length of the shortest input
list(zip('abc', 'def', 'ghijkl'))
from libmatasano import attack_single_byte_xor
# The 'next' builtin function will take the first element outputed by the zip function
# (zip is actually an Python iterator)
# the '*' operator before ctxt will unpack the elements of the 'ctxt' variable
# as multiple arguments of the function (see how we called zip in the previous cell)
l = next(zip(*ctxts))
attack_single_byte_xor(l)
Beautiful, right ?
Of course the message we got is meaningless, but this is supposed to be a concatenation of the first letter of each message, so this is not surprising.
So we just do this for every "column" and then we recompose the first message by taking the first letter of every column and so on...
columns = [ attack_single_byte_xor(l)['message'] for l in zip(*ctxts) ]
for msg in zip(*columns):
print(bytes(msg).decode())