This repository compute generic second preimage attack for long messages for Merkle-Damgård hash functions, in the specific case where the compression function used within the hash function follows a Davies-Meyer construction.
To compile the program you can use the Makefile. Some options are available in the Makefile to compile the tests or compile the program in debug mode :
make
: compile the program such that only the function attack will be executedmake test=1
: compile the program such that the test functions will be executedmake debug=1
: compile with -g option To execute the program, run the filesecond_preim_48_fillme
.
The function test_sp48
test if the function speck48_96
is correctly implemented. speck48_96
corrrespond to the encryption function Speck48/96. The corresponding revert function speck48_96_inv
is implemented too with it's test test_sp48_inv
.
The function cs48_dm
perform a compression function using a Davies-Meyer construction such that F(h,m) = E(m, h) ⊕ h
and the function test_cs48_dm
check our implementation. And to compute a fixed point such that dm(m, fp) == fp
, the function get_cs48_dm_fp
did it with his test test_cs48_dm_fp
.
The function find_exp_mess
returns two one block messages m1
and m2
such that there exists h = cs48_dm(m1, IV) = get_cs48_dm_fp(m2)
. For the implementation we used
the meet-in-the-middle attack.
The algorithm starts by computing some hash for different values of m1
such that h = cs48_dm(m1,IV)
and store them (the hash and its associated message m1). After that, we compute the hash of
different values of m2
such that h=get_cs48_dm_fp(m2)
, and see if we have a collision with the previous
hash computed.
To store all values such that we have an efficient way to check for collisions, we use a hashtable defined
in the file hashTableForAttack.c
.
To compute random values for our 2 messages m1
and m2
we used the function in the file xoshiro256starstar.h
.
This is our algorithm :
table <− HashTable
for i in range 1 to 9000000:
m1 <− random message
h <− cs48_dm(m1, IV)
store(table, m1, h)
while true:
m2 <− random message
h <− get_cs48_dm_fp(m2)
if found(table, h):
break
In theory our aim is to find a collision and with the birthday bound we need to compute approximately
the square root of hash possibilities.
The hash is of size 248 so we need to compute approximately 224 = 16777216 values. With the meet-in-the-middle attack, we stored 224 /2 ≈ 9000000 different messages m1
.
And then we’ll need to compute approximately the same number of m2
messages.
It takes less than 1 minutes to find a collision.
We have implemented the function attack that returns a second preimage for the message given in the subject. This is our algorithm :
// First we compute the message we want to find a second preimage
// and store all hash for each part of the message in our hash table
key <− IV
table <− HashTable
for submessage in message:
key <− cs48_dm(test , key)
store(table, m1, key)
//Now we find an expendable message
m1 <− uint32_t
m2 <− uint32_t
find_exp_mess(m1, m2)
//We compute the hash of the expandable message
fp <− cs48_dm(m1, IV)
// Find the second preimage
while true:
m3 <− random message
res <− cs48_dm(m3, fp)
if found(table, res):
break
//Compute the new message ( depends of the message chose before )
This time the meet-in-the-middle attack is computed with the stored values of hash of our sub mes-
sages of our message. So if our message is big it will take less time but if it’s too big it will take a lot
of space to store it.
In our case we have a message of length 1 << 20 = 1048576 which gives 1048576/4 = 262144 sub
messages so 262144 hash in our hashtable.
We will need to compute many more values of m3
to find a message that gives a hash in our hashtable.
Our implementation takes less than 10 minutes to compute.
In the folder threadsTest we have implemented the attack with some threads to make it more effi-
cient. The compilation of the programs contained in threadsTest is done with the command make
and
the execution is done by running the file second_preim_48_fillme
generated by the previous command.
In average it seems to be more efficient but it is not always the case.
The implementation does not contain the tests from the previous part, it would have been possible
if we had passed more time on it but the aim of this part was just to try to make the attack more
efficient.