Wrapped up in chains
The cryptographer Martin Hellman, famed as co-author of the Diffie-Hellman key agreement algorithm, in 1980, proposed a practical compromise between the two extremes that uses pre-computed data to reduce the time taken for an attack, at the price of memory consumption. Such a technique is now known as a "time memory trade-off" (TMTO). The basic idea is that the dictionary is stored only partially or in compressed form and, during the search for the key, either the missing parts are computed, or the search is repeated with slight modifications until a hit is scored in the dictionary. The TMTO technique has found many applications since then, including cracking Windows passwords online and decrypting mobile phone calls [1].
Hellman had the idea of storing the dictionary in the form of long chains. A randomly chosen key, with which a specific message is encrypted, stands at the beginning of such a chain. The result of the first pass is used as a key for encrypting the original message during the next pass. The result of this pass provides the key for the next one, and so on, until the chain has reached a predefined length. In order to save space, Hellman stored only the first and last values in the chain. The dictionary thus compressed is a two-column table sorted by end value.
In order to crack a key, the attacker must first induce the target system to encrypt the message that fits the chains. The subsequent search in the dictionary is a two-step process. In the first step, the attacker performs the same chain steps on the encrypted message as when creating the dictionary, and keeps on going until he finds its intermediate state in the dictionary as a chain end value.
In the second step, he only has to recompute the chain once more, on the basis of the relevant starting value, up to the point where he finds the encrypted message. The previous value in the chain is the desired key – or at least that's the simplified theory.
However, depending on the encryption function, collisions can occur if different intermediate values lead to the same following value. Since a collision is always followed by the same series of chain links, there can be several chain starting values for a given end value. If no starting value is arrived at en-route that leads to the desired key, the search has failed.
Because of the random selection of the starting value, moreover, there are inevitably gaps in coverage – not all possible keys are contained in the chains. From a statistical point of view, halving the number of keys not covered by a chain link requires that the number of computed chains be doubled, which quickly leads to large amounts of redundant data if good coverage of the range of keys is the aim.
The maximum number of steps in a crack is determined by the length of the chains, and the necessary memory space is given by the number of chains. The fewer the chains computed, the longer they have to be and the more time it takes to find a key. With perfect tables, it would be possible to find the key to a system with N possible keys in N1/2 (the square root of N) steps in a dictionary of size 2•N1/2. But such perfect tables don't exist for current encryption methods that use long keys. Even if they did exist, they could not be searched in a realistic time. In his work, Hellman presents a trade-off with N2/3 search time, 2•N2/3 memory space, and N2/3 time for pre-computing the tables. For many years, this was the best trade-off known.
However, in many cases Hellman's trade-off is not optimal. The bottleneck in searching the dictionary today is typically not the processor, but the hard disk. After every step, the algorithm has to check whether the element generated occurs in the table, and that generally involves several disk accesses. Usually, therefore, the processor can generate data much faster than the hard disk can find them.
In order to minimize the hard disk bottleneck, in 1982 the cryptologist Ronald ("the R in RSA") Rivest, who was also influential in developing algorithms such as MD5 and RC4, suggested an optimisation of Hellman's method, in which only end values that meet a prescribed criterion, such as the last twelve bits being zeros, end up in the dictionary table. While the chains are being generated, they are only run through each time – starting from random values again – until a value meets the criterion and can be used as an end point. Rivest's approach is therefore also known as the "distinguished points" method. If during the dictionary search an intermediate value doesn't meet the distinguished point criterion, it doesn't have to be looked up on the hard disk.
Since the end point for a starting value can't be predicted before the whole chain has been computed, the chains are of differing lengths. That doesn't change the average search time and memory space required, but very probably reduces the frequency of collisions. The longer chains are more susceptible to collisions, and on average this can't be made up for by the lower probability of a collision in the shorter chains. Furthermore, going through a long chain is more time-consuming, further aggravating the collision problem.