Birthday attack

From Wikipedia, the free encyclopedia

(Redirected from Collision attack)
Jump to: navigation, search

A birthday attack is a type of cryptographic attack, so named because it exploits the mathematics behind the birthday paradox. Given a function f, the goal of the attack is to find two inputs x1,x2 such that f(x1) = f(x2). Such a pair x1,x2 is called a collision. The method used to find a collision is to simply evaluate the function f for different input values that may be chosen randomly or pseudorandomly until the same result is found more than once. Because of the birthday paradox this method can be rather efficient. Specifically, if a function f(x) yields any of H different outputs with equal probability and H is sufficiently large, then we expect to obtain a pair of different arguments x1 and x2 with f(x1) = f(x2) after evaluating the function for about 1.25 \cdot \sqrt H different arguments on average.

Contents

Main article: birthday problem

We consider the following experiment. From a set of H values we choose n values uniformly at random thereby allowing repetitions. Let p(n;H) be the probability that during this experiment at least one value is chosen more than once. This probability can be approximated as

 p(n;H) \approx 1 - e^{-(n(n-1))/2 \cdot H} \approx 1-e^{-n^2/{2 \cdot H}},

Let n(p;H) be the smallest number of values we have to choose, such that the expected probability for finding a collision is at least p. By inverting this expression above, we find the following approximation

n(p;H)\approx \sqrt{2\cdot H\cdot\ln\left({1 \over 1-p}\right)},

and assigning a 0.5 probability of collision we arrive at

n(0.5;H) \approx 1.1774 \sqrt H.

Let Q(H) be the expected number of values we have to choose before finding the first collision. This number can be approximated by

Q(H)\approx \sqrt{{\pi\over 2}H}.

As an example, if a 64 bit hash is used, there are approximately 1.8 × 1019 different outputs. If these are all equally probable (the best case), then it would take 'only' approximately 5.1 × 109 attempts to generate a collision using brute force. This value is called birthday bound and for n-bit codes it could be computed as 2n / 2.[1] Other examples are as follows:

Bits Possible
outputs
(H)
Desired probability of random collision (p)
10−18 10−15 10−12 10−9 10−6 0.1% 1% 25% 50% 75%
32 4.3 × 109 <1 <1 <1 2.9 93 2.9 × 103 9.3 × 103 5.0 × 104 7.7 × 104 1.1 × 105
64 1.8 × 1019 6.1 1.9 × 102 6.1 × 103 1.9 × 105 6.1 × 106 1.9 × 108 6.1 × 108 3.3 × 109 5.1 × 109 7.2 × 109
128 3.4 × 1038 2.6 × 1010 8.2 × 1011 2.6 × 1013 8.2 × 1014 2.6 × 1016 8.3 × 1017 2.6 × 1018 1.4 × 1019 2.2 × 1019 3.1 × 1019
256 1.2 × 1077 4.8 × 1029 1.5 × 1031 4.8 × 1032 1.5 × 1034 4.8 × 1035 1.5 × 1037 4.8 × 1037 2.6 × 1038 4.0 × 1038 5.7 × 1038
384 3.9 × 10115 8.9 × 1048 2.8 × 1050 8.9 × 1051 2.8 × 1053 8.9 × 1054 2.8 × 1056 8.9 × 1056 4.8 × 1057 7.4 × 1057 1.0 × 1058
512 1.3 × 10154 1.6 × 1068 5.2 × 1069 1.6 × 1071 5.2 × 1072 1.6 × 1074 5.2 × 1075 1.6 × 1076 8.8 × 1076 1.4 × 1077 1.9 × 1077
Table shows number of hashes n(p) needed to achieve the given probability of success, assuming all hashes are equally likely. For comparison, 10−18 to 10−15 is the uncorrectable bit error rate of a typical hard disk [1]. In theory, MD5, 128 bits, should stay within that range until about 820 billion documents, even if its possible outputs are many more.

It is easy to see that if the outputs of the function are distributed unevenly, then a collision can be found even faster. The notion of 'balance' of a hash function quantifies the resistance of the function to birthday attacks and allows the vulnerability of popular hashes such as MD and SHA to be estimated (Bellare and Kohno, 2004).

Digital signatures can be susceptible to a birthday attack. A message m is typically signed by first computing f(m), where f is a cryptographic hash function, and then using some secret key to sign f(m). Suppose Alice wants to trick Bob into signing a fraudulent contract. Alice prepares a fair contract m and a fraudulent one m'. She then finds a number of positions where m can be changed without changing the meaning, such as inserting commas, empty lines, one versus two spaces after a sentence, replacing synonyms, etc. By combining these changes, she can create a huge number of variations on m which are all fair contracts. In a similar manner, she also creates a huge number of variations on the fraudulent contract m'. She then applies the hash function to all these variations until she finds a version of the fair contract and a version of the fraudulent contract which have the same hash value, f(m) = f(m'). She presents the fair version to Bob for signing. After Bob has signed, Alice takes the signature and attaches it to the fraudulent contract. This signature then "proves" that Bob signed the fraudulent contract. This differs slightly from the original birthday problem, as Alice gains nothing by finding two fair or two fraudulent contracts with the same hash. Alice's optimum strategy is to generate "pairs" of one fair and one fraudulent contract. Then Alice compares each freshly-generated pair to all other pairs; that is, she compares the new fair hash to all previous fraudulent hashes, and the new fraudulent contract to all previous fair hashes (but doesn't bother comparing fair hashes to fair or fraudulent to fraudulent). The birthday problem equations apply where "n" is the number of pairs. (The number of hashes Alice actually generates is 2n.)

To avoid this attack, the output length of the hash function used for a signature scheme can be chosen large enough so that the birthday attack becomes computationally infeasible, i.e. about twice as many bits as are needed to prevent an ordinary brute force attack.

Pollard's rho algorithm for logarithms is an example for an algorithm using a birthday attack for the computation of discrete logarithms.

  1. ^ Jacques Patarin, Audrey Montreuil (2005). "Benes and Butterfly schemes revisited" (PostScript, PDF). Université de Versailles. Retrieved on 2007-03-15.

Advanced Search
Included Web Search Engines


Safe Search

close

Top Matching Results

Occasionally Search.com will highlight specialized results that are based on the context of your query. Examples of specialized results include specific links to news, images, or video.

Top Matching Results may highlight information from other Search.com pages, content from the CNET Network of sites, or third party content. The listings are based purely on relevance. Search.com does not receive payment for listings in this section but our partners that provide this data may get paid for listing these products.

Sponsored Links

This section contains paid listings which have been purchased by companies that want to have their sites appear for specific search terms and related content. These listings are administered, sorted and maintained by a third party and are not endorsed by Search.com.

Search Results

Search.com sends your search query to several search engines at one time and integrates the results into one list which has been sorted by relevance using Search.com's proprietary algorithm. You can customize the list of search engines included in your metasearch from the preferences.

The search engines that are used in your metasearch may allow companies to pay to have their Web sites included within the results. To view the Paid Inclusion policy for a specific search engine, please visit their Web site. Search.com does not accept payment or share revenue with any search engine partner for listings in this section.