Survey of Discrete Log Algorithms

In 1999 Dan Boneh published a survey paper called “Twenty years of attacks on the RSA cryptosystem”. I like to tell people that if they read the entire paper they can solve just about every CTF challenge involving RSA.

This post is my attempt to do something similar with the discrete log problem, a primitive that underlies a bunch of cryptographic protocols including the Diffie-Hellman key exchange and the ElGamal encryption system. The algorithms all have similar-sounding names, and I wanted to just sit down and describe each of them to get them straight in my head.

The Discrete Log Problem

Suppose we have a group \(G\) generated by an element \(g\) (this means that every element can be written as \(g^k\) for some integer \(k\).) The discrete log of an element \(y = g^x\) is the number \(x\). For cyclic groups of order \(n\), discrete logs are unique modulo \(n\).

A common choice for \(G\) in cryptography is \((\mathbb{Z}_p)^\times\), or the multiplicative group modulo the prime \(p\). Fermat’s little theorem tells us that \(g^{p-1} = 1 \pmod{p}\), so the order of \((\mathbb{Z}_p)^\times\) is \(p-1\).

If the order of the cyclic group is not a smooth number, i.e. if \(p-1\) has at least one large prime factor, then it is believed that it is hard to compute the discrete logarithm on elements of the group.

This hardness assumption forms the basis for a large portion of modern cryptography. As a result, it’s important to understand the best algorithms that exist to compute discrete logs.

Approach #1: Brute Force

Given an element \(y\) in a group \(G\) with generator \(g\), we’re trying to find an integer \(x\) such that \(g^x = y\). A naive way to do this is to simply try all possible \(x\). This approach has time complexity \(O(n)\) where \(n\) is the order of the group and space complexity \(O(1)\).

This website is secured using an elliptic curve group with order \[n = 2^{256} - 2^{224} + 2^{192} - 89188191075325690597107910205041859247.\] Using this approach to find a discrete log in this group would take around \(2^{256}\) trials, which is a little much.

Approach #2: Baby-step giant-step

There’s a relatively easy way to improve the running time of the brute-force approach using a common technique in cryptography. We’re trying to find \(x\) such that \(g^x = y\). Let’s rewrite \(x\) as \(im + j\) where \(m = \lceil \sqrt{n} \rceil\) for integers \(i\) and \(j\) between \(0\) and \(m\). Now we can rewrite \[y = g^x = g^{im + j}\] as \[(g^{-m})^i y = g^j.\]

Next, let’s compute and store \(g^j\) for all possible values of \(j\) in a set, and then run through every possible value of \(i\) and check when \((g^{-m})^i y\) is in our set. Once we find a colliding \(i\) and \(j\), we can compute \(x = im + j\).

This is an example of a space-time tradeoff; we’ve essentially turned an \(O(n)\) time, \(O(1)\) space algorithm into an \(O(\sqrt{n})\) time, \(O(\sqrt{n})\) space algorithm.

The name stems from the fact that every increment of \(j\) represents a “baby-step” of size \(g^1\) while every increment of \(i\) represents a “giant-step” of size \(g^m\).

Approach #3: Pollard’s \(\rho\) algorithm

It turns out it’s possible to compute discrete logs with \(O(\sqrt{n})\) time and \(O(1)\) space. Here’s the idea:

First, partition the group into three pairwise-disjoint, approximately-equally sized sets \(S_0, S_1,\) and \(S_2\) such that \(G = S_0 \cup S_1 \cup S_2 \). This partition should be made at random.

Consider the following sequence \(h_i = g^{a_i}y^{b_i}\) where \(a_0 = b_0 = 1\) and \(h_i = f(h_{i-1})\) where

\[f(h) = \begin{cases} gh, & h \in S_0 \\ h^2, & h \in S_1 \\ yh, & h \in S_2 \end{cases}\]

Basically, a third of the time we increase \(a_i\) by one, a third of the time we increase \(b_i\) by one, and a third of the time we double both \(a_i\) and \(b_i\). This is just a convenient way to make the sequence walk through the space of \(g^{a_i}y^{b_i}\) pseudorandomly.

Why do we want to walk through this space? Well, consider what happens if we find a collision; that is, four integers \(a, b, a’, b’\) such that

\[g^ay^b = g^{a’}y^{b’}.\]

We can rewrite this as \[y = g^{(a - a’) / (b’ - b)}\] and then compute the discrete log of \(y\) as \((a - a’) / (b’ - b)\) (division just means “multiply by the inverse modulo \(n\), the order of the group”).

So how do we find a collision in our sequence using \(O(1)\) space? A collision in our sequence means that our sequence contains a cycle, and there’s a common trick that frequently pops up in algorithms interviews that allows us to find cycles in linked lists using \(O(1)\) space.

The idea is to use two iterators. One iterator increments through the sequence by one at each step, while the other increments by two at each step. If the linked list has a cycle, the two iterators will, at some point, have the same value.

For our sequence, we’ll use two variables, \(s\) and \(t\) both initialized to \(h_0\). On each step of our algorithm, we’ll set \(s\) to \(f(s)\) and \(t\) to \(f(f(t))\) and check if the two are equal. If they are, we have a collision and can solve the DLP. If not, continue.

By the birthday paradox, it should take on the order of \(O(\sqrt{n})\) steps before we find a collision, so the running time is \(O(\sqrt{n})\).

The name “rho” stems from the fact that the sequence with a cycle, when considered as a directed graph, takes the shape of the Greek letter \(\rho\). Pollard also has an analogous algorithm for factorization.

Approach #4: Pollard’s kangaroo algorithm

In the same paper where he introduced his rho-algorithm, Pollard also discussed a different algorithm to solve the discrete log problem. This algorithm is most useful when you already know some information about \(x\). Specifically, if you know that \(a \le x \le b\), then this algorithm will take \(O(\sqrt{b - a})\) time on expectation.

Here’s how it works: Once again we’ll define a pseudorandom sequence \(x_i\) where

\[x_{i + 1} = x_i + 2^{g^{x_i} \pmod k}\] for some small \(k\). (In practice I’ve found that setting \(k = \lceil \log_2(b - a)/2 \rceil \) works pretty well.) Also define \(y_i = g^{x_i} \pmod p\) and note that

\[y_{i+1} = y_i g^{2^{y_i \pmod k}}.\]

The point is that on each step, this sequence jumps forward by a random-looking power of two between \(0\) and \(2^{k-1}\). In his paper, Pollard makes an analogy to a kangaroo leaping forward by a random distance on each jump.

The first step is to launch a “tame” kangaroo starting from \(x_0 = b\) and tell it to jump \(N\) times, for some fixed \(N\). Mathematically this means computing \(x_{N}\) and \(y_{N}\) starting from \(x_0 = b\) and \(y_0 = g^b\). Note that we don’t need to store the values of the intermediate hops; we just make our \(N\) jumps and then stop.

Now, let’s try to use our tame kangaroo to catch a wild kangaroo. Define a similar sequence starting from \(y_0 = y\). Repeatedly compute the next value in the sequence and check if it is equal to \(y_N\). We don’t know the corresponding values of \(x_i\) (indeed, that’s the value we’re trying to find), but we can keep track of how much distance we’ve hopped from the initial \(x\) because on each hop we travel \(2^{y_i} \pmod k\).

The hope is that at some point the wild kangaroo will eventually land on the same location as a jump of the original tame kangaroo, and once the two kangaroos are on the same point, every subsequent jump will be identical since the sequence is generated deterministically. Thus, if the wild kangaroo ever lands on the same spot as the tame kangaroo, it will eventually end up on the tame kangaroo’s final location \(y_n\).

In the above diagram, the blue arrows represent the tame kangaroo and the red arrows represent the wild kangaroo. If the wild kangaroo eventually hits \(y_N\) after traveling a distance \(d\), we can compute:

\[y_0 g^{d} =y_N = g^{x_N}\] \[y = y_0 = g^{x_N - d}\]

so the discrete log of \(y\) is \(x = x_N - d\).

There are a couple of remaining problems. This algorithm is inherently probabilistic, and it’s possible that the wild kangaroo will never overlap with the tame kangaroo. We can detect when this has happened by stopping the wild kangaroo after it has traveled a distance of \(d > x_N - a\). Since we know our wild kangaroo starts somewhere between \(a\) and \(b\), if it has traveled more than a distance of \(x_N - a\) it has definitely passed our tame kangaroo sitting at \(x_N\).

If this happens, the algorithm has failed, and we’ll need to try again with a different tame kangaroo (start the tame kangaroo at \(b + 1\) instead of \(b\) for example).

Note that both Pollard’s \(\rho\) method and his kangaroo method are probabilistic, but in different ways. The \(\rho\) method takes an unbounded amount of time, but always succeeds when it terminates, while one iteration of the kangaroo method takes a bounded amount of time, but is not guaranteed to succeed. This is the difference between a Las Vegas algorithm and a Monte Carlo algorithm.

Finally, we need to choose an appropriate value of \(N\) and then prove that the algorithm runs in time \(O(\sqrt{b - a})\). We’d like to set \(N\) such that the probability of succeeding on one iteration is constant (not dependent on \(k\)). Suppose the mean jump length is \(\mu\). The probability of any one jump of the wild kangaroo landing on a jump of the tame kangaroo is approximately \(1/\mu\). The tame kangaroo makes \(N\) jumps, and we succeed as long as at least one of those jumps coincides with the wild kangaroo’s path. The probability of this is:

\[1 - \left (1 - \frac{1}{\mu} \right) ^ N.\]

If we set \(N = 4 \mu\), then this is approximately \[1 - e^{-4} \approx 0.98.\] The mean jump length is \[2^{k}/k\] so a good choice is \(N = 4 \cdot 2^{k}/k = 2^{k+2}/k\).

The total number of jumps taken by all the kangaroos on expectation is

\[N + \left(\frac{b-a}{2\mu} + N\right) = 8\left(\mu + \frac{b-a}{16\mu}\right).\]

The minimum value of this is \(4 \sqrt{b-a} \) which is achieved when \(\mu = \sqrt{b-a}/4\). (Incidentally, since \(\mu = 2^{k}/k\), this justifies setting \(k \approx \log_2{(b-a)}/2\)).

Thus, the algorithm takes \(O(\sqrt{b-a})\) time when it succeeds, and will succeed with probability 0.98.

Pollard has also referred to this algorithm as his \(\lambda\) algorithm because the shape of the converging kangaroo paths resembles the Greek letter lambda.

Approach #5: The Pohlig-Hellman algorithm

Remember how we said earlier that the order of the group needs to not be a smooth number? Here’s what goes wrong if \(n\) has a bunch of small factors:

(Some group theory knowledge is helpful for comprehension). Suppose that

\[n = \prod_{i = 1}^k p_{i}^{e_i}\] where the \(p_i\)’s are prime and unique. As usual we have a \(y = g^x\) and we want to find \(x\).

For each \(i\) we’ll compute \(x \bmod{p_i^{e_i}}\) and then use the Chinese Remainder Theorem to find \(x\) itself.

For each \(i\), let \(g_i = g^{n / p_i^{e_i}}\). This element has order \(p_i^{e_i}\) (raise \(g_i\) to this power to verify). Now consider

\[y_i = y^{n / p_i^{e_i}} = g^{xn / p_i^{e_i}} = g_i^x = g_i^{x \pmod{p_i^{e_i}}}\]

Since \(p_i^{e_i}\) is small, we should be able to find the \(r_i\) between \(0\) and \(p_i^{e_i} - 1\) such that \(y_i = g^{r_i} \pmod p\) using any of the previous DLP algorithms. Once we’ve done this for all \(i\) we have a system of linear congruences of the form \(x = r_i \pmod{p_i^{e_i}}\). Using CRT, we can then derive \(x\).

Note that \(k\), the number of prime divisors of \(n\) is bounded by \(k < \log_2{n}\), so it’s insignificant. The time complexity is then bounded by the size of the largest prime \(p^* \) and is \(O(\sqrt{p^*})\). The space complexity is \(O(k) = O(\log{n})\) but since we need this much space to store the number \(n\) in the first place, we’ll call it \(O(1)\).

Comparison table

Algorithm Time Complexity Space Complexity Notes
Brute Force \(O(n)\) \(O(1)\) Don't use.
Baby-step Giant-step \(O(\sqrt{n})\) \(O(\sqrt{n})\) Easy to implement, but Pollard's \(\rho\) is better.
Pollard's \(\rho\) \(O(\sqrt{n})\) \(O(1)\) Fastest known DLP algorithm for prime order.
Pollard's kangaroo \(O(\sqrt{b-a})\) \(O(1)\) Use when you already know some information about \(x\).
Polhig-Hellman \(O(\sqrt{p^*})\) \(O(1)\) Use when \(n\) is factorable into many small primes.