Section 3.6 Attacking the discrete log problem
Part of understanding cryptography is understanding how attacks on cryptographic systems are performed. Since this is a mathematical cryptography course, we'll focus on mathematical attacks. Diffie-Hellman key exchange and the related Elgamal encryption system rely on the difficulty of the DLP for security. In this section, we'll look at some of the methods used to reduce the difficulty of attacking the DLP from the brute force case.
To make our study of difficulty precise, we need a way to quantify how hard a problem is to solve. In the cryptographic setting, difficulty will always scale in some way with the size of the numbers being used to construct a system, and we need a representation for how that scaling affects methods of solution. We'll use the notion of the order of a function, usually called “big oh notation”, which is used to describe relationships depending on large quanitites. The information contained in a big O relationship is asymptotic - that is, it tells us what happens as values of the inputs get large, but shouldn't be expected to have any bearing on the relationship of quantities in small cases.
Definition 3.6.1.
Let \(f(x)\) and \(g(x)\) be positive functions. Then we say that \(f = \mathcal O(g(x))\) if there exist positive constants \(c, C\) so thatThe concept of order relationships has usage all over mathematics and computer science. For our purposes, we are going to relate the size of our inputs to the number of steps required to perform some algorithmic process. We will compare input size in bits to the number of steps.
-
Polynomial time: input \(\mathcal O(k)\text{,}\) output \(\mathcal O(k^A)\) for some \(A\text{.}\) Different values of \(A\) give different running times:
- \(A = 1:\) linear time
- \(A = 2:\) quadratic time
- \(\displaystyle \vdots\)
Polynomial time computations are “easy”.
- Exponential time: input \(\mathcal O(k)\text{,}\) output \(\mathcal O(e^{ck})\) for some constant \(c\text{.}\) Exponential time computations are “slow”.
- Subexponential time: somewhere in between. Example: output \(\mathcal O(e^{c\sqrt{ k \log k}})\)
The preference in cryptography is for methods that take exponential time to brute force. So how hard is the discrete logarithm problem in \(\F_p\ad\text{?}\)
Algorithm 3.6.2. Naive.
\(p\)\(2^k\)\(2^{k+1}\text{.}\)\(\F_p\ad\)\(p, g, h\text{,}\)\(3k\)\(\mathcal O(k)\)\(x\)The first reduction that we can make is a collision algorithm: we make lists and look for a match that unlocks the problem. The method, due to Shank in the 1960s, prefigures the rise of the DLP as a cryptographic theme.
Algorithm 3.6.3. babystep-giantstep.
Let \(G\) be a group, \(g\in G\) an element of order \(N\geq 2\text{.}\) The following procedure solves the DLP \(g^x = h\) in \(\mathcal O(\sqrt{N} \log N)\) steps.
- Let \(n = 1 + \lfloor N \rfloor\) where \(\lfloor \cdot \rfloor\) is the floor function.
- Create two lists:\begin{equation*} \begin{array}{ccccccc} e \amp g \amp g^2 \amp g^3 \amp \ldots \amp g^n \amp \text{:babystep} \\ h \amp h\cdot g^{-n} \amp h\cdot g^{-2n}\amp h\cdot g^{-3n} \amp \ldots \amp h\cdot g^{-n^2} \amp \text{:giantstep} \end{array} \end{equation*}
- The list will contain a collision. Suppose that\begin{equation*} g^i = h\cdot g^{-jn} \text{ for integers } i,j. \end{equation*}
- Then \(g^{i+jn} = h\) and so set \(x = i+nj\text{.}\)
Some obvious questions should arise on seeing this algorithm. The first is why we should expect a match at all? The answer is in the application of the humble yet consistently useful division algorithm. Suppose that \(x\) was the exponent so that \(g^x = h\text{.}\) Divide \(x\) by \(n\) and write \(x = nq + r\text{,}\) where \(0 \leq r \lt n\text{,}\) and further note that \(0 \leq q \lt n\text{.}\) To see this, since \(1\leq x \lt N\) (the order of an element in a finite group divides and thus is no larger than the order of the group)
where the last inequality follows from the fact that \(n \gt \sqrt{N}\text{.}\) Then since \(g^x = h\text{,}\) we get \(g^r = h\cdot g^{-nq}\text{,}\) where \(g^r\) is in the babystep list, and \(h \cdot g^{-nq}\) is in the giantstep list.
The second question concerns the speed of the computation. Step 2 takes \(2n\) multiplications, and sorting the lists in Step 3 to find a match can be done in \(n \log n\) steps. Combined, then, steps 2 and 3 are
Convincing yourself that the estimates above are valid will go along way to clarifying the concept of big O notation.
The resulting algorithm is still exponential: since \(N\) is on the order of \(2^k\text{,}\) the routine is \(\mathcal O(\sqrt{2^k} \log 2^k) = \mathcal O(k \cdot 2^{k/2})\text{.}\) However, when dealing with numbers on the order of the constants typically involved in cryptography, \(N\) to \(\sqrt{N}\) is a significant savings.