Project 11 - Blog - Shor’s Algorithm for Discrete Logs

You’ll find this easier to understand if you are comfortable with modular arithmetic, polynomial vs exponential complexity and some quantum computing basics.

Elliptic curve cryptography (ECC) is one of the main types of cryptography in widespread use today. In this post, we’ll show how a quantum computer can run Shor’s algorithm, which, in principle, allows one to break ECC. In particular, Shor’s algorithm solves the discrete log problem (DLP). The ability to solve this problem quickly is a key advantage of quantum computers over classical computers and allows quantum computers to compromise otherwise robust cryptography. You might care about running Shor’s algorithm and breaking ECC if you own some Bitcoin and have lost the private key that controls them, or more maliciously, if you wanted to eavesdrop on some internet traffic.

Suppose you have lost the private key for some Bitcoin. Is there any way you could reclaim the key, or calculate it based on your public key? Your options include:

Guessing: You could guess private keys at random and try them out. However, there are $2^{256}$ possible Bitcoin private keys. Even with all the computers in the world at your disposal, you would likely spend trillions upon trillions of years guessing in vain.
Classical algorithm: We could use your public key and a supercomputer to try and solve the DLP. Using the best algorithms we know, a supercomputer might take 100 million years to return your private key. Your odds here are better than guessing, but still bad.
Quantum computer: Instead of normal (classical) computers, get a beefy quantum computer*. Using your public key, you can run Shor’s algorithm to solve the DLP and reclaim your private key. A sufficiently large quantum computer should take just hours or days to run Shor’s algorithm.
*Does not exist today. It would need to be fault-tolerant and have around 2200 logical qubits, which is thousands of times more complex than today’s most advanced quantum computer. You might be able to access to one in the 2030s.

Classical computers take so much longer to solve the DLP because the classical algorithms we use for solving the DLP run in exponential time, whereas Shor’s algorithm runs in polynomial time - but can only run on a quantum computer. A polynomial time classical algorithm might exist, but we haven’t found it.

Assuming we have the requisite quantum computer, what does Shor’s algorithm involve? First, we explain the DLP.

The Discrete Log Problem

If we ignore many important details, the DLP looks similar to this problem:

Given $2^{x} = 4$ , what’s x?

The first detail to note is that we choose a prime number p, and work (mod p), meaning we only need to consider the remainders numbers have when divided by p. Let’s modify our problem to reflect this. Note we’ve added (mod 5) and have swapped 4 for 1.

Given $2^x≡1 \pmod 5$ , what’s x?”

The powers of $2 \pmod 5$ are 2, 4, 3, 1, 2, 4, 3, ... and the answer is 4. Notice that we any multiple of 4 would be a correct answer, as the powers of $2 \pmod 5$ repeat every 4th power.

Because we choose a prime to work (mod p), the list of powers of any number between 1 and p - 1 (mod p) produces a repeating pattern. For example, 4 (mod 5) gives $4, 1, 4, 1, . . .$ What pattern do the powers of 3 (mod 5) produce?

3, 4, 2, 1, 3, …

For both 2 and 3, the patterns produced by their powers (mod 5) includes 1, 2, 3 and 4. That’s every possible number except 0 (mod 5). We call 2 and 3 generators mod 5. If a number g is a generator (mod p), then $g, g^{2}, g^{3}, . . .$ $\pmod p$ form a repeating pattern that includes every number $1, 2, 3, . . ., p - 1$ .

We’ll update our problem so that we replace 5 with some arbitrary prime p, and replace 2 with a generator, g:

Given $g^x≡1 \pmod p$ and knowing g and p, what’s x?

Remember that the powers of g include every possible number 1, 2, 3, … , p - 1. Instead of having the result of raising g to the power of x be 1, the result could be any number between 1 and p - 1. If we call that result r, we have:

Given $g^x≡r \pmod p$ and knowing g, r and p, what’s x?

This is a general form of the DLP, but we are not done. There are two complications to mention, both due to us being interested in breaking elliptic curve cryptography.

To ensure that supercomputers can’t solve the DLP quickly - perhaps by brute force - and break ECC, the numbers involved must be large. In the case of Bitcoin, p is chosen to be slightly less than $2^{256}$ , which is a certified very big number (VBN). With p specified, g, r and x can be similarly large, as they must fall between 1 and p - 1. This makes it infeasible to solve a single instance of the DLP, even with all the supercomputers on earth at your disposal.

Given $g^x≡r \pmod p$ and knowing g, r and p, with $p≈2^{256}$ , what’s x?

The above is a standard DLP, and can be solved by Shor’s algorithm. We need to solve a different type of discrete log problem, one involving elliptic curves, which is also infeasible for supercomputers to solve, and solvable by Shor’s algorithm. This complication is responsible for the ‘elliptic curve’ in elliptic curve cryptography.

We start by choosing a specific elliptic curve. In the case of Bitcoin, the elliptic curve used is defined by $y^{2} = x^{3} + 7$ , and is called secp256k1. We choose p points on the curve and say that we’re working in the group of elliptic curve points of size p.

Secp256k1 is the elliptic curve used in Bitcoin.

With p points chosen, each number 0, 1, 2, … , p - 1 has a corresponding point on the elliptic curve. Adding or multiplying these numbers / points always gives a point on the elliptic curve that corresponds to one of the numbers 0, 1, 2, … , p - 1, so the points behave very similarly to integers $\pmod p$ . Addition and multiplication of the points also looks very similar to those operations on integers $\pmod p$ .

In the DLP above, we saw $g^{x}$ , meaning $g$ was multiplied by itself $x$ times. In ECC, we add $g$ to itself $x$ times rather than multiplying $g$ by itself $x$ times.

This gives us the full formulation of the elliptic curve DLP:

Given $g \cdot x = r g⋅x=r$ in the group of elliptic curve points of size p, and knowing g, r and p, what’s x?

That’s it. Reliably breaking elliptic curve cryptography requires solving the DLP, and the near-impossibility of doing this with classical computers is what makes ECC secure. We noted, however, that a large-scale, fault-tolerant quantum computer could break the DLP. It can do so using Shor’s Algorithm for the DLP.

Shor’s Discrete Log Algorithm

Peter Shor’s original paper in 1994 introduced two major quantum algorithms: One for solving DLPs and another for factoring integers into prime numbers. Both algorithms demonstrate an exponential speedup over the best known classical algorithms, but the factoring algorithm is much better known. That’s probably because factoring numbers is easier to explain to a lay audience than elliptic curves. Here, we are interested in the DLP algorithm.

Quantum computers are not uniformly superior to classical computers, but they are better at a few specific families of problems. They can calculate multiple things at the same time, but we can only extract a single one of the calculated answers before destroying all the other ones, which makes quantum computers less useful than some people imagine. Relatedly, we can often find faster quantum algorithms for problems that involve calculating some global property of a function or set of numbers, such as whether a function’s outputs are mostly positive, negative or evenly balanced.

In our case, Shor’s algorithm reduces the DLP to a period-finding problem: Given a large set of large numbers that we know to have a pattern, what’s the period of the pattern? This is something quantum-computers are good at. Period-finding is itself a problem that can be reduced to the Abelian hidden subgroup problem, and all such problems are susceptible to quantum speedups.

Designing quantum algorithms is a major challenge. One has to figure out if an interesting problem can be reduced to period-finding or a similarly quantum-friendly problem and then create a circuit of qubits and logic gates to convert the original problem into the quantum-friendly version. The result of the quantum-friendly translation must give a correct answer to the original problem.

In Shor’s algorithm, the DLP is the original problem, and period-finding is the target problem that the DLP gets reduced to. Once the period-finding problem is in place, an operation called a Quantum Fourier Transform (QFT) solves it and the result from the QFT and a little number theory and some additional calculations solve the DLP.

Shor’s algorithm for the DLP is surprisingly short. First, two lists of the numbers $0, 1, 2, . . ., p - 1$ are created. Next, a modular exponentiation is calculated and measuring this causes the two lists to be restricted to a set of pairs of numbers which obey a pattern related to $x$ . A QFT then extracts the period, which leads directly to the solution, $x$ . The algorithm does not work 100% of the time, but it works with high probability, and can be run multiple times until the correct answer, $x$ , is found.

Step 1: Setting up two lists of numbers, represented by not many qubits

We start with three sets of qubits, which we call registers. Two of the registers are each going to represent an identical list of numbers $0, 1, . . ., p - 2$ . The third register will be used later to compute and store a multiplication and addition involving $g$ and $r$ .

How many qubits do we need for the first two registers? Like in classical computing, qubits can represent binary numbers through their states 0 and 1. Because $p≈2^{256}$ , it is 256 bits long and we need 256 qubits or so to represent the largest numbers we’re dealing with. In total, the two registers require $\sim512$ qubits. A number close to $p$ would look like:

$110100100100100100100100100100100100100100100100...010101$ , with 256 bits.

These being qubits, they can be placed in a superposition of multiple states at once. A single qubit can be in a superposition between 0 and 1. When we measure that qubit, we will detect one and only one of those states, either 0 or 1. If we have two qubits, we can let their measured states represent binary numbers, with $00 \rightarrow 0,01 \rightarrow 1,10 \rightarrow 2,11 \rightarrow 3$ . If we put both qubits in superposition between 0 and 1 and measure them both, we have a chance of getting any of the 4 possible states. If the leading qubit results in state 1 and the latter qubit results in state 0, we have collapsed the superposition into state 10, or we could say the measurement’s result is 3.

With 256 qubits in each register, we can represent $2^{256}$ numbers at once. We choose to represent every number $0, 1, 2, . . ., p - 2$ . (Why $p - 2$ ? Some number theory explains). We do this by starting with each qubit in the state 0 and then implementing a Hadamard gate on each qubit. The Hadamard gate takes a qubit from state 0 and puts it in a balanced superposition between 0 and 1. Formally, this looks like $0 \rightarrow \frac{1}{\sqrt 2} ⋅ 0+ \frac{1}{\sqrt 2}⋅1$ .

Each of the registers is now in a superposition of each number (or elliptic curve point) from $0$ to $p - 2$ . The two registers have no dependency on each other. Together, the registers represent all possible pairs of numbers between $0$ and $p - 2$ . If we measure each of them, we will end up with random pair from the $(p - 1)^{2}$ possible results. For $p = 7$ (far too small!), possible pairs include $(0, 4), (3, 2), (5, 1)$ .

Step 2 - Entangle the two registers and create a pattern based on x

Our next step is to create some sort of pattern in the states of the two registers. They are in superpositions that are evenly balanced between all possible pairs $(0, 0), (0, 1), . . ., (p - 2, p - 2)$ , but that even distribution has no pattern. It’s totally uniform, random, flat. How might we create a pattern with some relation to $x$ , the DLP’s solution?

Suppose we removed every odd number from the first register’s state. That would create a pattern in the first register’s state. As a result, measuring the first register would always give an even number. The second register could still result in any number between $0$ and $p - 2$ . There’d be no correlation between the two registers though, they’d still be independent lists.

Shor’s algorithm creates a pattern related to $x$ by removing some of the numbers from each register’s state. At the same time, the two registers are made interdependent, or entangled. The states of the registers now have to satisfy some equation, and whenever one register is measured, the second register must only collapse into states that satisfy the equation, taking the first measurement into account. Taken together, the two registers are restricted to superpositions of all possible pairs that satisfy the equation linking them. By ensuring the equation involves $x$ , we have created a pattern related to $x$ .

We now use the third register, recalling that we still have two independent registers. In the third register, we compute and store the result of $g \cdot a + r \cdot b g⋅a+r⋅b$ , which is another point on the elliptic curve. $a$ and $b$ represent inputs from the first two registers, and gg and rr come from the problem statement. Since the first two registers are in superpositions and we are using a quantum computer, the third register calculates $g \cdot a + r \cdot b g⋅a+r⋅b$ for every allowed value of $a$ and $b$ and ends up in a superposition of all the results. We can recover only a single one of these results though, because once we interact with and measure the register, it collapses to exactly one state.

We measure the third register, causing it to collapse to a specific value of $g \cdot a + r \cdot b g⋅a+r⋅b$ . From this point on we have no further use for the third register and can ignore it.

From the DLP’s statement, we know $r = g \cdot x r=g⋅x$ .

$g \cdot a + r \cdot b = g \cdot a + g \cdot x \cdot b = g \cdot (a + x \cdot b) = g \cdot d g⋅a+r⋅b=g⋅a+g⋅x⋅b=g⋅(a+x⋅b)=g⋅d$

We can write the result of measure the third register as $g \cdot d g⋅d$ , and we have $d = a + x \cdot b d=a+x⋅b$ .

When we measured the third register, we took a function that had been calculated by using every possible value from the first two registers as input. Clearly, the result of the third register should agree with the first two registers. In particular, the first two registers should only occupy states, or superpositions of states, that satisfy $d = a + x \cdot b d=a+x⋅b$ .

At the end of step 1, the combined states of the first two registers included all $p - 1$ numbers $(0, 1, . . ., p - 2)$ in any pairwise combination. After measuring the third register, their states change. They are no longer independent and free to become any number between $0$ and $p - 2$ . Instead, the two registers are entangled, and we treat their possible values as pairs instead of as two unrelated values. In particular, the two registers are now in a superposition of all possible pairs $(a, b)$ that satisfy $d = a + x \cdot b d=a+x⋅b$ . If we measure one of the registers, that result will further restrict the possible outcomes of measuring the remaining register.

The pairs that the two registers can occupy must satisfy $d = a + x \cdot b d=a+x⋅b$ , where $d$ and $x$ have been fixed. Without knowing $x$ , we’ve entangled two registers in a way that depends on $x$ and created something that looks like a period-finding problem - if we were working in 2D space, the allowed pairs would be those that sit on lines with slope $x$ .

Now we’ve gotta somehow get $x$ from the entangled states of the two registers. If we could measure the registers many times, we could figure out what $x$ was, but we’re in the quantum computing realm and only get one measurement before we destroy our hard-won information.

Step 3 - Extract the period

It turns out there’s a way to extract the period. We use a Fourier Transform, which is one of the most commonly-used algorithms in all of computing. Straightforwardly enough, the quantum version of the algorithm is called the Quantum Fourier Transform (QFT). You can learn more about the Fourier Transform via this excellent interactive explainer.

A key advantage of quantum computers is that when working with large sets of numbers, the QFT is much more efficient than computing a Fourier Transform using classical computers. The classical Fourier Transform requires an amount of time proportional to the number of points in the input data, whereas the QFT requires time proportional to the logarithm of the number of points being processed. In our case, the points in question would be the allowed pairs that are in superposition and make up the states of the two entangled register. If we make a very crude approximation and assume we have about $2^{256}$ such pairs, computing a classical FT is completely impractical, whereas the time required for a quantum computer is quite reasonable.

Classical FT on $2^{256}$ points: $O(n \log n) \rightarrow O(2^{256}⋅256) \sim O(10^{79})$

QFT on $2^{256}$ points: $O((\log n)^2) \rightarrow O(256^2) \rightarrow O(65536)$

The Fourier Transform acts as a frequency analyzer, taking input data and converting it to a sum of simple periodic patterns. Extracting a frequency immediately gives us the period of any such pattern. If we apply a QFT to the two entangled registers, their resulting states are still entangled, but they will change to represent something new. Instead of representing the pairs of numbers $(a, b)$ that satisfy $d = a + x \cdot b d=a+x⋅b$ , the registers will represent the sets of frequencies $(f_{\mathbf{1}},f_{\mathbf{2}})$ , which can be combined to create the preceding set of pairs. $f_{\mathbf{1}}$ and $f_{\mathbf{2}}$ will be related - otherwise the registers would no longer be entangled. $f_{\mathbf{1}}$ is the state of the first register, while $f_{\mathbf{2}}$ is the state of the second register.

We apply a QFT to each of the two registers and measure both registers, getting two numbers, or frequencies, as our results. The superposition we’ve computed is destroyed and the two registers are no longer in superposition. Our outputs are some values $f_{\mathbf{1}}$ and $f_{\mathbf{2}}$ .

At this point, we do not have $x$ , but taken together, $f_{\mathbf{1}}$ and $f_{\mathbf{2}}$ give us some information on $x$ . The whole of Shor’s algorithm involves repeating the above quantum program multiple times, eventually having enough information to calculate $x$ in a short amount of time using classical computers and number theory. This part of the algorithm is called post-processing and does not involve a quantum computer. If you want to dive in to that part of the algorithm, see Shor’s original paper, pages 20 - 24.

To find out more, you might check out Proos & Zalka, this Youtube video, or talk about any of the above with an LLM. This piece is inspired by Scott Aaronson’s excellent explainer for Shor’s factoring algorithm.

References

Childs Lecture Notes, 2011 - https://www.cs.umd.edu/~amchilds/teaching/w11/l03.pdf
Proos, Zalka, 2004, Shor’s for discrete log - https://arxiv.org/pdf/quant-ph/0301141
Ekerå, 2024, Shor’s for discrete log - https://arxiv.org/pdf/1905.09084
Cryptography Stack Exchange Using Shor's algorithm to solve the discrete logarithm problem
Quantum Computing Stack Exchange How to build an example of Shor's algorithm for the discrete log?