Wagner's Birthday Attack - How to Break InsecureMuSig

Introduction

MuSig is a protocol which allows mutually distrustful parties to safely create aggregated digital signatures together. I highly recommend you read my MuSig article before diving into this one, as it lays down the context. If you’re already familiar with MuSig and Schnorr, carry on!

The MuSig protocol has several iterations:

  1. InsecureMuSig - An outdated and insecure 2-round version
  2. MuSig1 - The secure 3-round variant
  3. MuSig2 - The faster but slightly less secure 2-round variant

InsecureMuSig - the original form of MuSig - was proven insecure by Drijvers et al in 2018. The attack described in their paper depended on the fact that under InsecureMuSig, co-signers could manipulate the final aggregated nonce used for signing by submitting rogue nonces, computed as a function of their peers’ nonces. In response, the authors of MuSig amended their paper with a third nonce-commitment round of communication, thus preventing co-signers from providing rogue nonces.

This Blockstream article briefly summarizes the approach used by the attack, but only from the perspective an attack on an insecure implementation of MuSig1, in which nonces themselves are pre-shared before the message to sign is chosen. For a time, this article led me to mistakenly assume that it was the pre-sharing of nonces which made InsecureMuSig so insecure. However, that is not the approach used by Drijvers et al in their original paper. In the paper, Drijvers et al made no mention of nonce pre-sharing at all.

To my knowledge, there does not yet exist a full approachable explanation of how Wagner’s Attack broke InsecureMuSig. That is what I hope this article will be.

Bear in mind that I am no expert on Wagner’s Algorithm, but merely a tourist inclined to investigate algorithms which I find interesting. If you notice any mistakes, please suggest an edit on Github, or email me! I’d be very grateful for the help.

Notation

Just so we’re all on the same page:

Notation Meaning
The value is a member of set .
The sum of all from to . i.e.
is equivalent to modulo .
The base-point of the secp256k1 curve.
The order of the secp256k1 curve. There are possible valid non-zero points on the curve, plus the ‘infinity’ point (AKA zero).
A cryptographically secure namespaced hash function under the namespace . Returns an integer mod .
Sampling randomly from the set of integers modulo . Note that we exclude zero when sampling.
Sampling randomly from the set of secp256k1 curve points. Note that we exclude the point at infinity (AKA zero) when sampling.
Concatenation of the byte arrays and .
The set of all integers from to (inclusive), i.e. every such that
“For every element in set .”
The order of (AKA number of elements in) set .

The Attack

The Setup

Alice, Bob and Carol have an aggregated public key defined as per the normal MuSig1 protocol.



Carol has some evil message she wants to sign using the aggregated key, but which Alice and Bob would never agree to sign. Perhaps a signature on would steal all of Alice and Bob’s Bitcoins. Perhaps it would sign a TLS certificate which only Carol shouldn’t actually have. Perhaps it is the string “Alice sucks and must not be invited to parties”. You get the idea.

Requirements

Carol will execute her dastardly attack by opening a number of concurrent signing sessions with Alice and Bob, and providing a set of carefully chosen rogue nonces. Rogue nonces behave similarly to rogue keys, which I highlighted in my MuSig article. If Carol learns Alice’s nonce and Bob’s nonce before revealing her own nonce , then Carol can offer up a rogue nonce which fools Alice and Bob into computing the aggregated nonce to be whatever Carol desires. When Carol does this, she can cause Alice and Bob’s signatures on to change in a way she fully controls, albeit unpredictably.

The “fully controls” part is particularly important. For Wagner’s Attack to work Carol must learn Alice and Bob’s public nonces before choosing her own, otherwise she cannot control the challenge and her attack will not work.

This is why InsecureMuSig was insecure: Unlike MuSig1, InsecureMuSig didn’t have a nonce-commitment phase. Co-signers simply exchanged their public nonces at will with no additional commitment phase or scrambling involved.

The Procedure

OK enough babbling. Let’s dive right in and see how Carol attacks this system.

  1. Carol opens concurrent signing sessions with Alice and Bob, in which she proposes to sign innocuous messages which Alice and Bob would be perfectly agreeable to signing. can be any power of two - higher is generally more efficient, but riskier for Carol. should approach but not exceed the maximum number of concurrent signing sessions that Alice and Bob can safely handle.

  2. Carol waits for Alice and Bob to share their public nonces with her in every one of the signing sessions.

  3. Carol picks a nonce of her own which will protect her private key once the forgery is complete.


  1. Carol runs Wagner’s Algorithm to find different curve points which hash to challenges which sum to an evil challenge .


  • is the “evil nonce” which will eventually be used for Carol’s forged signature. It is the sum of all of Alice and Bob’s nonces across all signing sessions, plus Carol’s own nonce .
  • Each is hashed with the message to produce a challenge for signing session .
  • Through the wonder of Wagner’s Algorithm, the sum of all those hashes equals another hash which Carol has chosen based on an evil message (which Alice and Bob never see).
  1. Carol calculates a set of rogue nonces , one for each signing session.

The purpose of each rogue nonce is to fool Alice and Bob into using the point as the aggregated nonces for signing session . Alice and Bob, upon receiving , will compute the aggregate nonce as:

Alice and Bob do not know to avoid . Because InsecureMuSig lacks nonce commitments, Alice and Bob do not know whether Carol provided a genuine nonce or a rogue one.

Carol can thus convince Alice and Bob to sign with the “aggregated” nonces chosen by Carol. She can do this because she waited for Alice and Bob to expose their nonces to her, so she can easily compute her nonces as a function of theirs.

By now, perhaps you are starting to see why why Carol needed to open those signing sessions concurrently (at the same time), rather than sequentially. She needs to know all of Alice and Bob’s nonces in order to compute the points with Wagner’s Algorithm, so she had to at least start the signing sessions before she could perform step 4. However, she also can’t conclude any signing session until she knows which point to aim for with her rogue nonce. Thus, the signing sessions must be concurrent.

  1. Carol awaits the resulting pairs of partial signatures from Alice and Bob.

Let and denote Alice and Bob’s secret nonces for signing session . We can assume these are honestly random and unknown to Carol.

Alice and Bob will have computed their partial signatures as follows.




See how by selecting rogue nonces, Carol convinced her peers to sign the challenges which were selectively chosen by Carol, such that .

  1. Carol sums Alice and Bob’s partial signatures from those signing sessions to create a usable forged signature which verifies under the aggregate key on Carol’s chosen message and evil nonce . Carol must add her own partial signature to complete the forgery.


This seems a bit too easy. How could this possibly work? Let’s expand to see why it would be valid.

Group the private nonces all on the left, Alice and Bob’s private key terms in the middle, and Carol’s private key term on the right.

that doesn’t look like a valid Schnorr signature.

Not yet! But what if we…

Substitute All The Things

Here’s where the magic of Wagner’s Algorithm makes itself felt. Recall how Carol used Wagner’s Algorithm to find the the challenges so that they sum to her evil challenge .

Wondering how Carol did this? We’ll get there eventually.

This relationship allows Carol to factor out from the aggregated partial signatures, just as with a regular aggregated Schnorr signature, except this time they are super-duper aggregated - a rigorous technical term which denotes that the signatures came from many signing sessions and from multiple peers.

Oh look. Now we can merge the middle and right groups of terms by factoring out .

This is starting to look suspiciously similar to a valid Schnorr signature.

what about that first term with the sum of the secret nonces?

Remember how we defined the evil nonce as the sum of Alice and Bob’s public nonces , plus Carol’s nonce.

Although Carol doesn’t know the actual value of , she knows it must be the discrete logarithm (secret key) of the evil nonce . Let’s simplify by denoting that secret evil nonce as .

We can substitute into the signature equation to represent the hypothetical aggregated secret nonce, which is obscured but is still part of the forged signature.

We can further simplify by denoting the hypothetical aggregated private key as .

The forged signature can be verified as with any standard ol’ Schnorr signature.

If you understand everything above, then congratulations: You’ve just grasped the clever attack that forced some of the most famous cryptography experts in the Bitcoin community to backtrack and rework their original InsecureMuSig algorithm. You could close the page here, and give your brain a little “good job!” affirmation for grasping such an outlandish cryptographic attack.

But the perversely curious among you may wonder,

how did Carol find those curve points ?

how did she know they would result in challenges which sum to ?

If these questions are eating at you, read on to learn more. But beware! Here be dragons…

Wagner’s Algorithm

From an attentive reading of the attack procedure, I hope you’ll agree that the secret sauce seems to come in the form of some black box which I referred to as Wagner’s Algorithm. Carol used it to find a set of points that she used to fool Alice and Bob into signing challenges which Carol chose. This resulted in a forgery.

This black box is an algorithm originally described in David Wagner’s 2002 paper on the Generalized Birthday Problem in which he demonstrates a probabilistic search algorithm which can be used to find additively related elements in lists of random numbers.

what the hell does that mean?

Let’s back up a bit and sketch out the high-level game plan here. Wagner’s original paper is quite dense. I made several attempts at this section of the article before landing on what I believe to be a reasonably intuitive but also succinct description of Wagner’s Algorithm. It will help if I outline our approach before I describe the algorithm itself.

  • First we will review some principles of probability theory which we’ll need to make use of.
  • We’ll generate a bunch of lists of random hashes.
  • We’ll define a merging operation to merge pairs of lists together into another list of about the same size as either parent list.
  • Those merged lists will contain sums of hashes which are close to zero mod .
  • We’ll repeat this merging operation recursively in a tree-like structure until the hash-sums approach zero.
  • We’ll follow pointers back back up the tree to the original lists, and find inputs which, when hashed, should sum to any number we want mod .

Probabilities

Before we dive into a probability-driven algorithm, let’s review some probability theory!

Probability of Modular Sums

Imagine you have three D6 dice and roll them all independently. What are the odds that those three dice face values add up to a number which is divisible by six?


Assuming all the dice are evenly weighted and rolled independently, there are equally likely outcomes.

1 1 2
1 1 3
1 1 4
1 1 5
1 1 6
1 2 1
1 2 2
1 2 3
1 2 4
1 2 5
1 2 6
1 3 1
6 6 5
6 6 6

This works out such that each possible sum mod 6 is equally likely, i.e.

There’s nothing special about a modular sum of zero here. You can pick any constant and your odds of rolling a modular sum of are the same: .

Imagine rolling one die. That event has an exact probability of rolling any . Take that value mod 6, and it maps 1:1 to with exactly the same probability.

Roll a second die and add it to the first. This adds some again with uniform probability, which brings the modular sum once more to with the exact same probability distribution. Repeat ad infinitum. Thus, the probability distribution of a modular sum does not change no matter how many dice are summed.

Expected Values

The expected value of some variable is the value you can expect it to hold on average. It can be calculated by taking a weighted sum of every possible outcome value, multiplied by the probability of that outcome.

For example, if you were to roll a six-sided die , its expected value would be the weighted sum of each possible die face value. The expected value of a dice roll turns out to be .