Introduction
MuSig is a protocol which allows mutually distrustful parties to safely create aggregated digital signatures together. I highly recommend you read my MuSig article before diving into this one, as it lays down the context. If you’re already familiar with MuSig and Schnorr, carry on!
The MuSig protocol has several iterations:
 InsecureMuSig  An outdated and insecure 2round version
 MuSig1  The secure 3round variant
 MuSig2  The faster but slightly less secure 2round variant
InsecureMuSig  the original form of MuSig  was proven insecure by Drijvers et al in 2018. The attack described in their paper depended on the fact that under InsecureMuSig, cosigners could manipulate the final aggregated nonce used for signing by submitting rogue nonces, computed as a function of their peers’ nonces. In response, the authors of MuSig amended their paper with a third noncecommitment round of communication, thus preventing cosigners from providing rogue nonces.
This Blockstream article briefly summarizes the approach used by the attack, but only from the perspective an attack on an insecure implementation of MuSig1, in which nonces themselves are preshared before the message to sign is chosen. For a time, this article led me to mistakenly assume that it was the presharing of nonces which made InsecureMuSig so insecure. However, that is not the approach used by Drijvers et al in their original paper. In the paper, Drijvers et al made no mention of nonce presharing at all.
To my knowledge, there does not yet exist a full approachable explanation of how Wagner’s Attack broke InsecureMuSig. That is what I hope this article will be.
Bear in mind that I am no expert on Wagner’s Algorithm, but merely a tourist inclined to investigate algorithms which I find interesting. If you notice any mistakes, please suggest an edit on Github, or email me! I’d be very grateful for the help.
Notation
Just so we’re all on the same page:
Notation  Meaning 

The value 

The sum of all 

The basepoint of the secp256k1 curve.  
The order of the secp256k1 curve. There are 

A cryptographically secure namespaced hash function under the namespace 

Sampling 

Sampling 

Concatenation of the byte arrays 

The set of all integers from 

“For every element 

The order of (AKA number of elements in) set 
The Attack
The Setup
Alice, Bob and Carol have an aggregated public key
Carol has some evil message
Requirements
Carol will execute her dastardly attack by opening a number of concurrent signing sessions with Alice and Bob, and providing a set of carefully chosen rogue nonces. Rogue nonces behave similarly to rogue keys, which I highlighted in my MuSig article. If Carol learns Alice’s nonce
The “fully controls” part is particularly important. For Wagner’s Attack to work Carol must learn Alice and Bob’s public nonces before choosing her own, otherwise she cannot control the challenge
This is why InsecureMuSig was insecure: Unlike MuSig1, InsecureMuSig didn’t have a noncecommitment phase. Cosigners simply exchanged their public nonces at will with no additional commitment phase or scrambling involved.
The Procedure
OK enough babbling. Let’s dive right in and see how Carol attacks this system.
Carol opens
concurrent signing sessions with Alice and Bob, in which she proposes to sign innocuous messages which Alice and Bob would be perfectly agreeable to signing. can be any power of two  higher is generally more efficient, but riskier for Carol. should approach but not exceed the maximum number of concurrent signing sessions that Alice and Bob can safely handle. Carol waits for Alice and Bob to share their public nonces
with her in every one of the signing sessions. Carol picks a nonce of her own which will protect her private key once the forgery is complete.
 Carol runs Wagner’s Algorithm to find
different curve points which hash to challenges which sum to an evil challenge .
is the “evil nonce” which will eventually be used for Carol’s forged signature. It is the sum of all of Alice and Bob’s nonces across all signing sessions, plus Carol’s own nonce .  Each
is hashed with the message to produce a challenge for signing session .  Through the wonder of Wagner’s Algorithm, the sum of all those
hashes equals another hash which Carol has chosen based on an evil message (which Alice and Bob never see).
 Carol calculates a set of rogue nonces
, one for each signing session.
The purpose of each rogue nonce
Alice and Bob do not know to avoid
Carol can thus convince Alice and Bob to sign
By now, perhaps you are starting to see why why Carol needed to open those
 Carol awaits the resulting
pairs of partial signatures from Alice and Bob.
Let
Alice and Bob will have computed their partial signatures as follows.
See how by selecting rogue nonces, Carol convinced her peers to sign the challenges
 Carol sums Alice and Bob’s partial signatures from those
signing sessions to create a usable forged signature which verifies under the aggregate key on Carol’s chosen message and evil nonce . Carol must add her own partial signature to complete the forgery.
This seems a bit too easy. How could this possibly work? Let’s expand
_{Group the private nonces all on the left, Alice and Bob’s private key terms in the middle, and Carol’s private key term on the right.}
that doesn’t look like a valid Schnorr signature.
Not yet! But what if we…
Substitute All The Things
Here’s where the magic of Wagner’s Algorithm makes itself felt. Recall how Carol used Wagner’s Algorithm to find the the challenges
_{Wondering how Carol did this? We’ll get there eventually.}
This relationship allows Carol to factor out
Oh look. Now we can merge the middle and right groups of terms by factoring out
This is starting to look suspiciously similar to a valid Schnorr signature.
what about that first term with the sum of the secret nonces?
Remember how we defined the evil nonce
Although Carol doesn’t know the actual value of
We can substitute
We can further simplify by denoting the hypothetical aggregated private key as
The forged signature
If you understand everything above, then congratulations: You’ve just grasped the clever attack that forced some of the most famous cryptography experts in the Bitcoin community to backtrack and rework their original InsecureMuSig algorithm. You could close the page here, and give your brain a little “good job!” affirmation for grasping such an outlandish cryptographic attack.
But the perversely curious among you may wonder,
how did Carol find those curve points
? how did she know they would result in challenges which sum to
?
If these questions are eating at you, read on to learn more. But beware! Here be dragons…
Wagner’s Algorithm
From an attentive reading of the attack procedure, I hope you’ll agree that the secret sauce seems to come in the form of some black box which I referred to as Wagner’s Algorithm. Carol used it to find a set of points that she used to fool Alice and Bob into signing challenges which Carol chose. This resulted in a forgery.
This black box is an algorithm originally described in David Wagner’s 2002 paper on the Generalized Birthday Problem in which he demonstrates a probabilistic search algorithm which can be used to find additively related elements in lists of random numbers.
what the hell does that mean?
Let’s back up a bit and sketch out the highlevel game plan here. Wagner’s original paper is quite dense. I made several attempts at this section of the article before landing on what I believe to be a reasonably intuitive but also succinct description of Wagner’s Algorithm. It will help if I outline our approach before I describe the algorithm itself.
 First we will review some principles of probability theory which we’ll need to make use of.
 We’ll generate a bunch of lists of random hashes.
 We’ll define a merging operation to merge pairs of lists together into another list of about the same size as either parent list.
 Those merged lists will contain sums of hashes which are close to zero mod
.  We’ll repeat this merging operation recursively in a treelike structure until the hashsums approach zero.
 We’ll follow pointers back back up the tree to the original lists, and find inputs which, when hashed, should sum to any number we want mod
.
Probabilities
Before we dive into a probabilitydriven algorithm, let’s review some probability theory!
Probability of Modular Sums
Imagine you have three D6 dice
Assuming all the dice are evenly weighted and rolled independently, there are
1  1  2 
1  1  3 
1  1  4 
1  1  5 
1  1  6 
1  2  1 
1  2  2 
1  2  3 
1  2  4 
1  2  5 
1  2  6 
1  3  1 
…  …  … 
6  6  5 
6  6  6 
This works out such that each possible sum mod 6
There’s nothing special about a modular sum of zero here. You can pick any constant
Imagine rolling one die. That event has an exact
Roll a second die and add it to the first. This adds some
Expected Values
The expected value of some variable is the value you can expect it to hold on average. It can be calculated by taking a weighted sum of every possible outcome value, multiplied by the probability of that outcome.
For example, if you were to roll a sixsided die