Introduction
Many culture articles in the Bitcoin space will extol the sound byte:
Schnorr signatures will improve Bitcoin’s privacy for multisignature transactions!
…and yet most choose to omit the clever math which makes this statement true. The math isn’t mind-bogglingly complex. One needs only a basic grasp of elliptic curve cryptography, and the benefits of Schnorr Signatures quickly become very exciting.
Preliminaries
Just so we’re all on the same page:
Notation | Meaning |
---|---|
The base-point of the secp256k1 curve. | |
The message we’re trying to sign (a byte array). | |
The SHA-256 hash function. | |
The order of the secp256k1 curve. There are |
|
Sampling |
|
Concatenation of the byte arrays |
In general, upper-case variables like
Enter the Frankenstein
To understand how delicious Schnorr Signatures are for Bitcoin developers, we need to compare it to what we have now: ECDSA. I’ll forgive you if you choose to skip this part.
Show me the ugly ECDSA math!
I guess you’re really gonna make me huh? Best get this over with, then. *drinking intensifies
The old ECDSA protocol signs messages like so:
- Choose our private key,
, which is just a random integer in .
- Compute the hash of the message and interpret it as an integer
.
- Sample a random nonce
.
- Multiply
by the base point .
- Take the
coordinate from that computation modulo .
* If
- Signing time! Compute this mess.
* If
**
The final signature is the awful pair of integers
Verifying the signature goes (painfully) as follows:
- Find (or compute) the public key.
- Check that both
and are integers in .
- Compute the hash of the message that was signed, just as the signer would have.
- Compute these intermediate “u-values”.
- Multiply
and by the base point and public key respectively.
- Sum both points to get a verification point.
- Verify the X-coordinate of that point is equivalent to the signature’s
component when taken modulo .
If so, then wham you’ve got a valid signature. “That was easy wasn’t it?” he mused ironically…
but why does that work?
-I hear you ask.
I am not a masochist, so I will spare myself more work rewriting a proof for an algorithm which I’m taking such pains to paint as awful. Instead, go check out this proof of ECDSA in Doctor Nakov’s Practical Cryptography for Developers e-book.
but why is ECDSA so awful?
For starters, it requires computing modular multiplicative inverses of the nonce
Furthermore, just look at the number of steps I had to write out above. It’s less a signing algorithm than a surgical procedure which happens to involve your private keys.
Also, ECDSA signatures are malleable.
Furtherermore, ECDSA has pretty weak security proofs.
Furtherestmore, ECDSA (and its forerunner DSA) were originally published in the early nineties to get around the patent on Schnorr Signatures, which finally lapsed in 2008. Bitcoin has been using a knock-off of Schnorr since the genesis block. It’s time to finally upgrade to the brand-name.
Schnorr Signatures
Compared to ECDSA, Schnorr Signatures are a breath of fresh air for their elegant simplicity.
- Sample a random nonce
.
- Multiply
by the base-point .
- Hash the nonce point
, the signing public key , and the message to get the challenge .
* If you read about Schnorr Signatures elsewhere online, you might see the challenge computed as
- Compute the signature using the private key
.
The final signature is the tuple
Verifying Schnorr Signature is easy-peezy. Just multiply both sides of the above equation by
The verifier is assumed to know the public key
It’s pretty easy to prove that the signature is valid by simply factoring out
Thanks to the properties of cryptographically secure elliptic curves like secp256k1, factoring a curve point to find its discrete logarithm isn’t computationally feasible (yet). This is why signatures cannot be forged without knowing the private key
but why do these particular equations work?
Far be it from me to reverse-justify Schnorr’s design, but perhaps I can at least point out what each step and each variable is doing from the perspective of someone attempting to attack the scheme.
Recall the definition of a Schnorr Signature.
Note that:
and are both randomly sampled from the set of integers modulo , AKA .- The nonce
changes for every signature whereas the private key remains consistent. can be computed by anyone who knows the (presumably) public parameters .
When the signer multiplies their private key
So what’s the point of adding
This is why
If an observer knew the
Another common gotcha:
Interestingly, that definition of
Contrastingly, if we used two different nonces
What if we used different nonces to sign the same message? Would this result in any Bad News™? No, because remember that the challenge
If the nonce changes, so does
but why is this so cool?
- Schnorr is faster.
- Schnorr is simpler to implement than ECDSA.
- Schnorr signatures aren’t malleable.
- Schnorr has very solid security proofs.
- Schnorr permits linear signature aggregation.
I bolded #5 there because that one feature is incredibly important for the future potential of Bitcoin.
Aggregator or Crocodile?
Signature aggregation is not possible very difficult in ECDSA. ECDSA Threshold signatures are perfectly achievable, and these are handy but nowhere near as easy or as fast as linear signature aggregation with Schnorr. And as if it needed to flex even harder, Schnorr can do threshold signatures even better than ECDSA.
When I say Schnorr Signatures are linear, this means that the only operations needed for Schnorr are simple scalar addition and multiplication. Verification is the same: one only needs point-addition and point-multiplication to verify the signature. ECDSA on the other hand, requires modular multiplicative inversion to sign and verify.
This opaque-sounding explanation can be boiled down to one highly consequential fact:
The sum of a set of Schnorr signatures on a given message is a valid signature under the sum of the public keys which made those signatures.
In other words, if a bunch of different signers cooperatively sign the same message, those signatures can be summed up to produce an aggregated signature. If one also sums up the pubkeys of those who signed to get an aggregated pubkey, the aggregated signature will be valid under the aggregated pubkey.
Naive Example
Let’s say you have three parties with their own private and public key pairs. They each sample their own private random nonce.
Name | Public Key | Private Key | Nonce |
---|---|---|---|
Alice | |||
Bob | |||
Carol |
They all want to sign the same message
- They each compute their own public nonce points independently.
- They agree on an aggregated nonce point
.
- They agree on an aggregated public key
.
- They each hash the message with this aggregated nonce point and pubkey.
- They each compute their share of the signature.
- They send each other their
values and aggregate by summing them all up.
The final aggregated signature is
To verify, recall that a signature
And this is going to be the case for our aggregated signature. Recall how Alice, Bob and Carol aggregated
Our verification works out because by aggregating the signatures, we also aggregated the individual nonces and the private keys (multiplied by
Multiplying
HOWEVER
This example has a hidden flaw if used as a multisignature protocol under adversarial conditions (when co-signers don’t trust each other). Can you see why?
hint: the attack vector is very early in the aggregation procedure.
Tell me the answer already!
A naive protocol like this is vulnerable to a Rogue Key Attack when co-signing with untrusted peers. This attack can be performed by whomever is last to share their public key with the other signers.
Let’s return to Alice, Bob, and Carol.
- Alice tells Bob and Carol her pubkey
. - Bob tells Alice and Carol his pubkey
. - Alice and Bob now expect Carol to share her public key.
Carol has a sneaky option here. She can give Alice and Bob a phony public key
Carol doesn’t know the discrete log (private key) of
Carol has just fooled Alice and Bob into agreeing to use an apparently aggregated key which only Carol controls.
Neither Alice nor Bob would be able to distinguish between Carol’s authentic key
Carol could also collude with Bob to exclude Alice’s key from the final aggregated key, like so.
The Solution
Rogue Key Attacks can be fixed naively by requiring that each signer to prove she knows the private key for her public key. Such an affirmation is called Knowledge-of-Secret-Key (KOSK). This is flawed because not every key I want to aggregate with is going to be fully under my control all the time. Perhaps I want to aggregate a public key right now, but I can only expect to learn its secret key next week (e.g. in a Discreet Log Contract. Perhaps I want to aggregate a public key which is itself an aggregated key, whose component secret keys are owned by 3rd parties.
Instead modern Bitcoin developers use a kind of scrambling protocol to avoid the risk of rogue keys. This is what MuSig offers. I’m looking forward to discussing MuSig in another post Take a look at my article about MuSig to learn what all the fuss is about.
Benefits
In combination with Taproot, developers can embed arbitrary combinations of aggregated public keys into a single tweaked public key which encodes a highly complex array of possible spending conditions, but still looks like any old normal public key when used under normal conditions.
Bitcoin developers can use multisignature signing schemes like MuSig and threshold signing protocols like ROAST with much greater efficiency, security, and privacy than has ever been possible before with ECDSA.
Any given public key can now be an aggregation of colossal numbers of child keys and scripts and threshold public keys and much much more.