Hash-Based Signature Schemes for Post-Quantum Bitcoin

In my personal opinion, the most dangerous and unpredictable challenge facing Bitcoin is not political pressure, practical usability, scaling, or economic incentive… It is the threat of quantum computers.

Large and powerful organizations - both political and corporate - have been pouring billions of dollars into developing quantum computers for decades. Ever since Peter Shor published his (in)famous namesake algorithm in 1994, cryptographers have been preparing for “Q-Day” - the day a large, practically usable quantum computer becomes available.

If you’re reading my blog, I can assume you are at least roughly familiar with elliptic curve public/private keys. I’ll simply tell you that a quantum computer of sufficient power would be able to flip an EC public key backwards into its private key - a feat thought to be implausible with ordinary classical computers. You can see how this might be a problem. If you’d like to learn how this can be done in detail, check out Shor’s original paper.

This article won’t dissect the mechanics of how quantum computers break Bitcoin public keys, nor is it intended as a doom-and-gloom FUD-piece about how we should all be digging graves, stocking bunkers, and prepping for day zero of the post-quantum meltdown. I won’t bother to conjecture how long it will take for a powerful quantum computer to be invented - Estimates of when this will occur vary widely even between professional analysts. I will simply assume it will happen, one day, eventually. When it does, we must deal with it.

My hope is that this article will show you the landscape of the quantum threats facing the Bitcoin community. Then I’ll explore various hash-based signature schemes and analyze their potential to diffuse these threats, and the tradeoffs which inevitably come with them. Finally, I’ll propose a novel upgrade path for Bitcoin clients which uses hash-based cryptography as a fallback option to insure bitcoin users against future quantum adversaries, without the need for any near-term consensus changes.

All safe roads once were wild.

The Quantum Threat

In this section I’ll dive into the mechanics of how exactly a selfish quantum adversary could attack Bitcoin for financial gain. Understanding the fundamental threat of quantum computers in more detail will lead us to clues as to how we can improve our defenses.

Almost every Bitcoin transaction mined from the genesis block to today has authenticated itself using one or more asymmetric signatures. Sometimes public keys are hashed. Sometimes we use Bitcoin script opcodes to encumber coins with additional validation steps (e.g. time delays). But eventually, to spend a Bitcoin UTXO safely, one must be required to supply an asymmetric signature matching some public key committed to by the UTXO itself. The signature must commit to the spending transaction in some way so that it can’t be applied to a different transaction.

If one were to encumber a transaction output with only a hashlock, or only a timelock, etc, with no asymmetric signature requirement, then that UTXO can be spent by anyone who observes it being spent (including the miners who must see the transaction to validate it).

1
OP_SHA256 <hash> OP_EQUAL

A simple hash-lock Bitcoin script pubkey. This script is unspendable until someone reveals the preimage of <hash>. After that, the script could be easily unlocked by anyone.

Bitcoin transactions use signatures on the secp256k1 elliptic curve to achieve this asymmetric authentication dynamic. ECDSA is the older and more well-known signing algorithm, and Schnorr is a cleaner, faster, more secure and more flexible signing algorithm which was adopted several years ago as part of Bitcoin’s Taproot upgrade. Bitcoin’s Schnorr and ECDSA signing protocols both use the same secp256k1 elliptic curve.

ECDSA and Schnorr signing protocols are equally vulnerable to a quantum adversary (QA) who knows a victim’s secp256k1 public key, because the QA can simply invert the public key into the private key, and sign any message they desire.

However, the same QA will have a much harder time computing the preimage (i.e. input) of a hash function given its output. The currently-known best-case-scenario for a QA brute-forcing a hash preimage is a quadratic speed boost, given by Grover’s algorithm. Quadratic in this context means that if a classical computer requires operations to do something, then the QA can do it in operations. The smallest hash used in Bitcoin output scripts is RMD160 which has a security level of , so a quantum computer could crack a preimage to RMD160 in at most operations. Not great, but still implausibly difficult according to current theory.

So not all one-way functions are dead in the water, and cryptographic hashes aren’t the only ones still kicking.

Let’s examine the different address (output script) types in more detail.

Hashed Output Types

(AKA P2PKH/P2SH/P2WPKH/P2WSH)

The most commonly used type of Bitcoin address today is a pay-to-(witness)-public-key-hash address (P2PKH/P2WPKH), in which the public key is first hashed before being given to the payer. When spending this output, the owner signs the transaction and provides the raw public key so that verifiers (Bitcoin nodes) may re-hash it and compare the hash against the output script being spent.

1
OP_DUP OP_HASH160 <hash> OP_EQUALVERIFY OP_CHECKSIG

The classic P2PKH locking-script.

In P2SH and P2WSH, the address instead is the hash of a Bitcoin locking script which itself almost always contains at least one ECDSA pubkey, whose signature is required for spending. For the purposes of assessing post-quantum security, script-hash type outputs have effectively the same properties as key-hash type outputs, with the exception that P2WSH outputs are encoded as SHA256 hashes whereas all others are encoded as RMD160 hashes.

1
OP_HASH160 <hash> OP_EQUAL

The classic P2SH locking-script.

These four types of addresses currently protect the bulk of the mined and circulating Bitcoin supply.

The threat of a quantum adversary (QA) is in its ability to invert public keys into private keys. If a key-hash or script-hash address has received but not spent any of its coins, then the preimage of that hash (the key or the script) hasn’t been revealed yet. Barring out-of-band exposure (e.g. via BIP32 xpubs leaked to servers or peers), the QA generally won’t be able to attack such addresses practically.

This protection ends the moment the owner of the key/script-hash address tries to spend any of their coins. To validate a transaction spending a pay-to-*-hash output, the Bitcoin network peers must be given the key/script preimage. A quantum adversary could easily monitor these transactions and see that preimage, which always contains a public key the QA can invert. After inverting, the QA can try to double-spend the coins with a new higher-fee transaction.

Gracefully, the QA will have a limited time in which to mount this attack. A typical Bitcoin TX spends only 5 to 60 minutes of its public existence in the mempool before it is mined, at which point the coins will have moved permanently to a new address. If that new address is protected by a hash whose preimage the QA does not know, then the coins will be out of reach for the QA.

However, a significant fraction of key-hash and script-hash addresses are being reused on mainnet today. A reused address is one which receives one or more UTXOs, spends them, and having revealed their public key or spending-script, is still being used to custody additional Bitcoin UTXOs. Some estimate the amount of bitcoin held on such reused addresses was at least 5 million BTC as recently as 2019. A quantum adversary has much more time to compute the public keys exposed by these addresses. Time is on the side of the QA here; The burden of action is squarely on the shoulders of the owner to move their coins to a fresh unused address.

There is also the possibility of a large mining pool becoming a quantum adversary. Such a pool could delay mining certain transactions intentionally, to give themselves more time to crack a newly discovered public key.

Summary

P2PKH/P2SH/P2WPKH/P2WSH output script types can be reasonably secure against a QA if used properly, provided the QA needs a decent amount of processing time (more than an hour) to invert a secp256k1 public key.

The problem is that proper usage is hard. Ava Chow describes the numerous pitfalls which plague key-hashing as a countermeasure against a quantum adversary.

P2TR Outputs

In P2TR addresses, an internal public key is tweaked with a given hash to produce an output public key (where is the secp256k1 generator point). The output pubkey is then encoded directly (unhashed) in the P2TR address which receivers distribute.

When spending, the bearer can either authenticate directly by signing as the output pubkey (“key-path” spend), or they can provide the internal public key , a Bitcoin locking script, unlocking witness stack, and a proof that the locking script was committed as a merkle tree leaf of . If the script evaluation succeeds, the spender is deemed authentic (“script-path” spend). See BIP341 for more information.

The critical distinction between P2TR and the legacy pay-to-*-hash output script types is that taproot encodes a secp256k1 point directly in the output script. This means a quantum adversary (QA) can invert any P2TR output pubkey to get its secret key, and claim the P2TR UTXO using a “key-path spend”, even if the address has not been spent from before.

This is not a new observation, there has been much discourse and controversy surrounding this property of the taproot upgrade.

Summary

Against a quantum adversary, P2TR addresses are comparable in security to an already-reused key-hash address. The QA has unlimited time to invert the output pubkey and can start attacking any P2TR output from the moment they learn of its output public key . The burden of action is on the address owner to move funds to a quantum-safe address - or to avoid using the P2TR output script type - as soon as they realize the QA exists (or could exist soon).

Note on Multisig

P2SH or P2WSH addresses which use OP_CHECKMULTISIG cause a linear increase in the processing time needed for a QA to fully compromise the address. For an -of- multisig address composed of distinct pubkeys, the QA must invert at least of the constituent public keys to wield spending power over the address.

This slowdown will probably be offset by parallel-processing across multiple quantum processors, or with parallelized quantum algorithms which invert multiple public keys at once.

P2TR multisig addresses which use key aggregation do not have this property, because the taproot output pubkey can always be inverted. Possessing the secret key of is always sufficient to spend the P2TR output under current consensus rules.

Note on BIP32

Most consumer wallets derive their addresses using BIP32 hierarchical-deterministic (HD) wallet schemes. This allows a single seed to derive basically unlimited secret keys and corresponding public addresses for any script type. Wallets will sometimes share their extended public keys to avoid the need to communicate many individual child addresses.

To a quantum adversary, a BIP32 extended public key (xpub) is equivalent to its counterpart extended private key (xpriv) given enough time to invert the xpub. Once inverted, the QA can quickly derive all child addresses, and can thereafter monitor the blockchain to steal any funds paid to them.

To comply with BIP44 standards, most wallets derive one xpub/xpriv pair at specific BIP32 paths for each Bitcoin address type.

  • m/44'/0'/n' for P2PKH
  • m/49'/0'/n' for P2SH-wrapped P2WPKH
  • m/84'/0'/n' for native P2WPKH
  • m/86'/0'/n' for P2TR

Each layer of derivation in these key paths after the master key (m) is hardened, meaning that deriving that child key involves hashing the parent secret key.

Remember QAs cannot invert a hash function as efficiently as they invert elliptic curve public keys. This means that the BIP32 master key m, as well as the xprivs at any previously unused key path, are generally safe from the eyes of the QA provided the wallet has not exposed the BIP32 master public key. We will make use of this property in some of the options we discuss later.

Quantum-Safe Signatures

The critical question is: What cryptographic mechanism replaces elliptic curve digital signatures?

Whatever mechanism replaces EC signatures must be asymmetric, allowing a user to give payment instructions to a prospective sender, without surrendering control of their money to the sender. This way the receiver can claim that payment later without revealing their signing credentials to anybody (like one would with a password to a bank website). This rules out symmetric cryptographic schemes, such as HMAC, simple hashing, or symmetric encryption algorithms like AES or ChaCha.

We have numerous quantitative metrics by which to judge a replacement algorithm:

  • Public key size
  • Signature size
  • Secret key size
  • Key generation time
  • Signature computation time
  • Signature verification time

We also have qualitative statements we can make about a signing algorithm:

  • Statefulness: Does a signer need to maintain state between each signature?
  • Reusability: Can a signer reuse their secret key to sign multiple messages?

Today we’re going to explore one genre of signing algorithms which is almost certainly the safest and most conservative candidate, but which comes with some big practical issues: Hash-based signatures, or HBS for short.

Hash-based signature schemes are constructed from a simple one-way cryptographic hash function like SHA256. Secret keys are usually vectors of preimages. Public keys are usually hashes, or vectors of hashes. Signatures are usually vectors of preimages, paired with auxiliary data needed to verify the preimages correspond to a specific static public key. Unlike other candidates for post-quantum cryptography, hash-based signatures do not rely on new (possibly flawed) cryptographic assumptions - This is what makes them the conservative option.

I’ll now give a detailed description of several well-known hash-based signature algorithms, in increasing order of complexity. If I lose you in some of the math, don’t worry - Understanding the internals of the HBS schemes is not critical to understanding how they can be applied to Bitcoin.

Skip through to the final analysis section if you don’t care about the technical details of HBS schemes.

Before we begin, some notation:

Notation Meaning
The set of all possible bit strings of length
Uniform random sampling of an element from the set

Let be a secure cryptographic hash function which outputs pseudorandom bit strings of length (For SHA256, ). All constructions below are typically based on this single cryptographic primitive.

Disclaimer

I began studying these hash-based signing protocols only a few weeks ago. Do not take my technical descriptions below as precise and safe for real-world use, because I have probably made typos or omitted stuff. The purpose of these descriptions is to illustrate and convey the essence of the algorithms, not to provide any concrete implementation instructions.

Roll your own crypto at your own peril.

If you are an expert in hash-based signatures and have noticed a typo, please email me or open a pull request to fix it.

Let’s start with the simplest and earliest-known example of an HBS scheme: Lamport Signatures.

Lamport Signatures

We want to sign a single bit of information - 0 or 1 - using our hash function .

We generate two random preimages of bits each:


The tuple is a one-time secret key.

Our public key is the tuple . We can safely give the public key to anybody, and later reveal either to sign the bit zero, or to sign the bit one. Any observer with the public key can easily verify for a given bit .

We may generalize this approach to sign message strings of arbitrary bit length by generating more preimages - one per message bit. Then our secret key would be:

Each is an array of random preimages .

Our public key is made of hashes of every preimage.

To produce a signature on the message bits , we reveal the appropriate preimages per bit:

To verify, simply check that the hashed preimages match the public key for every bit in the message.

Properties

Lamport Signatures are a one-time signature (OTS) protocol - once the signer reveals a signature for a given public key, they can never use the same signing key again for a different message. If they do, the signer gives observers the ability to forge signatures on new messages they never consented to sign.

For instance, if and a signer reveals two signatures for distinct messages and , they will have now revealed all four preimages in their secret key . Any observer can now sign the other messages and even though the key owner never approved those messages.

Signatures have bit length and public keys have bit length .

Signing is simple, as it consists only of revealing secrets. Verification requires evaluations of .

Modifications

We can modify lamport signatures to improve the amount of space consumed by keys and signatures at the cost of some computation time.

Deterministic key-gen: Instead of randomly sampling our secret key preimages, we can derive them from a root secret using .

This way we only need to store a single 256-bit secret preimage for each Lamport OTS key, instead of preimages, but we do need to compute hashes to derive our public key, and we must compute hashes to sign a message of bits.

Compact public keys: We can reduce the size of public keys from to simply bits, by representing the public key itself as a hash of the standard Lamport public key.

However, we must now adjust our signature algorithm to supply not only the preimages, but also the complementary hashes, which are needed to reconstruct the public key hash .

Signatures double in size to bits, whereas public keys shrink from bits to only bits.

Verifiers can verify against by hashing each of the preimages and then hashing those hashes to re-derive .

Comparisons

The following table shows some concrete examples of time/space tradeoffs for Lamport signatures. In these examples, I’ll set , to emulate the results using the well-known hash function SHA256. All sizes are in bits.

Algorithm Signature size Public key size Secret key size
Vanilla Lamport
Compact Lamport

Below I note the time complexity of Lamport signatures in terms of the number of invocations of the hash function .

Algorithm Keygen time Signing time Verification time
Vanilla Lamport
Compact Lamport

Viability for Bitcoin

I expect the Compact Lamport signature approach would be the most practical flavor of Lamport Signatures, as public keys would be the same size or possibly smaller than a P2TR pubkey is today.

However, even the most space-efficient Lamport signatures with the lowest viable security levels () would require at least 6400 bytes of witness space per signature before compression. That’s 100x larger than a Schnorr signature.

There’s also the matter of key reuse. Lamport signatures are a one-time signature scheme. If a Bitcoin user spends from a hypothetical “pay-to-lamport-public-key” address, they’ll never be able to spend from that address again without giving observers some power to forge signatures.

Power users would not be affected greatly, as the best-practice touted by most experienced Bitcoiners is to use addresses only once, for a single receive/spend cycle. Yet, as we’ve seen earlier, a huge fraction of bitcoiners by volume have either never heard this advice, or have disregarded it. It will be difficult to convey to these users that address reuse with Lamport signatures will be more than just a privacy no-no - It could seriously compromise their security. This would make life especially difficult for cold-storage hardware signing devices, which rely on a semi-trusted host machine to prompt them with transactions to sign.

Lamport signatures are stateful, in that a signer should know if he has used his private key to sign a message before to know whether signing again could be dangerous.

Lamport signatures are not backwards compatible with Bitcoin’s consensus protocol as it exists today, although an outlandish scheme has been proposed by Ethan Heilman which make use of OP_SIZE to Lamport-sign the size of an ECDSA signature. Whether this approach is secure is still up for debate. But even with such creative invention, the size of such a signature would prohibit its usage in P2WSH addresses. At least with current knowledge, a soft-fork would be needed to add Lamport signature support.

Even after such a soft fork, existing addresses with exposed public keys would still be vulnerable to a quantum adversary. Users would need to actively migrate coins to a hypothetical pay-to-lamport-public-key address format.

Winternitz One-Time Signatures (WOTS)

Winternitz signatures rely on a chain of hashes starting from a set of secret preimages. By revealing certain intermediate hashes in that chain, the signer can sign a specific message.

In comparison to the Lamport scheme, WOTS shoulders a higher computational burden in exchange for much shorter signatures. Let’s see how they work.

First, let’s re-establish some parameters:

  • Let be the bit-length of the message we want to sign.
  • Let be the “Winternitz parameter” (used for time/space tradeoffs).
  • Let be the bit-group width into which we parse the message.
  • Let be the length of the message in digits of base .
  • Let be the checksum length in digits of base .
  • Let be the overall length of the message plus the checksum in digits of base .

As WOTS relies on hash-chains, we need a hash-chaining function which uses our primitive hash function internally.


So for instance, would hash twice.

Now we’re ready to generate our keys. We start by sampling random preimages of bits each.

To compute the Winternitz public key , we iteratively hash each preimage times.


To sign a message of bit-length , we break up into digits of base (I think of them as groups of bits).

We compute a checksum of the message:

Because , the binary representation of is at most bits long. This means we can always represent as an array of digits in base . In fact that’s exactly what we’ll do.

We append this checksum array to the message , to get the final message to sign, which is just an array of digits of base (I think of them as groups of bits).

To sign the checksummed-message , we go digit-by-digit through , and for each digit , we recursively hash the preimage , for iterations. This array of “intermediate hashes” in each hash-chain is the Winternitz signature.


A verifier given the plain message and public key must first reconstruct the checksummed message just like the signer should’ve. Then the verifier simply follows each hash chain in the signature to its expected terminal hash, and compares these terminal hashes against the public key .

If this holds for all , then the signature is valid.

Properties

Security. Note the special role played by the checksum. Without it, revealing one set of intermediate hashes should allow an observer to forge signatures on any message in a “higher” position in the hash chain.

For instance, consider the following diagramed example.

If we don’t add a checksum to the two-digit message , the signature would simply be the two intermediate hashes . But then anybody who learns this signature could forge a signature on any message “higher” in the chain, like or .

Because of the checksum construction, increasing any message digit implies decreasing at least one checksum digit . To forge the signature on the checksum as well, the observer would need to invert the hash function (implausible). So an observer cannot forge any other valid signature for as long as the signer uses their signing key only once.

Size. Signatures, public keys, and secret keys all have bit length

In case that equation seems unintuitive to you, you’re not alone. I plotted the relationship between the time/space tradeoff parameter on the X axis and the final WOTS signature size on the Y axis (in bits). I set to generate this particular graph.

Higher values of result in much smaller signatures and keys at first, but we get diminishing returns. In the limit as , it seems the shortest we can make our signatures is .

Approaching anywhere near that limit would be wildly impractical though, because by increasing we also exponentially increase the amount of work we need to do to generate public keys, sign messages, and verify signatures.

Runtime. Given a secret key containing preimages, to derive its public key we must compute hashes to derive our public key.

Public key generation uses the same hash-chaining procedure as signing/verifying. To sign or verify a signature, we must compute at worst hashes (on average, only half that).

To avoid this exponential blowup in computation time, most real-world WOTS instantiations keep quite low, usually at most 16, which would mean computing on the order of hashes for a signature operation, per bits in the message. As you can see, beyond performance will become a big issue.

X-axis is , Y-axis is the number of iterations of needed to generate a public key.

For reference, my weedy little computer can generate a WOTS public key with in around 0.4 seconds. Verification and signing will each require about the same in sum, as the signer starts the hash-chain which the verifier finishes.

Curiously, due to the zero-sum nature of the hash-chain signature construction, signers and verifiers who are able to select the message they’re signing can also influence the amount of work they and their counterparty must perform. By selecting a message digit significantly lower than the base , the signer needs to do less work to compute , while the verifier must do more work to compute . This could be a hinderance for a blockchain application like Bitcoin, because one signer could disproportionately emburden thousands of verifiers (nodes).

Modifications

Compact public keys: WOTS lends itself well to compacting public keys via hashing. A Winternitz signature gives the verifier everything they need to reconstruct the signer’s public key , and so we can represent a WOTS key succinctly with its hash, in exchange for an one additional unit of work for the verifier.

This decreases the public key size from bits to merely bits.

Deterministic key-gen: Like with Lamport signatures, WOTS secret keys can be generated from a seed with only invocations of .

Smaller and more secure signatures: To improve performance and security, one should consider using WOTS+, a variant of WOTS which adjusts the hash-chain construction to XOR each of the intermediary hashes in every hash chain with a randomization vector. This odd choice allows the security proof to reduce forgeries to second preimage extraction: If someone can forge a WOTS+ signature, they can also break the second-preimage resistance of . The security of WOTS+ thus relies on weaker assumptions than standard WOTS, which assumes a forging adversary cannot break collision resistance of (easier to do than finding a second-preimage). Broadly speaking, weaker assumptions lead to better security.

This means we can get away with using WOTS+ with smaller, faster hash functions which have broken collision resistance but are still believed to be second-preimage resistant, as long as one can accept the lower security level implied by their shorter output. Or WOTS+ can be instantiated with larger hash functions, and for only a modest increase in signature size and verification complexity, we gain the added security of weaker assumptions.

WOTS+ relies on a set of -bit randomization elements and an HMAC key which can all be generated pseudorandomly with and a seed, just like the vanilla WOTS secret key .

Instead of computing the intermediate values of the hash-chains as , WOTS+ uses a keyed hash function (think of it as an HMAC). The chaining function XORs the output of each keyed-hash on with a randomization element before proceeding with the next iteration.


For instance, would be:

WOTS+ appends the tuple to the public key, so that verifiers may also compute . To improve space-efficiency for public and secret keys, we can instead derive from a public seed (which is itself probably derived from a secret seed). The signer distributes as part of their signature.

The compact WOTS+ public key can then be defined as a simple -bit hash.

More efficient signatures using brute-force compression: This paper suggests modifying the WOTS signing algorithm so that the signer repeatedly hashes the input message along with an incrementing 32-bit salt/counter , until the resulting WOTS message checksum equals some fixed static value, e.g. zero. The signer appends the salt to the signature and the verifier confirms the resulting checksum is zero.

This allows the signer to omit the checksum portion of the WOTS signature, because the checksum is verified to always work out to a fixed static value. Messages without this fixed checksum are not valid, and so the checksum itself does not need to be signed. This approach is called “WOTS with Compression” (WOTS+C).

The effect of this compression is to reduce the overall runtime of the scheme on average across keygen, verification, and signing, while also reducing signature size by removing the checksum hash chains from the construction completely. One would think signing WOTS+C would take more time, due to the signer’s brute-force search for the salt . But the signer now needs to compute only intermediate hashes for the signature, instead of . These effects offset each other. At least, it does according to the paper authors, I have not yet validated this myself.

The authors of WOTS+C even propose ways for the signer to further reduce the size of their signature by removing additional chains from their signature, at the cost of additional work to find an appropriate message salt.

Comparisons

The following table shows some concrete examples of time/space tradeoffs for various kinds of Winternitz signatures. In these examples, I’ll set , to emulate the results using the well-known hash function SHA256. All sizes are in bits.

Algorithm Signature size Public key size Secret key size
Vanilla WOTS
Compact WOTS
Compact WOTS
Compact WOTS
Compact WOTS
Vanilla WOTS+
Compact WOTS+
Compact WOTS+
Compact WOTS+
Compact WOTS+
Compact WOTS+C