Pruning Hypertrees for the Lax and Lazy

Conduition

2026-04-20

cryptography

Happy 4/20 nerds!

In a previous article, I showed we can use parallelism to accelerate the hash-based signature system SLH-DSA by several orders of magnitude, so that generating keys on a modern CPU takes only a millisecond or so, and signing takes only 12 milliseconds. As far as I know, this is the best-performing open-source CPU implementation of SLH-DSA available anywhere today.

The one problem with the techniques I used in my experiments is that the best optimizations depend on parallelism hardware features, like SIMD, multithreading, or GPUs. Without parallelism, every one of the 2.1 million hash compressions needed to sign with SLH-DSA-SHA2-128s must be done sequentially, and if our hardware can’t hash quickly, we are doomed to a much worse performance profile.

Or are we?

This article demonstrates a hacky algorithmic trick I call hypertree pruning which SLH-DSA signer implementations can use to electively reduce a key’s signature use limit in exchange for better signing and/or keygen performance.

If you prefer a powerpoint-presentation style of learning, check out this recording of a talk I gave during the Ethereum Foundation's weekly PQ discussion call.

Hypertree pruning does not affect verifier compatibility: Signatures produced by anyone using this trick will still verify correctly using the vanilla, standardized SLH-DSA verification algorithm - Although, they are statistically distinguishable, and so hypertree pruning is noticeable to anyone observing multiple signatures over time from the same keypair.

The security trade-off can be parameterized to favor signing or keygen performance improvement, and the intensity of the trade-off may be tuned based on the number of signatures each keypair is expected to make over its lifetime.

As a result, the ideal use-cases for hypertree pruning are situations where verifier behavior is fixed, but the signer lacks hardware powerful enough to produce SLH-DSA keys or signatures using the verifier’s prescribed SLH-DSA parameter set. Hypertree pruning can empower signers who otherwise wouldn’t be able to produce compatible keys or signatures in a reasonable amount of time. Examples include low-power embedded hardware systems, legacy CPUs lacking hardware acceleration, or high-throughput applications which must be able to produce many signatures from multiple keys very quickly.

More plainly, hypertree pruning allows a signer to produce valid SLH-DSA keys and signatures which standard verifiers will accept, at the cost of a smaller limit on the number of signatures that each keypair can safely produce.

Disclaimers: This concept has not been peer-reviewed for security. Pruning breaks compatibility with NIST’s recommended SLH-DSA key generation and signing algorithms. Using hypertree pruning imperils your signing key because it intentionally reduces a keypair’s safe signature query limit in exchange for better performance. While I believe I have spelled out and proven the security considerations at play in this article, readers should use this technique only with the utmost care.

SLH-DSA Review

My prior article goes more in-depth on how SLH-DSA works. I highly suggest you read that section before proceeding with this article.

For those too lazy to open a new tab, have a look at this visualization of the SLH-DSA (SPHINCS) hypertree:

SLH-DSA keys are a merkle tree of merkle trees, often called a hypertree. The root hash of the root tree is called pk_root, and it is the most critical piece of the SLH-DSA public key, because it is what verifiers compare against when validating signatures.

The bottom-level leaves of this hypertree are FORS (Forest Of Random Subsets) keypairs, which are used to sign messages.

To verifiably connect pk_root to any given FORS keypair, the signer provides a chain of XMSS (eXtended Merkle Signature Scheme) certification signatures.

Confused? Read this section of the previous article. Otherwise from here onward I will assume readers already know how SLH-DSA works.

XMSS Pruning

Let’s zoom in on XMSS for a moment.

The leaves of an XMSS tree are Winternitz one-time signature (WOTS) public keys, which are hashed into a merkle tree that can be used to succinctly prove each leaf keypair’s membership.

Each WOTS leaf node can sign at most one message. In SLH-DSA, these WOTS leaf keys are used to certify lower level XMSS trees.

The most computationally intensive part of SLH-DSA or XMSS is, by far, generating WOTS leaves. Even leaves which are not used to sign a message must still be generated so that we can reconstruct the rest of the XMSS tree’s internal nodes, and so compute the merkle authentication path needed for a valid XMSS signature.

However, if a signer doesn’t need the entire XMSS tree, she can get away with discarding the WOTS leaves she does not need, replacing them with arbitrary hashes which have no cryptographic utility but are very fast to generate compared to WOTS keys. I call this technique pruning.

A signer can effeciently prune unneeded WOTS leaf nodes by deriving pseudorandom hashes from her keypair’s sk_seed, and inserting those in the place of the pruned WOTS leaves. The replacement hashes are called surrogates. To a verifier, a surrogate hash appears indistinguishable from a real WOTS public key hash, but they are basically dead voids which are much quicker for the signer to generate.

This technique reduces the number of signatures which can be issued by the XMSS tree. If we prune leaves, we reduce the signing capacity of the XMSS tree by precisely one-time signatures. In the most extreme case, if we prune all but one WOTS leaf, the XMSS tree can issue at most one signature, and then its capacity would be exhausted. This is called maximal pruning.

Hypertree Pruning

We can apply the above pruning technique to the XMSS layers in an SLH-DSA hypertree too, and we can do so to optimize SLH-DSA keygen and/or signing at the cost of security. Let’s optimize keygen first, since that is most straightforward to analyze.

Keygen is a very expensive process for SLH-DSA, and one which any signer will need to do before they can even distribute a public key to verifiers, let alone create signatures. Sometimes the key may never be used to sign, or it may only be used a few times. In this case the effort of keygen was mostly wasted. This is one of many realistic use-cases where the signer may wish to optimize SLH-DSA keygen and may be willing to make trade-offs to do so.

SLH-DSA key generation is fundamentally the same process as XMSS key generation applied to the SLH-DSA root tree, so it’s easy to see we can apply XMSS pruning there. Any leaf WOTS key of the root tree can be replaced by a surrogate, effectively skipping the difficult work of WOTS key generation for that leaf. This can be done to many leaves, to adjust performance as desired.

Pruning the root tree reduces the overall size of the SLH-DSA hypertree though. By pruning a leaf from the root tree, we remove our ability to sign any messages using FORS leaf nodes below that branch of the hypertree. In SLH-DSA, the message is hashed to select which FORS leaf to use, so if we prune some of those FORS leaves, we may find the message hashes to a FORS leaf which no longer exists!

Thankfully, SLH-DSA signatures include a randomizer, which the signer can grind using rejection sampling until she finds a randomizer which routes the message certification path to a valid (not-pruned) FORS leaf.

Examples

In the most extreme case, we can prune a root tree of height maximally, so that it contains only 1 leaf by simply generating pseudorandom hashes, and using those as the XMSS merkle authentication path. Hashing these together iteratively, we generate our pk_root in a small fraction of the time normally required for SLH-DSA keygen - about to be exact.

This diagram shows the structure of a SPHINCS (SLH-DSA) hypertree signature, with all but one leaf of the root tree pruned. The merkle authentication path can thus be replaced by surrogate hashes.

To put this in more concrete terms, let’s see an worked-out example of SLH-DSA keygen using a standard non-pruned root tree, and then optimize it with pruning.

In the SLH-DSA-SHA2-128s parameter set, each XMSS tree has a height of , meaning it has WOTS leaf nodes.
Each WOTS key has 35 hash chains which must be iterated 16 times to produce a WOTS public key, plus 8 extra SHA256 compressions to hash the WOTS public key into an XMSS leaf node, for a total of 568 SHA256 compressions.
We’d normally need 1024 more compressions to construct the merkle tree up to the pk_root node.
In total this works out to about SHA256 compressions to generate the root XMSS tree.

Now with maximal pruning, where all but one leaf are surrogates:

Generating the one valid WOTS leaf takes 568 SHA256 compressions.
We must run compressions to generate surrogate hashes and build the merkle path.
We compute a pk_root by running additional compressions.
In total this works out to SHA256 compressions.

The maximally pruned tree is about 500x faster to generate than the full non-pruned tree. Keygen can be done in a few microseconds, instead of a few milliseconds.

To Optimize Signing

The above description prunes to optimize SLH-DSA keygen, but it also speeds up signing slightly. Since the root tree is very fast to generate, the overall workload for signing is reduced because the root tree does not have to be generated on-demand. It is as though the root tree were always cached. Mikhail Kudinov helpfully pointed out that maximal root tree pruning also makes it feasible to efficiently cache the only remaining XMSS tree in the 2nd-to top layer, which can improve signing performance even more.

However, this improvement is minor compared to what we can achieve if we prune the hypertree specifically to optimize signing.

For instance, if we prune half the leaves of every XMSS tree used when signing, this is functionally equivalent to reducing the height of each XMSS tree by 1. Since each hypertree only costs half as much work to generate, this means we only need to do half as much work to sign - not counting the minority work needed for FORS signing. See this diagram:

Security

The main drawback to pruning is that it reduces security. More specifically: By shrinking the pool of FORS leaf nodes available to sign with, we increase the probability that any one FORS leaf will be used for a given signature. If a FORS leaf is used too much, it may permit forgery.

Let be the number of layers of XMSS trees in the hypertree.
Let be the height of each XMSS tree layer in the hypertree.
Prior to any hypertree pruning, we have FORS leaves available for signing.
- Each FORS leaf would have one chance in to be used.
If we prune FORS leaves, we will have usable FORS leaves remaining.
- When signing, each unpruned FORS leaf will have 1 chance in of being used.

Thus, by pruning FORS leaves, we multiply the probability that any given unpruned FORS leaf is used by a factor of .

Recall that in SLH-DSA we have a security parameter : the signature query limit, AKA the maximum number of signatures we can safely generate for the keypair, beyond which the classical security degrades below a given threshold . SLH-DSA parameter sets are designed so that after signatures, the expected number of signatures for every FORS leaf keypair falls below some critical bound, such that the expected forgery probability for every FORS keypair remains smaller than 1 chance in .

Therefore, if we want to retain the same security level as the unpruned equivalent, we must reduce by a factor of .

Each FORS leaf keypair would then be used the same number of times on average as it would’ve been if we had not used hypertree pruning.

Thus, after signatures, any given FORS leaf of the pruned keypair is just as likely to admit a FORS forgery as any FORS leaf of the full unpruned keypair would be after signatures.

Example

Let’s draw up an example with maximal pruning on the root tree, and see how that impacts security.

By pruning all but one leaf of the root tree, we have effectively decreased the parameter (the number of layers in the hypertree) by 1, which means the hypertree now contains a small fraction as many bottom-level FORS leaf keys. In other words, we have decreased the overall height of the hypertree by .

Without pruning, the probability for a FORS leaf to be used would be 1 chance in , and so the expected number of signatures per FORS leaf after signatures would be .

But when the root tree is maximally pruned, then the FORS leaves which we did not prune are times more likely to be used to sign any given message, meaning the probability of any one signature hitting a given FORS leaf is:

To retain the same security level as the unpruned equivalent, we reduce by a factor of .

Then after signatures, each FORS leaf keypair would then be used the same number of times on average as it would’ve been if we had not used hypertree pruning.

After signatures, the pruned hypertree is just as likely to admit a FORS forgery as the full keypair would be after signatures.

Working with SLH-DSA-SHA2-128s as a concrete parameter set to illustrate:

With parameters and , we have 7 XMSS layers of height-9 trees, leading to an overall hypertree height of - a total of FORS leaves.
Each of the WOTS leaves of the XMSS root tree would normally certify a child hypertree of height , containing FORS leaves.
By maximal pruning of the root tree, we remove all but one of those height-54 child hypertrees.
This leaves us with exactly usable unpruned FORS leaf nodes which are times as likely to be used for each signature.
Consequently, the keypair’s safe signature limit should be reduced from by a factor of , down to signatures.
After signatures, we expect each FORS to have issued signatures on average.
- This is exactly the same as an unpruned SLH-DSA-SHA2-128s key, where the expected number of FORS signatures per leaf is .

Privacy

The most obvious way to implement pruning is to always prune all but the first leaves for a given XMSS layer. This naive approach leaks information.

If an implementation always prunes the same set of leaves for every keypair, this makes the implementation’s signatures easy to fingerprint and identify to any observers who see a signature. To reduce fingerprinting, leaves should be pseudorandomly selected for pruning, in a manner unique to each keypair.

For example, signers could pseudorandomly derive a fixed-size subset of leaf positions for each unique XMSS tree being pruned, by hashing the XMSS tree address with the secret sk_seed. The signer can use that subset to select which leaves of a tree will or will not be pruned. This makes it harder for an observer to recognize and fingerprint a signer’s pruning behavior based on seeing a single signature.

However, after seeing multiple signatures, an observer may notice the signatures always reuse the same hypertree paths, so as more signatures are revealed the observer can become more and more confident that the signer is using a pruned SLH-DSA key. He may not be able to prove it conclusively though, and he won’t know for sure the exact pruning structure the signer used either.

Verifier Compatibility

Hypertree pruning is essentially a way to improve SLH-DSA performance at the cost of signature count, while maintaining verifier compatibility.

Verifier compatibility is an explicit core goal of hypertree pruning. If verifier compatibility were not needed, hypertree pruning is not needed either: If the signer could tell the verifier which SLH-DSA parameter set to use, then the signer could pick whatever SLH-DSA parameters they want, optimizing the parameters for performance, security, or signature size as needed. Hypertree pruning is always a worse option than selecting a tailored parameter set, because signatures using hypertree pruning will be unnecessarily larger than those of an equivalent bespoke parameter set.

However, in many real-world scenarios, signers may run into inflexible verifiers who are fixed to one or more specific SLH-DSA parameter sets. For example, in US government applications, only certain parameter sets are standardized and approved by NIST for official use. Hypertree pruning gives low-power signers a way to comply with these fixed standards at the cost of reduced signing capacity per keypair.

Signer Incompatibility

Hypertree pruning comes at a cost to signer compatibility. Two signer implementations may import the same SLH-DSA secret key, but if one uses hypertree pruning and the other does not, then both may generate different public keys and will create incompatible signatures. In other words, because hypertree pruning modifies the keygen and signing algorithms, it is incompatible with standard SLH-DSA signer implementations.

This incompatibility is important because it leads to a security footgun.

Given an SLH-DSA secret key (sk_seed, pk_seed, sk_prf)
Let there be two signer implementations who use the same SLH-DSA secret key: one using hypertree pruning and one who uses standard baseline SLH-DSA algorithms. We call these the pruning signer and the standard signer respectively.
Assume the pruning signer prunes at least one XMSS layer other than the root XMSS tree. Then those pruned trees will have different merkle root hashes than those of the standard signer.
If both signers create signatures involving the same intermediate (non-root) XMSS trees, then it it is feasible for the signers to accidentally reuse a WOTS key to certify distinct XMSS root hashes. This would break the security of the entire SLH-DSA scheme.
A similar pitfall can occur if the same keypair is reused with different pruning techniques.

Mitigations

The first and most obvious way to mitigate is to simply encode any pruned secret key in a non-standard format, e.g. by adding a version number which identifies the key as pruned, and also identifies the specific pruning technique that key uses. Standard SLH-DSA implementations should not accept such irregular keys, and pruned signers would not import standard SLH-DSA keys, and so there is no risk of WOTS key reuse between the two. This approach has the benefit of failing cleanly with recognizable errors.

If an application derives SLH-DSA keys pseudorandomly from some higher-level seed value, then the derivation process could be adjusted to scope keypairs specifically for pruned signing with a certain pruning technique. This keeps pruned keys completely separate from unpruned keys, and proper context separation in the derivation procedure can also ensure the same key is never reused with different pruning techniques.

If secret keys must be importable between incompatible signer implementations, then the pruning signer can mitigate the risk algorithmically. Notice the footgun only exists because both signers sometimes generate the same WOTS leaf keys, but the two signers may not generate the exact same sets of XMSS leaf nodes. We can use this fact. The signer need only randomize their sk_seed with a context string unique to their specific pruning algorithm. e.g. sk_seed = H(sk_seed, "my pruning algo"). Because sk_seed is used as a seed to generate all WOTS keys in the protocol, the pruning signer’s WOTS leaf keys will then be completely disjoint to those of the standard signer, and so there is no risk of WOTS key reuse between the two implementations. This also means pk_root hashes will always be different too.

Grinding Efficiently

While all pruning setups shown above result in efficiency gains, some pruning structures may result in a less efficient signer, because they demand an unrealistic amount of randomizer grinding.

Remember, we can prune as much as we like, but for a verifier to accept our signature, we must first find a randomizer which selects a valid and usable FORS leaf on the bottom layer of the hypertree. SLH-DSA does not let the signer select this leaf explicitly though. The best she can do is to grind the signature randomizer value until she lucks out and finds one which selects an unpruned FORS leaf.

The probability of such a hit is exactly where is the number of FORS leaves we have pruned, and the expected number of tries needed to find this hit will be the inverse:

Thus it turns out the expected time needed for randomizer grinding grows exponentially as we prune more FORS leaves (the graph looks a lot like that of ).

This also means the grinding efficiency of a pruned SLH-DSA keypair is closely linked to the security of the keypair, which also depends on the number of usable FORS keypairs.

Diminishing Returns

As we prune more and more FORS leaves, the growth in grinding difficulty results in diminishing performance returns, and can even become computationally infeasible.

For instance, let’s say we maximally prune the first two XMSS layers in a SLH-DSA-SHA2-128s hypertree. This reduces the overall hypertree height by , so we have shrunken our pool of usable FORS leaves by a factor of , down to leaves.

To sign with this keypair, we must find a randomizer which routes the verifier to one of the remaining FORS leaves. Each attempt when grinding the randomizer has a success probability of , so we can expect this to take around 262144 attempts before we find a valid randomizer. Each attempt takes one SHA256 compression call.

But generating an XMSS tree takes around 291000 compressions anyway, so we have only saved about 30,000 compressions by maximally pruning the penultimate XMSS layer. Contrast this with the 290,000 compressions we saved in an earlier example by maximally pruning the root XMSS tree.

If we maximally prune the 3rd XMSS layer from the top as well, then we would expect to need SHA256 compressions before we find a valid randomizer - that’s about 134 million compressions, or around 64 times more work than running the entire stock SLH-DSA-SHA2-128s signing algorithm without any pruning.

Clearly, there is a point beyond which pruning becomes more of a hassle than it is worth in performance gains.

Optimal Pruning

Keygen efficiency gains are easy to reason about.

Let be the number of WOTS hash chains in a WOTS keypair, and let be the length of those hash chains. For each leaf of the root XMSS tree we prune, the number of hash compressions needed for keygen is reduced by at least . Therefore, to maximize keygen performance (at cost of security), we may simply use maximal pruning: prune all but one leaf of the XMSS root tree.

It is more difficult to quantify the net signing efficiency improvement from pruning, because we must balance the savings from pruning against the losses from randomizer grinding. In this section we will venture to find the optimal pruning technique to maximize SLH-DSA signing speed, assuming we are willing to sacrifice some security.

How do we do optimal pruning?
What technique offers the best tradeoff ratio of security to signing speed?

We should first define what “optimal pruning” even means.

Signing runtime is majority dominated by:

WOTS key generations, and
(asymptotically) the randomizer grinding difficulty

WOTS key generations are easy to quantify. Each WOTS keypair is composed of hash chains of length . The majority of the work needed to produce a WOTS pubkey is in iterating those hash chains, requiring at least hash compression calls per WOTS key. Therefore we define the cost function which takes in the number of WOTS pubkeys which we must generate in total across all layers of the hypertree, and returns the minimum number of SHA256 calls needed to produce all those keys.

The randomizer grinding difficulty is proportional to the reduction in the size of the FORS leaf pool relative to the default FORS leaf pool size. We define this as the randomizer cost function which takes in the number of pruned FORS leaf nodes , and returns the average expected number of SHA256 calls needed to find a valid randomizer.

Note also that represents a “security cost factor”: it is the factor by which we must reduce our key’s signature query limit in order to preserve security bounds.

Given these cost functions, our overall goal is then to minimize the net cost function

However, the two parameters and are not independent. Pruning a WOTS leaf node across a layer decreases by one and so improves performance by a fixed increment, but also increases by some amount which worsens performance. Optimal pruning thus means finding a pruning algorithm which minimizes the growth of while also maximizing the growth of , such that we find a minimum of .

In the following section, I will describe a pruning algorithm that prioritizes signing speed, and prove it is optimal.

_{Note I leave aside the possibility of pruning layers with internal selectivity, where we might prune some XMSS trees more or less than others in the same layer. Instead I assume pruning is consistent inside any given XMSS layer, not necessarily at the same leaf positions, but at least to the same degree. This means every XMSS tree inside a given layer of a hypertree always has the same number of leaves.}

Symmetry

Let’s define a new term: An -regular hypertree is a hypertree for which all XMSS layers have the same number of (unpruned) leaf nodes. For example, standard SLH-DSA hypertrees are all -regular.

I will now show that pruning a leaf across any layer of an -regular hypertree has the same effect on the cost function , i.e. the same effect on signer performance, regardless of which layer is pruned.

Towards this end, here are two key facts:

Pruning any WOTS leaf across any layer of any hypertree will always have the same effect on regardless of which layer we prune: It saves a single WOTS key generation during signing, and so is decreased by 1.

This fact is hopefully self-evident and needs no proof.

Pruning a leaf across any layer of an -regular hypertree has the same effect on regardless of which layer we prune.

Proof of (2): Let be the number of layers in the -regular hypertree. Pruning a leaf on the root tree will remove FORS leaf nodes. This is the same number of FORS leaves we’d remove if we prune 1 leaf from the 2nd-to-top layer XMSS trees, because each such leaf is the parent of bottom-level FORS leaves: . The same logic applies to any XMSS layer further down, including if we prune a single FORS leaf from each of the bottom layer XMSS trees. Thus, pruning any layer of an -regular hypertree with layers results in increasing the number of pruned FORS leaves by , and therefore results in a proportional change to .

The above facts imply layer choice doesn’t affect the cost function (doesn’t affect signer performance) when pruning an -regular hypertree. Pruning any given layer of an -regular hypertree affects signing performance and security equally as much as pruning any other layer.

The only distinction between layers in this case is that pruning the root tree also improves keygen performance as well as signing performance. Thus we derive the first rule of our optimal pruning algorithm:

If the hypertree is regular, then always prune from the root tree first.

Since SLH-DSA trees are -regular, this means pruning a single leaf from the root tree is always the first step in any optimal pruning algorithm.

After pruning one leaf from the root tree, we have pruned FORS leaves, and we’ve saved 1 WOTS key generation during signing: . Which layer should we prune next?

First notice the overall hypertree is no longer -regular, but we can recursively invoke the symmetric properties of hypertree pruning on the lower layers. If we think of the XMSS layers below the root tree as independent child hypertrees with layers, where each XMSS layer has height , then it is clear these child hypertrees are -regular. Thus, pruning any given layer of those child hypertrees would be equivalent to pruning any other layer, at least in terms of the impact on the number of FORS leaves that we prune.

Therefore, we have only two material choices for where to prune next: Do we prune another leaf from the root tree again, or do we prune a leaf from a lower-level XMSS layer?

Asymmetry

Both choices are equivalent in their effect on the WOTS keygen cost , because they both decrement by 1.

But curiously, the two choices are not equivalent in their effect on the number of pruned FORS leaves, and thus on the randomizer grinding cost (AKA the security cost factor).

If we prune another leaf from the root tree, we will remove usable FORS leaves, as before, and so increase proportionally. However, pruning a leaf from any of the lower-level XMSS layers has a different effect.

Without loss of generality, assume we prune a leaf from the 2nd-to-top XMSS layer. Since we already pruned one leaf from the root tree, there are now only such unpruned XMSS trees remaining in the 2nd-to-top layer. Pruning a leaf of one such tree will remove FORS leaves. Thus, pruning a leaf from each of the XMSS trees in the 2nd-to-top layer will result in removing FORS leaves - Exactly leaves fewer than if we had pruned the root tree again!

Therefore, the optimal choice to minimize the number of pruned FORS leaves - and therefore minimize the cost - would be to leave the root tree as-is, and instead prune one of the lower-level XMSS layers, all of which are equivalent.

If we do prune a leaf from the 2nd-to-top layer, then this same reasoning can be applied recursively to show the optimal follow-up is to prune a leaf from the 3rd-to-top XMSS layer, and then again with the 4th-to-top XMSS layer. This proceeds until we have pruned one leaf from every XMSS layer in the hypertree, whereupon every layer has the same number of leaves: .

After pruning every layer once, we will have acquired a -regular hypertree, and so we can resume pruning from the top of the tree again, since at that point layer choice has no effect on the cost functions.

More generally: The best choice of pruning to minimize for any given partially-pruned hypertree is to always prune a leaf from whichever XMSS layer has the most unpruned WOTS leaf nodes remaining.

This gives us our only other necessary algorithmic pruning rule:

If the hypertree is not regular, then prune from the layer with the most unpruned leaves, highest first.

Visual Example

Tree-based algorithms are always easier to clock with a fully diagrammed example.

This is a 4-regular hypertree with 3 layers.

You can see visually why pruning a leaf from the top layer removes the same number of bottom-level leaf nodes as pruning one leaf across any other layer (always 16 leaves).

Assume we’ve already pruned one leaf from the top-level tree. Notice what happens when we prune a second leaf from the top-level tree.

This removed an additional 16 leaf nodes. Contrast that with the effect of pruning one leaf from each of the three remaining 2nd-layer trees.

Pruning a leaf across the 2nd layer removed only 12 leaf nodes.

This pattern is recursive. Applying the same technique again, we can see pruning a leaf across each of the 9 remaining trees on the bottom layer results in the removal of only 9 leaf nodes.

We have acquired a 3-regular hypertree, so we can return to pruning from the top layer, and repeat the process of pruning from top to bottom over and over until we have pruned as many WOTS leaves as we desire. The result is the optimal pruning technique to maximize signing efficiency.

TLDR

Practically speaking, this means the pruning technique which best optimizes signing is to prune all XMSS layers in equal amounts, with a slight bias to prune the root tree first, since that will speed up keygen as well. This approach minimizes the effects on security while maximizing the signing performance improvement by pruning.

As an example, the earlier suggestion I gave to optimize signing just happened to be an instance of optimal pruning, because each XMSS layer was pruned to exactly half its ordinary size.

However, as we’re about to see, that isn’t the absolute limit on how far we can optimize. The details of how many leaves we can prune per layer depend on the SLH-DSA parameter sets we use.

Parameter Set Suitability

Now that we know how to prune optimally, we can work out the upper limits of hypertree pruning, when applied to tune the performance and security of different SLH-DSA parameter sets. It turns out, not all parameter sets are equally receptive to pruning.

Note that optimizing keygen is again straightforward to analyze here. We can achieve optimal keygen by using maximal pruning of the root tree. In this case:

keygen runtime is multiplied by
signing runtime is multiplied by
signature limit is reduced to

…recalling that is the height of each XMSS tree, is the number of XMSS layers, and is the signature use limit.

It is totally feasible to prune the root tree only partially, reserving some security at a cost to keygen performance. To this end, I leave the details as an exercise to the reader.

Instead I will focus on the far more interesting and subtle use-case of optimally pruning different parameter sets to eke out the best possible signing performance (at expense to security).

Since we know optimal pruning for signing involves pruning every XMSS layer equally, as proven earlier, I will redefine the net cost function in terms of the number of WOTS leaves to prune per XMSS layer, which I’ll call the pruning degree and denote it with . A pruning degree of means we prune WOTS leaves from every XMSS tree in the hypertree - not necessarily in the exact same leaf positions, but the always in the same quantity of leaves per XMSS tree.

This implies we will use hypertrees which are always -regular, and that has several advantages:

It makes it easier to write both the number of pruned FORS leaves and the number of WOTS keygens saved.
It allows us to define the net cost function as a univariate function , so we can graph it in the Cartesian plane.
It makes implementation easier because every layer of the hypertree uses the same number of leaves, allowing for parallelization and other software optimizations.

Here is our new definition of the net cost function:

We can also compute , the signing query limit after pruning, as follows:

Think of this as multiplying the original signing query limit with the fraction which is the fraction of FORS leaves that remain unpruned and usable.

With concrete formulas in hand, let’s apply them to some real-world SLH-DSA parameter sets and see what we can learn about the effects of hypertree pruning.

First, we’ll look at the NIST-standardized SLH-DSA-SHA2-128s parameter set: , , , , .

Fascinating! The net cost function appears to decay almost linearly, until we’ve pruned around 400 of the 512 WOTS leaves per XMSS tree, at which point the grinding cost term starts to explode exponentially as increases, quickly dwarfing the WOTS cost which decays linearly with .

The net cost minimum occurs around , while the first meaningful diminishing return in performance gain caused by grinding (i.e. where starts to diverge from the straight line defined by ) occurs beyond .

This means any pruning degree is viable for this parameter set, though as approaches 405, improvements in performance come at an increasingly severe cost to security.

Here is a table of results for how SLH-DSA-SHA2-128s signer performance and pruned signing query limit would be impacted by different choices of pruning degree :

Leaves pruned per layer	HT^* signing speedup	Signature limit ()
	1.000x
	1.002x
	1.032x
	1.142x
	1.333x
	2.000x
	2.284x
	3.144x
	3.874x
	4.209x

_{^* “HT” stands for hypertree. The speedup ratios given here do not take external factors such as FORS signing or Merkle path construction into account. For most SLH-DSA parameter sets though, WOTS keygen consumes the bulk of the signer’s runtime, so this approximation should roughly match real benchmarks.}

Now let’s look at the SHRINCS-B parameter set: , , , , .

This graph looks almost like a “sharper” copy of the SLH-DSA standard parameter set. Never mind the negative cost values on the other side of the pole beyond - these are artifacts of the graphing tool.

The minimum of the SHRINCS-B parameter set’s pruning net cost function is at exactly , where we’d prune all but 16 of the 4096 leaves from both of the two XMSS layers. The cost starts to diverge from its linear decay beyond .

Here is another efficiency tradeoff table for SHRINCS-B:

Leaves pruned per layer	HT signing speedup	Signature limit ()
	1.000x
	1.016x
	1.067x
	1.143x
	1.333x
	2.000x
	4.000x
	8.000x
	16.000x
	42.567x
	72.300x
	170.667x

Here is the cost function graph for the SHRINCS-B32 parameter set: , , , , .

Its minimum is at exactly , leaving 16 WOTS leaves unpruned per layer. The net cost function starts to diverge from linear beyond .

Here is an efficiency tradeoff table for SHRINCS-B32:

Leaves pruned per layer	HT signing speedup	Signature limit ()
	1.000x
	1.004x
	1.016x
	1.032x
	1.143x
	1.333x
	1.600x
	2.000x
	2.667x
	4.000x
	7.938x
	12.800x

Discussion

Each parameter set supports a security cost factor of at the most extreme pruning degrees - i.e. . But interestingly, the SHRINCS-B and SHRINCS-B32 parameter sets seem to permit a much greater degree of signing speedup compared to SLH-DSA-SHA2-128s. SLH-DSA maxes out at 4.2x speedup, and SHRINCS-B32 permits a maximum speedup of 12.8x, while SHRINCS-B can go far harder with a max speedup of over 170x.

Of course, the obvious caveat is that both the SHRINCS parameter sets are completely ineffective as stateless signature schemes if we prune them to such extreme degrees. SHRINCS-B with pruning degree would permit only 16 signatures before the key’s security starts to degrade below the prescribed security level. SHRINCS-B32 with is a bit better with .

Realistically, any signers who need to use pruning would likely want to compromise somewhere in the middle, and pick a pruning degree which suits their use-case.

Hybrid Pruning

In some contexts, it may be desirable to use hybrid pruning, which is where we prune the root tree and the lower-level XMSS trees with different pruning degrees. This allows us to balance keygen and signing optimization.

For example, a signer might want to maximally prune her root XMSS tree by removing WOTS leaves, so that SLH-DSA keygen is very fast. If she has extra security budget she doesn’t need, she may also elect to prune the lower-level XMSS trees by a variable amount.

In this case, the net cost function is a bivariate generalization of , where is the pruning degree for the root XMSS tree and is the pruning degree for all the other lower-level XMSS layers.

We can adjust the signing query limit for a hybrid pruning setup as follows:

Optimizing hybrid pruning is a more subtle matter than pruning for signing speed alone. I feel this article is long enough as is, and I’ve spent enough time on the subject, so I will leave further examination of hybrid pruning as a future exercise and an open problem for anyone interested in pursuing it.

Conclusion

Despite SLH-DSA’s reputation for being slow, there are tricks we can do to speed it up. Some techniques like those I showed in my previous article are rooted in the principle of maximizing hardware usage.

Hypertree pruning is a unique new addition to the optimizer’s tookit though: Even for platforms who have already maximized their available hardware, we can still improve SLH-DSA performance, albeit at the expense of signature capacity.