Secret Secure Santa

After a long pause, I'm pleased to start writing again! I chose a whimsical topic for the occasion, a gentle introduction to discuss pretty interesting concepts: factorial bases and Lehmer codes.

With winter approaching, my group of friends started organizing Secret Santa parties. An interesting question (and a proof that we are uber-nerds) is around the security properties of this protocol, in particular, whether it is possible to organize this party following some constraints (e.g., people in a relationship don't give gifts to each other, because that's lame), without revealing the full output to any participant, and convincing every participant that the setup is correct. A question to which any sensible person would answer: "easy, there's a bunch of websites that do it for you". But only fools would trust a third party with such important matters.

separator

Let's get the general idea out of the way. To run a Secret Santa, we start with a list of participants [Alice, Bob, Claire] on which participants agree, and we need to select one permutation, for instance [Bob, Claire, Alice], which indicates that Bob gives a gift to Claire, Claire to Alice, and Alice to Bob. If folks in a relationship can't gift each other, this restricts which permutations are valid. Then, all we have to do is create a process that produces a random value/permutation (more on this difference in a second), and which is 3rd-party verifiable after it ran.

Thankfully, this last part is pretty standard. It's a random oracle, which can be approximated in the real-world by taking a cryptographic hash of, say, the headline of bbc.com at a pre-agreed date in the future. At a high-level, the relevant assumptions are that no one can guess the news beforehand, and that everyone can independently verify the output afterwards.

Thus, participants agree on a list of names and a date during setup, then once this date has passed, they hash the headline at bbc.com on the correct day, then use this randomness to select one permutation. How exactly?

I like to first sketch a bad solution to understand the playing field. If the cryptographic hash function is SHA-256, its output is a 32B byte array. If the number of participants N is small enough, we could cut this byte array into N equally-sized parts, and each could represent the index of a participant in the original list. For instance, for N=4, if the four bytestrings are 0x00000002, 0x00000000, 0x00000001, 0x00000003, we could say that this denotes the permutation [2,0,1,3] or [Claire, Alice, Bob, Daniel].

In this strawman, most values do not represent a valid permutation. At first glance, it seems we could increase the value and stop on the first valid permutation - this would satisfy 3rd-party verifiability. But valid permutations are all around low natural numbers, and most of the 32B range would map to the identity permutation, which would be much more likely, breaking the requirement that the outcome shouldn't be guessable beforehand. Even if the valid permutations were distributed more uniformly, iterating until finding a permutation is not a good idea, as there are only up to N! valid permutations hidden among 2256 values.

If we could make a "denser" mapping, it would solve both the iteration problem and help reduce the non-uniformity problem. We could work in base N (so hexadecimal for N=16 participants), and treat the 32B value as a N-digit base-N number. We still have many unused values though - only N! of the NN values correspond to a valid permutation, which is an exponentially decreasing function.

An interesting observation is that the amount of bits required to represent a permutation index is not constant: log2(N) bits are needed to store the first permutation element, but only one bit is needed for the element-before-last (which can be only one of two elements), and no bit is needed for the last element (for which there's no choice). Which is precisely what is achieved by using the so-called factorial number system, or factoradic.

The factorial number system represents natural numbers as a⋅N! + b⋅(N-1)! + … + y⋅1! + z⋅0!, quite similar to other bases like decimal or binary (except we are not using powers of a base number). To avoid confusion with other bases, a number in factoradic base can be represented as (a,...y,z) with commas and parentheses, instead of joining the digits like in base 10. An interesting observation is that there is a bijective mapping between natural numbers (in any base) to the factorial system: every natural number can be uniquely decomposed, and all numbers in the factorial system have at least one preimage.

A number in factoradic base does not map to a unique permutation, as there are many encodings possible (to convince ourselves of this: from any valid encoding e, we can create a different valid encoding e' by left-shifting all permutations in e). We need to pick one encoding in particular, and the most common one is called Lehmer codes. The algorithm of Lehmer encoding is not the most interesting part of this story; what is relevant however is its properties; for one, Lehmer codes naturally follow the (lexicographical) order of permutations, so the identity permutation has code 0 = (0,...0), while the "fully reversed" permutation [N, N-1, ..., 1, 0] has the highest code N!-1 = (N-1, N-2, ...). This gives a very "natural" encoding and explains why it is often considered the default mapping between natural numbers <N!, factoradic numbers, and permutations of N elements. In fact, this is so common that the factoradic number is also called Lehmer number or code, assuming that Lehmer decoding will be used.

Let's briefly see how Lehmer decoding works with an example. Let's set N=4, the input list [Alice, Bob, Claire, Daniel] and the factoradic number (2,2,0,0). We start from the sorted list of elements; we select index 2, this indicates the first output element is Claire (indices start at 0), and the remaining list is [Alice, Bob, Daniel]. Next index is 2, this corresponds to Daniel, and the remaining list is [Alice, Bob]. Following this process, the permutation corresponding to (2,2,0,0) is [Claire, Daniel, Alice, Bob].

separator

We are now in good shape for solving the problem! Given a set of N elements and a non-negative integer smaller than N!, we now can deterministically select a valid permutation. Thus, from the 32B output value of the hash function, we first read this value as a number in any base, then apply modulo N!, map it to the factoradic system, and apply Lehmer decoding. This defines a unique, valid permutation that solely depends on the hash value. All permutations are equally likely, and once the hash input is known, anyone can independently verify the process. Pretty neat!

We haven't solved the constraints problem yet ("people in a relationship don't gift each other"). A bad idea is to decode the hash into a Lehmer number in factoradic, then decode it into a permutation, and if the output doesn't satisfy the constraints, increase the Lehmer number by one until a valid permutation is found. This makes the probability distribution non-uniform, as permutations that follow many invalid permutations become much more likely.

The correct solution is to iterate over the hash value, e.g., Hash(bbc.com headline || c) where c is a counter. This way, we sample multiple times from the distribution until a valid result is found, without disproportionately increasing the probability of some permutations.

The Secret Santa is now secured. Thanks for reading!

separator

No fun project is complete without a toy implementation:

Participants Constraint (can't gift to...)

Fictional hash of the headline at bbc.com, at a pre-agreed date in the future:

Ludovic Barman
Ludovic Barman
written on :
19 / 08 / 2025