Friday, October 7, 2011

Code Golf: Shamir's Secret Sharing

Shamir's Secret Sharing algorithm is an simple and elegant technique to split a secret S into n parts, where any k parts can be combined to retrieve the secret. The algorithm can be described as follows:
  1. Create a polynomial f(x) of degree k-1 over a finite field F (i.e. modulo some number N) where the constant term is the secret (i.e. f(0)=S).
  2. Compute n distinct points on the curve to distribute.
  3. Given any k points, we can reconstruct the secret with polynomial interpolation.
The task of the challenge on cg.se is to implement the reconstruction portion of this algorithm over GF(1928049029) (in as few characters of code as possible).
Input is done using stdin. First comes an integer k, then k lines follow. Each of these lines contains a pair of integers x y that represent a secret. In other words f(x) = y in the original polynomial that was used to construct the secrets.
The simplest way to perform polynomial interpolation is to compute the polynomial in the Lagrange form:

The value we are want to compute is the value of the polynomial at x=0:




Since we are working modulo N, the division in the above equations refers to multiplying the numerator by the modular multiplicative inverse of the denominator. There are several algorithms that can be used to do this. Let's have a look at a few:

Firstly we have brute force. This won't finish in a reasonable amount of time for the value of N we are given. Additionally, in a Golfscript style, this will also require too much memory:
{*N%1=}+N,? # 11 chars, given the value to compute the inverse of on the stack 

Secondly is the extended Euclidean algorithm. This is an efficient way to compute modular multiplicative inverses (or more generally, expressing the gcd of two numbers as a linear combination of them). The two standard ways to compute this are iteratively and recursively.

The iterative algorithm can be written in Golfscript as follows:
[N\]1 0{1$3$~/*-@.~%:R+(;\@R}do\;\; # 35 chars
Much of the verbosity of this comes from the fact that we don't have easy access to manipulate the fourth element in the stack, so we store two of the elements in an array. By using a variable to store one of the elements, we can save several characters, for example:
:|!1\N{@2$|3$/*-\|\:|%.}do;; # 28 chars

The recursive algorithm is also fairly simple:
N{.{:|1$|/@@%|\g\@2$*-}{!\!}if}:g~; # 35 chars
We can shorten this further by noting that when we have b=0, then we must have a=1 (since this value gives the gcd of the two numbers).
N{:|{.|/\|%|\g\@2$*-}and}:g~; # 29 chars

The last algorithm we have to compute modular multiplicative inverses is modular exponentiation, noting that when p is prime and gcd(a,p)=1, the following equation holds:

This follows directly from Fermat's little theorem (or if you prefer, Euler's theorem).
While modular exponentiation isn't a built-in function to Golfscript (in dc, we could do lnd2-r| to get the result), we can still compute it easily:
1N(({.1&{\2$*N%\}*2/@.*N%@@.}do;\; #34 chars

This simplifies using the built-in 'base' function.
1N((2base-1%{2$\?*\.*N%\}/\;
# 28 chars -- we can skip the N% on the result variable as we will reduce the end result anyway)



Now that we've gotten the options for the hard bit of the code, we can write the main loops to compute the value given by the Lagrange polynomial.
n%(!\:A{:c~\:x;A[c]-{~;.x- __divide__ *N%}/+N%}/
Of the three possible options for the division, the shortest will be the iterative extended Euclidean algorithm. Note that we will still need to initialize N, so we would have to add whitespace to use the other options.
n%(!\:A{:c~\:x;A[c]-{~;.x-:|!1\1928049029:N{@2$|3$/*-\|\:|%.}do;;**N%}/+N%}/
# 76 chars -- passes sample test cases


We can still golf this further. Instead of removing the current point from the list of points, at the start (6 chars overhead), we can modify the denominator inside the inner loop to be xᵢ if it is otherwise zero. This will make the term with i=j equal 1, so it will have no affect on the product.
n%(!\:A{~\:<;A{~;.<-<or:|!1\1928049029:N{@2$|3$/*-\|\:|%.}do;;**N%}/+N%}/ # 73 chars 


We can also remove the inner modulo reduction. This will of course make intermediate values get fairly large, but only about 10 digits per line of input (or 20 if the integers aren't all close to 0).
n%(!\:A{~\:<;A{~;.<-<or:|!1\1928049029:N{@2$|3$/*-\|\:|%.}do;;**}/+N%}/ # 71 chars 


Since we aren't reducing the product modulo N now, we can make one final reduction. Since our modular multiplicative inverse returns 1 for the input 0, we can simply allow the i=j case to pass and divide by xᵢ after all the multiplications are done. This brings our code down to 67 characters!
n%(!\:A{~A{~;.3$-:|!1\1928049029:N{@2$|3$/*-\|\:|%.}do;;**}/\/+N%}/

Note: If we can resort to brute force for the modular multiplicative inverse, we can have code which computes the result in 52 characters:
n%(!\:A{~A{~;.3$-{*N%2<}+1928049029:N,(;?**}/\/+N%}/

Addendum: Peter Taylor notes that we can simply use the formula derived from Fermat's little theorem for the brute force approach, and provides a 47 character implementation. We can in fact shorten this to 46 characters using the .!+ pattern:
n%(!\:A{~A{~;.3$-.!+1928049029:N((?**}/\/+N%}/

No comments:

Post a Comment