Loading
Settings

Sample reconstruction estimate

Assume nn samples (n per row and column), v validators, and each validator being able to repopulate s samples (because their rows and columns intersect at s points).

What we want: Very high probability that at least 50% of the samples on each row and column can be repopulated from orthogonal rows/columns.

Probability computation

For each given sample, the probability that one of the validators can make it available as one of their reconstructed samples is
p=1(1q)sv
where q=1n2.

Let’s look at one line (row or column). The distribution of the number of available samples is distributed binomially with Binom(n,p) (approximately; the probabilities are not independent, but given the large number of samples they are very close to being independent). This distribution can be approximated using the normal distribution Binom(n,p)Normal(np,np(1p)). We want to have a small probability that less than 50% of this line is available. We can quantify this using multiples of ασ; e.g. 6σ means a ca. 1 in 1 billion chance that a row cannot be reconstructed.

To compute the value of p where this occurs, we set
n2=npαnp(1p)

thus
n2np=2αnp(1p)2npn=2αnp(1p)n2(4p24p+1)=4α2np(1p)4(n2+α2n)p24(n2+α2n)p+n2=0p2p+n24(n2+α2n)=0

Solving the quadratic equation (we want the positive root), we get

p=12+121n2n2+α2np=12(1+1n2n2+α2n)=12+12n2+α2nn2n2+α2n=12+12α2nn2+α2n=12(1+αnn2+α2n)

So

p=1(1q)sv=12(1+αnn2+α2n)(1q)sv=12(1αnn2+α2n)

and thus

v=1slog12(1αnn2+α2n)log(1q)

For s=4, α=6 and n=512, we get v=64833 validators which are necessary for safe reconstruction.