-
-
Published
Linked with GitHub
# Sample reconstruction estimate
Assume $n \cdot n$ samples ($n$ per row and column), $v$ validators, and each validator being able to repopulate $s$ samples (because their rows and columns intersect at $s$ points).
What we want: Very high probability that at least 50% of the samples on each row and column can be repopulated from *orthogonal* rows/columns.
## Probability computation
For each given sample, the probability that one of the validators can make it available as one of their reconstructed samples is
$$
p = 1 - (1 - q) ^ {s v}
$$
where $q = \frac{1}{n^2}$.
Let's look at one line (row or column). The distribution of the number of available samples is distributed binomially with $\mathrm{Binom}(n, p)$ (approximately; the probabilities are not independent, but given the large number of samples they are very close to being independent). This distribution can be approximated using the normal distribution $\mathrm{Binom}(n, p) \approx \mathrm{Normal}(np, np(1-p))$. We want to have a small probability that less than 50% of this line is available. We can quantify this using multiples of $\alpha \sigma$; e.g. $6\sigma$ means a ca. 1 in 1 billion chance that a row cannot be reconstructed.
To compute the value of $p$ where this occurs, we set
$$
\frac{n}{2} = np - \alpha \sqrt{n p (1-p)}
$$
thus
$$
n - 2np = -2\alpha \sqrt{n p (1-p)} \\
2np - n = 2\alpha \sqrt{n p (1-p)} \\
n^2 (4 p^2 - 4p + 1) = 4\alpha^2 n p (1-p) \\
4( n^2 + \alpha^2 n) p^2 - 4(n^2 + \alpha^2 n)p + n^2 = 0 \\
p^2 - p + \frac{n^2}{4 (n^2 + \alpha^2 n)} = 0
$$
Solving the quadratic equation (we want the positive root), we get
$$
p = \frac{1}{2} + \frac{1}{2}\sqrt{1 - \frac{n^2}{ n^2 + \alpha^2 n}} \\
p = \frac{1}{2}\left(1 + \sqrt{1 - \frac{n^2}{ n^2 + \alpha^2 n}}\right)\\
= \frac{1}{2} + \frac{1}{2}\sqrt{\frac{n^2 + \alpha^2 n - n^2}{n^2 + \alpha^2 n}} \\
= \frac{1}{2} + \frac{1}{2}\sqrt{\frac{\alpha^2 n }{n^2 + \alpha^2 n}} \\
= \frac{1}{2} \left(1+ \alpha\sqrt{\frac{ n }{ n^2 + \alpha^2 n}} \right)
$$
So
$$
p = 1 - (1 - q) ^ {s v} = \frac{1}{2} \left(1+ \alpha\sqrt{\frac{ n }{n^2 + \alpha^2 n}} \right)\\
(1 - q) ^ {s v} = \frac{1}{2} \left(1- \alpha\sqrt{\frac{ n }{n^2 + \alpha^2 n}} \right)
$$
and thus
$$
v = \frac{1}{s}\frac{\log\frac{1}{2} \left(1- \alpha\sqrt{\frac{ n }{n^2 + \alpha^2 n}} \right)}{\log (1-q)}
$$
For $s=4$, $\alpha=6$ and $n=512$, we get $v=64833$ validators which are necessary for safe reconstruction.