Assume n⋅n samples (n per row and column), v validators, and each validator being able to repopulate s samples (because their rows and columns intersect at s points).
What we want: Very high probability that at least 50% of the samples on each row and column can be repopulated from orthogonal rows/columns.
For each given sample, the probability that one of the validators can make it available as one of their reconstructed samples is
p=1−(1−q)sv
where q=1n2.
Let’s look at one line (row or column). The distribution of the number of available samples is distributed binomially with Binom(n,p) (approximately; the probabilities are not independent, but given the large number of samples they are very close to being independent). This distribution can be approximated using the normal distribution Binom(n,p)≈Normal(np,np(1−p)). We want to have a small probability that less than 50% of this line is available. We can quantify this using multiples of ασ; e.g. 6σ means a ca. 1 in 1 billion chance that a row cannot be reconstructed.
To compute the value of p where this occurs, we set
n2=np−α√np(1−p)
thus
n−2np=−2α√np(1−p)2np−n=2α√np(1−p)n2(4p2−4p+1)=4α2np(1−p)4(n2+α2n)p2−4(n2+α2n)p+n2=0p2−p+n24(n2+α2n)=0
Solving the quadratic equation (we want the positive root), we get
p=12+12√1−n2n2+α2np=12(1+√1−n2n2+α2n)=12+12√n2+α2n−n2n2+α2n=12+12√α2nn2+α2n=12(1+α√nn2+α2n)
So
p=1−(1−q)sv=12(1+α√nn2+α2n)(1−q)sv=12(1−α√nn2+α2n)
and thus
v=1slog12(1−α√nn2+α2n)log(1−q)
For s=4, α=6 and n=512, we get v=64833 validators which are necessary for safe reconstruction.