# Improve PeerDAS network by chunking columns
Authors: [Pop](https://github.com/ppopth)
*TLDR: We propose to chunk the columns in PeerDAS instead of sending each of them as a single message. The result is that the propagation latency is reduced significantly. It's similar to [blob decoupling](https://github.com/ethereum/consensus-specs/pull/3244) we did with Deneb, but this is for PeerDAS instead.*
## Chunking the columns

As you can see in the first figure, GossipSub is a **store-and-forward**. That is, when each message is received, it's stored first, validated, and then forwarded. It will not be forwarded immediately while it's being received. This is shown in the first figure.
Fortunately, columns in PeerDAS are easy to be chunked, even though we need to add ~300 bytes in each chunk which is the overhead from a signed block header and a KZG commitment inclusion proof. Those two things are necesary to prevent DoS attacks in the network.
When the columns are chunked, you don't need to store the whole column first before forwarding it. You can receive a chunk, store it, validate it, and then forward it individually. As shown in the second figure, the column is divided into 4 chunks. Theoretically the propagation latency should be reduced significantly.
## Specification change
The only change needed in the spec is `DataColumnSidecar`, with a new field, the starting row index of the chunk, and the limit on the number of cells it can have. The new constant `MAX_CELLS_PER_CHUNK` is added to determine that limit.
```python
class DataColumnSidecar(Container):
index: ColumnIndex
starting_row_index: RowIndex # New
column: List[Cell, MAX_CELLS_PER_CHUNK] # Modified
kzg_commitments: List[KZGCommitment, MAX_CELLS_PER_CHUNK] # Modified
kzg_proofs: List[KZGProof, MAX_CELLS_PER_CHUNK] # Modified
signed_block_header: SignedBeaconBlockHeader
kzg_commitments_inclusion_proof: Vector[Bytes32, KZG_COMMITMENTS_INCLUSION_PROOF_DEPTH]
```
There would be also changes in GossipSub validation rules, but we think the effort shouldn't be much.
## Simulation result
We ran a simulation using [pubsub-shadow](https://github.com/ppopth/pubsub-shadow/tree/chunked-peerdas) with 2,000 nodes, 20% of which have 1Gbps and 80% of which have 50Mbps. The block builder always has 1Gbps. Each node is subscribed to 8 random columns, while the block builder has to subscribe to all 128 columns and publish all of them. We are comparing two scenarios: the one without any chunking and the one with chunks of 32KB.
As you can see from the figure below, we ran simulations with various numbers of blobs. The ones with chunks always perform better. Even though the propagation time is reduced by only about 10%, we think the engineering effort is not much so it's worth sequeezing the juice out.
In another figure below, we ran simulations with various sizes of chunks: 32KB, 64KB, and 128KB. Even though 32KB shows the best result, we would not like to go lower because we would like to keep the overhead from the signed block header and inclusion proof negligible. Currently it's ~300 bytes per chunk so with 32KB, the overhead is <1%.

## How about one topic per chunk?
This proposal is similar to what we have done with [this PR](https://github.com/ethereum/consensus-specs/pull/3244) in Deneb. In that PR, we decouple the blobs from the block. That is, instead of sending all the blobs together with the block in a single message, we send each blob separately as an individual message, so this proposal is similar to that PR. However, that PR also has dedicated GossipSub topics for each blob, so we are curious if we should have dedicated topics for each chunk as well.
It turned out that we did't see any improvement from the simulations below, so for simplicity it's better to send chunks into a single topic instead.

## Related work
### Partial column dissemination
*Ethresearch post: https://ethresear.ch/t/a-new-design-for-das-and-sharded-blob-mempools/22537*
The partial column dissemination is an idea for distributed block building to let nodes help disseminate some parts of the columns early even though they don't have every blob in the block, but have some from the mempool. This definitely requires the `getBlobs` engine API.
Our proposal is different from this in that our proposal shows that chunking is beneficial even without `getBlobs` so it works with both public and private blob transactions.
Even better when partial column dissemination is implemented, we can incorporate this proposal by chunking the column messages, if they are larger than some threshold.
### Column propagation with cell centric erasure network coding
*Ethresearch post: https://ethresear.ch/t/improving-column-propagation-with-cell-centric-erasure-network-coding/22298*
This post uses cells as a unit of propagation rather than columns. It's similar to ours in that it sends partial columns rather than the full column. However, the post is about reducing the bandwidth consumption, but ours isn't.