One of the major challenge to scale bitcoin is that a lot of the work done by nodes is reactive. Nodes receive a block and start validating. This takes time and resources and is right in the hot path. The problem gets worse the bigger the block are, to the point were consensus takes more and more time to be reached. At some point, consensus cannot be reached.
In order to improve scaling capabilities of the network, it is key that nodes do as much work as possible ahead of time. When a block is received, node just have to assert that the node effectively contains what they expected, which speeds up validation significantly.
Bitcoin Unlimited developed a very interesting technology in order to speedup block propagation.
The basic idea is that if nodes already know of transactions that are likely to be in the a block, and rely on it to not repeat work that already been done.
If node A wants to request a block from node B, the first step is to send a bloom filter of all the unconfirmed transactions it knows of. Node B can use the bloom filter to know which transactions in the block A already knows about, and which A doesn’t. B will send the transactions A do not know about, but only reference to the transactions A knows about. As a result, the amount of information that needs to be transmitted to A is greatly reduced and the block propagate faster.
Keeping the mempool in sync
Using XThin, nodes A and B synchronize their mempool at the last minute. It is the hot path. A naive way to improve this is to have A and B exchange bloom filters on a regular basis and make sure they know about the same transactions. This way, transmitting the bloom filter would not be in the hot path anymore. As a result, the block can be sent right away between nodes without waiting for the bloom filter.
While this is doable, it wouldn’t be very efficient. Nodes would require a very large amount of bandwidth to constantly be exchanging bloom filters and transactions. This doesn’t looks like it is going to be very scalable. Or does it ?
Transmitting the difference
It is a given that the set of transactions a node has in memory is not going to change dramatically in between two synchronizations and is only going to grow in between blocks. New transactions in the mempool of a node will be materialized by extra 1s in the bloom filter.
Instead of sending the whole bloom filter every single time, a node could send only the extra 1s that were added since the last bloom filter was sent. However, network are unreliable, and the remote node may have lost one of the updates, received them out of order, or failed to acknowledge reception.
Hopefully, this problem has been solved very elegantly in the Quake III Arena network code. Each node will maintain a list of snapshot of its bloom filter. Let’s assume snapshots are synchronized at a 1s granularity. Each node can keeps 32 or 64 snapshots in its history or whatever number it thinks is appropriate. It will also remember what is the last state the remote node acknowledged.
Every second, node A will look what is the last bloom filter it sent to B that B acknowledged (S). It is then going to do a xor between its latest snapshot (L) and S to create update data (U). It is then going to send to B an update message containing U, the revision number of S and the revision number of L.
When receiving such message, B will first update its copy of A’s bloom filter by doing an or between the old state one the update. B will then send a acknowledgement message to A, containing the revision number of L, and the transaction from its mempool it thinks A is missing.
Alternatively, B can also use xor and keep an history of the states sent by A, which would allow B to catch removal from A’s mempool,but ultimately, it is not clear if it is a winning strategy compared to the or one.
Because updates are sent on a regular basis, the result of the xor operation is going to be mostly zeros. The few non zeros in there should be powers of 2 for the most part. Entropy coding should help make these messages compact.
When Quake III was written, Huffman was the only fast option. Arithmetic coding, requires to do many divisions on the compression side, and literally feed entropy to the branch predictor on the decompression side. This is bound to be slow. However, since then, a third option was put on the table int he name of FSE, which combine arithmetic coding compression capabilities with Huffman’s speed. That’s what we are looking for.
For future extensibility and to ensure compactness, FSE tables can be precomputed by implementing the protocol without compression and gathering real world data using it. A field in the message can identify the table the sender is using.
Now we have a very compact and fast protocol to synchronize mempool, this is getting good !
Really, we don’t need to add much here as the proposed protocol has really nice properties when it comes to fault tolerance. If a packet is dropped, be it the message or the ack, the sender will just keep sending update from the last acknowledged snapshot. That means that if a receiver miss an update, it’ll get the 2 combined in the next message. Because of entropy coding, the message will just be bigger. The protocol naturally compensate packet loss with a temporary increase in bandwidth to keep things running.
The same thing happens in case of latency increase. In case of reception out of order, the older update can simply be discarded. The beauty of this all is that no specific handling of these scenario is needed. The protocol is remarkably robust and will handle network fault naturally as an emergent property.
As a result, UDP can be used to get the very best performances out of the system.
If a node gets too far behind as it doesn’t acknowledge updates, then it can be considered disconnected, or be sent again the whole bloom filter. The same happens with new nodes or when the bloom filter needs to be resized.
Propagating blocks eagerly using prior knowledge
When a node receives a block, if the above protocol is followed, it should have a recent version of the bloom filter used by its peer. It is possible to aggressively send the block by removing all the information already known by the peer. Alternatively, it is possible for the node to only send the block on demand, in which case only an update of the bloom filter is needed, which can be very compact using the same trick as before, removing work from the critical path.
In both cases, we shaved a bloom filter exchange out of the critical path. We also made sure that mempool are synchronized. We get side effect benefit like improving the way transaction are propagated as, if a node missed an update about a transaction, it will notice quickly as the transaction is not in its bloom filter.