Motivation
There is a class of attacks where smeshers, potentially with very small weight, use their ballots to introduce blocks that were not constructed following consensus rules.
Under the existing implementation each block with at least one vote must be fetched by nodes as part of checking data-availability of the ballot (required for syntactic validation).
If an attacker tries to maximize the effect of such an attack they can potentially introduce many blocks using each ballot, limited only by the maximal diff list size of a ballot. When this is combined with other spamming vectors where a smesher maximizes the number of ballots they can create it is a force multiplier.
Mitigation
We change the conditions in which fetching a block is required. We allow smeshers to syntactically validate a ballot even when they don’t have the referenced blocks.
When a block becomes contextually valid, we must then ensure we have it and try to fetch it if we don’t.
This eliminates the problem because a block with little weight behind it will not put additional load on the system. No smesher will fetch it or store it. If an attacker is able to make the block appear valid, it’s no longer a cheap attack and it requires substantial weight to pull off.
When can a block become contextually valid?
Hare
The simplest case is when the node sees Hare reach consensus on a set of proposals. When the node sees this, it can construct a block from those proposals and store it. The block is considered contextually valid at this point, but until a certificate is constructed the node can’t prove it to peers.
Tortoise
When a block crosses the positive threshold—it’s considered valid. At this point it may already be valid thanks to Hare, and then the node should already have it stored. If the block is not locally available at this point it should be fetched from peers.
Having to fetch the block at this point should be rare. It can either happen due to an attack, or if the node was offline during Hare and the certification round failed (also means there’s an attack).
Sync
When a node performs sync, it only requests and stores provably valid blocks. This means the neighbors from which it syncs must either provide a Hare certificate or supporting ballots whose weight crosses the positive Tortoise threshold.
When can a block turn from valid to invalid?
Hare-valid
A block that was certified by Hare or constructed from proposals agreed upon by Hare might not end up being voted for by a majority of ballots. This can only happen when our security assumptions are violated.
When a block crosses the negative threshold in Tortoise we can safely prune it if it was previously considered valid.
Tortoise-valid
Similarly, when our security assumptions are violated the Tortoise might validate a block and later change direction. If we have a block stored already, we should not prune it immediately once it goes below the positive threshold, but when it goes below the negative threshold—we should.
Voting for Blocks in Ballots
In addition to existing rules for when to vote for blocks, if a block we would otherwise vote for is not locally available—we vote against it.
“Locally available” here means we have it stored in the database. This can be because we constructed the block locally after Hare agreed on a set of proposals, or if we successfully fetched it from peers.
This means that if a situation is somehow created where a valid block is not available to any honest smesher, all honest smeshers will vote against it.
Block Availability on the Network Level
As explained in the previous section, if a block is voted for in a ballot by at least one honest smesher it should be available to that smesher. Our assumptions say that any object available to an honest smesher should be obtainable to all honest smeshers eventually.
This means that, unless the adversary controls a majority of smeshing weight, the unavailable block will quickly cross the negative threshold and then not having it doesn’t hold consensus back anymore.
Encoding of Votes
Now that invalid blocks are not obtained, the node can’t determine the layer they belong to or their TickHeight
. To fix it, we create a richer Vote
structure:
type Vote struct {
BlockID
LayerID
TickHeight uint64
}
Conflicting votes
A ballot is only allowed to vote for a block ID in a single layer with a single tick height. A block ID is not allowed to appear in the diff lists more than once, or the ballot is syntactically invalid. If the base ballot (or recursively, a deeper parent) votes for a block in some layer or with some TickHeight
and the referencing ballot has the block in the support list (with a different layer / tick height) then it implicitly votes against the previous combination.
Votes against blocks
Votes against can still be only a block ID, since each block can only appear once in the history of a syntactically valid ballot.
Impact on mesh growth
This means that each vote is now 12 bytes bigger (LayerID
is 4 bytes and TickHeight
is 8 bytes). In the happy case where there’s no attack and a single vote per ballot, this translates to 63 MB/year.
An attacker could create many ballots and maximize the diff list. If we limit the diff list to 20 diffs, and the attacker can add 50 ballots per layer (e.g. by under-reporting seen ATXs) then they can add up to 1.26 GB/year of spam. This isn’t great, but it’s not a cheap attack and I don’t see a show stopper here.
Tallying votes
We consider each combination of BlockID
, LayerID
and TickHeight
to be a distinct voting target. A vote for a block that specifies the wrong layer ID and/or tick height can be ignored, but before we have the actual block we must consider all combinations. If a combination reaches the positive threshold - the node tries to obtain the block and ensure that the combination is valid (LayerID
and TickHeight
match this block).
Encoding blocks in HistoryHash
HistoryHash
is already divided into layers, so we only need to add the TickHeight
of each block into the commitment.
type HistoryBlock struct {
BlockID
TickHeight uint64
}