Deferred Block Fetching (a.k.a. Dangling Pointers)

Motivation

There is a class of attacks where smeshers, potentially with very small weight, use their ballots to introduce blocks that were not constructed following consensus rules.
Under the existing implementation each block with at least one vote must be fetched by nodes as part of checking data-availability of the ballot (required for syntactic validation).
If an attacker tries to maximize the effect of such an attack they can potentially introduce many blocks using each ballot, limited only by the maximal diff list size of a ballot. When this is combined with other spamming vectors where a smesher maximizes the number of ballots they can create it is a force multiplier.

Mitigation

We change the conditions in which fetching a block is required. We allow smeshers to syntactically validate a ballot even when they don’t have the referenced blocks.
When a block becomes contextually valid, we must then ensure we have it and try to fetch it if we don’t.
This eliminates the problem because a block with little weight behind it will not put additional load on the system. No smesher will fetch it or store it. If an attacker is able to make the block appear valid, it’s no longer a cheap attack and it requires substantial weight to pull off.

When can a block become contextually valid?

Hare

The simplest case is when the node sees Hare reach consensus on a set of proposals. When the node sees this, it can construct a block from those proposals and store it. The block is considered contextually valid at this point, but until a certificate is constructed the node can’t prove it to peers.

Tortoise

When a block crosses the positive threshold—it’s considered valid. At this point it may already be valid thanks to Hare, and then the node should already have it stored. If the block is not locally available at this point it should be fetched from peers.
Having to fetch the block at this point should be rare. It can either happen due to an attack, or if the node was offline during Hare and the certification round failed (also means there’s an attack).

Sync

When a node performs sync, it only requests and stores provably valid blocks. This means the neighbors from which it syncs must either provide a Hare certificate or supporting ballots whose weight crosses the positive Tortoise threshold.

When can a block turn from valid to invalid?

Hare-valid

A block that was certified by Hare or constructed from proposals agreed upon by Hare might not end up being voted for by a majority of ballots. This can only happen when our security assumptions are violated.
When a block crosses the negative threshold in Tortoise we can safely prune it if it was previously considered valid.

Tortoise-valid

Similarly, when our security assumptions are violated the Tortoise might validate a block and later change direction. If we have a block stored already, we should not prune it immediately once it goes below the positive threshold, but when it goes below the negative threshold—we should.

Voting for Blocks in Ballots

In addition to existing rules for when to vote for blocks, if a block we would otherwise vote for is not locally available—we vote against it.
“Locally available” here means we have it stored in the database. This can be because we constructed the block locally after Hare agreed on a set of proposals, or if we successfully fetched it from peers.
This means that if a situation is somehow created where a valid block is not available to any honest smesher, all honest smeshers will vote against it.

Block Availability on the Network Level

As explained in the previous section, if a block is voted for in a ballot by at least one honest smesher it should be available to that smesher. Our assumptions say that any object available to an honest smesher should be obtainable to all honest smeshers eventually.
This means that, unless the adversary controls a majority of smeshing weight, the unavailable block will quickly cross the negative threshold and then not having it doesn’t hold consensus back anymore.

Encoding of Votes

Now that invalid blocks are not obtained, the node can’t determine the layer they belong to or their TickHeight. To fix it, we create a richer Vote structure:

type Vote struct {
	BlockID
	LayerID
	TickHeight uint64
}

Conflicting votes

A ballot is only allowed to vote for a block ID in a single layer with a single tick height. A block ID is not allowed to appear in the diff lists more than once, or the ballot is syntactically invalid. If the base ballot (or recursively, a deeper parent) votes for a block in some layer or with some TickHeight and the referencing ballot has the block in the support list (with a different layer / tick height) then it implicitly votes against the previous combination.

Votes against blocks

Votes against can still be only a block ID, since each block can only appear once in the history of a syntactically valid ballot.

Impact on mesh growth

This means that each vote is now 12 bytes bigger (LayerID is 4 bytes and TickHeight is 8 bytes). In the happy case where there’s no attack and a single vote per ballot, this translates to 63 MB/year.

An attacker could create many ballots and maximize the diff list. If we limit the diff list to 20 diffs, and the attacker can add 50 ballots per layer (e.g. by under-reporting seen ATXs) then they can add up to 1.26 GB/year of spam. This isn’t great, but it’s not a cheap attack and I don’t see a show stopper here.

Tallying votes

We consider each combination of BlockID, LayerID and TickHeight to be a distinct voting target. A vote for a block that specifies the wrong layer ID and/or tick height can be ignored, but before we have the actual block we must consider all combinations. If a combination reaches the positive threshold - the node tries to obtain the block and ensure that the combination is valid (LayerID and TickHeight match this block).

Encoding blocks in HistoryHash

HistoryHash is already divided into layers, so we only need to add the TickHeight of each block into the commitment.

type HistoryBlock struct {
	BlockID
	TickHeight uint64
}

@countvonzero said about this:

hare certificate will be pruned from the network after hdist

That’s fine, if we passed hdist we shouldn’t keep a block if it doesn’t have enough support from ballots.

we don’t know if supporting ballots weight cross the positive tortoise threshold if these ballots are beyond the sliding window

I don’t understand this comment. For any given block in history we should be able to tell if it’s valid (has enough supporting weight behind it).

i take it that you are essentially saying, we only sync what peers told us are contextually valid blocks. within hdist, we have the certificate to prove. but outside of hdist, how does the peer prove to us that this block is contextually valid? i see two ways:

  1. peer provides the list of Ballot IDs that voted for the block, the node locally tally up the weight
  2. the node’s tortoise locally verify that enough ballot weight is behind this block

my earlier comment was assuming no.2, where we have to load all ballots since block’s layer to count weight.

another thing is, say the node received 10 ballots for layer N, of which 4 are malicious with phantom blocks. the node is choosing the base ballot for layer N+1, we have to know which ballots in layer N contains available blocks already. that doesn’t seem to delay the suffering by much.

I think there might be more than one way to solve this. In general, the idea is to hold off on accepting blocks until enough evidence has piled up that at least at some point this block was valid.

If the layer we’re syncing is within hdist from now, we can accept blocks from peers if they come with a Hare certificate. Otherwise, we first sync on ballots and run Tortoise (I assume we already do this) but we will accept blocks only after we count enough votes for them.

If this is historic data, there’s no need to run Tortoise up to the present to determine if the block is still valid. If we run Tortoise and at some point the block becomes valid - we can fetch it already. We shouldn’t block on obtaining the block, though. It may have later been pruned and our neighbors don’t have it available - and that’s fine. If we finish syncing and ran the Tortoise up to the present and the block still seems valid, but we can’t obtain it form neighbors, then our behavior is to vote against it in our own ballots. We should also keep trying to obtain it, until one of these two things happens:

  1. We manage to obtain it (if this happens we can start voting for it again).
  2. The block stops being valid (if no honest party can obtain the block, it’s expected to eventually become invalid).

It doesn’t matter. When the node constructs a ballot, it encodes its current view. It should only vote for blocks that:

are available
&& (the node constructed the block based on Hare results that it witnessed
    || the node received a valid Hare certificate for this block and successfully fetched the block
    || the node counted enough Tortoise votes to cross the positive threshold and successfully fetched the block
   )

(this is slightly simplified, but I think it aids understanding)

The node might still choose a base ballot that votes for a “phantom block” but it would add this block to the Against diff list. Ideally, selecting a base ballot would involve some optimization on the diff list - to keep it as short as possible.

i want to summarize the fetching behavior specifically:

the only way for the node to have the block locally is when the node generated the block itself based on hare consensus output.

when a block is not available locally, the only timings a node will attempt fetching a block from peers are

  • when it receives certificate of that block ID
  • the block ID crossing the positive threshold in tortoise

upon fetch failure, the node keeps trying to fetch it until the block ID

  • content is fetched
  • crosses the tortoise negative threshold

so we are basically limiting the repeated fetch on a “phantom block”
from when the block ID crosses the positive threshold until when the same block ID crosses the negative threshold.

as opposed to current behavior where the node fetches the first phantom block (of many) in a ballot once, fails, and considers such ballot invalid.

so if attackers generate N ballots that all contain M phantom blocks for S layers, where S is large enough to make those block IDs cross positive thresholds. and it takes honest nodes vote against them for another T layers for these block IDs to cross negative thresholds.

with our current approach, the node will fail fetch N1S times and consider all these ballots invalid.
with this new approach, the node will fail fetch M*(T-S) times (assuming the node tries fetch once per layer).

presumably T > 2*S. and our protocol can limit M but not N.
so this new approach has an upper bound on failed fetches.
is this the reasoning why the new approach is superior?

i think it is important to clarify that we don’t need to implement syncer and tortoise (counting for multiple targets) changes for genesis, and we probably shouldn’t in order to keep things simpler.

for genesis we need:

  • change votes format to include layer and height (as specified above)
  • add validation to the handler that accepts ballots to verify that height and layer in the votes matches values specified in the block

No, no need to try fetching a block when it’s undecided (between positive and negative thresholds). We’ll try to fetch it as long as it’s above the positive and stop trying when it’s below the positive again (because we wouldn’t vote for it in that case anyway).

If we already have the block available, we won’t discard it when it’s undecided though - that only happens when the negative threshold is crossed (and there’s a small likelihood of then crossing the positive threshold).


I think the impact of this change is bigger than you make it seem.

Under the old model an attacker could force the network to gossip, store and sync a large number of blocks - which are large objects - for a very low cost.

Under the new model an attacker can’t do that anymore. If they want to cause the network to do many failed fetch attempts - which are not very expensive (and we can probably optimize in various ways if they become a problem) - they need to have enough Tortoise voting power to make the blocks cross the positive threshold and stay there for the duration of the attack, despite all honest miners - who can’t obtain the blocks in question - repeatedly voting against them.

Seems like a worthwhile trade off to me :slight_smile:

Hmm… You’re saying we don’t actually implement deferred fetching and leave it for post-genesis. Do I interpret that correctly?

That wasn’t my original intention. With the exception of the added complexity in sync, the rest seems pretty straightforward to me, do you agree?

If most of the complexity is in sync, then we can defer only that part to post-genesis. This gives us protection for online nodes, and indirectly also for syncing nodes, since those will be protected as long as their neighbors are honest. It would mean that the only unprotected attack vector is against nodes that are syncing and have an attacking peer. That’s a relatively small systemic risk (unlike deferring the whole thing to post-genesis).

ok. yes. that makes sense.

not exactly. in the old/current model

  • upon receiving a gossiped proposal/ballot, the first block in a ballot that the node fails to fetch, it stops and declares the ballot syntactically invalid. it will not store the ballot and will not relay the gossip.

  • during sync, it will ignore block-fetching failures and move on to the next layer. it will only retry a layer that tortoise isn’t able to verify.

@countvonzero you’re talking about failed fetches, i.e. unavailable blocks. This is indeed the problem in the new model, but the current model has a bigger problem than failed fetches. The problem is successful fetches :slight_smile:

In the current model, a ballot can introduce a single vote for a block that wasn’t created by the Hare and if it makes that self-made block available (i.e. a successful fetch) along with the ballot that votes for it - everyone will be forced to store that block forever. Each ballot can introduce multiple such blocks, depending on how we limit the size of the diff list.

So to summarize:

Issue Current model New model
Failed fetches Not a problem - they get stopped when the first node gets a ballot and considers it syntactically invalid A small problem - nodes may need to retry fetching blocks for a while if the adversary is very strong
Block spam Serious issue - every node must store every block referenced in a ballot. The attacker may create ballots that reference many blocks and make them available. They would also maximize the size of each block and get free perpetual storage. Not a problem - only blocks that have enough votes to actually be valid need to be fetched and stored. Even if an adversary temporarily controls the network they can cause limited damage with this attack and it gets reversed quickly when they lose control.

ok. this totally make sense now. i completely missed the point (not surprisingly). hahah. thanks for spending the time to help me understand.

That wasn’t my original intention. With the exception of the added complexity in sync, the rest seems pretty straightforward to me, do you agree?

tortoise changes doesn’t seem straightforward, in particular efficiently representing multiple targets in memory

An attacker could create many ballots and maximize the diff list. If we limit the diff list to 20 diffs, and the attacker can add 50 ballots per layer (e.g. by under-reporting seen ATXs) then they can add up to 1.26 GB/year of spam. This isn’t great, but it’s not a cheap attack and I don’t see a show stopper here.

are you sure that 20 is sufficient for honest node to encode a difference? one extreme example is self-healing after a 50/50 split. both sides will have to vote according to weak coin, and in case of negative weak coin - tortoise can select base ballot before partition has started, but positive weak coin will require adding exceptions for the full duration of the partition (so e.g if partition lasted for 2000 layers - diff list will be 2000 block ids).

it probably doesn’t change the big picture, but anyway changes those numbers significantly

i don’t know how to efficiently implement conflicting votes and votes against blocks. implementation either have to store a mapping from block id to layer/height for every block that ballot votes for or recursively find a block with the same id. both approaches are not viable.

@noam please give a thought when you are back. i think this is not implementable as it is

lets say there is a vote for block with id = X. dishonest node votes that block X is supported in layer 10. later dishonest node casts a vote for the same block X but in a layer 1000000. the problem is that conflicting vote can be long ago (time wise and layer wise). same problem applies to decoding against vote, by decoding against vote i mean that we need to find layer and height that were previously supported. it is obvious that we can’t keep any in-memory structure that allows fast validation. but there is the same problem with on-disk structure, it won’t be viable if we always have to run scan without a lower layer boundary.

for nodes that follow protocol (honest) there will be no conflicting votes, and all against votes will be casted within known distance from the ballot that casts it. lets say this distance is N and it should be configured the same for every node in the cluster. this distance also determines practical self-healing distance. if we assume that this distance is 2000 - honest node will be able to encode difference that started ~1 week ago. in order to decode against vote - implementation will have to scan all supported blocks within 2000 layers before ballot that casted against vote. if it can’t find target block within distance - ballot is not decodable and such ballot is from dishonest node.

proposed changes:

  1. drop synchronous validation for conflicting votes. it is practically very expensive and doesn’t affect protocol. it can be done in the background as an optimization to cancel identities that cast conflicting votes (for future)
  2. two alternatives for decoding against votes
  • encode with layer and height (same as support). cheap decoding (no need to scan 2000 layers into the past). more storage
  • encode only block id. more expensive decoding, less storage
    i am leaning towards first option as it leads to a simpler code, and we still have an option to optimize space later by updating Votes structure.

We just discussed this on the Sandwich call and, if we understand the explanation correctly, we (mostly I) agree with your suggestion. About 2 definitely - it’s a small cost and easy to change in the future. About 1, I wish this wasn’t the case, but it’s also not terrible, so just do it if it makes this easier.

this is implemented.