Early Base Gets the Weight?

talm · March 23, 2023, 6:41pm

Definitions and Notations — Recap

Tick count: the number of PoET ticks in an ATX’s PoST proof (i.e., should correspond to one “epoch’s worth” of PoET work)
Positioning ATX: an ATX (supposedly in the previous epoch) whose ID gets included in the PoET challenge for the ATX’s PoST proof.

Honest nodes always choose the ATX with the highest tick-height as their Positioning ATX.

A special Golden Positioning ATX with a hardcoded ID can be used instead of a real ATX (this should only happen in the first epoch in which ATXs are generated). Thus, every ATX has a chain of Positioning ATXs that terminates in the Golden Positioning ATX.
Tick height: the sum of tick counts in the Positioning ATX chain, including the tick-count for the ATX itself. (The Golden Positioning ATX has a tick-count of 0).
Base height: the tick-height of the Positioning ATX.
Full ATX weight: the tick-count of the ATX multiplied by the ATX space.
Full ballot weight: the full ATX weight divided by the number of ballots declared in the epoch.

Problem Summary

The problem we’re trying to solve is that an adversary can choose an “early” positioning ATX, and work for an arbitrarily long time to create an ATX with a arbitrarily large tick-count (and thus arbitrarily large weight). Although the total resources used by the adversary in each epoch are small, this concentrates all the ATX weight into a single epoch, so the single ATX could have an arbitrarily large fraction of the total epoch weight.

Solution Idea

The main idea, is that, for each block, we compute the block height such that it is approximately the same as the base height for honest ATXs in that epoch. We can now define:

Block-relative ATX tick-count: For an ATX with tick-height h_a and tick-count t_a relative to a block with block-height h_b, the block-relative ATX tick-count is

t_{rel}=\max(0, \min(t_a, h_a - h_b))
Block-relative ATX weight: The block-relative tick-count \times the ATX space
Block-relative ballot weight: The block-relative ATX weight divided by the number of declared ballots in that epoch.

By using the block-relative ballot weight instead of the full ballot weight, ATXs with an early base don’t get extra weight.

Of course, there are still several issues that need to be addressed:

How do we set the block height?
Is there consensus on this?
What about the Hare?

Here’s an idea for how to compute and record the block height without getting into circular dependencies.

High-Level Overview

The block height will be the minimum base height of the ATXs corresponding to the proposals that were used to construct it
Hare participants will vote against any proposal whose tick count is more than twice their own tick count, or whose tick height is less than their base height.
- This assumes honest nodes have PoET speeds within a factor of 2, so honest parties will never fall afoul of these rules.

The intuition is that an adversary will be able to concentrate at most two epochs worth of weight, since if it tries to make the block-height earlier, its tick-count will be too large, and all honest parties will vote against its proposal. Since an adversary can already concentrate two epochs worth of weight (by starting at the beginning of epoch i and running poet until near the end of epoch i+1), the situation isn’t worse.

On the other hand, all honest proposals will be accepted, so they will all get their full weight (and future honest ballots won’t have base height much lower than the minimum honest proposal height).

(I think we can tradeoff security for the honest PoET difference by making the rules more stringent: e.g., if honest parties vote against proposals with tick-count more than (1+\epsilon) times their own tick-count, we’d need to assume honest PoET speeds are within a factor of (1+\epsilon), but correspondingly reduce the adversarial advantage.)

Details

Hare Participation

Since we’re using the Hare to agree on the block height, it’s important to ensure that we can select Hare participants fairly without having to recursively use this solution. To do this, for the purposes of Hare selection, we assume each ATX in the Hare active set has exactly 1 unit of time (i.e., its weight is equal to its space). This means the concentration attack is ineffective against the Hare.

On the other hand, an adversary can create ATXs with a very small tick-count, but they would still count as a whole epoch’s worth of weight. I think this is ok, because it can’t reuse the same identity multiple times in an epoch, and creating a new identity using the same space is more expensive than storing the data for an entire epoch.

Optional

If we want to be extra sure (or if there’s some attack I’m not thinking of at the moment), we can add an additional rule to the grading scheme for Hare Active Set selection:

Give grade 0 to ATXs whose tick count is less than c_1 > \frac{1}{2} of my own tick count
Give grade 1 to ATXs with tick count between c_1 and c_2=\sqrt{c_1} of my own tick count
Give grade 2 to ATXs with tick count at least c_2 of my own tick count

The values 1/2<c_1<c_2 imply assumptions about PoET speed differences between honest parties. We need to assume that for every two honest parties A and B, it always holds that A's tick-count is at least c_2=\sqrt{c_1} > \sqrt{1/2} > 0.70 times B's tick-count; this implies that if B gave grade 2 to an ATX, A gave that ATX grade at least 1.

The grading criterion ensures that each ATX in the active set must have used more than half an epoch’s worth of PoET ticks.

Preventing Block-Injection

Our current rule is that lowest-height valid block in each layer is the “active” block for that layer (i.e., the only one that is executed). The reason for this is than an adversary in the future can creates a block in a past epoch, but with a high block-height, such that no old ballot can vote against it. Thus, a short period of dishonest majority can let the adversary create a valid block arbitrarily far in the past. The “lowest block height” rule ensures that such a block won’t change history, since it will have a higher block-height than the “real” historical block.

However, we now have to be careful that the “early base” concentration attack doesn’t allow the adversary to add their own block with a low block-height to the current layer (which would then become the active block). IIRC, this is something we’ve talked about in the past, but I’m not sure the solution is implemented: Every ballot with tick-height h is considered a vote against every block in a future layer whose block-height h_b<h.

This means that if an adversary tries to inject a block with a low block-height, it will immediately be rejected (since all ballots whose tick-height is above this block’s base-height will be voting against it including blocks in previous layers.

Empty layers

IIRC, we currently signify a vote for an empty layer by voting for a special “empty-layer” block-id. Of course, in this case, we don’t have an explicit block-height and can’t encode one.

I think it’s actually ok to give the “empty layer block” lock-height 0 as far as voting is concerned, but treat it as \infty for the purposes of the active block rule (and with the caveat that only ballots in future layers can vote for it). Essentially, this means every ballot votes with its full weight for the empty layer, but if both the empty layer and another block are valid, the other block will be active.
(IIRC, this is what we currently do anyway).

The intuition is that the empty layer vote is really shorthand for “vote against all blocks in this layer”. If another block is valid in the layer, this means that it’s margin must be positive — so it can’t be that “vote against all blocks” is really valid — the only reason it would appear to be so is if we overestimated the vote weight (e.g., by counting the full weight of adversarial ballots who are not allowed to vote against blocks in this layer with their entire weight).

We do need to check whether this could cause problems with the verifying tortoise…

talm · March 24, 2023, 11:54pm

Problems with above empty-layer approach

In our last call, we realized that the empty-layer approach above is still problematic, since it doesn’t prevent the block-injection attack when the layer was previously empty.

The attack goes like this:

Layer i is empty for some reason.
n layers later (for arbitrarily large n, there’s a short-term dishonest majority (e.g., from layer i+n to layer i+n+k.
At layer i+n, the adversary creates a block claiming to be at layer i, but with block-height equivalent to layer i+n.
It then uses the dishonest majority to vote for this block in layers i+n,\dotsc,i+n+k.

According to the rules above, no ballot in a previous epoch can vote against the injected block, since the ballot tick-heights are lower than the block-height. Thus, given the dishonest majority, the injected block can pass the tortoise threshold and become valid (after that point, even honest parties will vote for the injected block).

Since we’re treating the empty layer as having height \infty for the purposes of the active block, the newly-injected block becomes active, and history is rewritten from layer i onwards.

Note that the problem with the attack isn’t that we can rewrite history but that we can rewrite history arbitrarily far into the past, without paying a resource price proportional to the length of history we’re changing. (E.g., with a dishonest majority for k layers, we can’t prevent a rewrite of O(k) past layers, but with this attack there is no limit to the number of layers that are changed)

Proposed Solution: Preemptive Block Injection

I propose the following simple solution:

We leave the empty-layer rules exactly as written above.
However, when a node is in confident consensus about (1) that layer i is empty and (2) there is an active block B in layer i-1, then
- The node will “preemptively inject” an empty block into layer i.
- The empty block is not the empty-layer id. Instead, it is a “real” block, but its effect on the global state is identical to that of an empty layer.
- The important difference between an empty block and an empty layer is that an empty block has an explicit block-height
- The block-height of the injected empty block will be the height of block B.

Note that a layer that has an empty block is not considered empty (so if layer i already has an empty block in consensus, honest nodes won’t inject an additional empty block).

Hand-wavy Analysis

For every empty layer i, honest nodes will eventually reach confident consensus about this (due to tortoise self-healing, if not before). If it happens due to Hare failure that is in consensus, the consensus about the empty layer will be immediate.

Suppose i is the first empty layer. Then at some point honest nodes will reach confident consensus about an active block B in layer i-1. At that point, they will all inject the empty block E_i into layer i with the same block-height as B, so they will all be voting for the same empty block, and hence will reach consensus about E_i, which will become the active block for layer i.

By induction, eventually a block will be injected in every layer (if both i and i+1 are empty, then i will get an empty block with some height h, following which layer i+1 will get an empty block with height h as well).

Resistance to adversarial block-injection

Since the preemptive block-injection will happen very quickly (because consensus happens quickly), we will no longer have empty layers far in the past, so the block-injection attack described above will no longer work (the empty blocks have a low block height, so will remain active if an adversarial block with future block-height is injected).

Self-healing

It’s possible that multiple empty blocks will be created in some layer i (or a combination of empty and non-empty blocks). For example, this could happen during a network split. After a rejoin, standard self-healing will ensure one of the following outcomes:

A single block remains valid in layer i: This is the best case.
No block remains valid: in this case, the parties will be in consensus that layer i is empty, and will preemptively inject a new empty block once they are in consensus about layer i-1.
Multiple blocks remain valid: In this case, the lowest-height valid block will be active. This could be either an empty block or a non-empty block, but in any case we’ll be in consensus about the active block in layer i.

iddo · March 25, 2023, 9:19pm

@talm do you agree that the cleanest way to mitigate the concentration attack by taking the median of the ticks of the identities that vote for the UCB, weighted according to their space units, and use this median as their tick count? And similarly for the hare active set, the weight of each identity is using as ticks this same median of ticks of all the identities in the hare active set that’s agreed upon (where the median is weighted according to space units).

Yes at https://community.spacemesh.io/t/new-history-reversal-attack-and-mitigation/268/6

Let me make sure whether I understand your proposal regarding empty layers:

Since the honest miners are voting neutral while no UCB is in agreement within the hare distance (hdist), are you suggesting the following?

Honest miners will immediately inject and vote in favor of an empty block for layer i after hdist was reached without hare agreement, in the case that in their view there’s non-empty layer i-1.
Honest miners will vote against everything in layer i (a.k.a. vote for the empty layer i) only in the case that in their view layer i-1 is still empty.

If that’s what you meant, then when there’s finally agreement on layer i-1 the honest miner’s ballot will simultaneously vote for in favor of the UCB of layer i-1 and the empty block of layer i ?

talm · March 25, 2023, 9:53pm

I you mean the median of the ticks of the identities that were included in the UCB (i.e., whose proposals were used to construct it), then possibly (it seems ok, but I’m a little less confident about it than what I proposed).

Regarding the Hare, what I proposed is just using 1 time unit for the weight, and ignoring the tick count. I don’t see why having a different number would do anything here.

No, honest miners will inject an empty block only if they are confident that the layer is empty and that the previous layer has a single block. This means they require that vote margin for every proposed layer-i block is under the negative confidence threshold. This could happen immediately after hdist is passed, but it might take longer (e.g., if there is disagreement about whether or not there was a block in that layer).

If honest miners are confident that layer i is empty, but not yet confident about layer i-1, they will continue voting for the empty layer i (rather than injecting an empty block). And indeed, in this case when a node becomes confident about layer i-1, its ballot will simultaneously vote for the block in layer i-1 and inject an empty block (with the same block-height) into layer i.

iddo · March 27, 2023, 4:39am

@talm Can you say what might potentially be bad with median that’s weighted according to space units? As we discussed, the effective tick count of each individual miner will be the minimum between this median and his own tick count. So an adversary that tries to concentrate ticks will fail because he’d only get this median as his effective tick count, and any miner below this median will just stay with his own tick count.

I’m not understanding, if miner1 has few ticks (i.e., stored the data for a short time) and miner2 has many ticks, then we should give miner2 relatively more weight when being selected for participation in the hare (and for earning of hare rewards) ?

About the empty layer:

BTW you didn’t mention here explicitly that the reason that we need \infty for the ticks of the empty layer is that otherwise it defeats the protection against the history reversal attack of https://community.spacemesh.io/t/new-history-reversal-attack-and-mitigation/268 (i.e., an adversary that performs this attack will vote in favor of the empty layer and thereby invalidate the non-empty UCB of the old layer).

I think that we didn’t discuss the idea of choosing the active content block to be the block with the highest total vote count (and having extra tortoise voting rules that use a weak coin to vote in favor of only one among multiple content blocks of the same layer, to ensure convergence), instead of taking the block with the lowest tick count to be the active block.

So far it seems to me that your preemptive empty block injection is good.

dmitry · March 27, 2023, 5:26am

is there a protocol reason why we need to support variable tick poet proofs? i think the problem with concentrating ticks doesn’t exist if each atx needs exactly N ticks. poet worked this way, and was changed only lately, what was the reason behind that change?

tick count can be a protocol parameter, and set to be slightly less than epoch time (based on reference poet, which should not be too bad). are there any protocol concerns? i can think of several operational concerns, but they are still better than the whole tick count problem and suggested changes in the consensus.

in case if we want to support variable ticks poet proofs, we can make it bounded by x2 (so if reference poet has 10000 ticks, valid poet will be in range of [1, 20000]). dishonest node can concentrate only x2 this way, so we can handle concentration attack a bit differently. atx will need to commit to a beacon. if it uses the same beacon as we do and has x2 speed - then this is an honest atx with faster poet. otherwise we delay counting ballots that are using such atx in consensus, note that we do it anyway.

in this last case changes are:

set max poet ticks
move beacon reference from first ballot to atx (beacon will have to complete before atx is submitted to poet)

iddo · March 27, 2023, 12:35pm

@dmitry if non-variable ticks mean a centralized (“reference”) entity that decides how many ticks are needed per epoch, then we don’t want that (if this centralized entity is benign then we can give it even more responsibilities in other aspects of the protocol, otherwise it can do severe damage).
If non-variable ticks mean that it’s fixed forever as const N ticks, then when CPUs become faster it’s rational to re-initialize the storage multiple times per epoch (and we wouldn’t want the honest smeshers to re-initialize).

It’s not so easy to detect who’s honest. We discussed on slack the idea of bounding the ticks relative to the previous epoch: https://drive.google.com/file/d/1VnjF9vqJP-b4bzlAbD_-M5dp2_-QarOX/view?usp=share_link

dmitry · March 27, 2023, 3:53pm

if non-variable ticks mean a centralized (“reference”) entity that decides how many ticks are needed per epoch, then we don’t want that (if this centralized entity is benign then we can give it even more responsibilities in other aspects of the protocol, otherwise it can do severe damage).
If non-variable ticks mean that it’s fixed forever as const N ticks, then when CPUs become faster it’s rational to re-initialize the storage multiple times per epoch (and we wouldn’t want the honest smeshers to re-initialize).

i think my proposal somewhere in the middle between the two. tick count can be updated using hard fork. spacemesh client user will be aware of a hard fork, and will consider if it should accept this update or not. i understand that it may seem like a crutch, but it simplifies things significantly…

It’s not so easy to detect who’s honest.

maybe honest is not correct word here, we just delay ballots that are using beacon that is different from node local beacon value. we already do that to mitigate another concentration attack

iddo · March 27, 2023, 5:26pm

That’s non-ideal to say the least Let’s first confine ourselves to non-hardfork methods and see whether the best that we come up with (which so far is the preemptive empty block injection) is pleasant enough.

If there’s honest majority of ATXs then there’s agreement on the beacon value and we’d get fast consensus in the current epoch. If the same can be true for PoET ticks then delayed ballots would be good.
As you told me elsewhere, our current PoET tick value is any integer between 0 and 10000 (and can be higher than 10000), so maybe you’re suggesting that we disallow fine-grained PoET tick values and only allow to jump from 10000 to say 11000,12000,…, to ensure that the honest majority agrees on the PoET tick value under supposedly reasonable assumptions regarding the fastest CPU speedup that’s possible? Still, an adversary that has a faster PoET would be able to cause disagreement on the PoET ticks value of the epoch among the honest ATXs, by allowing only a subset of them to use his PoET (in other words, the attack surface of an adversary with a faster PoET now becomes larger).

dmitry · April 25, 2023, 5:21am

have several questions to discuss

injecting empty block breaks verifying tortoise

as a reminder, verifying tortoise maintains an opinion vector for a whole history. instead of counting every ballot separately we compare that opinion on the ballot matches local opinion and if so, this ballot weight counted as “good”, and once “good” weight cross pessimistic threshold our local opinion is considered to be valid.

current implementation is using opinion hash for the purposes of finding “good” ballots. opinion hash is a recursive encoding of the opinion, where opinion on layer X is a hash(layer X - 1 opinion concatenated with supported voting targets, or abstain mark). if ballot votes for an empty layer opinion on layer X will be simply a hash(layer X - 1 opinion).

so changing opinion on historical layer, will invalidate previously “good” ballots and there won’t be enough weight to stay in verifying mode.

data structure to maintain weight of the atxs indexed by epoch and height

Every ballot with tick-height hhh is considered a vote against every block in a future layer whose block-height hb<hh_b<hhb<h.

i read it as there should be an efficient way to lookup weight of atxs from all epochs before block epoch, where height is higher than the block height. more precise the data structure should support two operations:

lookup(block.epoch()-1, block.height)
add(atx.target_epoch(), atx.height, atx.weight)

i will use interval b-tree or some variant of trie in-memory index to maintain index for efficient lookups in last epoch. this in-memory data structure will need to store data since genesis. additionally there will be on-disk index for all epochs before last.

are there simpler ideas?

cancelling hare blocks from honest nodes