Sync/serve protocols

Actually, the tool for doing this already exists, it is simply the rewarded ATXs as specified at ATX reward with a sliding window

  • The grading (as mentioned in my last post) isn’t needed at all in the sync protocol.
  • Instead, using a similar notation of the last post, Alice and Bob will use c_all_excluding_late_atxs := c_all XOR c_old_exluding_late_atxs where c_old_exluding_late_atxs will exclude ATXs of the last epoch (before c_curr starts) that weren’t rewarded according to ATX reward with a sliding window
  • Precise details:
    • The adversary can always send a never-before-seen \text{ATX}_0 before the cutoff time w.r.t. when the honest Alice and the honest Bob start their sync, so that Alice will include \text{ATX}_0 in her fingerprint and Bob won’t include \text{ATX}_0 in his fingerprint, and thus the fast detection (1st step in the last post) fails.
    • Our objective here is that the adversarty won’t be able to use \text{ATX}_0 that might be rewarded even though it caused the fast detection to fail.
      • In other words, if the adversary sends a never-before-seen \text{ATX}_0 such that \text{ATX}_0 declares itself to belong to an apoch from say 1 year ago, and only Alice (and not Bob) receives \text{ATX}_0, then indeed the fast detection should fail.
        • Bcause the whole purpose of the sync is that Alice will transmit \text{ATX}_0 to Bob.
        • In this case, \text{ATX}_0 is a syntactically valid ATX that the adversary expended spacetime resources to create, but the adversary won’t earn any reward for \text{ATX}_0.
    • Suppose for example that c_curr has the ATXs received in April, and c_old has the ATXs received in March.
    • Suppose that the last epoch before April is say E_{125}
    • For the fast detection, Alice and Bob disregard c_curr, and for c_old_excluding_late_atxs they use all the rewarded ATXs of E_{125} together with all the ATXs that were received in March (according to the local time of Alice and Bob) that don’t declare themselves to belong to E_{125}
    • If we haven’t implemented ATX rewards yet, then we can take c_old_excluding_late_atxs to be just all the ATXs that were received in March that don’t declare themselves to belong to E_{125} (i.e., we exclude all the ATXs that declare themselves to belong to E_{125}).
    • Note that not sync’ing the late ATXs of March is fine, just like Alice and Bob don’t sync yet the ATXs in c_curr for April (because they expect to receive recent ATXs via gossip).

Thanks for the explanation. It appears that when we have ATX rewards, the approach may be robust enough wrt attempts to degrade sync performance by publishing old ATXs.

Some points though. Maybe I still don’t completely understand some parts, but probably the sync should be aligned on epoch boundaries (perhaps 2 epochs instead of a single month). Otherwise, when the start of the month is in the middle of an epoch, it might be easier for two perfectly honest peers ti disagree on which month the particular ATX belongs to due different propagation times, which may degrade the performance. E.g. an ATX was created very close to midnight on March 1, March being the current month, so two peers may have different ideas about whether the ATX belongs to c_curr or c_old during March. c_all will be the same if they’re synced, but if they’re not, it might be somewhat of a problem. (or maybe I’m misunderstanding some part of it?)

Without ATX rewards, I see another possible problem in temporary isolation of a node or a subnetwork of nodes. Basically in such a subnet of nodes, some ATXs might have been generated and considered perfectly good within that subnet, yet when the nodes attempted to broadcast them, gossip somehow failed and then the nodes became isolated. When the nodes are connected to Internet again some time later, these ATXs they considered “good” previously will not propagate further than the peers they sync with (e.g. due to bad grade); this means that they will not be in sync with any new peer they see, causing excessive sync traffic.

Other than that, it appears that this approach can only be applied after ATX merge. With current ATX counts (1.5M ATXs in epoch 16) it’s of course too expensive to send e.g. month worth of ATX IDs upon sync (even though we do it now). When typical ATX counts become 5-10k per epoch after merge, that should become quite feasible, but it remains somewhat of a question of whether it will remain so during further network growth (and also a question of how successful the merge will be)

Every ATX is a self-contained data blob that declares itself to be in some specific epoch, and it’s either syntactically valid or not (also it can be syntactically valid by the miner ID is already malfeasant so it can be pruned, but for simplicity here let’s ignore this case).

In the mesh data structures we indeed need to be able to iterate over all the syntactically-valid ATXs that declare themselves to belong to (for example) epoch 101, but (as discussed above) for the purpose of the sync protocol we want another data structure (which as described above is c_1, c_2, c_3, …) that doesn’t care about epoch boundaries but instead cares only about the local timestamps of the full node in which the ATXs were receieved (again the reason is that it’s easy for a dishonest miner to inject ATXs that declare themselves to be in out-of-whack epoch number, and we don’t want such dishonest miners to degrade the efficacy of our sync).

So when you say “t might be easier for two perfectly honest peers ti disagree on which month the particular ATX belongs to due different propagation times”, this sounds confused. The month c_i of Alice that this ATX belongs to is according to the local timestamp at which Alice first received and validated this ATX, and the month c_j of Bob that this ATX belongs to is according to the local timestamp at which Bob first received and validated this ATX, and the epoch that this ATX belongs to is self-declared by the ATX itself (it’s immutable and publicly-verifiable by everyone).

As discussed above, if an adversary sends an ATX that Alice receives in April, but due to network problems (such as a netsplit) this same ATX is received by Bob in June, then this ATX will be in c_i of Alice and c_j of Bob such that i\neq j, but when the honest Alice and the honest Bob sync for example in August then (assuming no attacks and no network problems) they won’t even know that they have this ATX in c_i and c_j for i\neq j because the accumulated (xor’d) fingerprint will be the same.

And specifically, as described in the above protocol (with “then Alice and Bob freeze their c_curr, and send the list of ATX IDs in the frozen c_curr to each other” and the rest of the details above), there’s also continuity where new ATXs are received, so the sync protocol can and should be launched at any point in time (but when Alice and Bob start the sync they can decide to not to sync up to the latest checkpoint, specifically if we didn’t implement rewarded ATXs yet as described above).

@ivan4th is this clear, or you still think that there’s an issue here that I’m missing?

I don’t understand why you say “these ATXs they considered “good” previously will not propagate further than the peers they sync with (e.g. due to bad grade)”. The outcome of the sync protocol needs to be that all the syntactically-valid ATXs (up to a recent checkpoint) are the same for Alice and Bob (bad grade can cause an ATX not to be rewarded, but it still needs to be sync’d and processed and stored by all the honest full nodes). The whole point of the above sync protoocl that I described is to guarantee that if the honest Alice has some ATX and she starts a sync with the honest Bob, then in the end of the sync Bob will have this ATX too. If you think that there can be a situation in which Alice has a syntactically-valid ATX and Bob will never have this same ATX then the sync protocol is broken, can you explain why you think that this can happen?