Verifiable shuffle

Acronyms:
UCB = unified content block
pblock = tortoise proposal block

Purpose

  • The protocol is more solid w.r.t. rational deviations if the ordering of transactions in the UCB must be verifiably shuffled according to randomness that’s derived from all the miners of the layer.
    • Ideally, if at least one miner is honest then the shuffled order is uniform random (the protocol will make it difficult for adversarial miners to stray from this ideal).
  • Without a verifiable shuffle, the miners may ignore the hare altogether and vote for a UCB hash that was prepared by an external entity.
    • Specifically, a (supposedly) reputable entity that runs a non-trivial algorithm to extract MEV and shares the profits with the miners.
  • With a verifiable shuffle, the external entity cannot control the ordering of transactions, so it’s a much less attractive endeavor.
    • Even with a verifiable shuffle, the external entity can still decide which transactions to include in the UCB.
      • By including a small set of transaction, it’s easier to try to grind on the shuffled order, but such an endeavor is also less attractive because:
        • The grinding is done by excluding rewarded ATX identities (see next), so those identities would vote against the dishonest UCB.
        • The revenues derived from the transaction fees will shrink.
      • It may still be profitable to create a dishonest UCB with a small set of transactions and without trying to control the shuffled order.
        • For example, to win an auction because the competitors’ transactions are excluded.
        • But again, the revenues derived from transaction fees will shrink.
        • The verifiable shuffle makes it unattractive to engage is several commonplace MEV types, but not all types.
        • The overall potential for rational deviations (i.e., ignoring the hare even after the hare rewards are in place) is significantly decreased.

Spec

  • Reminder: when the miner Alice creates pblock for layer j\!+\!1 and ballot that votes on layer j, the VRF for both the pblock and the ballot is the same (the pblock is ephemeral but the ballot is permanent).
  • Reminder: if there’s rewarded ATX identity Bob in the UCB of layer j+1 for which the ballot of layer j of Bob is missing, then honest miners must always vote against this UCB (we discussed this protocol rule but iirc haven’t implemented it yet).
  • The VRFs of the rewarded ATXs in the UCB of layer j are mixed to create a randomness seed s_j
  • s_j is implicit (and verifiable in the future) because the ballots that vote on the UCB of layer j\!-\!1 are permanently available.
  • The transactions are sorted lexicographically, then shuffled according to s_j using Fisher-Yates, and then the non-fee-paying transaction are removed (optimistic filtering).
  • For the verifiable shuffle, the UCB will include extra data that’s comprised of the indices of the non-fee-paying transactions (either before or after the shuffle).
    • For example, if the UCB had 1000 transactions and the non-fee-paying transactions were the 3rd, 27th, and 788th before the shuffle, then this extra data will be the list (3,27,788) that’s encoded either via varint serialization (or 2 bytes for the length of the list and 2 bytes for each index in the list).
    • Optimization: if each transaction already has a leading type byte (or say 4 bits), then one of the types can be the non-existent type, and instead of encoding the indices of non-fee-paying transactions we will just put them as non-existent type among all the encoded transactions (after the shuffle of course), so that each non-fee-paying transaction wastes just 1 byte (or say 4 bits).
      • To take the prior example (but with non-existent placements after the shuffle). the first two transactions will be encoded normally, then the 3rd transaction will be encoded as just the non-existent byte value, then the 4th to 26th transactions will be encoded normally, then the 27th transaction will be encoded as just the non-existent byte value, then the 28th to 787th transactions will be encoded normally, and then the 788th transaction will be encoded as just the non-existent byte value, and then the 789th to 1000th transactions will be encoded normally.
  • Let L_j denote the shuffled list of the transactions in the UCB of layer j
  • Verifying the shuffle:
    • Without the optimization:
      • L_j is sorted lexiconically.
      • Then the non-existent transactions are placed in their specified indices.
      • Then the shuffle is executed with s_j
      • Then the non-existent txns are removed and the result is compared to the original L_j
    • With the optimization:
      • L_j is sorted backwards (from the last iteration to the first) using Fisher-Yates, which is easy because the randomness in each iteration k of the loop is computed as \text{hash}(k, s_j) \ \text{mod}\ n\!-\!k\!-\!1 where n is the total number of indices and k starts from zero.
      • The resulting list is checked to see whether it’s ordered lexiconically (while ignoring the non-existent transactions).