Hare data volume optimization

Problem
Data volume transmitted in Hare is surprisingly large (at the range of hundreds-of-megabytes, as measured on our test environment with 300 nodes). We would like to understand what is causing this and if there is any chance to improve this situation.

Diagnosis
The root cause of this turns out to be including collections of status messages as part of every:

  • safe value proofs
  • commit certificate

Solution
Messages included inside a commit certificate and a safe value proof are status messages. Say C is a commit certificate. By design of Hare, all status messages included in C must have been broadcast over gossip before C was created. Moreover, these status messages all belong to the round preceding the round where C was created. Therefore, by our security assumptions, these status messages should have been delivered to all honest nodes before C was delivered.

Similar mechanics occurs in case of safe value proofs.

It feels like replacing all status messages in a commit certificate and a safe value proof with just message hashes, combined with caching of messages in node, could provide a significant reduction in data volume transmitted, while retaining all the properties of Hare.

Calculations

I did a calculation of data volume for an “idealized” Hare protocol.

I assumed the following structures of messages (for the un-optimized protocol):

abstract class Message {
  val sender: NodeId
  val sig: Signature
  val eligibilityProof: Signature
}

class CommitCertificate {
  val acceptedSet: CollectionOfMarbles
  val iteration: Int
  val commitMessages: Array[CommitMsg]
}

class SafeValueProof {
  val iteration: Int
  val statusMessages: Set[StatusMsg]
  val magicMessage: StatusMsg
}

class PreroundMsg extends Message {
  val inputSet: CollectionOfMarbles
}

class StatusMsg extends Message {
  val iteration: Int
  val certifiedIteration: Int
  val acceptedSet: CollectionOfMarbles
}

class ProposalMsg extends Message {
  val iteration: Int
  val safeValueProof: SafeValueProof
}

class CommitMsg extends Message {
  val iteration: Int
  val commitCandidate: CollectionOfMarbles
}

class NotifyMsg extends Message {
  val iteration: Int
  val commitCertificate: CommitCertificate
}

In the above, CollectionOfMarbles type is used to represent collections of elements we are attempting to achieve consensus on (and elements are 32-bytes long hashes).

  val numberOfNodes: Int = 100000
  val maxNumberOfMarblesInACollection: Int = 50
  val c: Int = 800 //committee size
  val d: Int = 5 //target number of leader candidates to be elected in the proposal round
  val nodeId: Int = 32
  val marbleId: Int = 32
  val msgSignature: Int = 64
  val eligibilityProof: Int = 64
  val iterationNumber: Int = 4
  val messageHash: Int = 32
  val senderId: Int = nodeId
  val marblesCollectionTotal: Int = maxNumberOfMarblesInACollection * marbleId
  val msgMetadata: Int = senderId + iterationNumber + eligibilityProof + msgSignature

  class MsgVolumesProfile {
    var preRoundMsg: Int = 0
    var statusMsg: Int = 0
    var svp: Int = 0
    var proposalMsg: Int = 0
    var commitMsg: Int = 0
    var commitCertificate: Int = 0
    var notifyMsg: Int = 0
  }

  fun originalProtocol: MsgVolumesProfile = {
    val p = new MsgVolumesProfile

    p.preRoundMsg = senderId + marblesCollectionTotal + eligibilityProof + msgSignature
    p.statusMsg = msgMetadata + iterationNumber + marblesCollectionTotal
    p.svp = iterationNumber + c * p.statusMsg + marblesCollectionTotal
    p.proposalMsg = msgMetadata + p.svp
    p.commitMsg = msgMetadata + marblesCollectionTotal
    p.commitCertificate = marblesCollectionTotal + iterationNumber + c * p.commitMsg
    p.notifyMsg = msgMetadata + p.commitCertificate

    return p
  }

  fun optimizedProtocol: MsgVolumesProfile = {
    val p = new MsgVolumesProfile

    p.preRoundMsg = senderId + marblesCollectionTotal + eligibilityProof + msgSignature
    p.statusMsg = msgMetadata + iterationNumber + marblesCollectionTotal
    p.svp = iterationNumber + c * messageHash + marblesCollectionTotal
    p.proposalMsg = msgMetadata + p.svp
    p.commitMsg = msgMetadata + marblesCollectionTotal
    p.commitCertificate = marblesCollectionTotal + iterationNumber + c * messageHash
    p.notifyMsg = msgMetadata + p.commitCertificate

    return p
  }

The final results (for given profile: MsgVolumesProfile) are achieved as:

val preRoundTotal: Int = profile.preRoundMsg * c
val statusTotal: Int = profile.statusMsg * c
val proposalTotal: Int = profile.proposalMsg * d
val commitTotal: Int = profile.commitMsg * c
val notifyTotal: Int = profile.notifyMsg * c

val downloadInIterationZero: Int = preRoundTotal + statusTotal + proposalTotal + commitTotal + notifyTotal
val downloadInIterationOne: Int = statusTotal + proposalTotal + commitTotal + notifyTotal

These are the numbers I achieved:

----------------- original protocol [all values in megabytes] -------------------
download in iteration 0: 1088.80
download in iteration 1: 1087.46

     commit certificate: 1.3473548889160156
                    svp: 1.3504066467285156

              pre round: 1.34
           status round: 1.35
         proposal round: 6.75
           commit round: 1.35
           notify round: 1078.01

----------------- optimized protocol [all values in megabytes] -------------------
download in iteration 0: 25.05
download in iteration 1: 23.71

     commit certificate: 0.025943756103515625
                    svp: 0.025943756103515625

              pre round: 1.34
           status round: 1.35
         proposal round: 0.13
           commit round: 1.35
           notify round: 20.88

Using hashes as pointers to previously received messages makes sense to me. As noted, protocol already assumes that honest messages are received within some gossip delay, so we may use this assumption for efficiency as well.

I want to clarify what is the expected behavior, when message is received and message that is pointed by the hash is not available locally. We just ignore received message, perhaps with a warning in the log, right?

But it would be good if @talm confirms that this version doesn’t violate some assumptions that are used in proofs, before we implement it.

In both cases - SVP and CommitCertificate - the validity checking is based on f+1 rule (of course after accepting “equivocations defense” we are talking about this changed/improved f+1 rule). So they are very similar. Therefore, in explaining the recommended behaviour of the protocol I will treat both the SVP case and CommitCerfificate case together. I will use the word “certificate” to refer to the master message and “justifications” to refer to the collection of hashes inside the master message. If a hash refers to an unknown message, I will call it “danging justification”.

Now, let me specify how I think the protocol should handle the corner case you are asking about.

Whenever a certificate C arrives to a node and C has some dangling justifications, the handling is:

  • Case 1: IF “f+1” rule check passes OK after removing all dangling justifications from the list THEN consider C as valid
  • Case 2: otherwise: put C into a “incomplete-messages” buffer and just wait until enough dangling justifications will become filled (after - hopefully - delayed messages causing the whole dangling justifications issue will finally arrive); we do not have to wait for all dangling justifications to be filled, it is enough to just have as many of them as is needed to pass the “f+1” rule check. For all this time C waits in the buffer, it should not be seen for Hare protocol logic (just as if C itself is delayed).

otherwise: put C into a “incomplete-messages” buffer and just wait until enough dangling justifications will become filled (after - hopefully - delayed messages causing the whole dangling justifications issue will finally arrive)

interesting solution. there will be a bit of complexity in implementing it correctly. gossip protocol expects message to be broadcasted or dropped within reasonable time frame, otherwise it may drop later message on incoming queue overflow. so if this is implemented we will probably have to “ignore” and don’t broadcast such dangling justification, and then broadcast it later after seeing delayed message. though it needs to be more carefully analyzed during implementation

Hmm … I see.

First of all, there is no hard reason to strictly WAIT for all justifications to be fulfilled before forwarding a certificate. We may “optimistically” just forward the message without waiting for all justifications. Yes, I am aware about this rule that “we hard-validate before forwarding”. But this rule is taken out of thin air and in this specific case we could make a reasonable exception.

In the worst case, this relaxed validity checking may be used as a new attack vector. But this attack vector does not feel like a real danger, given the fact that we may (and should) introduce a hard limit on the justifications collection size. Assuming our quorum size mean value is 800, setting the hard limit to 900 would be a reasonable approach.

BTW, because of how Hare is designed, the vast majority of delayed justifications we might want to wait for will be delayed “just a little” (within the time window of 1 round). In practice it might be good enough to put a harsh limit on the time a message is allowed to wait in this “incomplete messages” buffer.

Assuming round time to be 30 seconds, we are probably happy with 30 second buffering limit, and 15 seconds may end up to be “good enough” in practice.

The suitable testing pattern here would be to establish (artificially) some network quality (in terms of average bandwidth & latency) - I hope we have tooling for this - and then measure Hare efficiency in a sequence of, say, 1-hour long experiments, where the buffering-time-limit is fixed in every experiment. I would then start from buffering-time-limit = 1 second. In next experiment it would be set to 2 seconds, and so on - up to 1 minute. Then, my wild guess is that we may observe that 95% of buffered messages spent less than 10 seconds in the buffer (or something like this).

How harsh is the timing boundary required by our gossip lib ?

Below I present updated volume calculations after adjusting parameters to more closely reflect the current implementation of Hare.

Code:

object NetworkBandwidth {

  /*      PARAMS       */

  val numberOfNodes: Int = 100000
  val averageNumberOfMarblesInACollection: Int = 50
  val c: Int = 800 //committee size
  val d: Int = 10 //target number of leader candidates to be elected in the proposal round
  val nodeId: Int = 32
  val marbleId: Int = 20
  val msgSignature: Int = 64
  val eligibilityProof: Int = 80 //maybe we need to add here also the size of VRF "message"
  val iterationNumber: Int = 1
  val messageHash: Int = 32 //sha-265


  /*      MESSAGES       */

  val senderId: Int = nodeId
  val marblesCollectionTotal: Int = averageNumberOfMarblesInACollection * marbleId
  val msgMetadata: Int = senderId + iterationNumber + eligibilityProof + msgSignature

  class MsgVolumesProfile {
    var preRoundMsg: Int = 0
    var statusMsg: Int = 0
    var svp: Int = 0
    var proposalMsg: Int = 0
    var commitMsg: Int = 0
    var commitCertificate: Int = 0
    var notifyMsg: Int = 0
  }

  def originalProtocol: MsgVolumesProfile = {
    val p = new MsgVolumesProfile

    p.preRoundMsg = senderId + marblesCollectionTotal + eligibilityProof + msgSignature
    p.statusMsg = msgMetadata + iterationNumber + marblesCollectionTotal
    p.svp = iterationNumber + c * p.statusMsg / 2
    p.proposalMsg = msgMetadata + p.svp
    p.commitMsg = msgMetadata + marblesCollectionTotal
    p.commitCertificate = marblesCollectionTotal + iterationNumber + c * p.commitMsg / 2
    p.notifyMsg = msgMetadata + p.commitCertificate

    return p
  }

  def optimizedProtocol: MsgVolumesProfile = {
    val p = new MsgVolumesProfile

    p.preRoundMsg = senderId + marblesCollectionTotal + eligibilityProof + msgSignature
    p.statusMsg = msgMetadata + iterationNumber + marblesCollectionTotal
    p.svp = iterationNumber + c * messageHash + marblesCollectionTotal
    p.proposalMsg = msgMetadata + p.svp
    p.commitMsg = msgMetadata + marblesCollectionTotal
    p.commitCertificate = marblesCollectionTotal + iterationNumber + c * messageHash
    p.notifyMsg = msgMetadata + p.commitCertificate

    return p
  }

  def printResults(profile: MsgVolumesProfile): Unit = {
    val megabyte: Int = 1024 * 1024
    val preRoundTotal: Int = profile.preRoundMsg * c
    val statusTotal: Int = profile.statusMsg * c
    val proposalTotal: Int = profile.proposalMsg * d
    val commitTotal: Int = profile.commitMsg * c
    val notifyTotal: Int = profile.notifyMsg * c

    val downloadInIterationZero: Int = preRoundTotal + statusTotal + proposalTotal + commitTotal + notifyTotal
    val downloadInIterationOne: Int = statusTotal + proposalTotal + commitTotal + notifyTotal
    val download0AsMegabytes: Double = downloadInIterationZero.toDouble / megabyte
    val download1AsMegabytes: Double = downloadInIterationOne.toDouble / megabyte

    println(f"download in iteration 0: $download0AsMegabytes%4.2f")
    println(f"download in iteration 1: $download1AsMegabytes%4.2f")
    println()
    println(f"     commit certificate: ${profile.commitCertificate.toDouble / megabyte}")
    println(f"                    svp: ${profile.svp.toDouble / megabyte}")
    println()
    println(f"              pre round: ${preRoundTotal.toDouble / megabyte}%4.2f")
    println(f"           status round: ${statusTotal.toDouble / megabyte}%4.2f")
    println(f"         proposal round: ${proposalTotal.toDouble / megabyte}%4.2f")
    println(f"           commit round: ${commitTotal.toDouble / megabyte}%4.2f")
    println(f"           notify round: ${notifyTotal.toDouble / megabyte}%4.2f")
  }

  def main(args: Array[String]): Unit = {
    import java.util.Locale
    Locale.setDefault(new Locale("en", "US"))
    println("----------------- original protocol [all values in megabytes] -------------------")
    printResults(originalProtocol)

    println()

    println("----------------- optimized protocol [all values in megabytes] -------------------")
    printResults(optimizedProtocol)
  }

}

Results:

----------------- original protocol [all values in megabytes] -------------------
download in iteration 0: 367.28
download in iteration 1: 366.38

     commit certificate: 0.44994449615478516
                    svp: 0.4493722915649414

              pre round: 0.90
           status round: 0.90
         proposal round: 4.50
           commit round: 0.90
           notify round: 360.09

----------------- optimized protocol [all values in megabytes] -------------------
download in iteration 0: 23.38
download in iteration 1: 22.48

     commit certificate: 0.025368690490722656
                    svp: 0.025368690490722656

              pre round: 0.90
           status round: 0.90
         proposal round: 0.26
           commit round: 0.90
           notify round: 20.43
1 Like