Here’s a summary of the optimized protocol.
Names and Definitions
We use the following list of ingredients.
Tools
-
A “fast” random oracle H that takes n-bit inputs and produces m-bit outputs. In practice:
- For
sha256
, n=256 (or anything we want really) and m=256 - For
aes
, n=128 and m=128 - For
siphash
, n=256 (or anything we want) and m=64. Note that both AES and Siphash are keyed functions; we put the challenge in the key instead of as a prefix to the input)
- For
-
A “slow” random oracle W that takes \log N bits and outputs 8 bits. This is the oracle used to generate the PoST Table; cell i in the table has the value W(i). (in practice W is our version of
scrypt
, with the appropriate ID prefix already included).
Below, we’ll the +
operator for concatenation. Concatenation is bitwise (e.g., if x
is a 4-bit value and y
is a 3 bit value, then x+y
is the 7-bit value consisting of the 4 bits of x followed by the 3 bits of y).
Inputs
-
Challenge
ch
. Think of the challenge as a random 64-bit string (in practice, the adversary can affect this a bit, but we’ll ignore this for now) - PoST table. Each cell of the table is a label, which in our case is 8 bits. Think of the table size as ~N=2^{38} labels (that’s 256 GB of data).
-
Label block table
T
. We partition the PoST tables into blocks of B labels. (e.g.,T[0]
consists of the concatenated PoST table labels 0,\dotsc,B-1,T[1]
is the concatenated PoST labels B,\dotsc,2B-1, etc.).
Parameter Summary
- N the number of labels in the PoST table
- B: the label block size (how many labels in a label-block)
- k: how many “good” label-blocks make up a proof.
- d: the “difficulty threshold” for a good label-block.
- |d|=\log{N} - \log{B} the number of bits required to express the difficulty threshold.
-
max_nonce
: the maximum nonce index (an integer, think of it as <100). -
num_nonces
: The number of nonces an honest party computes in a single pass over the data. - n the input size of the fast oracle H
- m the output size of the fast oracle H.
Proof and Verification
A proof consists of:
- a challenge
ch
- a nonce
nonce
- An array
P
of k label-block indices: (P[i]
is an index of size \log(N)-\log(B) bits)
To verify the proof, the verifier does the following
for all i in 0..k:
# Reconstruct label-block
start_index = B*P[i]
block = W(start_index)+W(start_index + 1) + ... + W(start_index+B-1)
# Check that it passes threshold
# compute the index of the fast-oracle query that will contain `nonce`.
nonce_block = floor( (nonce * |d|) / m ) # This is a log(max_nonce)-bit value
nonce_offset = (nonce * |d|) mod m
msb_bits = min(|d|, m - nonce_offset)
out_msbs = H(ch + block + nonce_block) # this is an m-bit value
if msb_bits < |d|
lsb_nonce_block = nonce_block+1 # this is a log(max_nonce)-bit value
out_lsbs = H(ch + block + lsb_nonce_block) # this is an m-bit value
else:
out_lsbs = 0
# Possible off-by-one errors below: caveat emptor
out = out_msbs[nonce_offset : nonce_offset + msb_bits - 1] + out_lsbs[0 : |d| - msb_bits]
# out is a |d|-bit value (the + above is concatenation).
if out > d:
reject proof
The intuition here is that for each block, we think of the output of H as a “stream” of bits, where the first m bits are given by H(ch+block+0)
, the next by H(ch+block+1)
, etc. Each |d| bits of this stream is the value for a single nonce.
For example, if N/B = 2^{32} and m=256, |d|=32, and each hash invocation produces exactly 8 nonces. So nonce 0 is the first four bytes of H(ch+block+0)
, nonce 1 is the next four bytes of H(ch+block+0)
and nonce 9 is the first four bytes of H(ch+block+1)
.
The code above also handles the case where |d| doesn’t evenly divide m; for example, if |d|=40, then nonce 0 is the first 5 bytes of H(ch+block+0)
, but for nonce 6, the last two bytes of H(ch+block+0)
are its MSBs, while the first 3 bytes of H(ch+block+1)
are the LSBs.
Verification cost
For full verification, at worst each verifier invokes W k\cdot B times and H 2\cdot k times.
Prover
# Number of calls we have to make to H to get all nonces for single pass.
num_outs = ceil(num_nonces * |d| / m)
# Initialize lists of good indices
good[0] = [], ... , good[num_outs - 1] = []
# Make a pass over entire label-block table.
for i in 0...N/B:
out = H(ch + T[i] + 0) + ... + H(ch + T[i] + num_outs - 1) # + is bit-concatenation.
for j in 0...num_outs - 1:
if out[j*|d|:j*|d|+|d|-1] < d:
add i to good[j] # found a good label-block for nonce j
if length(good[j] == k):
return (j, good[j])
# we didn't find anything in the first pass; we do it again with a second pass, starting at nonce num_nonces.
Prover cost
Assuming a single pass (which is enough with pretty high probability):
For each label-block, the prover invokes H \lceil \texttt{num\_nonces}\cdot |d|/m \rceil times. Thus, the total cost is \lceil \frac{\texttt{num\_nonces}\cdot |d|}{m} \rceil \cdot \frac{N}{B}
Parameter examples.
We’re currently thinking of N=2^{38} (for one space unit), k=800, B=16 and num_nonces=20
. This means the total number of blocks is 2^{34}, and |d|=34.
-
If we’re using AES, then m=128, so we need 6 invocations per block. This means for a single proof we’d need 6\cdot 2^{34}\approx 2^{36.6} AES invocations. For verification, we’d need 800\cdot 16=12800 scrypt invocations and 1600 AES invocations.
-
If we’re using Blake3-512, then m=512, so we need 2 invocations per block. This means for a single proof we’d need 2\cdot 2^{34}\approx 2^{35} Blake3-512 invocations. For verification, we’d need 800\cdot 16=12800 scrypt invocations and 1600 Blake3-512 invocations.
Note: It might be possible to directly get the 680 bits we need for 20 nonces from Blake’s variable-output length capability, with better efficiency than calling the function twice.
-
If we’re using Siphash, then m=64, so we need 11 invocations per block. This means for a single proof we’d need 11\cdot 2^{34}\approx 2^{37.5} Siphash invocations. For verification, we’d need 800\cdot 16=12800 scrypt invocations and 1600 Siphash invocations.
The break-even point is that Siphash must be more then 5.5 times faster than blake3-512 to be better over all, and AES must be more than 3 times faster.