Proposed Athena design

Now that we have a basic VM and we’re beginning work to integrate it into go-spacemesh, it’s time to finalize the account design and answer a few remaining questions. This is a continuation of Athena blockchain design and integration. I’m copying over the design and questions from Remaining questions and proposed design · Issue #53 · athenavm/athena · GitHub and a subsequent Slack thread on this topic so that we can continue the conversation here.

Proposed design:


VM-side

  1. How does one account send native coins to another account? Just using call? Is there a “receivable” function on the other end? Is there a way to transfer a balance without a call, i.e., a “simple send” tx type? (probably not, it doesn’t make sense in the context of AA)
  2. How do we pass input into a called function? Do we only allow calling a single entrypoint function? Do we use argc/argv or stick with using IO syscalls like now.
  3. How do we return a value back from a call? Do we want to support this? Add an opcode or two?
  4. Do we consume all remaining gas on failure?

Host-side

Encompasses everything the node (i.e., go-spacemesh) needs to implement.

Transactions

This design is maximally “abstracted,” i.e., it puts as little logic as possible in the host and as much as possible in the VM.

There are only two transaction types: self spawn and wallet call.

Type 1: Self spawn

Self spawn allows a funded but unspawned wallet account to be spawned, same as in the existing VM. The transaction specifies the principal address, template address, and immutable state (init params). The wallet template has a self_spawn() method. The host passes the entire tx into this method. The method can either fail, or else internally spawn the template. The spawned wallet program address must be the hash of the template address and the immutable state.

tx := (principal, nonce=0, gas_price, template_address, immutable_state, signature)
spawned_program_address := HASH(template_address, immutable_state)

Type 2: Wallet call

Everything else is a call into a method in a spawned wallet program. We use the term “wallet” loosely, and we don’t cleanly differentiate between wallet programs and non-wallet programs. Effectively a wallet program is any program that implements the “spend” and “proxy” methods described here (you can think of “wallet” like a Rust trait; a template may implement many traits, and a particular wallet may implement a subset of these methods, or additional convenience methods). Here are some examples of functionality wallet apps will likely provide:

Send

To send native coins from their wallet, the user generates a tx with their wallet program as principal. The tx is passed to the “spend” method on the wallet program, which either effectuates a send or fails.

tx := (principal, nonce, gas_price, ENCODE("spend", recipient, amount), signature)

Call/proxy

To use a wallet’s funds to pay gas for a call to another program, the user again generates a tx with their wallet program as principal. The tx is passed to the “proxy” method on the wallet program, which either effectuates the call to another program, or fails.

tx := (principal, nonce, gas_price, ENCODE("proxy", recipient, amount, input), signature)

Deploy

To deploy a new template, the user passes the template code to the wallet program, which uses a VM opcode to deploy a new template. The deploy opcode returns the newly-deployed template address, which is calculated as the hash of the template code. The deploy opcode fails if there’s already a template deployed to this address. There’s no verification of the template code at deploy time. The deploy opcode charges gas based on the size of the code.

tx := (principal, nonce, gas_price, ENCODE("deploy", template_code), signature)
template_address := HASH(template_code)

Spawn

To spawn a program from an existing template, the user passes the template address and immutable state (init vars) to the wallet program, which uses a VM opcode to spawn the program. The spawn opcode returns the newly-spawned program address, which is calculated as the hash of the template address, the principal address, and the nonce. The spawn opcode fails if there’s already a program spawned to this address, or if the template init code fails to run on the input immutable state. The spawn opcode charges gas based on the template init code execution, i.e., same as a cross-contract call.

tx := (principal, nonce, gas_price, ENCODE("spawn", template_address, immutable_state), signature)
program_address := HASH(template_address, immutable_state, principal, nonce)

Accounts

There are three kinds of accounts.

Stubs

A stub is simply an account that has received funds but hasn’t been spawned yet.

Templates

A template contains only code. Funds sent to a template account are effectively burned and cannot be moved. (Note that we cannot prevent funds from being sent to a template address before it’s deployed, but once we know an account is a template we can prevent additional funds from being sent there.)

Programs

A program account contains a balance, a nonce, a state tree, and a link to an associated template address.

Questions

  1. We need to finalize how wallets handle nonces. Note that the nonce is not fully abstracted into the VM since Spacemesh miners need to be able to read it for tx ordering.

Discussion thread to date

  1. By “native coin” do you mean L1 funds? I think we should make the native coin work like any other resource, just have a “special” predefined resource address (I posted some ideas about how resources could work to the forum: General Abstract Resources ). In fact, we could make both Athena and L1 coins resources, and use the general mechanisms to transfer between them.
  2. Can you explain a bit more what you mean by this question?
  3. I think we have to support return values from cross-contract calls. The reason is that we want to keep all contract data local to the contract (i.e., only the contracts own methods can access it directly) – so the only way to read data from a different contract is a cross-contract call.

I’m not sure spacemesh miners actually need the nonce for tx ordering. We actually have two separate ordering phases:

  1. Shuffle transactions, considering all transactions with the same principal to be identical
  2. Internally order transactions with the same principal by nonce.

I think we can have spacemesh miners just perform phase 1, while leaving phase 2 to the executors.What this means is that the UCB contains the full list of transactions and the ordering of tx principals, but doesn’t explicitly commit to the internal ordering.This would allow us a lot more flexibility in defining the nonce (eg, potentially allowing the nonce scheme to be fully defined by the contract).


  1. By “native coin” do you mean L1 funds?

yes - I’m assuming the “native coin” of Athena is the same as on the L1. keep in mind that we’re starting by launching Athena on a testnet at L1. I have no issue with making the native coin a “special” predefined resource, but this might make things like charging gas a little more complex. we won’t have resources in the first athena testnet, anyway, because I want to get a testnet up soon and it will take time to finish the R&D for the resources stuff. the only other implication I can think of is that the CALL host function, which allows cross-contract calls, also includes a value parameter that sends coins along with the call (same as in EVM). we need to consider if we want to change this behavior. as I laid out in my original question list, we also have to decide whether want to implement some sort of “payable” or “receivable” syntax on contracts or functions that can receive coins (also an EVM idea).

  1. How do we pass input into a called function? Do we only allow calling a single entrypoint function? Do we use argc/argv or stick with using IO syscalls like now.
  1. Can you explain a bit more what you mean by this question?

different chains handle this differently. EVM allows a call to one of many methods defined on the receiving smart contract, and uses a “selector” (basically a hash of the signature of the function to call) as part of the tx calldata. solana, by contrast, only allows a single entrypoint, and dispatch has to happen inside the smart contract. so, question one is which method do we choose and why? (I’m leaning towards the solana method because it’s simpler and more “abstracted”.)the second question is, how do we pass “calldata” into a called method? one option is to use the standard C argv/argc syntax (Command Line Arguments in C - GeeksforGeeks). in this case, the VM/runtime stores these values in memory and the running program can access them immediately. the alternative is to require the running program to make an IO host call to load these values into memory (it would pass in a pointer). right now we’re doing the latter and it seems fine to me.note that there’s a related question here about how multiple arguments should be encoded/decoded. solana only allows a single bytearray, which the receiving program is responsible for decoding. again, i’m leaning in this direction since it’s simpler.

  1. I think we have to support return values from cross-contract calls. The reason is that we want to keep all contract data local to the contract (i.e., only the contracts own methods can access it directly) – so the only way to read data from a different contract is a cross-contract call.

one alternative here is to allow the return data to be a resource that’s owned by the caller: the callee creates the resource, saves it, and somehow changes its ownership to the caller.I see three potential models for how to implement this – note that the first two options here are similar to the distinction discussed above about input:

  1. first-class support for return values: the callee simply calls “return” with a value, the runtime receives the value and saves it to memory and then maybe the CALL host function returns a pointer to where it’s stored.
  2. implement as independent IO host functions: callee calls a special host::return_value() host function, caller then calls a parallel host::get_return_value() host function after the CALL host function.
  3. using resources and storage, as described above

allowing the nonce scheme to be fully defined by the contract

I sense some deja vu here :slightly_smiling_face: would have to go back to old notes on this topic, but - is the idea that the template would expose some sort of get_nonce() method that the executors could call on each tx to read the nonce? I think this is probably fine, but it would definitely be less efficient than requiring the nonce to be explicitly visible in the tx without requiring interpretation inside the VM. it also sort of means we either need a very tight gas limit on this get_nonce() method, or else we need to gas meter it.

1 Like

For the general resource setting, you would have a list of (resource-address,value) tuples, each of which causes an implicit call to the resource’s “transfer” method before actually executing the call itself. (For efficiency, the encoding for the addresses of the native coins – L1 and Athena – should be extremely short)

I would actually lean towards the explicit method selector. There are several reasons:

  1. We have special methods for account abstraction that are recognized by the VM. (for example “verify”). If dispatch happens internally, the VM will have a hard time telling them apart. There are similar methods for resource contracts.
  2. We can have restrictions on the method signatures that affect their gas cost, and are enforced by the VM (e.g., we might want to have “getter” methods that do a simple lookup of account state, that are cheaper to call)
  3. I think it will be easier to write general third-party parsers for transactions if their format is more well defined (e.g., you want to write an explorer or wallet client).

I prefer the pass by value option. This is more expensive up front (in terms of computational cost), since you have to allocate a bit more memory and copy stuff into it, but less expensive than an IO call. Since almost every callee will want to access its arguments, I think it makes sense to write the arguments into the memory of the callee directly.

Also, my intuition says that SNARKifying this will be easier – in general, the more context switches out of the core VM the higher the SNARK cost.

I really prefer this option. Again, I think this is something almost every caller will want to do, and it’s cheaper and simpler to write the return value into the caller’s memory.

Yes, we discussed it in depth :slight_smile: . The difference now is that we don’t need all L1 miners to understand the nonce, so it makes things easier.

I agree that a fully general nonce scheme needs to be carefully designed. To start with, I think we should fix a nonce scheme (e.g., the one we designed for L1). However, even with a fixed nonce scheme, we can still make the nonce something that only the VM executors care about, meaning we can potentially allow generalization in the future without changing miner code.

I don’t have a very strong opinion about this, though, so if separating nonce computation from the shuffling really complicates the implementation, I don’t mind just leaving it as is (assuming the currently implemented scheme is the parallel-friendly scheme we designed).

There’s a reason EVM allows you to pass a value in the cross-contract call:

  1. It “notifies” the recipient of the inbound coins and lets the recipient execute code to handle the inbound coins. Think of the “vending machine” model: the transfer triggers the vending machine to disburse the product.
  2. It allows the recipient to reject inbound coins, and protects the sender from sending coins to a contract/function that isn’t prepared to receive coins. (This is the “payable” decorator in Solidity.)

I’m a bit skeptical that we can make this work for “general resources” without a very different design. Note that, in the case of ERC-20, you lose both of these. There have been some attempts to add this back with tokens on Eth (see ERC-223 and ERC-777) but, in general, these haven’t taken off. This will require some additional thought on the UX impact.

I don’t think these are mutually exclusive. We can still have explicit methods (like verify and getNonce) that the VM is aware of and can call directly, but require that for program invocations (i.e., user transactions and cross-contract calls), dispatch happen in the main entrypoint function.

A “getter” method that doesn’t change state is always free to call. EVM supports this, too. We can add a static mode context item to the VM, like in EVM, to enforce this. Also, I see no reason this wouldn’t work:

fn main(self, input: Vec<u8>) {
  let (selector, args) = self.parseInput(input);
  match (selector) => {
    readonly => self.readonly(args)
  }
}

fn readonly(self, args...) {
  // do something in readonly mode that's free or very cheap
  // attempting to write to state would cause failure
  self.context.enter_cheap_static_mode(|| => {
    ...
  });
}

I’m unconvinced. In EVM, you need a contract’s ABI to parse its transactions. This is no different.

It depends how this is implemented. In Athena today programs need to make an IO syscall to read input. This is different than a host call because it doesn’t traverse FFI, it’s handled entirely within the running VM, so it’s very cheap. The memory has already been allocated and the input is already there. It’s just a question of the semantics around reading that input. I think it’s more of a DevEx question.

Unlike passing data in to a program with call, passing data back does have an additional cost since we’re not currently writing it to the caller’s memory. I actually think it’s pretty common to have a program that returns a value, but where many callers choose to discard the return value, so I’m not convinced we want to impose this cost on every call.

The issue with nonces is that they can by themselves invalidate a transaction, or an entire set of transactions. I’m not sure we can require even Athena executors to need to fire up a VM just to check the nonce of each tx in the Athena mempool. This is expensive and it opens us up to potential DoS vectors (e.g., craft a tx that won’t ever pay gas but that requires the executor to re-evaluate many other txs in the mempool).

What I’m proposing is a direct generalization of the Ethereum scheme.
In the transaction, instead of having an “amount” field of type “uint”, you have a “transfer” field of type “array of (resource,uint) tuples”. If you’re transfering only native coin, then this array would have the value “(athena-native, amount)”. But you could transfer several resources at once: e.g., “(athena-native,7), (myresource1,112),…”

Marking a method as “payable” in the template might make sense, but I think it’s an orthogonal issue. (Also, this is another case where explicit methods would be helpful)

This sounds like the worst of both worlds – we have the (minor) added complexity of explicit method invocations, but don’t get the additional transparency and fine-grained method controls that explicit method invocation gives us.

I don’t think getter methods should be free, because there is an added cost of loading the state of another account into memory (and depending on which state is accessed, potentially a key lookup or computation as well).

Regarding your example; of course it would work. The question is what are the costs. I think the difference is similar to compile-time vs runtime. If you declare methods explicitly, you can run compile-time checks (and the equivalent when you are doing succinct proofs). For example, you might be able to check at template deployment time that a method can’t make any write system calls, and then can forgo these checks at runtime. If not, then you need to check everything at runtime (in the case of your example, I don’t see why a runtime declaration of “read only scope” is helpful — you can just charge more gas if the code writes to state).

Again, it’s not about working vs not working. It’s about how easy it is to do things. Requiring the ABI to parse transactions is a negative. Even with explicit method signatures, to get a fully parsed transaction we’ll also need ABI information, but when you don’t have it you’ll still get more transparency than if everything is hidden behind a single method call.

Note that contract designers can declare only a single method, and then parse its input and do dispatch. You can think of explicit methods as a “precompile” for the dispatch code that will be repeated for almost every contract.

If by “syscall” you mean call a library routine written in risc-v, then that’s probably ok – although still a bit more expensive than a simple memory access. If it’s an actual risc-v syscall it might be a lot more expensive when we’re doing succint proofs.

That makes sense, but we can still have a binary “ignore return value” parameter for the call; if it’s true, there’s no way to access the return value, and if it’s false, you get it in memory.

I think we’re basically agreeing here – this is what I meant by “needs to be carefully designed”. In any case, I don’t think we need fully general nonces for our initial deployment…

I need to give this some more thought. It seems too complicated. For one thing, I’m not sure that transferring, say, an NFT to a program should trigger the same code path as sending the native coin. And I’m not sure that a program can know at deploy + spawn time all of the types of resources it will need to be responsible for handling in the future, including those that don’t exist yet. It feels error-prone. And it could lead to a situation where a program that accepts payment in, say, multiple tokens or currencies needs to perform exchange conversions on the fly. I don’t like any of this.

I realized that, the way our VM (vCPU) is built today, we can’t actually call methods explicitly. Programs are compiled to ELF and an ELF executable has a single entrypoint. I need to look into how much work this would entail.

This is tricky. We don’t control the compilation process, since we compile using Rust. I’m not sure we can make compile-time checks beyond those already enforced by Rust, at least not without some deep changes. And I’m extremely hesitant to make any changes to Rust because I don’t want to maintain our own Rust fork/toolchain.

Checking for particular syscalls at deploy time shouldn’t be a problem. We can do some very basic static analysis on the already-compiled code.

Nothing is “a library routine written in RISC-V.” We interpret all of the code, including the syscalls. It’s a RISC-V ECALL. I confirmed that, with C semantics, argc and argv are already loaded onto the stack when the program starts executing. We can consider doing the same.

I don’t think this is a good idea. These calls need to be as simple as possible.

I don’t see why it’s complicated. The contract can ignore resources it doesn’t want to deal with (and the “MustAccept” trait ensures that accounts have to explicitly opt in to receiving resources that could do harm). Since resource traits are enforced by (and known to) the VM, the contract can also check whether a resource has a Data trait (i.e., is an NFT) and deal with it in a different way.

A simple contract can ignore everything except the native coin, and won’t be any more complex to write. However, an advanced contract can deal dynamically even with resources that didn’t exist at the time the contract was written. That is, if you want your contract to perform exchange conversions, you can write the code to do that. But if you don’t, you just ignore resources you don’t want to deal with, and the VM rules guarantee you can’t be harmed by sending you resources.

ELF is a linkable format – you can compile to a library rather than an executable. But in any case, I doubt we want to store ELFs on-chain – it’s a pretty verbose format, and we don’t need most of its features.

We should separate “what we’re currently doing” from the protocol-level design of the VM. At the protocol level, there is no rust – there’s only risc-v. What we’re currently doing is generating the risc-v code by compiling from rust.

I’m a lot more concerned about getting the protocol design right than I am about how we currently generate contract code from source. We should design the protocol in a way that allows compile-time and deploy-time verification (if possible), and makes it easy to generate succinct proofs about execution.

It’s fine if we don’t do compile-time checks in our first rust toolchain, but less fine if our design makes this impossible for any toolchain.

We need to keep in mind that our code will be running not just in your interpreter, but also in something like the risc0 or sp1 VM, and potentially other host types in the future. When I say “a library routine written in RISC-V”, I mean that on the logical level, this is implemented as risc-v code that is reached with a JAL opcode (the risc-v equivalent of “call”, IIRC), or even inlined by the compiler into the contract code.

There’s a big difference between this and an ECALL when running on risc0, for example; in the former case, risc0 doesn’t even know that a library routine is being called – it’s all part of the same program. In the latter, you need to deal with external I/O to the program.

Regarding the external calling convention (i.e., calling into a contract from outside), I don’t have strong opinions about whether we should use stack-based arguments (like the C calling convention) or memory mapping. All else being equal, I’d suggest using the risc-v calling conventions, which transfer arguments in registers where possible, because it’s more likely that risc-v toolchains will be compatible with them.
Perhaps @iddo or someone more familiar with risc0/sp1 internals can chime in?

Then I think we should always return the value. (We can define at the method level whether there is a return value – this won’t make the calls more complex, but will probably take care of most cases where the callers don’t need the return value.)

I also don’t think getter methods should be free, they are read-only methods that don’t change the state so they should be assigned minimal gas costs (but not free) and higher gas costs are assigned yo state-modifying settlers and complex operations. This can be enforced on the VM level where the VM identifies the function type based on its signature or selector and applies the appropriate gas cost which were predefined.

To be clear, reading state, or running any other code in static mode, is always free outside the context of running an actual program (smart contract) on chain in consensus. That’s because any single node can do it on its own and it puts no burden on the network. This is what I linked to above: Contract.

This is relevant in the context of, e.g., a (web2) program that just needs to read state.

Code running on chain in consensus always has a cost.