RFC: Actor/message-passing syntax for contracts

--- tags: RFC --- # RFC: Actor/message-passing syntax for contracts In fe, we currently model contracts as (rust-like) structs. ``` contract MyCoin { total_balance: u256 balances: Map<address, u256> pub fn transfer(mut self, ctx: Context, to: address, value: u256) { .. } } ``` A contract can have some number of data fields, and public and private functions that optionally read and/or write to these data fields. A contract can call a public function of another contract, and pass arguments to that function. This is a decent, familiar abstraction, but a contract differs enough from a normal struct that some pesky inconsistencies and limitations arise. Some of these are explained below. Perhaps we should consider an entirely new abstraction for contracts. It seems that defining a contract in terms of the the actor/message-passing programming model is a good fit for the underlying evm semantics, and might give us more flexibility in how we handle the behavior of contracts that doesn't fit well into the struct model. # Proposal Note that this is a very rough sketch, intended as a starting point of a discussion. Example contract definition: ```rust contract Coin { storage: CoinState ctx: std::Context const total_supply: u256 // or TOTAL_SUPPLY const name: String<32> const symbol: String<8> const decimals: u8 recv CoinMsg { CoinMsg::BalanceOf { addr: address } => with storage { return storage.balances[addr] } CoinMsg::Transfer { to: address, value: u256 } => with (mut ctx, mut storage) { let from = ctx.msg_sender() storage.transfer(from, to, value) ctx.emit(TransferEvent { from, to, value }) } CoinMsg::TotalSupply => return total_supply } } message CoinMsg { BalanceOf { addr: address } -> u256 Transfer { to: address, value: u256 } -> bool TotalSupply -> u256 } struct CoinState { balances: Map<address, u256> allowances: Map<address, Map<address, u256>> pub fn transfer(&mut self, from: address, to:address, value: u256) { self.balances[from] -= value self.balances[to] += value } } fn test() { let coin = Coin(0xbeef) let balance = call(coin, CoinMessage::BalanceOf { addr: some_address }) // or `ctx.call(..)`, or `ctx.delegate_call(coin, ..)` or .. } ``` ## Message types Message types are defined using a new syntax, similar to an `enum` definition, but with a return type for each message variant. ```rust= message CoinMsg { BalanceOf { addr: address } -> u256 Transfer { to: address, value: u256 } -> bool } ``` ### Message type encoding (abi) TBD. Simple proposal 1: Introduce an trait for each abi, and implement that abi trait for each built-in type. Allow user-defined types to implement the trait (manually, or ideally via a `derive` macro). ```rust= trait SolAbi { fn encode_size(self) -> u32 fn encode(self, mut } #abi(sol) message CoinMsg { Transfer { to: address, value: u256 } -> bool } ``` ## `recv` statement The message handling logic is specified in a `recv` block, which is only allowed inside of a contract definition. In this formulation, it's basically just a match statement, on a user-specified message type (which must implement the `Decode` trait). ``` recv Foo { Foo::BalanceOf { addr: address } => with storage { return storage.balances[addr] } Foo::Transfer { to: address, value: u256 } => with (mut ctx, mut storage) { let from = ctx.msg_sender() storage.transfer(from, to, value) ctx.emit(TransferEvent { from, to, value }) } Foo::TotalSupply => return total_supply } ``` The match arms here are `with` blocks, where the required "capabilities" of the given message handler are specified. This structure makes it easy to determine the `stateMutability` (pure/view/etc) of each message (aka "external function"). ## contract fields or capabilities or associated types Instead of having some number of contract data fields, we now specify a single associated storage type. This is inspired by associated types in rust traits; it could follow rust's syntax, or use something different. ``` contract C { type Storage = Foo # OR storage: Foo # OR storage = Foo ``` I had initially written `cap storage = Foo`, conceptualizing this as defining a "capability" called `storage`. This capability is then used in the `recv` statement, and can be used to derive the `stateMutability` of "external functions". The storage type might be a basic struct, or it could be some database abstraction that uses some low-level storage access primitives or whatever. Details TBD. The `ctx` type/capability that gives access to chain context could be specified in the same fashion; maybe it defaults to `std::context::Context`?. ``` contract C { # every contract must specify storage and context types? storage: Foo # must impl trait Storage? ctx: Context # must impl trait ChainContext? ``` Perhaps users could also define their own capabilities, to place additional restrictions on which code can do what. ## Stretch goals ### Generic associated types If a contract's `storage` type is generic, it can act as a customization point. The contract definition could, for example, implement security best practices, while delegating some tasks to the `storage` type. ```rust // in some library contract Wallet<S: WalletStorage> { storage: S const OWNER: address recv Msg { Deposit => ..., Transfer { to: address, value: u256 } => with (mut ctx, storage) { assert!(caller == OWNER) assert!(to != address(0)) assert!(value <= storage.get_transfer_limit()) if storage.authorize_transfer(to, value) { ctx.send_value(to, value) } ctx.emit(TransferEvent { from, to, value }) } } } trait WalletStorage { fn get_transfer_limit(self) -> u256 fn authorize_transfer(self, to: address, value: u256) -> bool } ``` ```rust // user code: use somelib::{Wallet, WalletStorage} type MyWallet = Wallet<MyWalletStorage> struct MyWalletStorage { friends: Map<address, bool> totals: Map<address, u256> } impl WalletStorage for MyWalletStorage { fn get_transfer_limit(self) -> u256 { return 100_000 } fn authorize_transfer(self, to: address, value: u256) -> bool { let ok = self.friends[to] if ok { self.totals[to] += value } return ok } } ``` ### message encoding customization ``` # Interface compatible with classic ERC20, and modern NuToken standards #[derive(Decode)] enum NuTokenMsg { #[abi = sol("transfer(address,uint256)")] TransferClassic { to: address, value: u256 } #[abi = nu, id = 1] Transfer { to: address, value: u256 } } ``` ``` #[derive(Decode)] enum Eth2 { #[abi = sol("deposit(bytes,bytes,bytes,bytes32)")] DepositClassic { pubkey: Array<u8, 48> withdrawal_creds: Array<u8, 32> sig: Array<u8, 96> data_root: Array<u8, 32> } #[abi = nu, id = 1] Deposit { pubkey: BlsPublicKey withdrawal_creds: Array<u8, 32> sig: BlsSignature data_root: Sha256Hash } GetDepositRoot, SupportsInterface { value: u64 } } ``` # Open questions ## contract interfaces (eg ERC20) ### How does one specify a contract interface? For example, the ERC721 interface: ``` function balanceOf(address _owner) external view returns (uint256); function ownerOf(uint256 _tokenId) external view returns (address); function safeTransferFrom(address _from, address _to, uint256 _tokenId, bytes data) external payable; function safeTransferFrom(address _from, address _to, uint256 _tokenId) external payable; function transferFrom(address _from, address _to, uint256 _tokenId) external payable; function approve(address _approved, uint256 _tokenId) external payable; function setApprovalForAll(address _operator, bool _approved) external; function getApproved(uint256 _tokenId) external view returns (address); function isApprovedForAll(address _owner, address _operator) external view returns (bool); ``` (Note that `safeTransferFrom` is overloaded) A message `enum` type doesn't allow for a return type to be specified. **Bad idea 1:** specify a recv block with return types. Message enum fields aren't included, because they're specified in the enum definition. ``` interface Erc721 { recv Erc721Msg { BalanceOf -> u256 OwnerOf -> address SafeTransferFrom -> () } } enum Erc721Msg { BalanceOf { owner: address } ... } ``` **Bad idea 2:** The interface message type definition includes expected return types. ``` #[interface] enum Erc721 { BalanceOf { owner: address } -> u256 // `Option<Bytes>` is used here to handle the overloaded function name. // I guess this would result in two selectors, one with the data arg and one // without, which is weird. Alternatively, we could allow the same name to // be used more than once in a message enum SafeTransferFrom { from: address, to: address, id: u256, data: Option<Bytes> } } ``` It feels weird to have special syntax in an enum block if enums are used as message ### How does a contract declare that it implements one or more interfaces? Idea 1: It's implicit in the message type ## recv block restrictions Constraints: - we need to be able to determine `stateMutability` of each "function" (== message handler) - ?? # Motivation Contracts and structs are syntactically identical, but semantically different. ### instance methods vs type-associated functions ``` contract C { pub fn foo() { .. } } struct S { pub fn foo() { .. } } let s: S = .. s.foo() # ERROR, `foo` does not take `self`, use `S::foo()` instead let c: C = .. c.foo() # OK ``` For structs, there's a clear distinction between instance methods and associated functions. For contracts, every function is an "instance method", even if it looks like an associated function; we're calling the function on a contract deployed at a particular address, so we write `c.foo()`, even though `foo` doesn't take `self`. ``` contract C { pub fn foo() { .. } fn g(self) { let c: C = .. c.foo() # OK self.foo() # ERROR, `foo` does not take `self` use `foo()` } } ``` For structs, "takes `self`" === "is an instance method". For contracts, "takes `self`" === "accesses storage". ### contract mutability ``` contract C { pub fn foo(mut self) { .. } } struct S { pub fn foo(mut self) { .. } } let s: S = .. s.foo() # ERROR: not mutable let c: C = .. c.foo() # OK ``` A struct function that takes `mut self` (or `&mut self` if we have explicit references) can only be called on a mutable struct. A contract function that takes `mut self` can currently be called on a "non-mutable" external contract. We could require that an external contract be defined as `let c: &mut C` or `let c: *mut C` to make this more consistent. However, when calling a function on an external contract, it's important to remember that the contract can do *whatever it wants to do*. It's not constrained by the mutability of our local variable, or by whether we "pass in" the special `Context` object. An external contract function signature might declare that it doesn't access storage or context, but it's actually free to do both. Note that the compiler could use the STATICCALL opcode when calling out to an immutable external contract to restrict its ability to mutate storage, but we don't yet do this. ### contract `self` has a special type For structs, `self` is just an instance of the struct type, and can be passed into a function that expects that struct type, just like any other instance of that struct type. ``` struct S { pub fn foo(self) {} pub fn bar(self) { call_sfoo(s: self) # OK } } fn call_sfoo(s: S) { s.foo() } ``` For contracts, a contract type is a "newtype" wrapper around the primitive `address` type. However, the `self` contract is something different. The `self` address is retrieved via the evm `ADDRESS` opcode. Calling a function on the `self` contract is an internal call, rather than an external call. It's (currently) not possible to pass the `self` contract into a function that expects an instance of that contract type. ``` use std::context::Context contract C { pub fn foo(self) {} pub fn bar(self, ctx: Context) { let other: C = C(address(0)) call_cfoo(c: other) # OK call_cfoo(c: self) # ERROR, invalid use of contract `self` call_cfoo(c: C(ctx.self_address())) # OK; this makes an external call to the current contract } } fn call_cfoo(c: C) { c.foo() } ``` ### generics ``` struct S { pub fn f<T>(self, val: T) { .. } # OK fn g<T>(self, val: T) { .. } # OK } contract C { pub fn f<T>(self, val: T) { .. } # Error? contract `pub` fns can't be generic fn g<T>(self, val: T) { .. } # OK } ``` If a struct contains a generic function, we can determine the complete set of concrete types used in the calls to that function, and generate code to handle all of those types (eg via monomorphization), all at compile time. If a contract contains a non-public generic function, we can do the same. If a contract contains a public function, we can do the same for *internal* calls to that function, but can't do it for *external* calls to the function, as those happen after the code is compiled and deployed. ### `pub` fields Struct fields can be marked `pub`, which means they're accessible by functions that aren't defined on the struct itself. With a mutable reference to a struct, it's possible to directly modify any pub fields of that struct. Contract fields can't be marked `pub`. It might make sense to allow some contract fields to be *readable* by any external code; defining a `pub` field could be syntactic sugar for defining an accessor function for that field. However, it probably doesn't make sense for `pub` fields to be *writable* by any external code, without some way of controlling write access to certain addresses (see, for example, swift's "computed properties"). ### Struct vs contract creation Structs are created via rust-like struct initializer syntax, if the struct fields are visible to the creating code: ``` let p = Point { x: 10, y: 100 } ``` or via some user-defined "constructor" function that returns an instance of that struct type: ``` struct Point { pub x: u64 pub y: u64 pub fn origin() -> Point { return Point { x: 0, y: 0 } } } ``` Contracts are created by doing something like: ``` # Not yet implemented ctx.create<MyCon>(seed: None, value: 0, args: (10, 100)) ``` which calls an optional user-defined `__init__` function: ``` contract MyCon { x: u8 y: u16 pub fn __init__(mut self, x: u8, y: u16) { self.x = x self.y = y } } ``` ### external contract call semantics The evm defines four different calling opcodes. With our current contract function call syntax (`somecontract.foo()`), we could generate bytecode that uses `CALL`and `STATICCALL`, depending on mutability of the contract. I'm not sure how to support `CALLCODE` and `DELEGATECALL`, but it will presumably require new syntax. The evm `CALL` op takes gas and value arguments, but there's currently no way to specify these arguments in fe. Solidity: `somecontract.foo{value: 10, gas: 800}()` ### special `self` and `ctx` parameters/capabilities If a contract function needs to access contract storage, it can be defined to take a `self` parameter. To access block/chain/message context, it can take a `ctx` parameter. If it takes neither, the function is "pure". ``` contract C { balances: Map<address, u256> pub fn transfer(mut self, mut ctx: Context, to: address, value: u256) { self.balances[ctx.msg_sender()] -= value self.balances[to] += value ctx.emit(Transfer { from: ctx.msg_sender(), to: address, value }) } } ``` `self` and `ctx` here are special, in that they aren't real arguments that are sent by the caller. They're more like "capabilities" that the function has; `self` gives access to the current contract storage fields, and `ctx` gives access to the block/chain/msg context. (Technically, `Context` has zero-size, so it could be thought of as being passed in by the caller, and fe currently requires that calls to external functions include `ctx` where specified). These parameters play the same role as solidity's "mutability" function labels (pure/view/nonpayable/payable), but they make the use of storage and context very explicit in the function body. These are fine, but they also add a lot of noise to the fn parameter line, and mixing them with real arguments that are encoded and sent as part of the message calldata from the caller feels weird. We could explore some new syntax for these "capabilities" that retains the explicitness and the overall contract fn design, but doesn't pretend that they're function parameters. Example: ``` pub fn transfer(to: address, value: u256) uses mut storage, mut ctx { storage.balances[ctx.msg_sender()] -= value ```

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.