ethereum/EIPs

EIP 101 (Serenity): Currency and Crypto Abstraction

vbuterin opened this issue ยท 73 comments

Title

  EIP: 101
  Title: Serenity Currency and Crypto Abstraction
  Author: Vitalik Buterin <v@buterin.com>
  Status: Active
  Type: Serenity feature
  Created: 2015-11-15

Specification

  1. Accounts now have only two fields in their RLP encoding: code and storage.
  2. Ether is no longer stored in account objects directly; instead, at address 0, we premine a contract which contains all ether holdings. The eth.getBalance command in web3 is remapped appropriately.
  3. msg.value no longer exists as an opcode, and neither does tx.gasprice
  4. A transaction now only has four fields: to, startgas, data and code.
  5. Aside from an RLP validity check, and checking that the to field is twenty bytes long, the startgas is an integer, and code is either empty or hashes to the to address, there are no other validity constraints; anything goes. However, the block gas limit remains, so miners are disincentivized from including junk.
  6. Gas is charged for bytes in code at the same rate as data.
  7. When a transaction is sent, if the receiving account does not yet exist, the account is created, and its code is set to the code provided in the transaction; otherwise the code is ignored.
  8. A tx.gas opcode is added alongside the existing msg.gas at index 0x5c; this new opcode allows the transaction to access the original amount of gas allotted for the transaction

Note that ECRECOVER, sequence number/nonce incrementing and ether are now nowhere in the bottom-level spec (NOTE: ether is going to continue to have a privileged role in Casper PoS). To replicate existing functionality under the new model, we do the following.

Simple user accounts can have the following default standardized code:

# We assume that data takes the following schema:
# bytes 0-31: v (ECDSA sig)
# bytes 32-63: r (ECDSA sig)
# bytes 64-95: s (ECDSA sig)
# bytes 96-127: sequence number (formerly called "nonce")
# bytes 128-159: gasprice
# bytes 172-191: to
# bytes 192+: data

# Get the hash for transaction signing
~mstore(0, msg.gas)
~calldatacopy(32, 96, ~calldatasize() - 96)
h = sha3(96, ~calldatasize() - 96)
# Call ECRECOVER contract to get the sender
~call(5000, 3, [h, ~calldataload(0), ~calldataload(32), ~calldataload(64)], 128, ref(addr), 32)
# Check sender correctness
assert addr == 0x82a978b3f5962a5b0957d9ee9eef472ee55b42f1
# Check sequence number correctness
assert ~calldataload(96) == self.storage[-1]
# Increment sequence number
self.storage[-1] += 1
# Make the sub-call and discard output
~call(msg.gas - 50000, ~calldataload(160), 192, ~calldatasize() - 192, 0, 0)
# Pay for gas
~call(40000, 0, [SEND, block.coinbase, ~calldataload(128) * (tx.gas - msg.gas + 50000)], 96, 0, 0)

This essentially implements signature and nonce checking, and if both checks pass then it uses all remaining gas minus 50000 to send the actual desired call, and then finally pays for gas.

Miners can follow the following algorithm upon receiving transactions:

  1. Run the code for a maximum of 50000 gas, stopping if they see an operation or call that threatens to go over this limit
  2. Upon seeing that operation, make sure that it leaves at last 50000 gas to spare (either by checking that the static gas consumption is small enough or by checking that it is a call with msg.gas - 50000 as its gas limit parameter)
  3. Pattern-match to make sure that gas payment code at the end is exactly the same as in the code above.

This process ensures that miners waste at most 50000 gas before knowing whether or not it will be worth their while to include the transaction, and is also highly general so users can experiment with new cryptography (eg. ed25519, Lamport), ring signatures, quasi-native multisig, etc. Theoretically, one can even create an account for which the valid signature type is a valid Merkle branch of a receipt, creating a quasi-native alarm clock.

If someone wants to send a transaction with nonzero value, instead of the current msg.sender approach, we compile into a three step process:

  1. In the outer scope just before calling, call the ether contract to create a cheque for the desired amount
  2. In the inner scope, if a contract uses the msg.value opcode anywhere in the function that is being called, then we have the contract cash out the cheque at the start of the function call and store the amount cashed out in a standardized address in memory
  3. In the outer scope just after calling, send a message to the ether contract to disable the cheque if it has not yet been cashed

Rationale

This allows for a large increase in generality, particularly in a few areas:

  1. Cryptographic algorithms used to secure accounts (we could reasonably say that Ethereum is quantum-safe, as one is perfectly free to secure one's account with Lamport signatures). The nonce-incrementing approach is now also open to revision on the part of account holders, allowing for experimentation in k-parallelizable nonce techniques, UTXO schemes, etc.
  2. Moving ether up a level of abstraction, with the particular benefit of allowing ether and sub-tokens to be treated similarly by contracts
  3. Reducing the level of indirection required for custom-policy accounts such as multisigs

It also substantially simplifies and purifies the underlying Ethereum protocol, reducing the minimal consensus implementation complexity.

Implementation

Coming soon.

What happens to contracts that use msg.value? Do they get automatically translated into the new abstraction?

Yep, every feature that gets removed should be auto-translateable.

However, note that this does require some care on the part of developers: particularly, anyone developing ethereum contracts now should use static jumps ONLY, not dynamic jumps (eg. PUSH <val> JUMPand PUSH <val> JUMPI are okay, PUSH 32 MLOAD JUMP is not).

I take it this could allow for "calling collect" with sophisticated enough miners? (i.e. you just ping some random contract and it will pay for its own execution at no cost to you.) That would be awesome, considering the amount of shenanigans it takes to do the equivalent currently. See also: paying gas/mana with non-ether currencies.

Also, +1 for putting ether and subcoins on the same footing. I'm working on a simple contract (mostly for fun) to bridge that gap.

what is the target time to include it?

I take it this could allow for "calling collect" with sophisticated enough miners? (i.e. you just ping some random contract and it will pay for its own execution at no cost to you.)

Exactly. The goal with the above recommended miner software implementation is that if the miner sees a proof that they will get paid within 50000 steps, then they just go ahead and do it, so you should not even need to pre-arrange much of anything.

what is the target time to include it?

Serenity, ie. same time as Casper.

i'm in favour of bringing it forward to homestead-era, in preparation for serenity.

Exactly. The goal with the above recommended miner software implementation is that if the miner sees a proof that they will get paid within 50000 steps, then they just go ahead and do it, so you should not even need to pre-arrange much of anything.

This is a huge benefit that makes everything worth it, in my opinion.

indeed.

I have a concern about contract creation under this model. Currently, two contracts may have identical code but extremely different data, but in this case they would have to be the same contract. Think of a modern contract with the "owner" modifier, or the standard metacoin that gives the creator a zillion gizmos. Once I create an owned contract with a given code, no one else can make an identical contract with them as the owner. I don't think hardcoding the owner's signature is a good alternative, as then how do you change owners?

since we are down to code and storage in an account we could just put code in storage. For example store code at location 0. Then we would have a single merkle tree. To load code we would load program_address.concat(0) from the tree. And to load from storage index "test" we load program_address +"test" and so on.

@wanderer that is a good idea in principle, but it depends on the ability to store a single code chunk of arbitrary size in storage, which would be a separate EIP (that I would support as a serenity change).

@vbuterin but we can store arbitrary sizes in the merkle tree. I'm saying the we just need one merkle tree and not a separate root for the storage. Now from within the EVM you don't have accesses to more than one word which is not that nice. But the execution environment has to load the code and give to the evm as it stand now anyways, so being able to access the code from within the EVM is a not big concern yet.

Right, but it seems ugly to store code sequentially. Also, there are space efficiency reasons to have code be in one big chunk; that was the original reason to do it that way as I recall.

i'm going to back on the idea of storing code at zero. We don't need that. Just store the codehash at the address. Then for storage just append the storage key to the address.

<address> = codeHash
<address> + 'test' = Storage key 'test'

Code should not be stored in storage, it has to be immutable.

its doesn't have to be immutable, but there is not many use case for mutable code yet. Interpreters that JIT often need mutable code. But we don't have interpreters running on ethereum yet :P. And it's easier to just store the code directly at address.

@wanderer I hope that correctness and verifyability has a higher priority than speed, here. Interpreters that compile just in time also do not need mutable code. It is fine for them to call newly created code, and that works perfectly with CREATE and CALLCODE.

My instinct at this point is to retain the "immutable code + mutable storage" dichotomy that we currently have.

๐Ÿ‘ putting ether on equal footing as other tokens

I take it this could allow for "calling collect" with sophisticated enough miners?

I don't understand this. Doesn't block.coinbase.send(x) already provide everything we need for contracts that pay their own gas?

@aakilfernandes

Good point. But, as Vitalik pointed out, this would allow such a system to work easier. (Just set up your contract with the standard payment code and you're good.)

Concern: What if a contract appears to be able to pay for its own gas, but at the last moment shoves the ether it has to another contract? The miner can immediately refuse to continue the transaction, but that doesn't refund the miner or cost the contract.

I think a reasonable solution is for the miner to insure that the actual desired, and gas limited, call is right before the payment code, no matter what.

Note that ECRECOVER, sequence number/nonce incrementing and ether are now nowhere in the bottom-level spec

So does this mean that ECRECOVER (for different curves) would need to be implemented in Solidity in a special library contract? Seems like that would be very expensive to call, no?

Ah, sorry. It would exist as a precompile at address 3.

BTW for everyone's curiosity, ECRECOVER has been implemented in Serpent already. It costs ~700k gas.

What happens to gasprice?

@vbuterin ah, so ECRECOVER for secp256k1 would be at address 3. Any plans on precompiling it for other curves (secp256r1, NIST P256, ed25519 etc)?

Hehe, NIST P-256 and secp256r1 is the same curve, doh! :)

I think that a precompile for ed25519 is reasonable; all the altcoins seem to be converging on it as an optimal curve so we should consider it. I implemented it in python here https://github.com/vbuterin/ed25519 but I haven't made any effort in making sure that it's standards-compliant yet, though at least in python it seems like its speed advantages over secp256k1 exist but are quite a bit smaller than advertised.

Small observation: while I think this is a good idea if recommend postponing it as much as we can, to allow token standards discussions to have real world usage and maturity.

On Nov 29, 2015, at 05:02, vbuterin notifications@github.com wrote:

I think that a precompile for ed25519 is reasonable; all the altcoins seem to be converging on it as an optimal curve so we should consider it. I implemented it in python here https://github.com/vbuterin/ed25519 but I haven't made any effort in making sure that it's standards-compliant yet, though at least in python it seems like its speed advantages over secp256k1 exist but are quite a bit smaller than advertised.

โ€”
Reply to this email directly or view it on GitHub.

@vbuterin Yeah Curve25519/Ed25519 are getting more and more popular. You should check out the NaCl library which is the canonical one for working with Curve25519 and Ed25519. I believe libnacl provides python bindings for this library.

The secp256r1 curve is interesting because it is a NIST standard and so curve signatures with this curve is supported on many off-the-shelf smartcards/USB keys like Yubikey. Also in iOS9 there is support for generating private keys and computing elliptic curve signatures using secp256r1 in the secure element of the iPhone, providing a very secure environment for mobile wallets.

@christianlundkvist I think NIST curves aren't very popular due to the limited evidence that their parameters are safe. See https://eprint.iacr.org/2014/571.pdf

@subtly I've never been able to figure out if there is some merit to the theory that the secp256r1 curve might be backdoored. It seems clear that Dual_EC_DRBG was indeed backdoored, but this RNG was immediately seen as suspicious and most people were reluctant to use it from the start. There are many inconclusive discussions such as this one

https://bitcointalk.org/index.php?topic=289795.200

which is mostly concerned with secp256k1. I guess that if you have a choice of other curves and there is a risk that it might be backdoored, you want to pick the other curve.

void4 commented

To extend on the code=data=state argument of @wanderer:
Wouldn't it be possible to make the EVM a tree-addressed system instead of a pure stack machine? Is anyone familiar with Urbits Nock? I admit, it is a bit esoteric, but it would make certain processes easier (e.g. formal verification). It would be an extreme modification of the original spec, but I figured these changes are easier to do now than later.

@viod4 yeah I have looked into Nock. Feel free to message me on gitter if you interested in VMs

This is a great idea, sooner rather than later, please!

janx commented

I have a concern about contract creation under this model. Currently, two contracts may have identical code but extremely different data, but in this case they would have to be the same contract. Think of a modern contract with the "owner" modifier, or the standard metacoin that gives the creator a zillion gizmos. Once I create an owned contract with a given code, no one else can make an identical contract with them as the owner. I don't think hardcoding the owner's signature is a good alternative, as then how do you change owners?

I got the same concern as @Smithgift , any comments on this? Does it mean I have to 'tweak' the contract code so it has a different hash before creating a contract with it?

@janx: In the latest iteration of this idea (see here on the main Ethereum blog), a contract address is the hash of code and the sender's address. There's still an issue if you want to have multiple contracts of the same code with different constructor arguments, but that's a smaller issue.

I'm concerned about EVM errors in this system. Suppose an attacker creates a valid transaction which, several function calls down the line, makes an invalid jump and so undoes the whole transaction. The miner has spent resources to compute the transaction, but since the transaction never happened, he doesn't get paid.

One "fix" would be to put a true try-catch mechanism in the EVM, and have the outermost contract catch all from inside, so it always pays. But the additional complexity of partial transaction reversion sounds unpleasant, to say the least.

janx commented

@Smithgift thanks, I missed that. Hash with sender address is good enough to me, since I can always include my own 'nonce' in contract to generate different address.

@Smithgift the try-catch mechanism is already in place for the EVM. It does not have too much of an overhead because you can just switch back to a previously existing state root hash. Note that errors during execution do not revert the whole transaction but only the current call (Solidity has a mechanism for automatically causing an error in the outer stack frame in this case, but that is just a feature). You have to take care to clear deleted state trie nodes only at the end of the transaction and not while it is being executed.
The code above:

# Make the sub-call and discard output
~call(msg.gas - 50000, ~calldataload(160), 192, ~calldatasize() - 192, 0, 0)

calls the actual code, but reserves 50000 gas for paying the miner. If the call runs out of gas, it returns (and puts an error code on the stack, which is ignored in this example) and we still have 50000 gas left to pay the miner.

@chriseth: Thanks. Learn something new every day.

[BitNoCoin Proposal] Beat me to it, @vbuterin! Thanks for all of your work! I really appreciate all of your work on Ethereum, and my premine purchase is definitely worth it for your team's projects as well as the Ether!

Not sure if this belongs here, but is there interest in adding opcodes/precompiles for basic elliptic curve operations (EC addition, scalar multiplication etc)? I think you could do some fun on-chain crypto schemes like quasi-homomorphic encryption etc using this. I spoke to Denis Lukianov at DevCon and he mentioned that this may be on the roadmap?

@christianlundkvist i'm more interested in making the VM fast enough that we don't need precompiles

+1 for static jumps only!

Is it possible to forbid dynamic jumps in the next hardfork?

What's the inspiration for the assembly pseudocode in the EIP? I've been fiddling with building a disassembler that generates easy to read code, and that seems like a good target!

@Arachnid: I believe that's actually Serpent code.

@Arachnid: Check out here.

@Smithgift Hm, okay. Is that syntax documented anywhere? I can't find any mention of it.

@vbuterin or anyone else: Will this proposal make it possible to send someone Ether without sending their address a message, but instead just sending a message to address 0? If so, that prevents some of the complexity caused by having to account for malicious or broken recipients, but also means contracts can't "reject" ether like they can now.

That would also completely break wallets, so I doubt it.

@PeterBorah I believe it is both. The ether contract does not contact the receiving contract, but if there is common use of a cheque mechanism as described in the original post a contract can still reject ether. (Specifically, a contract would simply never accept the cheque.)

Pattern-match to make sure that gas payment code at the end is exactly the same as in the code above.

Doesn't this depend on the type of contract being used and the position the gas is sent in at? Will miners be able to pattern match for this kind of thing in general?

axic commented

What will happen to the current balance residing at address 0?

A tx.gas opcode is added alongside the existing msg.gas at index 0x5c; this new opcode allows the transaction to access the original amount of gas allotted for the transaction

Would it make sense implementing this separately, perhaps earlier? It is fixing an oversight in the current EVM.

Just as a reminder: With the release of an Ether-token some old contracts may become vulnerable. I had a look at wallet.sol (https://github.com/ethereum/dapp-bin/blob/master/wallet/wallet.sol) the multisig wallet which is used by many. wallet.sol is restricting transactions sending Ether with .value() using multisig. Using Ether-tokens those transactions could be done only using the data field. The multisig would become a shared wallet. It seems an easy auto-translation would not be possible.

axic commented

@vbuterin:

# bytes 128-159: gasprice
# bytes 172-191: to
# bytes 192+: data

While reading the code, it seems these offsets are wrong and at offset 160 should be the to.

~call(40000, 0, [SEND, block.coinbase, ~calldataload(128) * (tx.gas - msg.gas + 50000)], 96, 0, 0)

Can you also explain the format of the calls to the crypto contract? The above translates as:

bytes 0 - 31: command
bytes 32 - 63: account (token holder)
bytes 64 - 95: value

Can we make the crypto contract conform to ERC20 (token interface) and make calls accordingly?

Also here's an rough implementation in Solidity: https://gist.github.com/axic/528017d2d67801fa669fd75577c2093c. In order to be optimised a lot more would need to be moved to inline assembly, nullifying the benefit of Solidity.

Is there a chance to have a pre-compiled ed25519 signature check contract in the near future ?

I have submitted an EIP pull request for the ed25519 addition: #665

@picostocks cool.

Here are my comments on this blog post.
Wait, ~mstore(0, ~txexecgas()) is overwritten by the next line, ~calldatacopy(32, 96, ~calldatasize() - 96), which writes the length of the input data to the first word of memory. So I don't see what the point of that first line is.

This seems outdated according to the current yellow Paper version (equation 217), as it omits the gas limit after the gas price

# bytes 0-31: v (ECDSA sig)
# bytes 32-63: r (ECDSA sig)
# bytes 64-95: s (ECDSA sig)
# bytes 96-127: sequence number (formerly called "nonce")
# bytes 128-159: gasprice
# There should be a line here for bytes 160-161: gaslimit
# bytes 172-191: to
# bytes 192-223: value
# bytes 224+: data

Also for:
~mstore(0, ~sha3(0, ~calldatasize() - 64))
I don't understand why there is minus 64. The SHA3 function is

ฮผ_s[0] โ‰ก Keccak(ฮผ_m[ฮผ_s[0] . . . (ฮผ_s[0] + ฮผ_s[1] โˆ’ 1)])
ฮผ_i' โ‰ก M (ฮผ_i , ฮผ_s[0], ฮผ_s[1])

According to Appendix F in the Yellow Paper, the hash function excludes v, r and s, which makes sense. I think that line should change to have minus 96, not minus 64.
~mstore(0, ~sha3(0, ~calldatasize() - 96))

~call(5000, 3, [h, ~calldataload(0), ~calldataload(32), ~calldataload(64)], 128, ref(addr), 32)
Blog post:
~call(5000, 1, 0, 0, 128, 0, 32)
Note the difference. The blog post was posted on December 24 2015 while this issue was opened on 21 November 2015, so it's a while ago, but this issue is still open. The first line is more descriptive.

There are other differences which makes things confusing.

~mstore(0, msg.gas) % above code
~mstore(0, ~txexecgas()) % blog post

txexecgas seems more like tx.gas.

h = sha3(96, ~calldatasize() - 96) 
~mstore(0, ~sha3(0, ~calldatasize() - 64)) % blog post

I suggest changing the above line to:
~mstore(0, ~sha3(0, ~calldatasize() - 96))
-64 doesn't make sense as you omit v, r and s (not just v and r) for the SHA3 opcode / Keccak function, as shown in appendix F, which can be found more easily in this version of the paper here, which has a document outline and extra features for readability. (A PR is here.)

Going back to:
~call(5000, 3, [h, ~calldataload(0), ~calldataload(32), ~calldataload(64)], 128, ref(addr), 32)
Blog post:
~call(5000, 1, 0, 0, 128, 0, 32)
"gas, to, value, in offset, in size, out offset, out size."
However, it looks like the order of CALL for ECRECOVER (which would presumably need to be the same order for all call opcodes, unless you had a separate call opcode for ECRECOVER, or I am misunderstood) is changed in the above which I assume is like so: gas, ECRECOVER_CONTRACT_CODE_PRECOMPILE_ADDRESS, sequencenumber, in offset, to, outsize.
Unless ref(addr) is the out offset at the to address, but it is ambiguous to me. I guess you can't have in size because that depends on the data field.

Is there any chance to get a secp256r1 (aka prime256v1 or NIST P-256) pre-compiled signature verification?

I believe there is a strong case for supporting this scheme, as it's the only one implemented natively in both Android's Trusted Execution Environment [1] and iOS' Secure Enclave [2]. This would allow for stronger security in the application layer for virtually all mobile wallet apps, and other third-party cryptographic hardware (as the NIST P-256 is the preferred standard of the hardware industry, rather than secp256k1).

[1] https://source.android.com/security/keystore/implementer-ref
[2] https://www.apple.com/business/docs/iOS_Security_Guide.pdf

I think there is a concern that the NIST curve could have a backdoor, as another NIST curve was found to have a backdoor.

That's a hypothesis and I wouldn't discard it entirely.

However, assuming that secp256r1 is not backdoored by design, we get to significantly strengthen the mobile wallet security against a wider array of attack vectors (from side channels to memory dumps) as the cryptographic operations would be performed inside a dedicated hardware processor. I believe the gains are outweighing the risks for this one.

Sounds fair enough. I haven't done much research on this so it's hard to know for sure.

Is there any update(s) on this as to when it will be implemented?

Asone commented

@greggmojica : i guess i'ts gonna be difficult to have a date of implementation of this, as there's very few information about the implementation of EIP101.

It's not even listed in the main page of the repo unfortunately. I'm also quite interested in being able to follow the steps of development of this EIP as it is a major step for Ethereum Ecosystem.

Asone commented

@cdetrio : Thanks for pointing out ! Added to my favs.

Can i suggest to add the EIP in readme.md and point out the link to the new discussion or should it stay completely outside of the readme ?

This EIP is a major one, and even if far away of being implemented, easing users to find the new location will allow many to stay up to date about the discussion.

Much datalove

There has been no activity on this issue for two months. It will be closed in a week if no further activity occurs. If you would like to move this EIP forward, please respond to any outstanding feedback or add a comment indicating that you have addressed all required feedback and are ready for a review.

This issue was closed due to inactivity. If you are still pursuing it, feel free to reopen it and respond to any feedback or request a review in a comment.