ethereum/EIPs

EIP 5: Gas costs for return values

chriseth opened this issue ยท 31 comments

Proposal

Change memory gas costs for return values of a call to be applied depending on the size of the written memory area, not the reserved memory area.

Current Situation

In a call, the caller has to provide an area where the returned data from a call is written to. Only reserving it already changes gas costs for memory.

Advantage

This makes it possible to read a return value whose size is not known to the caller prior to the call: The caller just reserves a huge amount of memory.

Possible Extension

The CALL opcode (and its friends) may even return the actual size of memory written to, so that the data can be used (e.g. forwarded) even if nothing is known about its structure.

Implementation

The implementation might be a bit tricky: The CALL opcode has to be charged some gas upfront, then the call is performed without allocating memory for the return value. At the point, the return value is available, gas costs are computed and if some gas is still available, memory is allocated and written to.

This would be quite useful from a contract/library author's perspective. It currently requires some major hoop jumping to get bytes from another contract.

Voting for this proposal. IMHO this will be a huge jump to multi-contract apps.

applied depending on the size of the written memory area, not the reserved memory area.

So would there be a max allocation size the the callee would specify?

I have thought about this a bit more in the last day and tried to implement some generic call forwarding stuff in serpent, and have really come to appreciate the wisdom of this change and how much easier it would make it to do a lot of things.

So would there be a max allocation size the the callee would specify?

Yes. However, if you trust the callee absolutely, you can just set it to 2**100; but this isn't as scary as it seems as contracts trust the callee with all of their gas by default already.

๐Ÿ‘

Here is my proposed implementation in python:

The way it currently works: http://vitalik.ca/files/old.py
After the proposed change: http://vitalik.ca/files/new.py

lgtm

janx commented

+1

+1! This unlocks the full potential of "dynamic" contracts. Without this, you can't use Solidity to write a contract that could be updated to implement an arbitrary Solidity ABI.

What is the status of this proposal? I'm currently running into this issue. :)

@kumavis suggested that I expand on my usecase for this. Basically, this would allow generic middleware and generic call-forwarding in a way that you can't really do right now.

This is particularly useful for "identity" in various forms. I want to write a contract that serves as the canonical address for an entity (a person, dapp, DAO, token, etc.), but that is capable of forwarding incoming or outgoing calls to other contracts. Right now, this is only possible if this middle contract has special knowledge about the interior workings of the destination contract, and I would like to be able to do it generically.

For an example of how it currently has to be done, check out the experiments I'm doing here:

https://github.com/PeterBorah/ether-router/blob/master/contracts/EtherRouter.sol

In line 11, I have to explicitly look up the expected return size in a separately-maintained database, so that I can allocate enough memory to pass back the return value.

There's also a simpler usecase, which is that Solidity is currently unable to call a function that returns a dynamically-sized array, because it needs to know the size of the array before it makes the call.

In the meantime, another possible solution to the problem was proposed:

Charge the gas to the callee instead of the caller.

Advantages:

  • CALL opcode cannot fail (assuming we leave enough gas with the caller to pay for the slightly dynamic costs of the call itself)

Disadvantages:

  • callee pays for memory extension of caller and has to retain some gas for this (but the amount is unknown as it depends on the position in caller's memory and not only the size)

In comparison, the original proposal of paying from caller's gas after returning the gas left in the callee's context:

Advantages:

  • backwards compatible (gas costs for CALL will be distributed to pre- and post-gas, but sum will be at most as large and might be smaller)

Disadvantages:

  • actual gas costs depend on value returned (but that effect only appears because we reduce the original gas costs by a dynamic amount)

Disadvantage of original proposal:

  • CALL breaks the rule that no opcode can run OOG once the execution of the op-code has started.

To be clear, until now the formal spec for the EVM dispatch sequence is:

  1. Decode instruction.
  2. Check gas requirements (bail with OOG exception if not met).
  3. Deduct gas.
  4. Execute instruction.
  5. Refund unused gas.

This would fundamentally alter to:

  1. Decode instruction.
  2. Check gas requirements (bail with OOG exception if not met).
  3. Deduct gas.
  4. Execute instruction.
  5. Check additional gas requirements (bail with OOG exception if not met).
  6. Deduct/refund additional/unused gas.

This would add a base layer of additional complexity and reduce our ability to reason about edge circumstances particularly regarding possible attack scenarios due to gas usage.

While indeed, this adds another step for call-like opcodes, you are hiding a lot of complexity that is already present, especially with regards to call-like opcodes:

  1. decode instruction
  2. check stack requirements (throw OOG if not met)
  3. calculate and check gas requirements (throw OOG if out of gas)
  4. enlarge memory if necessary
  5. check that the call depth is not too large (push 0 to stack otherwise)
  6. check that we have enough ether to send along with the call (push 0 to stack otherwise)
  7. if checks succeeded: perform the call (passing on memory pointers for to a new stack frame)
  8. copy output into memory (that step might be done as the last one of the inner stack frame)
  9. refund gas

And perhaps some other steps I have missed here. With this Proposal, this turns into:

  1. decode instruction
  2. check stack requirements (throw OOG if not met)
  3. calculate and check gas requirements (throw OOG if out of gas)
  4. enlarge memory if necessary
  5. check that the call depth is not too large (push 0 to stack otherwise)
  6. check that we have enough ether to send along with the call (push 0 to stack otherwise)
  7. if checks succeeded: perform the call (passing on memory pointers for to a new stack frame)
  8. calculate and check gas requirements for writing output into memory (throw OOG if out of gas)
  9. enlarge memory if necessary
  10. copy output into memory
  11. refund gas

Or, to make that even simpler: Every opcode and step that accesses memory grows memory and this growing of memory has to be paid with gas. Before this proposal, the call semantics were:

  1. access memory both at input and output area to grow it artificially
  2. perform the call, passing pointers into input and output area
  3. have the data magically appear in the output area unknowing how much has been written

With this proposal:

  1. access memory at input
  2. perform the call, passing input data via a pointer
  3. copy output into output area

I am sorry, but I think it is just ridiculous that this has been open for over half a year now although we have general acceptance inside the community and among most of the client developers and a gigantic need for this feature.

axic commented

@chriseth @gavofyork: An alternative solution is to charge the gas to the caller for the reserved memory area and refund (include in the refund counter) the difference of reserved and actually written bytes.

The disadvantage is that a much higher gas limit is needed as the refund counter is only processed after the transaction has finished.

@axic unfortunately, that will not solve the problem, because the calling contract does not know how much to reserve.

I would like to give a simple summary of the two proposals again:

The designated output area is not taken into account for enlarging memory before the call. At the point where the call returns, if the return value is shorter than the output area, the output area effectively shrinks to the size of the return value. Memory is enlarged if necessary.

Proposal A: remaining gas from the call is refunded to the caller; caller pays for enlarging memory for output, throws OOG if not enough gas. Otherwise, return value is written to memory.

Proposal B: memory is enlarged and paid for from the remaining gas of the call. If that is not enough, the call fails (returns zero). Otherwise, return value is written to memory and remaining gas is refunded to the caller.

Proposal B is backwards incompatible, but this case is not used by the solidity compiler except if you manually specify gas.

Proposal B also plays nicely with #90

Can you elaborate why it would play nicer than A, or is that not what you are saying?

My general opinion is that it is much easier (and better) to add a new opcode, instead of reusing existing one and creating complex combos from single propose opcodes. So in this case I would add CALL_OUTPUT_SIZE opcode instead of tweaking MSIZE.

Proposal B is problematic with regards to existing code: Calls to e.g. the identity precompile might be more expensive than before (meaning the call will consume more gas, not the call opcode itself), but there are existing contracts that always supply a very specific amount of gas for the identity precompile.

In order to not break these contracts, we can alter the proposal so that the new rules only take effect if the specified size of the output area is MAX = 2**256 - 1 (alternative: MAX = 2**63-1). In summary:

If the size of the output area is different from MAX, the semantics of the EVM do not change, meaning that the memory is resized to accommodate both the input and the output area and the gas costs for that are paid as part of the costs for the CALL opcode.

If the size of the output area is equal to MAX, memory is resized to accommodate the input area only, the CALL opcode pays as before. At the point where the call returns, memory is resized again to fit the size of the actually returned data. The procedure is paid from the remaining gas of the call. If there is not enough gas, the call fails, memory is not resized and the return data is not written to memory (this is verify similar to the code deposit for a creation).

To follow up on that ^

Specifying output MAX means removing the notion of "max allocation size the the callee would specify", discussed above.
And also, with this proposal, the following is necessary to determine the size of the returned data:

At the point where the call returns, memory is resized again to fit the size of the actually returned data.

That means only resizing to increase, not decrease the memory area, right ?

There currently is no way to reduce the allocation of memory, and with resize I actually meant "enlarge if needed". I don't think that it is worth the hassle to provide such a means. If you reduce the size, would that mean that you get a gas refund? Also, I don't think we should include this complication into this EIP.

so...what's the final decision?

@chriseth

If there is not enough gas, the call fails, memory is not resized and the return data is not written to memory (this is verify similar to the code deposit for a creation).

Please clarify that this means it also puts a 0 on the stack, so default caller behavior will be to re-throw (hopefully via REVERT) - in other words, it still fails by "running out of gas"

See #211.

I guess this can be closed now. @Souptacular what is the process here?

axic commented

@chriseth @Souptacular I think this can be closed given the draft of this is merged and has been deprecated since by EIP211 (the latter also states "replaces: 5").