sc-forks/solidity-coverage

Improve implementation - perhaps with debug_traceTransaction

area opened this issue ยท 23 comments

area commented

Given that Truffle 4 is now in beta, and comes with a snazzy debugger that we can (in theory) crib notes from, it seems like a good moment to start a discussion along these lines.

On the surface, it seems like doing no instrumentation at all and using debug_traceTransaction alongside the demonstrated methodology in the truffle debugger would save a lot of headaches that instrumentation can otherwise cause.

The obvious downside to this approach would be that we would no longer be able to trace lines of code executed with an eth_call, which would limit the maximum coverage we could display to a number below 100%. We could, however, exclude functions labelled with pure and view (formerly known as constant), though they can still obviously be called by transactions that change blockchain state, and so not showing them as covered might be confusing.

I'm considering this a blue-skies issue for now, so I'm open to suggestions, however far-fetched they might seem. I'll kick it off:

  • Create our own fork of solidity and etheruemjs-vm, and introduce another opcode for our coverage events that cost no gas.
  • Integrate something like the solidity debugger with sourcemaps etc. into the ethereumjs-vm, so that it 'knows' what code it is executing and keeps track of it.

I think this makes a lot of sense. I really like the opcode idea.

I was imagining something slightly different (probably impractical) before I read your description of how it might work. But like:

  1. Somehow 'instrument' a trace_transaction-like opcode-to-source map before running the tests.
  2. Eavesdrop on testrpc (via stdout?) as it runs to get the transactionIds
  3. Run debug_TraceTransaction to see what ops got hit.
  4. Compare 3 and 1 to generate the report.

One thing I find daunting about the opening proposal is tracking upstream solc. They have a blistering production schedule. In my fantasy there are no dependencies, lol. And solidity-coverage runs very slowly.

RE: eth_call - that's a good insight. I guess most calls don't matter although some might - for example algorithmically intricate math fns.

area commented

One thing I find daunting about the opening proposal is tracking upstream solc. They have a blistering production schedule. In my fantasy there are no dependencies, lol.

Yeah, I really don't want to track that either.

Your proposal is the sensible way to implement the functionality found in the truffle debugger, definitely. There's no need to eavesdrop on testrpc though; noting the blocknumber before and after running the tests would allow us to just iterate over all the blocks to find the transactions that ran (this requires disabling testrpc's evm_snapshot functionality, but we're going to have to do that anyway if we go down any road involving debug_traceTransaction.

And solidity-coverage runs very slowly.

I am aware #63 is floating around with my name on it still ๐Ÿ˜‰ . It's clear that performance can be greatly improved even with instrumentation. Personally, I think I am of the opinion that prioritising accurate coverage is more valuable than any of these issues; I'm open to discussions on that front though!

@area Yes in complete agreement with you about speed. I don't care about #63 especially - current performance is fine except maybe for things that are actually market simulations in disguise. Was just noting that one virtue of the opCode injection is that it's fast and what I was suggesting involves iterating over everything twice - I suspect it will always be > 2N.

Out of curiosity could you explain what we have to do to the snapshot and why that's required?

@subhodi I hope you don't mind but I'm going to delete your comment since it's unrelated to this issue. Please open a separate issue describing the errors your seeing and we'll be happy to help you. Keeping the issue topics well ordered is important because it helps other people find solutions by search when they run into difficulties. Sorry, thanks!

area commented

Out of curiosity could you explain what we have to do to the snapshot and why that's required?

When using testrpc, Truffle uses the snapshotting provided by evm_snapshot and evm_revert to quickly restore the chain to a 'pre-test' state. When that's not available, it redeploys everything and runs all the migrations prior to each test. If we started using debug_traceTransaction, if we let it revert the chain, then the transactions we would want to trace after running all the tests wouldn't exist, and so we wouldn't know which code they touched.

This isn't an issue currently, because even though the chain gets reset, we've already logged the instrumentation events.

Ah ok right! That's an important detail. Definitely goes into the 'benefits of an opcode' column. [Edit Ignore last sentence: incoherent.]

So are we looking at something like having app.js:

  • resemble truffle-core's lib/test.js.
  • rewriting truffle-core's testing/testrunner.js so that it:
    • hooks into the initialize method to generate an initial mapping.
    • hooks into the startTest method to track the block numbers.
    • hooks into the revert method to collect the tx ids and run comparison logic.

?

Create our own fork of solidity and etheruemjs-vm, and introduce another opcode for our coverage events that cost no gas.

It's avoidable to fork ethereumjs-vm. You can monkeypatch and add your own opcode, detection included. I've tested this very simple opcode (0xfd, wasn't sure where it should be) with a simple require hook:

hook([
  'ethereumjs-vm/lib/opcodes.js',
  'ethereumjs-vm/lib/opFns.js'
], function (exports, name, path) {
  if (path.indexOf('opcodes') !== -1) {
    // If it's ethereumjs-vm/lib/opcodes.js, then we wrap the exported
    // function to detect our own opcode (0xfd).
    const _super = exports
    exports = function (op, full) {
      const coverOp = {
        name: 'COVER',
        opcode: op,
        fee: 0,
        in: 0,
        out: 0,
        dynamic: false,
        async: false
      }

      return op === 0xfd ? coverOp : _super(op, full)
    }

    return exports
  } else {
    // If it's ethereumjs-vm/lib/opFns.js then we add our opcode
    // function - this simply logs cover, but it can be more complex.
    // For example, it could take values off the stack (e.g. type of coverage being done)
    Object.assign(exports, {
      COVER: function (runState) {
        console.log('Cover opcode')
      }
    })
  }

  return exports
})

Using it is like using ethereumjs-vm normally:

vm.runCode({
  code: Buffer.from(code, 'hex'),
  gasLimit: Buffer.from('ffffffff', 'hex')
}, function(err, results) {
  console.log('returned: ' + results.return.toString('hex'));
})

So, instrumentation just needs to add the appropriate opcodes in the contracts instead of events.

Obviously this approach depends on the internals of ethereumjs-vm, but it's so simple that I can't imagine it being too big of a risk doing it like this.

@onbjerg Wow! Great observation, thank you.

It may also be possible to combine this approach with listening to the vm's step event to capture coverage ops in real time. If so we won't have to worry about evm_revert and we can also cover calls?

Also worth noting that the testrpc server method passes a blockchain object in its callback. Not sure what's available on that but vm might be.

Yep, you can just use the step event instead. That approach is actually a bit cleaner.

You don't need to use testrpc, you can use ganache-core directly by using it's provider method and passing that to the truffle config instead. This method also accepts options, one of which is actually the VM, if you want to roll your own. It shouldn't be needed though, since this hook works if you're using ganache-core directly (which is what testrpc is using).

Ah right! Thanks so much @onbjerg, really helpful.

area commented

Yeah, that's brilliant. I'm not sure how I've gone so long without hearing about node-hook.

Just to be clear, the hook function is a modified version of require-in-the-middle ๐Ÿ˜Š

area commented

Just to be clear, the hook function is a modified version of require-in-the-middle ๐Ÿ˜Š

Good to know!

I definitely think there's promise to this approach, but it's going to require some thought. Ideally, we'd have support for something like

bytecode {
    0xfd
}

from Solidity, otherwise we're going to have to interact with (and 'instrument') the actual compiled code, I think? @onbjerg, in your test, did you just insert fd somewhere in the middle of some bytecode you had precompiled? Or did you have a more cunning approach?

If we figured out a way to do that, however, I think with the addition of a 'free push' opcode as well, we could pretty straightforwardly get all the information out that we wanted by prior to calling our opcode putting information on to the stack that we wanted (e.g. which branch of a conditional was being called), and then clearing those elements from the stack as part of our opcode.

To get the 'true' behaviour of the uninstrumented contracts, there's another hurdle to be overcome. Contract deployment of anything that's instrumented will cost a different amount from an uninstrumented contract. So I think we'd need to override CODECOPY to not count anything related to instrumentation against the cost of the deployment.

EDIT: Not even that is guaranteed to work; if the instrumentation stops the solidity compiler optimising something, then the gas costs could still be different.

Bonus terrible (maybe?) implementation idea

If we can't introduce custom opcodes during compilation, let's do it afterwards! We're going to have to mess with CODECOPY anyway... Let's say we want to instrument the fact that line 2 has run. We instrument the solidity with something like

assembly {
   0 0 0 0 0 0 2 add add add add add add add
}

or some other piece of code that's very unlikely to actually end up in the compiled bytecode. This compiles to 6000600060006000600060006000600001010101010101, or PUSH1 0x00 PUSH1 0x00 PUSH1 0x00 PUSH1 0x00 PUSH1 0x00 PUSH1 0x00 PUSH1 0x00 PUSH1 0x02 ADD ADD ADD ADD ADD ADD ADD (I can think of very few reasons why someone would push 0s on to the stack and then add them up!) Whenever a CODECOPY happens during deployment, we could look for these 'special' bytecode sequences, replace them with our 'special' free opcodes, and charge gas as if they weren't there at all.

Someone talk me out of this, please? ๐Ÿ˜›

@area I just inserted fd somewhere in the bytecode. However, I have another approach. One that is (potentially) so amazing that it requires no opcodes and no instrumentation. It only requires 1 monkey patch, and it's not in the VM. Yes, that's right. Rub your eyes, because this is going to be a long one.

solc has an option to compile contracts with source mappings. In Truffle 4, all contracts have source mappings by default, because they need this for truffle-debugger. The new contract schema is documented here.

The (rough) general strategy is now this:

  1. Parse the contract sources and find "instrumentation" points
  2. Get the source mapping for each of the contracts
  3. Pass an instance of ethereumjs-vm into ganache-core, and listen for step events on this VM
  4. Whenever a test is run, we get a step event. This step event includes the program counter, and we can use this value combined with the source map to see if we hit an instrumentation point
  5. If we did, then we mark that location as hit
  6. Iterate 4 & 5 until the tests are done

Now, here's the last crucial part: since this is essentially a custom provider (set in truffle-config.js), we would normally have no way of checking when the tests are done, because we are not running them ourselves (as we are now). To fix this, we do a very simple (and safe!) monkey patch on the provider from ganache-core:

const provider = ganache.provider({ ... })

// ๐Ÿ’ patch
const _super = provider.close.bind(provider)
provider.close = function (cb) {
  reporter.generateReports()
  _super(cb)
}

Basically, we're the man in the middle. Whenever the provider is closed (i.e. tests are done and the network is no longer used), we save the reports to disk. No disk I/O needed except here.

Thus, we are actually just inspecting what opcodes are run, and because we get the "instrumentation" points from the opcode offsets in the bytecode, we can calculate backwards via. the source mappings to get the actual lines of code in the Solidity source.

This is great on multiple levels

  • It's fast (theoretically)
  • There's no need to have a custom truffle-config bundled with solidity-coverage
  • There's minimal dependencies (ethereumjs-vm, ganache-core, solidity-parser)
  • There's no instrumentation needed
  • Truffle does source mappings for us, so no need for "double compilation" (i.e. this approach would also work w/o Truffle 4, but we'd have to compile the contracts ourselves just for the source mappings)
  • The code can be very clean
  • No real monkey patching needed (as the API for providers in Web3 are unlikely to change)
  • No custom forks of software
  • No custom opcodes
  • No weird alterations to CODECOPY et al.

Obvious drawbacks

  • We only support Truffle 4

Everything this approach does is essentially only observe the execution and record whenever a point of interest is run.

I'd really appreciate feedback on this approach, as I already started hacking some of these ideas together this weekend.

@onbjerg Yes, this is clearly a good design. The 'instrumentation' piece will be a big deal to figure out and get working correctly. But the mechanics seem ideal.

Still haven't checked but if the VM is exposed on the blockchain object coming from testrpc could we also remove ganache-core as dep since in this model all we need is the step?

If it's not exposed we could PR that over there. It wouldn't affect them.

A side note - if we continue to parse the code we should start using federicobond's antlr parser, which is more robust and has excellent line number / column pos reporting at the node level. That might help a little.

We still have to know when the tests are done though....

[Editing, more] - The ganache-core hook is really cool.

Another weird possibility is somehow leveraging the event hooks available in a mocha reporter? I just wrote a little eth-gas mocha reporter yesterday and there's a suite end hook available to third party reporters.

area commented

Yeah, that's a really nice approach - basically the way that it should be done. Using those source mappings is what I wanted to do, but using ganache-core and hooking in to the VM to catch even bits executed during eth_calls is the bit that I was missing.

I would definitely like to keep the ability to not require Truffle 4 available, though it does seem like Truffle is dominating at this point, so not essential for now.

True. This approach works with other frameworks, too, since it is just a provider. I think it should default to looking for source mappings etc. for Truffle, though. This could probably be configured in the provider :)

@onbjerg Sending you an org invite just in case you have any interest in experimenting with your approach in a branch here. ๐Ÿ™‚

Also going to create 'V2' tag for the issues - we might be at the point where the steps we'll need to take are a little clearer.

Great, thanks! I'll look at it this weekend if I have time, I'm in the process of moving ๐Ÿšš

area commented

I'm on tenterhooks here, @onbjerg ๐Ÿ˜‰

Implemented with 0.7.0.

Happy new year @area!

area commented