Create a more robust abiSha3 hash
tpmccallum opened this issue · 9 comments
It is possible that ABIs, which are uploaded to the search engine, could have their functions and events in different orders.
Whilst the check for ABI compatibility would still pass, the abiSha3 hash would be different.
We already strip out tabs and extra spaces etc. to ensure that hashes are more consistent and reliable, however it would be a really good idea to also sort the keys in the ABI data structure so that we can formulate a more robust ABI hash.
Let Atlas know when the new abiSha3 hashes are created as this will change the frontend display lookup objects.
This is now done by sorting the keys in the data structure and then hashing the clean (no tabs, extra spaces or return characters) data using sha3.
def shaAnAbiWithOrderedKeys(self, _theAbi):
theAbiHash = str(self.web3.toHex(self.web3.sha3(text=json.dumps(_theAbi, sort_keys=True))))
return theAbiHash
Example of this include:
- ERC20
Official ERC20 ABI found on Ethereum Wiki produces the following Sha3 value.
0x2b5710e2cf7eb7c9bd50bfac8e89070bdfed6eb58f0c26915f034595e5443286
After further testing it turns out that ordering the keys is not entirely robust. Please consider the following example where two different valid ABIs (which are also valid JSON) can produce two different Sha3 hashes, due to the values being out of order.
import json
string1 = '''{
"constant": false,
"inputs": [{
"name": "value",
"type": "uint256"
},
{
"name": "spender",
"type": "address"
}
]
}'''
string2 = '''{
"constant": false,
"inputs": [{
"name": "spender",
"type": "address"
},
{
"name": "value",
"type": "uint256"
}
]
}'''
json1 = json.loads(string1)
json2 = json.loads(string2)
output1 = json.dumps(json1, sort_keys=True)
output2 = json.dumps(json2, sort_keys=True)
print(output1)
# {"constant": false, "inputs": [{"name": "value", "type": "uint256"}, {"name": "spender", "type": "address"}]}
print(output2)
# {"constant": false, "inputs": [{"name": "spender", "type": "address"}, {"name": "value", "type": "uint256"}]}
# Note the values of type remain different when comparing 1 and 2. These are both valid JSON and valid Ethereum ABI. This is not robust enough for deterministic hashes of ABIs given that a user can upload an out of order ABI as shown above.
This occurs when an internal list can contain a repeated key such as "name". The json.dumps does not re-work the list.
JSON can't have duplicate keys so one would assume that sorting by keys is robust. However single entries in internal lists can each have duplicate keys which will remain out of order if the JSON is created in that way.
Given the fact that each input of a given smart contract's function can not share the same name, we can go ahead and sort the list using the following code.
from operator import itemgetter
list.sort(key=itemgetter("name"))
This code sorts the internals of the list as part of a dynamic for loop.
def sortInternalListsInJsonObject(self, _json):
for listItem in _json:
for k, v in listItem.items():
if type(v) not in (str, bool, int) and len(v) > 1:
if type(v[0]) is dict:
v.sort(key=itemgetter("name"))
else:
v.sort()
return _json
The overall result being what we want.
_json
# Returns
# {'constant': False, 'inputs': [{'name': 'spender', 'type': 'address'}, {'name': 'value', 'type': 'uint256'}]}
Can now confirm that the new code is able to create an ordered ABI with a deterministic hash
Reading configuration file
Master index: allercchecker
Common index: ercchecker
Abi index: abiercchecker
Bytecode index: bytecodeercchecker
Blockchain RPC: https://mainnet.infura.io/v3/fdaf79947fba404ab08cc096f20e12ea
ElasticSearch Endpoint: search-cmtsearch-l72er2gp2gxdwazqb5wcs6tskq.ap-southeast-2.es.amazonaws.com
0xf184e89595256b4eff3b1f0a66570fb6944e04eab99156d3f7bfe5b7c082c628
0xf184e89595256b4eff3b1f0a66570fb6944e04eab99156d3f7bfe5b7c082c628
Code for the above test is here