When debugging transactions, you can often run into unverified contracts. Even if you interact only with verified contracts, they can call contracts that are only ABI-verified or unverified completely.
This essentially creates a black box, limiting the information and insights current debuggers can provide. In these instances, you usually see raw state transitions and executed instructions, without any additional data.
This extremely limited insight into unverified contracts leaves developers guessing what’s in the black box, making many transactions hard to understand. For this reason, we decided to create new methods to enrich calls to unverified contracts with more data.
The current insights into unverified contracts
In this example transaction, you have no decoded information other than it reverted.
Since verification isn’t available, you can see only two interesting values in this transaction:
- The
calldata
:0x8da5cb5b
- The
output/returndata
:0x245aecd3000000000000000000000000b06a64615842cba9b3bdb7e6f726f3a5bd20dac2
.
What can you do with these two values? What do they represent? How do you map them to any high level concept in Solidity without any meta data?
The starting point for decoding unverified contracts
At first glance, you can see random hex numbers, almost impossible to understand. So, we decided to try to find a solution and decode these random values.
To achieve this, we relied on the fact that functions, events, and errors have a rigid and deterministic way of being encoded, allowing us to provide some context on what these numbers represent.
4-byte function selectors
To understand the values in the example above, we need to understand how function calls are encoded as defined by the ABI specification.
To call any contract written in Solidity or Vyper or any contract that adheres to the ABI, you need to form the calldata
in a very specific way. For example, if you want to call the function function transfer(address to, uint256 value) external returns (bool);
of the IERC20
interface, you need to take the keccak256
hash of the simplified signature, containing only the name and parameter types.
It would look like this: keccak256("transfer(address,uint256)")
. Each parameter type is separated only by the ,
character. This gives you the value of 0xa9059cbb2ab09eb219583f4a59a5d0623ade346d962bcd4e46b11da047c9049b
. The ABI specifies that the first four bytes of this hash should be used as the function selector, leaving you with 0xa9059cbb
.
The next step in encoding the calldata
is adding the parameters. The ABI describes this in detail as well. You just need to put the parameters after the selector (parameters smaller than 32 bytes are padded to 32 bytes with zeros on the left). If you use 0x1122334455112233445511223344551122334455
as the address and 0x321
as the value, you end up with this calldata
:
a9059cbb00000000000000000000000011223344551122334455112233445511223344550000000000000000000000000000000000000000000000000000000000000321
Other selectors
There are two more selector types that help us analyze unverified contracts:
- Event selectors work the same way, except all of the 32 bytes of the hash are preserved.
- Custom error selectors work identically to function selectors. The only difference is the data is often found in the
returndata
of the transaction and not thecalldata
. This is because errors are not called, but are a result of a failing transaction.
The process of decoding unverified contracts
Since the process of creating selectors is deterministic and behaves the same for each contract, you can try to reverse it. For a specific calldata
or returndata
, you can try to decode it without any additional information.
You can parse all publicly verified contracts and take their functions, events, and custom errors. Then, calculate the selectors and put them in a database. This database is called “the 4-byte directory”.
Since function and error signatures are only four-bytes long, there will be some collisions. We record all functions that generate that selector. We also count the number of appearances since the more often a signature appears in verified contracts, the more relevant it is to us.
Using this 4-byte directory, you can try to decode unverified contracts. Other than matching the selectors, you can use the rest of the calldata
or returndata
to further solidify the match.
In the example transaction, the entries corresponding to 0x8da5cb5b
show:
One function called owner with no parameters, giving it the signature owner()
.
You can also search for the first four bytes of the returndata 0x245aecd3
and get the result:
Decoding the rest of the return data gives an address match of 0xb06a64615842cba9b3bdb7e6f726f3a5bd20dac2
. This consumes all of the returndata
, giving a good probability you found an adequate match.
From this, you can conclude that the function owner()
was called and that it reverted with a custom error of NotOwner(0xb06a64615842cba9b3bdb7e6f726f3a5bd20dac2)
.
The original code of the unverified contract shows the information is correct:
Note: We decoded the hex numbers without access to this code. We’re including it here to verify our analysis.
Important considerations
Using publicly verified contracts and some analysis, you can get insights into unverified contracts. However, it’s important to note we can’t guarantee that some other function with the same signature isn’t a part of the code.
Also, these selectors are generated only based on the function signature, not the code of the function. This means that we can’t know what this function does, only what it’s called and which parameters it has.
This can be used as a tool in understanding unverified contracts when debugging transactions. However, it’s critical to remember that all of this is based on deduction.