EVM Object Format (EOF), or EIP-7692, is a collection of multiple EIPs aimed at improving the structure of smart contracts executed on the EVM. It has been long awaited and is finally coming to the Pectra hard-fork, with many other improvements.
The current EVM doesn’t impose a format or structure for any contract, meaning any executed code has to be validated at runtime, every time. This doesn’t only create an overhead for the execution, but it also brings challenges around introducing new or deprecating old features.
Let's delve into the individual EIPs that make up the “meta” EIP-7692, the expected changes, and overall effects on the ecosystem.
Paving the way for EOF with EIP-3541
The EIP-3541 has already been adopted to support the implementation of 7692. EIP-3531 enables the differentiation of contracts with legacy code and EOF formatted contracts. It disallows the deployment of contracts that start with the byte 0xEF that will be used for EOF contracts.
However, since there are already some existing contracts that start with 0xEF, the distinction between legacy and EOF contracts still isn’t clear-cut. To solve this, the identifier for EOF is a sequence of bytes starting with 0xEF that does not exist in any current contract – 0xEF00
.
What EIPs does EVM Object Format include?
As a collection of several EIPs, the EIP-7692 includes the following:
The Format: EIP-3540
This EIP can be considered as the “main” EIP around which everything else is structured. It proposes the first version of an extensible and versioned container format for EVM bytecode.
EIP-3540 proposes a format consisting of the EOF Header and one or more sections in the following order:
- The header begins with the EOF identifier 0xEF00.
- Following the identifier is the EOF version number (currently
0x01
deploy-time). - Next are one or more section headers that describe various parts of the program such as functions or data regions.
The EVM will have two new opcodes for calling and returning from those sections. They are introduced in EIP-4750 to facilitate calling functions within the same contract. JUMP instructions were previously used to achieve this.
Calling external functions has also been modified, with EIP-7069, simplifying them semantically.
Validation: EIP-3670 and EIP-5450
Currently, execution clients have to validate every executed opcode for every executed call. EIP-3670 (code validation) and EIP-5450 (stack validation) aim to replace this validation with a one-time, deploy-time validation of the EOF formatted code.
Since currently the EVM code has no imposed structure, upfront validation is extremely hard to compute. Having the code in a defined structure, where each function has its own section, makes this task a lot easier.
New Instructions: EIP-7480, EIP-663, and EIP-4200
Because of the new concept of sections, the format requires the introduction of new instructions for jumping to those sections: CALLF
, RETF
, and JUMPF
.
For jumps within sections, there are new “relative-jump” instructions RJUMP
, RJUMPI
, and RJUMPV
. These are a relative jump, a relative conditional jump, and a relative table jump. This limits the flexibility of the EVM a bit, but allows for easier and more powerful analysis and validation.
SWAPN
, DUPN
, and EXCHANGE
instructions are finally getting included, which will eliminate stack to deep errors. The introduction of these three instructions will allow for much more control over the stack, opening up various improvements in any compiler.
EIP-7480 introduces instructions for reading data from data sections of contracts. The separation of code and data will also improve the analysis and validation of contracts.
Why is EIP-7692 needed?
One of the main reasons behind the EOF EIP is the introduction of “EVM Versioning”. The EVM has evolved with the whole ecosystem, changing and improving over time. This has become somewhat hard to track and is often confusing for new developers, especially if you account for the growing number of EVM networks.
EOF introduces a way to track and version the container itself and what it can hold. It prepares the EVM for easier changes in the future, making modifications easier and allowing for faster innovation and development.
The introduced code separation, easier code analysis, and opcode changes facilitate translating EVM bytecode to machine code. This would allow any EVM implementation to be a lot faster and more efficient, making clients easier and cheaper to run.
The benefits of EOF1
EOF1, as the current and first iteration of EOF, brings immediate improvements and fixes some existing pain points in the ecosystem:
- Many compiler improvements are unlocked with this upgrade.
- Stack manipulation becomes easier, allowing for more local variables and more complex calculations.
- Functions get more significance than just random sets of instructions.
- Having sections for each function allows for safer and more efficient code.
- A lot of [static] analysis tools become a lot more useful and applicable.
What are the immediate effects of EOF?
The implementation of EOF will introduce some immediate changes to the EVM ecosystem, including:
- Reduced gas costs of newly deployed contracts: EOF1 produces smaller contract code and smaller deployment (constructor) code. This will reduce deployment costs between 5% and 10% for an average contract.
- Decrease in gas usage for EOF code: Some early tests show a small decrease in gas usage when executing EOF code compared to legacy code. This will make EOF contracts cheaper to run.
- No stack too deep errors: The introduction of opcode immediates such as
SWAPN
,DUPN
, andEXCHANGE
will allow for more complex stack manipulation, eliminating stack too deep errors. This will enable more complex functions with many local variables and complex calls. - Lower costs of running nodes: The deploy time validation and simplified runtime validation will bring down costs of running nodes, allowing for cheaper node services. This will also lower the entry requirements for running nodes.
- Simplified auditing process: With the separation of code and data and the new instruction set, reasoning about the intent of the code becomes easier, making the auditing process a bit easier.
What are the long-term effects of EOF?
The EOF implementation will also shape the ecosystem in the long run, bringing some significant benefits to developers, as well as the general community.
- A conduit for innovation: The removal of on-chain code and gas observability creates an environment more suitable for upgrades and changes. By removing opcodes like
GAS
,CODECOPY
,EXTCODEHASH
, and others, contracts will behave much more consistently with changes to opcode pricing or the opcodes themselves. This will allow the EVM to evolve more rapidly by enabling easier depreciation of old features and the introduction of new ones. - Greater EVM customizability: Likewise, the gas and code observability removal allows L2 networks to more easily adapt their version of the EVM to their use cases. This means more specialized opcodes and gas pricing adjustments, per each L2 network requirement.
- Моre effective tooling: The imposed structure allows for new analysis tools to be introduced, making future code safer, cheaper and more reliable. Lots of current tools used in traditional programming can be more easily adapted to smart contract development. Auditing with these new tools will be a lot better and more reliable.
- Compiler improvements: EOF1 provides a set of instructions more suitable for compiler improvements. This will allow all compilers to improve over time much more, bringing in even more gas usage reductions.
What are some important considerations for implementing EOF?
EOF introduces, removes, and changes many opcodes. It would be the biggest change to the EVM behavior yet, making it riskier compared to previous opcode changes such as EIP-6780. This can be alleviated with thorough testing and auditing to minimize the potential risk of creating consensus violations.
Code and stack validation needs to be the same across all clients, becoming part of the consensus. This means that their implementation opens up the possibility of consensus bugs. For this reason, defining and implementing them properly is very important.
The biggest downside to EOF is that legacy code will still exist even after EOF is shipped. Currently, it’s extremely difficult or even impossible to transform old code into EOF, keeping legacy contracts for now and, thus, requiring two execution clients to be maintained.
Supporting the ecosystem growth
Aside from a few important risk considerations, EOF brings many long awaited benefits to the ecosystem. It sets up the stage for innovation and faster development in the future, while simultaneously bringing immediate benefits in bytecode size, gas efficiency, gas prices, and better analysis.
Additionally, L2 networks will reap great benefits in the ease of both tracking different features and customization. EOF also makes the problem of “L1 compatibility” a lot easier to solve.
On top of these direct benefits, EOF brings indirect benefits by enabling the development of new tooling and facilitating the auditing process. The EOF bytecode will be much similar and more readable than the current bytecode, making these processes easier and more reliable.