Breaking down the EVM

What are the inner-workings of the EVM ?

Flavius Burca
#blockchain

What is the Ethereum Virtual Machine?

The Ethereum Virtual Machine (or EVM) is a stack-based computer, responsible for the execution of smart contract instructions. All EVM instructions take their parameter from the stack, except for PUSHx, which takes their parameters from the code. Each instruction has stack inputs, the parameters that they may need, and stack outputs (their return values).

What is a smart contract?

A smart contract is a set of instructions. Each instruction is an opcode (with their own handy mnemonic for reference, text representations of their assigned values between 0 and 255). When the EVM executes a smart contract, it reads and executes each instruction sequentially, except for JUMP and JUMPI instructions. If an instruction cannot be executed, for instance, if there are not enough values on the stack, or insufficient gas, the execution reverts. Transaction reversion can also be triggered with the REVERT opcode, though the REVERT opcode refunds unused gas fees of its call context, while other causes of revert consumes it all. In the event of a reverted transaction, any state changes dictated by the transaction instructions are returned to their state before the transaction.

The Execution Environment

When the EVM executes a smart contract, a context is created for it. This context is made of several data regions, each with a distinct purpose, as well as variables, such as the program counter, the current caller, callee and the address of the current code.

Code

The code is the region where instructions are stored. Instruction data stored in the code is persistent as part of a contract account state field. Externally owned accounts (or EOAs) have empty code regions. Code is the bytes read, interpreted, and executed by the EVM during smart contract execution. Code is immutable, which means it cannot be modified, but it can be read with the instructions CODESIZE and CODECOPY. The code of one contract can be read by other contracts, with instructions EXTCODESIZE and EXTCODECOPY.

Program Counter

The Program Counter (PC) encodes which instruction, stored in the code, should be next read by the EVM. The program counter is usually incremented by one byte, to point to the following instruction, with some exceptions. For instance, the PUSHx instruction is longer than a single byte, and causes the PC to skip their parameter. The JUMP instruction does not increase the PC's value, instead, it modifies the program counter to a position specified by the top of the stack. JUMPI does this as well, if its condition is true (a nonzero code value), otherwise it increments the PC like other instructions.

Stack

The stack is a list of 32-byte elements used to store smart contract instruction inputs and outputs. There is one stack created per call context, and it is destroyed when the call context ends. When a new value is put on the stack, it is put on top, and only the top values are used by the instructions. The stack currently has a maximum limit of 1024 values. All instructions interact with the stack, but it can be directly manipulated with instructions like PUSH1, POP, DUP1, or SWAP1.

Memory

EVM memory is not persistent, and is destroyed at the end of the call context. At the start of a call context, memory is initialized to 0. Reading and Writing from memory is usually done with MLOAD and MSTORE instructions respectively, but can also be accessed by other instructions like CREATE or EXTCODECOPY. We discuss memory size calculations later in this document.

Storage

Storage is a map of 32-byte slots to 32-byte values. Storage is the persistent memory of smart contracts: each value written by the contract is retained past the completion of a call, unless its value is changed to 0, or the SELFDESTRUCT instruction is executed. Reading stored bytes from an unwritten key also returns 0. Each contract has its own storage, and cannot read or modify storage from another contract. Storage is read and written with instructions SLOAD and SSTORE.

calldata

The calldata region is the data sent to a transaction as part of a smart contract transaction. For example, when creating a contract, calldata would be the constructor code of the new contract. Calldata is immutable, and can be read with instructions CALLDATALOAD, CALLDATASIZE, and CALLDATACOPY. It is important to note that when a contract executes an xCALL instruction, it also creates an internal transaction. As a result, when executing xCALL, there is a calldata region in the new context.

return data

The return data is the way a smart contract can return a value after a call. It can be set by contract calls through the RETURN and REVERT instructions, and can be read by the calling contract with RETURNDATASIZE and RETURNDATACOPY.

Gas Costs

Each transaction on the Ethereum blockchain is vetted by a third-party validator, before it is added to the blockchain. These validators are compensated for conducting this vetting process, and adding transactions to the blockchain, with incentive fee payments. Fees vary from transaction to transaction, contingent on different variables for different forks. Some variables in calculating fees include:

  • Current price of one gas unit: Gas, or gwei, is a denomination of Ethereum, used in fee payment. Gas prices vary over time, based on current demand for block space, measured in ETH per gas.
  • Calldata size: Each calldata byte costs gas, the larger the size of the transaction data, the higher the gas fees. Calldata costs 4 gas per byte equal to 0, and 16 gas for the others (64 before the hardfork Istanbul).
  • Intrinsic Gas: Each transaction has an intrinsic cost of 21000 gas. Creating a contract costs 32000 gas, on top of the transaction cost. Again: calldata costs 4 gas per byte equal to 0, and 16 gas for the others (64 before the hardfork Istanbul). This cost is paid from the transaction before any opcode or transfer execution.
  • Opcode Fixed Execution Cost : Each opcode has a fixed cost to be paid upon execution, measured in gas. This cost is the same for all executions, though this is subject to change in new hardforks.
  • Opcode Dynamic Execution Cost: Some instructions conduct more work than others, depending on their parameters. Because of this, on top of fixed costs, some instructions have dynamic costs. These dynamic costs are dependant on several factors (which vary from hardfork to hardfork).

To get a complete estimation of the gas cost for your program, with your compiler options and specific state and inputs, use a tool like Remix or Truffle.

Gas Refunds

Some opcodes can trigger gas refunds, which reduces the gas cost of a transaction. Gas refunds are applied at the end of a transaction. If a transaction has insufficient gas to reach the end of its run, its gas refund cannot not be triggered, and the transaction fails. With the introduction of the London hardfork, two aspects of gas refunds changed. First, the limit to how much gas can be refunded is lowered from half of the total transaction cost, to one fifth of the total transaction cost. Second, the SELFDESTRUCT opcode cannot trigger gas refunds, only SSTORE.

Memory Expansion

During a smart contract execution, memory can be accessed with opcodes. When an offset is first accessed (either read or write), memory may trigger an expansion, which costs gas.

Memory expansion may be triggered when the byte offset (modulo 32) accessed is bigger than previous offsets. If a larger offset trigger of memory expansion occurs, the cost of accessing the higher offset is computed and removed from the total gas available at the current call context.

The total cost for a given memory size is computed as follows:


memory_size_word = (memory_byte_size + 31) / 32
memory_cost = (memory_size_word ** 2) / 512 + (3 * memory_size_word)

When a memory expansion is triggered, only the additional bytes of memory must be paid for. Therefore, the cost of memory expansion for specific opcode is thus:


memory_expansion_cost = new_memory_cost - last_memory_cost

The memory_byte_size can be obtained with opcode MSIZE. The cost of memory expansion triggered by MSIZE grows quadratically, disincentivizing the overuse of memory by making higher offsets more costly. Any opcode that accesses memory may trigger an expansion (such as MLOAD, RETURN or CALLDATACOPY). Note that opcodes with a byte size parameter of 0 will not trigger memory expansion, regardless of their offset parameters.

Access Sets

Access sets are defined per external transaction, and not per call. Each transaction may be defined by some combination of its sender, calldata, or callee. Transactions can either be external or internal. External transactions are sent to the Ethereum network. Internal transactions are triggered by external transactions that have executed the xCALL instruction. As such, internal transactions are also known as calls. Access sets can be thought of as two independent types of lists: those of touched addresses, and those of touched contract storage slots.

When an address is accessed by a transaction, instruction, or used as caller or callee, it is put in the access set. Calling the opcode BALANCE, on an address not present in an access set costs more than if the address were already in the set. Other opcodes that can modify the access set include EXTCODESIZE, EXTCODECOPY, EXTCODEHASH, CALL, CALLCODE, DELEGATECALL, STATICCALL, CREATE, CREATE2 and SELFDESTRUCT. Each opcode has their own cost when modifying the access set.

Touch slot lists are a set of storage slot keys accessed by contract addresses. Slot lists are initialized to empty. When an opcode accesses a slot that is not present in the set, it adds it to it. Opcodes that can modify the touched slot list are SLOAD and SSTORE. Again, both opcodes have their own cost when modifying the access set.

If a context is reverted, sets are reverted to their state before the context.

If an address or storage slot is present in the set, it is called 'warm'; otherwise it is 'cold'. Storage slots that are touched for the first time in a transaction change from cold to warm for the duration of the transaction. Transactions can pre-specify contracts as warm using EIP-2930 access lists. The dynamic cost of some opcodes depend on whether an address or slot is warm or cold. After the Berlin hardfork, all precompiled contract addresses are always ‘warm’.