[EthMag] Proposed scheme for encoding Ethereum state into a Verkle tree
One of the things that I want to do at the same time as moving the current Ethereum state to a Verkle tree to cut witness sizes is to move it from the current 2-layer "trie inside a trie" structure to a single-layer structure.
You may have seen diagrams of the Ethereum state like this before:
The state consists of a set of accounts, where each account contains a header. Each header in turn contains a pointer to a piece of code, and a root of another whole account-specific tree. This structure has a few really annoying problems:
I propose an alternative, which fits the entire Ethereum state into a single key:value trie. That is, every position in the state (eg.
There is no "trie of tries" design, and there is also no special serialization for a header; instead, header and code are both mapped to keys, just like storage. Values which share the first 31 bytes of their key are put into the same bottom-layer commitment; this saves witness space for common use cases (verifying many header fields or code chunks or adjacent storage slots). Note that accounts will no longer be "under one subtree" the way they are today; different parts of the storage of an account will be in different locations in the state trie.
For the sake of illustration, here's part of the code in my proposal, which shows how values inside addresses will get mapped to trie keys:
Notice in particular how values in the header get mapped to trie keys that share the first 31 bytes, so they are part of the same bottom-level commitment, and adjacent storage slots (and code too) benefit from the same treatment.
Additionally, storage slots 0…63 have the privilege of being part of the same bottom-level commitment as the header, so witness size for accessing these frequently used slots is really low. Gas costs (see next section) will be adjusted to take advantage of this; the current proposal charges a flat 200 gas per slot for these first 64 slots.
Witness gas cost reform
An important reform that needs to happen alongside this is gas cost reform, finishing the work that was started in EIP 2929 by making gas costs closely map to witness data usage in all cases. This allows us to ensure tight witness size bounds for Ethereum blocks (even including worst-case attacks on the trie that require 2**80 computing power, we're looking at ~6 MB witnesses; under more "normal" circumstances witness size will be <2 MB).
A necessary part of this is introducing gas costs for accessing code chunks (see this proto-EIP; the current plan is ~200 gas per 31-byte chunk). That proto-EIP goes further, however, and harmonizes gas costs more generally so that they align with this trie structure: 1900 gas for a new branch proof, and 200 gas for a new value in the same bottom-level commitment.
Some preliminary analysis of code chunk gas costs have shown that this may increase contract calling cost by ~10%. However, this new gas scheme will also introduce some important benefits for developers. A notable benefit is that gas costs for accessing many adjacent storage slots (eg. single persistent variables in Solidity are typically mapped to storage slots 0, 1, 2….n) will drop from ~2100 per storage slot to ~200 per storage slot after the first. Hence, well-designed contracts could easily get savings that make up for the increased cost of code access. Additionally, the per-chunk and per-subtree costs mean that contracts that call to libraries will no longer be much less efficient than large contracts that keep all their code together in one piece.