SPECIFICATION — DRAFT

Partitioned Binary Tree (PBT)

A zone-structured unified binary state tree for Ethereum with uniform 256-bit keys, structural per-account subtree boundaries, and future-proof collision resistance.

Zones

000 / 001 / 1

acct / code / storage

Header Stem

194 leaves

EIP-7864 layout

Code Zone

dedup

content-addressed

Root Computation

Single pass

no sequential dependency

§1

Overview

What Is the Partitioned Binary Tree?

The Partitioned Binary Tree (PBT) is a unified binary state tree for Ethereum that stores all state — account headers, contract storage, and contract code — in a single non-sparse binary trie while preserving structural boundaries between semantic categories of data.

Unlike a flat unified tree where all key-value pairs are undifferentiated, PBT uses a deterministic key derivation scheme that embeds zone and account identity information directly into the key structure. This creates topological boundaries in the tree that protocols can reason about: a node at a known depth is always the root of a specific account's storage, or the root of all code, etc.

PBT uses uniform 256-bit keys across all zones, with zone prefixes encoding data category. In a non-sparse trie, unused key space costs nothing — actual tree depth adapts to content density, not key width. Storage achieves collision resistance within 256 bits by combining a 60-bit address prefix for account-local grouping with a 187-bit stem suffix derived from both the address and storage index.

Design Goals

No Sequential Dependency

Root computation is a single bottom-up pass. No leaf value contains a hash of another part of the tree. Account headers do not store storage_root. Zones propagate upward independently.

Structural Semantics

The tree topology carries meaning. Zone boundaries, per-account subtree roots, and data categories are intrinsic to the tree structure — not just a property of the key derivation scheme, but of the tree itself.

Future-Proof Security

245-bit account keys (birthday bound 2¹²²·⁵). 187-bit storage stem suffix (birthday bound 2⁹³·⁵). 60-bit address prefix for account-local grouping. Designed to resist foreseeable advances in computational power.

State Distribution Alignment

Storage gets 50% of the tree (1-bit prefix). Accounts and code use 3-bit prefixes (12.5% each). Prefix '01' (25%) is reserved for future state categories. Ethereum state: ~75% storage, ~20% accounts, ~5% code. The non-sparse trie adapts naturally to actual population density.

Protocol Enablement

Structural boundaries enable Valid-only Partial Statelessness (VOPS), partial statefulness schemes, per-account state expiry, zone-based sharding, and archival pruning — all as natural operations on the tree topology.

§2

Zone Architecture

Three Semantic Zones

PBT partitions the key space into three zones, each storing a different category of Ethereum state. The zone is encoded in the most significant bits of every key using variable-width prefixes: 1 bit for storage, 3 bits for accounts and code, and 2 bits for a reserved zone.

Storage uses a 1-bit prefix ("1"), giving it 50% of the top-level tree address space. At depth 1, the right child of the root IS the root of all storage. Accounts ("000") and code ("001") use 3-bit prefixes, each occupying 12.5%. Prefix "01" (25%) is reserved for future state categories.

000

12.5% of tree

Account Headers

Key width: 256 bits
Prefix: 3-bit

Per-account header stem (EIP-7864 layout): BASIC_DATA, CODE_HASH, first 64 storage slots, and first 128 code chunks. Up to 194 populated leaves per account. 245-bit address hash provides 2¹²²·⁵ birthday resistance.

001

12.5% of tree

Code Chunks

Key width: 256 bits
Prefix: 3-bit

Content-addressed code chunks beyond the first 128 (which live in the account header). Keyed by H(code_hash || tree_index)[:245], enabling deduplication of identical bytecodes across contracts.

1

50% of tree

Contract Storage

Key width: 256 bits
Prefix: 1-bit

All contract storage slots. 256-bit keys with 60-bit address prefix for account-local grouping and 187-bit stem suffix. The dominant data category in Ethereum state (~75%).

Why This Distribution?

Ethereum's state is approximately 75% contract storage, 20% account headers, and 5% code. Allocating equal tree space to all data types would create an imbalanced tree where the storage subtree is significantly denser than the others.

By giving storage a 1-bit prefix (50% of tree space), the tree is more balanced at the top levels. This reduces average proof depth for storage lookups — the most common operation — and creates a natural split at depth 1: left child = accounts and code, right child = all storage.

⚠Design note — zone prefix rationale. Storage gets a 1-bit prefix ("1"), occupying 50% of the tree and rooted at a single node (depth 1, right child). The left subtree ("0") splits at depth 2 into "00" (accounts + code, 25%) and "01" (reserved, 25%). At depth 3, "00" further splits into accounts ("000", 12.5%) and code ("001", 12.5%). The reserved "01" prefix enables future state categories (e.g., precompile state, beacon chain roots) without restructuring the tree. In a non-sparse trie, actual tree balance depends on leaf population, not address space allocation — the unused reserved zone creates no overhead.

Top-Level Tree Topology

§3

Key Space

Key Structure Overview

Every key in PBT is 256 bits and encodes semantic segments: the zone prefix (which data category), address-derived hash bits (identifying the Ethereum account), a path segment (which item within the account), and a sub-index (position within a stem group of 256 leaves). Storage keys additionally include an address prefix for account locality.

Zone 000 — Account Headers

Each Ethereum account occupies a single header stem in zone 000, following the EIP-7864 layout. The stem holds up to 194 populated leaves: account metadata, the first 64 storage slots, and the first 128 code chunks — all co-located under one 248-bit prefix. The 245-bit address hash provides 2¹²²·⁵ birthday resistance.

zone

3b

H(addr)

245b

sub_idx

8b

000

H(addr)[:245]

0x00–0xFF

total: 256 bits

Sub-index layout (EIP-7864):
0x00:       BASIC_DATA (version, code_size, nonce, balance)
0x01:       CODE_HASH
0x02–0x3F:  reserved (62 slots, future use)
0x40–0x7F:  storage slots 0–63 (hot storage)
0x80–0xFF:  code chunks 0–127 (first ~4 KB of bytecode)

BASIC_DATA packing (32 bytes):

version:   1 byte
reserved:  4 bytes (future use)
code_size: 3 bytes (max 16,777,215 bytes ≈ 16 MB)
nonce:     8 bytes
balance:   16 bytes
total: 32 bytes

The 3-byte code_size supports contract sizes up to 16 MB — orders of magnitude beyond any foreseeable contract size limit (current mainnet: 24 KB, Glamsterdam: 40 KB). The 4 bytes of reserved space are available for future protocol fields (epoch counters, flags, etc.).

The header stem packs account metadata, hot storage, and initial code into a single stem of up to 194 leaves. All share the 248-bit prefix 000 || H(addr)[:245] and differ only in the last 8 bits (sub_index). Proving an account's nonce, balance, and first few storage slots requires opening only one stem — maximizing proof sharing for the most common access patterns.

128 code chunks × 31 bytes = ~4 KB of per-account code. This covers the majority of deployed contracts by count. Contracts larger than ~4 KB store their remaining chunks in the content-addressed code zone (zone 001).

64 storage slots × 32 bytes = 2 KB of hot storage. Slots 0–63 are the most accessed in typical contracts (ERC-20 balances, ownership, reentrancy guards). Slots ≥64 are stored in the storage zone (zone 1).

Zone 001 — Code Chunks (Content-Addressed)

Code chunks beyond the first 128 (which live in the account header stem) are stored in zone 001 using content-addressed keys. The stem path is derived from code_hash (keccak256 of the full bytecode) and tree_index, so contracts with identical bytecode share code zone leaves — deduplicating storage for thousands of identical ERC-20 contracts from the same factory.

zone

3b

H(code_hash||idx)

245b

sub_idx

8b

001

H(code_hash || tree_index)[:245]

(chunk_id − 128) % 256

total: 256 bits

tree_index = (chunk_id − 128) // 256. The header stem covers chunks 0–127. With Glamsterdam's 40 KB limit: ~1,290 total chunks. Header covers 0–127, code zone covers 128–1,289: tree_index ∈ {0, 1, 2, 3, 4}. Sequential chunks share stems: chunks 128–383 share one stem, 384–639 another.

Collision security: H(code_hash || tree_index)[:245] provides 2¹²²·⁵ birthday resistance between any two distinct (code_hash, tree_index) pairs. Since code_hash is keccak256(bytecode), a 256-bit uniform output, truncation to 245 bits preserves uniformity. A collision would require finding two distinct bytecodes whose hashed (code_hash, tree_index) pairs match in 245 bits — computationally infeasible.

Why 8 bits for sub_index? The sub_index = (chunk_id − 128) % 256 groups up to 256 consecutive chunks under the same stem node. Each stem covers ~8 KB of bytecode (256 × 31-byte chunks). This is the same 8-bit stem width used in all zones for consistency. Sequential code execution naturally reads consecutive chunks within a stem.

⚠Hybrid dedup — per-account header + content-addressed overflow. The first 128 code chunks live in the account header stem (zone 000, sub_idx 0x80–0xFF), keyed per-account. These are trivially expirable with the account. Chunks 128+ live in zone 001 with content-addressed keys — shared across all contracts with the same bytecode. Deletion of content-addressed code requires reference counting (how many live accounts share this code_hash?) or permanent storage. Since SELFDESTRUCT is deprecated and most contracts fit within the 128-chunk header, the practical impact is limited.

Zone 1 — Storage Slots

Storage slots ≥64 are stored in zone 1 (slots 0–63 live in the account header stem in zone 000). Storage keys are 256 bits — the same width as all other zones. The key uses a 60-bit address prefix for account-local grouping, followed by a 187-bit stem suffix derived from both the address and tree_index. Storage slots are adversarially addressable (any contract can write to any slot key), so the stem suffix must resist birthday attacks — 187 bits provides 2⁹³·⁵ birthday resistance.

zone

1b

addr_prefix

60b

stem_suffix

187b

sub_idx

8b

1

H(addr)[:60]

H(addr || tree_index)[:187]

slot_key % 256

total: 256 bits

tree_index = slot_key // 256 (32-byte big-endian integer division). Slots 64–255 share the first zone 1 stem (tree_index = 0, sub_idx 64–255), slots 256–511 share a stem, etc. Stem co-location is deterministic — adjacent slots always group together. Slots 0–63 are accessed from the account header stem (zone 000, sub_idx 0x40–0x7F) without ever touching zone 1.

Sparse Key Space in a Non-Sparse Trie

In practice, even the largest Ethereum contracts use only a few thousand storage slots. The useful information in the stem suffix is maybe 10–15 bits of entropy. Yet we allocate 187 bits. Most of this key space will never contain leaves.

This is not a problem — and it is exactly how Ethereum's current Merkle Patricia Trie already works. In the MPT, storage keys are keccak256(slot_key) — a full 256-bit hash. A typical contract uses fewer than 1,000 of the 2²⁵⁶ possible slots. The hexary Patricia trie adapts: it only creates nodes where keys actually diverge, resulting in ~3–4 levels of branching (log₁₆(1000) ≈ 2.5), not the theoretical maximum of 64 levels.

PBT's non-sparse binary trie behaves identically. A contract with 1,000 storage slots produces ~10 levels of branching below the per-account bucket boundary (log₂(1000) ≈ 10). Empty regions of the key space create no nodes, no storage overhead, and no proof overhead. The wide key space exists solely to provide cryptographic collision resistance — the same function it serves in today's MPT, but made explicit.

Practical proof depth for a contract with N slots:
N = 100 slots    → ~7 levels below bucket boundary
N = 1,000 slots  → ~10 levels
N = 10,000 slots → ~13 levels
N = 100,000 slots → ~17 levels
These numbers depend on actual leaf count, not key width.
The theoretical max (187 levels) is unreachable — it would require
a single contract with 2¹⁸⁷ storage slots, which is physically impossible.

Address Prefix Locality (P=60)

The 60-bit address prefix creates a soft per-account bucket boundary at depth 61 in the storage zone (1-bit zone prefix + 60-bit prefix). For the vast majority of accounts, this bucket contains exactly one account's storage — providing de facto per-account grouping.

Prefix collisions are not consensus-fatal. At 10 billion accounts, the expected number of address pairs sharing a 60-bit prefix is ~43. These accounts would share a bucket in the tree, meaning their storage leaves are interleaved below depth 61. This is benign — the accounts' data remains distinguishable via the stem suffix H(addr || tree_index), which binds each stem to both the address and the slot index. Two accounts sharing a bucket simply have slightly reduced parallelism at that bucket boundary.

Why H(addr || tree_index)? The stem suffix hashes the address together with the tree_index, not just the tree_index alone. This ensures that two different accounts sharing the same 60-bit prefix produce independent stem suffixes — there is no structural correlation between their storage layouts. An attacker who finds a prefix collision gains no ability to predict or influence the victim's storage key positions within the shared bucket.

Key Derivation Across Zones

Each zone uses a different amount of address-derived bits, tailored to its threat model and structural needs:

Account zone (000): H(addr)[:245] — the full hash (minus zone prefix and sub_index) provides maximum collision resistance at 2¹²²·⁵ birthday bound. The header stem holds up to 194 leaves (BASIC_DATA, CODE_HASH, 64 storage slots, 128 code chunks), all sharing the same 248-bit prefix.

Code zone (001): H(code_hash || tree_index)[:245] — content-addressed stems with 2¹²²·⁵ birthday resistance. No per-account boundary in the code zone; per-account code locality is provided by the first 128 chunks in the header stem (zone 000). Contracts with identical bytecode share code zone leaves.

Storage zone (1): H(addr)[:60] prefix + H(addr || tree_index)[:187] suffix — a 60-bit prefix creates a soft per-account bucket at depth 61. The 187-bit suffix is bound to both address and slot index, providing 2⁹³·⁵ birthday resistance within each bucket. This design fits storage's unique requirements: many adversarially-chosen slots per account, with account-local grouping that doesn't require extended key width.

Uniform 256-Bit Keys

All zones use 256-bit keys. There is no variable-width complexity. Each zone achieves its security and structural goals within the same 256-bit envelope:

Storage achieves collision resistance via a 187-bit stem suffix (birthday 2⁹³·⁵) within 256 bits. The key structure uses H(addr || tree_index) to bind each stem to both the address and slot index, preventing cross-account structural correlation.

Account locality in the storage zone is achieved via a 60-bit address prefix, providing soft per-account grouping without extending key width. At 10B accounts, ~43 address pairs share a prefix — benign because prefix collisions affect only grouping, not data integrity.

Code chunks use content-addressed keys: H(code_hash || tree_index)[:245] provides 2¹²²·⁵ birthday resistance and enables bytecode deduplication. Account headers use 245 bits of address hash for maximum per-account collision resistance (birthday 2¹²²·⁵).

✓Key derivation is invisible to smart contracts. The EVM operates on 256-bit logical slot numbers via SSTORE/SLOAD — it never sees tree keys. The entire key derivation happens inside the Ethereum client, below the EVM layer. No contract, Solidity code, or Yul assembly needs to change. This is the same abstraction boundary that exists today: the MPT internally hashes every storage slot through keccak256(slot_key) to produce a 256-bit trie path, and every account address through keccak256(address) — yet no contract is aware of this transformation. PBT simply replaces what happens behind that boundary.

§4

Tree Topology & Structural Boundaries

Structural Depth Map

The key derivation creates deterministic structural boundaries at known depths. Any implementation or protocol can compute the exact depth of a per-account subtree root, a stem node, or a zone root without inspecting the tree.

d=1
Binary split: non-storage (left) vs all storage (right)
Root children
d=2
Accounts+code (00*) vs reserved (01*)
Second level
d=3
Account zone root (000) vs code zone root (001)
Zone roots
d=61
Soft per-account bucket boundary in storage zone
1 + 60
d=248
Stem level in all zones
256 − 8
d=256
Leaf level in all zones
All leaves

Per-Account Subtrees

Per-account boundaries differ by zone, reflecting each zone's structure:

Code zone (001): No per-account structural boundary. The code zone uses content-addressed keys — H(code_hash || tree_index)[:245] — so stems are organized by bytecode content, not by account. Per-account code locality is provided by the first 128 code chunks in the account header stem (zone 000, sub_idx 0x80–0xFF). Code zone leaves for a given bytecode are shared across all contracts with that bytecode.

Storage zone (1): Probabilistic per-account grouping at depth 61 (1-bit zone prefix + 60-bit address prefix). For the vast majority of accounts, the node at path 1 || H(addr)[:60] roots exactly one account's storage. For the ~43 collision pairs expected at 10B accounts, two accounts share a bucket — their data is interleaved below depth 61 but remains distinguishable via the stem suffix.

Account zone (000): Each account's header stem holds up to 194 leaves (BASIC_DATA, CODE_HASH, 64 storage slots, 128 code chunks). All share the 248-bit prefix 000 || H(addr)[:245] and form a single stem. This is the primary per-account structural boundary in PBT.

✓These boundaries enable per-account state expiry, VOPS (Valid-only Partial Statelessness), and archival pruning: protocols can reference subtree roots at known depths as cryptographic commitments, without needing a separate data structure. The header stem provides per-account co-location of metadata, hot storage, and initial code. The storage zone provides probabilistic per-account grouping at depth 61, sufficient for the vast majority of accounts.

Full Tree Visualization

§5

Root Computation: No Sequential Dependency

The Problem PBT Solves

In Ethereum's current Merkle Patricia Trie (MPT), an account's leaf value is:

account_trie[hash(address)] = RLP(nonce, balance, storage_root, code_hash)

The storage_root field is the root hash of a separate trie containing that account's storage. This creates a sequential dependency: you cannot finalize the account trie leaf (and thus the state root) until you have computed the storage trie root. The storage root is an input to the account leaf value.

MPT: sequential dependency

Step 1: Update A's storage trie → compute storage_root_A
        ↓ MUST FINISH FIRST
Step 2: Update account trie with new storage_root_A → compute state_root
        ↓ DEPENDS ON STEP 1

How PBT Eliminates the Dependency

In PBT, the account header leaf in zone 000 contains:

basic_data  = [version(1) | reserved(4) | code_size(3) | nonce(8) | balance(16)]
code_hash   = keccak256(bytecode)

Notice what is not there: storage_root. No leaf value anywhere in the tree contains a hash computed from another part of the tree. Every leaf value is self-contained data — a nonce is just a nonce, a slot value is just a slot value.

When a block modifies account A's nonce (zone 000) and storage slot 5 (zone 1), these are two independent leaf updates in the same tree. Neither leaf's value depends on the other. Root recomputation is a single bottom-up pass:

PBT: single-pass root computation

1. Apply all leaf modifications (any order, or in parallel)
2. Recompute hashes bottom-up from modified leaves to root
3. Done — one pass, one root

The zone 000 branch and zone 1 branch propagate upward
INDEPENDENTLY and meet at the root.

✓The per-account bucket boundary (depth 61 in storage) is an intermediate hash — a side-effect of the bottom-up pass, not an input to any other computation. Protocols can read it as a useful commitment, but it never constrains computation order.

Parallelism Properties

Zone-level parallelism: The zone 000 (accounts) and zone 1 (storage) subtrees can be updated by separate threads. They share no data dependencies until their hashes merge near the root.

Account-level parallelism within zones: Different accounts' subtrees within the same zone are independent. Updating account A's storage and account B's storage can proceed in parallel — they occupy disjoint subtrees below the bucket boundary.

Stem-level parallelism: Within a single account's subtree, different stems are independent sub-computations that can be hashed in parallel.

§6

Stems & Co-location

What Is a Stem?

A stem is a group of up to 256 leaves that share every bit of their key except the last 8 (the sub_index). In the tree, a stem corresponds to a subtree of depth 8 rooted at the node where the common prefix ends.

The stem concept enables proof compression: when proving multiple values that share a stem, the Merkle branch from root to stem node is shared. Only the individual 8-level openings within the stem differ. For k values in the same stem at effective branch depth d:

Independent proofs: k × d sibling hashes
Stem-batched proof: d + 8k − 8 sibling hashes
k=4, d=13:  52 → 37 sibling hashes (29% savings)
k=10, d=13: 130 → 85 sibling hashes (35% savings)
k=16, d=15: 240 → 135 sibling hashes (44% savings)

Stem Co-location by Design

In zone 1 (storage), stems are formed by the tree_index = slot_key // 256 derivation. This means slots 0–255 always share a stem, slots 256–511 share a stem, etc. This is deterministic, not hash-dependent.

This co-location is valuable because common contract patterns (mappings, arrays, sequential storage) access nearby slots together. ERC-20 balances stored in consecutive mapping slots naturally share a stem.

Stems Per Zone

ZONE 000 — Account Headers

Stem depth: d=248
Grouping: up to 194 values per stem

EIP-7864 layout: BASIC_DATA, CODE_HASH, 64 storage slots (0x40–0x7F), 128 code chunks (0x80–0xFF). Hot paths — balance check + first storage slots — are co-located in one stem.

ZONE 001 — Code Chunks

Stem depth: d=248
Grouping: 256 chunks per stem

Content-addressed: chunks from identical bytecodes share stems. Each stem holds 256 consecutive chunks (~8 KB). Only populated for contracts with >128 chunks (>~4 KB).

ZONE 1 — Storage Slots

Stem depth: d=248
Grouping: 256 slots per stem

Adjacent slots share stems by design. High value for mapping-heavy contracts. Slots 0–63 are in the header stem (zone 000); slots 64+ are in zone 1.

§7

State Expiry & Partial Statefulness

Per-Account Subtree Expiry

PBT's structural boundaries enable per-account subtree pruning as a natural state expiry mechanism, with different granularity per zone:

Code zone: Content-addressed code zone leaves are potentially shared across contracts with identical bytecode. Expiry requires reference counting (how many live accounts share this code_hash?) or permanent storage. The first 128 code chunks in the header stem (zone 000) expire naturally with the account — no reference counting needed.

Storage zone: Bucket-level expiry at depth 61 (1-bit zone prefix + 60-bit address prefix). The node at path 1 || H(addr)[:60] is the bucket boundary. For the vast majority of accounts, this bucket contains exactly one account's storage. Record the subtree hash and prune below depth 61.

Account zone: Per-account expiry at the stem level. The header stem holds up to 194 leaves (BASIC_DATA, CODE_HASH, 64 storage slots, 128 code chunks). Expiry of the entire header stem removes the account's core data, hot storage, and initial code in one operation.

To resurrect expired data, a client provides a proof consistent with the recorded commitment stub. The subtree is re-attached at its original position.

ℹExpiry granularity is per-account, per-zone. You can expire account A's storage while keeping A's account header (zone 000) and code (zone 001) live. For the ~43 address pairs that share a storage bucket at 10B accounts, bucket-level expiry would affect both accounts. Mitigation options include finer-grained expiry within the bucket, or accepting this as a rare edge case.

Support for Partial-Statefulness & VOPS

PBT's zone structure directly enables partial statefulness — a mode where a node stores the complete account trie (zone 000) but only syncs and maintains storage and code for a configured subset of contracts. This is the architecture behind EIP-7928 (Block Access Lists), where a partial state node reduces disk usage from ~640 GB to ~59 GB by skipping untracked contracts' storage entirely.

In PBT, this maps cleanly onto the zone topology. Zone 000 (all account headers) is the subtree at prefix "000" — a self-contained subtree that can be synced independently. A partial node downloads zone 000 in full, giving it complete account data (balances, nonces, code hashes, hot storage, and initial code) for every address. For tracked contracts, it additionally syncs their code zone stems (by code_hash) and storage subtrees in zone 1. For untracked contracts, zones 001 and 1 are simply never fetched.

Within the storage zone, a partial node can restrict scope to specific accounts by syncing only the subtrees below their prefix at depth 61. This gives fine-grained control: a node tracking 100 "hot" contracts downloads only those 100 storage subtrees, not the millions of inactive ones. For code, syncing is by code_hash — the node fetches the relevant content-addressed stems for each tracked contract's bytecode.

Block-by-block state updates for partial nodes come via Block Access Lists (BALs) — per-block diffs specifying which storage slots changed and to what values. A partial node applies BAL diffs directly to the trie for tracked contracts without re-executing transactions, then verifies the resulting state root against the block header. PBT's single-pass root computation (no sequential dependency) means this verification is straightforward: apply leaf updates, recompute bottom-up, compare root.

This architecture also serves as the foundation for Valid-only Partial Statelessness (VOPS): validators that store only a portion of the state can still validate blocks by verifying witness proofs for the portions they don't hold. PBT's structural boundaries make it possible to define precisely which subtrees a validator holds (per-account, per-zone, or any combination) and to verify proofs against the subtree roots at known depths.

Archival Pruning

Archival nodes can selectively prune cold storage on a per-account basis. Identify accounts whose storage has not been accessed for N epochs. For each, record the bucket root at depth 61 in the storage zone and prune the subtree. The archival node retains zone 000 (header stems, including initial code and hot storage) plus the commitment stubs. Code zone stems are retained only for bytecodes referenced by live accounts.

This allows an archival node to serve balance/nonce/code queries for all accounts while offloading cold storage to dedicated archive providers. The commitment stubs ensure that the archival node can verify storage proofs provided by archive providers without trusting them.

§8

Privacy: Wormholes & Wider Identifiers

The 160-Bit Bottleneck

EIP-7503 (zero-knowledge wormholes) proposes plausibly deniable privacy for Ethereum: users "burn" ETH by sending it to a cryptographically unspendable address, then present a zero-knowledge proof to re-mint the funds in a fresh account. The burn transaction looks like a normal transfer — no interaction with a privacy contract is visible on-chain.

The fundamental problem is Ethereum's 160-bit address space. A burn address is derived as trunc_160(H("worm" || secret)). The birthday paradox means an attacker can find two secrets s₁, s₂ producing the same 160-bit address in approximately 2⁸⁰ operations — within reach of nation-states. With such a collision, the attacker deposits once and mints twice with different nullifiers: an infinite inflation bug.

Worse, even "fixing" the nullifier scheme doesn't help. An attacker can find s₁, s₂ such that trunc_160(H("worm" || s₁)) == trunc_160(H(skToPk(s₂))), producing a burn address that is also spendable via a regular private key. This is the same 2⁸⁰ collision search. As the original analysis concludes: plausibly deniable wormholes require collision resistance that 160-bit addresses cannot provide.

How PBT's 245-Bit Account Keys Help

PBT's 245-bit account key (in zone 000) raises the birthday bound from 2⁸⁰ to 2¹²²·⁵. The difference is categorical: 2⁸⁰ is attackable by nation-states today; 2¹²²·⁵ is computationally infeasible by a very wide margin, even accounting for foreseeable advances in hardware.

The critical question is where the wormhole identity lives. In EIP-7503's current design, the burn identity is a 160-bit Ethereum address — constrained by the transaction format, not by the tree. PBT does not change this surface constraint. However, PBT creates infrastructure that a protocol-level wormhole could leverage:

Enshrined burn at the tree level. If the protocol natively understands PBT's key structure, a burn operation could target a 245-bit account key directly — bypassing the 160-bit address bottleneck entirely. The burn identity would be H(H("worm" || secret))[:245], a 245-bit value with birthday resistance of 2¹²²·⁵. A BASIC_DATA leaf (sub_idx 0x00) would exist at this key in zone 000, holding the burned balance. The minting proof would demonstrate knowledge of a secret whose derived key maps to an existing balance in the tree.

This sidesteps the fundamental tension identified in the wormhole analysis: the collision resistance needed for security is provided by the wider identifier, while plausible deniability is preserved because the burn still creates an ordinary-looking account entry in the tree — indistinguishable from any other account key derived from a regular address.

Practical Considerations

What PBT enables vs. what it solves. PBT does not implement wormholes. It provides the structural precondition — a wide enough identifier space with collision resistance above the security threshold. Several open problems remain for a complete wormhole design:

Anonymity set. EIP-7503 currently exposes a beacon_block_root as public input, shrinking the anonymity set to transactions in a single block. A tree-level design could prove against a state root (committing to all accounts), potentially expanding the anonymity set to all accounts satisfying a predicate at a given block — but this requires efficient proof generation over PBT's Merkle structure.

Hash function alignment. If PBT uses a SNARK-friendly hash (e.g., Poseidon2), the wormhole proof — which must demonstrate a valid path through the tree — becomes dramatically cheaper to generate. The hash function choice in §10 (Open Questions) has direct implications for wormhole feasibility.

Beacon deposit integration. A complementary approach uses private beacon chain deposits as the wormhole mechanism. With ~33% of ETH staked and tens of millions in daily withdrawals, the anonymity set is large. PBT's structural boundaries could help here too: the deposit's tree footprint is a well-defined subtree that proofs can reference.

§9

Security Analysis

Collision Resistance

Field	Bits	Birthday Bound	Threat Model	Assessment
account key (zone 000)	245	2¹²²·⁵	Two addresses → same key prefix	Excellent
addr_prefix (storage)	60	~43 pairs at 10B	Two addresses → same bucket	Not consensus-fatal
stem_suffix (storage)	187	2⁹³·⁵	Two tree_indexes → same stem	Future-proof for decades
storage effective	195	2⁹⁷·⁵	Full key collision within bucket	Excellent
code stem (code zone)	245	2¹²²·⁵	Two (code_hash, tree_index) → same stem	Excellent
sub_index	8	n/a	Direct mapping, not hashed	Collision impossible*

* sub_index = slot_key % 256. Two distinct slot keys collide in sub_index only if they share the same tree_index AND the same remainder — meaning they are the same slot. Collision is logically impossible between distinct keys.

Content-Addressed Code: Collision Resistance

Code zone keys are derived as H(code_hash || tree_index)[:245], where code_hash = keccak256(bytecode). The 245-bit truncation provides 2¹²²·⁵ birthday resistance between any two distinct (code_hash, tree_index) pairs. Since code_hash is a keccak256 output (256 uniformly distributed bits), truncation preserves uniformity. Two distinct bytecodes mapping to the same stem key is computationally infeasible. Contracts with identical bytecode intentionally share stems — this is the deduplication mechanism, not a collision.

Cross-Account Isolation

Code zone: Content-addressed — contracts with identical bytecode intentionally share code zone leaves. Contracts with different bytecodes are isolated by 245-bit H(code_hash || tree_index) with 2¹²²·⁵ birthday resistance. Per-account code isolation for the first 128 chunks is provided by the header stem in zone 000.

Storage zone: Isolation is probabilistic via the 60-bit address prefix. The ~43 expected collision pairs at 10B accounts share a bucket, meaning their storage leaves are interleaved below depth 61. This is not consensus-fatal — the stem suffix H(addr || tree_index) ensures each account's stems are independent. Two accounts sharing a bucket can coexist, with slightly reduced parallelism at the bucket boundary. An attacker cannot exploit a prefix collision to corrupt another account's storage.

§10

Open Questions

Q1: Optimal storage prefix length (P)

P=60 provides ~43 collision pairs at 10B accounts — is this the right tradeoff? Increasing P strengthens per-account isolation but reduces stem suffix bits. P=64 would give ~2.7 collision pairs at 10B accounts but reduce stem suffix to 183 bits (birthday 2⁹¹·⁵). P=48 would give ~700K collision pairs — likely too many. The choice depends on how much bucket sharing is tolerable.

Q2: Resurrection strategy for expired subtrees

When a storage bucket is expired to a commitment stub at depth 61, how does resurrection work? Options include: (A) full subtree re-provision with a Merkle proof against the stub, (B) incremental resurrection where individual slots are re-proved on access, (C) epoch-based resurrection with bulk re-attachment. Each has different bandwidth and latency tradeoffs.

Q3: Hash function selection

PBT is hash-function agnostic. The choice affects proving performance (SNARK/STARK friendliness), hashing speed (native execution), and proof compactness. Candidates: BLAKE3 (fast native), Poseidon2 (SNARK-friendly), Keccak (ecosystem compatibility). The tree structure is independent of this choice.

Q4: Cross-zone multi-proof format

Proving both an account header (zone 000) and a storage slot (zone 1) for the same account requires two Merkle branches. These share the first ~1 level (from root to the depth-1 split). Defining a compact multi-proof format that exploits this sharing needs specification.

Q5: Reference counting for content-addressed code

When a contract is destroyed, its header stem (including first 128 code chunks) is deleted. But code zone stems (zone 001) may be shared with other contracts via content-addressing. Options: (A) reference counting per code_hash — track how many live accounts use each bytecode, (B) never delete code zone stems — since SELFDESTRUCT is deprecated, dead code stays until a state expiry sweep, (C) garbage collection during state expiry — periodically scan for unreferenced code_hash entries. Option B is simplest and may be sufficient given that most contracts fit within the 128-chunk header.

Q6: Header stem sub-index constants

The EIP-7864 header layout embeds specific sub-index boundaries into the protocol: 0x40 for first storage slot, 0x80 for first code chunk. These constants determine how many hot storage slots (64) and initial code chunks (128) are per-account vs zone-separated. Should these be configurable per-fork, or fixed permanently? Changing them would require migrating all account header stems.

Q7: Bucket collision handling for state expiry

When two accounts share a 60-bit storage prefix (~43 pairs at 10B accounts), bucket-level expiry at depth 61 would expire both accounts' storage together. Should the protocol handle this explicitly (e.g., check for sharing before expiring), accept it as a rare edge case, or use finer-grained expiry within shared buckets?