[go: up one dir, main page]

Etherlink: withdrawal observer

Problem

We've been several times in a situation where users didn't receive their withdrawn assets in a timely manner, and could even have lost them, because outbox messages were not executed.

At the moment, a rollup is used to execute outbox messages when they are ripe. It seems however that it misses some. It's hard to diagnose because when we see the pb it might be more than 2 weeks after the initial cause, which means logs are lost.

Goals

  • raise alerts asap
    • if withdrawal not in outbox
    • if executable message is not executed
  • light solution
    • use existing RPC as much as possible at first
    • small component
    • no modification to kernel
  • (hope) help debugging rollup node

High Level Plan (tm)

Create a small "withdrawal observer" that monitors withdrawal executed by evm nodes, and outbox messages computed by rollup nodes. It will raise an alert if a withdraw doesn't appear in the outbox, and if an outbox message is not executed on time.

  • All data will be in a sqlite database, easily auditable
  • communicate with a rollup node and an evm node through RPC (potentially streamed)
  • Alerts will be raised by events written in a file read by Loki and fed to grafana

Why a standalone

Possibilities discussed:

  • build a standalone component, communicating with rollup and evm node through RPC
  • add a mode to the evm node, that would look a lot like a standard observer, but track withdrawals and raise alerts

Pros and cons for standalone

Pros:

  • small, no big data dir,
  • can be run independently
  • does not need to be updated at same pace as evm node
  • can be pulg into any evm node instance, any rollup instance
  • only populate an audit table
  • independent testing (less test to execute ?)

Cons:

  • more RPC connection and messages
  • new binary
  • need a websocket client
  • more code to write

Hypothesis

outbox level of a message = L1 level whose kernel execution generated the message in the outbox outbox level of a withdraw = inbox level in which the blueprint was included

Available information from rollup node

  • RPC to get non-executable messages
  • RPC to get executable messages
  • RPC to know when an outbox level is commited, cemmented, and estimate time before cementation

Available information from evm

  • precompile contract of withdraw publishes enough logs to detect withdraw
  • we can capture those logs using eth_subscribe
  • table l1_l2_relationship contain the outbox level
    • in a row, l1_level correspond to the outbox level of withdraws in the block finalized_l2_level
    • so to obtain the outbox level corresponding to a l2 block: select l1_level from l1_l2_levels_relationships where finalized_l2_level >= <L2_BLUEPRINT> order by finalized_l2_level asc limit 1;

Lifecycle of a withdraw:

  • included in a blueprint by sequencer
  • sent on L1 (blueprint included) - nb of L1 blocks depend on batcher
  • finalized on L1 (in the outbox of a rollup but non-executable) - 2 blocks later (1 really)
  • commited
  • executable (~15 days later)
  • executed (1 block later)

If we get the list of withdraw and their outbox level, we can establish a one to one correspondence with the list of outbox messages: same number, same order.

Tasks

  • new RPC in evm node to get outbox level of a l2 block
    • first: just a RPC
    • then: streamed RPC that can be bootstraped, can be restarted
  • extend eth_subscribe to get history (with a starting block)
  • write ocaml websocket client (vendor more stuff from ocaml-websocket lib)
  • read outbox in rollup state with curl
    http://localhost:8932/global/block/head/outbox/7623066/messages
  • add a query parameter to /local/outbox/pending/unexecutable to get only one level
  • create DB
    • L2 -> L1 level
    • withdraw tx from etherlink (hash, block, etc.)
    • outbox messages (indexed by outbox level and message index)
    • link between withdraw and outbox messages
    • which levels (L1 and L2) have been seen
  • write alerts in a file (events) and read them with Loki

Attribution:

Ressources

Edited by Pierre-Emmanuel CORNILLEAU