Smart rollup: unable to restore the state due to "Commitment does not exist" error
I was able to reproduce several times the situation when the rollup node is not able to reindex from scratch due to the missing commitment (unpublished?). It happens when runtime error occurs and forces the node to shutdown. Once the node is restarted with a cleated context it crashes with "Commitment does not exist" error.
Here are the logs snippets that demonstrate that (lines from newest to oldest):
Rollup is progressing normally until L1 node connection issues:
Closing store
Closing context
Shutting down L1
Apr 15 11:33:46.923 - sc_rollup_batcher: worker terminated [sc_rollup_batcher]
Shutting down Batcher
Apr 15 11:33:46.923 - tx_rollup_injector: worker terminated [tx_rollup_injector]
Shutting down Injector
Exiting.
Block with hash BM4qTRYD77uFzoEHP2gTEQ9q564xzzQyR5NvSQrnBt5kD935A2d was not found on the L1 node.
"<html>\r\n<head><title>502 Bad Gateway</title></head>\r\n<body>\r\n<center><h1>502 Bad Gateway</h1></center>\r\n<hr><center>nginx</center>\r\n</body>\r\n</html>\r\n"
- error: Unexpected error 502:
- uri: https://rpc.tzkt.io/mumbainet/chains/main/blocks/BM4qTRYD77uFzoEHP2gTEQ9q564xzzQyR5NvSQrnBt5kD935A2d?metadata=always
- meth: GET
Rpc request failed:
trace:
Block with hash BM4qTRYD77uFzoEHP2gTEQ9q564xzzQyR5NvSQrnBt5kD935A2d was not found on the L1 node.,
Shutting down RPC server
Error:
Apr 15 11:33:46.922 - main:
Apr 15 11:33:46.922 - main: </html>
Apr 15 11:33:46.922 - main: </body>
Apr 15 11:33:46.922 - main: <hr><center>nginx</center>
Apr 15 11:33:46.922 - main: <center><h1>502 Bad Gateway</h1></center>
Apr 15 11:33:46.922 - main: <body>
Apr 15 11:33:46.922 - main: <head><title>502 Bad Gateway</title></head>
Apr 15 11:33:46.922 - main: <html>
Apr 15 11:33:46.922 - main: Original body follows.
Apr 15 11:33:46.922 - main: Attempt number 10/10, will retry after 0.9 seconds.
Apr 15 11:33:45.727 - rpc_http_event: (1) received request /global/block/head/durable/wasm_2_0_0/value?key=/batches/BMKZaarzN9ZJRewPrWptXG2uXoUYboE3L2okrSu9Tpb66saQBsw
Apr 15 11:33:45.726 - rpc_http_event: (1) received request /global/block/head/durable/wasm_2_0_0/value?key=/head
Apr 15 11:33:44.237 - main:
Apr 15 11:33:44.237 - main: </html>
Apr 15 11:33:44.237 - main: </body>
Apr 15 11:33:44.237 - main: <hr><center>nginx</center>
Apr 15 11:33:44.237 - main: <center><h1>502 Bad Gateway</h1></center>
Apr 15 11:33:44.237 - main: <body>
Apr 15 11:33:44.237 - main: <head><title>502 Bad Gateway</title></head>
Apr 15 11:33:44.237 - main: <html>
Apr 15 11:33:44.237 - main: Original body follows.
Apr 15 11:33:44.237 - main: Attempt number 9/10, will retry after 0.8 seconds.
Apr 15 11:33:41.639 - main:
Apr 15 11:33:41.639 - main: </html>
Apr 15 11:33:41.639 - main: </body>
Apr 15 11:33:41.639 - main: <hr><center>nginx</center>
Apr 15 11:33:41.639 - main: <center><h1>502 Bad Gateway</h1></center>
Apr 15 11:33:41.639 - main: <body>
Apr 15 11:33:41.639 - main: <head><title>502 Bad Gateway</title></head>
Apr 15 11:33:41.639 - main: <html>
Apr 15 11:33:41.639 - main: Original body follows.
Apr 15 11:33:41.639 - main: Attempt number 8/10, will retry after 0.7 seconds.
Apr 15 11:33:40.721 - rpc_http_event: (1) received request /global/block/head/durable/wasm_2_0_0/value?key=/batches/BMKZaarzN9ZJRewPrWptXG2uXoUYboE3L2okrSu9Tpb66saQBsw
Apr 15 11:33:40.720 - rpc_http_event: (1) received request /global/block/head/durable/wasm_2_0_0/value?key=/head
Apr 15 11:33:40.160 - main:
Apr 15 11:33:40.160 - main: </html>
Apr 15 11:33:40.160 - main: </body>
Apr 15 11:33:40.160 - main: <hr><center>nginx</center>
Apr 15 11:33:40.160 - main: <center><h1>502 Bad Gateway</h1></center>
Apr 15 11:33:40.160 - main: <body>
Apr 15 11:33:40.160 - main: <head><title>502 Bad Gateway</title></head>
Apr 15 11:33:40.160 - main: <html>
Apr 15 11:33:40.160 - main: Original body follows.
Apr 15 11:33:40.160 - main: Attempt number 7/10, will retry after 0.6 seconds.
Apr 15 11:33:38.777 - main:
Apr 15 11:33:38.777 - main: </html>
Apr 15 11:33:38.777 - main: </body>
Apr 15 11:33:38.777 - main: <hr><center>nginx</center>
Apr 15 11:33:38.777 - main: <center><h1>502 Bad Gateway</h1></center>
Apr 15 11:33:38.777 - main: <body>
Apr 15 11:33:38.777 - main: <head><title>502 Bad Gateway</title></head>
Apr 15 11:33:38.777 - main: <html>
Apr 15 11:33:38.777 - main: Original body follows.
Apr 15 11:33:38.777 - main: Attempt number 6/10, will retry after 0.5 seconds.
Apr 15 11:33:37.495 - main:
Apr 15 11:33:37.495 - main: </html>
Apr 15 11:33:37.495 - main: </body>
Apr 15 11:33:37.495 - main: <hr><center>nginx</center>
Apr 15 11:33:37.495 - main: <center><h1>502 Bad Gateway</h1></center>
Apr 15 11:33:37.495 - main: <body>
Apr 15 11:33:37.495 - main: <head><title>502 Bad Gateway</title></head>
Apr 15 11:33:37.495 - main: <html>
Apr 15 11:33:37.495 - main: Original body follows.
Apr 15 11:33:37.495 - main: Attempt number 5/10, will retry after 0.4 seconds.
Apr 15 11:33:36.315 - main:
Apr 15 11:33:36.315 - main: </html>
Apr 15 11:33:36.315 - main: </body>
Apr 15 11:33:36.315 - main: <hr><center>nginx</center>
Apr 15 11:33:36.315 - main: <center><h1>502 Bad Gateway</h1></center>
Apr 15 11:33:36.315 - main: <body>
Apr 15 11:33:36.315 - main: <head><title>502 Bad Gateway</title></head>
Apr 15 11:33:36.315 - main: <html>
Apr 15 11:33:36.315 - main: Original body follows.
Apr 15 11:33:36.315 - main: Attempt number 4/10, will retry after 0.3 seconds.
Apr 15 11:33:35.237 - main:
Apr 15 11:33:35.237 - main: </html>
Apr 15 11:33:35.237 - main: </body>
Apr 15 11:33:35.237 - main: <hr><center>nginx</center>
Apr 15 11:33:35.237 - main: <center><h1>502 Bad Gateway</h1></center>
Apr 15 11:33:35.237 - main: <body>
Apr 15 11:33:35.237 - main: <head><title>502 Bad Gateway</title></head>
Apr 15 11:33:35.237 - main: <html>
Apr 15 11:33:35.237 - main: Original body follows.
Apr 15 11:33:35.237 - main: Attempt number 3/10, will retry after 0.2 seconds.
Apr 15 11:33:33.231 - main:
Apr 15 11:33:33.231 - main: </html>
Apr 15 11:33:33.231 - main: </body>
Apr 15 11:33:33.231 - main: <hr><center>nginx</center>
Apr 15 11:33:33.231 - main: <center><h1>502 Bad Gateway</h1></center>
Apr 15 11:33:33.231 - main: <body>
Apr 15 11:33:33.231 - main: <head><title>502 Bad Gateway</title></head>
Apr 15 11:33:33.231 - main: <html>
Apr 15 11:33:33.231 - main: Original body follows.
Apr 15 11:33:33.231 - main: Attempt number 2/10, will retry after 0.1 seconds.
Apr 15 11:33:31.333 - main:
Apr 15 11:33:31.333 - main: </html>
Apr 15 11:33:31.333 - main: </body>
Apr 15 11:33:31.333 - main: <hr><center>nginx</center>
Apr 15 11:33:31.333 - main: <center><h1>502 Bad Gateway</h1></center>
Apr 15 11:33:31.333 - main: <body>
Apr 15 11:33:31.333 - main: <head><title>502 Bad Gateway</title></head>
Apr 15 11:33:31.333 - main: <html>
Apr 15 11:33:31.333 - main: Original body follows.
Apr 15 11:33:31.333 - main: Attempt number 1/10, will retry after 0 seconds.
Apr 15 11:33:21.291 - sc_rollup_injector: Request pushed on 2023-04-15T11:33:21.290-00:00, treated in 42.160us, completed in 1.670us
Apr 15 11:33:21.291 - sc_rollup_injector: [tz1N96ka7u5cK: publish, add_messages, cement, refute] injection
Apr 15 11:33:21.291 - sc_rollup_node.layer_1: BM4qTRYD77uFzoEHP2gTEQ9q564xzzQyR5NvSQrnBt5kD935A2d at level 358948
Apr 15 11:33:21.291 - sc_rollup_node.layer_1: Layer 1 node has switched to head
Apr 15 11:33:21.291 - sc_rollup_node.batcher.worker: Request pushed on 2023-04-15T11:33:21.290-00:00, treated in 8.890us, completed in 3.160us
Apr 15 11:33:21.290 - sc_rollup_node.batcher.worker: batch
Apr 15 11:33:10.728 - rpc_http_event: (1) received request /global/block/358946/durable/wasm_2_0_0/value?key=/head
Apr 15 11:33:10.728 - rpc_http_event: (4) received request /global/block/358947/durable/wasm_2_0_0/value?key=/head
Apr 15 11:33:10.727 - rpc_http_event: (1) received request /global/block/head/durable/wasm_2_0_0/value?key=/batches/BLEBtf4wB1S8zt4CPpxstTmq55X2iiE8E1MRNQiM4yqb4c2zb65
Apr 15 11:33:10.726 - rpc_http_event: (4) received request /global/block/head/durable/wasm_2_0_0/value?key=/batches/BMKZaarzN9ZJRewPrWptXG2uXoUYboE3L2okrSu9Tpb66saQBsw
Apr 15 11:33:10.726 - rpc_http_event: (1) received request /global/block/358946/durable/wasm_2_0_0/value?key=/head
Apr 15 11:33:10.725 - rpc_http_event: (4) received request /global/block/358947/durable/wasm_2_0_0/value?key=/head
Apr 15 11:33:10.723 - rpc_http_event: (4) received request /global/block/head/durable/wasm_2_0_0/value?key=/batches/BMKZaarzN9ZJRewPrWptXG2uXoUYboE3L2okrSu9Tpb66saQBsw
Apr 15 11:33:10.721 - rpc_http_event: (4) received request /global/block/head/durable/wasm_2_0_0/value?key=/head
Apr 15 11:33:09.533 - sc_rollup_injector: 358945
Apr 15 11:33:09.533 - sc_rollup_injector: [tz1N96ka7u5cK: publish, add_messages, cement, refute] Confirmed Tezos level
Apr 15 11:33:09.533 - sc_rollup_node.daemon: Finished processing 1 layer 1 heads for levels 358947 to 358947
Apr 15 11:33:09.532 - sc_rollup_node.daemon: BLQjcnQN5B1uvrDrXfzwXVc6U61cGRKgLjoBaBxEGMMG4iNVCfH at level 358947
Apr 15 11:33:09.532 - sc_rollup_node.daemon: Finished processing layer 1 head
Apr 15 11:33:09.532 - sc_rollup_node.daemon: at level 358945
Apr 15 11:33:09.532 - sc_rollup_node.daemon: Processing finalized head BLyQqD4GgPJ2gsuao1MVzKQZKkmLzgnSfi4Afp6neHznR4vVVEZ
Apr 15 11:33:09.531 - sc_rollup_node.interpreter: 224257000000000 with 3 messages
Apr 15 11:33:09.531 - sc_rollup_node.interpreter: srs11VB3EdoasCjnZSkQ6jP9Eyb8bPmRo8FPbMEUujsBNGUgqgUdZd at tick
Apr 15 11:33:09.531 - sc_rollup_node.interpreter: Transitioned PVM at inbox level 358947 to
Apr 15 11:33:09.530 - smart_rollup_node: "Kernel yields"
Apr 15 11:33:09.529 - smart_rollup_node: "Batch applied: lvl: 10192, ts: 1681558375, hash: BMKZaarzN9[...]"
Apr 15 11:33:09.528 - smart_rollup_node: "L1 predecessor block: BKoGvYDBUYrSZvydjEre839zGTHJzFNvwhu62[...]"
Apr 15 11:33:09.528 - smart_rollup_node: "Kernel invoked, prev head: lvl: 10191, ts: 1681558367, hash[...]"
Apr 15 11:33:06.710 - sc_rollup_node.inbox: BLQjcnQN5B1uvrDrXfzwXVc6U61cGRKgLjoBaBxEGMMG4iNVCfH at level 358947
Apr 15 11:33:06.710 - sc_rollup_node.inbox: Fetching 0 messages from block
Apr 15 11:33:06.710 - sc_rollup_node.daemon: level 358947
Apr 15 11:33:06.710 - sc_rollup_node.daemon: Processing new head BLQjcnQN5B1uvrDrXfzwXVc6U61cGRKgLjoBaBxEGMMG4iNVCfH at
Apr 15 11:33:06.710 - sc_rollup_node.daemon: level 358947 to level 358947
Apr 15 11:33:06.710 - sc_rollup_node.daemon: A new iteration of process_heads has been triggered: processing 1 heads from
Apr 15 11:32:55.912 - sc_rollup_node.layer_1: BLQjcnQN5B1uvrDrXfzwXVc6U61cGRKgLjoBaBxEGMMG4iNVCfH at level 358947
Apr 15 11:32:55.912 - sc_rollup_node.layer_1: Layer 1 node has switched to head
After that the context was cleared and node restarted:
Closing store
Closing context
Shutting down L1
Apr 15 11:34:38.664 - sc_rollup_batcher: worker terminated [sc_rollup_batcher]
Shutting down Batcher
Apr 15 11:34:38.664 - tx_rollup_injector: worker terminated [tx_rollup_injector]
Exiting.
Shutting down Injector
Shutting down RPC server
Commitment src13rDGu88LnxzQrmknKSt3DDq6YmqxgtaM7GzX8rzwqKMXMmtsDP does not exist
Error:
Apr 15 11:34:38.619 - sc_rollup_node.daemon: level 348754
Apr 15 11:34:38.619 - sc_rollup_node.daemon: Processing new head BLr1N88nQE88PZ2kfadhrhZdYDsmGWrpiyPtSEvY1sSKEozFJXt at
Apr 15 11:34:38.619 - sc_rollup_node.daemon: from level 348754 to level 358958
Apr 15 11:34:38.619 - sc_rollup_node.daemon: A new iteration of process_heads has been triggered: processing 10205 heads
Apr 15 11:34:23.765 - sc_rollup_node.layer_1: BL9hwunwJkuz19TAo8XC8pttWxsqpoV9Mhrca6TgRJnf85QjiVK at level 358958
Apr 15 11:34:23.765 - sc_rollup_node.layer_1: Layer 1 node has switched to head
Apr 15 11:34:23.765 - smart_rollup_node: The smart rollup node is listening to 0.0.0.0:8932
Apr 15 11:34:23.760 - sc_rollup_batcher: worker started
Apr 15 11:34:23.760 - sc_rollup_injector: included_in_blocks from disk
Apr 15 11:34:23.760 - sc_rollup_injector: [tz1N96ka7u5cK: publish, add_messages, cement, refute] Loaded 0 elements in
Apr 15 11:34:23.753 - sc_rollup_injector: injected_ophs from disk
Apr 15 11:34:23.753 - sc_rollup_injector: [tz1N96ka7u5cK: publish, add_messages, cement, refute] Loaded 0 elements in
Apr 15 11:34:23.753 - sc_rollup_injector: included_operations from disk
Apr 15 11:34:23.753 - sc_rollup_injector: [tz1N96ka7u5cK: publish, add_messages, cement, refute] Loaded 0 elements in
Apr 15 11:34:23.750 - sc_rollup_injector: injected_operations from disk
Apr 15 11:34:23.750 - sc_rollup_injector: [tz1N96ka7u5cK: publish, add_messages, cement, refute] Loaded 0 elements in
Apr 15 11:34:23.749 - sc_rollup_injector: operations_queue from disk
Apr 15 11:34:23.749 - sc_rollup_injector: [tz1N96ka7u5cK: publish, add_messages, cement, refute] Loaded 0 elements in
Apr 15 11:34:23.748 - tx_rollup_injector: worker started for tz1N96ka7u5cK
Apr 15 11:34:23.739 - sc_rollup_node.commitment: Starting commitment publisher for the smart rollup node
Apr 15 11:34:23.738 - sc_rollup_node.inbox: Starting inbox tracker of the smart rollup node
Apr 15 11:34:23.738 - rpc_http_event: Server started (agent: OCaml-Resto)
Apr 15 11:34:23.683 - smart_rollup_node: sr1SgxtCbLJnV6KtBS3HxuKCZJvuRKCPgFdM of kind wasm_2_0_0
Apr 15 11:34:23.683 - smart_rollup_node: The smart rollup node is interacting with rollup
Apr 15 11:34:23.671 - sc_rollup_node.layer_1: Starting layer 1 tracker of the smart rollup node
Apr 15 11:34:23.654 - smart_rollup_node: Starting the smart rollup node