[go: up one dir, main page]

Agnostic_baker: Mitigate errors at migration between incompatible baking binaries

What

Parent MR: !16491 (merged)

This MR is one of the many necessary steps towards consolidating the octez-experimental-agnostic-baker.

It introduces a mechanism to automatically solve an agnostic-baker command line situation which might break at migration due to incompatibility. For visualisation, suppose protocol P+1 makes an option --example-option mandatory, while it was not mandatory in P. If a user was running the octez-experimental-agnostic-baker without that option enabled in P, at migration time the baker will break and halt for them, because they need to reconfigure their list of command arguments, because the agnostic-baker is designed to run the next protocol with the same list of arguments.

This mechanism is encapsulated under the command line option --auto-patch-cli (thanks @MBourgoin for the suggested feature).

There can be other scenarios, like introducing a completely new mandatory command argument, but which can have a default option, and therefore we can fix the baker without halting it.

Why

My idea here was to reduce the number of cases where the agnostic-baker halts, so that only the cases where the users must explicitly stop the baker and do some extra configuration remain (for instance when they will need to configure a DAL node before continuing to bake). Furthermore, I tried to add an organised way for developers to add rules, which can change from protocol to protocol. Therefore, if at some point we want the baker to actually halt, we do not need to add any new recovery plan, but we have the option to.

How

To achieve this, before the octez-experimental-agnostic-baker spawns a new baking binary (which can happen at startup or at migration), the baking command arguments are checked against a hard-coded list of rules, to spot potential errors, and if it's the case, fix them (and warning the user of this procedure):

  • if a mandatory argument is missing, it can be added if it is provided a default value (for instance, we have the --liquidity-baking-toggle-vote with default value pass)
  • if one of several arguments must appear, then as long as we know which one to add in the default case, we can add it (for instance, from alpha onwards, it is mandatory to have either --dal-node or --without-dal enabled, with the default case being --without-dal - further reading).
  • ... the list can be upgraded as new protocols/situations appear

Manually testing the MR

make
  1. Tezt
dune exec tezt/tests/main.exe -- --file agnostic_baker_test.ml
  • in the tezt tests, we hard-coded a default option for the dal-node argument to be --without-dal in !16426 (12fa0875), but here we remove that part in !16240 (2885771f), and this should make the test fail. However, when the auto-patch-cli is propagated in the tezt suite, and is set to true, the tests will pass (!16240 (8023c521)).
  1. Sandbox

You can check that running the octez-experimental-agnostic-baker on protocol Alpha without any mandatory arguments is fine, as it "autopatches" them:

make

DATA_DIR=~/.tezos-node-sandbox ./src/bin_node/octez-sandboxed-node.sh 1 --connections 0
eval `./src/bin_client/octez-init-sandboxed-client.sh 1`

octez-activate-alpha

type octez-client # this will give you <data-dir>/bin/octez-client and you only need `data-dir` for the next command

./octez-experimental-agnostic-baker --auto-patch-cli -- --base-dir <data-dir> --endpoint http://localhost:18731 run with local node ~/.tezos-node-sandbox

You should see the messages:

./octez-experimental-agnostic-baker --auto-patch-cli -- --base-dir /var/folders/10/l0cp1hc50yl4rqq9m2qsx0mc0000gp/T/tezos-tmp-client.XXXXXXXX.q8HQ8ubnHn/ --endpoint http://localhost:18731 run with local node ~/.tezos-node-sandbox
Jan 29 14:10:26.817: experimental agnostic baker started
Jan 29 14:10:26.817: [WARNING] As the name suggests, this binary is EXPERIMENTAL, therefore it is intended for testing
Jan 29 14:10:26.817:   purposes only. Please do not use it on `mainnet`.
Jan 29 14:10:26.819: The automatic CLI patching for the agnostic baker has been activated. This implies that the command
Jan 29 14:10:26.819:   line arguments will be checked against the rules of the binary corresponding to the following
Jan 29 14:10:26.819:   protocol: ProtoALphaAL
Jan 29 14:10:26.819: Missing mandatory argument: --liquidity-baking-toggle-vote
Jan 29 14:10:26.819: Adding mandatory argument `--liquidity-baking-toggle-vote pass` to baking command
Jan 29 14:10:26.819: One of [--dal-node; --without-dal] must be provided
Jan 29 14:10:26.819: Adding default argument `--without-dal` to baking command. The other options are `--dal-node; --without-dal`.
Jan 29 14:10:26.819: starting baker for protocol ProtoALphaAL with arguments:
Jan 29 14:10:26.819:   "--base-dir
Jan 29 14:10:26.819: /var/folders/10/l0cp1hc50yl4rqq9m2qsx0mc0000gp/T[...]"
Jan 29 14:10:26.820: baker ProtoALphaAL was started on pid 9145
Jan 29 14:10:26.821: baker for protocol ProtoALphaAL is now running
Jan 29 14:10:26.824: new block on proposal period (remaining period duration 63)
Jan 29 14:10:26.826: new block on proposal period (remaining period duration 63)
Jan 29 14:10:26.827: new block on proposal period (remaining period duration 63)
Jan 29 14:10:27.063: reading votes file: /var/folders/10/l0cp1hc50yl4rqq9m2qsx0mc0000gp/T/tezos-tmp-[...]
Jan 29 14:10:27.064: read liquidity baking toggle vote = pass
Jan 29 14:10:27.064: read adaptive issuance vote = pass
Jan 29 14:10:27.064: No DAL node endpoint has been provided.
Jan 29 14:10:27.064: It will soon be required to launch a DAL node before running
Jan 29 14:10:27.064:   the baker. For instructions on running a DAL node, please visit
Jan 29 14:10:27.064:   https://docs.tezos.com/tutorials/join-dal-baker.
Node is bootstrapped.
Waiting for protocol alpha to start...
pre-emptive-forge-time optimization set to 0.150000s. Operation inclusion window is ~0.850000s. Caution: Setting this too high may result in reduced block proposal rewards.
Baker 21.0~rc3+dev (f76f56ed) for ProtoALphaAL started.
Jan 29 14:10:27.068: Baker will run with the following delegates:
Jan 29 14:10:27.068:   activator (tz1TGu6TN5GSez2ndXXeDX6LgUDvLzPLqgYV)
Jan 29 14:10:27.068:   bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)bootstrap2 (tz1gjaF81ZRRvdzjobyfVNsAeSC6PScjfQwN)
Jan 29 14:10:27.068:   bootstrap3 (tz1faswCTDciRzE4oJ9jn2Vm2dvjeyA9fUzU)bootstrap4 (tz1b7tUupMgCNw2cCLpKTkSD1NZzB5TkP2sv)
Jan 29 14:10:27.068:   bootstrap5 (tz1ddb9NMYHZi5UzPdzTZMYQQZoMub195zgv)
Jan 29 14:10:27.072: initializing irmin context at /Users/gabrielmoise/.tezos-node-sandbox/context
Jan 29 14:10:27.076: successfully migrated nonces: legacy nonces are safe to delete
Jan 29 14:10:27.077: reading votes file: /var/folders/10/l0cp1hc50yl4rqq9m2qsx0mc0000gp/T/tezos-tmp-[...]
Jan 29 14:10:27.077: Voting pass for liquidity baking toggle vote
Jan 29 14:10:27.077: Voting pass for adaptive issuance vote
Jan 29 14:10:27.082: received new forge event:
Jan 29 14:10:27.082:   block ready for delegate: bootstrap5 (tz1ddb9NMYHZi5UzPdzTZMYQQZoMub195zgv) at level 2 (round: 6)
...

Checklist

  • Document the interface of any function added or modified (see the coding guidelines)
  • Document any change to the user interface, including configuration parameters (see node configuration)
  • Provide automatic testing (see the testing guide).
  • For new features and bug fixes, add an item in the appropriate changelog (docs/protocols/alpha.rst for the protocol and the environment, CHANGES.rst at the root of the repository for everything else).
  • Select suitable reviewers using the Reviewers field below.
  • Select as Assignee the next person who should take action on that MR
Edited by Gabriel Moise

Merge request reports

Loading