Agnostic_baker: Add agnostic baker plugin
What
Parent MR: !16878 (merged)
This MR's achievements are:
- getting rid of the dependency of the
octez-experimental-agnostic-bakerto the protocol specific baking binaries, by removing thewatchdogmechanism that forked processes based on calling the necessary binaries with the given arguments; It is replaced with one that is based onLwtthat calls directly the main function entry point for those binaries; - this comes at the cost of exposing the main entry point of each baker library into the plugin library
lib_agnostic_baker.
Why
- improves UX
- enables us to maintain a single binary, instead of multiple ones
- decreases the risk of double baking
How
This is done by exposing most of the code from lib_delegate (the entry point being the Baking_commands.baker_commands, from which we get most files and modules). This is done via the lib_plugin in lib_agnostic_baker, and this plugin is used by the bin_agnostic_baker code. This approach is quite similar to the one done by the DAL team in the lib_dal_node.
Manually testing the MR
- No regression in
tezt:
dune exec tezt/tests/main.exe -- --file agnostic_baker_test.ml
This tests that:
- the agnostic baker starts and stops correctly
- the agnostic baker runs and migrates correctly (from
QtoRand fromRtoAlphainsandbox), potentially using remote signer
- Testing on
ghostnet:
./octez-node run --data-dir ~/.tezos-node-ghostnet --metrics-addr localhost:19091 --rpc-addr 127.0.0.1:18733 --net-addr localhost:19732
Not having the correct arguments gives an error the same way the octez-baker-PsQuebec would do :
./octez-experimental-agnostic-baker -- --base-dir ~/.tezos-client-ghostnet --endpoint http://localhost:18733 run with local node ~/.tezos-node-ghostnet
Feb 12 11:35:48.067: experimental agnostic baker started
Feb 12 11:35:48.068: [WARNING] As the name suggests, this binary is EXPERIMENTAL, therefore it is intended for testing
Feb 12 11:35:48.068: purposes only. Please do not use it on `mainnet`.
Feb 12 11:35:48.074: starting baker for protocol PsQuebecnLBy with arguments:
Feb 12 11:35:48.074: "--base-dir /Users/gabrielmoise/.tezos-client-ghostnet --end[...]"
Feb 12 11:35:48.219: baker for protocol PsQuebecnLBy is now running
Error:
Missing liquidity baking toggle vote, please use either the --liquidity-baking-toggle-vote option, or the --votefile option or a votes file in the default location: per_block_votes.json in the current working directory or in the baker directory.
And having the right arguments works the same way:
./octez-experimental-agnostic-baker -- --base-dir ~/.tezos-client-ghostnet --endpoint http://localhost:18733 run with local node ~/.tezos-node-ghostnet --liquidity-baking-toggle-vote pass --without-dal
Feb 13 13:10:39.753: experimental agnostic baker started
Feb 13 13:10:39.754: [WARNING] As the name suggests, this binary is EXPERIMENTAL, therefore it is intended for testing
Feb 13 13:10:39.754: purposes only. Please do not use it on `mainnet`.
Feb 13 13:10:39.758: starting baker for protocol PsQuebecnLBy with arguments:
Feb 13 13:10:39.758: "--base-dir /Users/gabrielmoise/.tezos-client-ghostnet --end[...]"
Feb 13 13:10:39.897: baker for protocol PsQuebecnLBy is now running
Feb 13 13:10:39.900: read liquidity baking toggle vote = pass
Feb 13 13:10:39.900: read adaptive issuance vote = pass
Feb 13 13:10:39.900: No DAL node endpoint has been provided.
Feb 13 13:10:39.900: Future protocols might integrate DAL participation into
Feb 13 13:10:39.900: participation rewards.
Feb 13 13:10:39.900: Bakers are encouraged to set up a DAL attester node instead of using the
Feb 13 13:10:39.900: `--without-dal` option.
Feb 13 13:10:39.900: For instructions on how to run a DAL node, please visit
Feb 13 13:10:39.900: https://docs.tezos.com/tutorials/join-dal-baker.
Node is bootstrapped.
Waiting for protocol 021-PsQuebec to start...
Feb 13 13:10:39.901: new block on proposal period (remaining period duration 7169)
pre-emptive-forge-time optimization set to 0.600000s. Operation inclusion window is ~3.400000s. Caution: Setting this too high may result in reduced block proposal rewards.
Baker 21.0~rc3+dev (975f25d0) for PsQuebecnLBy started.
Feb 13 13:10:39.906: Baker will run with the following delegates:
Feb 13 13:10:39.906:
Feb 13 13:10:39.915: initializing irmin context at /Users/gabrielmoise/.tezos-node-ghostnet/context
Feb 13 13:10:40.021: successfully migrated nonces: legacy nonces are safe to delete
Feb 13 13:10:40.021: received new head BKqpcwY6kPtXTYiEnaZjNoFLwxWw366ENwvM7WaYd9VX9ih3i5U at level 10608639, round 0
Feb 13 13:10:43.062: received new proposal BMYJEjtL2TGGKZYdZNreyeug3MNF8UdYLq1yq8pesiv6hBdeA2o at level 10608640, round 0
Feb 13 13:10:43.073: No DAL node endpoint has been provided.
Feb 13 13:10:43.073: Future protocols might integrate DAL participation into
Feb 13 13:10:43.073: participation rewards.
Feb 13 13:10:43.073: Bakers are encouraged to set up a DAL attester node instead of using the
Feb 13 13:10:43.073: `--without-dal` option.
Feb 13 13:10:43.073: For instructions on how to run a DAL node, please visit
Feb 13 13:10:43.073: https://docs.tezos.com/tutorials/join-dal-baker.
Feb 13 13:10:43.086: received new head BMYJEjtL2TGGKZYdZNreyeug3MNF8UdYLq1yq8pesiv6hBdeA2o at level 10608640, round 0
Feb 13 13:10:43.088: new block on proposal period (remaining period duration 7168)
Feb 13 13:10:47.138: received new proposal BLci4i6WB8Jmki7BPdRPxNdTtk2ddE9F8wUqjdP2dhJEho4rZ9o at level 10608641, round 0
Feb 13 13:10:47.162: received new head BLci4i6WB8Jmki7BPdRPxNdTtk2ddE9F8wUqjdP2dhJEho4rZ9o at level 10608641, round 0
Feb 13 13:10:47.163: new block on proposal period (remaining period duration 7167)
Feb 13 13:10:51.097: received new proposal BKtJKFCGnWrsBGe77Px2e3u14Q8UuoQHkHzyhZtv7U4bpzE8ksE at level 10608642, round 0
Feb 13 13:10:51.118: received new head BKtJKFCGnWrsBGe77Px2e3u14Q8UuoQHkHzyhZtv7U4bpzE8ksE at level 10608642, round 0
Feb 13 13:10:51.119: new block on proposal period (remaining period duration 7166)
One can also see that the logs contains a directory for the octez-experimental-agnostic-baker as well (no duplicated logs):
cat ~/.tezos-client-ghostnet/logs/octez-experimental-agnostic-baker/daily-20250217.log
2025-02-17T17:41:43.260-00:00 [agnostic-baker.starting_daemon] experimental agnostic baker started
2025-02-17T17:41:43.261-00:00 [agnostic-baker.experimental_binary] [WARNING] As the name suggests, this binary is EXPERIMENTAL, therefore it is intended for testing purposes only. Please do not use it on `mainnet`.
2025-02-17T17:41:43.266-00:00 [agnostic-baker.starting_baker] starting baker for protocol PsQuebecnLBy with arguments: "--base-dir /Users/gabrielmoise/.tezos-client-ghostnet --end[...]"
2025-02-17T17:41:43.402-00:00 [agnostic-baker.baker_running] baker for protocol PsQuebecnLBy is now running
2025-02-17T17:41:43.405-00:00 [021-PsQuebec.baker.read_liquidity_baking_toggle_vote] read liquidity baking toggle vote = pass
2025-02-17T17:41:43.405-00:00 [021-PsQuebec.baker.read_adaptive_issuance_vote] read adaptive issuance vote = pass
2025-02-17T17:41:43.405-00:00 [021-PsQuebec.baker.commands.no_dal_node_provided] No DAL node endpoint has been provided.
Future protocols might integrate DAL participation into participation rewards.
Bakers are encouraged to set up a DAL attester node instead of using the `--without-dal` option.
For instructions on how to run a DAL node, please visit https://docs.tezos.com/tutorials/join-dal-baker.
2025-02-17T17:41:43.406-00:00 [agnostic-baker.period_status] new block on proposal period (remaining period duration 15335)
2025-02-17T17:41:43.409-00:00 [021-PsQuebec.baker.delegates.delegates_used] Baker will run with the following delegates:
2025-02-17T17:41:43.421-00:00 [node.context_ops.initializing_context] initializing irmin context at /Users/gabrielmoise/.tezos-node-ghostnet/context
2025-02-17T17:41:43.421-00:00 [node.context.disk.init_context] initializing context (readonly: true, index_log_size: 2500000, lru_size: 15000)
- Testing on
sandbox:
DATA_DIR=~/.tezos-node-sandbox ./src/bin_node/octez-sandboxed-node.sh 1 --connections 0
eval `./src/bin_client/octez-init-sandboxed-client.sh 1`
octez-activate-alpha
type octez-client
octez-client is <client_dir>/bin/octez-client
./octez-experimental-agnostic-baker -- --base-dir <client_dir> --endpoint http://localhost:18731 run with local node ~/.tezos-node-sandbox --liquidity-baking-toggle-vote pass
Feb 17 10:28:57.243: experimental agnostic baker started
Feb 17 10:28:57.244: [WARNING] As the name suggests, this binary is EXPERIMENTAL, therefore it is intended for testing
Feb 17 10:28:57.244: purposes only. Please do not use it on `mainnet`.
Feb 17 10:28:57.248: starting baker for protocol ProtoALphaAL with arguments:
Feb 17 10:28:57.248: "--base-dir
Feb 17 10:28:57.248: /var/folders/10/l0cp1hc50yl4rqq9m2qsx0mc0000gp/T[...]"
Feb 17 10:28:57.380: baker for protocol ProtoALphaAL is now running
Feb 17 10:28:57.381: reading votes file: /var/folders/10/l0cp1hc50yl4rqq9m2qsx0mc0000gp/T/tezos-tmp-[...]
Feb 17 10:28:57.382: read liquidity baking toggle vote = pass
Feb 17 10:28:57.382: read adaptive issuance vote = pass
Error:
Please connect a running DAL node using '--dal-node <endpoint>'. If you do not want to run a DAL node, you have to opt-out using '--without-dal'.
./octez-experimental-agnostic-baker -- --base-dir /var/folders/10/l0cp1hc50yl4rqq9m2qsx0mc0000gp/T/tezos-tmp-client.XXXXXXXX.pioCyATaQD --endpoint http://localhost:18731 run with local node ~/.tezos-node-sandbox --liquidity-baking-toggle-vote pass --without-dal
Feb 17 10:29:31.905: experimental agnostic baker started
Feb 17 10:29:31.905: [WARNING] As the name suggests, this binary is EXPERIMENTAL, therefore it is intended for testing
Feb 17 10:29:31.905: purposes only. Please do not use it on `mainnet`.
Feb 17 10:29:31.908: starting baker for protocol ProtoALphaAL with arguments:
Feb 17 10:29:31.908: "--base-dir
Feb 17 10:29:31.908: /var/folders/10/l0cp1hc50yl4rqq9m2qsx0mc0000gp/T[...]"
Feb 17 10:29:32.040: baker for protocol ProtoALphaAL is now running
Feb 17 10:29:32.043: reading votes file: /var/folders/10/l0cp1hc50yl4rqq9m2qsx0mc0000gp/T/tezos-tmp-[...]
Feb 17 10:29:32.043: read liquidity baking toggle vote = pass
Feb 17 10:29:32.043: read adaptive issuance vote = pass
Feb 17 10:29:32.043: No DAL node endpoint has been provided.
Feb 17 10:29:32.043: Not running a DAL node might result in losing a share of the
Feb 17 10:29:32.043: participation rewards.
Feb 17 10:29:32.043: For instructions on how to run a DAL node, please visit
Feb 17 10:29:32.043: https://docs.tezos.com/tutorials/join-dal-baker.
Feb 17 10:29:32.044: new block on proposal period (remaining period duration 63)
Node is bootstrapped.
Waiting for protocol alpha to start...
Feb 17 10:29:32.046: new block on proposal period (remaining period duration 63)
pre-emptive-forge-time optimization set to 0.150000s. Operation inclusion window is ~0.850000s. Caution: Setting this too high may result in reduced block proposal rewards.
Baker 21.0~rc3+dev (07a640df) for ProtoALphaAL started.
Feb 17 10:29:32.048: Baker will run with the following delegates:
Feb 17 10:29:32.048: activator (tz1TGu6TN5GSez2ndXXeDX6LgUDvLzPLqgYV)
Feb 17 10:29:32.048: bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)bootstrap2 (tz1gjaF81ZRRvdzjobyfVNsAeSC6PScjfQwN)
Feb 17 10:29:32.048: bootstrap3 (tz1faswCTDciRzE4oJ9jn2Vm2dvjeyA9fUzU)bootstrap4 (tz1b7tUupMgCNw2cCLpKTkSD1NZzB5TkP2sv)
Feb 17 10:29:32.048: bootstrap5 (tz1ddb9NMYHZi5UzPdzTZMYQQZoMub195zgv)
Feb 17 10:29:32.048: new block on proposal period (remaining period duration 63)
Feb 17 10:29:32.052: initializing irmin context at /Users/gabrielmoise/.tezos-node-sandbox/context
Feb 17 10:29:32.057: successfully migrated nonces: legacy nonces are safe to delete
Feb 17 10:29:32.058: reading votes file: /var/folders/10/l0cp1hc50yl4rqq9m2qsx0mc0000gp/T/tezos-tmp-[...]
Feb 17 10:29:32.058: Voting pass for liquidity baking toggle vote
Feb 17 10:29:32.058: Voting pass for adaptive issuance vote
Feb 17 10:29:32.064: received new forge event:
Feb 17 10:29:32.064: block ready for delegate: bootstrap2 (tz1gjaF81ZRRvdzjobyfVNsAeSC6PScjfQwN) at level 2 (round: 15)
...
- Testing error cases
A. It was decided that the agnostic baker will stop when the underlying baker process encounters an error. This is complicated to test automatically, but it can be seen in the following way:
- test that the
agnostic bakerstops when the node stops: quite simple, run thenodeand theagnostic bakerand then check that stopping thenoderesults in theagnostic bakerstopping with:
...
Error:
Connection with node was lost
- test that an error in the underlying error gets propagated: what I did was to add
(let open Lwt_syntax in
let* () = Lwt_unix.sleep 5.0 in
Lwt.fail (Failure "Underlying baker process failed after 5 seconds"));
in the Lwt.pick of src/proto_021_PsQuebec/lib_agnostic_baker/agnostic_baker_plugin_registration.ml function run_baker_binary and then run the agnostic baker on ghostnet, and after 5 seconds, the agnostic baker stopped with:
...
Error:
Underlying baker process failed after 5 seconds
Error:
Lwt.Resolution_loop.Canceled
Additionally, adding a random tzfail error in a lib_delegate library (like baking_commands.ml) meant that I got the correct error. What I did was to add let* () = tzfail (Node_version_malformatted "random") in in run_baker and I got:
$ ./octez-experimental-agnostic-baker -- --base-dir ~/.tezos-client-ghostnet --endpoint http://localhost:18733 run with local node ~/.tezos-node-ghostnet --liquidity-baking-toggle-vote pass --without-dal
Feb 18 14:59:56.039: experimental agnostic baker started
Feb 18 14:59:56.042: [WARNING] As the name suggests, this binary is EXPERIMENTAL, therefore it is intended for testing
Feb 18 14:59:56.042: purposes only. Please do not use it on `mainnet`.
Feb 18 14:59:56.052: starting baker for protocol PsQuebecnLBy with arguments:
Feb 18 14:59:56.052: "--base-dir /Users/gabrielmoise/.tezos-client-ghostnet --end[...]"
Feb 18 14:59:56.212: baker for protocol PsQuebecnLBy is now running
Error:
The node version provided in command argument: 'random' is a malformatted version
Error:
Error in the underlying baker process
Checklist
-
Document the interface of any function added or modified (see the coding guidelines) -
Document any change to the user interface, including configuration parameters (see node configuration) -
Provide automatic testing (see the testing guide). -
For new features and bug fixes, add an item in the appropriate changelog ( docs/protocols/alpha.rstfor the protocol and the environment,CHANGES.rstat the root of the repository for everything else). -
Select suitable reviewers using the Reviewersfield below. -
Select as Assigneethe next person who should take action on that MR