Baker: emit events on stalling RPC
What
Baker RPC usage is now wrapped in an helper that emit event(s) if the RPC hasn't answered in some period of times.
Why
The goal is in case of incident to determine if an RPC could be the root cause thanks to the logs. Note that it's not extremly verbose yet, we could go deeper, as we're at the lib_delegate level, we're too high level to get the path of the RPC which could be really interesting to have at the Info level. An alternative would be to have pretty-printers in the stalling_rpc events to print the RPC parameters.
How
It's really not rocket science.
Manually testing the MR
I don't know if this can be automatically tested by a test, in the mean time I have added manually some sleeps in the code, then I run a baker alongside a node and get:
Nov 21 10:48:20.414 WARN │ RPC shell_header has not answered in the last 2.4 seconds
Nov 21 10:48:21.231 NOTICE │ received new proposal BLZh1dZRwCvTvEZiupz97QkcziDVkJXBU85Jz3Z7auS6nzkCvjM at level 54100, round 0
Nov 21 10:48:21.240 NOTICE │ The following delegates have no attesting rights at level 54100: 'tmp' (tz1TXhefsLkdNcoFA3pFW7t1oxs2UQrkGQQr)
Nov 21 10:48:21.257 NOTICE │ received new head BLZh1dZRwCvTvEZiupz97QkcziDVkJXBU85Jz3Z7auS6nzkCvjM at level 54100, round 0
Nov 21 10:48:23.296 WARN │ RPC shell_header has not answered in the last 5.28 seconds
Nov 21 10:48:25.215 NOTICE │ received new proposal BM6jA9cwb5Uu7CFJs7aALgEuRGqeheyibRNa9esZz8cYrmkNdxF at level 54101, round 0
Nov 21 10:48:25.223 NOTICE │ The following delegates have no attesting rights at level 54101: 'tmp' (tz1TXhefsLkdNcoFA3pFW7t1oxs2UQrkGQQr)
Nov 21 10:48:25.240 NOTICE │ received new head BM6jA9cwb5Uu7CFJs7aALgEuRGqeheyibRNa9esZz8cYrmkNdxF at level 54101, round 0
Nov 21 10:48:26.753 WARN │ RPC shell_header has not answered in the last 8.736 seconds
with:
diff --git a/src/lib_shell/block_directory.ml b/src/lib_shell/block_directory.ml
index ccef3e831c4..43ee0c8534d 100644
--- a/src/lib_shell/block_directory.ml
+++ b/src/lib_shell/block_directory.ml
@@ -93,6 +93,7 @@ let build_raw_header_rpc_directory (module Proto : Block_services.PROTO) =
register0 S.raw_header (fun (_, _, header) () () ->
return (Data_encoding.Binary.to_bytes_exn Block_header.encoding header)) ;
register0 S.Header.shell_header (fun (_, _, header) () () ->
+ let*! () = Lwt_unix.sleep 10. in
return header.shell) ;
register0 S.Header.protocol_data (fun (_, _, header) () () ->
return
@@ -268,6 +269,7 @@ let build_raw_rpc_directory_without_validator
let module S = Block_services.S in
(* block metadata *)
let block_metadata chain_store block =
+ let*! () = Lwt_unix.sleep 10. in
let* metadata = Store.Block.get_block_metadata chain_store block in
let protocol_data =
Data_encoding.Binary.of_bytes_exn
Checklist
-
Document the interface of any function added or modified (see the coding guidelines) -
Document any change to the user interface, including configuration parameters (see node configuration) -
Provide automatic testing (see the testing guide). -
For new features and bug fixes, add an item in the appropriate changelog ( docs/protocols/alpha.rstfor the protocol and the environment,CHANGES.rstat the root of the repository for everything else). -
Select suitable reviewers using the Reviewersfield below. -
Select as Assigneethe next person who should take action on that MR