[go: up one dir, main page]

DAL: Add PPX profiler

What

Parent MR: !18326 (merged)

Why

Adding profiling outputs to the DAL node can help us identify bottlenecks, especially in the frequent tezt-cloud scenarios that we run.

How

Manually testing the MR

  1. Manual (running a ghostnet DAL node)
TEZOS_PPX_PROFILER=profiling make

PROFILING=Debug PROFILING_BACKEND=txt ./octez-node run --data-dir ~/.tezos-node-ghostnet --rpc-addr 127.0.0.1:18733
./octez-dal-node config init --endpoint http://127.0.0.1:18733 --attester-profiles=<baker-key>

PROFILING=Debug PROFILING_BACKEND=txt ./octez-dal-node run

And I got the following results in ~/.tezos-dal-node/dal_profiler.txt:

...
2025-06-13T16:05:38.154-00:00
BLpJtMFQUwMVHYqGVxgeFbzJUZfL4exuhwDEJjNUCfitsovLKxY ............................ 1         3989.302ms   3%
  new_finalized_head ........................................................... 1           11.428ms  54%
    try_process_block .......................................................... 1            9.628ms  42%
      process_block ............................................................ 1            9.626ms  42%
        process_block_data ..................................................... 1            8.087ms  45%
          publish_slot_data .................................................... 2            0.001ms  84%
          block_info ........................................................... 1            1.649ms  47%
          slot_headers ......................................................... 1            1.701ms  20%
          store_slot_headers ................................................... 1            0.237ms 105%
          fetch_committee ...................................................... 1                        
          get_gs_worker ........................................................ 1                        
          dal_attestation ...................................................... 1            0.001ms 105%
          update_selected_slot_headers_statuses ................................ 1            0.294ms 112%
          remove_unattested_slots_and_shards ................................... 1            1.042ms 120%
          get_attestations ..................................................... 1            0.965ms  29%
          check_attesters_attested ............................................. 1            2.177ms  19%
          inject_entrapment_evidences .......................................... 1            0.003ms  70%
2025-06-13T16:05:42.143-00:00
BLQikHQ5RHMyvMNySJoG17WRQXFTncNYQsUeCgewE7FfBh3vtzE ............................ 1         4161.257ms   3%
...
  1. TEZT-CLOUD (DAL scenario)

Command used:

dune exec tezt/tests/cloud/main.exe -- DAL --stake 1,1 --producers 32 --log-file ghost_32p_prof --proxy --proxy-localhost --network ghostnet --website --monitoring --prometheus --prometheus-export --grafana --tezt-cloud ghost-32p-prof --dockerfile-alias dal --keep-temp --process-monitoring -v --fundraiser edsk3nd3m5Zb31fbchqAvWVUbeF4jH7kdcs6mRCA7rFzYt4kopSyZP --ppx-profiling

PRODUCER result:

...
2025-06-13T16:18:28.343-00:00
BLLop3WUGHHGrQSB9LEqKdBfLGksNPNMFFUi2gFk2ESueiDMhsF ............................ 1         4552.647ms  47%
  new_finalized_head ........................................................... 1          105.757ms  85%
    try_process_block .......................................................... 1          100.857ms  85%
      process_block ............................................................ 1          100.849ms  85%
        process_block_data ..................................................... 1           94.270ms  87%
          publish_slot_data .................................................... 17          41.625ms 100%
          block_info ........................................................... 1           14.786ms  50%
          slot_headers ......................................................... 1            8.682ms  31%
          store_slot_headers ................................................... 1            0.774ms  95%
          fetch_committee ...................................................... 1            0.002ms 105%
          get_gs_worker ........................................................ 1            0.002ms 158%
          dal_attestation ...................................................... 1            0.004ms  75%
          update_selected_slot_headers_statuses ................................ 1            0.545ms 100%
          remove_unattested_slots_and_shards ................................... 1           23.703ms 106%
          get_attestations ..................................................... 1            4.036ms 104%
          check_attesters_attested ............................................. 1            0.004ms 105%
          inject_entrapment_evidences .......................................... 1            0.007ms 116%
2025-06-13T16:18:32.895-00:00
BKxwMuCz9vB4RKbFM5zfhNxGU3PWkrYZojPvCTJUVypTzps5YWA ............................ 1         3520.444ms  61%
  new_finalized_head ........................................................... 1           36.659ms  54%
    try_process_block .......................................................... 1           31.791ms  47%
      process_block ............................................................ 1           31.783ms  47%
        process_block_data ..................................................... 1           26.996ms  50%
          publish_slot_data .................................................... 17           0.206ms 100%
          block_info ........................................................... 1           10.799ms  45%
          slot_headers ......................................................... 1            8.133ms  34%
          store_slot_headers ................................................... 1            1.240ms 102%
          fetch_committee ...................................................... 1            0.002ms 105%
          get_gs_worker ........................................................ 1            0.002ms 105%
          dal_attestation ...................................................... 1            0.004ms  79%
          update_selected_slot_headers_statuses ................................ 1            0.198ms 101%
          remove_unattested_slots_and_shards ................................... 1            2.811ms 117%
          get_attestations ..................................................... 1            3.507ms  28%
          check_attesters_attested ............................................. 1            0.005ms 120%
          inject_entrapment_evidences .......................................... 1            0.010ms 100%
2025-06-13T16:18:36.416-00:00
BMaMrgsNdMWXvPZwcXeiEiYFGkUR766PK9wBDqnq1V19gRqKCS1 ............................ 1         4153.561ms  51%
...

(one block it has ~100ms time for new_finalized_head and the next block ~40ms and it repeats this behaviour)

In the /baker-dal-node-0 folder I got the following dal_profiling.txt:

...
2025-06-13T16:21:58.371-00:00
BL2HskwXQ7Ar8ZVJXPYbaa8AYoSBAsJ1ZKskyVEKzehHHsDV8JA ............................ 1         3956.804ms   1%
  new_finalized_head ........................................................... 1           72.993ms  37%
    try_process_block .......................................................... 1           67.196ms  36%
      process_block ............................................................ 1           67.189ms  36%
        process_block_data ..................................................... 1           51.052ms  44%
          publish_slot_data .................................................... 25           0.067ms 101%
          block_info ........................................................... 1           30.735ms  40%
          slot_headers ......................................................... 1           11.972ms  39%
          store_slot_headers ................................................... 1            1.019ms 100%
          fetch_committee ...................................................... 1            0.002ms 140%
          get_gs_worker ........................................................ 1            0.003ms 105%
          dal_attestation ...................................................... 1            0.005ms 120%
          update_selected_slot_headers_statuses ................................ 1            0.618ms  96%
          remove_unattested_slots_and_shards ................................... 1            1.616ms 100%
          get_attestations ..................................................... 1            4.788ms  37%
          check_attesters_attested ............................................. 1            0.088ms 101%
          inject_entrapment_evidences .......................................... 1            0.007ms  98%
2025-06-13T16:22:02.328-00:00
BLi61d7NsVz5eoxMx76m5swEAtRnNVHzNYnwHYBJUu4jXDn6BdX ............................ 1         4043.669ms   1%
...

(all blocks are around the same values)

Checklist

  • Document the interface of any function added or modified (see the coding guidelines)
  • Document any change to the user interface, including configuration parameters (see node configuration)
  • Provide automatic testing (see the testing guide).
  • For new features and bug fixes, add an item in the appropriate changelog (docs/protocols/alpha.rst for the protocol and the environment, CHANGES.rst at the root of the repository for everything else).
  • Select suitable reviewers using the Reviewers field below.
  • Select as Assignee the next person who should take action on that MR
Edited by Gabriel Moise

Merge request reports

Loading