[go: up one dir, main page]

Octez-p2p: enable SO_KEEPALIVE flag for connections

(patch proposed by @Saroupille )

This MR introduces the enabling of the SO_KEEPALIVE socket option for connections established by the Octez-P2P library. This enhancement aims to improve connection reliability within the Data Availability Layer (DAL) node by addressing issues related to stale connections.

Rationale

The decision to enable SO_KEEPALIVE stems from observed behavior where connections in the DAL node become stale over time due to infrequent communication from certain peers. A socket-level keep-alive mechanism provides a robust alternative to application-level ping strategies, simplifying the architecture while achieving the desired connection reliability.

Impact on DAL Nodes

  • Maintain Active Connections: By enabling SO_KEEPALIVE, the socket periodically sends keep-alive probes to verify that the connection remains active. This is particularly beneficial for the DAL node, where some peers may communicate (i.e. sending messages) infrequently.

  • Prevent Stale Connections: Without a ping mechanism at the GossipSub/Application layer, SO_KEEPALIVE helps detect and close unresponsive or dead connections, thereby preventing resource leaks and enhancing overall network stability.

  • Resource Management: Helps efficiently manage system resources by automatically cleaning up inactive connections.

Impact on L1 Nodes

  • Minimal Necessity: L1 nodes typically engage in frequent data exchanges, reducing the likelihood of stale connections.

  • Non-Harmful: While enabling SO_KEEPALIVE may offer negligible benefits for L1 nodes, it should not introduce any adverse effects, ensuring compatibility and safety across different node types.

Usage Considerations

  • Performance: While SO_KEEPALIVE introduces periodic network traffic to monitor connection health, the impact is generally minimal and outweighed by the benefits of maintaining active and reliable connections.

  • Configurability: The frequency and parameters of keep-alive probes can typically be adjusted at the system level to balance between responsiveness and network overhead, allowing flexibility based on deployment environments.

Edited by Mohamed IGUERNLALA

Merge request reports

Loading