CI: Add [runner_system_failure] and [stuck_or_timeout_failure] retries handlers to the default section (!18661) · Merge requests · Tezos / tezos

Summary

This MR introduces a default retry policy at the global handler level to support the transition to GCP spot instances for runners with the gcp tag. The default retry configuration will automatically retry jobs that fail due to runner_system_failure and stuck_or_timeout_failure, which are common failure modes when using spot instances.

Changes

Default Retry Policy

Added default retry configuration that applies to all jobs:
- max: 2 retries
- Retry on: stuck_or_timeout_failure and runner_system_failure

Cleanup of Redundant Retry Configurations

Removed redundant retry policies from individual CI jobs where they duplicate the new default behavior
Retained specific retry configurations only where jobs need different retry behavior than the default

Rationale

With the transition to GCP spot instances for cost optimization, we need to handle the increased likelihood of infrastructure-related failures gracefully. Spot instances can be preempted at any time, leading to runner_system_failure scenarios. Additionally, stuck_or_timeout_failure can occur during spot instance provisioning delays.

By setting these retry policies at the default level:

Consistency: All jobs automatically get appropriate retry behavior for spot instance failures
Maintenance: Reduces the need to manually add retry policies to each job
Resilience: Improves pipeline reliability during the GCP spot instance transition
Clean Code: Eliminates redundant retry configurations throughout the CI configuration

Testing

CI pipeline generates correctly with the new default configuration
Jobs that previously had matching retry policies now inherit from default
Jobs with specific retry requirements maintain their custom configurations
Generated YAML file contains the expected default retry block

Edited Jul 16, 2025 by Neo

CI: Add [runner_system_failure] and [stuck_or_timeout_failure] retries handlers to the default section