Use Merge Trains for tezos/tezos

The introduction of merge trains in GitLab represents a promising feature that can potentially streamline our workflow and replace the existing functionality provided by Marge-bot.

This issue describes the proposed changes in the current workflow and how these changes will affect developers and the merge team.

Current workflow

A MR is created.
Developers must run manual pipelines to test their code in the CI
The MR must receive 2 approvals from the merge team to be merged
All thread must be resolved
The MR is assigned to Margebot
- MargeBot put the MR on an internal queue
- MargeBot takes the MR at the head of the queue
- Rebase it on the current master and assert a few static checks
- Waits for a successful pipeline
- Merge
- Takes the next MR on the queue.

This workflow allows for one MR at the time is merged.

Proposed Merge Trains workflow

Developers must run manual pipelines to test their code in the CI
An additional pipeline of type "merge_result" is triggered that result in a immediate successful pipeline
The developer must remove the label when the MR is ready.
The MR must receive 2 approvals to be merged
All thread must be resolved
A MergeTeam member click on the "Merge Button"
The MR is added to a merge train
If the merge train is empty:
- The MR is rebased on master
- A pipeline of type "merge_train" is run with all the tests
- The MR is merged
If the merge train is not empty
- The MR is rebased on the tail of the train
- A pipeline of type "merge_train" is run with all the tests
- The MR is immediately merged if all MR before it in the train are also merged
- The MR is removed from the train if the rebase of CI fails
- If one MR before is remove from the train, then all MRs are going to be enqueued again.

With the optimistic assumption that, on average, the CI is green and the rebase conflict rate is low, this workflow allows to merge MRs at a higher rate running multiple pipelines in parallel.

The "optimistic assumption" is essential to avoid soaring CI costs as failed pipelines or rebase conflicts implies the rescheduled of all other MR enqueued later in the merge train.

Implementation

First we provide a few definitions. Merge trains and Merge result pipelines are coupled in the gitlab implementation of merge trains. Merge result-type of pipelines are triggered using the target branch of the MR. In our case, each time we push a commit on a MR, the merge_request_event will be of type "merge_result". Since we do not want to run a pipeline for each commit in order to save CI costs, a while ago we introduced mandatory manual pipelines. These two mechanisms are incompatible.

Merge train-types of pipelines are triggered each time a MR is run with the tail of the merge train as target. These are the pipelines we must run to ensure that only tested code is merged in master.

With this in mind, the plan is to modify the rules of our manual trigger to play well with this new system.

First we introduce a new type of job, that is a merge-result-job that succeeds immediately. We run this job only if the type of trigger is merge_result, that is every time we push or rebase.

This allows to avoid a duplicated pipeline saving costs.

Second we modify the actual manual trigger to run only if the MR has a label set by default. This allows the developer to trigger a manual job as needed exactly as it is now. The label must be removed when the MR is ready for merge.

Third we add a final job that will run the entire test array if the merge event is of type "merge_train". This will ensure to run the final pipeline before merging.

Why three new jobs ?

We need to satisfy all our constraints.

Developers cannot add MR on the merge train
Developers can run a manual pipeline when needed
No duplicate pipelines
The final pipeline is always run and green before merging into master.

Impact on the developer

As before developers will create a MR
When the MR is ready, the developer MUST add the label MR_IS_READY.
Instead of the manual pipeline, now the developer will see a phony-always-succeed pipeline. This is a merge_result type of pipeline and always succeed Since this the merger_result job comes at no cost, we can safely ignore it.
To get back to the normal manual pipeline, the developer can remove the MR_IS_READY label.

Impact on the merge team

The label MR_IS_READY must be set for the MR.
The CI must be green to be added to the merge train. At this stage, this is always true as the only job on the pipeline will be the "merge_result" phony pipeline.
The MR must be approved to be added to the merge train
Instead of affecting a MR to margebot, the merge team member will click the merge button instead.
The MR will be removed from the merge train in case of rebase conflict of a problem with the CI.

In this new workflow the MR can be in two states.

Not ready (label MR_IS_READY not present). In this state, created merge request pipelines will look like normal pipelines. They have a manual trigger that runs the full normal pipeline.
Ready (label MR_IS_READY present). In this state, created merge request pipelines are "phony" and consists of a single trigger-merge-results job that serves only to make the pipeline green, it does nothing of itself. Created merge train pipelines on the other hand contains a automatic, full normal pipeline.

The purpose of the Ready state is to simplify the manual rebases that is necessary while still ensuring that after each rebase, there is a quasi-immediate green pipeline.

Developers can change states, but pipelines can be triggered either clicking the rebase button or pushing a commit.

Edited Nov 27, 2023 by pietro