(OKR 2022Q3 - 4.3) - Faster CI
The goal of this milestone is to improve the CI of tezos/tezos so that it takes less time to run. We are talking about the "wall time", i.e. not the sum of jobs but the actual time that devs have to wait. Of course, we want to do this without compromising the quality of tests, and the cost of additional CI resources should stay within reason.
Note that the goal should not be taken too literally. The actual goal is to improve quality of life for developers. If we manage significantly improve the CI without actually reducing the time it takes to run a pipeline, it can still be considered in scope.
This milestone is split into sub-goals below. Some of those sub-goals are incomplete: they are preparatory work, they miss some streams of work that are needed to reach the sub-goals.
Dedicated runners (22 pw)
Expected gains: unknown, from 0 to 10 minutes? Goal is to answer this question.
-
Stream of work 1 (4 pw) (status: not started)
-
Stream of work 2 (18 pw) (status: not started) (Q4)
Auto-balance unit tests
Expected gains: between 0 and 3 minutes.
- Stream of work 1 (status: in review)
- Estimated time: ~2 weeks
- Nb of engineers needed: 1 (@romain.nl)
- Short description: split Tezt into JS / non-JS parts
(other streams of work are out of scope for now)
Don't run tests for unchanged code
Expected gains: between 0 and 15 minutes depending on the MR. Between 1 and 4 minutes for a typical MR. Alternatively, could reduce the number of runners per typical MR.
- Stream of work 2 (status: proof of concept done) (Q4)
- Estimated time: ~1-2 months
- Nb of engineers needed: 2 (@romain.nl and Nohmane)
- Short description: generate CI YAML files from an OCaml specification
(other streams of work are out of scope for now)
Add more runners (5 pw)
Estimated gains: 0% to 90% (probably closer to 30%)
-
Stream of work 1 (status: done)
- Estimated time: ~1-2 weeks
- Nb of engineers needed: 1 (Nohmane)
- Short description: fix marge bot handling of manual pipelines
-
Stream of work 2 (status: done)
- Estimated time: ~1 week
- Nb of engineers needed: 1 (Nohmane)
- Depends on: stream of work 1
- Short description: manual pipelines
-
Stream of work 3 (status: in progress: gathering data)
- Estimated time: ~1 week
- Nb of engineers needed: 1
- Depends on: stream of work 2
- Short description: investigate whether manual pipelines gives us more breathing room w.r.t. the rate limit, since they would reduce the number of CI jobs
-
Stream of work 4 (status: partially done, waiting for stream of work 3 to continue)
- Estimated time: ~3 days
- Nb of engineers needed: 1
- Weakly depends on: streams of work 1, 2, 3
- Short description: double the amount of runners and monitor the result (this assumes that rate limit is no longer a problem after manual pipelines, or that we find other workarounds)