Move DORA metrics to CH
Summary
DORA metrics are industry standard to measure DevOps success in the company. GitLab still have some flaws when measuring these metrics, so here I suggest to reiterate on DORA metrics in GitLab to provide best experience and robust indicators for measuring DevOps metrics.
Current flaws
- Lead time for changes: supposed to be a median value, in fact it is a median of daily medians.
- Time to restore service: supposed to be a median value, in fact it is a median of daily medians.
- Change failure rate: supposed to be % of deployments with incidents, in fact it's average number of incidents per deployment.
- DORA calculation logic and storage behind it tied closely to GitLab platform itself make it impossible to use DORA metrics with parts of the system outside of gitlab. E.g. when Jira is used for planning. I believe that's important because it will allow us to showcase metric improvements "before" and "after" using GitLab as single platform for DevSecOps.
Suggestion
- Move DORA calculation data to clickhouse, which will allow us to store non-aggregated data and use CH capabilities to aggregate on-the-fly. That will allow us to fix LTFC and TTRS metrics to be true medians.
- Make data structures in CH decoupled from GitLab specific concepts so it will be easier to integrate DORA with 3rd party services.