[go: up one dir, main page]

Manual Gitaly node fail over

Problem to solve

If a failure makes a Gitaly node unavailable, there needs to be a way to promote a replica so that access continues.

We should focus on making the cross over process reliable and automatic, before we begin detecting failures automatically, since there is no point detecting a failure and failing over unreliably.

Further details

This should be the most naive implementation possible, so that we can put it in different situations, observe undesirable behavior and then address these situations iteratively.

Proposal

As an administrator

  • from the Rails console I should be able to mark a repositories primary as unavailable
  • GitLab should select a replica to promote to primary
  • Git access should now occur from the secondary

Links / references

Edited by James Ramsay (ex-GitLab)