diff --git a/doc/administration/gitaly/img/praefect_architecture_v12_10.png b/doc/administration/gitaly/img/praefect_architecture_v12_10.png index 7b8f1138b2351c286469d52850a2a3b9de744dca..024a12b0a5dd1b955fcff2101fb86830ce198e89 100644 Binary files a/doc/administration/gitaly/img/praefect_architecture_v12_10.png and b/doc/administration/gitaly/img/praefect_architecture_v12_10.png differ diff --git a/doc/administration/gitaly/praefect.md b/doc/administration/gitaly/praefect.md index 19a69dc348ccc9b8f2440d53f569c6d0cddb548a..c8be53f6c042b747dc9185fe3dfd2e2be142a284 100644 --- a/doc/administration/gitaly/praefect.md +++ b/doc/administration/gitaly/praefect.md @@ -1,34 +1,36 @@ # Praefect: High Availability -NOTE: **Note:** Praefect is an experimental service, and data loss is likely. +NOTE: **Note:** Praefect is a +[beta](https://about.gitlab.com/handbook/product/#alpha-beta-ga) component that +allows Gitaly to be run in a highly available configuration. While unexpected +data loss is not likely, Praefect is not yet ready for production environments. Praefect is an optional reverse-proxy for [Gitaly](../index.md) to manage a -cluster of Gitaly nodes for high availability. Initially, high availability -be implemented through asynchronous replication. If a Gitaly node becomes -unavailable, it will be possible to fail over to a warm Gitaly replica. +cluster of Gitaly nodes for high availability. High availability is currently +implemented through asynchronous replication. If a Gitaly node becomes +unavailable, Praefect will automatically route traffic to a warm Gitaly replica. -The first minimal version will support: +The current version supports: - Eventual consistency of the secondary replicas. - Automatic fail over from the primary to the secondary. - Reporting of possible data loss if replication queue is non empty. Follow the [HA Gitaly epic](https://gitlab.com/groups/gitlab-org/-/epics/1489) -for updates and roadmap. +for improvements including +[strong consistency](https://gitlab.com/groups/gitlab-org/-/epics/1189) and +[horizontally distributing reads](https://gitlab.com/groups/gitlab-org/-/epics/2013). ## Requirements for configuring Gitaly for High Availability -NOTE: **Note:** this reference architecture is not highly available because -Praefect is a single point of failure. +A minimum highly available configuration requires: -The minimal [alpha](https://about.gitlab.com/handbook/product/#alpha-beta-ga) -reference architecture additionally requires: - -- 1 Praefect node -- 1 PostgreSQL server (PostgreSQL 9.6 or newer) +- 1 highly available load balancer +- 1 highly available PostgreSQL server (PostgreSQL 9.6 or newer) +- 2 Praefect nodes - 3 Gitaly nodes (1 primary, 2 secondary) -![Alpha architecture diagram](img/praefect_architecture_v12_10.png) +![Architecture diagram](img/praefect_architecture_v12_10.png) See the [design document](https://gitlab.com/gitlab-org/gitaly/-/blob/master/doc/design_ha.md) @@ -64,6 +66,7 @@ GitLab](https://about.gitlab.com/install/). You will need the IP/host address for each node. +1. `LOADBALANCER_SERVER_ADDRESS`: the IP/hots address of the load balancer 1. `POSTGRESQL_SERVER_ADDRESS`: the IP/host address of the PostgreSQL server 1. `PRAEFECT_HOST`: the IP/host address of the Praefect server 1. `GITALY_HOST`: the IP/host address of each Gitaly server @@ -245,9 +248,9 @@ application server, or a Gitaly node. 1. Configure the **Praefect** cluster to connect to each Gitaly node in the cluster by editing `/etc/gitlab/gitlab.rb`. - In the example below we have configured one cluster named `praefect`. This - cluster has three Gitaly nodes `gitaly-1`, `gitaly-2`, and `gitaly-3`, which - will be replicas of each other. + In the example below we have configured one virtual storage (or shard) named + `storage-1`. This cluster has three Gitaly nodes `gitaly-1`, `gitaly-2`, and + `gitaly-3`, which will be replicas of each other. Replace `PRAEFECT_INTERNAL_TOKEN` with a strong secret, which will be used by Praefect when communicating with Gitaly nodes in the cluster. This token is @@ -266,7 +269,7 @@ application server, or a Gitaly node. # Name of storage hash must match storage name in git_data_dirs on GitLab # server ('praefect') and in git_data_dirs on Gitaly nodes ('gitaly-1') praefect['virtual_storages'] = { - 'praefect' => { + 'storage-1' => { 'gitaly-1' => { 'address' => 'tcp://GITALY_HOST:8075', 'token' => 'PRAEFECT_INTERNAL_TOKEN', @@ -312,7 +315,7 @@ demote the primary to be a secondary. # failover_enabled turns on automatic failover praefect['failover_enabled'] = true praefect['virtual_storages'] = { - 'praefect' => { + 'storage-1' => { 'gitaly-1' => { 'address' => 'tcp://GITALY_HOST:8075', 'token' => 'PRAEFECT_INTERNAL_TOKEN', @@ -555,7 +558,7 @@ Particular attention should be shown to: - the storage name added to `git_data_dirs` in this section must match the storage name under `praefect['virtual_storages']` on the Praefect node. This was set in the [Praefect](#praefect) section of this guide. This document uses - `praefect` as the Praefect storage name. + `storage-1` as the Praefect storage name. 1. SSH into the **GitLab** node and login as root: @@ -577,7 +580,7 @@ Particular attention should be shown to: "default" => { "gitaly_address" => "tcp://GITLAB_HOST:8075" }, - "praefect" => { + "storage-1" => { "gitaly_address" => "tcp://PRAEFECT_HOST:2305", "gitaly_token" => 'PRAEFECT_EXTERNAL_TOKEN' }