diff --git a/doc/administration/reference_architectures/10k_users.md b/doc/administration/reference_architectures/10k_users.md index 69e17c28e543ccce8c8b1487ce79f36c0fe1dcc6..afa53b5efa8232618e2803cd58adf840569ec91c 100644 --- a/doc/administration/reference_architectures/10k_users.md +++ b/doc/administration/reference_architectures/10k_users.md @@ -422,9 +422,9 @@ install the necessary dependencies from step 1, and add the GitLab package repository from step 2. When installing GitLab in the second step, do not supply the `EXTERNAL_URL` value. -#### PostgreSQL primary node +#### PostgreSQL nodes -1. SSH in to the PostgreSQL primary node. +1. SSH in to one of the PostgreSQL nodes. 1. Generate a password hash for the PostgreSQL username/password pair. This assumes you will use the default username of `gitlab` (recommended). The command will request a password and confirmation. Use the value that is output by this command in the next @@ -452,23 +452,33 @@ in the second step, do not supply the `EXTERNAL_URL` value. sudo gitlab-ctl pg-password-md5 gitlab-consul ``` -1. On the primary database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section: +1. On every database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section: ```ruby - # Disable all components except PostgreSQL and Repmgr and Consul + # Disable all components except PostgreSQL, Patroni, and Consul roles ['postgres_role'] # PostgreSQL configuration postgresql['listen_address'] = '0.0.0.0' - postgresql['hot_standby'] = 'on' - postgresql['wal_level'] = 'replica' - postgresql['shared_preload_libraries'] = 'repmgr_funcs' + + # Enable Patroni + patroni['enable'] = true + # Set `max_wal_senders` to one more than the number of database nodes in the cluster. + # This is used to prevent replication from using up all of the + # available database connections. + patroni['postgresql']['max_wal_senders'] = 4 + patroni['postgresql']['max_replication_slots'] = 4 + # Incoming recommended value for max connections is 500. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691. + patroni['postgresql']['max_connections'] = 500 # Disable automatic database migrations gitlab_rails['auto_migrate'] = false # Configure the Consul agent + consul['enable'] = true consul['services'] = %w(postgresql) + ## Enable service discovery for Prometheus + consul['monitoring_service_discovery'] = true # START user configuration # Please set the real values as explained in Required Information section @@ -477,18 +487,9 @@ in the second step, do not supply the `EXTERNAL_URL` value. postgresql['pgbouncer_user_password'] = '' # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value postgresql['sql_user_password'] = '' - # Set `max_wal_senders` to one more than the number of database nodes in the cluster. - # This is used to prevent replication from using up all of the - # available database connections. - postgresql['max_wal_senders'] = 4 - postgresql['max_replication_slots'] = 4 # Replace XXX.XXX.XXX.XXX/YY with Network Address postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) - repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24) - - ## Enable service discovery for Prometheus - consul['monitoring_service_discovery'] = true # Set the network addresses that the exporters will listen on for monitoring node_exporter['listen_address'] = '0.0.0.0:9100' @@ -503,70 +504,9 @@ in the second step, do not supply the `EXTERNAL_URL` value. # END user configuration ``` -1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and replace - the file of the same name on this server. If that file is not on this server, - add the file from your Consul server to this server. - -1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. - -
- - Back to setup components - -
- -#### PostgreSQL secondary nodes - -1. On both the secondary nodes, add the same configuration specified above for the primary node - with an additional setting (`repmgr['master_on_initialization'] = false`) that will inform `gitlab-ctl` that they are standby nodes initially - and there's no need to attempt to register them as a primary node: - - ```ruby - # Disable all components except PostgreSQL and Repmgr and Consul - roles ['postgres_role'] - - # PostgreSQL configuration - postgresql['listen_address'] = '0.0.0.0' - postgresql['hot_standby'] = 'on' - postgresql['wal_level'] = 'replica' - postgresql['shared_preload_libraries'] = 'repmgr_funcs' - - # Disable automatic database migrations - gitlab_rails['auto_migrate'] = false - - # Configure the Consul agent - consul['services'] = %w(postgresql) - - # Specify if a node should attempt to be primary on initialization. - repmgr['master_on_initialization'] = false - - # Replace PGBOUNCER_PASSWORD_HASH with a generated md5 value - postgresql['pgbouncer_user_password'] = '' - # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value - postgresql['sql_user_password'] = '' - # Set `max_wal_senders` to one more than the number of database nodes in the cluster. - # This is used to prevent replication from using up all of the - # available database connections. - postgresql['max_wal_senders'] = 4 - postgresql['max_replication_slots'] = 4 - - # Replace with your network addresses - postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) - repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24) - - ## Enable service discovery for Prometheus - consul['monitoring_service_discovery'] = true - - # Set the network addresses that the exporters will listen on for monitoring - node_exporter['listen_address'] = '0.0.0.0:9100' - postgres_exporter['listen_address'] = '0.0.0.0:9187' - - ## The IPs of the Consul server nodes - ## You can also use FQDNs and intermix them with IPs - consul['configuration'] = { - retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13), - } - ``` +PostgreSQL, with Patroni managing its failover, will default to use `pg_rewind` by default to handle conflicts. +Like most failover handling methods, this has a small chance of leading to data loss. +Learn more about the various [Patroni replication methods](../postgresql/replication_and_failover.md#selecting-the-appropriate-patroni-replication-method). 1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and replace the file of the same name on this server. If that file is not on this server, @@ -601,84 +541,25 @@ SSH in to the **primary node**: 1. Exit the database prompt by typing `\q` and Enter. -1. Verify the cluster is initialized with one node: +1. Check the status of the leader and cluster: ```shell - gitlab-ctl repmgr cluster show + gitlab-ctl patroni members ``` The output should be similar to the following: ```plaintext - Role | Name | Upstream | Connection String - ----------+----------|----------|---------------------------------------- - * master | HOSTNAME | | host=HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr + | Cluster | Member | Host | Role | State | TL | Lag in MB | Pending restart | + |---------------|-----------------------------------|-----------|--------|---------|-----|-----------|-----------------| + | postgresql-ha | | 10.6.0.21 | Leader | running | 175 | | * | + | postgresql-ha | | 10.6.0.22 | | running | 175 | 0 | * | + | postgresql-ha | | 10.6.0.23 | | running | 175 | 0 | * | ``` -1. Note down the hostname or IP address in the connection string: `host=HOSTNAME`. We will - refer to the hostname in the next section as ``. If the value - is not an IP address, it will need to be a resolvable name (via DNS or - `/etc/hosts`) - -SSH in to the **secondary node**: - -1. Set up the repmgr standby: - - ```shell - gitlab-ctl repmgr standby setup - ``` - - Do note that this will remove the existing data on the node. The command - has a wait time. - - The output should be similar to the following: - - ```console - Doing this will delete the entire contents of /var/opt/gitlab/postgresql/data - If this is not what you want, hit Ctrl-C now to exit - To skip waiting, rerun with the -w option - Sleeping for 30 seconds - Stopping the database - Removing the data - Cloning the data - Starting the database - Registering the node with the cluster - ok: run: repmgrd: (pid 19068) 0s - ``` - -Before moving on, make sure the databases are configured correctly. Run the -following command on the **primary** node to verify that replication is working -properly and the secondary nodes appear in the cluster: - -```shell -gitlab-ctl repmgr cluster show -``` - -The output should be similar to the following: - -```plaintext -Role | Name | Upstream | Connection String -----------+---------|-----------|------------------------------------------------ -* master | MASTER | | host= user=gitlab_repmgr dbname=gitlab_repmgr - standby | STANDBY | MASTER | host= user=gitlab_repmgr dbname=gitlab_repmgr - standby | STANDBY | MASTER | host= user=gitlab_repmgr dbname=gitlab_repmgr -``` - -If the 'Role' column for any node says "FAILED", check the +If the 'State' column for any node doesn't say "running", check the [Troubleshooting section](troubleshooting.md) before proceeding. -Also, check that the `repmgr-check-master` command works successfully on each node: - -```shell -su - gitlab-consul -gitlab-ctl repmgr-check-master || echo 'This node is a standby repmgr node' -``` - -This command relies on exit codes to tell Consul whether a particular node is a master -or secondary. The most important thing here is that this command does not produce errors. -If there are errors it's most likely due to incorrect `gitlab-consul` database user permissions. -Check the [Troubleshooting section](troubleshooting.md) before proceeding. -
Back to setup components @@ -696,7 +577,7 @@ The following IPs will be used as an example: 1. On each PgBouncer node, edit `/etc/gitlab/gitlab.rb`, and replace `` and `` with the - password hashes you [set up previously](#postgresql-primary-node): + password hashes you [set up previously](#postgresql-nodes): ```ruby # Disable all components except Pgbouncer and Consul agent @@ -704,15 +585,16 @@ The following IPs will be used as an example: # Configure PgBouncer pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul) - pgbouncer['users'] = { - 'gitlab-consul': { - password: '' - }, - 'pgbouncer': { - password: '' - } + 'gitlab-consul': { + password: '' + }, + 'pgbouncer': { + password: '' + } } + # Incoming recommended value for max db connections is 150. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691. + pgbouncer['max_db_connections'] = 150 # Configure Consul agent consul['watchers'] = %w(postgresql) diff --git a/doc/administration/reference_architectures/25k_users.md b/doc/administration/reference_architectures/25k_users.md index 03e340bdb88ff342f2570860651d207f903d8ed3..d96e93d4ab4465f2498ed16b75fe1e81796a4a2a 100644 --- a/doc/administration/reference_architectures/25k_users.md +++ b/doc/administration/reference_architectures/25k_users.md @@ -422,9 +422,9 @@ install the necessary dependencies from step 1, and add the GitLab package repository from step 2. When installing GitLab in the second step, do not supply the `EXTERNAL_URL` value. -#### PostgreSQL primary node +#### PostgreSQL nodes -1. SSH in to the PostgreSQL primary node. +1. SSH in to one of the PostgreSQL nodes. 1. Generate a password hash for the PostgreSQL username/password pair. This assumes you will use the default username of `gitlab` (recommended). The command will request a password and confirmation. Use the value that is output by this command in the next @@ -452,23 +452,33 @@ in the second step, do not supply the `EXTERNAL_URL` value. sudo gitlab-ctl pg-password-md5 gitlab-consul ``` -1. On the primary database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section: +1. On every database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section: ```ruby - # Disable all components except PostgreSQL and Repmgr and Consul + # Disable all components except PostgreSQL, Patroni, and Consul roles ['postgres_role'] # PostgreSQL configuration postgresql['listen_address'] = '0.0.0.0' - postgresql['hot_standby'] = 'on' - postgresql['wal_level'] = 'replica' - postgresql['shared_preload_libraries'] = 'repmgr_funcs' + + # Enable Patroni + patroni['enable'] = true + # Set `max_wal_senders` to one more than the number of database nodes in the cluster. + # This is used to prevent replication from using up all of the + # available database connections. + patroni['postgresql']['max_wal_senders'] = 4 + patroni['postgresql']['max_replication_slots'] = 4 + # Incoming recommended value for max connections is 500. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691. + patroni['postgresql']['max_connections'] = 500 # Disable automatic database migrations gitlab_rails['auto_migrate'] = false # Configure the Consul agent + consul['enable'] = true consul['services'] = %w(postgresql) + ## Enable service discovery for Prometheus + consul['monitoring_service_discovery'] = true # START user configuration # Please set the real values as explained in Required Information section @@ -477,18 +487,9 @@ in the second step, do not supply the `EXTERNAL_URL` value. postgresql['pgbouncer_user_password'] = '' # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value postgresql['sql_user_password'] = '' - # Set `max_wal_senders` to one more than the number of database nodes in the cluster. - # This is used to prevent replication from using up all of the - # available database connections. - postgresql['max_wal_senders'] = 4 - postgresql['max_replication_slots'] = 4 # Replace XXX.XXX.XXX.XXX/YY with Network Address postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) - repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24) - - ## Enable service discovery for Prometheus - consul['monitoring_service_discovery'] = true # Set the network addresses that the exporters will listen on for monitoring node_exporter['listen_address'] = '0.0.0.0:9100' @@ -503,70 +504,9 @@ in the second step, do not supply the `EXTERNAL_URL` value. # END user configuration ``` -1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and replace - the file of the same name on this server. If that file is not on this server, - add the file from your Consul server to this server. - -1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. - - - -#### PostgreSQL secondary nodes - -1. On both the secondary nodes, add the same configuration specified above for the primary node - with an additional setting (`repmgr['master_on_initialization'] = false`) that will inform `gitlab-ctl` that they are standby nodes initially - and there's no need to attempt to register them as a primary node: - - ```ruby - # Disable all components except PostgreSQL and Repmgr and Consul - roles ['postgres_role'] - - # PostgreSQL configuration - postgresql['listen_address'] = '0.0.0.0' - postgresql['hot_standby'] = 'on' - postgresql['wal_level'] = 'replica' - postgresql['shared_preload_libraries'] = 'repmgr_funcs' - - # Disable automatic database migrations - gitlab_rails['auto_migrate'] = false - - # Configure the Consul agent - consul['services'] = %w(postgresql) - - # Specify if a node should attempt to be primary on initialization. - repmgr['master_on_initialization'] = false - - # Replace PGBOUNCER_PASSWORD_HASH with a generated md5 value - postgresql['pgbouncer_user_password'] = '' - # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value - postgresql['sql_user_password'] = '' - # Set `max_wal_senders` to one more than the number of database nodes in the cluster. - # This is used to prevent replication from using up all of the - # available database connections. - postgresql['max_wal_senders'] = 4 - postgresql['max_replication_slots'] = 4 - - # Replace with your network addresses - postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) - repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24) - - ## Enable service discovery for Prometheus - consul['monitoring_service_discovery'] = true - - # Set the network addresses that the exporters will listen on for monitoring - node_exporter['listen_address'] = '0.0.0.0:9100' - postgres_exporter['listen_address'] = '0.0.0.0:9187' - - ## The IPs of the Consul server nodes - ## You can also use FQDNs and intermix them with IPs - consul['configuration'] = { - retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13), - } - ``` +PostgreSQL, with Patroni managing its failover, will default to use `pg_rewind` by default to handle conflicts. +Like most failover handling methods, this has a small chance of leading to data loss. +Learn more about the various [Patroni replication methods](../postgresql/replication_and_failover.md#selecting-the-appropriate-patroni-replication-method). 1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and replace the file of the same name on this server. If that file is not on this server, @@ -601,84 +541,25 @@ SSH in to the **primary node**: 1. Exit the database prompt by typing `\q` and Enter. -1. Verify the cluster is initialized with one node: +1. Check the status of the leader and cluster: ```shell - gitlab-ctl repmgr cluster show + gitlab-ctl patroni members ``` The output should be similar to the following: ```plaintext - Role | Name | Upstream | Connection String - ----------+----------|----------|---------------------------------------- - * master | HOSTNAME | | host=HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr + | Cluster | Member | Host | Role | State | TL | Lag in MB | Pending restart | + |---------------|-----------------------------------|-----------|--------|---------|-----|-----------|-----------------| + | postgresql-ha | | 10.6.0.21 | Leader | running | 175 | | * | + | postgresql-ha | | 10.6.0.22 | | running | 175 | 0 | * | + | postgresql-ha | | 10.6.0.23 | | running | 175 | 0 | * | ``` -1. Note down the hostname or IP address in the connection string: `host=HOSTNAME`. We will - refer to the hostname in the next section as ``. If the value - is not an IP address, it will need to be a resolvable name (via DNS or - `/etc/hosts`) - -SSH in to the **secondary node**: - -1. Set up the repmgr standby: - - ```shell - gitlab-ctl repmgr standby setup - ``` - - Do note that this will remove the existing data on the node. The command - has a wait time. - - The output should be similar to the following: - - ```console - Doing this will delete the entire contents of /var/opt/gitlab/postgresql/data - If this is not what you want, hit Ctrl-C now to exit - To skip waiting, rerun with the -w option - Sleeping for 30 seconds - Stopping the database - Removing the data - Cloning the data - Starting the database - Registering the node with the cluster - ok: run: repmgrd: (pid 19068) 0s - ``` - -Before moving on, make sure the databases are configured correctly. Run the -following command on the **primary** node to verify that replication is working -properly and the secondary nodes appear in the cluster: - -```shell -gitlab-ctl repmgr cluster show -``` - -The output should be similar to the following: - -```plaintext -Role | Name | Upstream | Connection String -----------+---------|-----------|------------------------------------------------ -* master | MASTER | | host= user=gitlab_repmgr dbname=gitlab_repmgr - standby | STANDBY | MASTER | host= user=gitlab_repmgr dbname=gitlab_repmgr - standby | STANDBY | MASTER | host= user=gitlab_repmgr dbname=gitlab_repmgr -``` - -If the 'Role' column for any node says "FAILED", check the +If the 'State' column for any node doesn't say "running", check the [Troubleshooting section](troubleshooting.md) before proceeding. -Also, check that the `repmgr-check-master` command works successfully on each node: - -```shell -su - gitlab-consul -gitlab-ctl repmgr-check-master || echo 'This node is a standby repmgr node' -``` - -This command relies on exit codes to tell Consul whether a particular node is a master -or secondary. The most important thing here is that this command does not produce errors. -If there are errors it's most likely due to incorrect `gitlab-consul` database user permissions. -Check the [Troubleshooting section](troubleshooting.md) before proceeding. -
Back to setup components @@ -696,7 +577,7 @@ The following IPs will be used as an example: 1. On each PgBouncer node, edit `/etc/gitlab/gitlab.rb`, and replace `` and `` with the - password hashes you [set up previously](#postgresql-primary-node): + password hashes you [set up previously](#postgresql-nodes): ```ruby # Disable all components except Pgbouncer and Consul agent @@ -704,15 +585,16 @@ The following IPs will be used as an example: # Configure PgBouncer pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul) - pgbouncer['users'] = { - 'gitlab-consul': { - password: '' - }, - 'pgbouncer': { - password: '' - } + 'gitlab-consul': { + password: '' + }, + 'pgbouncer': { + password: '' + } } + # Incoming recommended value for max db connections is 150. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691. + pgbouncer['max_db_connections'] = 150 # Configure Consul agent consul['watchers'] = %w(postgresql) diff --git a/doc/administration/reference_architectures/2k_users.md b/doc/administration/reference_architectures/2k_users.md index 6158fabb8bd22167944b2391b484e7087a9b0ab4..e664d55fa1bbef65b370fcf3ad75c126e1fbb419 100644 --- a/doc/administration/reference_architectures/2k_users.md +++ b/doc/administration/reference_architectures/2k_users.md @@ -271,7 +271,7 @@ further configuration steps. ```ruby # Disable all components except PostgreSQL roles ['postgres_role'] - repmgr['enable'] = false + patroni['enable'] = false consul['enable'] = false prometheus['enable'] = false alertmanager['enable'] = false diff --git a/doc/administration/reference_architectures/3k_users.md b/doc/administration/reference_architectures/3k_users.md index 670b86913cb7aad25215057a993cd3633651e4f0..e7b151452a52744cc3c537a5bd0e64d59b24cfde 100644 --- a/doc/administration/reference_architectures/3k_users.md +++ b/doc/administration/reference_architectures/3k_users.md @@ -672,9 +672,9 @@ install the necessary dependencies from step 1, and add the GitLab package repository from step 2. When installing GitLab in the second step, do not supply the `EXTERNAL_URL` value. -#### PostgreSQL primary node +#### PostgreSQL nodes -1. SSH in to the PostgreSQL primary node. +1. SSH in to one of the PostgreSQL nodes. 1. Generate a password hash for the PostgreSQL username/password pair. This assumes you will use the default username of `gitlab` (recommended). The command will request a password and confirmation. Use the value that is output by this command in the next @@ -702,114 +702,33 @@ in the second step, do not supply the `EXTERNAL_URL` value. sudo gitlab-ctl pg-password-md5 gitlab-consul ``` -1. On the primary database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section: +1. On every database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section: ```ruby - # Disable all components except PostgreSQL and Repmgr and Consul + # Disable all components except PostgreSQL, Patroni, and Consul roles ['postgres_role'] # PostgreSQL configuration postgresql['listen_address'] = '0.0.0.0' - postgresql['hot_standby'] = 'on' - postgresql['wal_level'] = 'replica' - postgresql['shared_preload_libraries'] = 'repmgr_funcs' - # Disable automatic database migrations - gitlab_rails['auto_migrate'] = false - - # Configure the Consul agent - consul['services'] = %w(postgresql) - - # START user configuration - # Please set the real values as explained in Required Information section - # - # Replace PGBOUNCER_PASSWORD_HASH with a generated md5 value - postgresql['pgbouncer_user_password'] = '' - # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value - postgresql['sql_user_password'] = '' + # Enable Patroni + patroni['enable'] = true # Set `max_wal_senders` to one more than the number of database nodes in the cluster. # This is used to prevent replication from using up all of the # available database connections. - postgresql['max_wal_senders'] = 4 - postgresql['max_replication_slots'] = 4 - - # Replace XXX.XXX.XXX.XXX/YY with Network Address - postgresql['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24) - repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24) - - ## Enable service discovery for Prometheus - consul['enable'] = true - consul['monitoring_service_discovery'] = true - - # Set the network addresses that the exporters will listen on for monitoring - node_exporter['listen_address'] = '0.0.0.0:9100' - postgres_exporter['listen_address'] = '0.0.0.0:9187' - postgres_exporter['dbname'] = 'gitlabhq_production' - postgres_exporter['password'] = '' - - ## The IPs of the Consul server nodes - ## You can also use FQDNs and intermix them with IPs - consul['configuration'] = { - retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13), - } - # - # END user configuration - ``` - -1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. -1. You can list the current PostgreSQL primary, secondary nodes status via: - - ```shell - sudo /opt/gitlab/bin/gitlab-ctl repmgr cluster show - ``` - -1. Verify the GitLab services are running: - - ```shell - sudo gitlab-ctl status - ``` - - The output should be similar to the following: - - ```plaintext - run: consul: (pid 30593) 77133s; run: log: (pid 29912) 77156s - run: logrotate: (pid 23449) 3341s; run: log: (pid 29794) 77175s - run: node-exporter: (pid 30613) 77133s; run: log: (pid 29824) 77170s - run: postgres-exporter: (pid 30620) 77132s; run: log: (pid 29894) 77163s - run: postgresql: (pid 30630) 77132s; run: log: (pid 29618) 77181s - run: repmgrd: (pid 30639) 77132s; run: log: (pid 29985) 77150s - ``` - - - -#### PostgreSQL secondary nodes - -1. On both the secondary nodes, add the same configuration specified above for the primary node - with an additional setting that will inform `gitlab-ctl` that they are standby nodes initially - and there's no need to attempt to register them as a primary node: - - ```ruby - # Disable all components except PostgreSQL and Repmgr and Consul - roles ['postgres_role'] - - # PostgreSQL configuration - postgresql['listen_address'] = '0.0.0.0' - postgresql['hot_standby'] = 'on' - postgresql['wal_level'] = 'replica' - postgresql['shared_preload_libraries'] = 'repmgr_funcs' + patroni['postgresql']['max_wal_senders'] = 4 + patroni['postgresql']['max_replication_slots'] = 4 + # Incoming recommended value for max connections is 500. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691. + patroni['postgresql']['max_connections'] = 500 # Disable automatic database migrations gitlab_rails['auto_migrate'] = false # Configure the Consul agent + consul['enable'] = true consul['services'] = %w(postgresql) - - # Specify if a node should attempt to be primary on initialization. - repmgr['master_on_initialization'] = false + ## Enable service discovery for Prometheus + consul['monitoring_service_discovery'] = true # START user configuration # Please set the real values as explained in Required Information section @@ -818,34 +737,31 @@ in the second step, do not supply the `EXTERNAL_URL` value. postgresql['pgbouncer_user_password'] = '' # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value postgresql['sql_user_password'] = '' - # Set `max_wal_senders` to one more than the number of database nodes in the cluster. - # This is used to prevent replication from using up all of the - # available database connections. - postgresql['max_wal_senders'] = 4 - postgresql['max_replication_slots'] = 4 # Replace XXX.XXX.XXX.XXX/YY with Network Address - postgresql['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24) - repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24) - - ## Enable service discovery for Prometheus - consul['enable'] = true - consul['monitoring_service_discovery'] = true + postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) # Set the network addresses that the exporters will listen on for monitoring node_exporter['listen_address'] = '0.0.0.0:9100' postgres_exporter['listen_address'] = '0.0.0.0:9187' - postgres_exporter['dbname'] = 'gitlabhq_production' - postgres_exporter['password'] = '' ## The IPs of the Consul server nodes ## You can also use FQDNs and intermix them with IPs consul['configuration'] = { retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13), } + # # END user configuration ``` +PostgreSQL, with Patroni managing its failover, will default to use `pg_rewind` by default to handle conflicts. +Like most failover handling methods, this has a small chance of leading to data loss. +Learn more about the various [Patroni replication methods](../postgresql/replication_and_failover.md#selecting-the-appropriate-patroni-replication-method). + +1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and replace + the file of the same name on this server. If that file is not on this server, + add the file from your Consul server to this server. + 1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. Advanced [configuration options](https://docs.gitlab.com/omnibus/settings/database.html) @@ -876,84 +792,25 @@ SSH in to the **primary node**: 1. Exit the database prompt by typing `\q` and Enter. -1. Verify the cluster is initialized with one node: +1. Check the status of the leader and cluster: ```shell - gitlab-ctl repmgr cluster show + gitlab-ctl patroni members ``` The output should be similar to the following: ```plaintext - Role | Name | Upstream | Connection String - ----------+----------|----------|---------------------------------------- - * master | HOSTNAME | | host=HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr + | Cluster | Member | Host | Role | State | TL | Lag in MB | Pending restart | + |---------------|-----------------------------------|-----------|--------|---------|-----|-----------|-----------------| + | postgresql-ha | | 10.6.0.31 | Leader | running | 175 | | * | + | postgresql-ha | | 10.6.0.32 | | running | 175 | 0 | * | + | postgresql-ha | | 10.6.0.33 | | running | 175 | 0 | * | ``` -1. Note down the hostname or IP address in the connection string: `host=HOSTNAME`. We will - refer to the hostname in the next section as ``. If the value - is not an IP address, it will need to be a resolvable name (via DNS or - `/etc/hosts`) - -SSH in to the **secondary node**: - -1. Set up the repmgr standby: - - ```shell - gitlab-ctl repmgr standby setup - ``` - - Do note that this will remove the existing data on the node. The command - has a wait time. - - The output should be similar to the following: - - ```console - Doing this will delete the entire contents of /var/opt/gitlab/postgresql/data - If this is not what you want, hit Ctrl-C now to exit - To skip waiting, rerun with the -w option - Sleeping for 30 seconds - Stopping the database - Removing the data - Cloning the data - Starting the database - Registering the node with the cluster - ok: run: repmgrd: (pid 19068) 0s - ``` - -Before moving on, make sure the databases are configured correctly. Run the -following command on the **primary** node to verify that replication is working -properly and the secondary nodes appear in the cluster: - -```shell -gitlab-ctl repmgr cluster show -``` - -The output should be similar to the following: - -```plaintext -Role | Name | Upstream | Connection String -----------+---------|-----------|------------------------------------------------ -* master | MASTER | | host= user=gitlab_repmgr dbname=gitlab_repmgr - standby | STANDBY | MASTER | host= user=gitlab_repmgr dbname=gitlab_repmgr - standby | STANDBY | MASTER | host= user=gitlab_repmgr dbname=gitlab_repmgr -``` - -If the 'Role' column for any node says "FAILED", check the +If the 'State' column for any node doesn't say "running", check the [Troubleshooting section](troubleshooting.md) before proceeding. -Also, check that the `repmgr-check-master` command works successfully on each node: - -```shell -su - gitlab-consul -gitlab-ctl repmgr-check-master || echo 'This node is a standby repmgr node' -``` - -This command relies on exit codes to tell Consul whether a particular node is a master -or secondary. The most important thing here is that this command does not produce errors. -If there are errors it's most likely due to incorrect `gitlab-consul` database user permissions. -Check the [Troubleshooting section](troubleshooting.md) before proceeding. -
Back to setup components @@ -971,7 +828,7 @@ The following IPs will be used as an example: 1. On each PgBouncer node, edit `/etc/gitlab/gitlab.rb`, and replace `` and `` with the - password hashes you [set up previously](#postgresql-primary-node): + password hashes you [set up previously](#postgresql-nodes): ```ruby # Disable all components except Pgbouncer and Consul agent @@ -979,15 +836,16 @@ The following IPs will be used as an example: # Configure PgBouncer pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul) - pgbouncer['users'] = { - 'gitlab-consul': { - password: '' - }, - 'pgbouncer': { - password: '' - } + 'gitlab-consul': { + password: '' + }, + 'pgbouncer': { + password: '' + } } + # Incoming recommended value for max db connections is 150. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691. + pgbouncer['max_db_connections'] = 150 # Configure Consul agent consul['watchers'] = %w(postgresql) diff --git a/doc/administration/reference_architectures/50k_users.md b/doc/administration/reference_architectures/50k_users.md index 91b3b022365817b185221d041b0196aca9b4ebc4..093869d331b2200dd059bd4ef5c493d0dd1a7d4e 100644 --- a/doc/administration/reference_architectures/50k_users.md +++ b/doc/administration/reference_architectures/50k_users.md @@ -422,9 +422,9 @@ install the necessary dependencies from step 1, and add the GitLab package repository from step 2. When installing GitLab in the second step, do not supply the `EXTERNAL_URL` value. -#### PostgreSQL primary node +#### PostgreSQL nodes -1. SSH in to the PostgreSQL primary node. +1. SSH in to one of the PostgreSQL nodes. 1. Generate a password hash for the PostgreSQL username/password pair. This assumes you will use the default username of `gitlab` (recommended). The command will request a password and confirmation. Use the value that is output by this command in the next @@ -452,23 +452,33 @@ in the second step, do not supply the `EXTERNAL_URL` value. sudo gitlab-ctl pg-password-md5 gitlab-consul ``` -1. On the primary database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section: +1. On every database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section: ```ruby - # Disable all components except PostgreSQL and Repmgr and Consul + # Disable all components except PostgreSQL, Patroni, and Consul roles ['postgres_role'] # PostgreSQL configuration postgresql['listen_address'] = '0.0.0.0' - postgresql['hot_standby'] = 'on' - postgresql['wal_level'] = 'replica' - postgresql['shared_preload_libraries'] = 'repmgr_funcs' + + # Enable Patroni + patroni['enable'] = true + # Set `max_wal_senders` to one more than the number of database nodes in the cluster. + # This is used to prevent replication from using up all of the + # available database connections. + patroni['postgresql']['max_wal_senders'] = 4 + patroni['postgresql']['max_replication_slots'] = 4 + # Incoming recommended value for max connections is 500. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691. + patroni['postgresql']['max_connections'] = 500 # Disable automatic database migrations gitlab_rails['auto_migrate'] = false # Configure the Consul agent + consul['enable'] = true consul['services'] = %w(postgresql) + ## Enable service discovery for Prometheus + consul['monitoring_service_discovery'] = true # START user configuration # Please set the real values as explained in Required Information section @@ -477,18 +487,9 @@ in the second step, do not supply the `EXTERNAL_URL` value. postgresql['pgbouncer_user_password'] = '' # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value postgresql['sql_user_password'] = '' - # Set `max_wal_senders` to one more than the number of database nodes in the cluster. - # This is used to prevent replication from using up all of the - # available database connections. - postgresql['max_wal_senders'] = 4 - postgresql['max_replication_slots'] = 4 # Replace XXX.XXX.XXX.XXX/YY with Network Address postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) - repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24) - - ## Enable service discovery for Prometheus - consul['monitoring_service_discovery'] = true # Set the network addresses that the exporters will listen on for monitoring node_exporter['listen_address'] = '0.0.0.0:9100' @@ -503,70 +504,9 @@ in the second step, do not supply the `EXTERNAL_URL` value. # END user configuration ``` -1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and replace - the file of the same name on this server. If that file is not on this server, - add the file from your Consul server to this server. - -1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. - - - -#### PostgreSQL secondary nodes - -1. On both the secondary nodes, add the same configuration specified above for the primary node - with an additional setting (`repmgr['master_on_initialization'] = false`) that will inform `gitlab-ctl` that they are standby nodes initially - and there's no need to attempt to register them as a primary node: - - ```ruby - # Disable all components except PostgreSQL and Repmgr and Consul - roles ['postgres_role'] - - # PostgreSQL configuration - postgresql['listen_address'] = '0.0.0.0' - postgresql['hot_standby'] = 'on' - postgresql['wal_level'] = 'replica' - postgresql['shared_preload_libraries'] = 'repmgr_funcs' - - # Disable automatic database migrations - gitlab_rails['auto_migrate'] = false - - # Configure the Consul agent - consul['services'] = %w(postgresql) - - # Specify if a node should attempt to be primary on initialization. - repmgr['master_on_initialization'] = false - - # Replace PGBOUNCER_PASSWORD_HASH with a generated md5 value - postgresql['pgbouncer_user_password'] = '' - # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value - postgresql['sql_user_password'] = '' - # Set `max_wal_senders` to one more than the number of database nodes in the cluster. - # This is used to prevent replication from using up all of the - # available database connections. - postgresql['max_wal_senders'] = 4 - postgresql['max_replication_slots'] = 4 - - # Replace with your network addresses - postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) - repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24) - - ## Enable service discovery for Prometheus - consul['monitoring_service_discovery'] = true - - # Set the network addresses that the exporters will listen on for monitoring - node_exporter['listen_address'] = '0.0.0.0:9100' - postgres_exporter['listen_address'] = '0.0.0.0:9187' - - ## The IPs of the Consul server nodes - ## You can also use FQDNs and intermix them with IPs - consul['configuration'] = { - retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13), - } - ``` +PostgreSQL, with Patroni managing its failover, will default to use `pg_rewind` by default to handle conflicts. +Like most failover handling methods, this has a small chance of leading to data loss. +Learn more about the various [Patroni replication methods](../postgresql/replication_and_failover.md#selecting-the-appropriate-patroni-replication-method). 1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and replace the file of the same name on this server. If that file is not on this server, @@ -601,84 +541,25 @@ SSH in to the **primary node**: 1. Exit the database prompt by typing `\q` and Enter. -1. Verify the cluster is initialized with one node: +1. Check the status of the leader and cluster: ```shell - gitlab-ctl repmgr cluster show + gitlab-ctl patroni members ``` The output should be similar to the following: ```plaintext - Role | Name | Upstream | Connection String - ----------+----------|----------|---------------------------------------- - * master | HOSTNAME | | host=HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr + | Cluster | Member | Host | Role | State | TL | Lag in MB | Pending restart | + |---------------|-----------------------------------|-----------|--------|---------|-----|-----------|-----------------| + | postgresql-ha | | 10.6.0.21 | Leader | running | 175 | | * | + | postgresql-ha | | 10.6.0.22 | | running | 175 | 0 | * | + | postgresql-ha | | 10.6.0.23 | | running | 175 | 0 | * | ``` -1. Note down the hostname or IP address in the connection string: `host=HOSTNAME`. We will - refer to the hostname in the next section as ``. If the value - is not an IP address, it will need to be a resolvable name (via DNS or - `/etc/hosts`) - -SSH in to the **secondary node**: - -1. Set up the repmgr standby: - - ```shell - gitlab-ctl repmgr standby setup - ``` - - Do note that this will remove the existing data on the node. The command - has a wait time. - - The output should be similar to the following: - - ```console - Doing this will delete the entire contents of /var/opt/gitlab/postgresql/data - If this is not what you want, hit Ctrl-C now to exit - To skip waiting, rerun with the -w option - Sleeping for 30 seconds - Stopping the database - Removing the data - Cloning the data - Starting the database - Registering the node with the cluster - ok: run: repmgrd: (pid 19068) 0s - ``` - -Before moving on, make sure the databases are configured correctly. Run the -following command on the **primary** node to verify that replication is working -properly and the secondary nodes appear in the cluster: - -```shell -gitlab-ctl repmgr cluster show -``` - -The output should be similar to the following: - -```plaintext -Role | Name | Upstream | Connection String -----------+---------|-----------|------------------------------------------------ -* master | MASTER | | host= user=gitlab_repmgr dbname=gitlab_repmgr - standby | STANDBY | MASTER | host= user=gitlab_repmgr dbname=gitlab_repmgr - standby | STANDBY | MASTER | host= user=gitlab_repmgr dbname=gitlab_repmgr -``` - -If the 'Role' column for any node says "FAILED", check the +If the 'State' column for any node doesn't say "running", check the [Troubleshooting section](troubleshooting.md) before proceeding. -Also, check that the `repmgr-check-master` command works successfully on each node: - -```shell -su - gitlab-consul -gitlab-ctl repmgr-check-master || echo 'This node is a standby repmgr node' -``` - -This command relies on exit codes to tell Consul whether a particular node is a master -or secondary. The most important thing here is that this command does not produce errors. -If there are errors it's most likely due to incorrect `gitlab-consul` database user permissions. -Check the [Troubleshooting section](troubleshooting.md) before proceeding. -
Back to setup components @@ -696,7 +577,7 @@ The following IPs will be used as an example: 1. On each PgBouncer node, edit `/etc/gitlab/gitlab.rb`, and replace `` and `` with the - password hashes you [set up previously](#postgresql-primary-node): + password hashes you [set up previously](#postgresql-nodes): ```ruby # Disable all components except Pgbouncer and Consul agent @@ -704,15 +585,16 @@ The following IPs will be used as an example: # Configure PgBouncer pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul) - pgbouncer['users'] = { - 'gitlab-consul': { - password: '' - }, - 'pgbouncer': { - password: '' - } + 'gitlab-consul': { + password: '' + }, + 'pgbouncer': { + password: '' + } } + # Incoming recommended value for max db connections is 150. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691. + pgbouncer['max_db_connections'] = 150 # Configure Consul agent consul['watchers'] = %w(postgresql) diff --git a/doc/administration/reference_architectures/5k_users.md b/doc/administration/reference_architectures/5k_users.md index 6d7511f67dbe1b012f2c9ebb144820518d0ca3f1..16ad866a108201f2a9ce0107d04753f0d063ca69 100644 --- a/doc/administration/reference_architectures/5k_users.md +++ b/doc/administration/reference_architectures/5k_users.md @@ -671,9 +671,9 @@ install the necessary dependencies from step 1, and add the GitLab package repository from step 2. When installing GitLab in the second step, do not supply the `EXTERNAL_URL` value. -#### PostgreSQL primary node +#### PostgreSQL nodes -1. SSH in to the PostgreSQL primary node. +1. SSH in to one of the PostgreSQL nodes. 1. Generate a password hash for the PostgreSQL username/password pair. This assumes you will use the default username of `gitlab` (recommended). The command will request a password and confirmation. Use the value that is output by this command in the next @@ -701,114 +701,33 @@ in the second step, do not supply the `EXTERNAL_URL` value. sudo gitlab-ctl pg-password-md5 gitlab-consul ``` -1. On the primary database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section: +1. On every database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section: ```ruby - # Disable all components except PostgreSQL and Repmgr and Consul + # Disable all components except PostgreSQL, Patroni, and Consul roles ['postgres_role'] # PostgreSQL configuration postgresql['listen_address'] = '0.0.0.0' - postgresql['hot_standby'] = 'on' - postgresql['wal_level'] = 'replica' - postgresql['shared_preload_libraries'] = 'repmgr_funcs' - # Disable automatic database migrations - gitlab_rails['auto_migrate'] = false - - # Configure the Consul agent - consul['services'] = %w(postgresql) - - # START user configuration - # Please set the real values as explained in Required Information section - # - # Replace PGBOUNCER_PASSWORD_HASH with a generated md5 value - postgresql['pgbouncer_user_password'] = '' - # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value - postgresql['sql_user_password'] = '' + # Enable Patroni + patroni['enable'] = true # Set `max_wal_senders` to one more than the number of database nodes in the cluster. # This is used to prevent replication from using up all of the # available database connections. - postgresql['max_wal_senders'] = 4 - postgresql['max_replication_slots'] = 4 - - # Replace XXX.XXX.XXX.XXX/YY with Network Address - postgresql['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24) - repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24) - - ## Enable service discovery for Prometheus - consul['enable'] = true - consul['monitoring_service_discovery'] = true - - # Set the network addresses that the exporters will listen on for monitoring - node_exporter['listen_address'] = '0.0.0.0:9100' - postgres_exporter['listen_address'] = '0.0.0.0:9187' - postgres_exporter['dbname'] = 'gitlabhq_production' - postgres_exporter['password'] = '' - - ## The IPs of the Consul server nodes - ## You can also use FQDNs and intermix them with IPs - consul['configuration'] = { - retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13), - } - # - # END user configuration - ``` - -1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. -1. You can list the current PostgreSQL primary, secondary nodes status via: - - ```shell - sudo /opt/gitlab/bin/gitlab-ctl repmgr cluster show - ``` - -1. Verify the GitLab services are running: - - ```shell - sudo gitlab-ctl status - ``` - - The output should be similar to the following: - - ```plaintext - run: consul: (pid 30593) 77133s; run: log: (pid 29912) 77156s - run: logrotate: (pid 23449) 3341s; run: log: (pid 29794) 77175s - run: node-exporter: (pid 30613) 77133s; run: log: (pid 29824) 77170s - run: postgres-exporter: (pid 30620) 77132s; run: log: (pid 29894) 77163s - run: postgresql: (pid 30630) 77132s; run: log: (pid 29618) 77181s - run: repmgrd: (pid 30639) 77132s; run: log: (pid 29985) 77150s - ``` - - - -#### PostgreSQL secondary nodes - -1. On both the secondary nodes, add the same configuration specified above for the primary node - with an additional setting that will inform `gitlab-ctl` that they are standby nodes initially - and there's no need to attempt to register them as a primary node: - - ```ruby - # Disable all components except PostgreSQL and Repmgr and Consul - roles ['postgres_role'] - - # PostgreSQL configuration - postgresql['listen_address'] = '0.0.0.0' - postgresql['hot_standby'] = 'on' - postgresql['wal_level'] = 'replica' - postgresql['shared_preload_libraries'] = 'repmgr_funcs' + patroni['postgresql']['max_wal_senders'] = 4 + patroni['postgresql']['max_replication_slots'] = 4 + # Incoming recommended value for max connections is 500. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691. + patroni['postgresql']['max_connections'] = 500 # Disable automatic database migrations gitlab_rails['auto_migrate'] = false # Configure the Consul agent + consul['enable'] = true consul['services'] = %w(postgresql) - - # Specify if a node should attempt to be primary on initialization. - repmgr['master_on_initialization'] = false + ## Enable service discovery for Prometheus + consul['monitoring_service_discovery'] = true # START user configuration # Please set the real values as explained in Required Information section @@ -817,34 +736,31 @@ in the second step, do not supply the `EXTERNAL_URL` value. postgresql['pgbouncer_user_password'] = '' # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value postgresql['sql_user_password'] = '' - # Set `max_wal_senders` to one more than the number of database nodes in the cluster. - # This is used to prevent replication from using up all of the - # available database connections. - postgresql['max_wal_senders'] = 4 - postgresql['max_replication_slots'] = 4 # Replace XXX.XXX.XXX.XXX/YY with Network Address - postgresql['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24) - repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24) - - ## Enable service discovery for Prometheus - consul['enable'] = true - consul['monitoring_service_discovery'] = true + postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) # Set the network addresses that the exporters will listen on for monitoring node_exporter['listen_address'] = '0.0.0.0:9100' postgres_exporter['listen_address'] = '0.0.0.0:9187' - postgres_exporter['dbname'] = 'gitlabhq_production' - postgres_exporter['password'] = '' ## The IPs of the Consul server nodes ## You can also use FQDNs and intermix them with IPs consul['configuration'] = { retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13), } + # # END user configuration ``` +PostgreSQL, with Patroni managing its failover, will default to use `pg_rewind` by default to handle conflicts. +Like most failover handling methods, this has a small chance of leading to data loss. +Learn more about the various [Patroni replication methods](../postgresql/replication_and_failover.md#selecting-the-appropriate-patroni-replication-method). + +1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and replace + the file of the same name on this server. If that file is not on this server, + add the file from your Consul server to this server. + 1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. Advanced [configuration options](https://docs.gitlab.com/omnibus/settings/database.html) @@ -874,84 +790,25 @@ SSH in to the **primary node**: 1. Exit the database prompt by typing `\q` and Enter. -1. Verify the cluster is initialized with one node: +1. Check the status of the leader and cluster: ```shell - gitlab-ctl repmgr cluster show + gitlab-ctl patroni members ``` The output should be similar to the following: ```plaintext - Role | Name | Upstream | Connection String - ----------+----------|----------|---------------------------------------- - * master | HOSTNAME | | host=HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr + | Cluster | Member | Host | Role | State | TL | Lag in MB | Pending restart | + |---------------|-----------------------------------|-----------|--------|---------|-----|-----------|-----------------| + | postgresql-ha | | 10.6.0.31 | Leader | running | 175 | | * | + | postgresql-ha | | 10.6.0.32 | | running | 175 | 0 | * | + | postgresql-ha | | 10.6.0.33 | | running | 175 | 0 | * | ``` -1. Note down the hostname or IP address in the connection string: `host=HOSTNAME`. We will - refer to the hostname in the next section as ``. If the value - is not an IP address, it will need to be a resolvable name (via DNS or - `/etc/hosts`) - -SSH in to the **secondary node**: - -1. Set up the repmgr standby: - - ```shell - gitlab-ctl repmgr standby setup - ``` - - Do note that this will remove the existing data on the node. The command - has a wait time. - - The output should be similar to the following: - - ```console - Doing this will delete the entire contents of /var/opt/gitlab/postgresql/data - If this is not what you want, hit Ctrl-C now to exit - To skip waiting, rerun with the -w option - Sleeping for 30 seconds - Stopping the database - Removing the data - Cloning the data - Starting the database - Registering the node with the cluster - ok: run: repmgrd: (pid 19068) 0s - ``` - -Before moving on, make sure the databases are configured correctly. Run the -following command on the **primary** node to verify that replication is working -properly and the secondary nodes appear in the cluster: - -```shell -gitlab-ctl repmgr cluster show -``` - -The output should be similar to the following: - -```plaintext -Role | Name | Upstream | Connection String -----------+---------|-----------|------------------------------------------------ -* master | MASTER | | host= user=gitlab_repmgr dbname=gitlab_repmgr - standby | STANDBY | MASTER | host= user=gitlab_repmgr dbname=gitlab_repmgr - standby | STANDBY | MASTER | host= user=gitlab_repmgr dbname=gitlab_repmgr -``` - -If the 'Role' column for any node says "FAILED", check the +If the 'State' column for any node doesn't say "running", check the [Troubleshooting section](troubleshooting.md) before proceeding. -Also, check that the `repmgr-check-master` command works successfully on each node: - -```shell -su - gitlab-consul -gitlab-ctl repmgr-check-master || echo 'This node is a standby repmgr node' -``` - -This command relies on exit codes to tell Consul whether a particular node is a master -or secondary. The most important thing here is that this command does not produce errors. -If there are errors it's most likely due to incorrect `gitlab-consul` database user permissions. -Check the [Troubleshooting section](troubleshooting.md) before proceeding. -
Back to setup components @@ -969,7 +826,7 @@ The following IPs will be used as an example: 1. On each PgBouncer node, edit `/etc/gitlab/gitlab.rb`, and replace `` and `` with the - password hashes you [set up previously](#postgresql-primary-node): + password hashes you [set up previously](#postgresql-nodes): ```ruby # Disable all components except Pgbouncer and Consul agent @@ -977,15 +834,16 @@ The following IPs will be used as an example: # Configure PgBouncer pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul) - pgbouncer['users'] = { - 'gitlab-consul': { - password: '' - }, - 'pgbouncer': { - password: '' - } + 'gitlab-consul': { + password: '' + }, + 'pgbouncer': { + password: '' + } } + # Incoming recommended value for max db connections is 150. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691. + pgbouncer['max_db_connections'] = 150 # Configure Consul agent consul['watchers'] = %w(postgresql) diff --git a/doc/administration/reference_architectures/troubleshooting.md b/doc/administration/reference_architectures/troubleshooting.md index cd1a2bc91dfe341c253bffa36c065218af52fe23..8fcdb457d6a8ea30b6b2ea4f6407a9dc49fa94ff 100644 --- a/doc/administration/reference_architectures/troubleshooting.md +++ b/doc/administration/reference_architectures/troubleshooting.md @@ -514,39 +514,24 @@ See the suggested fix [in Geo documentation](../geo/replication/troubleshooting. See the suggested fix [in Geo documentation](../geo/replication/troubleshooting.md#message-log--invalid-ip-mask-md5-name-or-service-not-known). -## Troubleshooting PostgreSQL +## Troubleshooting PostgreSQL with Patroni -In case you are experiencing any issues connecting through PgBouncer, the first place to check is always the logs: +In case you are experiencing any issues connecting through PgBouncer, the first place to check is always the logs for PostgreSQL (which is run through Patroni): ```shell -sudo gitlab-ctl tail postgresql +sudo gitlab-ctl tail patroni ``` -### Consul and PostgreSQL changes not taking effect +### Consul and PostgreSQL with Patroni changes not taking effect Due to the potential impacts, `gitlab-ctl reconfigure` only reloads Consul and PostgreSQL, it will not restart the services. However, not all changes can be activated by reloading. -To restart either service, run `gitlab-ctl restart SERVICE` +To restart either service, run `gitlab-ctl restart consul` or `gitlab-ctl restart patroni` respectively. -For PostgreSQL, it is usually safe to restart the master node by default. Automatic failover defaults to a 1 minute timeout. Provided the database returns before then, nothing else needs to be done. To be safe, you can stop `repmgrd` on the standby nodes first with `gitlab-ctl stop repmgrd`, then start afterwards with `gitlab-ctl start repmgrd`. +For PostgreSQL with Patroni, to prevent the primary node from being failed over automatically, it's safest to stop all secondaries first, then restart the primary and finally restart the secondaries again. On the Consul server nodes, it is important to restart the Consul service in a controlled fashion. Read our [Consul documentation](../consul.md#restart-consul) for instructions on how to restart the service. -### `gitlab-ctl repmgr-check-master` command produces errors - -If this command displays errors about database permissions it is likely that something failed during -install, resulting in the `gitlab-consul` database user getting incorrect permissions. Follow these -steps to fix the problem: - -1. On the master database node, connect to the database prompt - `gitlab-psql -d template1` -1. Delete the `gitlab-consul` user - `DROP USER "gitlab-consul";` -1. Exit the database prompt - `\q` -1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) and the user will be re-added with the proper permissions. -1. Change to the `gitlab-consul` user - `su - gitlab-consul` -1. Try the check command again - `gitlab-ctl repmgr-check-master`. - -Now there should not be errors. If errors still occur then there is another problem. - ### PgBouncer error `ERROR: pgbouncer cannot connect to server` You may get this error when running `gitlab-rake gitlab:db:configure` or you