From 6a5cb8d3c4baa7d1e27efa0d0a1bc467a5a1837e Mon Sep 17 00:00:00 2001 From: Valery Sizov Date: Mon, 11 Mar 2019 18:54:37 +0200 Subject: [PATCH 1/4] Clarify where 'Write to "authorized_keys"' option is --- doc/administration/operations/fast_ssh_key_lookup.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/administration/operations/fast_ssh_key_lookup.md b/doc/administration/operations/fast_ssh_key_lookup.md index 6ba5768ebfdc08..07f6953fc1be3e 100644 --- a/doc/administration/operations/fast_ssh_key_lookup.md +++ b/doc/administration/operations/fast_ssh_key_lookup.md @@ -79,7 +79,7 @@ file will still be scanned. So git SSH performance will still be slow for many users as long as a large file exists. You can disable any more writes to the `authorized_keys` file by unchecking -`Write to "authorized_keys" file` in the Application Settings of your GitLab +`Write to "authorized_keys" file` in the _Admin > Settings > Network > Performance optimization_ of your GitLab installation. ![Write to authorized keys setting](img/write_to_authorized_keys_setting.png) -- GitLab From 6e8b1eb8985237305d5e6c60505500ca70e77170 Mon Sep 17 00:00:00 2001 From: Valery Sizov Date: Tue, 12 Mar 2019 19:20:52 +0200 Subject: [PATCH 2/4] Geo: Update external database use guide Documented Guide to using Geo in HA with RDS cross-region replicas --- .../geo/replication/external_database.md | 105 ++++++++++++++++-- ...o-in-ha-with-rds-cross-region-replicas.yml | 5 + 2 files changed, 99 insertions(+), 11 deletions(-) create mode 100644 ee/changelogs/unreleased/7360-documented-guide-to-using-geo-in-ha-with-rds-cross-region-replicas.yml diff --git a/doc/administration/geo/replication/external_database.md b/doc/administration/geo/replication/external_database.md index 18e0c75f703405..392d972a010239 100644 --- a/doc/administration/geo/replication/external_database.md +++ b/doc/administration/geo/replication/external_database.md @@ -11,6 +11,21 @@ developed and tested. We aim to be compatible with most external ## **Primary** node +1. SSH into your GitLab **primary** server and login as root: + + ```sh + sudo -i + ``` + +1. Execute the command below to define the node as **primary** node: + + ```sh + gitlab-ctl set-geo-primary-node + ``` + + This command will use your defined `external_url` in `/etc/gitlab/gitlab.rb`. + + ### Configure the external database to be replicated To set up an external database, you can either: @@ -18,8 +33,16 @@ To set up an external database, you can either: - Set up streaming replication yourself (for example, in AWS RDS). - Perform the Omnibus configuration manually as follows. -In an Omnibus install, the -[geo_primary_role](https://docs.gitlab.com/omnibus/roles/#gitlab-geo-roles) +#### AWS RDS (as an example) + +Given you have primary node set up on AWS EC2 that uses RDS. +You can now just create a read-only replica in a different region and the +replication process will be managed by AWS. Make sure you've set Network ACL, Subnet, and +Security Group according to your needs, so the secondary node instance can access the database. + +#### Any other external database with access to `postgresql.conf` + +The [geo_primary_role](https://docs.gitlab.com/omnibus/roles/#gitlab-geo-roles) configures the **primary** node's database to be replicated by making changes to `pg_hba.conf` and `postgresql.conf`. Make the following configuration changes manually to your external database configuration: @@ -55,6 +78,32 @@ has three main functions: 1. Configure the tracking database. 1. Enable the Geo Log Cursor (`geo_logcursor`) (irrelevant to this doc). +The first two steps we will set up manually as described bellow but the Log Cursor have to +be enabled anyway: + +1. SSH into a GitLab **secondary** server and login as root: + + ```bash + sudo -i + ``` + +1. Edit `/etc/gitlab/gitlab.rb` and add the following + + ```ruby + ## + ## Geo Secondary role + ## - configure dependent flags automatically to enable Geo + ## + roles ['geo_secondary_role'] + ``` +1. Reconfigure GitLab for the changes to take effect: + + ```sh + gitlab-ctl reconfigure + ``` + + + ### Configure the external replica database To set up an external replica database, you can either: @@ -62,9 +111,8 @@ To set up an external replica database, you can either: - Set up streaming replication yourself (for example, in AWS RDS). - Perform the Omnibus configuration manually as follows. -In an Omnibus install, the `geo_secondary_role` makes configuration changes to -`postgresql.conf`. Make the following configuration changes manually to your -external replica database configuration: +Make the following configuration changes manually to your `postgresql.conf` +of external replica database: ``` ## @@ -77,6 +125,33 @@ wal_keep_segments = 10 hot_standby = on ``` +### Configure an access to the external replica database + +1. SSH into a GitLab **secondary** server and login as root: + + ```bash + sudo -i + ``` + +1. Edit `/etc/gitlab/gitlab.rb` with the connection params and credentials for + the machine with the PostgreSQL replica instance: + + ```ruby + # note this is shared between both databases, + # make sure you define the same password in both + gitlab_rails['db_password'] = 'mypassword' + + gitlab_rails['db_username'] = 'gitlab' + gitlab_rails['db_host'] = 'my-database-read-replica.dbs.com' + ``` + +1. Reconfigure GitLab for the changes to take effect: + + ```bash + gitlab-ctl reconfigure + ``` + + ### Configure the tracking database **Secondary** nodes use a separate PostgreSQL installation as a tracking @@ -89,6 +164,13 @@ connection with the **secondary** replica database for improved performance. If you have an external database ready to be used as the tracking database, follow the instructions below to use it: +NOTE: **Note:** +If you use AWS RDS for tracking database, make sure your tracking database has access to +secondary database. Unfortunatelly, just assigning the same security group is not enough as +outbound rules do not apply to RDS PostgreSQL database. So you need to explicitly add inbound +rule for security group to allow any TCP traffic from tracking database to 5432 port. + + 1. SSH into a GitLab **secondary** server and login as root: ```bash @@ -99,12 +181,11 @@ follow the instructions below to use it: the machine with the PostgreSQL instance: ```ruby - # note this is shared between both databases, - # make sure you define the same password in both - gitlab_rails['db_password'] = 'mypassword' + geo_secondary['db_username'] = 'gitlab_geo' + geo_secondary['db_password'] = 'my password' - geo_secondary['db_host'] = '' - geo_secondary['db_port'] = 5431 # change to the correct port + geo_secondary['db_host'] = '' + geo_secondary['db_port'] = 5432 # change to the correct port geo_secondary['db_fdw'] = true # enable FDW geo_postgresql['enable'] = false # don't use internal managed instance ``` @@ -118,6 +199,7 @@ follow the instructions below to use it: 1. Run the tracking database migrations: ```bash + gitlab-rake geo:db:create gitlab-rake geo:db:migrate ``` @@ -135,6 +217,7 @@ follow the instructions below to use it: DB_HOST="" DB_NAME="gitlabhq_production" DB_USER="gitlab" + DB_PASS="my password" DB_PORT="5432" # Tracking Database connection params: @@ -149,7 +232,7 @@ follow the instructions below to use it: query_exec "CREATE EXTENSION postgres_fdw;" query_exec "CREATE SERVER gitlab_secondary FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '${DB_HOST}', dbname '${DB_NAME}', port '${DB_PORT}');" - query_exec "CREATE USER MAPPING FOR ${GEO_DB_USER} SERVER gitlab_secondary OPTIONS (user '${DB_USER}');" + query_exec "CREATE USER MAPPING FOR ${GEO_DB_USER} SERVER gitlab_secondary OPTIONS (user '${DB_USER}', password '${DB_PASS}');" query_exec "CREATE SCHEMA gitlab_secondary;" query_exec "GRANT USAGE ON FOREIGN SERVER gitlab_secondary TO ${GEO_DB_USER};" ``` diff --git a/ee/changelogs/unreleased/7360-documented-guide-to-using-geo-in-ha-with-rds-cross-region-replicas.yml b/ee/changelogs/unreleased/7360-documented-guide-to-using-geo-in-ha-with-rds-cross-region-replicas.yml new file mode 100644 index 00000000000000..a7bdcd9cb134e9 --- /dev/null +++ b/ee/changelogs/unreleased/7360-documented-guide-to-using-geo-in-ha-with-rds-cross-region-replicas.yml @@ -0,0 +1,5 @@ +--- +title: Documented Guide to using Geo in HA with RDS cross-region replicas +merge_request: +author: +type: changed -- GitLab From ca944f0bf5ee3b94bd5b6638ca78a4abb6131ccc Mon Sep 17 00:00:00 2001 From: Valery Sizov Date: Wed, 13 Mar 2019 14:56:04 +0200 Subject: [PATCH 3/4] Simplify External database Geo Setup External tracking database is an optional step --- .../geo/replication/database.md | 6 +- .../geo/replication/external_database.md | 93 ++++++++----------- 2 files changed, 40 insertions(+), 59 deletions(-) diff --git a/doc/administration/geo/replication/database.md b/doc/administration/geo/replication/database.md index 10e5409124c639..fb90ea9ccd75ad 100644 --- a/doc/administration/geo/replication/database.md +++ b/doc/administration/geo/replication/database.md @@ -8,8 +8,8 @@ from source, follow the NOTE: **Note:** If your GitLab installation uses external (not managed by Omnibus) PostgreSQL instances, the Omnibus roles will not be able to perform all necessary -configuration steps. In this case, refer to -[additional instructions](external_database.md). +configuration steps. In this case, +[follow the Geo with external PostgreSQL instances document instead](external_database.md). NOTE: **Note:** The stages of the setup process must be completed in the documented order. @@ -333,7 +333,7 @@ There is an [issue where support is being discussed](https://gitlab.com/gitlab-o ## ## Secondary address - ## - replace '198.51.100.2' with the public or VPC address of your Geo secondary node + ## - replace '198.51.100.2' with the public or VPC address of your Geo secondary node ## postgresql['listen_address'] = '198.51.100.2' postgresql['md5_auth_cidr_addresses'] = ['198.51.100.2/32'] diff --git a/doc/administration/geo/replication/external_database.md b/doc/administration/geo/replication/external_database.md index 392d972a010239..bbc9886686c4c5 100644 --- a/doc/administration/geo/replication/external_database.md +++ b/doc/administration/geo/replication/external_database.md @@ -33,14 +33,15 @@ To set up an external database, you can either: - Set up streaming replication yourself (for example, in AWS RDS). - Perform the Omnibus configuration manually as follows. -#### AWS RDS (as an example) +#### Leverage your cloud provider's tools to replicate the primary database -Given you have primary node set up on AWS EC2 that uses RDS. +Given you have a primary node set up on AWS EC2 that uses RDS. You can now just create a read-only replica in a different region and the replication process will be managed by AWS. Make sure you've set Network ACL, Subnet, and -Security Group according to your needs, so the secondary node instance can access the database. +Security Group according to your needs, so the secondary application node can access the database. +Skip to the [Configure secondary application node](#configure-secondary-application-node) section below. -#### Any other external database with access to `postgresql.conf` +#### Manually configure the primary database for replication The [geo_primary_role](https://docs.gitlab.com/omnibus/roles/#gitlab-geo-roles) configures the **primary** node's database to be replicated by making changes to @@ -70,46 +71,7 @@ hot_standby = on ## **Secondary** nodes -With Omnibus, the -[geo_secondary_role](https://docs.gitlab.com/omnibus/roles/#gitlab-geo-roles) -has three main functions: - -1. Configure the replica database. -1. Configure the tracking database. -1. Enable the Geo Log Cursor (`geo_logcursor`) (irrelevant to this doc). - -The first two steps we will set up manually as described bellow but the Log Cursor have to -be enabled anyway: - -1. SSH into a GitLab **secondary** server and login as root: - - ```bash - sudo -i - ``` - -1. Edit `/etc/gitlab/gitlab.rb` and add the following - - ```ruby - ## - ## Geo Secondary role - ## - configure dependent flags automatically to enable Geo - ## - roles ['geo_secondary_role'] - ``` -1. Reconfigure GitLab for the changes to take effect: - - ```sh - gitlab-ctl reconfigure - ``` - - - -### Configure the external replica database - -To set up an external replica database, you can either: - -- Set up streaming replication yourself (for example, in AWS RDS). -- Perform the Omnibus configuration manually as follows. +### Manually configure the replica database Make the following configuration changes manually to your `postgresql.conf` of external replica database: @@ -125,18 +87,33 @@ wal_keep_segments = 10 hot_standby = on ``` -### Configure an access to the external replica database +### Configure **secondary** application servers to use the external read-replica -1. SSH into a GitLab **secondary** server and login as root: +With Omnibus, the +[geo_secondary_role](https://docs.gitlab.com/omnibus/roles/#gitlab-geo-roles) +has three main functions: + +1. Configure the replica database. +1. Configure the tracking database. +1. Enable the Geo Log Cursor (`geo_logcursor`) (irrelevant to this doc). + +So we need to configure the connection to the external read-replica database and enable Log Cursor: + +1. SSH into a GitLab **secondary** application server and login as root: ```bash sudo -i ``` -1. Edit `/etc/gitlab/gitlab.rb` with the connection params and credentials for - the machine with the PostgreSQL replica instance: +1. Edit `/etc/gitlab/gitlab.rb` and add the following ```ruby + ## + ## Geo Secondary role + ## - configure dependent flags automatically to enable Geo + ## + roles ['geo_secondary_role'] + # note this is shared between both databases, # make sure you define the same password in both gitlab_rails['db_password'] = 'mypassword' @@ -144,7 +121,6 @@ hot_standby = on gitlab_rails['db_username'] = 'gitlab' gitlab_rails['db_host'] = 'my-database-read-replica.dbs.com' ``` - 1. Reconfigure GitLab for the changes to take effect: ```bash @@ -156,19 +132,23 @@ hot_standby = on **Secondary** nodes use a separate PostgreSQL installation as a tracking database to keep track of replication status and automatically recover from -potential replication issues. +potential replication issues. Omnibus automatically configures a tracking database +when `roles ['geo_secondary_role']` is set. For high availability, +refer to [Geo High Availability](https://docs.gitlab.com/ee/administration/high_availability). +If you want to run this database external to Omnibus, please follow the instructions below. -It requires an [FDW](https://www.postgresql.org/docs/9.6/static/postgres-fdw.html) +The tracking database requires an [FDW](https://www.postgresql.org/docs/9.6/static/postgres-fdw.html) connection with the **secondary** replica database for improved performance. If you have an external database ready to be used as the tracking database, follow the instructions below to use it: NOTE: **Note:** -If you use AWS RDS for tracking database, make sure your tracking database has access to -secondary database. Unfortunatelly, just assigning the same security group is not enough as -outbound rules do not apply to RDS PostgreSQL database. So you need to explicitly add inbound -rule for security group to allow any TCP traffic from tracking database to 5432 port. +If you want to use AWS RDS as a tracking database, make sure it has access to +the secondary database. Unfortunately, just assigning the same security group is not enough as +outbound rules do not apply to RDS PostgreSQL databases. So you need to explicitly add an inbound +rule to the read-replica's security group allowing any TCP traffic from +the tracking database on port 5432. 1. SSH into a GitLab **secondary** server and login as root: @@ -238,7 +218,8 @@ rule for security group to allow any TCP traffic from tracking database to 5432 ``` NOTE: **Note:** The script template above uses `gitlab-psql` as it's intended to be executed from the Geo machine, - but you can change it to `psql` and run it from any machine that has access to the database. + but you can change it to `psql` and run it from any machine that has access to the database. We also recommend using + `psql` for AWS RDS. 1. Restart GitLab: -- GitLab From 5136fc1fd2e28494b544edfc477126e3f34f0d3d Mon Sep 17 00:00:00 2001 From: Valery Sizov Date: Thu, 21 Mar 2019 13:40:24 +0200 Subject: [PATCH 4/4] Address review comments --- .../geo/replication/external_database.md | 30 +++++-------------- .../operations/fast_ssh_key_lookup.md | 2 +- ...o-in-ha-with-rds-cross-region-replicas.yml | 2 +- 3 files changed, 10 insertions(+), 24 deletions(-) diff --git a/doc/administration/geo/replication/external_database.md b/doc/administration/geo/replication/external_database.md index bbc9886686c4c5..a4981f818d2b6c 100644 --- a/doc/administration/geo/replication/external_database.md +++ b/doc/administration/geo/replication/external_database.md @@ -11,7 +11,7 @@ developed and tested. We aim to be compatible with most external ## **Primary** node -1. SSH into your GitLab **primary** server and login as root: +1. SSH into a GitLab **primary** application server and login as root: ```sh sudo -i @@ -87,7 +87,7 @@ wal_keep_segments = 10 hot_standby = on ``` -### Configure **secondary** application servers to use the external read-replica +### Configure **secondary** application nodes to use the external read-replica With Omnibus, the [geo_secondary_role](https://docs.gitlab.com/omnibus/roles/#gitlab-geo-roles) @@ -95,9 +95,9 @@ has three main functions: 1. Configure the replica database. 1. Configure the tracking database. -1. Enable the Geo Log Cursor (`geo_logcursor`) (irrelevant to this doc). +1. Enable the [Geo Log Cursor](index.md#geo-log-cursor) (not covered in this section). -So we need to configure the connection to the external read-replica database and enable Log Cursor: +To configure the connection to the external read-replica database and enable Log Cursor: 1. SSH into a GitLab **secondary** application server and login as root: @@ -121,12 +121,7 @@ So we need to configure the connection to the external read-replica database and gitlab_rails['db_username'] = 'gitlab' gitlab_rails['db_host'] = 'my-database-read-replica.dbs.com' ``` -1. Reconfigure GitLab for the changes to take effect: - - ```bash - gitlab-ctl reconfigure - ``` - +1. Save the file and [reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure) ### Configure the tracking database @@ -146,11 +141,10 @@ follow the instructions below to use it: NOTE: **Note:** If you want to use AWS RDS as a tracking database, make sure it has access to the secondary database. Unfortunately, just assigning the same security group is not enough as -outbound rules do not apply to RDS PostgreSQL databases. So you need to explicitly add an inbound +outbound rules do not apply to RDS PostgreSQL databases. Therefore, you need to explicitly add an inbound rule to the read-replica's security group allowing any TCP traffic from the tracking database on port 5432. - 1. SSH into a GitLab **secondary** server and login as root: ```bash @@ -170,11 +164,7 @@ the tracking database on port 5432. geo_postgresql['enable'] = false # don't use internal managed instance ``` -1. Reconfigure GitLab for the changes to take effect: - - ```bash - gitlab-ctl reconfigure - ``` +1. Save the file and [reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure) 1. Run the tracking database migrations: @@ -221,11 +211,7 @@ the tracking database on port 5432. but you can change it to `psql` and run it from any machine that has access to the database. We also recommend using `psql` for AWS RDS. -1. Restart GitLab: - - ```bash - gitlab-ctl restart - ``` +1. Save the file and [restart GitLab](../../restart_gitlab.md#omnibus-gitlab-restart) 1. Populate the FDW tables: ```bash diff --git a/doc/administration/operations/fast_ssh_key_lookup.md b/doc/administration/operations/fast_ssh_key_lookup.md index 07f6953fc1be3e..9125b468c16d71 100644 --- a/doc/administration/operations/fast_ssh_key_lookup.md +++ b/doc/administration/operations/fast_ssh_key_lookup.md @@ -79,7 +79,7 @@ file will still be scanned. So git SSH performance will still be slow for many users as long as a large file exists. You can disable any more writes to the `authorized_keys` file by unchecking -`Write to "authorized_keys" file` in the _Admin > Settings > Network > Performance optimization_ of your GitLab +`Write to "authorized_keys" file` in the **Admin Area > Settings > Network > Performance optimization** of your GitLab installation. ![Write to authorized keys setting](img/write_to_authorized_keys_setting.png) diff --git a/ee/changelogs/unreleased/7360-documented-guide-to-using-geo-in-ha-with-rds-cross-region-replicas.yml b/ee/changelogs/unreleased/7360-documented-guide-to-using-geo-in-ha-with-rds-cross-region-replicas.yml index a7bdcd9cb134e9..5aa74d3deacc25 100644 --- a/ee/changelogs/unreleased/7360-documented-guide-to-using-geo-in-ha-with-rds-cross-region-replicas.yml +++ b/ee/changelogs/unreleased/7360-documented-guide-to-using-geo-in-ha-with-rds-cross-region-replicas.yml @@ -1,5 +1,5 @@ --- title: Documented Guide to using Geo in HA with RDS cross-region replicas -merge_request: +merge_request: 9985 author: type: changed -- GitLab