From 6a7a15b88354e20f17245187b51104a28c73dcd6 Mon Sep 17 00:00:00 2001 From: Ben Bodenmiller Date: Fri, 17 Mar 2023 09:20:54 +0000 Subject: [PATCH 1/4] Use helm upgrade --atomic option --- doc/troubleshooting/kubernetes_cheat_sheet.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/troubleshooting/kubernetes_cheat_sheet.md b/doc/troubleshooting/kubernetes_cheat_sheet.md index 6c34793089..0623f7ccd1 100644 --- a/doc/troubleshooting/kubernetes_cheat_sheet.md +++ b/doc/troubleshooting/kubernetes_cheat_sheet.md @@ -198,7 +198,7 @@ and they will assist you with any issues you are having. helm get values > gitlab.yaml # run upgrade itself - helm upgrade -f gitlab.yaml + helm upgrade --atomic -f gitlab.yaml ``` See also [Updating GitLab by using the Helm chart](../installation/upgrade.md). @@ -209,7 +209,7 @@ and they will assist you with any issues you are having. - Run the following command to apply changes: ```shell - helm upgrade -f gitlab.yaml + helm upgrade --atomic -f gitlab.yaml ``` - How to get the manifest for a release. It can be useful because it contains the information about -- GitLab From eac2e31944ac00cfb3f6c3464e6f6f5d8d9aea8f Mon Sep 17 00:00:00 2001 From: Ben Bodenmiller Date: Fri, 17 Mar 2023 09:22:33 +0000 Subject: [PATCH 2/4] Update upgrade.md --- doc/installation/upgrade.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/installation/upgrade.md b/doc/installation/upgrade.md index c08a20eee8..37f3fc77d3 100644 --- a/doc/installation/upgrade.md +++ b/doc/installation/upgrade.md @@ -18,13 +18,13 @@ Ongoing work to support this feature can be tracked via [the GitLab Operator epic](https://gitlab.com/groups/gitlab-org/cloud-native/-/epics/52). We also recommend that you take a [backup](../backup-restore/index.md) first. Also note that you -must provide all values using `helm upgrade --set key=value` syntax or `-f values.yaml` instead of +must provide all values using `helm upgrade --atomic --set key=value` syntax or `-f values.yaml` instead of using `--reuse-values`, because some of the current values might be deprecated. You can retrieve your previous `--set` arguments cleanly, with `helm get values `. If you direct this into a file (`helm get values > gitlab.yaml`), you can safely pass this -file via `-f`. Thus `helm upgrade gitlab gitlab/gitlab -f gitlab.yaml`. +file via `-f`. Thus `helm upgrade --atomic gitlab gitlab/gitlab -f gitlab.yaml`. This safely replaces the behavior of `--reuse-values` See [mappings](../installation/version_mappings.md) between chart versioning and GitLab versioning. @@ -58,7 +58,7 @@ The following are the steps to upgrade GitLab to a newer version: 1. Perform the upgrade, with values extracted in the previous step: ```shell - helm upgrade gitlab gitlab/gitlab \ + helm upgrade --atomic gitlab gitlab/gitlab \ --version \ -f gitlab.yaml \ --set gitlab.migrations.enabled=true \ -- GitLab From 0e59917df206047f7e958e0c0bde52e4b5355fde Mon Sep 17 00:00:00 2001 From: Achilleas Pipinellis Date: Wed, 12 Apr 2023 12:31:43 +0200 Subject: [PATCH 3/4] Add new section on --atomic --- doc/installation/upgrade.md | 64 +++++++++++++++++-- doc/troubleshooting/kubernetes_cheat_sheet.md | 4 +- 2 files changed, 62 insertions(+), 6 deletions(-) diff --git a/doc/installation/upgrade.md b/doc/installation/upgrade.md index 37f3fc77d3..cd8272a5a2 100644 --- a/doc/installation/upgrade.md +++ b/doc/installation/upgrade.md @@ -18,17 +18,73 @@ Ongoing work to support this feature can be tracked via [the GitLab Operator epic](https://gitlab.com/groups/gitlab-org/cloud-native/-/epics/52). We also recommend that you take a [backup](../backup-restore/index.md) first. Also note that you -must provide all values using `helm upgrade --atomic --set key=value` syntax or `-f values.yaml` instead of +must provide all values using `helm upgrade --set key=value` syntax or `-f values.yaml` instead of using `--reuse-values`, because some of the current values might be deprecated. You can retrieve your previous `--set` arguments cleanly, with `helm get values `. If you direct this into a file (`helm get values > gitlab.yaml`), you can safely pass this -file via `-f`. Thus `helm upgrade --atomic gitlab gitlab/gitlab -f gitlab.yaml`. -This safely replaces the behavior of `--reuse-values` +file via `-f`. Thus `helm upgrade gitlab gitlab/gitlab -f gitlab.yaml`. +This safely replaces the behavior of `--reuse-values`. See [mappings](../installation/version_mappings.md) between chart versioning and GitLab versioning. +## Use `--atomic` for automatic reverts when deployments go wrong + +WARNING: +Use the `--atomic` flag at your own risk. It comes with many caveats +which you should know before upgrading. For more information, see +the discussion in [merge request 3029](https://gitlab.com/gitlab-org/charts/gitlab/-/merge_requests/3029). + +From the [Helm documentation](https://helm.sh/docs/helm/helm_upgrade/#options), +you can pass the `--atomic` flag during a `helm upgrade`, and the upgrade +process rolls back changes made in case of failed upgrade. `--atomic` sets +`--wait` by default, which is based on the `--timeout` flag, which is 5 minutes +by default. So, depending your environment, you might want to tweak this, if an +upgrade takes more than 5 minutes. There's also another potentially useful +flag, `--wait-for-jobs`, which can be used in combination with `--atomic` and +`--timeout`. + +The rollback of the atomic flag might not always bring back the release in a +functional state. For example, when the migration job upgrades the database +(partially), manual intervention will be required. However, the combination of +`--wait-for-jobs` and `--atomic` could allow to detect these problems early. + +Without `--atomic` you could have a failed deployment and GitLab left in a +non-working state. With `--atomic` there is a chance you roll back to a working +version. Using `--atomic` is safest when you only upgrade one version at a +time, and thus the database is guaranteed to be compatible. + +Timeouts, in combination with `--atomic`, can bring the application to a broken +state . There might be many unexpected scenarios which could make a +job/pod take much longer than expected. + + By upgrading one minor version at a time, with a long `--timeout`, and + skipping the post-deploy migrations, you not only increase the chances of a + successful deploy, but you also increase the chances of a successful revert. + +Of course, it depends on the migration type, and the set of migrations for a +specific release. + +One single regular migration (also called pre-deployment) should be less than 3 +minutes. One single post-deployment migration should be less than 5 minutes. +However, background migrations can take days. + +When a rollback is performed, a new `migrations` Job is created. In this case, +if the image is downgraded a version (for example, 15.10.0 to 15.9.3), the +older code won't know how to revert any pre-deployment migrations that did +complete as part of the run at the newer version. + +Here's the general workflow: + +- Configure the appropriate `ENV` variable in `gitlab.migrations.env` which will prevent + `rake gitlab:db:configure` from performing post-deployment migrations. +- Deploy the Helm chart. +- Await all migrations to complete, and the application to come up. +- When prepared to operate post-migrations (many checks later), use the + `toolbox` pod to run `/scripts/db-migrate`, which then performs all + post-migrations (not "batched background migrations"). + ## Steps NOTE: @@ -58,7 +114,7 @@ The following are the steps to upgrade GitLab to a newer version: 1. Perform the upgrade, with values extracted in the previous step: ```shell - helm upgrade --atomic gitlab gitlab/gitlab \ + helm upgrade gitlab gitlab/gitlab \ --version \ -f gitlab.yaml \ --set gitlab.migrations.enabled=true \ diff --git a/doc/troubleshooting/kubernetes_cheat_sheet.md b/doc/troubleshooting/kubernetes_cheat_sheet.md index 0623f7ccd1..6c34793089 100644 --- a/doc/troubleshooting/kubernetes_cheat_sheet.md +++ b/doc/troubleshooting/kubernetes_cheat_sheet.md @@ -198,7 +198,7 @@ and they will assist you with any issues you are having. helm get values > gitlab.yaml # run upgrade itself - helm upgrade --atomic -f gitlab.yaml + helm upgrade -f gitlab.yaml ``` See also [Updating GitLab by using the Helm chart](../installation/upgrade.md). @@ -209,7 +209,7 @@ and they will assist you with any issues you are having. - Run the following command to apply changes: ```shell - helm upgrade --atomic -f gitlab.yaml + helm upgrade -f gitlab.yaml ``` - How to get the manifest for a release. It can be useful because it contains the information about -- GitLab From ef3e1c5543eb73d6bd0e9136373beae7f0fb5568 Mon Sep 17 00:00:00 2001 From: Achilleas Pipinellis Date: Fri, 9 Jun 2023 09:56:17 +0000 Subject: [PATCH 4/4] Apply 4 suggestion(s) to 1 file(s) --- doc/installation/upgrade.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/doc/installation/upgrade.md b/doc/installation/upgrade.md index cd8272a5a2..d17baed45d 100644 --- a/doc/installation/upgrade.md +++ b/doc/installation/upgrade.md @@ -38,7 +38,7 @@ the discussion in [merge request 3029](https://gitlab.com/gitlab-org/charts/gitl From the [Helm documentation](https://helm.sh/docs/helm/helm_upgrade/#options), you can pass the `--atomic` flag during a `helm upgrade`, and the upgrade -process rolls back changes made in case of failed upgrade. `--atomic` sets +process rolls back changes made in case workloads do not become ready in time. `--atomic` sets `--wait` by default, which is based on the `--timeout` flag, which is 5 minutes by default. So, depending your environment, you might want to tweak this, if an upgrade takes more than 5 minutes. There's also another potentially useful @@ -46,11 +46,11 @@ flag, `--wait-for-jobs`, which can be used in combination with `--atomic` and `--timeout`. The rollback of the atomic flag might not always bring back the release in a -functional state. For example, when the migration job upgrades the database -(partially), manual intervention will be required. However, the combination of +functional state. For example, if the migration job upgrades the database +partially, manual intervention may be required. However, the combination of `--wait-for-jobs` and `--atomic` could allow to detect these problems early. -Without `--atomic` you could have a failed deployment and GitLab left in a +Without `--atomic` you could have a failed deployment and be left with GitLab in a non-working state. With `--atomic` there is a chance you roll back to a working version. Using `--atomic` is safest when you only upgrade one version at a time, and thus the database is guaranteed to be compatible. @@ -59,9 +59,9 @@ Timeouts, in combination with `--atomic`, can bring the application to a broken state . There might be many unexpected scenarios which could make a job/pod take much longer than expected. - By upgrading one minor version at a time, with a long `--timeout`, and - skipping the post-deploy migrations, you not only increase the chances of a - successful deploy, but you also increase the chances of a successful revert. +By upgrading one minor version at a time, with a long `--timeout`, and +skipping the post-deploy migrations, you not only increase the chances of a +successful deploy, but you also increase the chances of a successful revert. Of course, it depends on the migration type, and the set of migrations for a specific release. -- GitLab