From 7ed9e7d2147a464724550fa3bd70ee951f0ee9a9 Mon Sep 17 00:00:00 2001 From: Mike Kozono Date: Fri, 16 Jun 2023 18:32:39 -1000 Subject: [PATCH 01/13] Move back up specific sections to back up doc And add more introductory content to "Back up GitLab". --- doc/administration/gitaly/index.md | 2 +- doc/administration/postgresql/pgbouncer.md | 4 +- .../postgresql/replication_and_failover.md | 2 +- doc/administration/raketasks/check.md | 2 +- doc/administration/raketasks/storage.md | 4 +- doc/ci/variables/index.md | 2 +- doc/install/docker.md | 2 +- doc/raketasks/backup_gitlab.md | 1056 +++++++++++++++-- doc/raketasks/backup_restore.md | 876 +------------- doc/raketasks/restore_gitlab.md | 43 +- doc/user/project/repository/mirror/index.md | 2 +- 11 files changed, 1044 insertions(+), 951 deletions(-) diff --git a/doc/administration/gitaly/index.md b/doc/administration/gitaly/index.md index 2a3c3da24b3991..8c351faa612926 100644 --- a/doc/administration/gitaly/index.md +++ b/doc/administration/gitaly/index.md @@ -175,7 +175,7 @@ best suited by using Gitaly Cluster. ### Backing up repositories -When backing up or syncing repositories using tools other than GitLab, you must [prevent writes](../../raketasks/backup_restore.md#prevent-writes-and-copy-the-git-repository-data) +When backing up or syncing repositories using tools other than GitLab, you must [prevent writes](../../raketasks/backup_gitlab.md#prevent-writes-and-copy-the-git-repository-data) while copying repository data. ## Gitaly Cluster diff --git a/doc/administration/postgresql/pgbouncer.md b/doc/administration/postgresql/pgbouncer.md index d23118705cb2d4..fec4ab47bd0313 100644 --- a/doc/administration/postgresql/pgbouncer.md +++ b/doc/administration/postgresql/pgbouncer.md @@ -88,7 +88,7 @@ This content has been moved to a [new location](replication_and_failover.md#conf Do not backup or restore GitLab through a PgBouncer connection: it causes a GitLab outage. -[Read more about this and how to reconfigure backups](../../raketasks/backup_restore.md#back-up-and-restore-for-installations-using-pgbouncer). +[Read more about this and how to reconfigure backups](../../raketasks/backup_gitlab.md#back-up-and-restore-for-installations-using-pgbouncer). ## Enable Monitoring @@ -178,7 +178,7 @@ ote_pid | tls Some database changes have to be done directly, and not through PgBouncer. -The main affected tasks are [database restores](../../raketasks/backup_restore.md#back-up-and-restore-for-installations-using-pgbouncer) +The main affected tasks are [database restores](../../raketasks/backup_gitlab.md#back-up-and-restore-for-installations-using-pgbouncer) and [GitLab upgrades with database migrations](../../update/zero_downtime.md#postgresql). 1. To find the primary node, run the following on a database node: diff --git a/doc/administration/postgresql/replication_and_failover.md b/doc/administration/postgresql/replication_and_failover.md index 392f9f2b89c67a..46b4a4f17a828e 100644 --- a/doc/administration/postgresql/replication_and_failover.md +++ b/doc/administration/postgresql/replication_and_failover.md @@ -571,7 +571,7 @@ in the Troubleshooting section before proceeding. Do not backup or restore GitLab through a PgBouncer connection: this causes a GitLab outage. -[Read more about this and how to reconfigure backups](../../raketasks/backup_restore.md#back-up-and-restore-for-installations-using-pgbouncer). +[Read more about this and how to reconfigure backups](../../raketasks/backup_gitlab.md#back-up-and-restore-for-installations-using-pgbouncer). ### Ensure GitLab is running diff --git a/doc/administration/raketasks/check.md b/doc/administration/raketasks/check.md index e55a0f1c8a7954..4eed85abc6d428 100644 --- a/doc/administration/raketasks/check.md +++ b/doc/administration/raketasks/check.md @@ -215,7 +215,7 @@ secrets file (`gitlab-secrets.json`). Automatic resolution is not yet implemented. If you have values that cannot be decrypted, you can follow steps to reset them, see our -documentation on what to do [when the secrets file is lost](../../raketasks/backup_restore.md#when-the-secrets-file-is-lost). +documentation on what to do [when the secrets file is lost](../../raketasks/backup_gitlab.md#when-the-secrets-file-is-lost). This can take a very long time, depending on the size of your database, as it checks all rows in all tables. diff --git a/doc/administration/raketasks/storage.md b/doc/administration/raketasks/storage.md index 3664a79bf43264..6080896ec2f864 100644 --- a/doc/administration/raketasks/storage.md +++ b/doc/administration/raketasks/storage.md @@ -244,9 +244,9 @@ If destroying the project generates a stack trace relating to encryption or the 1. [Verify your GitLab secrets](check.md#verify-database-values-can-be-decrypted-using-the-current-secrets). 1. If the affected projects have secrets that cannot be decrypted it will be necessary to remove those specific secrets. - [Our documentation for dealing with lost secrets](../../raketasks/backup_restore.md#when-the-secrets-file-is-lost) + [Our documentation for dealing with lost secrets](../../raketasks/backup_gitlab.md#when-the-secrets-file-is-lost) is for loss of all secrets, but it's possible for specific projects to be affected. For example, - to [reset specific runner registration tokens](../../raketasks/backup_restore.md#reset-runner-registration-tokens) + to [reset specific runner registration tokens](../../raketasks/backup_gitlab.md#reset-runner-registration-tokens) for a specific project ID: ```sql diff --git a/doc/ci/variables/index.md b/doc/ci/variables/index.md index 7b6ba36e35d89f..c51aaec0a5bee1 100644 --- a/doc/ci/variables/index.md +++ b/doc/ci/variables/index.md @@ -246,7 +246,7 @@ malicious code can compromise both masked and protected variables. Variable values are encrypted using [`aes-256-cbc`](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard) and stored in the database. This data can only be read and decrypted with a -valid [secrets file](../../raketasks/backup_restore.md#when-the-secrets-file-is-lost). +valid [secrets file](../../raketasks/backup_gitlab.md#when-the-secrets-file-is-lost). ### Mask a CI/CD variable diff --git a/doc/install/docker.md b/doc/install/docker.md index ab1aec98b1670c..299de1e2fe2ba3 100644 --- a/doc/install/docker.md +++ b/doc/install/docker.md @@ -619,7 +619,7 @@ to back up the `gitlab.rb` file. WARNING: [Backing up the GitLab secrets file](../raketasks/backup_gitlab.md#storing-configuration-files) is required -to avoid [complicated steps](../raketasks/backup_restore.md#when-the-secrets-file-is-lost) when recovering +to avoid [complicated steps](../raketasks/backup_gitlab.md#when-the-secrets-file-is-lost) when recovering GitLab from backup. The secrets file is stored at `/etc/gitlab/gitlab-secrets.json` inside the container, or `$GITLAB_HOME/config/gitlab-secrets.json` [on the container host](#set-up-the-volumes-location). diff --git a/doc/raketasks/backup_gitlab.md b/doc/raketasks/backup_gitlab.md index a6635c589aa94e..9ed1bc33824a51 100644 --- a/doc/raketasks/backup_gitlab.md +++ b/doc/raketasks/backup_gitlab.md @@ -6,6 +6,120 @@ info: To determine the technical writer assigned to the Stage/Group associated w # Back up GitLab **(FREE SELF)** +The exact procedure for backing up GitLab depends on many factors. Your particular deployment's usage and configuration determine what kind of data exists, where it is located, and how much there is. These factors influence your options for how to perform a back up, how to store it, and how to restore it. + +## What data needs to be backed up? + +- [PostgreSQL databases](#postgresql-databases) +- [Git repositories](#git-repositories) +- [Blobs](#blobs) +- [Storing configuration files](#storing-configuration-files) +- [Other data](#other-data) + +### PostgreSQL databases + +In the simplest case, GitLab has one PostgreSQL database in one PostgreSQL server on the same VM as all other GitLab services. But depending on configuration, GitLab may use multiple PostgreSQL databases in multiple PostgreSQL servers. + +In general, this data is the single source of truth for most user-generated content in the Web interface, such as issue and merge request content, comments, permissions, and credentials. + +PostgreSQL also holds some cached data like HTML-rendered Markdown, and by default, merge request diffs. +Although merge request diffs can also be configured to be offloaded to the file system or object storage, see [Blobs](#blobs). + +Gitaly Cluster's Praefect service uses a PostgreSQL database as a single source of truth to manage its Gitaly nodes. + +A common PostgreSQL utility, [`pg_dump`](https://www.postgresql.org/docs/current/app-pgdump.html), produces a backup file which can be used to restore a PostgreSQL database. The [backup command](#backup-command) uses this utility under the hood. + +Unfortunately, `pg_dump` takes longer to execute, the larger the database. Depending on your situation, the duration becomes impractical at some point (days, for example). If your database is over 100GB, `pg_dump`, and by extension the [backup command](#backup-command), is likely not usable. + +### Git repositories + +A GitLab instance can have one or more repository shards. Each shard is a Gitaly instance or Gitaly Cluster that +is responsible for allowing access and operations on the locally stored Git repositories. Gitaly can run +on a machine: + +- With a single disk. +- With multiple disks mounted as a single mount-point (like with a RAID array). +- Using LVM. + +Gitaly may work with NFS or a mounted Storage Appliance, but it is [not officially supported](../administration/nfs.md#gitaly-with-nfs-not-supported) because Git requires low latency. + +Each project can have up to 3 different repositories: + +- A project repository, where the source code is stored. +- A wiki repository, where the wiki content is stored. +- A design repository, where design artifacts are indexed (assets are actually in LFS). + +They all live in the same shard and share the same base name with a `-wiki` and `-design` suffix +for Wiki and Design Repository cases. + +Personal and project snippets, and group wiki content, are stored in Git repositories. + +Project forks are deduplicated in live a GitLab site using pool repositories. + +The [backup command](#backup-command) produces a Git bundle for each repository and tars them all up. This duplicates pool repository data into every fork. For more information, see [Alternative backup strategies](#alternative-backup-strategies). + +### Blobs + +GitLab stores blobs (or files) such as issue attachments or LFS objects into either: + +- The file system in a specific location. +- An [Object Storage](../administration/object_storage.md) solution. Object Storage solutions can be: + - Cloud based like Amazon S3 Google Cloud Storage. + - Hosted by you (like MinIO). + - A Storage Appliance that exposes an Object Storage-compatible API. + +### Storing configuration files + +The [backup Rake task](#back-up-gitlab) GitLab provides does _not_ store your +configuration files. The primary reason for this is that your database contains +items including encrypted information for two-factor authentication and the +CI/CD _secure variables_. Storing encrypted information in the same location +as its key defeats the purpose of using encryption in the first place. + +WARNING: +The secrets file is essential to preserve your database encryption key. + +At the very **minimum**, you must back up: + +For Omnibus: + +- `/etc/gitlab/gitlab-secrets.json` +- `/etc/gitlab/gitlab.rb` + +For installation from source: + +- `/home/git/gitlab/config/secrets.yml` +- `/home/git/gitlab/config/gitlab.yml` + +For [Docker installations](../install/docker.md), you must +back up the volume where the configuration files are stored. If you created +the GitLab container according to the documentation, it should be in the +`/srv/gitlab/config` directory. + +For [GitLab Helm chart installations](https://gitlab.com/gitlab-org/charts/gitlab) +on a Kubernetes cluster, you must follow the +[Back up the secrets](https://docs.gitlab.com/charts/backup-restore/backup.html#back-up-the-secrets) +instructions. + +You may also want to back up any TLS keys and certificates (`/etc/gitlab/ssl`, `/etc/gitlab/trusted-certs`), and your +[SSH host keys](https://superuser.com/questions/532040/copy-ssh-keys-from-one-server-to-another-server/532079#532079) +to avoid man-in-the-middle attack warnings if you have to perform a full machine restore. + +If you use Omnibus GitLab, review additional information to +[backup your configuration](https://docs.gitlab.com/omnibus/settings/backups.html). + +In the unlikely event that the secrets file is lost, see the +[troubleshooting section](#when-the-secrets-file-is-lost). + +### Other data + +GitLab uses Redis both as a cache store and to hold persistent data for our background jobs system, Sidekiq. The provided [backup command](#backup-command) does _not_ back up Redis data. This means to take a consistent backup with the [backup command](#backup-command), there must be no pending or running background jobs. It is possible to manually back up Redis. + +Elasticsearch is an optional database for advanced search. It can improve search +in both source-code level, and user generated content in issues, merge requests, and discussions. The [backup command](#backup-command) does _not_ back up Elasticsearch data. Elasticsearch data can be regenerated from PostgreSQL data after a restore. It is possible to manually back up Elasticsearch. + +## Command line interface + GitLab provides a command line interface to back up your entire instance, including: @@ -32,8 +146,31 @@ WARNING: GitLab does not back up any configuration files (`/etc/gitlab`), TLS keys and certificates, or system files. You are highly advised to read about [storing configuration files](#storing-configuration-files). +### Requirements + +To be able to back up and restore, ensure that Rsync is installed on your +system. If you installed GitLab: + +- _Using the Omnibus package_, Rsync is already installed. +- _From source_, check if `rsync` is installed. If Rsync is not installed, install it. For example: + + ```shell + # Debian/Ubuntu + sudo apt-get install rsync + + # RHEL/CentOS + sudo yum install rsync + ``` + +### Backup command + WARNING: -The backup command requires [additional parameters](backup_restore.md#back-up-and-restore-for-installations-using-pgbouncer) when +GitLab doesn't back up items that aren't stored on the file system. If you're +using [object storage](../administration/object_storage.md), be sure to enable +backups with your object storage provider, if desired. + +WARNING: +The backup command requires [additional parameters](#back-up-and-restore-for-installations-using-pgbouncer) when your installation is using PgBouncer, for either performance reasons or when using it with a Patroni cluster. WARNING: @@ -41,6 +178,13 @@ Before GitLab 15.5.0, the backup command doesn't verify if another backup is alr [issue 362593](https://gitlab.com/gitlab-org/gitlab/-/issues/362593). We strongly recommend you make sure that all backups are complete before starting a new one. +An application data backup creates an archive file that contains the database, +all repositories and all attachments. + +You can only restore a backup to **exactly the same version and type (CE/EE)** +of GitLab on which it was created. The best way to +[migrate your projects from one server to another](backup_restore.md#migrate-to-a-new-server) is through a backup and restore. + Depending on your version of GitLab, use the following command if you installed GitLab using the Omnibus package: @@ -117,55 +261,23 @@ Deleting tmp directories...[DONE] Deleting old backups... [SKIPPING] ``` -## Storing configuration files - -The [backup Rake task](#back-up-gitlab) GitLab provides does _not_ store your -configuration files. The primary reason for this is that your database contains -items including encrypted information for two-factor authentication and the -CI/CD _secure variables_. Storing encrypted information in the same location -as its key defeats the purpose of using encryption in the first place. - -WARNING: -The secrets file is essential to preserve your database encryption key. - -At the very **minimum**, you must back up: - -For Omnibus: - -- `/etc/gitlab/gitlab-secrets.json` -- `/etc/gitlab/gitlab.rb` - -For installation from source: - -- `/home/git/gitlab/config/secrets.yml` -- `/home/git/gitlab/config/gitlab.yml` - -For [Docker installations](../install/docker.md), you must -back up the volume where the configuration files are stored. If you created -the GitLab container according to the documentation, it should be in the -`/srv/gitlab/config` directory. - -For [GitLab Helm chart installations](https://gitlab.com/gitlab-org/charts/gitlab) -on a Kubernetes cluster, you must follow the -[Back up the secrets](https://docs.gitlab.com/charts/backup-restore/backup.html#back-up-the-secrets) -instructions. - -You may also want to back up any TLS keys and certificates (`/etc/gitlab/ssl`, `/etc/gitlab/trusted-certs`), and your -[SSH host keys](https://superuser.com/questions/532040/copy-ssh-keys-from-one-server-to-another-server/532079#532079) -to avoid man-in-the-middle attack warnings if you have to perform a full machine restore. +### Backup timestamp -If you use Omnibus GitLab, review additional information to -[backup your configuration](https://docs.gitlab.com/omnibus/settings/backups.html). +The backup archive is saved in `backup_path`, which is specified in the +`config/gitlab.yml` file. The filename is `[TIMESTAMP]_gitlab_backup.tar`, +where `TIMESTAMP` identifies the time at which each backup was created, plus +the GitLab version. The timestamp is needed if you need to restore GitLab and +multiple backups are available. -In the unlikely event that the secrets file is lost, see the -[troubleshooting section](backup_restore.md#when-the-secrets-file-is-lost). +For example, if the backup name is `1493107454_2018_04_25_10.6.4-ce_gitlab_backup.tar`, +the timestamp is `1493107454_2018_04_25_10.6.4-ce`. -## Backup options +### Backup options The command line tool GitLab provides to backup your instance can accept more options. -### Backup strategy option +#### Backup strategy option The default backup strategy is to essentially stream data from the respective data locations to the backup using the Linux command `tar` and `gzip`. This works @@ -190,14 +302,14 @@ sudo gitlab-backup create STRATEGY=copy Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. -### Backup filename +#### Backup filename WARNING: If you use a custom backup filename, you can't [limit the lifetime of the backups](#limit-backup-lifetime-for-local-files-prune-old-backups). By default, a backup file is created according to the specification in the -previous [Backup timestamp](backup_restore.md#backup-timestamp) section. You can, however, +previous [Backup timestamp](#backup-timestamp) section. You can, however, override the `[TIMESTAMP]` portion of the filename by setting the `BACKUP` environment variable. For example: @@ -211,7 +323,7 @@ The resulting file is named `dump_gitlab_backup.tar`. This is useful for systems that make use of rsync and incremental backups, and results in considerably faster transfer speeds. -### Confirm archive can be transferred +#### Confirm archive can be transferred To ensure the generated archive is transferable by rsync, you can set the `GZIP_RSYNCABLE=yes` option. This sets the `--rsyncable` option to `gzip`, which is useful only in @@ -227,7 +339,7 @@ sudo gitlab-backup create BACKUP=dump GZIP_RSYNCABLE=yes Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. -### Excluding specific directories from the backup +#### Excluding specific directories from the backup You can exclude specific directories from the backup by adding the environment variable `SKIP`, whose values are a comma-separated list of the following options: @@ -269,7 +381,7 @@ sudo -u git -H bundle exec rake gitlab:backup:create SKIP=db,uploads RAILS_ENV=p - [Skip creation of the tar file](#skipping-tar-creation) (`SKIP=tar`). - [Skip uploading the backup to remote storage](#skip-uploading-backups-to-remote-storage) (`SKIP=remote`). -### Skipping tar creation +#### Skipping tar creation NOTE: It is not possible to skip the tar creation when using [object storage](#upload-backups-to-a-remote-cloud-storage) for backups. @@ -297,7 +409,7 @@ For installations from source: sudo -u git -H bundle exec rake gitlab:backup:create SKIP=tar RAILS_ENV=production ``` -### Back up Git repositories concurrently +#### Back up Git repositories concurrently > - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/37158) in GitLab 13.3. > - [Concurrent restore introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/69330) in GitLab 14.3 @@ -327,7 +439,7 @@ For example, for installations from source: sudo -u git -H bundle exec rake gitlab:backup:create GITLAB_BACKUP_MAX_CONCURRENCY=4 GITLAB_BACKUP_MAX_STORAGE_CONCURRENCY=1 ``` -### Incremental repository backups +#### Incremental repository backups > - Introduced in GitLab 14.9 [with a flag](../administration/feature_flags.md) named `incremental_repository_backup`. Disabled by default. > - [Enabled on self-managed](https://gitlab.com/gitlab-org/gitlab/-/issues/355945) in GitLab 14.10. @@ -350,7 +462,7 @@ to create an incremental backup from: - In GitLab 14.9 and 14.10, use the `BACKUP=` option to choose the backup to use. The chosen previous backup is overwritten. - In GitLab 15.0 and later, use the `PREVIOUS_BACKUP=` option to choose the backup to use. By default, a backup file is created - as documented in the [Backup timestamp](backup_restore.md#backup-timestamp) section. You can override the `[TIMESTAMP]` portion of the filename by setting the + as documented in the [Backup timestamp](#backup-timestamp) section. You can override the `[TIMESTAMP]` portion of the filename by setting the [`BACKUP` environment variable](#backup-filename). To create an incremental backup, run: @@ -373,7 +485,7 @@ To create an [untarred](#skipping-tar-creation) incremental backup from a tarred sudo gitlab-backup create INCREMENTAL=yes SKIP=tar ``` -### Back up specific repository storages +#### Back up specific repository storages > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/86896) in GitLab 15.0. @@ -394,7 +506,7 @@ For example, for installations from source: sudo -u git -H bundle exec rake gitlab:backup:create REPOSITORIES_STORAGES=storage1,storage2 ``` -### Back up specific repositories +#### Back up specific repositories > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/88094) in GitLab 15.1. @@ -419,7 +531,7 @@ and skip the **Project D** in **Group A** (`group-a/project-d`): sudo -u git -H bundle exec rake gitlab:backup:create REPOSITORIES_PATHS=group-a,group-b/project-c SKIP_REPOSITORIES_PATHS=group-a/project-d ``` -### Upload backups to a remote (cloud) storage +#### Upload backups to a remote (cloud) storage NOTE: It is not possible to [skip the tar creation](#skipping-tar-creation) when using object storage for backups. @@ -433,7 +545,7 @@ for AWS, Google, and Aliyun. A local driver is [Read more about using object storage with GitLab](../administration/object_storage.md). -#### Using Amazon S3 +##### Using Amazon S3 For Omnibus GitLab packages: @@ -456,7 +568,7 @@ For Omnibus GitLab packages: 1. [Reconfigure GitLab](../administration/restart_gitlab.md#reconfigure-a-linux-package-installation) for the changes to take effect -#### S3 Encrypted Buckets +##### S3 Encrypted Buckets > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/64765) in GitLab 14.3. @@ -469,7 +581,7 @@ AWS supports these [modes for server side encryption](https://docs.aws.amazon.co Use your mode of choice with GitLab. Each mode has similar, but slightly different, configuration methods. -##### SSE-S3 +###### SSE-S3 To enable SSE-S3, in the backup storage options set the `server_side_encryption` field to `AES256`. For example, in Omnibus GitLab: @@ -480,7 +592,7 @@ gitlab_rails['backup_upload_storage_options'] = { } ``` -##### SSE-KMS +###### SSE-KMS To enable SSE-KMS, you need the [KMS key via its Amazon Resource Name (ARN) in the `arn:aws:kms:region:acct-id:key/key-id` format](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingKMSEncryption.html). @@ -498,7 +610,7 @@ gitlab_rails['backup_upload_storage_options'] = { } ``` -##### SSE-C +###### SSE-C SSE-C requires you to set these encryption options: @@ -520,7 +632,7 @@ For example: gitlab_rails['env'] = { 'GITLAB_BACKUP_ENCRYPTION_KEY' => "\xDE\xAD\xBE\xEF" * 8 } ``` -#### Digital Ocean Spaces +##### Digital Ocean Spaces This example can be used for a bucket in Amsterdam (AMS3): @@ -545,7 +657,7 @@ the cause may be the use of backup encryption. Because Digital Ocean Spaces doesn't support encryption, remove or comment the line that contains `gitlab_rails['backup_encryption']`. -#### Other S3 Providers +##### Other S3 Providers Not all S3 providers are fully compatible with the Fog library. For example, if you see a `411 Length Required` error message after attempting to upload, @@ -646,7 +758,7 @@ with the name of your bucket: } ``` -#### Using Google Cloud Storage +##### Using Google Cloud Storage To use Google Cloud Storage to save backups, you must first create an access key from the Google console: @@ -698,7 +810,7 @@ For installations from source: 1. [Restart GitLab](../administration/restart_gitlab.md#installations-from-source) for the changes to take effect -#### Using Azure Blob storage +##### Using Azure Blob storage > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/25877) in GitLab 13.4. @@ -738,7 +850,7 @@ For installations from source: For more details, see the [table of Azure parameters](../administration/object_storage.md#azure-blob-storage). -#### Specifying a custom directory for backups +##### Specifying a custom directory for backups This option works only for remote storage. If you want to group your backups, you can pass a `DIRECTORY` environment variable: @@ -750,7 +862,7 @@ sudo gitlab-backup create DIRECTORY=weekly Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. -### Skip uploading backups to remote storage +#### Skip uploading backups to remote storage If you have configured GitLab to [upload backups in a remote storage](#upload-backups-to-a-remote-cloud-storage), you can use the `SKIP=remote` option to skip uploading your backups to the remote storage. @@ -767,7 +879,7 @@ For installations from source: sudo -u git -H bundle exec rake gitlab:backup:create SKIP=remote RAILS_ENV=production ``` -### Upload to locally-mounted shares +#### Upload to locally-mounted shares You can send backups to a locally-mounted share (for example, `NFS`,`CIFS`, or `SMB`) using the Fog [`Local`](https://github.com/fog/fog-local#usage) storage provider. @@ -786,7 +898,7 @@ When mounted, the directory set in the `local_root` key must be owned by either: Because file system performance may affect overall GitLab performance, [we don't recommend using cloud-based file systems for storage](../administration/nfs.md#avoid-using-cloud-based-file-systems). -#### Avoid conflicting configuration +##### Avoid conflicting configuration Don't set the following configuration keys to the same path: @@ -800,7 +912,7 @@ If these configuration keys are set to the same location, the upload feature fai the upload location. This failure causes the upload feature to delete the backup because it assumes it's a residual file remaining after the failed upload attempt. -#### Configure uploads to locally-mounted shares +##### Configure uploads to locally-mounted shares For Omnibus GitLab packages: @@ -839,7 +951,7 @@ For installations from source: 1. [Restart GitLab](../administration/restart_gitlab.md#installations-from-source) for the changes to take effect. -### Backup archive permissions +#### Backup archive permissions The backup archives created by GitLab (`1393513186_2014_02_27_gitlab_backup.tar`) have the owner/group `git`/`git` and 0600 permissions by default. This is @@ -870,7 +982,7 @@ For installations from source: 1. [Restart GitLab](../administration/restart_gitlab.md#installations-from-source) for the changes to take effect. -### Configuring cron to make daily backups +#### Configuring cron to make daily backups WARNING: The following cron jobs do not [back up your GitLab configuration files](#storing-configuration-files) @@ -914,7 +1026,7 @@ The `CRON=1` environment setting directs the backup script to hide all progress output if there aren't any errors. This is recommended to reduce cron spam. When troubleshooting backup problems, however, replace `CRON=1` with `--trace` to log verbosely. -## Limit backup lifetime for local files (prune old backups) +#### Limit backup lifetime for local files (prune old backups) WARNING: The process described in this section doesn't work if you used a [custom filename](#backup-filename) @@ -954,3 +1066,805 @@ For installations from source: 1. [Restart GitLab](../administration/restart_gitlab.md#installations-from-source) for the changes to take effect. + +#### Back up and restore for installations using PgBouncer + +Do not back up or restore GitLab through a PgBouncer connection. These +tasks must [bypass PgBouncer and connect directly to the PostgreSQL primary database node](#bypassing-pgbouncer), +or they cause a GitLab outage. + +When the GitLab backup or restore task is used with PgBouncer, the +following error message is shown: + +```ruby +ActiveRecord::StatementInvalid: PG::UndefinedTable +``` + +Each time the GitLab backup runs, GitLab starts generating 500 errors and errors about missing +tables will [be logged by PostgreSQL](../administration/logs/index.md#postgresql-logs): + +```plaintext +ERROR: relation "tablename" does not exist at character 123 +``` + +This happens because the task uses `pg_dump`, which +[sets a null search path and explicitly includes the schema in every SQL query](https://gitlab.com/gitlab-org/gitlab/-/issues/23211) +to address [CVE-2018-1058](https://www.postgresql.org/about/news/postgresql-103-968-9512-9417-and-9322-released-1834/). + +Since connections are reused with PgBouncer in transaction pooling mode, +PostgreSQL fails to search the default `public` schema. As a result, +this clearing of the search path causes tables and columns to appear +missing. + +##### Bypassing PgBouncer + +There are two ways to fix this: + +1. [Use environment variables to override the database settings](#environment-variable-overrides) for the backup task. +1. Reconfigure a node to [connect directly to the PostgreSQL primary database node](../administration/postgresql/pgbouncer.md#procedure-for-bypassing-pgbouncer). + +###### Environment variable overrides + +By default, GitLab uses the database configuration stored in a +configuration file (`database.yml`). However, you can override the database settings +for the backup and restore task by setting environment +variables that are prefixed with `GITLAB_BACKUP_`: + +- `GITLAB_BACKUP_PGHOST` +- `GITLAB_BACKUP_PGUSER` +- `GITLAB_BACKUP_PGPORT` +- `GITLAB_BACKUP_PGPASSWORD` +- `GITLAB_BACKUP_PGSSLMODE` +- `GITLAB_BACKUP_PGSSLKEY` +- `GITLAB_BACKUP_PGSSLCERT` +- `GITLAB_BACKUP_PGSSLROOTCERT` +- `GITLAB_BACKUP_PGSSLCRL` +- `GITLAB_BACKUP_PGSSLCOMPRESSION` + +For example, to override the database host and port to use 192.168.1.10 +and port 5432 with the Omnibus package: + +```shell +sudo GITLAB_BACKUP_PGHOST=192.168.1.10 GITLAB_BACKUP_PGPORT=5432 /opt/gitlab/bin/gitlab-backup create +``` + +See the [PostgreSQL documentation](https://www.postgresql.org/docs/12/libpq-envars.html) +for more details on what these parameters do. + +#### `gitaly-backup` for repository backup and restore + +> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/333034) in GitLab 14.2. +> - [Deployed behind a feature flag](../user/feature_flags.md), enabled by default. +> - [Generally available](https://gitlab.com/gitlab-org/gitlab/-/issues/333034) in GitLab 14.10. [Feature flag `gitaly_backup`](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/83254) removed. + +The `gitaly-backup` binary is used by the backup Rake task to create and restore repository backups from Gitaly. +`gitaly-backup` replaces the previous backup method that directly calls RPCs on Gitaly from GitLab. + +The backup Rake task must be able to find this executable. In most cases, you don't need to change +the path to the binary as it should work fine with the default path `/opt/gitlab/embedded/bin/gitaly-backup`. +If you have a specific reason to change the path, it can be configured in Omnibus GitLab packages: + +1. Add the following to `/etc/gitlab/gitlab.rb`: + + ```ruby + gitlab_rails['backup_gitaly_backup_path'] = '/path/to/gitaly-backup' + ``` + +1. [Reconfigure GitLab](../administration/restart_gitlab.md#reconfigure-a-linux-package-installation) + for the changes to take effect. + +## Alternative backup strategies + +In the following cases, consider using file system data transfer or snapshots as part of your backup strategy: + +- Your GitLab instance contains a lot of Git repository data and the GitLab backup script is too slow. +- Your GitLab instance has a lot of forked projects and the regular backup task duplicates the Git data for all of them. +- Your GitLab instance has a problem and using the regular backup and import Rake tasks isn't possible. + +WARNING: +Gitaly Cluster [does not support snapshot backups](../administration/gitaly/index.md#snapshot-backup-and-recovery-limitations). + +When considering using file system data transfer or snapshots: + +- Don't use these methods to migrate from one operating system to another. The operating systems of the source and destination should be as similar as possible. For example, + don't use these methods to migrate from Ubuntu to Fedora. +- Data consistency is very important. We recommend stopping GitLab with `sudo gitlab-ctl stop` before taking doing a file system transfer (with rsync, for example) or taking a + snapshot. + +Example: Amazon Elastic Block Store (EBS) + +> A GitLab server using Omnibus GitLab hosted on Amazon AWS. +> An EBS drive containing an ext4 file system is mounted at `/var/opt/gitlab`. +> In this case you could make an application backup by taking an EBS snapshot. +> The backup includes all repositories, uploads and PostgreSQL data. + +Example: Logical Volume Manager (LVM) snapshots + rsync + +> A GitLab server using Omnibus GitLab, with an LVM logical volume mounted at `/var/opt/gitlab`. +> Replicating the `/var/opt/gitlab` directory using rsync would not be reliable because too many files would change while rsync is running. +> Instead of rsync-ing `/var/opt/gitlab`, we create a temporary LVM snapshot, which we mount as a read-only file system at `/mnt/gitlab_backup`. +> Now we can have a longer running rsync job which creates a consistent replica on the remote server. +> The replica includes all repositories, uploads and PostgreSQL data. + +If you're running GitLab on a virtualized server, you can possibly also create +VM snapshots of the entire GitLab server. It's not uncommon however for a VM +snapshot to require you to power down the server, which limits this solution's +practical use. + +### Back up repository data separately + +First, ensure you back up existing GitLab data while [skipping repositories](#excluding-specific-directories-from-the-backup): + +```shell +# for Omnibus GitLab package installations +sudo gitlab-backup create SKIP=repositories + +# for installations from source: +sudo -u git -H bundle exec rake gitlab:backup:create SKIP=repositories RAILS_ENV=production +``` + +For manually backing up the Git repository data on disk, there are multiple possible strategies: + +- Use snapshots, such as the previous examples of Amazon EBS drive snapshots, or LVM snapshots + rsync. +- Use [GitLab Geo](../administration/geo/index.md) and rely on the repository data on a Geo secondary site. +- [Prevent writes and copy the Git repository data](#prevent-writes-and-copy-the-git-repository-data). +- [Create an online backup by marking repositories as read-only (experimental)](#online-backup-through-marking-repositories-as-read-only-experimental). + +#### Prevent writes and copy the Git repository data + +Git repositories must be copied in a consistent way. They should not be copied during concurrent write +operations, as this can lead to inconsistencies or corruption issues. For more details, +[issue #270422](https://gitlab.com/gitlab-org/gitlab/-/issues/270422 "Provide documentation on preferred method of migrating Gitaly servers") +has a longer discussion explaining the potential problems. + +To prevent writes to the Git repository data, there are two possible approaches: + +- Use [maintenance mode](../administration/maintenance_mode/index.md) to place GitLab in a read-only state. +- Create explicit downtime by stopping all Gitaly services before backing up the repositories: + + ```shell + sudo gitlab-ctl stop gitaly + # execute git data copy step + sudo gitlab-ctl start gitaly + ``` + +You can copy Git repository data using any method, as long as writes are prevented on the data being copied +(to prevent inconsistencies and corruption issues). In order of preference and safety, the recommended methods are: + +1. Use `rsync` with archive-mode, delete, and checksum options, for example: + + ```shell + rsync -aR --delete --checksum source destination # be extra safe with the order as it will delete existing data if inverted + ``` + +1. Use a [`tar` pipe to copy the entire repository's directory to another server or location](../administration/operations/moving_repositories.md#tar-pipe-to-another-server). + +1. Use `sftp`, `scp`, `cp`, or any other copying method. + +#### Online backup through marking repositories as read-only (experimental) + +One way of backing up repositories without requiring instance-wide downtime +is to programmatically mark projects as read-only while copying the underlying data. + +There are a few possible downsides to this: + +- Repositories are read-only for a period of time that scales with the size of the repository. +- Backups take a longer time to complete due to marking each project as read-only, potentially leading to inconsistencies. For example, + a possible date discrepancy between the last data available for the first project that gets backed up compared to + the last project that gets backed up. +- Fork networks should be entirely read-only while the projects inside get backed up to prevent potential changes to the pool repository. + +There is an **experimental** script that attempts to automate this process in +[the Geo team Runbooks project](https://gitlab.com/gitlab-org/geo-team/runbooks/-/tree/main/experimental-online-backup-through-rsync). + +## Troubleshooting + +The following are possible problems you might encounter, along with potential +solutions. + +### When the secrets file is lost + +If you didn't [back up the secrets file](#storing-configuration-files), you +must complete several steps to get GitLab working properly again. + +The secrets file is responsible for storing the encryption key for the columns +that contain required, sensitive information. If the key is lost, GitLab can't +decrypt those columns, preventing access to the following items: + +- [CI/CD variables](../ci/variables/index.md) +- [Kubernetes / GCP integration](../user/infrastructure/clusters/index.md) +- [Custom Pages domains](../user/project/pages/custom_domains_ssl_tls_certification/index.md) +- [Project error tracking](../operations/error_tracking.md) +- [Runner authentication](../ci/runners/index.md) +- [Project mirroring](../user/project/repository/mirror/index.md) +- [Integrations](../user/project/integrations/index.md) +- [Web hooks](../user/project/integrations/webhooks.md) + +In cases like CI/CD variables and runner authentication, you can experience +unexpected behaviors, such as: + +- Stuck jobs. +- 500 errors. + +In this case, you must reset all the tokens for CI/CD variables and +runner authentication, which is described in more detail in the following +sections. After resetting the tokens, you should be able to visit your project +and the jobs begin running again. + +Use the information in the following sections at your own risk. + +#### Verify that all values can be decrypted + +You can determine if your database contains values that can't be decrypted by using a +[Rake task](../administration/raketasks/check.md#verify-database-values-can-be-decrypted-using-the-current-secrets). + +#### Take a backup + +You must directly modify GitLab data to work around your lost secrets file. + +WARNING: +Be sure to create a full database backup before attempting any changes. + +#### Disable user two-factor authentication (2FA) + +Users with 2FA enabled can't sign in to GitLab. In that case, you must +[disable 2FA for everyone](../security/two_factor_authentication.md#for-all-users), +after which users must reactivate 2FA. + +#### Reset CI/CD variables + +1. Enter the database console: + + For Omnibus GitLab 14.1 and earlier: + + ```shell + sudo gitlab-rails dbconsole + ``` + + For Omnibus GitLab 14.2 and later: + + ```shell + sudo gitlab-rails dbconsole --database main + ``` + + For installations from source, GitLab 14.1 and earlier: + + ```shell + sudo -u git -H bundle exec rails dbconsole -e production + ``` + + For installations from source, GitLab 14.2 and later: + + ```shell + sudo -u git -H bundle exec rails dbconsole -e production --database main + ``` + +1. Examine the `ci_group_variables` and `ci_variables` tables: + + ```sql + SELECT * FROM public."ci_group_variables"; + SELECT * FROM public."ci_variables"; + ``` + + These are the variables that you need to delete. + +1. Delete all variables: + + ```sql + DELETE FROM ci_group_variables; + DELETE FROM ci_variables; + ``` + +1. If you know the specific group or project from which you wish to delete variables, you can include a `WHERE` statement to specify that in your `DELETE`: + + ```sql + DELETE FROM ci_group_variables WHERE group_id = ; + DELETE FROM ci_variables WHERE project_id = ; + ``` + +You may need to reconfigure or restart GitLab for the changes to take effect. + +#### Reset runner registration tokens + +1. Enter the database console: + + For Omnibus GitLab 14.1 and earlier: + + ```shell + sudo gitlab-rails dbconsole + ``` + + For Omnibus GitLab 14.2 and later: + + ```shell + sudo gitlab-rails dbconsole --database main + ``` + + For installations from source, GitLab 14.1 and earlier: + + ```shell + sudo -u git -H bundle exec rails dbconsole -e production + ``` + + For installations from source, GitLab 14.2 and later: + + ```shell + sudo -u git -H bundle exec rails dbconsole -e production --database main + ``` + +1. Clear all tokens for projects, groups, and the entire instance: + + WARNING: + The final `UPDATE` operation stops the runners from being able to pick + up new jobs. You must register new runners. + + ```sql + -- Clear project tokens + UPDATE projects SET runners_token = null, runners_token_encrypted = null; + -- Clear group tokens + UPDATE namespaces SET runners_token = null, runners_token_encrypted = null; + -- Clear instance tokens + UPDATE application_settings SET runners_registration_token_encrypted = null; + -- Clear key used for JWT authentication + -- This may break the $CI_JWT_TOKEN job variable: + -- https://gitlab.com/gitlab-org/gitlab/-/issues/325965 + UPDATE application_settings SET encrypted_ci_jwt_signing_key = null; + -- Clear runner tokens + UPDATE ci_runners SET token = null, token_encrypted = null; + ``` + +#### Reset pending pipeline jobs + +1. Enter the database console: + + For Omnibus GitLab 14.1 and earlier: + + ```shell + sudo gitlab-rails dbconsole + ``` + + For Omnibus GitLab 14.2 and later: + + ```shell + sudo gitlab-rails dbconsole --database main + ``` + + For installations from source, GitLab 14.1 and earlier: + + ```shell + sudo -u git -H bundle exec rails dbconsole -e production + ``` + + For installations from source, GitLab 14.2 and later: + + ```shell + sudo -u git -H bundle exec rails dbconsole -e production --database main + ``` + +1. Clear all the tokens for pending jobs: + + For GitLab 15.3 and earlier: + + ```sql + -- Clear build tokens + UPDATE ci_builds SET token = null, token_encrypted = null; + ``` + + For GitLab 15.4 and later: + + ```sql + -- Clear build tokens + UPDATE ci_builds SET token_encrypted = null; + ``` + +A similar strategy can be employed for the remaining features. By removing the +data that can't be decrypted, GitLab can be returned to operation, and the +lost data can be manually replaced. + +#### Fix integrations and webhooks + +If you've lost your secrets, the [integrations settings pages](../user/project/integrations/index.md) +and [webhooks settings pages](../user/project/integrations/webhooks.md) are probably displaying `500` error messages. + +The fix is to truncate the affected tables (those containing encrypted columns). +This deletes all your configured integrations, webhooks, and related metadata. +You should verify that the secrets are the root cause before deleting any data. + +1. Enter the database console: + + For Omnibus GitLab 14.1 and earlier: + + ```shell + sudo gitlab-rails dbconsole + ``` + + For Omnibus GitLab 14.2 and later: + + ```shell + sudo gitlab-rails dbconsole --database main + ``` + + For installations from source, GitLab 14.1 and earlier: + + ```shell + sudo -u git -H bundle exec rails dbconsole -e production + ``` + + For installations from source, GitLab 14.2 and later: + + ```shell + sudo -u git -H bundle exec rails dbconsole -e production --database main + ``` + +1. Truncate the following tables: + + ```sql + -- truncate web_hooks table + TRUNCATE integrations, chat_names, issue_tracker_data, jira_tracker_data, slack_integrations, web_hooks, zentao_tracker_data, web_hook_logs; + ``` + +### Container Registry push failures after restoring from a backup + +If you use the [Container Registry](../user/packages/container_registry/index.md), +pushes to the registry may fail after restoring your backup on an Omnibus GitLab +instance after restoring the registry data. + +These failures mention permission issues in the registry logs, similar to: + +```plaintext +level=error +msg="response completed with error" +err.code=unknown +err.detail="filesystem: mkdir /var/opt/gitlab/gitlab-rails/shared/registry/docker/registry/v2/repositories/...: permission denied" +err.message="unknown error" +``` + +This issue is caused by the restore running as the unprivileged user `git`, +which is unable to assign the correct ownership to the registry files during +the restore process ([issue #62759](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/62759 "Incorrect permissions on registry filesystem after restore")). + +To get your registry working again: + +```shell +sudo chown -R registry:registry /var/opt/gitlab/gitlab-rails/shared/registry/docker +``` + +If you changed the default file system location for the registry, run `chown` +against your custom location, instead of `/var/opt/gitlab/gitlab-rails/shared/registry/docker`. + +### Backup fails to complete with Gzip error + +When running the backup, you may receive a Gzip error message: + +```shell +sudo /opt/gitlab/bin/gitlab-backup create +... +Dumping ... +... +gzip: stdout: Input/output error + +Backup failed +``` + +If this happens, examine the following: + +- Confirm there is sufficient disk space for the Gzip operation. It's not uncommon for backups that + use the [default strategy](#backup-strategy-option) to require half the instance size + in free disk space during backup creation. +- If NFS is being used, check if the mount option `timeout` is set. The + default is `600`, and changing this to smaller values results in this error. + +### Backup fails with `File name too long` error + +During backup, you can get the `File name too long` error ([issue #354984](https://gitlab.com/gitlab-org/gitlab/-/issues/354984)). For example: + +```plaintext +Problem: |\r\n]+$'))[1]) > 246; + + CREATE INDEX ON uploads_with_long_filenames(row_id); + + SELECT + u.id, + u.path, + -- Current filename + (regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1] AS current_filename, + -- New filename + CONCAT( + LEFT(SPLIT_PART((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1], '.', 1), 242), + COALESCE(SUBSTRING((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1] FROM '\.(?:.(?!\.))+$')) + ) AS new_filename, + -- New path + CONCAT( + COALESCE((regexp_match(u.path, '(.*\/).*'))[1], ''), + CONCAT( + LEFT(SPLIT_PART((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1], '.', 1), 242), + COALESCE(SUBSTRING((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1] FROM '\.(?:.(?!\.))+$')) + ) + ) AS new_path + FROM uploads_with_long_filenames AS u + WHERE u.row_id > 0 AND u.row_id <= 10000; + ``` + + Output example: + + ```postgresql + -[ RECORD 1 ]----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + id | 34 + path | public/@hashed/loremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisitloremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisit.txt + current_filename | loremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisitloremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisit.txt + new_filename | loremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisitloremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelits.txt + new_path | public/@hashed/loremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisitloremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelits.txt + ``` + + Where: + + - `current_filename`: a filename that is currently more than 246 characters long. + - `new_filename`: a filename that has been truncated to 246 characters maximum. + - `new_path`: new path considering the `new_filename` (truncated). + + Once you validate the batch results, you must change the batch size (`row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. + +1. Rename the files found in the `uploads` table from long filenames to new truncated filenames. The following query rolls back the update so you can check the results safely within a transaction wrapper: + + ```sql + CREATE TEMP TABLE uploads_with_long_filenames AS + SELECT ROW_NUMBER() OVER(ORDER BY id) row_id, path, id + FROM uploads AS u + WHERE LENGTH((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1]) > 246; + + CREATE INDEX ON uploads_with_long_filenames(row_id); + + BEGIN; + WITH updated_uploads AS ( + UPDATE uploads + SET + path = + CONCAT( + COALESCE((regexp_match(updatable_uploads.path, '(.*\/).*'))[1], ''), + CONCAT( + LEFT(SPLIT_PART((regexp_match(updatable_uploads.path, '[^\\/:*?"<>|\r\n]+$'))[1], '.', 1), 242), + COALESCE(SUBSTRING((regexp_match(updatable_uploads.path, '[^\\/:*?"<>|\r\n]+$'))[1] FROM '\.(?:.(?!\.))+$')) + ) + ) + FROM + uploads_with_long_filenames AS updatable_uploads + WHERE + uploads.id = updatable_uploads.id + AND updatable_uploads.row_id > 0 AND updatable_uploads.row_id <= 10000 + RETURNING uploads.* + ) + SELECT id, path FROM updated_uploads; + ROLLBACK; + ``` + + Once you validate the batch update results, you must change the batch size (`row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. + +1. Validate that the new filenames from the previous query are the expected ones. If you are sure you want to truncate the records found in the previous step to 246 characters, run the following: + + WARNING: + The following action is **irreversible**. + + ```sql + CREATE TEMP TABLE uploads_with_long_filenames AS + SELECT ROW_NUMBER() OVER(ORDER BY id) row_id, path, id + FROM uploads AS u + WHERE LENGTH((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1]) > 246; + + CREATE INDEX ON uploads_with_long_filenames(row_id); + + UPDATE uploads + SET + path = + CONCAT( + COALESCE((regexp_match(updatable_uploads.path, '(.*\/).*'))[1], ''), + CONCAT( + LEFT(SPLIT_PART((regexp_match(updatable_uploads.path, '[^\\/:*?"<>|\r\n]+$'))[1], '.', 1), 242), + COALESCE(SUBSTRING((regexp_match(updatable_uploads.path, '[^\\/:*?"<>|\r\n]+$'))[1] FROM '\.(?:.(?!\.))+$')) + ) + ) + FROM + uploads_with_long_filenames AS updatable_uploads + WHERE + uploads.id = updatable_uploads.id + AND updatable_uploads.row_id > 0 AND updatable_uploads.row_id <= 10000; + ``` + + Once you finish the batch update, you must change the batch size (`updatable_uploads.row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. + +Truncate the filenames in the references found: + +1. Check if those records are referenced somewhere. One way to do this is to dump the database and search for the parent directory name and filename: + + 1. To dump your database, you can use the following command as an example: + + ```shell + pg_dump -h /var/opt/gitlab/postgresql/ -d gitlabhq_production > gitlab-dump.tmp + ``` + + 1. Then you can search for the references using the `grep` command. Combining the parent directory and the filename can be a good idea. For example: + + ```shell + grep public/alongfilenamehere.txt gitlab-dump.tmp + ``` + +1. Replace those long filenames using the new filenames obtained from querying the `uploads` table. + +Truncate the filenames on the file system. You must manually rename the files in your file system to the new filenames obtained from querying the `uploads` table. + +#### Re-run the backup task + +After following all the previous steps, re-run the backup task. + +### Restoring database backup fails when `pg_stat_statements` was previously enabled + +The GitLab backup of the PostgreSQL database includes all SQL statements required to enable extensions that were +previously enabled in the database. + +The `pg_stat_statements` extension can only be enabled or disabled by a PostgreSQL user with `superuser` role. +As the restore process uses a database user with limited permissions, it can't execute the following SQL statements: + +```sql +DROP EXTENSION IF EXISTS pg_stat_statements; +CREATE EXTENSION IF NOT EXISTS pg_stat_statements WITH SCHEMA public; +``` + +When trying to restore the backup in a PostgreSQL instance that doesn't have the `pg_stats_statements` extension, +the following error message is displayed: + +```plaintext +ERROR: permission denied to create extension "pg_stat_statements" +HINT: Must be superuser to create this extension. +ERROR: extension "pg_stat_statements" does not exist +``` + +When trying to restore in an instance that has the `pg_stats_statements` extension enabled, the cleaning up step +fails with an error message similar to the following: + +```plaintext +rake aborted! +ActiveRecord::StatementInvalid: PG::InsufficientPrivilege: ERROR: must be owner of view pg_stat_statements +/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:42:in `block (4 levels) in ' +/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:41:in `each' +/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:41:in `block (3 levels) in ' +/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:71:in `block (3 levels) in ' +/opt/gitlab/embedded/bin/bundle:23:in `load' +/opt/gitlab/embedded/bin/bundle:23:in `
' +Caused by: +PG::InsufficientPrivilege: ERROR: must be owner of view pg_stat_statements +/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:42:in `block (4 levels) in ' +/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:41:in `each' +/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:41:in `block (3 levels) in ' +/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:71:in `block (3 levels) in ' +/opt/gitlab/embedded/bin/bundle:23:in `load' +/opt/gitlab/embedded/bin/bundle:23:in `
' +Tasks: TOP => gitlab:db:drop_tables +(See full trace by running task with --trace) +``` + +#### Prevent the dump file to include `pg_stat_statements` + +To prevent the inclusion of the extension in the PostgreSQL dump file that is part of the backup bundle, +enable the extension in any schema except the `public` schema: + +```sql +CREATE SCHEMA adm; +CREATE EXTENSION pg_stat_statements SCHEMA adm; +``` + +If the extension was previously enabled in the `public` schema, move it to a new one: + +```sql +CREATE SCHEMA adm; +ALTER EXTENSION pg_stat_statements SET SCHEMA adm; +``` + +To query the `pg_stat_statements` data after changing the schema, prefix the view name with the new schema: + +```sql +SELECT * FROM adm.pg_stat_statements limit 0; +``` + +To make it compatible with third-party monitoring solutions that expect it to be enabled in the `public` schema, +you need to include it in the `search_path`: + +```sql +set search_path to public,adm; +``` + +#### Fix an existing dump file to remove references to `pg_stat_statements` + +To fix an existing backup file, do the following changes: + +1. Extract from the backup the following file: `db/database.sql.gz`. +1. Decompress the file or use an editor that is capable of handling it compressed. +1. Remove the following lines, or similar ones: + + ```sql + CREATE EXTENSION IF NOT EXISTS pg_stat_statements WITH SCHEMA public; + ``` + + ```sql + COMMENT ON EXTENSION pg_stat_statements IS 'track planning and execution statistics of all SQL statements executed'; + ``` + +1. Save the changes and recompress the file. +1. Update the backup file with the modified `db/database.sql.gz`. diff --git a/doc/raketasks/backup_restore.md b/doc/raketasks/backup_restore.md index 1fd772c06da448..5ebeaced3cfc1a 100644 --- a/doc/raketasks/backup_restore.md +++ b/doc/raketasks/backup_restore.md @@ -6,68 +6,22 @@ info: To determine the technical writer assigned to the Stage/Group associated w # Back up and restore GitLab **(FREE SELF)** -GitLab provides Rake tasks for backing up and restoring GitLab instances. + -WARNING: -GitLab doesn't back up items that aren't stored on the file system. If you're -using [object storage](../administration/object_storage.md), be sure to enable -backups with your object storage provider, if desired. - -## Requirements - -To be able to back up and restore, ensure that Rsync is installed on your -system. If you installed GitLab: - -- _Using the Omnibus package_, Rsync is already installed. -- _From source_, check if `rsync` is installed. If Rsync is not installed, install it. For example: - - ```shell - # Debian/Ubuntu - sudo apt-get install rsync - - # RHEL/CentOS - sudo yum install rsync - ``` - -### `gitaly-backup` for repository backup and restore - -> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/333034) in GitLab 14.2. -> - [Deployed behind a feature flag](../user/feature_flags.md), enabled by default. -> - [Generally available](https://gitlab.com/gitlab-org/gitlab/-/issues/333034) in GitLab 14.10. [Feature flag `gitaly_backup`](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/83254) removed. - -The `gitaly-backup` binary is used by the backup Rake task to create and restore repository backups from Gitaly. -`gitaly-backup` replaces the previous backup method that directly calls RPCs on Gitaly from GitLab. - -The backup Rake task must be able to find this executable. In most cases, you don't need to change -the path to the binary as it should work fine with the default path `/opt/gitlab/embedded/bin/gitaly-backup`. -If you have a specific reason to change the path, it can be configured in Omnibus GitLab packages: +## Why should I care about back up and restore? -1. Add the following to `/etc/gitlab/gitlab.rb`: +You've invested time and money developing your software. What happens if: - ```ruby - gitlab_rails['backup_gitaly_backup_path'] = '/path/to/gitaly-backup' - ``` - -1. [Reconfigure GitLab](../administration/restart_gitlab.md#reconfigure-a-linux-package-installation) - for the changes to take effect. - -## Backup timestamp - -The backup archive is saved in `backup_path`, which is specified in the -`config/gitlab.yml` file. The filename is `[TIMESTAMP]_gitlab_backup.tar`, -where `TIMESTAMP` identifies the time at which each backup was created, plus -the GitLab version. The timestamp is needed if you need to restore GitLab and -multiple backups are available. +- your data gets corrupted? +- your data gets accidentally deleted? +- your data gets encrypted by ransomware? +- your cloud provider suffers unexpected downtime? -For example, if the backup name is `1493107454_2018_04_25_10.6.4-ce_gitlab_backup.tar`, -the timestamp is `1493107454_2018_04_25_10.6.4-ce`. +These scenarios may be existential threats to your software or organization. You can mitigate all of these risks with a disaster recovery plan that includes backups. ## Back up GitLab @@ -77,174 +31,6 @@ For detailed information on backing up GitLab, see [Backup GitLab](backup_gitlab For detailed information on restoring GitLab, see [Restore GitLab](restore_gitlab.md). -## Alternative backup strategies - -In the following cases, consider using file system data transfer or snapshots as part of your backup strategy: - -- Your GitLab instance contains a lot of Git repository data and the GitLab backup script is too slow. -- Your GitLab instance has a lot of forked projects and the regular backup task duplicates the Git data for all of them. -- Your GitLab instance has a problem and using the regular backup and import Rake tasks isn't possible. - -WARNING: -Gitaly Cluster [does not support snapshot backups](../administration/gitaly/index.md#snapshot-backup-and-recovery-limitations). - -When considering using file system data transfer or snapshots: - -- Don't use these methods to migrate from one operating system to another. The operating systems of the source and destination should be as similar as possible. For example, - don't use these methods to migrate from Ubuntu to Fedora. -- Data consistency is very important. We recommend stopping GitLab with `sudo gitlab-ctl stop` before taking doing a file system transfer (with rsync, for example) or taking a - snapshot. - -Example: Amazon Elastic Block Store (EBS) - -> A GitLab server using Omnibus GitLab hosted on Amazon AWS. -> An EBS drive containing an ext4 file system is mounted at `/var/opt/gitlab`. -> In this case you could make an application backup by taking an EBS snapshot. -> The backup includes all repositories, uploads and PostgreSQL data. - -Example: Logical Volume Manager (LVM) snapshots + rsync - -> A GitLab server using Omnibus GitLab, with an LVM logical volume mounted at `/var/opt/gitlab`. -> Replicating the `/var/opt/gitlab` directory using rsync would not be reliable because too many files would change while rsync is running. -> Instead of rsync-ing `/var/opt/gitlab`, we create a temporary LVM snapshot, which we mount as a read-only file system at `/mnt/gitlab_backup`. -> Now we can have a longer running rsync job which creates a consistent replica on the remote server. -> The replica includes all repositories, uploads and PostgreSQL data. - -If you're running GitLab on a virtualized server, you can possibly also create -VM snapshots of the entire GitLab server. It's not uncommon however for a VM -snapshot to require you to power down the server, which limits this solution's -practical use. - -### Back up repository data separately - -First, ensure you back up existing GitLab data while [skipping repositories](backup_gitlab.md#excluding-specific-directories-from-the-backup): - -```shell -# for Omnibus GitLab package installations -sudo gitlab-backup create SKIP=repositories - -# for installations from source: -sudo -u git -H bundle exec rake gitlab:backup:create SKIP=repositories RAILS_ENV=production -``` - -For manually backing up the Git repository data on disk, there are multiple possible strategies: - -- Use snapshots, such as the previous examples of Amazon EBS drive snapshots, or LVM snapshots + rsync. -- Use [GitLab Geo](../administration/geo/index.md) and rely on the repository data on a Geo secondary site. -- [Prevent writes and copy the Git repository data](#prevent-writes-and-copy-the-git-repository-data). -- [Create an online backup by marking repositories as read-only (experimental)](#online-backup-through-marking-repositories-as-read-only-experimental). - -#### Prevent writes and copy the Git repository data - -Git repositories must be copied in a consistent way. They should not be copied during concurrent write -operations, as this can lead to inconsistencies or corruption issues. For more details, -[issue #270422](https://gitlab.com/gitlab-org/gitlab/-/issues/270422 "Provide documentation on preferred method of migrating Gitaly servers") -has a longer discussion explaining the potential problems. - -To prevent writes to the Git repository data, there are two possible approaches: - -- Use [maintenance mode](../administration/maintenance_mode/index.md) to place GitLab in a read-only state. -- Create explicit downtime by stopping all Gitaly services before backing up the repositories: - - ```shell - sudo gitlab-ctl stop gitaly - # execute git data copy step - sudo gitlab-ctl start gitaly - ``` - -You can copy Git repository data using any method, as long as writes are prevented on the data being copied -(to prevent inconsistencies and corruption issues). In order of preference and safety, the recommended methods are: - -1. Use `rsync` with archive-mode, delete, and checksum options, for example: - - ```shell - rsync -aR --delete --checksum source destination # be extra safe with the order as it will delete existing data if inverted - ``` - -1. Use a [`tar` pipe to copy the entire repository's directory to another server or location](../administration/operations/moving_repositories.md#tar-pipe-to-another-server). - -1. Use `sftp`, `scp`, `cp`, or any other copying method. - -#### Online backup through marking repositories as read-only (experimental) - -One way of backing up repositories without requiring instance-wide downtime -is to programmatically mark projects as read-only while copying the underlying data. - -There are a few possible downsides to this: - -- Repositories are read-only for a period of time that scales with the size of the repository. -- Backups take a longer time to complete due to marking each project as read-only, potentially leading to inconsistencies. For example, - a possible date discrepancy between the last data available for the first project that gets backed up compared to - the last project that gets backed up. -- Fork networks should be entirely read-only while the projects inside get backed up to prevent potential changes to the pool repository. - -There is an **experimental** script that attempts to automate this process in -[the Geo team Runbooks project](https://gitlab.com/gitlab-org/geo-team/runbooks/-/tree/main/experimental-online-backup-through-rsync). - -## Back up and restore for installations using PgBouncer - -Do not back up or restore GitLab through a PgBouncer connection. These -tasks must [bypass PgBouncer and connect directly to the PostgreSQL primary database node](#bypassing-pgbouncer), -or they cause a GitLab outage. - -When the GitLab backup or restore task is used with PgBouncer, the -following error message is shown: - -```ruby -ActiveRecord::StatementInvalid: PG::UndefinedTable -``` - -Each time the GitLab backup runs, GitLab starts generating 500 errors and errors about missing -tables will [be logged by PostgreSQL](../administration/logs/index.md#postgresql-logs): - -```plaintext -ERROR: relation "tablename" does not exist at character 123 -``` - -This happens because the task uses `pg_dump`, which -[sets a null search path and explicitly includes the schema in every SQL query](https://gitlab.com/gitlab-org/gitlab/-/issues/23211) -to address [CVE-2018-1058](https://www.postgresql.org/about/news/postgresql-103-968-9512-9417-and-9322-released-1834/). - -Since connections are reused with PgBouncer in transaction pooling mode, -PostgreSQL fails to search the default `public` schema. As a result, -this clearing of the search path causes tables and columns to appear -missing. - -### Bypassing PgBouncer - -There are two ways to fix this: - -1. [Use environment variables to override the database settings](#environment-variable-overrides) for the backup task. -1. Reconfigure a node to [connect directly to the PostgreSQL primary database node](../administration/postgresql/pgbouncer.md#procedure-for-bypassing-pgbouncer). - -#### Environment variable overrides - -By default, GitLab uses the database configuration stored in a -configuration file (`database.yml`). However, you can override the database settings -for the backup and restore task by setting environment -variables that are prefixed with `GITLAB_BACKUP_`: - -- `GITLAB_BACKUP_PGHOST` -- `GITLAB_BACKUP_PGUSER` -- `GITLAB_BACKUP_PGPORT` -- `GITLAB_BACKUP_PGPASSWORD` -- `GITLAB_BACKUP_PGSSLMODE` -- `GITLAB_BACKUP_PGSSLKEY` -- `GITLAB_BACKUP_PGSSLCERT` -- `GITLAB_BACKUP_PGSSLROOTCERT` -- `GITLAB_BACKUP_PGSSLCRL` -- `GITLAB_BACKUP_PGSSLCOMPRESSION` - -For example, to override the database host and port to use 192.168.1.10 -and port 5432 with the Omnibus package: - -```shell -sudo GITLAB_BACKUP_PGHOST=192.168.1.10 GITLAB_BACKUP_PGPORT=5432 /opt/gitlab/bin/gitlab-backup create -``` - -See the [PostgreSQL documentation](https://www.postgresql.org/docs/12/libpq-envars.html) -for more details on what these parameters do. - ## Migrate to a new server @@ -450,645 +236,3 @@ To migrate your repositories from one server to another with an up-to-date version of GitLab, use the [import Rake task](import.md) to do a mass import of the repository. If you do an import Rake task rather than a backup restore, you get all of your repositories, but no other data. - -## Troubleshooting - -The following are possible problems you might encounter, along with potential -solutions. - -### Restoring database backup using Omnibus packages outputs warnings - -If you're using backup restore procedures, you may encounter the following -warning messages: - -```plaintext -ERROR: must be owner of extension pg_trgm -ERROR: must be owner of extension btree_gist -ERROR: must be owner of extension plpgsql -WARNING: no privileges could be revoked for "public" (two occurrences) -WARNING: no privileges were granted for "public" (two occurrences) -``` - -Be advised that the backup is successfully restored in spite of these warning -messages. - -The Rake task runs this as the `gitlab` user, which doesn't have superuser -access to the database. When restore is initiated, it also runs as the `gitlab` -user, but it also tries to alter the objects it doesn't have access to. -Those objects have no influence on the database backup or restore, but display -a warning message. - -For more information, see: - -- PostgreSQL issue tracker: - - [Not being a superuser](https://www.postgresql.org/message-id/201110220712.30886.adrian.klaver@gmail.com). - - [Having different owners](https://www.postgresql.org/message-id/2039.1177339749@sss.pgh.pa.us). - -- Stack Overflow: [Resulting errors](https://stackoverflow.com/questions/4368789/error-must-be-owner-of-language-plpgsql). - -### When the secrets file is lost - -If you didn't [back up the secrets file](backup_gitlab.md#storing-configuration-files), you -must complete several steps to get GitLab working properly again. - -The secrets file is responsible for storing the encryption key for the columns -that contain required, sensitive information. If the key is lost, GitLab can't -decrypt those columns, preventing access to the following items: - -- [CI/CD variables](../ci/variables/index.md) -- [Kubernetes / GCP integration](../user/infrastructure/clusters/index.md) -- [Custom Pages domains](../user/project/pages/custom_domains_ssl_tls_certification/index.md) -- [Project error tracking](../operations/error_tracking.md) -- [Runner authentication](../ci/runners/index.md) -- [Project mirroring](../user/project/repository/mirror/index.md) -- [Integrations](../user/project/integrations/index.md) -- [Web hooks](../user/project/integrations/webhooks.md) - -In cases like CI/CD variables and runner authentication, you can experience -unexpected behaviors, such as: - -- Stuck jobs. -- 500 errors. - -In this case, you must reset all the tokens for CI/CD variables and -runner authentication, which is described in more detail in the following -sections. After resetting the tokens, you should be able to visit your project -and the jobs begin running again. - -Use the information in the following sections at your own risk. - -#### Verify that all values can be decrypted - -You can determine if your database contains values that can't be decrypted by using a -[Rake task](../administration/raketasks/check.md#verify-database-values-can-be-decrypted-using-the-current-secrets). - -#### Take a backup - -You must directly modify GitLab data to work around your lost secrets file. - -WARNING: -Be sure to create a full database backup before attempting any changes. - -#### Disable user two-factor authentication (2FA) - -Users with 2FA enabled can't sign in to GitLab. In that case, you must -[disable 2FA for everyone](../security/two_factor_authentication.md#for-all-users), -after which users must reactivate 2FA. - -#### Reset CI/CD variables - -1. Enter the database console: - - For Omnibus GitLab 14.1 and earlier: - - ```shell - sudo gitlab-rails dbconsole - ``` - - For Omnibus GitLab 14.2 and later: - - ```shell - sudo gitlab-rails dbconsole --database main - ``` - - For installations from source, GitLab 14.1 and earlier: - - ```shell - sudo -u git -H bundle exec rails dbconsole -e production - ``` - - For installations from source, GitLab 14.2 and later: - - ```shell - sudo -u git -H bundle exec rails dbconsole -e production --database main - ``` - -1. Examine the `ci_group_variables` and `ci_variables` tables: - - ```sql - SELECT * FROM public."ci_group_variables"; - SELECT * FROM public."ci_variables"; - ``` - - These are the variables that you need to delete. - -1. Delete all variables: - - ```sql - DELETE FROM ci_group_variables; - DELETE FROM ci_variables; - ``` - -1. If you know the specific group or project from which you wish to delete variables, you can include a `WHERE` statement to specify that in your `DELETE`: - - ```sql - DELETE FROM ci_group_variables WHERE group_id = ; - DELETE FROM ci_variables WHERE project_id = ; - ``` - -You may need to reconfigure or restart GitLab for the changes to take effect. - -#### Reset runner registration tokens - -1. Enter the database console: - - For Omnibus GitLab 14.1 and earlier: - - ```shell - sudo gitlab-rails dbconsole - ``` - - For Omnibus GitLab 14.2 and later: - - ```shell - sudo gitlab-rails dbconsole --database main - ``` - - For installations from source, GitLab 14.1 and earlier: - - ```shell - sudo -u git -H bundle exec rails dbconsole -e production - ``` - - For installations from source, GitLab 14.2 and later: - - ```shell - sudo -u git -H bundle exec rails dbconsole -e production --database main - ``` - -1. Clear all tokens for projects, groups, and the entire instance: - - WARNING: - The final `UPDATE` operation stops the runners from being able to pick - up new jobs. You must register new runners. - - ```sql - -- Clear project tokens - UPDATE projects SET runners_token = null, runners_token_encrypted = null; - -- Clear group tokens - UPDATE namespaces SET runners_token = null, runners_token_encrypted = null; - -- Clear instance tokens - UPDATE application_settings SET runners_registration_token_encrypted = null; - -- Clear key used for JWT authentication - -- This may break the $CI_JWT_TOKEN job variable: - -- https://gitlab.com/gitlab-org/gitlab/-/issues/325965 - UPDATE application_settings SET encrypted_ci_jwt_signing_key = null; - -- Clear runner tokens - UPDATE ci_runners SET token = null, token_encrypted = null; - ``` - -#### Reset pending pipeline jobs - -1. Enter the database console: - - For Omnibus GitLab 14.1 and earlier: - - ```shell - sudo gitlab-rails dbconsole - ``` - - For Omnibus GitLab 14.2 and later: - - ```shell - sudo gitlab-rails dbconsole --database main - ``` - - For installations from source, GitLab 14.1 and earlier: - - ```shell - sudo -u git -H bundle exec rails dbconsole -e production - ``` - - For installations from source, GitLab 14.2 and later: - - ```shell - sudo -u git -H bundle exec rails dbconsole -e production --database main - ``` - -1. Clear all the tokens for pending jobs: - - For GitLab 15.3 and earlier: - - ```sql - -- Clear build tokens - UPDATE ci_builds SET token = null, token_encrypted = null; - ``` - - For GitLab 15.4 and later: - - ```sql - -- Clear build tokens - UPDATE ci_builds SET token_encrypted = null; - ``` - -A similar strategy can be employed for the remaining features. By removing the -data that can't be decrypted, GitLab can be returned to operation, and the -lost data can be manually replaced. - -#### Fix integrations and webhooks - -If you've lost your secrets, the [integrations settings pages](../user/project/integrations/index.md) -and [webhooks settings pages](../user/project/integrations/webhooks.md) are probably displaying `500` error messages. - -The fix is to truncate the affected tables (those containing encrypted columns). -This deletes all your configured integrations, webhooks, and related metadata. -You should verify that the secrets are the root cause before deleting any data. - -1. Enter the database console: - - For Omnibus GitLab 14.1 and earlier: - - ```shell - sudo gitlab-rails dbconsole - ``` - - For Omnibus GitLab 14.2 and later: - - ```shell - sudo gitlab-rails dbconsole --database main - ``` - - For installations from source, GitLab 14.1 and earlier: - - ```shell - sudo -u git -H bundle exec rails dbconsole -e production - ``` - - For installations from source, GitLab 14.2 and later: - - ```shell - sudo -u git -H bundle exec rails dbconsole -e production --database main - ``` - -1. Truncate the following tables: - - ```sql - -- truncate web_hooks table - TRUNCATE integrations, chat_names, issue_tracker_data, jira_tracker_data, slack_integrations, web_hooks, zentao_tracker_data, web_hook_logs; - ``` - -### Container Registry push failures after restoring from a backup - -If you use the [Container Registry](../user/packages/container_registry/index.md), -pushes to the registry may fail after restoring your backup on an Omnibus GitLab -instance after restoring the registry data. - -These failures mention permission issues in the registry logs, similar to: - -```plaintext -level=error -msg="response completed with error" -err.code=unknown -err.detail="filesystem: mkdir /var/opt/gitlab/gitlab-rails/shared/registry/docker/registry/v2/repositories/...: permission denied" -err.message="unknown error" -``` - -This issue is caused by the restore running as the unprivileged user `git`, -which is unable to assign the correct ownership to the registry files during -the restore process ([issue #62759](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/62759 "Incorrect permissions on registry filesystem after restore")). - -To get your registry working again: - -```shell -sudo chown -R registry:registry /var/opt/gitlab/gitlab-rails/shared/registry/docker -``` - -If you changed the default file system location for the registry, run `chown` -against your custom location, instead of `/var/opt/gitlab/gitlab-rails/shared/registry/docker`. - -### Backup fails to complete with Gzip error - -When running the backup, you may receive a Gzip error message: - -```shell -sudo /opt/gitlab/bin/gitlab-backup create -... -Dumping ... -... -gzip: stdout: Input/output error - -Backup failed -``` - -If this happens, examine the following: - -- Confirm there is sufficient disk space for the Gzip operation. It's not uncommon for backups that - use the [default strategy](backup_gitlab.md#backup-strategy-option) to require half the instance size - in free disk space during backup creation. -- If NFS is being used, check if the mount option `timeout` is set. The - default is `600`, and changing this to smaller values results in this error. - -### Backup fails with `File name too long` error - -During backup, you can get the `File name too long` error ([issue #354984](https://gitlab.com/gitlab-org/gitlab/-/issues/354984)). For example: - -```plaintext -Problem: |\r\n]+$'))[1]) > 246; - - CREATE INDEX ON uploads_with_long_filenames(row_id); - - SELECT - u.id, - u.path, - -- Current filename - (regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1] AS current_filename, - -- New filename - CONCAT( - LEFT(SPLIT_PART((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1], '.', 1), 242), - COALESCE(SUBSTRING((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1] FROM '\.(?:.(?!\.))+$')) - ) AS new_filename, - -- New path - CONCAT( - COALESCE((regexp_match(u.path, '(.*\/).*'))[1], ''), - CONCAT( - LEFT(SPLIT_PART((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1], '.', 1), 242), - COALESCE(SUBSTRING((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1] FROM '\.(?:.(?!\.))+$')) - ) - ) AS new_path - FROM uploads_with_long_filenames AS u - WHERE u.row_id > 0 AND u.row_id <= 10000; - ``` - - Output example: - - ```postgresql - -[ RECORD 1 ]----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - id | 34 - path | public/@hashed/loremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisitloremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisit.txt - current_filename | loremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisitloremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisit.txt - new_filename | loremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisitloremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelits.txt - new_path | public/@hashed/loremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisitloremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelits.txt - ``` - - Where: - - - `current_filename`: a filename that is currently more than 246 characters long. - - `new_filename`: a filename that has been truncated to 246 characters maximum. - - `new_path`: new path considering the `new_filename` (truncated). - - Once you validate the batch results, you must change the batch size (`row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. - -1. Rename the files found in the `uploads` table from long filenames to new truncated filenames. The following query rolls back the update so you can check the results safely within a transaction wrapper: - - ```sql - CREATE TEMP TABLE uploads_with_long_filenames AS - SELECT ROW_NUMBER() OVER(ORDER BY id) row_id, path, id - FROM uploads AS u - WHERE LENGTH((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1]) > 246; - - CREATE INDEX ON uploads_with_long_filenames(row_id); - - BEGIN; - WITH updated_uploads AS ( - UPDATE uploads - SET - path = - CONCAT( - COALESCE((regexp_match(updatable_uploads.path, '(.*\/).*'))[1], ''), - CONCAT( - LEFT(SPLIT_PART((regexp_match(updatable_uploads.path, '[^\\/:*?"<>|\r\n]+$'))[1], '.', 1), 242), - COALESCE(SUBSTRING((regexp_match(updatable_uploads.path, '[^\\/:*?"<>|\r\n]+$'))[1] FROM '\.(?:.(?!\.))+$')) - ) - ) - FROM - uploads_with_long_filenames AS updatable_uploads - WHERE - uploads.id = updatable_uploads.id - AND updatable_uploads.row_id > 0 AND updatable_uploads.row_id <= 10000 - RETURNING uploads.* - ) - SELECT id, path FROM updated_uploads; - ROLLBACK; - ``` - - Once you validate the batch update results, you must change the batch size (`row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. - -1. Validate that the new filenames from the previous query are the expected ones. If you are sure you want to truncate the records found in the previous step to 246 characters, run the following: - - WARNING: - The following action is **irreversible**. - - ```sql - CREATE TEMP TABLE uploads_with_long_filenames AS - SELECT ROW_NUMBER() OVER(ORDER BY id) row_id, path, id - FROM uploads AS u - WHERE LENGTH((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1]) > 246; - - CREATE INDEX ON uploads_with_long_filenames(row_id); - - UPDATE uploads - SET - path = - CONCAT( - COALESCE((regexp_match(updatable_uploads.path, '(.*\/).*'))[1], ''), - CONCAT( - LEFT(SPLIT_PART((regexp_match(updatable_uploads.path, '[^\\/:*?"<>|\r\n]+$'))[1], '.', 1), 242), - COALESCE(SUBSTRING((regexp_match(updatable_uploads.path, '[^\\/:*?"<>|\r\n]+$'))[1] FROM '\.(?:.(?!\.))+$')) - ) - ) - FROM - uploads_with_long_filenames AS updatable_uploads - WHERE - uploads.id = updatable_uploads.id - AND updatable_uploads.row_id > 0 AND updatable_uploads.row_id <= 10000; - ``` - - Once you finish the batch update, you must change the batch size (`updatable_uploads.row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. - -Truncate the filenames in the references found: - -1. Check if those records are referenced somewhere. One way to do this is to dump the database and search for the parent directory name and filename: - - 1. To dump your database, you can use the following command as an example: - - ```shell - pg_dump -h /var/opt/gitlab/postgresql/ -d gitlabhq_production > gitlab-dump.tmp - ``` - - 1. Then you can search for the references using the `grep` command. Combining the parent directory and the filename can be a good idea. For example: - - ```shell - grep public/alongfilenamehere.txt gitlab-dump.tmp - ``` - -1. Replace those long filenames using the new filenames obtained from querying the `uploads` table. - -Truncate the filenames on the file system. You must manually rename the files in your file system to the new filenames obtained from querying the `uploads` table. - -#### Re-run the backup task - -After following all the previous steps, re-run the backup task. - -### Restoring database backup fails when `pg_stat_statements` was previously enabled - -The GitLab backup of the PostgreSQL database includes all SQL statements required to enable extensions that were -previously enabled in the database. - -The `pg_stat_statements` extension can only be enabled or disabled by a PostgreSQL user with `superuser` role. -As the restore process uses a database user with limited permissions, it can't execute the following SQL statements: - -```sql -DROP EXTENSION IF EXISTS pg_stat_statements; -CREATE EXTENSION IF NOT EXISTS pg_stat_statements WITH SCHEMA public; -``` - -When trying to restore the backup in a PostgreSQL instance that doesn't have the `pg_stats_statements` extension, -the following error message is displayed: - -```plaintext -ERROR: permission denied to create extension "pg_stat_statements" -HINT: Must be superuser to create this extension. -ERROR: extension "pg_stat_statements" does not exist -``` - -When trying to restore in an instance that has the `pg_stats_statements` extension enabled, the cleaning up step -fails with an error message similar to the following: - -```plaintext -rake aborted! -ActiveRecord::StatementInvalid: PG::InsufficientPrivilege: ERROR: must be owner of view pg_stat_statements -/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:42:in `block (4 levels) in ' -/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:41:in `each' -/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:41:in `block (3 levels) in ' -/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:71:in `block (3 levels) in ' -/opt/gitlab/embedded/bin/bundle:23:in `load' -/opt/gitlab/embedded/bin/bundle:23:in `
' -Caused by: -PG::InsufficientPrivilege: ERROR: must be owner of view pg_stat_statements -/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:42:in `block (4 levels) in ' -/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:41:in `each' -/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:41:in `block (3 levels) in ' -/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:71:in `block (3 levels) in ' -/opt/gitlab/embedded/bin/bundle:23:in `load' -/opt/gitlab/embedded/bin/bundle:23:in `
' -Tasks: TOP => gitlab:db:drop_tables -(See full trace by running task with --trace) -``` - -#### Prevent the dump file to include `pg_stat_statements` - -To prevent the inclusion of the extension in the PostgreSQL dump file that is part of the backup bundle, -enable the extension in any schema except the `public` schema: - -```sql -CREATE SCHEMA adm; -CREATE EXTENSION pg_stat_statements SCHEMA adm; -``` - -If the extension was previously enabled in the `public` schema, move it to a new one: - -```sql -CREATE SCHEMA adm; -ALTER EXTENSION pg_stat_statements SET SCHEMA adm; -``` - -To query the `pg_stat_statements` data after changing the schema, prefix the view name with the new schema: - -```sql -SELECT * FROM adm.pg_stat_statements limit 0; -``` - -To make it compatible with third-party monitoring solutions that expect it to be enabled in the `public` schema, -you need to include it in the `search_path`: - -```sql -set search_path to public,adm; -``` - -#### Fix an existing dump file to remove references to `pg_stat_statements` - -To fix an existing backup file, do the following changes: - -1. Extract from the backup the following file: `db/database.sql.gz`. -1. Decompress the file or use an editor that is capable of handling it compressed. -1. Remove the following lines, or similar ones: - - ```sql - CREATE EXTENSION IF NOT EXISTS pg_stat_statements WITH SCHEMA public; - ``` - - ```sql - COMMENT ON EXTENSION pg_stat_statements IS 'track planning and execution statistics of all SQL statements executed'; - ``` - -1. Save the changes and recompress the file. -1. Update the backup file with the modified `db/database.sql.gz`. diff --git a/doc/raketasks/restore_gitlab.md b/doc/raketasks/restore_gitlab.md index bbb2f2aa6480c9..ac2f9f11036d26 100644 --- a/doc/raketasks/restore_gitlab.md +++ b/doc/raketasks/restore_gitlab.md @@ -34,7 +34,7 @@ Restoring SQL data skips views owned by PostgreSQL extensions. To restore a backup, **you must also restore the GitLab secrets**. These include the database encryption key, [CI/CD variables](../ci/variables/index.md), and variables used for [two-factor authentication](../user/profile/account/two_factor_authentication.md). -Without the keys, [multiple issues occur](backup_restore.md#when-the-secrets-file-is-lost), +Without the keys, [multiple issues occur](backup_gitlab.md#when-the-secrets-file-is-lost), including loss of access by users with [two-factor authentication enabled](../user/profile/account/two_factor_authentication.md), and GitLab Runners cannot log in. @@ -54,7 +54,7 @@ Depending on your case, you might want to run the restore command with one or more of the following options: - `BACKUP=timestamp_of_backup`: Required if more than one backup exists. - Read what the [backup timestamp is about](backup_restore.md#backup-timestamp). + Read what the [backup timestamp is about](backup_gitlab.md#backup-timestamp). - `force=yes`: Doesn't ask if the `authorized_keys` file should get regenerated, and assumes 'yes' for warning about database tables being removed, enabling the `Write to authorized_keys file` setting, and updating LDAP @@ -110,7 +110,7 @@ sudo gitlab-backup restore BACKUP=11493107454_2018_04_25_10.6.4-ce ``` Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:restore` instead. -Some [known non-blocking error messages may appear](backup_restore.md#restoring-database-backup-using-omnibus-packages-outputs-warnings). +Some [known non-blocking error messages may appear](#restoring-database-backup-using-omnibus-packages-outputs-warnings). WARNING: `gitlab-rake gitlab:backup:restore` doesn't set the correct file system @@ -124,7 +124,7 @@ message. Install the [correct GitLab version](https://packages.gitlab.com/gitlab and then try again. WARNING: -The restore command requires [additional parameters](backup_restore.md#back-up-and-restore-for-installations-using-pgbouncer) when +The restore command requires [additional parameters](backup_gitlab.md#back-up-and-restore-for-installations-using-pgbouncer) when your installation is using PgBouncer, for either performance reasons or when using it with a Patroni cluster. Next, restart and [check](../administration/raketasks/maintenance.md#check-gitlab-configuration) GitLab: @@ -403,3 +403,38 @@ For example, for installations from source: ```shell sudo -u git -H bundle exec rake gitlab:backup:restore ``` + +## Troubleshooting + +The following are possible problems you might encounter, along with potential +solutions. + +### Restoring database backup using Omnibus packages outputs warnings + +If you're using backup restore procedures, you may encounter the following +warning messages: + +```plaintext +ERROR: must be owner of extension pg_trgm +ERROR: must be owner of extension btree_gist +ERROR: must be owner of extension plpgsql +WARNING: no privileges could be revoked for "public" (two occurrences) +WARNING: no privileges were granted for "public" (two occurrences) +``` + +Be advised that the backup is successfully restored in spite of these warning +messages. + +The Rake task runs this as the `gitlab` user, which doesn't have superuser +access to the database. When restore is initiated, it also runs as the `gitlab` +user, but it also tries to alter the objects it doesn't have access to. +Those objects have no influence on the database backup or restore, but display +a warning message. + +For more information, see: + +- PostgreSQL issue tracker: + - [Not being a superuser](https://www.postgresql.org/message-id/201110220712.30886.adrian.klaver@gmail.com). + - [Having different owners](https://www.postgresql.org/message-id/2039.1177339749@sss.pgh.pa.us). + +- Stack Overflow: [Resulting errors](https://stackoverflow.com/questions/4368789/error-must-be-owner-of-language-plpgsql). diff --git a/doc/user/project/repository/mirror/index.md b/doc/user/project/repository/mirror/index.md index 733310a0b4daf1..cfb5644df6c27f 100644 --- a/doc/user/project/repository/mirror/index.md +++ b/doc/user/project/repository/mirror/index.md @@ -210,7 +210,7 @@ Older versions of SSH may require you to remove `-E md5` from the command. - Configure a [Pull Mirroring Interval](../../../../administration/instance_limits.md#pull-mirroring-interval) - [Disable mirrors for a project](../../../admin_area/settings/visibility_and_access_controls.md#enable-project-mirroring) -- [Secrets file and mirroring](../../../../raketasks/backup_restore.md#when-the-secrets-file-is-lost) +- [Secrets file and mirroring](../../../../raketasks/backup_gitlab.md#when-the-secrets-file-is-lost) ## Troubleshooting -- GitLab From 6a88ca933f84f6ad96399b90564b0ff09d0db3bc Mon Sep 17 00:00:00 2001 From: Mike Kozono Date: Wed, 21 Jun 2023 13:35:10 -1000 Subject: [PATCH 02/13] Replace deprecated reference with newer relevant references --- doc/raketasks/backup_restore.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/doc/raketasks/backup_restore.md b/doc/raketasks/backup_restore.md index 5ebeaced3cfc1a..e661be0d701a7c 100644 --- a/doc/raketasks/backup_restore.md +++ b/doc/raketasks/backup_restore.md @@ -232,7 +232,9 @@ methods to export or back up your data yourself from GitLab.com. Issues are stored in the database, and can't be stored in Git itself. -To migrate your repositories from one server to another with an up-to-date -version of GitLab, use the [import Rake task](import.md) to do a mass import of -the repository. If you do an import Rake task rather than a backup restore, -you get all of your repositories, but no other data. +## Related features + +- [Geo](../administration/geo/index.md) +- [Disaster Recovery (Geo)](../administration/geo/disaster_recovery/index.md) +- [Migrating GitLab groups](../user/group/import/index.md) +- [Import and migrate projects](../user/project/import/index.md) -- GitLab From 880568c8dd0b87aa8b6f0cb8b49e3615d66d622f Mon Sep 17 00:00:00 2001 From: Mike Kozono Date: Wed, 21 Jun 2023 14:32:09 -1000 Subject: [PATCH 03/13] Move backup_restore.md to doc/administration/backup_restore/ directory. --- doc/administration/backup_restore/index.md | 232 +++++++++++++++++ .../geo/disaster_recovery/planned_failover.md | 4 +- doc/administration/get_started.md | 4 +- doc/administration/gitaly/index.md | 2 +- doc/administration/logs/index.md | 2 +- doc/administration/nfs.md | 4 +- doc/administration/object_storage.md | 2 +- doc/administration/operations/index.md | 2 +- .../operations/moving_repositories.md | 2 +- doc/administration/pages/index.md | 2 +- doc/administration/pages/source.md | 2 +- .../postgresql/multiple_databases.md | 4 +- doc/administration/read_only_gitlab.md | 2 +- .../repository_storage_paths.md | 2 +- .../troubleshooting/postgresql.md | 2 +- doc/index.md | 2 +- doc/install/aws/gitlab_sre_for_aws.md | 4 +- doc/install/aws/manual_install_aws.md | 4 +- doc/install/docker.md | 2 +- doc/install/installation.md | 2 +- doc/install/next_steps.md | 2 +- doc/integration/mattermost/index.md | 2 +- doc/raketasks/backup_gitlab.md | 2 +- doc/raketasks/backup_restore.md | 240 +----------------- doc/raketasks/index.md | 2 +- doc/update/background_migrations.md | 2 +- doc/update/package/downgrade.md | 2 +- doc/update/package/index.md | 4 +- doc/update/patch_versions.md | 2 +- doc/update/plan_your_upgrade.md | 6 +- .../account/two_factor_authentication.md | 2 +- doc/user/project/import/index.md | 2 +- doc/user/project/issues/managing_issues.md | 2 +- .../projects/files/user_browses_files_spec.rb | 2 +- 34 files changed, 280 insertions(+), 274 deletions(-) create mode 100644 doc/administration/backup_restore/index.md diff --git a/doc/administration/backup_restore/index.md b/doc/administration/backup_restore/index.md new file mode 100644 index 00000000000000..9e9818bcf4b1b2 --- /dev/null +++ b/doc/administration/backup_restore/index.md @@ -0,0 +1,232 @@ +--- +stage: Systems +group: Geo +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments +--- + +# Back up and restore GitLab **(FREE SELF)** + +Your software or organization depends on the data in your GitLab instance. You need to ensure this data is protected from adverse events such as: + +- Corrupted data +- Accidental deletion of data +- Ransomware attacks +- Unexpected cloud provider downtime + +You can mitigate all of these risks with a disaster recovery plan that includes backups. + +## Back up GitLab + +For detailed information on backing up GitLab, see [Backup GitLab](../raketasks/backup_gitlab.md). + +## Restore GitLab + +For detailed information on restoring GitLab, see [Restore GitLab](../raketasks/restore_gitlab.md). + +## Migrate to a new server + + + +You can use GitLab backup and restore to migrate your instance to a new server. This section outlines a typical procedure for a GitLab deployment running on a single server. +If you're running GitLab Geo, an alternative option is [Geo disaster recovery for planned failover](../geo/disaster_recovery/planned_failover.md). + +WARNING: +Avoid uncoordinated data processing by both the new and old servers, where multiple +servers could connect concurrently and process the same data. For example, when using +[incoming email](../incoming_email.md), if both GitLab instances are +processing email at the same time, then both instances miss some data. +This type of problem can occur with other services as well, such as a +[non-packaged database](https://docs.gitlab.com/omnibus/settings/database.html#using-a-non-packaged-postgresql-database-management-server), +a non-packaged Redis instance, or non-packaged Sidekiq. + +Prerequisites: + +- Some time before your migration, consider notifying your users of upcoming + scheduled maintenance with a [broadcast message banner](../../user/admin_area/broadcast_messages.md). +- Ensure your backups are complete and current. Create a complete system-level backup, or + take a snapshot of all servers involved in the migration, in case destructive commands + (like `rm`) are run incorrectly. + +### Prepare the new server + +To prepare the new server: + +1. Copy the + [SSH host keys](https://superuser.com/questions/532040/copy-ssh-keys-from-one-server-to-another-server/532079#532079) + from the old server to avoid man-in-the-middle attack warnings. + See [Manually replicate the primary site's SSH host keys](../geo/replication/configuration.md#step-2-manually-replicate-the-primary-sites-ssh-host-keys) for example steps. +1. [Install and configure GitLab](https://about.gitlab.com/install/) except + [incoming email](../incoming_email.md): + 1. Install GitLab. + 1. Configure by copying `/etc/gitlab` files from the old server to the new server, and update as necessary. + Read the + [Omnibus configuration backup and restore instructions](https://docs.gitlab.com/omnibus/settings/backups.html) for more detail. + 1. If applicable, disable [incoming email](../incoming_email.md). + 1. Block new CI/CD jobs from starting upon initial startup after the backup and restore. + Edit `/etc/gitlab/gitlab.rb` and set the following: + + ```ruby + nginx['custom_gitlab_server_config'] = "location = /api/v4/jobs/request {\n deny all;\n return 503;\n }\n" + ``` + + 1. Reconfigure GitLab: + + ```shell + sudo gitlab-ctl reconfigure + ``` + +1. Stop GitLab to avoid any potential unnecessary and unintentional data processing: + + ```shell + sudo gitlab-ctl stop + ``` + +1. Configure the new server to allow receiving the Redis database and GitLab backup files: + + ```shell + sudo rm -f /var/opt/gitlab/redis/dump.rdb + sudo chown /var/opt/gitlab/redis /var/opt/gitlab/backups + ``` + +### Prepare and transfer content from the old server + +1. Ensure you have an up-to-date system-level backup or snapshot of the old server. +1. Enable [maintenance mode](../maintenance_mode/index.md), + if supported by your GitLab edition. +1. Block new CI/CD jobs from starting: + 1. Edit `/etc/gitlab/gitlab.rb`, and set the following: + + ```ruby + nginx['custom_gitlab_server_config'] = "location = /api/v4/jobs/request {\n deny all;\n return 503;\n }\n" + ``` + + 1. Reconfigure GitLab: + + ```shell + sudo gitlab-ctl reconfigure + ``` + +1. Disable periodic background jobs: + 1. On the left sidebar, expand the top-most chevron (**{chevron-down}**). + 1. Select **Admin Area**. + 1. On the left sidebar, select **Monitoring > Background Jobs**. + 1. Under the Sidekiq dashboard, select **Cron** tab and then + **Disable All**. +1. Wait for the currently running CI/CD jobs to finish, or accept that jobs that have not completed may be lost. + To view jobs currently running, on the left sidebar, select **Overviews > Jobs**, + and then select **Running**. +1. Wait for Sidekiq jobs to finish: + 1. On the left sidebar, select **Monitoring > Background Jobs**. + 1. Under the Sidekiq dashboard, select **Queues** and then **Live Poll**. + Wait for **Busy** and **Enqueued** to drop to 0. + These queues contain work that has been submitted by your users; + shutting down before these jobs complete may cause the work to be lost. + Make note of the numbers shown in the Sidekiq dashboard for post-migration verification. +1. Flush the Redis database to disk, and stop GitLab other than the services needed for migration: + + ```shell + sudo /opt/gitlab/embedded/bin/redis-cli -s /var/opt/gitlab/redis/redis.socket save && sudo gitlab-ctl stop && sudo gitlab-ctl start postgresql && sudo gitlab-ctl start gitaly + ``` + +1. Create a GitLab backup: + + ```shell + sudo gitlab-backup create + ``` + +1. Disable the following GitLab services and prevent unintentional restarts by adding the following to the bottom of `/etc/gitlab/gitlab.rb`: + + ```ruby + alertmanager['enable'] = false + gitlab_exporter['enable'] = false + gitlab_pages['enable'] = false + gitlab_workhorse['enable'] = false + grafana['enable'] = false + logrotate['enable'] = false + gitlab_rails['incoming_email_enabled'] = false + nginx['enable'] = false + node_exporter['enable'] = false + postgres_exporter['enable'] = false + postgresql['enable'] = false + prometheus['enable'] = false + puma['enable'] = false + redis['enable'] = false + redis_exporter['enable'] = false + registry['enable'] = false + sidekiq['enable'] = false + ``` + +1. Reconfigure GitLab: + + ```shell + sudo gitlab-ctl reconfigure + ``` + +1. Verify everything is stopped, and confirm no services are running: + + ```shell + sudo gitlab-ctl status + ``` + +1. Transfer the Redis database and GitLab backups to the new server: + + ```shell + sudo scp /var/opt/gitlab/redis/dump.rdb @new-server:/var/opt/gitlab/redis + sudo scp /var/opt/gitlab/backups/your-backup.tar @new-server:/var/opt/gitlab/backups + ``` + +### Restore data on the new server + +1. Restore appropriate file system permissions: + + ```shell + sudo chown gitlab-redis /var/opt/gitlab/redis + sudo chown gitlab-redis:gitlab-redis /var/opt/gitlab/redis/dump.rdb + sudo chown git:root /var/opt/gitlab/backups + sudo chown git:git /var/opt/gitlab/backups/your-backup.tar + ``` + +1. [Restore the GitLab backup](#restore-gitlab). +1. Verify that the Redis database restored correctly: + 1. On the left sidebar, expand the top-most chevron (**{chevron-down}**). + 1. Select **Admin Area**. + 1. On the left sidebar, select **Monitoring > Background Jobs**. + 1. Under the Sidekiq dashboard, verify that the numbers + match with what was shown on the old server. + 1. While still under the Sidekiq dashboard, select **Cron** and then **Enable All** + to re-enable periodic background jobs. +1. Test that read-only operations on the GitLab instance work as expected. For example, browse through project repository files, merge requests, and issues. +1. Disable [Maintenance Mode](../maintenance_mode/index.md), if previously enabled. +1. Test that the GitLab instance is working as expected. +1. If applicable, re-enable [incoming email](../incoming_email.md) and test it is working as expected. +1. Update your DNS or load balancer to point at the new server. +1. Unblock new CI/CD jobs from starting by removing the custom NGINX configuration + you added previously: + + ```ruby + # The following line must be removed + nginx['custom_gitlab_server_config'] = "location = /api/v4/jobs/request {\n deny all;\n return 503;\n }\n" + ``` + +1. Reconfigure GitLab: + + ```shell + sudo gitlab-ctl reconfigure + ``` + +1. Remove the scheduled maintenance [broadcast message banner](../../user/admin_area/broadcast_messages.md). + +## Additional notes + +This documentation is for GitLab Community and Enterprise Edition. We back up +GitLab.com and ensure your data is secure. You can't, however, use these +methods to export or back up your data yourself from GitLab.com. + +Issues are stored in the database, and can't be stored in Git itself. + +## Related features + +- [Geo](../geo/index.md) +- [Disaster Recovery (Geo)](../geo/disaster_recovery/index.md) +- [Migrating GitLab groups](../../user/group/import/index.md) +- [Import and migrate projects](../../user/project/import/index.md) diff --git a/doc/administration/geo/disaster_recovery/planned_failover.md b/doc/administration/geo/disaster_recovery/planned_failover.md index 13e0938fa59c1d..d5bcdcc7823e68 100644 --- a/doc/administration/geo/disaster_recovery/planned_failover.md +++ b/doc/administration/geo/disaster_recovery/planned_failover.md @@ -56,7 +56,7 @@ site you are about to failover to: rsync --archive --perms --delete root@:/var/opt/gitlab/gitlab-rails/shared/registry/. /var/opt/gitlab/gitlab-rails/shared/registry ``` -Alternatively, you can [back up](../../../raketasks/backup_restore.md#back-up-gitlab) +Alternatively, you can [back up](../../../administration/backup_restore/index.md#back-up-gitlab) the container registry on the primary site and restore it onto the secondary site: @@ -71,7 +71,7 @@ site: 1. Copy the backup tarball generated from your primary site to the `/var/opt/gitlab/backups` folder on your secondary site. -1. On your secondary site, restore the registry following the [Restore GitLab](../../../raketasks/backup_restore.md#restore-gitlab) +1. On your secondary site, restore the registry following the [Restore GitLab](../../../administration/backup_restore/index.md#restore-gitlab) documentation. ## Preflight checks diff --git a/doc/administration/get_started.md b/doc/administration/get_started.md index 60291732a200ef..b4c7a7be18ca26 100644 --- a/doc/administration/get_started.md +++ b/doc/administration/get_started.md @@ -130,7 +130,7 @@ The routine differs, depending on whether you deployed with the Linux package or When backing up (single node) GitLab server installed using the Linux package, you can use a single Rake task. -Learn about [backing up Linux package or Helm variations](../raketasks/backup_restore.md). +Learn about [backing up Linux package or Helm variations](../administration/backup_restore/index.md). This process backs up your entire instance, but does not back up the configuration files. Ensure those are backed up separately. Keep your configuration files and backup archives in a separate location to ensure the encryption keys are not kept with the encrypted data. @@ -163,7 +163,7 @@ For more information about GitLab SaaS backups, see our [Backup FAQ page](https: ### Alternative backup strategies In some situations the Rake task for backups may not be the most optimal solution. Here are some -[alternatives](../raketasks/backup_restore.md) to consider if the Rake task does not work for you. +[alternatives](../administration/backup_restore/index.md) to consider if the Rake task does not work for you. #### Option 1: File system snapshot diff --git a/doc/administration/gitaly/index.md b/doc/administration/gitaly/index.md index 8c351faa612926..18866a3ceb05e9 100644 --- a/doc/administration/gitaly/index.md +++ b/doc/administration/gitaly/index.md @@ -78,7 +78,7 @@ the current status of these issues, refer to the referenced issues and epics. Gitaly Cluster does not support snapshot backups. Snapshot backups can cause issues where the Praefect database becomes out of sync with the disk storage. Because of how Praefect rebuilds the replication metadata of Gitaly disk information -during a restore, you should use the [official backup and restore Rake tasks](../../raketasks/backup_restore.md). +during a restore, you should use the [official backup and restore Rake tasks](../../administration/backup_restore/index.md). The [incremental backup method](../../raketasks/backup_gitlab.md#incremental-repository-backups) can be used to speed up Gitaly Cluster backups. diff --git a/doc/administration/logs/index.md b/doc/administration/logs/index.md index 8dcb25e22dfc5d..28d4a4eedc90d1 100644 --- a/doc/administration/logs/index.md +++ b/doc/administration/logs/index.md @@ -1129,7 +1129,7 @@ GitLab also tracks [Prometheus metrics for Praefect](../gitaly/monitoring.md#mon For Omnibus installations, the backup log is located at `/var/log/gitlab/gitlab-rails/backup_json.log`. -This log is populated when a [GitLab backup is created](../../raketasks/backup_restore.md). You can use this log to understand how the backup process performed. +This log is populated when a [GitLab backup is created](../../administration/backup_restore/index.md). You can use this log to understand how the backup process performed. ## Performance bar stats diff --git a/doc/administration/nfs.md b/doc/administration/nfs.md index 3e3712c96453a2..27c838ec832b7e 100644 --- a/doc/administration/nfs.md +++ b/doc/administration/nfs.md @@ -293,7 +293,7 @@ NFS mount point is `/gitlab-nfs`. Then, add the following bind mounts in Using bind mounts requires you to manually make sure the data directories are empty before attempting a restore. Read more about the -[restore prerequisites](../raketasks/backup_restore.md). +[restore prerequisites](../administration/backup_restore/index.md). ### Multiple NFS mounts @@ -315,7 +315,7 @@ provide configuration for [UDP log shipping](https://docs.gitlab.com/omnibus/set Having multiple NFS mounts requires you to manually make sure the data directories are empty before attempting a restore. Read more about the -[restore prerequisites](../raketasks/backup_restore.md). +[restore prerequisites](../administration/backup_restore/index.md). ## Testing NFS diff --git a/doc/administration/object_storage.md b/doc/administration/object_storage.md index f2b966bd180db9..b6c3c3e1b0f990 100644 --- a/doc/administration/object_storage.md +++ b/doc/administration/object_storage.md @@ -825,7 +825,7 @@ See the following additional guides: ### Objects are not included in GitLab backups -As noted in [the backup documentation](../raketasks/backup_restore.md), +As noted in [the backup documentation](../administration/backup_restore/index.md), objects are not included in GitLab backups. You can enable backups with your object storage provider instead. diff --git a/doc/administration/operations/index.md b/doc/administration/operations/index.md index 867ef3236ee39f..8870c4769abdc0 100644 --- a/doc/administration/operations/index.md +++ b/doc/administration/operations/index.md @@ -17,7 +17,7 @@ Keep your GitLab instance up and running. - [Use SSH certificates](../../administration/operations/ssh_certificates.md) - [Enable encrypted configuration](../../administration/encrypted_configuration.md) - [Rake tasks](../../raketasks/index.md) -- [Backup and restore](../../raketasks/backup_restore.md) +- [Backup and restore](../../administration/backup_restore/index.md) - [Inactive project deletion](../../administration/inactive_project_deletion.md) - [Move repositories](../../administration/operations/moving_repositories.md) - [Read-only state](../../administration/read_only_gitlab.md) diff --git a/doc/administration/operations/moving_repositories.md b/doc/administration/operations/moving_repositories.md index e9d829f3f08820..43db0655141c3f 100644 --- a/doc/administration/operations/moving_repositories.md +++ b/doc/administration/operations/moving_repositories.md @@ -188,7 +188,7 @@ Each of the approaches we list can or does overwrite data in the target director ### Recommended approach in all cases -For either Gitaly or Gitaly Cluster targets, the GitLab [backup and restore capability](../../raketasks/backup_restore.md) +For either Gitaly or Gitaly Cluster targets, the GitLab [backup and restore capability](../../administration/backup_restore/index.md) should be used. Git repositories are accessed, managed, and stored on GitLab servers by Gitaly as a database. Data loss can result from directly accessing and copying Gitaly files using tools like `rsync`. diff --git a/doc/administration/pages/index.md b/doc/administration/pages/index.md index 459a4dcc6a388c..5c083e8848a093 100644 --- a/doc/administration/pages/index.md +++ b/doc/administration/pages/index.md @@ -1158,7 +1158,7 @@ If you want to store the pages content locally or continue using an NFS server, ## Backup -GitLab Pages are part of the [regular backup](../../raketasks/backup_restore.md), so there is no separate backup to configure. +GitLab Pages are part of the [regular backup](../../administration/backup_restore/index.md), so there is no separate backup to configure. ## Security diff --git a/doc/administration/pages/source.md b/doc/administration/pages/source.md index 2ee9dd653f0b31..92a645ce863731 100644 --- a/doc/administration/pages/source.md +++ b/doc/administration/pages/source.md @@ -491,7 +491,7 @@ To change this value: ## Backup -Pages are part of the [regular backup](../../raketasks/backup_restore.md) so there is nothing to configure. +Pages are part of the [regular backup](../../administration/backup_restore/index.md) so there is nothing to configure. ## Security diff --git a/doc/administration/postgresql/multiple_databases.md b/doc/administration/postgresql/multiple_databases.md index f03781d0ee2dac..b6692b630e7ff5 100644 --- a/doc/administration/postgresql/multiple_databases.md +++ b/doc/administration/postgresql/multiple_databases.md @@ -103,7 +103,7 @@ the other way around. 1. For existing installations, [migrate the data](#migrate-existing-installations) first. -1. [Back up GitLab](../../raketasks/backup_restore.md) +1. [Back up GitLab](../../administration/backup_restore/index.md) in case of unforeseen issues. 1. Stop GitLab: @@ -157,7 +157,7 @@ the other way around. 1. For existing installations, [migrate the data](#migrate-existing-installations) first. -1. [Back up GitLab](../../raketasks/backup_restore.md) +1. [Back up GitLab](../../administration/backup_restore/index.md) in case of unforeseen issues. 1. Stop GitLab: diff --git a/doc/administration/read_only_gitlab.md b/doc/administration/read_only_gitlab.md index 3842cf0846b959..adc6f42271c80c 100644 --- a/doc/administration/read_only_gitlab.md +++ b/doc/administration/read_only_gitlab.md @@ -66,7 +66,7 @@ sudo gitlab-ctl start puma If you want to allow users to use the GitLab UI, ensure that the database is read-only: -1. Take a [GitLab backup](../raketasks/backup_restore.md) +1. Take a [GitLab backup](../administration/backup_restore/index.md) in case things don't go as expected. 1. Enter PostgreSQL on the console as an administrator user: diff --git a/doc/administration/repository_storage_paths.md b/doc/administration/repository_storage_paths.md index e5f69abad08472..9967b623773974 100644 --- a/doc/administration/repository_storage_paths.md +++ b/doc/administration/repository_storage_paths.md @@ -73,7 +73,7 @@ For repository storage paths: ### Configure for backups -For [backups](../raketasks/backup_restore.md) to work correctly: +For [backups](../administration/backup_restore/index.md) to work correctly: - The repository storage path cannot be a mount point. - The GitLab user must have correct permissions for the parent directory of the path. diff --git a/doc/administration/troubleshooting/postgresql.md b/doc/administration/troubleshooting/postgresql.md index 120d566a7e76f0..88e0d4e6c6b16a 100644 --- a/doc/administration/troubleshooting/postgresql.md +++ b/doc/administration/troubleshooting/postgresql.md @@ -152,7 +152,7 @@ without having to [reconfigure GitLab](../restart_gitlab.md#reconfigure-a-linux- which in this case would restart Puma and Sidekiq. For example, a backup may fail with the following errors in the output of the -[backup command](../../raketasks/backup_restore.md#back-up-gitlab) +[backup command](../../administration/backup_restore/index.md#back-up-gitlab) because the statement timeout was too short: ```plaintext diff --git a/doc/index.md b/doc/index.md index 367f18ec159a73..6b0422310da44a 100644 --- a/doc/index.md +++ b/doc/index.md @@ -39,7 +39,7 @@ Have a look at some of our most popular topics: | [GitLab groups](user/group/index.md) | Manage projects together. | | [Keyword reference for the `.gitlab-ci.yml` file](ci/yaml/index.md) | Available configuration options for `.gitlab-ci.yml` files. | | [Activate GitLab EE with a license](user/admin_area/license.md) | Activate GitLab Enterprise Edition functionality with a license. | -| [Back up and restore GitLab](raketasks/backup_restore.md) | Rake tasks for backing up and restoring GitLab self-managed instances. | +| [Back up and restore GitLab](administration/backup_restore/index.md) | Backing up and restoring GitLab self-managed instances. | | [GitLab release and maintenance policy](policy/maintenance.md) | Policies for version naming and cadence, and also upgrade recommendations. | | [Elasticsearch integration](integration/advanced_search/elasticsearch.md) | Integrate Elasticsearch with GitLab to enable advanced search. | | [Database settings for Linux package installations](https://docs.gitlab.com/omnibus/settings/database.html) | Database settings for self-managed instances installed using Linux packages. | diff --git a/doc/install/aws/gitlab_sre_for_aws.md b/doc/install/aws/gitlab_sre_for_aws.md index f957dfa8a652cd..512857b87b5261 100644 --- a/doc/install/aws/gitlab_sre_for_aws.md +++ b/doc/install/aws/gitlab_sre_for_aws.md @@ -80,11 +80,11 @@ All recommendations are for production configurations, including performance tes ### AWS Gitaly backup -Due to the nature of how Praefect tracks the replication metadata of Gitaly disk information, the best backup method is [the official backup and restore Rake tasks](../../raketasks/backup_restore.md). +Due to the nature of how Praefect tracks the replication metadata of Gitaly disk information, the best backup method is [the official backup and restore Rake tasks](../../administration/backup_restore/index.md). ### AWS Gitaly recovery -Gitaly Cluster does not support snapshot backups as these can cause issues where the Praefect database becomes out of syn with the disk storage. Due to the nature of how Praefect rebuilds the replication metadata of Gitaly disk information during a restore, the best recovery method is [the official backup and restore Rake tasks](../../raketasks/backup_restore.md). +Gitaly Cluster does not support snapshot backups as these can cause issues where the Praefect database becomes out of syn with the disk storage. Due to the nature of how Praefect rebuilds the replication metadata of Gitaly disk information during a restore, the best recovery method is [the official backup and restore Rake tasks](../../administration/backup_restore/index.md). ### Gitaly HA in EKS quick start diff --git a/doc/install/aws/manual_install_aws.md b/doc/install/aws/manual_install_aws.md index 92375fff59e910..13dac7071dedbe 100644 --- a/doc/install/aws/manual_install_aws.md +++ b/doc/install/aws/manual_install_aws.md @@ -754,7 +754,7 @@ Read more on configuring an ## Backup and restore -GitLab provides [a tool to back up](../../raketasks/backup_restore.md) +GitLab provides [a tool to back up](../../administration/backup_restore/index.md) and restore its Git data, database, attachments, LFS objects, and so on. Some important things to know: @@ -781,7 +781,7 @@ For GitLab 12.1 and earlier, use `gitlab-rake gitlab:backup:create`. ### Restoring GitLab from a backup -To restore GitLab, first review the [restore documentation](../../raketasks/backup_restore.md#restore-gitlab), +To restore GitLab, first review the [restore documentation](../../administration/backup_restore/index.md#restore-gitlab), and primarily the restore prerequisites. Then, follow the steps under the [Linux package installations section](../../raketasks/restore_gitlab.md#restore-for-omnibus-gitlab-installations). diff --git a/doc/install/docker.md b/doc/install/docker.md index 299de1e2fe2ba3..d12d1ffd7fcde6 100644 --- a/doc/install/docker.md +++ b/doc/install/docker.md @@ -609,7 +609,7 @@ You can create a GitLab backup with: docker exec -t gitlab-backup create ``` -Read more on how to [back up and restore GitLab](../raketasks/backup_restore.md). +Read more on how to [back up and restore GitLab](../administration/backup_restore/index.md). NOTE: If configuration is provided entirely via the `GITLAB_OMNIBUS_CONFIG` environment variable diff --git a/doc/install/installation.md b/doc/install/installation.md index 4d80a02c9f1fa3..de108fa15a066f 100644 --- a/doc/install/installation.md +++ b/doc/install/installation.md @@ -72,7 +72,7 @@ When following the instructions on this page, you create this directory structur - `/home/git/repositories` - Bare repositories for all projects organized by namespace. This is where the Git repositories which are pushed/pulled are maintained for all projects. **This area contains critical data for projects. - [Keep a backup](../raketasks/backup_restore.md).** + [Keep a backup](../administration/backup_restore/index.md).** The default locations for repositories can be configured in `config/gitlab.yml` of GitLab and `config.yml` of GitLab Shell. diff --git a/doc/install/next_steps.md b/doc/install/next_steps.md index 70b6101b1ebbc6..5d9ca05758146c 100644 --- a/doc/install/next_steps.md +++ b/doc/install/next_steps.md @@ -40,7 +40,7 @@ installation. ## Backup and upgrade -- [Back up and restore GitLab](../raketasks/backup_restore.md): Learn the different +- [Back up and restore GitLab](../administration/backup_restore/index.md): Learn the different ways you can back up or restore GitLab. - [Upgrade GitLab](../update/index.md): Every 22nd of the month, a new feature-rich GitLab version is released. Learn how to upgrade to it, or to an interim release that contains a security fix. diff --git a/doc/integration/mattermost/index.md b/doc/integration/mattermost/index.md index 0f9192f9a84aa6..7ca4ed8a0e8971 100644 --- a/doc/integration/mattermost/index.md +++ b/doc/integration/mattermost/index.md @@ -176,7 +176,7 @@ sudo gitlab-psql -d mattermost_production ## Back up GitLab Mattermost -GitLab Mattermost is not included in the regular [Linux package backup](../../raketasks/backup_restore.md) Rake task. +GitLab Mattermost is not included in the regular [Linux package backup](../../administration/backup_restore/index.md) Rake task. The general Mattermost [backup and disaster recovery](https://docs.mattermost.com/deploy/backup-disaster-recovery.html) documentation can be used as a guide on what needs to be backed up. diff --git a/doc/raketasks/backup_gitlab.md b/doc/raketasks/backup_gitlab.md index 9ed1bc33824a51..269e9ee7f72ab5 100644 --- a/doc/raketasks/backup_gitlab.md +++ b/doc/raketasks/backup_gitlab.md @@ -183,7 +183,7 @@ all repositories and all attachments. You can only restore a backup to **exactly the same version and type (CE/EE)** of GitLab on which it was created. The best way to -[migrate your projects from one server to another](backup_restore.md#migrate-to-a-new-server) is through a backup and restore. +[migrate your projects from one server to another](backup_restore/index.md#migrate-to-a-new-server) is through a backup and restore. Depending on your version of GitLab, use the following command if you installed GitLab using the Omnibus package: diff --git a/doc/raketasks/backup_restore.md b/doc/raketasks/backup_restore.md index e661be0d701a7c..8cfd8094614595 100644 --- a/doc/raketasks/backup_restore.md +++ b/doc/raketasks/backup_restore.md @@ -2,239 +2,13 @@ stage: Systems group: Geo info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments +redirect_to: '../administration/backup_restore/index.md' +remove_date: '2023-09-26' --- -# Back up and restore GitLab **(FREE SELF)** +This document was moved to [another location](../administration/backup_restore/index.md). - - -## Why should I care about back up and restore? - -You've invested time and money developing your software. What happens if: - -- your data gets corrupted? -- your data gets accidentally deleted? -- your data gets encrypted by ransomware? -- your cloud provider suffers unexpected downtime? - -These scenarios may be existential threats to your software or organization. You can mitigate all of these risks with a disaster recovery plan that includes backups. - -## Back up GitLab - -For detailed information on backing up GitLab, see [Backup GitLab](backup_gitlab.md). - -## Restore GitLab - -For detailed information on restoring GitLab, see [Restore GitLab](restore_gitlab.md). - -## Migrate to a new server - - - -You can use GitLab backup and restore to migrate your instance to a new server. This section outlines a typical procedure for a GitLab deployment running on a single server. -If you're running GitLab Geo, an alternative option is [Geo disaster recovery for planned failover](../administration/geo/disaster_recovery/planned_failover.md). - -WARNING: -Avoid uncoordinated data processing by both the new and old servers, where multiple -servers could connect concurrently and process the same data. For example, when using -[incoming email](../administration/incoming_email.md), if both GitLab instances are -processing email at the same time, then both instances miss some data. -This type of problem can occur with other services as well, such as a -[non-packaged database](https://docs.gitlab.com/omnibus/settings/database.html#using-a-non-packaged-postgresql-database-management-server), -a non-packaged Redis instance, or non-packaged Sidekiq. - -Prerequisites: - -- Some time before your migration, consider notifying your users of upcoming - scheduled maintenance with a [broadcast message banner](../user/admin_area/broadcast_messages.md). -- Ensure your backups are complete and current. Create a complete system-level backup, or - take a snapshot of all servers involved in the migration, in case destructive commands - (like `rm`) are run incorrectly. - -### Prepare the new server - -To prepare the new server: - -1. Copy the - [SSH host keys](https://superuser.com/questions/532040/copy-ssh-keys-from-one-server-to-another-server/532079#532079) - from the old server to avoid man-in-the-middle attack warnings. - See [Manually replicate the primary site's SSH host keys](../administration/geo/replication/configuration.md#step-2-manually-replicate-the-primary-sites-ssh-host-keys) for example steps. -1. [Install and configure GitLab](https://about.gitlab.com/install/) except - [incoming email](../administration/incoming_email.md): - 1. Install GitLab. - 1. Configure by copying `/etc/gitlab` files from the old server to the new server, and update as necessary. - Read the - [Omnibus configuration backup and restore instructions](https://docs.gitlab.com/omnibus/settings/backups.html) for more detail. - 1. If applicable, disable [incoming email](../administration/incoming_email.md). - 1. Block new CI/CD jobs from starting upon initial startup after the backup and restore. - Edit `/etc/gitlab/gitlab.rb` and set the following: - - ```ruby - nginx['custom_gitlab_server_config'] = "location = /api/v4/jobs/request {\n deny all;\n return 503;\n }\n" - ``` - - 1. Reconfigure GitLab: - - ```shell - sudo gitlab-ctl reconfigure - ``` - -1. Stop GitLab to avoid any potential unnecessary and unintentional data processing: - - ```shell - sudo gitlab-ctl stop - ``` - -1. Configure the new server to allow receiving the Redis database and GitLab backup files: - - ```shell - sudo rm -f /var/opt/gitlab/redis/dump.rdb - sudo chown /var/opt/gitlab/redis /var/opt/gitlab/backups - ``` - -### Prepare and transfer content from the old server - -1. Ensure you have an up-to-date system-level backup or snapshot of the old server. -1. Enable [maintenance mode](../administration/maintenance_mode/index.md), - if supported by your GitLab edition. -1. Block new CI/CD jobs from starting: - 1. Edit `/etc/gitlab/gitlab.rb`, and set the following: - - ```ruby - nginx['custom_gitlab_server_config'] = "location = /api/v4/jobs/request {\n deny all;\n return 503;\n }\n" - ``` - - 1. Reconfigure GitLab: - - ```shell - sudo gitlab-ctl reconfigure - ``` - -1. Disable periodic background jobs: - 1. On the left sidebar, expand the top-most chevron (**{chevron-down}**). - 1. Select **Admin Area**. - 1. On the left sidebar, select **Monitoring > Background Jobs**. - 1. Under the Sidekiq dashboard, select **Cron** tab and then - **Disable All**. -1. Wait for the currently running CI/CD jobs to finish, or accept that jobs that have not completed may be lost. - To view jobs currently running, on the left sidebar, select **Overviews > Jobs**, - and then select **Running**. -1. Wait for Sidekiq jobs to finish: - 1. On the left sidebar, select **Monitoring > Background Jobs**. - 1. Under the Sidekiq dashboard, select **Queues** and then **Live Poll**. - Wait for **Busy** and **Enqueued** to drop to 0. - These queues contain work that has been submitted by your users; - shutting down before these jobs complete may cause the work to be lost. - Make note of the numbers shown in the Sidekiq dashboard for post-migration verification. -1. Flush the Redis database to disk, and stop GitLab other than the services needed for migration: - - ```shell - sudo /opt/gitlab/embedded/bin/redis-cli -s /var/opt/gitlab/redis/redis.socket save && sudo gitlab-ctl stop && sudo gitlab-ctl start postgresql && sudo gitlab-ctl start gitaly - ``` - -1. Create a GitLab backup: - - ```shell - sudo gitlab-backup create - ``` - -1. Disable the following GitLab services and prevent unintentional restarts by adding the following to the bottom of `/etc/gitlab/gitlab.rb`: - - ```ruby - alertmanager['enable'] = false - gitlab_exporter['enable'] = false - gitlab_pages['enable'] = false - gitlab_workhorse['enable'] = false - grafana['enable'] = false - logrotate['enable'] = false - gitlab_rails['incoming_email_enabled'] = false - nginx['enable'] = false - node_exporter['enable'] = false - postgres_exporter['enable'] = false - postgresql['enable'] = false - prometheus['enable'] = false - puma['enable'] = false - redis['enable'] = false - redis_exporter['enable'] = false - registry['enable'] = false - sidekiq['enable'] = false - ``` - -1. Reconfigure GitLab: - - ```shell - sudo gitlab-ctl reconfigure - ``` - -1. Verify everything is stopped, and confirm no services are running: - - ```shell - sudo gitlab-ctl status - ``` - -1. Transfer the Redis database and GitLab backups to the new server: - - ```shell - sudo scp /var/opt/gitlab/redis/dump.rdb @new-server:/var/opt/gitlab/redis - sudo scp /var/opt/gitlab/backups/your-backup.tar @new-server:/var/opt/gitlab/backups - ``` - -### Restore data on the new server - -1. Restore appropriate file system permissions: - - ```shell - sudo chown gitlab-redis /var/opt/gitlab/redis - sudo chown gitlab-redis:gitlab-redis /var/opt/gitlab/redis/dump.rdb - sudo chown git:root /var/opt/gitlab/backups - sudo chown git:git /var/opt/gitlab/backups/your-backup.tar - ``` - -1. [Restore the GitLab backup](#restore-gitlab). -1. Verify that the Redis database restored correctly: - 1. On the left sidebar, expand the top-most chevron (**{chevron-down}**). - 1. Select **Admin Area**. - 1. On the left sidebar, select **Monitoring > Background Jobs**. - 1. Under the Sidekiq dashboard, verify that the numbers - match with what was shown on the old server. - 1. While still under the Sidekiq dashboard, select **Cron** and then **Enable All** - to re-enable periodic background jobs. -1. Test that read-only operations on the GitLab instance work as expected. For example, browse through project repository files, merge requests, and issues. -1. Disable [Maintenance Mode](../administration/maintenance_mode/index.md), if previously enabled. -1. Test that the GitLab instance is working as expected. -1. If applicable, re-enable [incoming email](../administration/incoming_email.md) and test it is working as expected. -1. Update your DNS or load balancer to point at the new server. -1. Unblock new CI/CD jobs from starting by removing the custom NGINX configuration - you added previously: - - ```ruby - # The following line must be removed - nginx['custom_gitlab_server_config'] = "location = /api/v4/jobs/request {\n deny all;\n return 503;\n }\n" - ``` - -1. Reconfigure GitLab: - - ```shell - sudo gitlab-ctl reconfigure - ``` - -1. Remove the scheduled maintenance [broadcast message banner](../user/admin_area/broadcast_messages.md). - -## Additional notes - -This documentation is for GitLab Community and Enterprise Edition. We back up -GitLab.com and ensure your data is secure. You can't, however, use these -methods to export or back up your data yourself from GitLab.com. - -Issues are stored in the database, and can't be stored in Git itself. - -## Related features - -- [Geo](../administration/geo/index.md) -- [Disaster Recovery (Geo)](../administration/geo/disaster_recovery/index.md) -- [Migrating GitLab groups](../user/group/import/index.md) -- [Import and migrate projects](../user/project/import/index.md) + + + + \ No newline at end of file diff --git a/doc/raketasks/index.md b/doc/raketasks/index.md index 47fa7e855a1afb..8725c8ce8a03a9 100644 --- a/doc/raketasks/index.md +++ b/doc/raketasks/index.md @@ -20,7 +20,7 @@ The following Rake tasks are available for use with GitLab: | Tasks | Description | |:------------------------------------------------------|:------------| -| [Back up and restore](backup_restore.md) | Back up, restore, and migrate GitLab instances between servers. | +| [Back up and restore](../administration/backup_restore/index.md) | Back up, restore, and migrate GitLab instances between servers. | | [Clean up](cleanup.md) | Clean up unneeded items from GitLab instances. | | [Development](../development/rake_tasks.md) | Tasks for GitLab contributors. | | [Elasticsearch](../integration/advanced_search/elasticsearch.md#gitlab-advanced-search-rake-tasks) | Maintain Elasticsearch in a GitLab instance. | diff --git a/doc/update/background_migrations.md b/doc/update/background_migrations.md index bf9f2df9e87661..3f8b7be28917bb 100644 --- a/doc/update/background_migrations.md +++ b/doc/update/background_migrations.md @@ -263,7 +263,7 @@ version and manually ensuring that the batched migrations complete successfully. #### Roll back and follow the required upgrade path -1. [Rollback and restore the previously installed version](../raketasks/backup_restore.md) +1. [Rollback and restore the previously installed version](../administration/backup_restore/index.md) 1. Update to either 14.0.5 or 14.1 **before** updating to 14.2+ 1. [Check the status](#check-the-status-of-batched-background-migrations) of the batched background migrations and make sure they are all marked as finished before attempting to upgrade again. If any remain marked as active, diff --git a/doc/update/package/downgrade.md b/doc/update/package/downgrade.md index 7b48f1f4045696..c5b65e29ec4b21 100644 --- a/doc/update/package/downgrade.md +++ b/doc/update/package/downgrade.md @@ -12,7 +12,7 @@ of a package. WARNING: You must at least have a database backup created under the version you are downgrading to. Ideally, you should have a -[full backup archive](../../raketasks/backup_restore.md) +[full backup archive](../../administration/backup_restore/index.md) on hand. The example below demonstrates the downgrade procedure when downgrading between minor diff --git a/doc/update/package/index.md b/doc/update/package/index.md index 3e0d09eb36ee3a..5f15e5ea3a8ff6 100644 --- a/doc/update/package/index.md +++ b/doc/update/package/index.md @@ -56,7 +56,7 @@ sudo touch /etc/gitlab/skip-auto-backup ``` Nevertheless, it is highly recommended to maintain a full up-to-date -[backup](../../raketasks/backup_restore.md) on your own. +[backup](../../administration/backup_restore/index.md) on your own. ## Upgrade using the official repositories @@ -178,7 +178,7 @@ To download and install GitLab: # Debian/Ubuntu dpkg -i - # RHEL/CentOS 6 and 7 + # RHEL/CentOS 6 and 7 rpm -Uvh # RHEL/CentOS 8 diff --git a/doc/update/patch_versions.md b/doc/update/patch_versions.md index 964c6430a166f1..3211b732f0a859 100644 --- a/doc/update/patch_versions.md +++ b/doc/update/patch_versions.md @@ -14,7 +14,7 @@ You can select the tag in the version dropdown list in the upper-left corner of ### 0. Backup -Make a backup just in case things go south. Depending on the installation method, backup commands vary. See the [backing up and restoring GitLab](../raketasks/backup_restore.md) documentation. +Make a backup just in case things go south. Depending on the installation method, backup commands vary. See the [backing up and restoring GitLab](../administration/backup_restore/index.md) documentation. ### 1. Stop server diff --git a/doc/update/plan_your_upgrade.md b/doc/update/plan_your_upgrade.md index 9378b104c8171b..a96e02f589d1ad 100644 --- a/doc/update/plan_your_upgrade.md +++ b/doc/update/plan_your_upgrade.md @@ -75,7 +75,7 @@ Create a backup of GitLab and all its data (database, repositories, uploads, bui artifacts, LFS objects, registry, pages). This is vital for making it possible to roll back GitLab to a working state if there's a problem with the upgrade: -- Create a [GitLab backup](../raketasks/backup_restore.md). +- Create a [GitLab backup](../administration/backup_restore/index.md). Make sure to follow the instructions based on your installation method. Don't forget to back up the [secrets and configuration files](../raketasks/backup_gitlab.md#storing-configuration-files). - Alternatively, create a snapshot of your instance. If this is a multi-node @@ -89,9 +89,9 @@ If you have a test environment that mimics your production one, we recommend tes To restore your GitLab backup: - Before restoring, make sure to read about the - [prerequisites](../raketasks/backup_restore.md#restore-gitlab), most importantly, + [prerequisites](../administration/backup_restore/index.md#restore-gitlab), most importantly, the versions of the backed up and the new GitLab instance must be the same. -- [Restore GitLab](../raketasks/backup_restore.md#restore-gitlab). +- [Restore GitLab](../administration/backup_restore/index.md#restore-gitlab). Make sure to follow the instructions based on your installation method. Confirm that the [secrets and configuration files](../raketasks/backup_gitlab.md#storing-configuration-files) are also restored. - If restoring from a snapshot, know the steps to do this. diff --git a/doc/user/profile/account/two_factor_authentication.md b/doc/user/profile/account/two_factor_authentication.md index b856366bb58f64..85d09c91d3bcc6 100644 --- a/doc/user/profile/account/two_factor_authentication.md +++ b/doc/user/profile/account/two_factor_authentication.md @@ -465,7 +465,7 @@ a GitLab global administrator disable 2FA for your account: ## Information for GitLab administrators **(FREE SELF)** -- Take care that 2FA keeps working after [restoring a GitLab backup](../../../raketasks/backup_restore.md). +- Take care that 2FA keeps working after [restoring a GitLab backup](../../../administration/backup_restore/index.md). - To ensure 2FA authorizes correctly with a time-based one-time password (TOTP) server, synchronize your GitLab server's time using a service like NTP. Otherwise, authorization can always fail because of time differences. - The GitLab WebAuthn implementation does _not_ work when the GitLab instance is accessed from multiple hostnames diff --git a/doc/user/project/import/index.md b/doc/user/project/import/index.md index 9fdb8eed5aa234..5542f16377bff5 100644 --- a/doc/user/project/import/index.md +++ b/doc/user/project/import/index.md @@ -92,7 +92,7 @@ over a series of Docker pulls and pushes. Re-run any CI pipelines to retrieve an ## Migrate between two self-managed GitLab instances To migrate from an existing self-managed GitLab instance to a new self-managed GitLab instance, -you should [back up](../../../raketasks/backup_restore.md) +you should [back up](../../../administration/backup_restore/index.md) the existing instance and restore it on the new instance. For example, you could use this method to migrate a self-managed instance from an old server to a new server. The backups produced don't depend on the operating system running GitLab. You can therefore use diff --git a/doc/user/project/issues/managing_issues.md b/doc/user/project/issues/managing_issues.md index 6f1c14b2b8031a..3f1f14f66cebf0 100644 --- a/doc/user/project/issues/managing_issues.md +++ b/doc/user/project/issues/managing_issues.md @@ -155,7 +155,7 @@ Prerequisites: To do it: -1. Optional (but recommended). [Create a backup](../../../raketasks/backup_restore.md) before +1. Optional (but recommended). [Create a backup](../../../administration/backup_restore/index.md) before attempting any changes in the console. 1. Open the [Rails console](../../../administration/operations/rails_console.md). 1. Run the following script. Make sure to change `project`, `admin_user`, and `target_project` to diff --git a/spec/features/projects/files/user_browses_files_spec.rb b/spec/features/projects/files/user_browses_files_spec.rb index bb14b9c4e312e4..3b30a620257dea 100644 --- a/spec/features/projects/files/user_browses_files_spec.rb +++ b/spec/features/projects/files/user_browses_files_spec.rb @@ -134,7 +134,7 @@ click_link("Rake tasks") expect(page).to have_current_path(project_tree_path(project, "markdown/doc/raketasks"), ignore_query: true) - expect(page).to have_content("backup_restore.md").and have_content("maintenance.md") + expect(page).to have_content("maintenance.md") click_link("maintenance.md") -- GitLab From 17c5d5a620f61af0a510abf8cc1080fab970dad6 Mon Sep 17 00:00:00 2001 From: Mike Kozono Date: Wed, 21 Jun 2023 14:54:59 -1000 Subject: [PATCH 04/13] Move backup_gitlab.md to doc/administration/backup_restore directory. --- .../backup_restore/backup_gitlab.md | 1870 +++++++++++++++++ doc/administration/backup_restore/index.md | 2 +- .../geo/disaster_recovery/planned_failover.md | 2 +- doc/administration/gitaly/index.md | 4 +- doc/administration/object_storage.md | 6 +- .../operations/moving_repositories.md | 4 +- doc/administration/postgresql/pgbouncer.md | 4 +- .../postgresql/replication_and_failover.md | 2 +- doc/administration/raketasks/check.md | 2 +- doc/administration/raketasks/storage.md | 4 +- .../reference_architectures/index.md | 2 +- doc/ci/variables/index.md | 2 +- doc/install/aws/manual_install_aws.md | 6 +- doc/install/docker.md | 4 +- doc/raketasks/backup_gitlab.md | 1870 +---------------- doc/update/package/convert_to_ee.md | 2 +- doc/update/plan_your_upgrade.md | 4 +- doc/user/project/repository/mirror/index.md | 2 +- 18 files changed, 1903 insertions(+), 1889 deletions(-) create mode 100644 doc/administration/backup_restore/backup_gitlab.md diff --git a/doc/administration/backup_restore/backup_gitlab.md b/doc/administration/backup_restore/backup_gitlab.md new file mode 100644 index 00000000000000..31034126f9ec56 --- /dev/null +++ b/doc/administration/backup_restore/backup_gitlab.md @@ -0,0 +1,1870 @@ +--- +stage: Systems +group: Geo +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments +--- + +# Back up GitLab **(FREE SELF)** + +The exact procedure for backing up GitLab depends on many factors. Your particular deployment's usage and configuration determine what kind of data exists, where it is located, and how much there is. These factors influence your options for how to perform a back up, how to store it, and how to restore it. + +## What data needs to be backed up? + +- [PostgreSQL databases](#postgresql-databases) +- [Git repositories](#git-repositories) +- [Blobs](#blobs) +- [Storing configuration files](#storing-configuration-files) +- [Other data](#other-data) + +### PostgreSQL databases + +In the simplest case, GitLab has one PostgreSQL database in one PostgreSQL server on the same VM as all other GitLab services. But depending on configuration, GitLab may use multiple PostgreSQL databases in multiple PostgreSQL servers. + +In general, this data is the single source of truth for most user-generated content in the Web interface, such as issue and merge request content, comments, permissions, and credentials. + +PostgreSQL also holds some cached data like HTML-rendered Markdown, and by default, merge request diffs. +Although merge request diffs can also be configured to be offloaded to the file system or object storage, see [Blobs](#blobs). + +Gitaly Cluster's Praefect service uses a PostgreSQL database as a single source of truth to manage its Gitaly nodes. + +A common PostgreSQL utility, [`pg_dump`](https://www.postgresql.org/docs/current/app-pgdump.html), produces a backup file which can be used to restore a PostgreSQL database. The [backup command](#backup-command) uses this utility under the hood. + +Unfortunately, `pg_dump` takes longer to execute, the larger the database. Depending on your situation, the duration becomes impractical at some point (days, for example). If your database is over 100GB, `pg_dump`, and by extension the [backup command](#backup-command), is likely not usable. + +### Git repositories + +A GitLab instance can have one or more repository shards. Each shard is a Gitaly instance or Gitaly Cluster that +is responsible for allowing access and operations on the locally stored Git repositories. Gitaly can run +on a machine: + +- With a single disk. +- With multiple disks mounted as a single mount-point (like with a RAID array). +- Using LVM. + +Gitaly may work with NFS or a mounted Storage Appliance, but it is [not officially supported](../nfs.md#gitaly-with-nfs-not-supported) because Git requires low latency. + +Each project can have up to 3 different repositories: + +- A project repository, where the source code is stored. +- A wiki repository, where the wiki content is stored. +- A design repository, where design artifacts are indexed (assets are actually in LFS). + +They all live in the same shard and share the same base name with a `-wiki` and `-design` suffix +for Wiki and Design Repository cases. + +Personal and project snippets, and group wiki content, are stored in Git repositories. + +Project forks are deduplicated in live a GitLab site using pool repositories. + +The [backup command](#backup-command) produces a Git bundle for each repository and tars them all up. This duplicates pool repository data into every fork. For more information, see [Alternative backup strategies](#alternative-backup-strategies). + +### Blobs + +GitLab stores blobs (or files) such as issue attachments or LFS objects into either: + +- The file system in a specific location. +- An [Object Storage](../object_storage.md) solution. Object Storage solutions can be: + - Cloud based like Amazon S3 Google Cloud Storage. + - Hosted by you (like MinIO). + - A Storage Appliance that exposes an Object Storage-compatible API. + +### Storing configuration files + +The [backup Rake task](#back-up-gitlab) GitLab provides does _not_ store your +configuration files. The primary reason for this is that your database contains +items including encrypted information for two-factor authentication and the +CI/CD _secure variables_. Storing encrypted information in the same location +as its key defeats the purpose of using encryption in the first place. + +WARNING: +The secrets file is essential to preserve your database encryption key. + +At the very **minimum**, you must back up: + +For Omnibus: + +- `/etc/gitlab/gitlab-secrets.json` +- `/etc/gitlab/gitlab.rb` + +For installation from source: + +- `/home/git/gitlab/config/secrets.yml` +- `/home/git/gitlab/config/gitlab.yml` + +For [Docker installations](../../install/docker.md), you must +back up the volume where the configuration files are stored. If you created +the GitLab container according to the documentation, it should be in the +`/srv/gitlab/config` directory. + +For [GitLab Helm chart installations](https://gitlab.com/gitlab-org/charts/gitlab) +on a Kubernetes cluster, you must follow the +[Back up the secrets](https://docs.gitlab.com/charts/backup-restore/backup.html#back-up-the-secrets) +instructions. + +You may also want to back up any TLS keys and certificates (`/etc/gitlab/ssl`, `/etc/gitlab/trusted-certs`), and your +[SSH host keys](https://superuser.com/questions/532040/copy-ssh-keys-from-one-server-to-another-server/532079#532079) +to avoid man-in-the-middle attack warnings if you have to perform a full machine restore. + +If you use Omnibus GitLab, review additional information to +[backup your configuration](https://docs.gitlab.com/omnibus/settings/backups.html). + +In the unlikely event that the secrets file is lost, see the +[troubleshooting section](#when-the-secrets-file-is-lost). + +### Other data + +GitLab uses Redis both as a cache store and to hold persistent data for our background jobs system, Sidekiq. The provided [backup command](#backup-command) does _not_ back up Redis data. This means to take a consistent backup with the [backup command](#backup-command), there must be no pending or running background jobs. It is possible to manually back up Redis. + +Elasticsearch is an optional database for advanced search. It can improve search +in both source-code level, and user generated content in issues, merge requests, and discussions. The [backup command](#backup-command) does _not_ back up Elasticsearch data. Elasticsearch data can be regenerated from PostgreSQL data after a restore. It is possible to manually back up Elasticsearch. + +## Command line interface + +GitLab provides a command line interface to back up your entire instance, +including: + +- Database +- Attachments +- Git repositories data +- CI/CD job output logs +- CI/CD job artifacts +- LFS objects +- Terraform states ([introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/331806) in GitLab 14.7) +- Container Registry images +- GitLab Pages content +- Packages ([introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/332006) in GitLab 14.7) +- Snippets +- [Group wikis](../../user/project/wiki/group.md) +- Project-level Secure Files ([introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/121142) in GitLab 16.1) + +Backups do not include: + +- [Mattermost data](https://docs.mattermost.com/administration/config-settings.html#file-storage) +- Redis (and thus Sidekiq jobs) + +WARNING: +GitLab does not back up any configuration files (`/etc/gitlab`), TLS keys and certificates, or system +files. You are highly advised to read about [storing configuration files](#storing-configuration-files). + +### Requirements + +To be able to back up and restore, ensure that Rsync is installed on your +system. If you installed GitLab: + +- _Using the Omnibus package_, Rsync is already installed. +- _From source_, check if `rsync` is installed. If Rsync is not installed, install it. For example: + + ```shell + # Debian/Ubuntu + sudo apt-get install rsync + + # RHEL/CentOS + sudo yum install rsync + ``` + +### Backup command + +WARNING: +GitLab doesn't back up items that aren't stored on the file system. If you're +using [object storage](../object_storage.md), be sure to enable +backups with your object storage provider, if desired. + +WARNING: +The backup command requires [additional parameters](#back-up-and-restore-for-installations-using-pgbouncer) when +your installation is using PgBouncer, for either performance reasons or when using it with a Patroni cluster. + +WARNING: +Before GitLab 15.5.0, the backup command doesn't verify if another backup is already running, as described in +[issue 362593](https://gitlab.com/gitlab-org/gitlab/-/issues/362593). We strongly recommend +you make sure that all backups are complete before starting a new one. + +An application data backup creates an archive file that contains the database, +all repositories and all attachments. + +You can only restore a backup to **exactly the same version and type (CE/EE)** +of GitLab on which it was created. The best way to +[migrate your projects from one server to another](index.md#migrate-to-a-new-server) is through a backup and restore. + +Depending on your version of GitLab, use the following command if you installed +GitLab using the Omnibus package: + +- GitLab 12.2 or later: + + ```shell + sudo gitlab-backup create + ``` + +- GitLab 12.1 and earlier: + + ```shell + gitlab-rake gitlab:backup:create + ``` + +If you installed GitLab from source, use the following command: + +```shell +sudo -u git -H bundle exec rake gitlab:backup:create RAILS_ENV=production +``` + +If you're running GitLab from within a Docker container, run the backup from +the host, based on your installed version of GitLab: + +- GitLab 12.2 or later: + + ```shell + docker exec -t gitlab-backup create + ``` + +- GitLab 12.1 and earlier: + + ```shell + docker exec -t gitlab-rake gitlab:backup:create + ``` + +If you're using the [GitLab Helm chart](https://gitlab.com/gitlab-org/charts/gitlab) +on a Kubernetes cluster, you can run the backup task by using `kubectl` to run the `backup-utility` +script on the GitLab toolbox pod. For more details, see the +[charts backup documentation](https://docs.gitlab.com/charts/backup-restore/backup.html). + +Similar to the Kubernetes case, if you have scaled out your GitLab cluster to +use multiple application servers, you should pick a designated node (that isn't +auto-scaled away) for running the backup Rake task. Because the backup Rake +task is tightly coupled to the main Rails application, this is typically a node +on which you're also running Puma or Sidekiq. + +Example output: + +```plaintext +Dumping database tables: +- Dumping table events... [DONE] +- Dumping table issues... [DONE] +- Dumping table keys... [DONE] +- Dumping table merge_requests... [DONE] +- Dumping table milestones... [DONE] +- Dumping table namespaces... [DONE] +- Dumping table notes... [DONE] +- Dumping table projects... [DONE] +- Dumping table protected_branches... [DONE] +- Dumping table schema_migrations... [DONE] +- Dumping table services... [DONE] +- Dumping table snippets... [DONE] +- Dumping table taggings... [DONE] +- Dumping table tags... [DONE] +- Dumping table users... [DONE] +- Dumping table users_projects... [DONE] +- Dumping table web_hooks... [DONE] +- Dumping table wikis... [DONE] +Dumping repositories: +- Dumping repository abcd... [DONE] +Creating backup archive: $TIMESTAMP_gitlab_backup.tar [DONE] +Deleting tmp directories...[DONE] +Deleting old backups... [SKIPPING] +``` + +### Backup timestamp + +The backup archive is saved in `backup_path`, which is specified in the +`config/gitlab.yml` file. The filename is `[TIMESTAMP]_gitlab_backup.tar`, +where `TIMESTAMP` identifies the time at which each backup was created, plus +the GitLab version. The timestamp is needed if you need to restore GitLab and +multiple backups are available. + +For example, if the backup name is `1493107454_2018_04_25_10.6.4-ce_gitlab_backup.tar`, +the timestamp is `1493107454_2018_04_25_10.6.4-ce`. + +### Backup options + +The command line tool GitLab provides to backup your instance can accept more +options. + +#### Backup strategy option + +The default backup strategy is to essentially stream data from the respective +data locations to the backup using the Linux command `tar` and `gzip`. This works +fine in most cases, but can cause problems when data is rapidly changing. + +When data changes while `tar` is reading it, the error `file changed as we read +it` may occur, and causes the backup process to fail. To combat this, 8.17 +introduces a new backup strategy called `copy`. The strategy copies data files +to a temporary location before calling `tar` and `gzip`, avoiding the error. + +A side-effect is that the backup process takes up to an additional 1X disk +space. The process does its best to clean up the temporary files at each stage +so the problem doesn't compound, but it could be a considerable change for large +installations. This is why the `copy` strategy is not the default in 8.17. + +To use the `copy` strategy instead of the default streaming strategy, specify +`STRATEGY=copy` in the Rake task command. For example: + +```shell +sudo gitlab-backup create STRATEGY=copy +``` + +Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. + +#### Backup filename + +WARNING: +If you use a custom backup filename, you can't +[limit the lifetime of the backups](#limit-backup-lifetime-for-local-files-prune-old-backups). + +By default, a backup file is created according to the specification in the +previous [Backup timestamp](#backup-timestamp) section. You can, however, +override the `[TIMESTAMP]` portion of the filename by setting the `BACKUP` +environment variable. For example: + +```shell +sudo gitlab-backup create BACKUP=dump +``` + +Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. + +The resulting file is named `dump_gitlab_backup.tar`. This is useful for +systems that make use of rsync and incremental backups, and results in +considerably faster transfer speeds. + +#### Confirm archive can be transferred + +To ensure the generated archive is transferable by rsync, you can set the `GZIP_RSYNCABLE=yes` +option. This sets the `--rsyncable` option to `gzip`, which is useful only in +combination with setting [the Backup filename option](#backup-filename). + +The `--rsyncable` option in `gzip` isn't guaranteed to be available +on all distributions. To verify that it's available in your distribution, run +`gzip --help` or consult the man pages. + +```shell +sudo gitlab-backup create BACKUP=dump GZIP_RSYNCABLE=yes +``` + +Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. + +#### Excluding specific directories from the backup + +You can exclude specific directories from the backup by adding the environment variable `SKIP`, whose values are a comma-separated list of the following options: + +- `db` (database) +- `uploads` (attachments) +- `builds` (CI job output logs) +- `artifacts` (CI job artifacts) +- `lfs` (LFS objects) +- `terraform_state` (Terraform states) +- `registry` (Container Registry images) +- `pages` (Pages content) +- `repositories` (Git repositories data) +- `packages` (Packages) +- `ci_secure_files` (Project-level Secure Files) + +NOTE: +When [backing up and restoring Helm Charts](https://docs.gitlab.com/charts/architecture/backup-restore.html), there is an additional option `packages`, which refers to any packages managed by the GitLab [package registry](../../user/packages/package_registry/index.md). +For more information see [command line arguments](https://docs.gitlab.com/charts/architecture/backup-restore.html#command-line-arguments). + +All wikis are backed up as part of the `repositories` group. Non-existent +wikis are skipped during a backup. + +For Omnibus GitLab packages: + +```shell +sudo gitlab-backup create SKIP=db,uploads +``` + +Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. + +For installations from source: + +```shell +sudo -u git -H bundle exec rake gitlab:backup:create SKIP=db,uploads RAILS_ENV=production +``` + +`SKIP=` is also used to: + +- [Skip creation of the tar file](#skipping-tar-creation) (`SKIP=tar`). +- [Skip uploading the backup to remote storage](#skip-uploading-backups-to-remote-storage) (`SKIP=remote`). + +#### Skipping tar creation + +NOTE: +It is not possible to skip the tar creation when using [object storage](#upload-backups-to-a-remote-cloud-storage) for backups. + +The last part of creating a backup is generation of a `.tar` file containing +all the parts. In some cases (for example, if the backup is picked up by other +backup software) creating a `.tar` file might be wasted effort or even directly +harmful, so you can skip this step by adding `tar` to the `SKIP` environment +variable. + +Adding `tar` to the `SKIP` variable leaves the files and directories containing the +backup in the directory used for the intermediate files. These files are +overwritten when a new backup is created, so you should make sure they are copied +elsewhere, because you can only have one backup on the system. + +For Omnibus GitLab packages: + +```shell +sudo gitlab-backup create SKIP=tar +``` + +For installations from source: + +```shell +sudo -u git -H bundle exec rake gitlab:backup:create SKIP=tar RAILS_ENV=production +``` + +#### Back up Git repositories concurrently + +> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/37158) in GitLab 13.3. +> - [Concurrent restore introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/69330) in GitLab 14.3 + +When using [multiple repository storages](../repository_storage_paths.md), +repositories can be backed up or restored concurrently to help fully use CPU time. The +following variables are available to modify the default behavior of the Rake +task: + +- `GITLAB_BACKUP_MAX_CONCURRENCY`: The maximum number of projects to back up at + the same time. Defaults to the number of logical CPUs (in GitLab 14.1 and + earlier, defaults to `1`). +- `GITLAB_BACKUP_MAX_STORAGE_CONCURRENCY`: The maximum number of projects to + back up at the same time on each storage. This allows the repository backups + to be spread across storages. Defaults to `2` (in GitLab 14.1 and earlier, + defaults to `1`). + +For example, for Omnibus GitLab installations with 4 repository storages: + +```shell +sudo gitlab-backup create GITLAB_BACKUP_MAX_CONCURRENCY=4 GITLAB_BACKUP_MAX_STORAGE_CONCURRENCY=1 +``` + +For example, for installations from source: + +```shell +sudo -u git -H bundle exec rake gitlab:backup:create GITLAB_BACKUP_MAX_CONCURRENCY=4 GITLAB_BACKUP_MAX_STORAGE_CONCURRENCY=1 +``` + +#### Incremental repository backups + +> - Introduced in GitLab 14.9 [with a flag](../feature_flags.md) named `incremental_repository_backup`. Disabled by default. +> - [Enabled on self-managed](https://gitlab.com/gitlab-org/gitlab/-/issues/355945) in GitLab 14.10. +> - `PREVIOUS_BACKUP` option [introduced](https://gitlab.com/gitlab-org/gitaly/-/issues/4184) in GitLab 15.0. + +FLAG: +On self-managed GitLab, by default this feature is available. To hide the feature, ask an administrator to [disable the feature flag](../feature_flags.md) named `incremental_repository_backup`. +On GitLab.com, this feature is not available. + +NOTE: +Only repositories support incremental backups. Therefore, if you use `INCREMENTAL=yes`, the task +creates a self-contained backup tar archive. This is because all subtasks except repositories are +still creating full backups (they overwrite the existing full backup). +See [issue 19256](https://gitlab.com/gitlab-org/gitlab/-/issues/19256) for a feature request to +support incremental backups for all subtasks. + +Incremental repository backups can be faster than full repository backups because they only pack changes since the last backup into the backup bundle for each repository. +The incremental backup archives are not linked to each other: each archive is a self-contained backup of the instance. There must be an existing backup +to create an incremental backup from: + +- In GitLab 14.9 and 14.10, use the `BACKUP=` option to choose the backup to use. The chosen previous backup is overwritten. +- In GitLab 15.0 and later, use the `PREVIOUS_BACKUP=` option to choose the backup to use. By default, a backup file is created + as documented in the [Backup timestamp](#backup-timestamp) section. You can override the `[TIMESTAMP]` portion of the filename by setting the + [`BACKUP` environment variable](#backup-filename). + +To create an incremental backup, run: + +- In GitLab 15.0 or later: + + ```shell + sudo gitlab-backup create INCREMENTAL=yes PREVIOUS_BACKUP= + ``` + +- In GitLab 14.9 and 14.10: + + ```shell + sudo gitlab-backup create INCREMENTAL=yes BACKUP= + ``` + +To create an [untarred](#skipping-tar-creation) incremental backup from a tarred backup, use `SKIP=tar`: + +```shell +sudo gitlab-backup create INCREMENTAL=yes SKIP=tar +``` + +#### Back up specific repository storages + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/86896) in GitLab 15.0. + +When using [multiple repository storages](../repository_storage_paths.md), +repositories from specific repository storages can be backed up separately +using the `REPOSITORIES_STORAGES` option. The option accepts a comma-separated list of +storage names. + +For example, for Omnibus GitLab installations: + +```shell +sudo gitlab-backup create REPOSITORIES_STORAGES=storage1,storage2 +``` + +For example, for installations from source: + +```shell +sudo -u git -H bundle exec rake gitlab:backup:create REPOSITORIES_STORAGES=storage1,storage2 +``` + +#### Back up specific repositories + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/88094) in GitLab 15.1. + +You can back up specific repositories using the `REPOSITORIES_PATHS` option. +Similarly, you can use `SKIP_REPOSITORIES_PATHS` to skip certain repositories. +Both options accept a comma-separated list of project or group paths. If you +specify a group path, all repositories in all projects in the group and +descendent groups are included or skipped, depending on which option you used. + +For example, to back up all repositories for all projects in **Group A** (`group-a`), the repository for **Project C** in **Group B** (`group-b/project-c`), +and skip the **Project D** in **Group A** (`group-a/project-d`): + +- Omnibus GitLab installations: + + ```shell + sudo gitlab-backup create REPOSITORIES_PATHS=group-a,group-b/project-c SKIP_REPOSITORIES_PATHS=group-a/project-d + ``` + +- Installations from source: + + ```shell + sudo -u git -H bundle exec rake gitlab:backup:create REPOSITORIES_PATHS=group-a,group-b/project-c SKIP_REPOSITORIES_PATHS=group-a/project-d + ``` + +#### Upload backups to a remote (cloud) storage + +NOTE: +It is not possible to [skip the tar creation](#skipping-tar-creation) when using object storage for backups. + +You can let the backup script upload (using the [Fog library](https://fog.io/)) +the `.tar` file it creates. In the following example, we use Amazon S3 for +storage, but Fog also lets you use [other storage providers](https://fog.io/storage/). +GitLab also [imports cloud drivers](https://gitlab.com/gitlab-org/gitlab/-/blob/da46c9655962df7d49caef0e2b9f6bbe88462a02/Gemfile#L113) +for AWS, Google, and Aliyun. A local driver is +[also available](#upload-to-locally-mounted-shares). + +[Read more about using object storage with GitLab](../object_storage.md). + +##### Using Amazon S3 + +For Omnibus GitLab packages: + +1. Add the following to `/etc/gitlab/gitlab.rb`: + + ```ruby + gitlab_rails['backup_upload_connection'] = { + 'provider' => 'AWS', + 'region' => 'eu-west-1', + 'aws_access_key_id' => 'AKIAKIAKI', + 'aws_secret_access_key' => 'secret123' + # If using an IAM Profile, don't configure aws_access_key_id & aws_secret_access_key + # 'use_iam_profile' => true + } + gitlab_rails['backup_upload_remote_directory'] = 'my.s3.bucket' + # Consider using multipart uploads when file size reaches 100MB. Enter a number in bytes. + # gitlab_rails['backup_multipart_chunk_size'] = 104857600 + ``` + +1. [Reconfigure GitLab](../restart_gitlab.md#reconfigure-a-linux-package-installation) + for the changes to take effect + +##### S3 Encrypted Buckets + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/64765) in GitLab 14.3. + +AWS supports these [modes for server side encryption](https://docs.aws.amazon.com/AmazonS3/latest/userguide/serv-side-encryption.html): + +- Amazon S3-Managed Keys (SSE-S3) +- Customer Master Keys (CMKs) Stored in AWS Key Management Service (SSE-KMS) +- Customer-Provided Keys (SSE-C) + +Use your mode of choice with GitLab. Each mode has similar, but slightly +different, configuration methods. + +###### SSE-S3 + +To enable SSE-S3, in the backup storage options set the `server_side_encryption` +field to `AES256`. For example, in Omnibus GitLab: + +```ruby +gitlab_rails['backup_upload_storage_options'] = { + 'server_side_encryption' => 'AES256' +} +``` + +###### SSE-KMS + +To enable SSE-KMS, you need the +[KMS key via its Amazon Resource Name (ARN) in the `arn:aws:kms:region:acct-id:key/key-id` format](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingKMSEncryption.html). +Under the `backup_upload_storage_options` configuration setting, set: + +- `server_side_encryption` to `aws:kms`. +- `server_side_encryption_kms_key_id` to the ARN of the key. + +For example, in Omnibus GitLab: + +```ruby +gitlab_rails['backup_upload_storage_options'] = { + 'server_side_encryption' => 'aws:kms', + 'server_side_encryption_kms_key_id' => 'arn:aws::' +} +``` + +###### SSE-C + +SSE-C requires you to set these encryption options: + +- `backup_encryption`: AES256. +- `backup_encryption_key`: Unencoded, 32-byte (256 bits) key. The upload fails if this isn't exactly 32 bytes. + +For example, in Omnibus GitLab: + +```ruby +gitlab_rails['backup_encryption'] = 'AES256' +gitlab_rails['backup_encryption_key'] = '' +``` + +If the key contains binary characters and cannot be encoded in UTF-8, +instead, specify the key with the `GITLAB_BACKUP_ENCRYPTION_KEY` environment variable. +For example: + +```ruby +gitlab_rails['env'] = { 'GITLAB_BACKUP_ENCRYPTION_KEY' => "\xDE\xAD\xBE\xEF" * 8 } +``` + +##### Digital Ocean Spaces + +This example can be used for a bucket in Amsterdam (AMS3): + +1. Add the following to `/etc/gitlab/gitlab.rb`: + + ```ruby + gitlab_rails['backup_upload_connection'] = { + 'provider' => 'AWS', + 'region' => 'ams3', + 'aws_access_key_id' => 'AKIAKIAKI', + 'aws_secret_access_key' => 'secret123', + 'endpoint' => 'https://ams3.digitaloceanspaces.com' + } + gitlab_rails['backup_upload_remote_directory'] = 'my.s3.bucket' + ``` + +1. [Reconfigure GitLab](../restart_gitlab.md#reconfigure-a-linux-package-installation) + for the changes to take effect + +If you see a `400 Bad Request` error message when using Digital Ocean Spaces, +the cause may be the use of backup encryption. Because Digital Ocean Spaces +doesn't support encryption, remove or comment the line that contains +`gitlab_rails['backup_encryption']`. + +##### Other S3 Providers + +Not all S3 providers are fully compatible with the Fog library. For example, +if you see a `411 Length Required` error message after attempting to upload, +you may need to downgrade the `aws_signature_version` value from the default +value to `2`, [due to this issue](https://github.com/fog/fog-aws/issues/428). + +For installations from source: + +1. Edit `home/git/gitlab/config/gitlab.yml`: + + ```yaml + backup: + # snip + upload: + # Fog storage connection settings, see https://fog.io/storage/ . + connection: + provider: AWS + region: eu-west-1 + aws_access_key_id: AKIAKIAKI + aws_secret_access_key: 'secret123' + # If using an IAM Profile, leave aws_access_key_id & aws_secret_access_key empty + # ie. aws_access_key_id: '' + # use_iam_profile: 'true' + # The remote 'directory' to store your backups. For S3, this would be the bucket name. + remote_directory: 'my.s3.bucket' + # Specifies Amazon S3 storage class to use for backups, this is optional + # storage_class: 'STANDARD' + # + # Turns on AWS Server-Side Encryption with Amazon Customer-Provided Encryption Keys for backups, this is optional + # 'encryption' must be set in order for this to have any effect. + # 'encryption_key' should be set to the 256-bit encryption key for Amazon S3 to use to encrypt or decrypt. + # To avoid storing the key on disk, the key can also be specified via the `GITLAB_BACKUP_ENCRYPTION_KEY` your data. + # encryption: 'AES256' + # encryption_key: '' + # + # + # Turns on AWS Server-Side Encryption with Amazon S3-Managed keys (optional) + # https://docs.aws.amazon.com/AmazonS3/latest/userguide/serv-side-encryption.html + # For SSE-S3, set 'server_side_encryption' to 'AES256'. + # For SS3-KMS, set 'server_side_encryption' to 'aws:kms'. Set + # 'server_side_encryption_kms_key_id' to the ARN of customer master key. + # storage_options: + # server_side_encryption: 'aws:kms' + # server_side_encryption_kms_key_id: 'arn:aws:kms:YOUR-KEY-ID-HERE' + ``` + +1. [Restart GitLab](../restart_gitlab.md#installations-from-source) + for the changes to take effect + +If you're uploading your backups to S3, you should create a new +IAM user with restricted access rights. To give the upload user access only for +uploading backups create the following IAM profile, replacing `my.s3.bucket` +with the name of your bucket: + +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Sid": "Stmt1412062044000", + "Effect": "Allow", + "Action": [ + "s3:AbortMultipartUpload", + "s3:GetBucketAcl", + "s3:GetBucketLocation", + "s3:GetObject", + "s3:GetObjectAcl", + "s3:ListBucketMultipartUploads", + "s3:PutObject", + "s3:PutObjectAcl" + ], + "Resource": [ + "arn:aws:s3:::my.s3.bucket/*" + ] + }, + { + "Sid": "Stmt1412062097000", + "Effect": "Allow", + "Action": [ + "s3:GetBucketLocation", + "s3:ListAllMyBuckets" + ], + "Resource": [ + "*" + ] + }, + { + "Sid": "Stmt1412062128000", + "Effect": "Allow", + "Action": [ + "s3:ListBucket" + ], + "Resource": [ + "arn:aws:s3:::my.s3.bucket" + ] + } + ] +} +``` + +##### Using Google Cloud Storage + +To use Google Cloud Storage to save backups, you must first create an +access key from the Google console: + +1. Go to the [Google storage settings page](https://console.cloud.google.com/storage/settings). +1. Select **Interoperability**, and then create an access key. +1. Make note of the **Access Key** and **Secret** and replace them in the + following configurations. +1. In the buckets advanced settings ensure the Access Control option + **Set object-level and bucket-level permissions** is selected. +1. Ensure you have already created a bucket. + +For Omnibus GitLab packages: + +1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + gitlab_rails['backup_upload_connection'] = { + 'provider' => 'Google', + 'google_storage_access_key_id' => 'Access Key', + 'google_storage_secret_access_key' => 'Secret', + + ## If you have CNAME buckets (foo.example.com), you might run into SSL issues + ## when uploading backups ("hostname foo.example.com.storage.googleapis.com + ## does not match the server certificate"). In that case, uncomnent the following + ## setting. See: https://github.com/fog/fog/issues/2834 + #'path_style' => true + } + gitlab_rails['backup_upload_remote_directory'] = 'my.google.bucket' + ``` + +1. [Reconfigure GitLab](../restart_gitlab.md#reconfigure-a-linux-package-installation) + for the changes to take effect + +For installations from source: + +1. Edit `home/git/gitlab/config/gitlab.yml`: + + ```yaml + backup: + upload: + connection: + provider: 'Google' + google_storage_access_key_id: 'Access Key' + google_storage_secret_access_key: 'Secret' + remote_directory: 'my.google.bucket' + ``` + +1. [Restart GitLab](../restart_gitlab.md#installations-from-source) + for the changes to take effect + +##### Using Azure Blob storage + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/25877) in GitLab 13.4. + +For Omnibus GitLab packages: + +1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + gitlab_rails['backup_upload_connection'] = { + 'provider' => 'AzureRM', + 'azure_storage_account_name' => '', + 'azure_storage_access_key' => '', + 'azure_storage_domain' => 'blob.core.windows.net', # Optional + } + gitlab_rails['backup_upload_remote_directory'] = '' + ``` + +1. [Reconfigure GitLab](../restart_gitlab.md#reconfigure-a-linux-package-installation) + for the changes to take effect + +For installations from source: + +1. Edit `home/git/gitlab/config/gitlab.yml`: + + ```yaml + backup: + upload: + connection: + provider: 'AzureRM' + azure_storage_account_name: '' + azure_storage_access_key: '' + remote_directory: '' + ``` + +1. [Restart GitLab](../restart_gitlab.md#installations-from-source) + for the changes to take effect + +For more details, see the [table of Azure parameters](../object_storage.md#azure-blob-storage). + +##### Specifying a custom directory for backups + +This option works only for remote storage. If you want to group your backups, +you can pass a `DIRECTORY` environment variable: + +```shell +sudo gitlab-backup create DIRECTORY=daily +sudo gitlab-backup create DIRECTORY=weekly +``` + +Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. + +#### Skip uploading backups to remote storage + +If you have configured GitLab to [upload backups in a remote storage](#upload-backups-to-a-remote-cloud-storage), +you can use the `SKIP=remote` option to skip uploading your backups to the remote storage. + +For Omnibus GitLab packages: + +```shell +sudo gitlab-backup create SKIP=remote +``` + +For installations from source: + +```shell +sudo -u git -H bundle exec rake gitlab:backup:create SKIP=remote RAILS_ENV=production +``` + +#### Upload to locally-mounted shares + +You can send backups to a locally-mounted share (for example, `NFS`,`CIFS`, or `SMB`) using the Fog +[`Local`](https://github.com/fog/fog-local#usage) storage provider. + +To do this, you must set the following configuration keys: + +- `backup_upload_connection.local_root`: mounted directory that backups are copied to. +- `backup_upload_remote_directory`: subdirectory of the `backup_upload_connection.local_root` directory. It is created if it doesn't exist. + If you want to copy the tarballs to the root of your mounted directory, use `.`. + +When mounted, the directory set in the `local_root` key must be owned by either: + +- The `git` user. So, mounting with the `uid=` of the `git` user for `CIFS` and `SMB`. +- The user that you are executing the backup tasks as. For Omnibus GitLab, this is the `git` user. + +Because file system performance may affect overall GitLab performance, +[we don't recommend using cloud-based file systems for storage](../nfs.md#avoid-using-cloud-based-file-systems). + +##### Avoid conflicting configuration + +Don't set the following configuration keys to the same path: + +- `gitlab_rails['backup_path']` (`backup.path` for source installations). +- `gitlab_rails['backup_upload_connection'].local_root` (`backup.upload.connection.local_root` for source installations). + +The `backup_path` configuration key sets the local location of the backup file. The `upload` configuration key is +intended for use when the backup file is uploaded to a separate server, perhaps for archival purposes. + +If these configuration keys are set to the same location, the upload feature fails because a backup already exists at +the upload location. This failure causes the upload feature to delete the backup because it assumes it's a residual file +remaining after the failed upload attempt. + +##### Configure uploads to locally-mounted shares + +For Omnibus GitLab packages: + +1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + gitlab_rails['backup_upload_connection'] = { + :provider => 'Local', + :local_root => '/mnt/backups' + } + + # The directory inside the mounted folder to copy backups to + # Use '.' to store them in the root directory + gitlab_rails['backup_upload_remote_directory'] = 'gitlab_backups' + ``` + +1. [Reconfigure GitLab](../restart_gitlab.md#reconfigure-a-linux-package-installation) + for the changes to take effect. + +For installations from source: + +1. Edit `home/git/gitlab/config/gitlab.yml`: + + ```yaml + backup: + upload: + # Fog storage connection settings, see https://fog.io/storage/ . + connection: + provider: Local + local_root: '/mnt/backups' + # The directory inside the mounted folder to copy backups to + # Use '.' to store them in the root directory + remote_directory: 'gitlab_backups' + ``` + +1. [Restart GitLab](../restart_gitlab.md#installations-from-source) + for the changes to take effect. + +#### Backup archive permissions + +The backup archives created by GitLab (`1393513186_2014_02_27_gitlab_backup.tar`) +have the owner/group `git`/`git` and 0600 permissions by default. This is +meant to avoid other system users reading GitLab data. If you need the backup +archives to have different permissions, you can use the `archive_permissions` +setting. + +For Omnibus GitLab packages: + +1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + gitlab_rails['backup_archive_permissions'] = 0644 # Makes the backup archives world-readable + ``` + +1. [Reconfigure GitLab](../restart_gitlab.md#reconfigure-a-linux-package-installation) + for the changes to take effect. + +For installations from source: + +1. Edit `/home/git/gitlab/config/gitlab.yml`: + + ```yaml + backup: + archive_permissions: 0644 # Makes the backup archives world-readable + ``` + +1. [Restart GitLab](../restart_gitlab.md#installations-from-source) + for the changes to take effect. + +#### Configuring cron to make daily backups + +WARNING: +The following cron jobs do not [back up your GitLab configuration files](#storing-configuration-files) +or [SSH host keys](https://superuser.com/questions/532040/copy-ssh-keys-from-one-server-to-another-server/532079#532079). + +You can schedule a cron job that backs up your repositories and GitLab metadata. + +For Omnibus GitLab packages: + +1. Edit the crontab for the `root` user: + + ```shell + sudo su - + crontab -e + ``` + +1. There, add the following line to schedule the backup for everyday at 2 AM: + + ```plaintext + 0 2 * * * /opt/gitlab/bin/gitlab-backup create CRON=1 + ``` + + Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. + +For installations from source: + +1. Edit the crontab for the `git` user: + + ```shell + sudo -u git crontab -e + ``` + +1. Add the following lines at the bottom: + + ```plaintext + # Create a full backup of the GitLab repositories and SQL database every day at 2am + 0 2 * * * cd /home/git/gitlab && PATH=/usr/local/bin:/usr/bin:/bin bundle exec rake gitlab:backup:create RAILS_ENV=production CRON=1 + ``` + +The `CRON=1` environment setting directs the backup script to hide all progress +output if there aren't any errors. This is recommended to reduce cron spam. +When troubleshooting backup problems, however, replace `CRON=1` with `--trace` to log verbosely. + +#### Limit backup lifetime for local files (prune old backups) + +WARNING: +The process described in this section doesn't work if you used a [custom filename](#backup-filename) +for your backups. + +To prevent regular backups from using all your disk space, you may want to set a limited lifetime +for backups. The next time the backup task runs, backups older than the `backup_keep_time` are +pruned. + +This configuration option manages only local files. GitLab doesn't prune old +files stored in a third-party [object storage](#upload-backups-to-a-remote-cloud-storage) +because the user may not have permission to list and delete files. It's +recommended that you configure the appropriate retention policy for your object +storage (for example, [AWS S3](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-lifecycle.html)). + +For Omnibus GitLab packages: + +1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + ## Limit backup lifetime to 7 days - 604800 seconds + gitlab_rails['backup_keep_time'] = 604800 + ``` + +1. [Reconfigure GitLab](../restart_gitlab.md#reconfigure-a-linux-package-installation) + for the changes to take effect. + +For installations from source: + +1. Edit `/home/git/gitlab/config/gitlab.yml`: + + ```yaml + backup: + ## Limit backup lifetime to 7 days - 604800 seconds + keep_time: 604800 + ``` + +1. [Restart GitLab](../restart_gitlab.md#installations-from-source) + for the changes to take effect. + +#### Back up and restore for installations using PgBouncer + +Do not back up or restore GitLab through a PgBouncer connection. These +tasks must [bypass PgBouncer and connect directly to the PostgreSQL primary database node](#bypassing-pgbouncer), +or they cause a GitLab outage. + +When the GitLab backup or restore task is used with PgBouncer, the +following error message is shown: + +```ruby +ActiveRecord::StatementInvalid: PG::UndefinedTable +``` + +Each time the GitLab backup runs, GitLab starts generating 500 errors and errors about missing +tables will [be logged by PostgreSQL](../logs/index.md#postgresql-logs): + +```plaintext +ERROR: relation "tablename" does not exist at character 123 +``` + +This happens because the task uses `pg_dump`, which +[sets a null search path and explicitly includes the schema in every SQL query](https://gitlab.com/gitlab-org/gitlab/-/issues/23211) +to address [CVE-2018-1058](https://www.postgresql.org/about/news/postgresql-103-968-9512-9417-and-9322-released-1834/). + +Since connections are reused with PgBouncer in transaction pooling mode, +PostgreSQL fails to search the default `public` schema. As a result, +this clearing of the search path causes tables and columns to appear +missing. + +##### Bypassing PgBouncer + +There are two ways to fix this: + +1. [Use environment variables to override the database settings](#environment-variable-overrides) for the backup task. +1. Reconfigure a node to [connect directly to the PostgreSQL primary database node](../postgresql/pgbouncer.md#procedure-for-bypassing-pgbouncer). + +###### Environment variable overrides + +By default, GitLab uses the database configuration stored in a +configuration file (`database.yml`). However, you can override the database settings +for the backup and restore task by setting environment +variables that are prefixed with `GITLAB_BACKUP_`: + +- `GITLAB_BACKUP_PGHOST` +- `GITLAB_BACKUP_PGUSER` +- `GITLAB_BACKUP_PGPORT` +- `GITLAB_BACKUP_PGPASSWORD` +- `GITLAB_BACKUP_PGSSLMODE` +- `GITLAB_BACKUP_PGSSLKEY` +- `GITLAB_BACKUP_PGSSLCERT` +- `GITLAB_BACKUP_PGSSLROOTCERT` +- `GITLAB_BACKUP_PGSSLCRL` +- `GITLAB_BACKUP_PGSSLCOMPRESSION` + +For example, to override the database host and port to use 192.168.1.10 +and port 5432 with the Omnibus package: + +```shell +sudo GITLAB_BACKUP_PGHOST=192.168.1.10 GITLAB_BACKUP_PGPORT=5432 /opt/gitlab/bin/gitlab-backup create +``` + +See the [PostgreSQL documentation](https://www.postgresql.org/docs/12/libpq-envars.html) +for more details on what these parameters do. + +#### `gitaly-backup` for repository backup and restore + +> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/333034) in GitLab 14.2. +> - [Deployed behind a feature flag](../../user/feature_flags.md), enabled by default. +> - [Generally available](https://gitlab.com/gitlab-org/gitlab/-/issues/333034) in GitLab 14.10. [Feature flag `gitaly_backup`](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/83254) removed. + +The `gitaly-backup` binary is used by the backup Rake task to create and restore repository backups from Gitaly. +`gitaly-backup` replaces the previous backup method that directly calls RPCs on Gitaly from GitLab. + +The backup Rake task must be able to find this executable. In most cases, you don't need to change +the path to the binary as it should work fine with the default path `/opt/gitlab/embedded/bin/gitaly-backup`. +If you have a specific reason to change the path, it can be configured in Omnibus GitLab packages: + +1. Add the following to `/etc/gitlab/gitlab.rb`: + + ```ruby + gitlab_rails['backup_gitaly_backup_path'] = '/path/to/gitaly-backup' + ``` + +1. [Reconfigure GitLab](../restart_gitlab.md#reconfigure-a-linux-package-installation) + for the changes to take effect. + +## Alternative backup strategies + +In the following cases, consider using file system data transfer or snapshots as part of your backup strategy: + +- Your GitLab instance contains a lot of Git repository data and the GitLab backup script is too slow. +- Your GitLab instance has a lot of forked projects and the regular backup task duplicates the Git data for all of them. +- Your GitLab instance has a problem and using the regular backup and import Rake tasks isn't possible. + +WARNING: +Gitaly Cluster [does not support snapshot backups](../gitaly/index.md#snapshot-backup-and-recovery-limitations). + +When considering using file system data transfer or snapshots: + +- Don't use these methods to migrate from one operating system to another. The operating systems of the source and destination should be as similar as possible. For example, + don't use these methods to migrate from Ubuntu to Fedora. +- Data consistency is very important. We recommend stopping GitLab with `sudo gitlab-ctl stop` before taking doing a file system transfer (with rsync, for example) or taking a + snapshot. + +Example: Amazon Elastic Block Store (EBS) + +> A GitLab server using Omnibus GitLab hosted on Amazon AWS. +> An EBS drive containing an ext4 file system is mounted at `/var/opt/gitlab`. +> In this case you could make an application backup by taking an EBS snapshot. +> The backup includes all repositories, uploads and PostgreSQL data. + +Example: Logical Volume Manager (LVM) snapshots + rsync + +> A GitLab server using Omnibus GitLab, with an LVM logical volume mounted at `/var/opt/gitlab`. +> Replicating the `/var/opt/gitlab` directory using rsync would not be reliable because too many files would change while rsync is running. +> Instead of rsync-ing `/var/opt/gitlab`, we create a temporary LVM snapshot, which we mount as a read-only file system at `/mnt/gitlab_backup`. +> Now we can have a longer running rsync job which creates a consistent replica on the remote server. +> The replica includes all repositories, uploads and PostgreSQL data. + +If you're running GitLab on a virtualized server, you can possibly also create +VM snapshots of the entire GitLab server. It's not uncommon however for a VM +snapshot to require you to power down the server, which limits this solution's +practical use. + +### Back up repository data separately + +First, ensure you back up existing GitLab data while [skipping repositories](#excluding-specific-directories-from-the-backup): + +```shell +# for Omnibus GitLab package installations +sudo gitlab-backup create SKIP=repositories + +# for installations from source: +sudo -u git -H bundle exec rake gitlab:backup:create SKIP=repositories RAILS_ENV=production +``` + +For manually backing up the Git repository data on disk, there are multiple possible strategies: + +- Use snapshots, such as the previous examples of Amazon EBS drive snapshots, or LVM snapshots + rsync. +- Use [GitLab Geo](../geo/index.md) and rely on the repository data on a Geo secondary site. +- [Prevent writes and copy the Git repository data](#prevent-writes-and-copy-the-git-repository-data). +- [Create an online backup by marking repositories as read-only (experimental)](#online-backup-through-marking-repositories-as-read-only-experimental). + +#### Prevent writes and copy the Git repository data + +Git repositories must be copied in a consistent way. They should not be copied during concurrent write +operations, as this can lead to inconsistencies or corruption issues. For more details, +[issue #270422](https://gitlab.com/gitlab-org/gitlab/-/issues/270422 "Provide documentation on preferred method of migrating Gitaly servers") +has a longer discussion explaining the potential problems. + +To prevent writes to the Git repository data, there are two possible approaches: + +- Use [maintenance mode](../maintenance_mode/index.md) to place GitLab in a read-only state. +- Create explicit downtime by stopping all Gitaly services before backing up the repositories: + + ```shell + sudo gitlab-ctl stop gitaly + # execute git data copy step + sudo gitlab-ctl start gitaly + ``` + +You can copy Git repository data using any method, as long as writes are prevented on the data being copied +(to prevent inconsistencies and corruption issues). In order of preference and safety, the recommended methods are: + +1. Use `rsync` with archive-mode, delete, and checksum options, for example: + + ```shell + rsync -aR --delete --checksum source destination # be extra safe with the order as it will delete existing data if inverted + ``` + +1. Use a [`tar` pipe to copy the entire repository's directory to another server or location](../operations/moving_repositories.md#tar-pipe-to-another-server). + +1. Use `sftp`, `scp`, `cp`, or any other copying method. + +#### Online backup through marking repositories as read-only (experimental) + +One way of backing up repositories without requiring instance-wide downtime +is to programmatically mark projects as read-only while copying the underlying data. + +There are a few possible downsides to this: + +- Repositories are read-only for a period of time that scales with the size of the repository. +- Backups take a longer time to complete due to marking each project as read-only, potentially leading to inconsistencies. For example, + a possible date discrepancy between the last data available for the first project that gets backed up compared to + the last project that gets backed up. +- Fork networks should be entirely read-only while the projects inside get backed up to prevent potential changes to the pool repository. + +There is an **experimental** script that attempts to automate this process in +[the Geo team Runbooks project](https://gitlab.com/gitlab-org/geo-team/runbooks/-/tree/main/experimental-online-backup-through-rsync). + +## Troubleshooting + +The following are possible problems you might encounter, along with potential +solutions. + +### When the secrets file is lost + +If you didn't [back up the secrets file](#storing-configuration-files), you +must complete several steps to get GitLab working properly again. + +The secrets file is responsible for storing the encryption key for the columns +that contain required, sensitive information. If the key is lost, GitLab can't +decrypt those columns, preventing access to the following items: + +- [CI/CD variables](../../ci/variables/index.md) +- [Kubernetes / GCP integration](../../user/infrastructure/clusters/index.md) +- [Custom Pages domains](../../user/project/pages/custom_domains_ssl_tls_certification/index.md) +- [Project error tracking](../../operations/error_tracking.md) +- [Runner authentication](../../ci/runners/index.md) +- [Project mirroring](../../user/project/repository/mirror/index.md) +- [Integrations](../../user/project/integrations/index.md) +- [Web hooks](../../user/project/integrations/webhooks.md) + +In cases like CI/CD variables and runner authentication, you can experience +unexpected behaviors, such as: + +- Stuck jobs. +- 500 errors. + +In this case, you must reset all the tokens for CI/CD variables and +runner authentication, which is described in more detail in the following +sections. After resetting the tokens, you should be able to visit your project +and the jobs begin running again. + +Use the information in the following sections at your own risk. + +#### Verify that all values can be decrypted + +You can determine if your database contains values that can't be decrypted by using a +[Rake task](../raketasks/check.md#verify-database-values-can-be-decrypted-using-the-current-secrets). + +#### Take a backup + +You must directly modify GitLab data to work around your lost secrets file. + +WARNING: +Be sure to create a full database backup before attempting any changes. + +#### Disable user two-factor authentication (2FA) + +Users with 2FA enabled can't sign in to GitLab. In that case, you must +[disable 2FA for everyone](../../security/two_factor_authentication.md#for-all-users), +after which users must reactivate 2FA. + +#### Reset CI/CD variables + +1. Enter the database console: + + For Omnibus GitLab 14.1 and earlier: + + ```shell + sudo gitlab-rails dbconsole + ``` + + For Omnibus GitLab 14.2 and later: + + ```shell + sudo gitlab-rails dbconsole --database main + ``` + + For installations from source, GitLab 14.1 and earlier: + + ```shell + sudo -u git -H bundle exec rails dbconsole -e production + ``` + + For installations from source, GitLab 14.2 and later: + + ```shell + sudo -u git -H bundle exec rails dbconsole -e production --database main + ``` + +1. Examine the `ci_group_variables` and `ci_variables` tables: + + ```sql + SELECT * FROM public."ci_group_variables"; + SELECT * FROM public."ci_variables"; + ``` + + These are the variables that you need to delete. + +1. Delete all variables: + + ```sql + DELETE FROM ci_group_variables; + DELETE FROM ci_variables; + ``` + +1. If you know the specific group or project from which you wish to delete variables, you can include a `WHERE` statement to specify that in your `DELETE`: + + ```sql + DELETE FROM ci_group_variables WHERE group_id = ; + DELETE FROM ci_variables WHERE project_id = ; + ``` + +You may need to reconfigure or restart GitLab for the changes to take effect. + +#### Reset runner registration tokens + +1. Enter the database console: + + For Omnibus GitLab 14.1 and earlier: + + ```shell + sudo gitlab-rails dbconsole + ``` + + For Omnibus GitLab 14.2 and later: + + ```shell + sudo gitlab-rails dbconsole --database main + ``` + + For installations from source, GitLab 14.1 and earlier: + + ```shell + sudo -u git -H bundle exec rails dbconsole -e production + ``` + + For installations from source, GitLab 14.2 and later: + + ```shell + sudo -u git -H bundle exec rails dbconsole -e production --database main + ``` + +1. Clear all tokens for projects, groups, and the entire instance: + + WARNING: + The final `UPDATE` operation stops the runners from being able to pick + up new jobs. You must register new runners. + + ```sql + -- Clear project tokens + UPDATE projects SET runners_token = null, runners_token_encrypted = null; + -- Clear group tokens + UPDATE namespaces SET runners_token = null, runners_token_encrypted = null; + -- Clear instance tokens + UPDATE application_settings SET runners_registration_token_encrypted = null; + -- Clear key used for JWT authentication + -- This may break the $CI_JWT_TOKEN job variable: + -- https://gitlab.com/gitlab-org/gitlab/-/issues/325965 + UPDATE application_settings SET encrypted_ci_jwt_signing_key = null; + -- Clear runner tokens + UPDATE ci_runners SET token = null, token_encrypted = null; + ``` + +#### Reset pending pipeline jobs + +1. Enter the database console: + + For Omnibus GitLab 14.1 and earlier: + + ```shell + sudo gitlab-rails dbconsole + ``` + + For Omnibus GitLab 14.2 and later: + + ```shell + sudo gitlab-rails dbconsole --database main + ``` + + For installations from source, GitLab 14.1 and earlier: + + ```shell + sudo -u git -H bundle exec rails dbconsole -e production + ``` + + For installations from source, GitLab 14.2 and later: + + ```shell + sudo -u git -H bundle exec rails dbconsole -e production --database main + ``` + +1. Clear all the tokens for pending jobs: + + For GitLab 15.3 and earlier: + + ```sql + -- Clear build tokens + UPDATE ci_builds SET token = null, token_encrypted = null; + ``` + + For GitLab 15.4 and later: + + ```sql + -- Clear build tokens + UPDATE ci_builds SET token_encrypted = null; + ``` + +A similar strategy can be employed for the remaining features. By removing the +data that can't be decrypted, GitLab can be returned to operation, and the +lost data can be manually replaced. + +#### Fix integrations and webhooks + +If you've lost your secrets, the [integrations settings pages](../../user/project/integrations/index.md) +and [webhooks settings pages](../../user/project/integrations/webhooks.md) are probably displaying `500` error messages. + +The fix is to truncate the affected tables (those containing encrypted columns). +This deletes all your configured integrations, webhooks, and related metadata. +You should verify that the secrets are the root cause before deleting any data. + +1. Enter the database console: + + For Omnibus GitLab 14.1 and earlier: + + ```shell + sudo gitlab-rails dbconsole + ``` + + For Omnibus GitLab 14.2 and later: + + ```shell + sudo gitlab-rails dbconsole --database main + ``` + + For installations from source, GitLab 14.1 and earlier: + + ```shell + sudo -u git -H bundle exec rails dbconsole -e production + ``` + + For installations from source, GitLab 14.2 and later: + + ```shell + sudo -u git -H bundle exec rails dbconsole -e production --database main + ``` + +1. Truncate the following tables: + + ```sql + -- truncate web_hooks table + TRUNCATE integrations, chat_names, issue_tracker_data, jira_tracker_data, slack_integrations, web_hooks, zentao_tracker_data, web_hook_logs; + ``` + +### Container Registry push failures after restoring from a backup + +If you use the [Container Registry](../../user/packages/container_registry/index.md), +pushes to the registry may fail after restoring your backup on an Omnibus GitLab +instance after restoring the registry data. + +These failures mention permission issues in the registry logs, similar to: + +```plaintext +level=error +msg="response completed with error" +err.code=unknown +err.detail="filesystem: mkdir /var/opt/gitlab/gitlab-rails/shared/registry/docker/registry/v2/repositories/...: permission denied" +err.message="unknown error" +``` + +This issue is caused by the restore running as the unprivileged user `git`, +which is unable to assign the correct ownership to the registry files during +the restore process ([issue #62759](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/62759 "Incorrect permissions on registry filesystem after restore")). + +To get your registry working again: + +```shell +sudo chown -R registry:registry /var/opt/gitlab/gitlab-rails/shared/registry/docker +``` + +If you changed the default file system location for the registry, run `chown` +against your custom location, instead of `/var/opt/gitlab/gitlab-rails/shared/registry/docker`. + +### Backup fails to complete with Gzip error + +When running the backup, you may receive a Gzip error message: + +```shell +sudo /opt/gitlab/bin/gitlab-backup create +... +Dumping ... +... +gzip: stdout: Input/output error + +Backup failed +``` + +If this happens, examine the following: + +- Confirm there is sufficient disk space for the Gzip operation. It's not uncommon for backups that + use the [default strategy](#backup-strategy-option) to require half the instance size + in free disk space during backup creation. +- If NFS is being used, check if the mount option `timeout` is set. The + default is `600`, and changing this to smaller values results in this error. + +### Backup fails with `File name too long` error + +During backup, you can get the `File name too long` error ([issue #354984](https://gitlab.com/gitlab-org/gitlab/-/issues/354984)). For example: + +```plaintext +Problem: |\r\n]+$'))[1]) > 246; + + CREATE INDEX ON uploads_with_long_filenames(row_id); + + SELECT + u.id, + u.path, + -- Current filename + (regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1] AS current_filename, + -- New filename + CONCAT( + LEFT(SPLIT_PART((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1], '.', 1), 242), + COALESCE(SUBSTRING((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1] FROM '\.(?:.(?!\.))+$')) + ) AS new_filename, + -- New path + CONCAT( + COALESCE((regexp_match(u.path, '(.*\/).*'))[1], ''), + CONCAT( + LEFT(SPLIT_PART((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1], '.', 1), 242), + COALESCE(SUBSTRING((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1] FROM '\.(?:.(?!\.))+$')) + ) + ) AS new_path + FROM uploads_with_long_filenames AS u + WHERE u.row_id > 0 AND u.row_id <= 10000; + ``` + + Output example: + + ```postgresql + -[ RECORD 1 ]----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + id | 34 + path | public/@hashed/loremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisitloremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisit.txt + current_filename | loremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisitloremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisit.txt + new_filename | loremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisitloremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelits.txt + new_path | public/@hashed/loremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisitloremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelits.txt + ``` + + Where: + + - `current_filename`: a filename that is currently more than 246 characters long. + - `new_filename`: a filename that has been truncated to 246 characters maximum. + - `new_path`: new path considering the `new_filename` (truncated). + + Once you validate the batch results, you must change the batch size (`row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. + +1. Rename the files found in the `uploads` table from long filenames to new truncated filenames. The following query rolls back the update so you can check the results safely within a transaction wrapper: + + ```sql + CREATE TEMP TABLE uploads_with_long_filenames AS + SELECT ROW_NUMBER() OVER(ORDER BY id) row_id, path, id + FROM uploads AS u + WHERE LENGTH((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1]) > 246; + + CREATE INDEX ON uploads_with_long_filenames(row_id); + + BEGIN; + WITH updated_uploads AS ( + UPDATE uploads + SET + path = + CONCAT( + COALESCE((regexp_match(updatable_uploads.path, '(.*\/).*'))[1], ''), + CONCAT( + LEFT(SPLIT_PART((regexp_match(updatable_uploads.path, '[^\\/:*?"<>|\r\n]+$'))[1], '.', 1), 242), + COALESCE(SUBSTRING((regexp_match(updatable_uploads.path, '[^\\/:*?"<>|\r\n]+$'))[1] FROM '\.(?:.(?!\.))+$')) + ) + ) + FROM + uploads_with_long_filenames AS updatable_uploads + WHERE + uploads.id = updatable_uploads.id + AND updatable_uploads.row_id > 0 AND updatable_uploads.row_id <= 10000 + RETURNING uploads.* + ) + SELECT id, path FROM updated_uploads; + ROLLBACK; + ``` + + Once you validate the batch update results, you must change the batch size (`row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. + +1. Validate that the new filenames from the previous query are the expected ones. If you are sure you want to truncate the records found in the previous step to 246 characters, run the following: + + WARNING: + The following action is **irreversible**. + + ```sql + CREATE TEMP TABLE uploads_with_long_filenames AS + SELECT ROW_NUMBER() OVER(ORDER BY id) row_id, path, id + FROM uploads AS u + WHERE LENGTH((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1]) > 246; + + CREATE INDEX ON uploads_with_long_filenames(row_id); + + UPDATE uploads + SET + path = + CONCAT( + COALESCE((regexp_match(updatable_uploads.path, '(.*\/).*'))[1], ''), + CONCAT( + LEFT(SPLIT_PART((regexp_match(updatable_uploads.path, '[^\\/:*?"<>|\r\n]+$'))[1], '.', 1), 242), + COALESCE(SUBSTRING((regexp_match(updatable_uploads.path, '[^\\/:*?"<>|\r\n]+$'))[1] FROM '\.(?:.(?!\.))+$')) + ) + ) + FROM + uploads_with_long_filenames AS updatable_uploads + WHERE + uploads.id = updatable_uploads.id + AND updatable_uploads.row_id > 0 AND updatable_uploads.row_id <= 10000; + ``` + + Once you finish the batch update, you must change the batch size (`updatable_uploads.row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. + +Truncate the filenames in the references found: + +1. Check if those records are referenced somewhere. One way to do this is to dump the database and search for the parent directory name and filename: + + 1. To dump your database, you can use the following command as an example: + + ```shell + pg_dump -h /var/opt/gitlab/postgresql/ -d gitlabhq_production > gitlab-dump.tmp + ``` + + 1. Then you can search for the references using the `grep` command. Combining the parent directory and the filename can be a good idea. For example: + + ```shell + grep public/alongfilenamehere.txt gitlab-dump.tmp + ``` + +1. Replace those long filenames using the new filenames obtained from querying the `uploads` table. + +Truncate the filenames on the file system. You must manually rename the files in your file system to the new filenames obtained from querying the `uploads` table. + +#### Re-run the backup task + +After following all the previous steps, re-run the backup task. + +### Restoring database backup fails when `pg_stat_statements` was previously enabled + +The GitLab backup of the PostgreSQL database includes all SQL statements required to enable extensions that were +previously enabled in the database. + +The `pg_stat_statements` extension can only be enabled or disabled by a PostgreSQL user with `superuser` role. +As the restore process uses a database user with limited permissions, it can't execute the following SQL statements: + +```sql +DROP EXTENSION IF EXISTS pg_stat_statements; +CREATE EXTENSION IF NOT EXISTS pg_stat_statements WITH SCHEMA public; +``` + +When trying to restore the backup in a PostgreSQL instance that doesn't have the `pg_stats_statements` extension, +the following error message is displayed: + +```plaintext +ERROR: permission denied to create extension "pg_stat_statements" +HINT: Must be superuser to create this extension. +ERROR: extension "pg_stat_statements" does not exist +``` + +When trying to restore in an instance that has the `pg_stats_statements` extension enabled, the cleaning up step +fails with an error message similar to the following: + +```plaintext +rake aborted! +ActiveRecord::StatementInvalid: PG::InsufficientPrivilege: ERROR: must be owner of view pg_stat_statements +/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:42:in `block (4 levels) in ' +/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:41:in `each' +/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:41:in `block (3 levels) in ' +/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:71:in `block (3 levels) in ' +/opt/gitlab/embedded/bin/bundle:23:in `load' +/opt/gitlab/embedded/bin/bundle:23:in `
' +Caused by: +PG::InsufficientPrivilege: ERROR: must be owner of view pg_stat_statements +/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:42:in `block (4 levels) in ' +/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:41:in `each' +/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:41:in `block (3 levels) in ' +/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:71:in `block (3 levels) in ' +/opt/gitlab/embedded/bin/bundle:23:in `load' +/opt/gitlab/embedded/bin/bundle:23:in `
' +Tasks: TOP => gitlab:db:drop_tables +(See full trace by running task with --trace) +``` + +#### Prevent the dump file to include `pg_stat_statements` + +To prevent the inclusion of the extension in the PostgreSQL dump file that is part of the backup bundle, +enable the extension in any schema except the `public` schema: + +```sql +CREATE SCHEMA adm; +CREATE EXTENSION pg_stat_statements SCHEMA adm; +``` + +If the extension was previously enabled in the `public` schema, move it to a new one: + +```sql +CREATE SCHEMA adm; +ALTER EXTENSION pg_stat_statements SET SCHEMA adm; +``` + +To query the `pg_stat_statements` data after changing the schema, prefix the view name with the new schema: + +```sql +SELECT * FROM adm.pg_stat_statements limit 0; +``` + +To make it compatible with third-party monitoring solutions that expect it to be enabled in the `public` schema, +you need to include it in the `search_path`: + +```sql +set search_path to public,adm; +``` + +#### Fix an existing dump file to remove references to `pg_stat_statements` + +To fix an existing backup file, do the following changes: + +1. Extract from the backup the following file: `db/database.sql.gz`. +1. Decompress the file or use an editor that is capable of handling it compressed. +1. Remove the following lines, or similar ones: + + ```sql + CREATE EXTENSION IF NOT EXISTS pg_stat_statements WITH SCHEMA public; + ``` + + ```sql + COMMENT ON EXTENSION pg_stat_statements IS 'track planning and execution statistics of all SQL statements executed'; + ``` + +1. Save the changes and recompress the file. +1. Update the backup file with the modified `db/database.sql.gz`. diff --git a/doc/administration/backup_restore/index.md b/doc/administration/backup_restore/index.md index 9e9818bcf4b1b2..b98775dc1c7f9b 100644 --- a/doc/administration/backup_restore/index.md +++ b/doc/administration/backup_restore/index.md @@ -17,7 +17,7 @@ You can mitigate all of these risks with a disaster recovery plan that includes ## Back up GitLab -For detailed information on backing up GitLab, see [Backup GitLab](../raketasks/backup_gitlab.md). +For detailed information on backing up GitLab, see [Backup GitLab](backup_gitlab.md). ## Restore GitLab diff --git a/doc/administration/geo/disaster_recovery/planned_failover.md b/doc/administration/geo/disaster_recovery/planned_failover.md index d5bcdcc7823e68..6ac67c3d21e00b 100644 --- a/doc/administration/geo/disaster_recovery/planned_failover.md +++ b/doc/administration/geo/disaster_recovery/planned_failover.md @@ -61,7 +61,7 @@ the container registry on the primary site and restore it onto the secondary site: 1. On your primary site, back up only the registry and - [exclude specific directories from the backup](../../../raketasks/backup_gitlab.md#excluding-specific-directories-from-the-backup): + [exclude specific directories from the backup](../../../administration/backup_restore/backup_gitlab.md#excluding-specific-directories-from-the-backup): ```shell # Create a backup in the /var/opt/gitlab/backups folder diff --git a/doc/administration/gitaly/index.md b/doc/administration/gitaly/index.md index 18866a3ceb05e9..811819d5897ac2 100644 --- a/doc/administration/gitaly/index.md +++ b/doc/administration/gitaly/index.md @@ -80,7 +80,7 @@ Gitaly Cluster does not support snapshot backups. Snapshot backups can cause iss out of sync with the disk storage. Because of how Praefect rebuilds the replication metadata of Gitaly disk information during a restore, you should use the [official backup and restore Rake tasks](../../administration/backup_restore/index.md). -The [incremental backup method](../../raketasks/backup_gitlab.md#incremental-repository-backups) +The [incremental backup method](../../administration/backup_restore/backup_gitlab.md#incremental-repository-backups) can be used to speed up Gitaly Cluster backups. If you are unable to use either method, contact customer support for restoration help. @@ -175,7 +175,7 @@ best suited by using Gitaly Cluster. ### Backing up repositories -When backing up or syncing repositories using tools other than GitLab, you must [prevent writes](../../raketasks/backup_gitlab.md#prevent-writes-and-copy-the-git-repository-data) +When backing up or syncing repositories using tools other than GitLab, you must [prevent writes](../../administration/backup_restore/backup_gitlab.md#prevent-writes-and-copy-the-git-repository-data) while copying repository data. ## Gitaly Cluster diff --git a/doc/administration/object_storage.md b/doc/administration/object_storage.md index b6c3c3e1b0f990..16e65864fd91d2 100644 --- a/doc/administration/object_storage.md +++ b/doc/administration/object_storage.md @@ -68,7 +68,7 @@ automatically. Thus, only the following providers can be used: The consolidated form configuration can't be used for backups or Mattermost. Backups can be configured with -[server side encryption](../raketasks/backup_gitlab.md#s3-encrypted-buckets) +[server side encryption](../administration/backup_restore/backup_gitlab.md#s3-encrypted-buckets) separately. See the [table for a complete list](#configure-each-object-type-to-define-its-own-storage-connection-storage-specific-form) of supported object storage types. @@ -163,7 +163,7 @@ supported by consolidated form, refer to the following guides: | Object storage type | Supported by consolidated form? | |---------------------|------------------------------------------| | [Project-level Secure Files](secure_files.md#using-object-storage) | **{dotted-circle}** No | -| [Backups](../raketasks/backup_gitlab.md#upload-backups-to-a-remote-cloud-storage) | **{dotted-circle}** No | +| [Backups](../administration/backup_restore/backup_gitlab.md#upload-backups-to-a-remote-cloud-storage) | **{dotted-circle}** No | | [Container Registry](packages/container_registry.md#use-object-storage) (optional feature) | **{dotted-circle}** No | | [Mattermost](https://docs.mattermost.com/configure/file-storage-configuration-settings.html)| **{dotted-circle}** No | | [Autoscale runner caching](https://docs.gitlab.com/runner/configuration/autoscale.html#distributed-runners-caching) (optional for improved performance) | **{dotted-circle}** No | @@ -848,7 +848,7 @@ Helm-based installs require separate buckets to ### S3 API compatibility issues -Not all S3 providers [are fully compatible](../raketasks/backup_gitlab.md#other-s3-providers) +Not all S3 providers [are fully compatible](../administration/backup_restore/backup_gitlab.md#other-s3-providers) with the Fog library that GitLab uses. Symptoms include an error in `production.log`: ```plaintext diff --git a/doc/administration/operations/moving_repositories.md b/doc/administration/operations/moving_repositories.md index 43db0655141c3f..c27bedd39de691 100644 --- a/doc/administration/operations/moving_repositories.md +++ b/doc/administration/operations/moving_repositories.md @@ -193,9 +193,9 @@ should be used. Git repositories are accessed, managed, and stored on GitLab ser can result from directly accessing and copying Gitaly files using tools like `rsync`. - From GitLab 13.3, backup performance can be improved by - [processing multiple repositories concurrently](../../raketasks/backup_gitlab.md#back-up-git-repositories-concurrently). + [processing multiple repositories concurrently](../../administration/backup_restore/backup_gitlab.md#back-up-git-repositories-concurrently). - Backups can be created of just the repositories using the - [skip feature](../../raketasks/backup_gitlab.md#excluding-specific-directories-from-the-backup). + [skip feature](../../administration/backup_restore/backup_gitlab.md#excluding-specific-directories-from-the-backup). No other method works for Gitaly Cluster targets. diff --git a/doc/administration/postgresql/pgbouncer.md b/doc/administration/postgresql/pgbouncer.md index fec4ab47bd0313..ae2b7609641235 100644 --- a/doc/administration/postgresql/pgbouncer.md +++ b/doc/administration/postgresql/pgbouncer.md @@ -88,7 +88,7 @@ This content has been moved to a [new location](replication_and_failover.md#conf Do not backup or restore GitLab through a PgBouncer connection: it causes a GitLab outage. -[Read more about this and how to reconfigure backups](../../raketasks/backup_gitlab.md#back-up-and-restore-for-installations-using-pgbouncer). +[Read more about this and how to reconfigure backups](../../administration/backup_restore/backup_gitlab.md#back-up-and-restore-for-installations-using-pgbouncer). ## Enable Monitoring @@ -178,7 +178,7 @@ ote_pid | tls Some database changes have to be done directly, and not through PgBouncer. -The main affected tasks are [database restores](../../raketasks/backup_gitlab.md#back-up-and-restore-for-installations-using-pgbouncer) +The main affected tasks are [database restores](../../administration/backup_restore/backup_gitlab.md#back-up-and-restore-for-installations-using-pgbouncer) and [GitLab upgrades with database migrations](../../update/zero_downtime.md#postgresql). 1. To find the primary node, run the following on a database node: diff --git a/doc/administration/postgresql/replication_and_failover.md b/doc/administration/postgresql/replication_and_failover.md index 46b4a4f17a828e..9693ec7adacf83 100644 --- a/doc/administration/postgresql/replication_and_failover.md +++ b/doc/administration/postgresql/replication_and_failover.md @@ -571,7 +571,7 @@ in the Troubleshooting section before proceeding. Do not backup or restore GitLab through a PgBouncer connection: this causes a GitLab outage. -[Read more about this and how to reconfigure backups](../../raketasks/backup_gitlab.md#back-up-and-restore-for-installations-using-pgbouncer). +[Read more about this and how to reconfigure backups](../../administration/backup_restore/backup_gitlab.md#back-up-and-restore-for-installations-using-pgbouncer). ### Ensure GitLab is running diff --git a/doc/administration/raketasks/check.md b/doc/administration/raketasks/check.md index 4eed85abc6d428..014a4fba5d22fa 100644 --- a/doc/administration/raketasks/check.md +++ b/doc/administration/raketasks/check.md @@ -215,7 +215,7 @@ secrets file (`gitlab-secrets.json`). Automatic resolution is not yet implemented. If you have values that cannot be decrypted, you can follow steps to reset them, see our -documentation on what to do [when the secrets file is lost](../../raketasks/backup_gitlab.md#when-the-secrets-file-is-lost). +documentation on what to do [when the secrets file is lost](../../administration/backup_restore/backup_gitlab.md#when-the-secrets-file-is-lost). This can take a very long time, depending on the size of your database, as it checks all rows in all tables. diff --git a/doc/administration/raketasks/storage.md b/doc/administration/raketasks/storage.md index 6080896ec2f864..8cef4436f3f07a 100644 --- a/doc/administration/raketasks/storage.md +++ b/doc/administration/raketasks/storage.md @@ -244,9 +244,9 @@ If destroying the project generates a stack trace relating to encryption or the 1. [Verify your GitLab secrets](check.md#verify-database-values-can-be-decrypted-using-the-current-secrets). 1. If the affected projects have secrets that cannot be decrypted it will be necessary to remove those specific secrets. - [Our documentation for dealing with lost secrets](../../raketasks/backup_gitlab.md#when-the-secrets-file-is-lost) + [Our documentation for dealing with lost secrets](../../administration/backup_restore/backup_gitlab.md#when-the-secrets-file-is-lost) is for loss of all secrets, but it's possible for specific projects to be affected. For example, - to [reset specific runner registration tokens](../../raketasks/backup_gitlab.md#reset-runner-registration-tokens) + to [reset specific runner registration tokens](../../administration/backup_restore/backup_gitlab.md#reset-runner-registration-tokens) for a specific project ID: ```sql diff --git a/doc/administration/reference_architectures/index.md b/doc/administration/reference_architectures/index.md index 08aeb14945466b..07eede1f4a4ea8 100644 --- a/doc/administration/reference_architectures/index.md +++ b/doc/administration/reference_architectures/index.md @@ -63,7 +63,7 @@ This section explains the designs you can choose from. It begins with the least ### Standalone (non-HA) -For environments serving 2,000 or fewer users, we generally recommend a standalone approach by deploying a non-highly available single or multi-node environment. With this approach, you can employ strategies such as [automated backups](../../raketasks/backup_gitlab.md#configuring-cron-to-make-daily-backups) for recovery to provide a good level of RPO / RTO while avoiding the complexities that come with HA. +For environments serving 2,000 or fewer users, we generally recommend a standalone approach by deploying a non-highly available single or multi-node environment. With this approach, you can employ strategies such as [automated backups](../../administration/backup_restore/backup_gitlab.md#configuring-cron-to-make-daily-backups) for recovery to provide a good level of RPO / RTO while avoiding the complexities that come with HA. *[RTO]: Recovery time objective *[RPO]: Recovery point objective diff --git a/doc/ci/variables/index.md b/doc/ci/variables/index.md index c51aaec0a5bee1..c580e5658ee19e 100644 --- a/doc/ci/variables/index.md +++ b/doc/ci/variables/index.md @@ -246,7 +246,7 @@ malicious code can compromise both masked and protected variables. Variable values are encrypted using [`aes-256-cbc`](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard) and stored in the database. This data can only be read and decrypted with a -valid [secrets file](../../raketasks/backup_gitlab.md#when-the-secrets-file-is-lost). +valid [secrets file](../../administration/backup_restore/backup_gitlab.md#when-the-secrets-file-is-lost). ### Mask a CI/CD variable diff --git a/doc/install/aws/manual_install_aws.md b/doc/install/aws/manual_install_aws.md index 13dac7071dedbe..765269a8d44313 100644 --- a/doc/install/aws/manual_install_aws.md +++ b/doc/install/aws/manual_install_aws.md @@ -760,10 +760,10 @@ and restore its Git data, database, attachments, LFS objects, and so on. Some important things to know: - The backup/restore tool **does not** store some configuration files, like secrets; you - must [configure this yourself](../../raketasks/backup_gitlab.md#storing-configuration-files). + must [configure this yourself](../../administration/backup_restore/backup_gitlab.md#storing-configuration-files). - By default, the backup files are stored locally, but you can - [backup GitLab using S3](../../raketasks/backup_gitlab.md#using-amazon-s3). -- You can [exclude specific directories form the backup](../../raketasks/backup_gitlab.md#excluding-specific-directories-from-the-backup). + [backup GitLab using S3](../../administration/backup_restore/backup_gitlab.md#using-amazon-s3). +- You can [exclude specific directories form the backup](../../administration/backup_restore/backup_gitlab.md#excluding-specific-directories-from-the-backup). ### Backing up GitLab diff --git a/doc/install/docker.md b/doc/install/docker.md index d12d1ffd7fcde6..f1af65afe4ebb4 100644 --- a/doc/install/docker.md +++ b/doc/install/docker.md @@ -618,8 +618,8 @@ meaning no configuration is set directly in the `gitlab.rb` file, then there is to back up the `gitlab.rb` file. WARNING: -[Backing up the GitLab secrets file](../raketasks/backup_gitlab.md#storing-configuration-files) is required -to avoid [complicated steps](../raketasks/backup_gitlab.md#when-the-secrets-file-is-lost) when recovering +[Backing up the GitLab secrets file](../administration/backup_restore/backup_gitlab.md#storing-configuration-files) is required +to avoid [complicated steps](../administration/backup_restore/backup_gitlab.md#when-the-secrets-file-is-lost) when recovering GitLab from backup. The secrets file is stored at `/etc/gitlab/gitlab-secrets.json` inside the container, or `$GITLAB_HOME/config/gitlab-secrets.json` [on the container host](#set-up-the-volumes-location). diff --git a/doc/raketasks/backup_gitlab.md b/doc/raketasks/backup_gitlab.md index 269e9ee7f72ab5..231312b3833300 100644 --- a/doc/raketasks/backup_gitlab.md +++ b/doc/raketasks/backup_gitlab.md @@ -2,1869 +2,13 @@ stage: Systems group: Geo info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments +redirect_to: '../administration/backup_restore/backup_gitlab.md' +remove_date: '2023-09-26' --- -# Back up GitLab **(FREE SELF)** +This document was moved to [another location](../administration/backup_restore/backup_gitlab.md). -The exact procedure for backing up GitLab depends on many factors. Your particular deployment's usage and configuration determine what kind of data exists, where it is located, and how much there is. These factors influence your options for how to perform a back up, how to store it, and how to restore it. - -## What data needs to be backed up? - -- [PostgreSQL databases](#postgresql-databases) -- [Git repositories](#git-repositories) -- [Blobs](#blobs) -- [Storing configuration files](#storing-configuration-files) -- [Other data](#other-data) - -### PostgreSQL databases - -In the simplest case, GitLab has one PostgreSQL database in one PostgreSQL server on the same VM as all other GitLab services. But depending on configuration, GitLab may use multiple PostgreSQL databases in multiple PostgreSQL servers. - -In general, this data is the single source of truth for most user-generated content in the Web interface, such as issue and merge request content, comments, permissions, and credentials. - -PostgreSQL also holds some cached data like HTML-rendered Markdown, and by default, merge request diffs. -Although merge request diffs can also be configured to be offloaded to the file system or object storage, see [Blobs](#blobs). - -Gitaly Cluster's Praefect service uses a PostgreSQL database as a single source of truth to manage its Gitaly nodes. - -A common PostgreSQL utility, [`pg_dump`](https://www.postgresql.org/docs/current/app-pgdump.html), produces a backup file which can be used to restore a PostgreSQL database. The [backup command](#backup-command) uses this utility under the hood. - -Unfortunately, `pg_dump` takes longer to execute, the larger the database. Depending on your situation, the duration becomes impractical at some point (days, for example). If your database is over 100GB, `pg_dump`, and by extension the [backup command](#backup-command), is likely not usable. - -### Git repositories - -A GitLab instance can have one or more repository shards. Each shard is a Gitaly instance or Gitaly Cluster that -is responsible for allowing access and operations on the locally stored Git repositories. Gitaly can run -on a machine: - -- With a single disk. -- With multiple disks mounted as a single mount-point (like with a RAID array). -- Using LVM. - -Gitaly may work with NFS or a mounted Storage Appliance, but it is [not officially supported](../administration/nfs.md#gitaly-with-nfs-not-supported) because Git requires low latency. - -Each project can have up to 3 different repositories: - -- A project repository, where the source code is stored. -- A wiki repository, where the wiki content is stored. -- A design repository, where design artifacts are indexed (assets are actually in LFS). - -They all live in the same shard and share the same base name with a `-wiki` and `-design` suffix -for Wiki and Design Repository cases. - -Personal and project snippets, and group wiki content, are stored in Git repositories. - -Project forks are deduplicated in live a GitLab site using pool repositories. - -The [backup command](#backup-command) produces a Git bundle for each repository and tars them all up. This duplicates pool repository data into every fork. For more information, see [Alternative backup strategies](#alternative-backup-strategies). - -### Blobs - -GitLab stores blobs (or files) such as issue attachments or LFS objects into either: - -- The file system in a specific location. -- An [Object Storage](../administration/object_storage.md) solution. Object Storage solutions can be: - - Cloud based like Amazon S3 Google Cloud Storage. - - Hosted by you (like MinIO). - - A Storage Appliance that exposes an Object Storage-compatible API. - -### Storing configuration files - -The [backup Rake task](#back-up-gitlab) GitLab provides does _not_ store your -configuration files. The primary reason for this is that your database contains -items including encrypted information for two-factor authentication and the -CI/CD _secure variables_. Storing encrypted information in the same location -as its key defeats the purpose of using encryption in the first place. - -WARNING: -The secrets file is essential to preserve your database encryption key. - -At the very **minimum**, you must back up: - -For Omnibus: - -- `/etc/gitlab/gitlab-secrets.json` -- `/etc/gitlab/gitlab.rb` - -For installation from source: - -- `/home/git/gitlab/config/secrets.yml` -- `/home/git/gitlab/config/gitlab.yml` - -For [Docker installations](../install/docker.md), you must -back up the volume where the configuration files are stored. If you created -the GitLab container according to the documentation, it should be in the -`/srv/gitlab/config` directory. - -For [GitLab Helm chart installations](https://gitlab.com/gitlab-org/charts/gitlab) -on a Kubernetes cluster, you must follow the -[Back up the secrets](https://docs.gitlab.com/charts/backup-restore/backup.html#back-up-the-secrets) -instructions. - -You may also want to back up any TLS keys and certificates (`/etc/gitlab/ssl`, `/etc/gitlab/trusted-certs`), and your -[SSH host keys](https://superuser.com/questions/532040/copy-ssh-keys-from-one-server-to-another-server/532079#532079) -to avoid man-in-the-middle attack warnings if you have to perform a full machine restore. - -If you use Omnibus GitLab, review additional information to -[backup your configuration](https://docs.gitlab.com/omnibus/settings/backups.html). - -In the unlikely event that the secrets file is lost, see the -[troubleshooting section](#when-the-secrets-file-is-lost). - -### Other data - -GitLab uses Redis both as a cache store and to hold persistent data for our background jobs system, Sidekiq. The provided [backup command](#backup-command) does _not_ back up Redis data. This means to take a consistent backup with the [backup command](#backup-command), there must be no pending or running background jobs. It is possible to manually back up Redis. - -Elasticsearch is an optional database for advanced search. It can improve search -in both source-code level, and user generated content in issues, merge requests, and discussions. The [backup command](#backup-command) does _not_ back up Elasticsearch data. Elasticsearch data can be regenerated from PostgreSQL data after a restore. It is possible to manually back up Elasticsearch. - -## Command line interface - -GitLab provides a command line interface to back up your entire instance, -including: - -- Database -- Attachments -- Git repositories data -- CI/CD job output logs -- CI/CD job artifacts -- LFS objects -- Terraform states ([introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/331806) in GitLab 14.7) -- Container Registry images -- GitLab Pages content -- Packages ([introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/332006) in GitLab 14.7) -- Snippets -- [Group wikis](../user/project/wiki/group.md) -- Project-level Secure Files ([introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/121142) in GitLab 16.1) - -Backups do not include: - -- [Mattermost data](https://docs.mattermost.com/administration/config-settings.html#file-storage) -- Redis (and thus Sidekiq jobs) - -WARNING: -GitLab does not back up any configuration files (`/etc/gitlab`), TLS keys and certificates, or system -files. You are highly advised to read about [storing configuration files](#storing-configuration-files). - -### Requirements - -To be able to back up and restore, ensure that Rsync is installed on your -system. If you installed GitLab: - -- _Using the Omnibus package_, Rsync is already installed. -- _From source_, check if `rsync` is installed. If Rsync is not installed, install it. For example: - - ```shell - # Debian/Ubuntu - sudo apt-get install rsync - - # RHEL/CentOS - sudo yum install rsync - ``` - -### Backup command - -WARNING: -GitLab doesn't back up items that aren't stored on the file system. If you're -using [object storage](../administration/object_storage.md), be sure to enable -backups with your object storage provider, if desired. - -WARNING: -The backup command requires [additional parameters](#back-up-and-restore-for-installations-using-pgbouncer) when -your installation is using PgBouncer, for either performance reasons or when using it with a Patroni cluster. - -WARNING: -Before GitLab 15.5.0, the backup command doesn't verify if another backup is already running, as described in -[issue 362593](https://gitlab.com/gitlab-org/gitlab/-/issues/362593). We strongly recommend -you make sure that all backups are complete before starting a new one. - -An application data backup creates an archive file that contains the database, -all repositories and all attachments. - -You can only restore a backup to **exactly the same version and type (CE/EE)** -of GitLab on which it was created. The best way to -[migrate your projects from one server to another](backup_restore/index.md#migrate-to-a-new-server) is through a backup and restore. - -Depending on your version of GitLab, use the following command if you installed -GitLab using the Omnibus package: - -- GitLab 12.2 or later: - - ```shell - sudo gitlab-backup create - ``` - -- GitLab 12.1 and earlier: - - ```shell - gitlab-rake gitlab:backup:create - ``` - -If you installed GitLab from source, use the following command: - -```shell -sudo -u git -H bundle exec rake gitlab:backup:create RAILS_ENV=production -``` - -If you're running GitLab from within a Docker container, run the backup from -the host, based on your installed version of GitLab: - -- GitLab 12.2 or later: - - ```shell - docker exec -t gitlab-backup create - ``` - -- GitLab 12.1 and earlier: - - ```shell - docker exec -t gitlab-rake gitlab:backup:create - ``` - -If you're using the [GitLab Helm chart](https://gitlab.com/gitlab-org/charts/gitlab) -on a Kubernetes cluster, you can run the backup task by using `kubectl` to run the `backup-utility` -script on the GitLab toolbox pod. For more details, see the -[charts backup documentation](https://docs.gitlab.com/charts/backup-restore/backup.html). - -Similar to the Kubernetes case, if you have scaled out your GitLab cluster to -use multiple application servers, you should pick a designated node (that isn't -auto-scaled away) for running the backup Rake task. Because the backup Rake -task is tightly coupled to the main Rails application, this is typically a node -on which you're also running Puma or Sidekiq. - -Example output: - -```plaintext -Dumping database tables: -- Dumping table events... [DONE] -- Dumping table issues... [DONE] -- Dumping table keys... [DONE] -- Dumping table merge_requests... [DONE] -- Dumping table milestones... [DONE] -- Dumping table namespaces... [DONE] -- Dumping table notes... [DONE] -- Dumping table projects... [DONE] -- Dumping table protected_branches... [DONE] -- Dumping table schema_migrations... [DONE] -- Dumping table services... [DONE] -- Dumping table snippets... [DONE] -- Dumping table taggings... [DONE] -- Dumping table tags... [DONE] -- Dumping table users... [DONE] -- Dumping table users_projects... [DONE] -- Dumping table web_hooks... [DONE] -- Dumping table wikis... [DONE] -Dumping repositories: -- Dumping repository abcd... [DONE] -Creating backup archive: $TIMESTAMP_gitlab_backup.tar [DONE] -Deleting tmp directories...[DONE] -Deleting old backups... [SKIPPING] -``` - -### Backup timestamp - -The backup archive is saved in `backup_path`, which is specified in the -`config/gitlab.yml` file. The filename is `[TIMESTAMP]_gitlab_backup.tar`, -where `TIMESTAMP` identifies the time at which each backup was created, plus -the GitLab version. The timestamp is needed if you need to restore GitLab and -multiple backups are available. - -For example, if the backup name is `1493107454_2018_04_25_10.6.4-ce_gitlab_backup.tar`, -the timestamp is `1493107454_2018_04_25_10.6.4-ce`. - -### Backup options - -The command line tool GitLab provides to backup your instance can accept more -options. - -#### Backup strategy option - -The default backup strategy is to essentially stream data from the respective -data locations to the backup using the Linux command `tar` and `gzip`. This works -fine in most cases, but can cause problems when data is rapidly changing. - -When data changes while `tar` is reading it, the error `file changed as we read -it` may occur, and causes the backup process to fail. To combat this, 8.17 -introduces a new backup strategy called `copy`. The strategy copies data files -to a temporary location before calling `tar` and `gzip`, avoiding the error. - -A side-effect is that the backup process takes up to an additional 1X disk -space. The process does its best to clean up the temporary files at each stage -so the problem doesn't compound, but it could be a considerable change for large -installations. This is why the `copy` strategy is not the default in 8.17. - -To use the `copy` strategy instead of the default streaming strategy, specify -`STRATEGY=copy` in the Rake task command. For example: - -```shell -sudo gitlab-backup create STRATEGY=copy -``` - -Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. - -#### Backup filename - -WARNING: -If you use a custom backup filename, you can't -[limit the lifetime of the backups](#limit-backup-lifetime-for-local-files-prune-old-backups). - -By default, a backup file is created according to the specification in the -previous [Backup timestamp](#backup-timestamp) section. You can, however, -override the `[TIMESTAMP]` portion of the filename by setting the `BACKUP` -environment variable. For example: - -```shell -sudo gitlab-backup create BACKUP=dump -``` - -Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. - -The resulting file is named `dump_gitlab_backup.tar`. This is useful for -systems that make use of rsync and incremental backups, and results in -considerably faster transfer speeds. - -#### Confirm archive can be transferred - -To ensure the generated archive is transferable by rsync, you can set the `GZIP_RSYNCABLE=yes` -option. This sets the `--rsyncable` option to `gzip`, which is useful only in -combination with setting [the Backup filename option](#backup-filename). - -The `--rsyncable` option in `gzip` isn't guaranteed to be available -on all distributions. To verify that it's available in your distribution, run -`gzip --help` or consult the man pages. - -```shell -sudo gitlab-backup create BACKUP=dump GZIP_RSYNCABLE=yes -``` - -Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. - -#### Excluding specific directories from the backup - -You can exclude specific directories from the backup by adding the environment variable `SKIP`, whose values are a comma-separated list of the following options: - -- `db` (database) -- `uploads` (attachments) -- `builds` (CI job output logs) -- `artifacts` (CI job artifacts) -- `lfs` (LFS objects) -- `terraform_state` (Terraform states) -- `registry` (Container Registry images) -- `pages` (Pages content) -- `repositories` (Git repositories data) -- `packages` (Packages) -- `ci_secure_files` (Project-level Secure Files) - -NOTE: -When [backing up and restoring Helm Charts](https://docs.gitlab.com/charts/architecture/backup-restore.html), there is an additional option `packages`, which refers to any packages managed by the GitLab [package registry](../user/packages/package_registry/index.md). -For more information see [command line arguments](https://docs.gitlab.com/charts/architecture/backup-restore.html#command-line-arguments). - -All wikis are backed up as part of the `repositories` group. Non-existent -wikis are skipped during a backup. - -For Omnibus GitLab packages: - -```shell -sudo gitlab-backup create SKIP=db,uploads -``` - -Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. - -For installations from source: - -```shell -sudo -u git -H bundle exec rake gitlab:backup:create SKIP=db,uploads RAILS_ENV=production -``` - -`SKIP=` is also used to: - -- [Skip creation of the tar file](#skipping-tar-creation) (`SKIP=tar`). -- [Skip uploading the backup to remote storage](#skip-uploading-backups-to-remote-storage) (`SKIP=remote`). - -#### Skipping tar creation - -NOTE: -It is not possible to skip the tar creation when using [object storage](#upload-backups-to-a-remote-cloud-storage) for backups. - -The last part of creating a backup is generation of a `.tar` file containing -all the parts. In some cases (for example, if the backup is picked up by other -backup software) creating a `.tar` file might be wasted effort or even directly -harmful, so you can skip this step by adding `tar` to the `SKIP` environment -variable. - -Adding `tar` to the `SKIP` variable leaves the files and directories containing the -backup in the directory used for the intermediate files. These files are -overwritten when a new backup is created, so you should make sure they are copied -elsewhere, because you can only have one backup on the system. - -For Omnibus GitLab packages: - -```shell -sudo gitlab-backup create SKIP=tar -``` - -For installations from source: - -```shell -sudo -u git -H bundle exec rake gitlab:backup:create SKIP=tar RAILS_ENV=production -``` - -#### Back up Git repositories concurrently - -> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/37158) in GitLab 13.3. -> - [Concurrent restore introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/69330) in GitLab 14.3 - -When using [multiple repository storages](../administration/repository_storage_paths.md), -repositories can be backed up or restored concurrently to help fully use CPU time. The -following variables are available to modify the default behavior of the Rake -task: - -- `GITLAB_BACKUP_MAX_CONCURRENCY`: The maximum number of projects to back up at - the same time. Defaults to the number of logical CPUs (in GitLab 14.1 and - earlier, defaults to `1`). -- `GITLAB_BACKUP_MAX_STORAGE_CONCURRENCY`: The maximum number of projects to - back up at the same time on each storage. This allows the repository backups - to be spread across storages. Defaults to `2` (in GitLab 14.1 and earlier, - defaults to `1`). - -For example, for Omnibus GitLab installations with 4 repository storages: - -```shell -sudo gitlab-backup create GITLAB_BACKUP_MAX_CONCURRENCY=4 GITLAB_BACKUP_MAX_STORAGE_CONCURRENCY=1 -``` - -For example, for installations from source: - -```shell -sudo -u git -H bundle exec rake gitlab:backup:create GITLAB_BACKUP_MAX_CONCURRENCY=4 GITLAB_BACKUP_MAX_STORAGE_CONCURRENCY=1 -``` - -#### Incremental repository backups - -> - Introduced in GitLab 14.9 [with a flag](../administration/feature_flags.md) named `incremental_repository_backup`. Disabled by default. -> - [Enabled on self-managed](https://gitlab.com/gitlab-org/gitlab/-/issues/355945) in GitLab 14.10. -> - `PREVIOUS_BACKUP` option [introduced](https://gitlab.com/gitlab-org/gitaly/-/issues/4184) in GitLab 15.0. - -FLAG: -On self-managed GitLab, by default this feature is available. To hide the feature, ask an administrator to [disable the feature flag](../administration/feature_flags.md) named `incremental_repository_backup`. -On GitLab.com, this feature is not available. - -NOTE: -Only repositories support incremental backups. Therefore, if you use `INCREMENTAL=yes`, the task -creates a self-contained backup tar archive. This is because all subtasks except repositories are -still creating full backups (they overwrite the existing full backup). -See [issue 19256](https://gitlab.com/gitlab-org/gitlab/-/issues/19256) for a feature request to -support incremental backups for all subtasks. - -Incremental repository backups can be faster than full repository backups because they only pack changes since the last backup into the backup bundle for each repository. -The incremental backup archives are not linked to each other: each archive is a self-contained backup of the instance. There must be an existing backup -to create an incremental backup from: - -- In GitLab 14.9 and 14.10, use the `BACKUP=` option to choose the backup to use. The chosen previous backup is overwritten. -- In GitLab 15.0 and later, use the `PREVIOUS_BACKUP=` option to choose the backup to use. By default, a backup file is created - as documented in the [Backup timestamp](#backup-timestamp) section. You can override the `[TIMESTAMP]` portion of the filename by setting the - [`BACKUP` environment variable](#backup-filename). - -To create an incremental backup, run: - -- In GitLab 15.0 or later: - - ```shell - sudo gitlab-backup create INCREMENTAL=yes PREVIOUS_BACKUP= - ``` - -- In GitLab 14.9 and 14.10: - - ```shell - sudo gitlab-backup create INCREMENTAL=yes BACKUP= - ``` - -To create an [untarred](#skipping-tar-creation) incremental backup from a tarred backup, use `SKIP=tar`: - -```shell -sudo gitlab-backup create INCREMENTAL=yes SKIP=tar -``` - -#### Back up specific repository storages - -> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/86896) in GitLab 15.0. - -When using [multiple repository storages](../administration/repository_storage_paths.md), -repositories from specific repository storages can be backed up separately -using the `REPOSITORIES_STORAGES` option. The option accepts a comma-separated list of -storage names. - -For example, for Omnibus GitLab installations: - -```shell -sudo gitlab-backup create REPOSITORIES_STORAGES=storage1,storage2 -``` - -For example, for installations from source: - -```shell -sudo -u git -H bundle exec rake gitlab:backup:create REPOSITORIES_STORAGES=storage1,storage2 -``` - -#### Back up specific repositories - -> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/88094) in GitLab 15.1. - -You can back up specific repositories using the `REPOSITORIES_PATHS` option. -Similarly, you can use `SKIP_REPOSITORIES_PATHS` to skip certain repositories. -Both options accept a comma-separated list of project or group paths. If you -specify a group path, all repositories in all projects in the group and -descendent groups are included or skipped, depending on which option you used. - -For example, to back up all repositories for all projects in **Group A** (`group-a`), the repository for **Project C** in **Group B** (`group-b/project-c`), -and skip the **Project D** in **Group A** (`group-a/project-d`): - -- Omnibus GitLab installations: - - ```shell - sudo gitlab-backup create REPOSITORIES_PATHS=group-a,group-b/project-c SKIP_REPOSITORIES_PATHS=group-a/project-d - ``` - -- Installations from source: - - ```shell - sudo -u git -H bundle exec rake gitlab:backup:create REPOSITORIES_PATHS=group-a,group-b/project-c SKIP_REPOSITORIES_PATHS=group-a/project-d - ``` - -#### Upload backups to a remote (cloud) storage - -NOTE: -It is not possible to [skip the tar creation](#skipping-tar-creation) when using object storage for backups. - -You can let the backup script upload (using the [Fog library](https://fog.io/)) -the `.tar` file it creates. In the following example, we use Amazon S3 for -storage, but Fog also lets you use [other storage providers](https://fog.io/storage/). -GitLab also [imports cloud drivers](https://gitlab.com/gitlab-org/gitlab/-/blob/da46c9655962df7d49caef0e2b9f6bbe88462a02/Gemfile#L113) -for AWS, Google, and Aliyun. A local driver is -[also available](#upload-to-locally-mounted-shares). - -[Read more about using object storage with GitLab](../administration/object_storage.md). - -##### Using Amazon S3 - -For Omnibus GitLab packages: - -1. Add the following to `/etc/gitlab/gitlab.rb`: - - ```ruby - gitlab_rails['backup_upload_connection'] = { - 'provider' => 'AWS', - 'region' => 'eu-west-1', - 'aws_access_key_id' => 'AKIAKIAKI', - 'aws_secret_access_key' => 'secret123' - # If using an IAM Profile, don't configure aws_access_key_id & aws_secret_access_key - # 'use_iam_profile' => true - } - gitlab_rails['backup_upload_remote_directory'] = 'my.s3.bucket' - # Consider using multipart uploads when file size reaches 100MB. Enter a number in bytes. - # gitlab_rails['backup_multipart_chunk_size'] = 104857600 - ``` - -1. [Reconfigure GitLab](../administration/restart_gitlab.md#reconfigure-a-linux-package-installation) - for the changes to take effect - -##### S3 Encrypted Buckets - -> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/64765) in GitLab 14.3. - -AWS supports these [modes for server side encryption](https://docs.aws.amazon.com/AmazonS3/latest/userguide/serv-side-encryption.html): - -- Amazon S3-Managed Keys (SSE-S3) -- Customer Master Keys (CMKs) Stored in AWS Key Management Service (SSE-KMS) -- Customer-Provided Keys (SSE-C) - -Use your mode of choice with GitLab. Each mode has similar, but slightly -different, configuration methods. - -###### SSE-S3 - -To enable SSE-S3, in the backup storage options set the `server_side_encryption` -field to `AES256`. For example, in Omnibus GitLab: - -```ruby -gitlab_rails['backup_upload_storage_options'] = { - 'server_side_encryption' => 'AES256' -} -``` - -###### SSE-KMS - -To enable SSE-KMS, you need the -[KMS key via its Amazon Resource Name (ARN) in the `arn:aws:kms:region:acct-id:key/key-id` format](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingKMSEncryption.html). -Under the `backup_upload_storage_options` configuration setting, set: - -- `server_side_encryption` to `aws:kms`. -- `server_side_encryption_kms_key_id` to the ARN of the key. - -For example, in Omnibus GitLab: - -```ruby -gitlab_rails['backup_upload_storage_options'] = { - 'server_side_encryption' => 'aws:kms', - 'server_side_encryption_kms_key_id' => 'arn:aws::' -} -``` - -###### SSE-C - -SSE-C requires you to set these encryption options: - -- `backup_encryption`: AES256. -- `backup_encryption_key`: Unencoded, 32-byte (256 bits) key. The upload fails if this isn't exactly 32 bytes. - -For example, in Omnibus GitLab: - -```ruby -gitlab_rails['backup_encryption'] = 'AES256' -gitlab_rails['backup_encryption_key'] = '' -``` - -If the key contains binary characters and cannot be encoded in UTF-8, -instead, specify the key with the `GITLAB_BACKUP_ENCRYPTION_KEY` environment variable. -For example: - -```ruby -gitlab_rails['env'] = { 'GITLAB_BACKUP_ENCRYPTION_KEY' => "\xDE\xAD\xBE\xEF" * 8 } -``` - -##### Digital Ocean Spaces - -This example can be used for a bucket in Amsterdam (AMS3): - -1. Add the following to `/etc/gitlab/gitlab.rb`: - - ```ruby - gitlab_rails['backup_upload_connection'] = { - 'provider' => 'AWS', - 'region' => 'ams3', - 'aws_access_key_id' => 'AKIAKIAKI', - 'aws_secret_access_key' => 'secret123', - 'endpoint' => 'https://ams3.digitaloceanspaces.com' - } - gitlab_rails['backup_upload_remote_directory'] = 'my.s3.bucket' - ``` - -1. [Reconfigure GitLab](../administration/restart_gitlab.md#reconfigure-a-linux-package-installation) - for the changes to take effect - -If you see a `400 Bad Request` error message when using Digital Ocean Spaces, -the cause may be the use of backup encryption. Because Digital Ocean Spaces -doesn't support encryption, remove or comment the line that contains -`gitlab_rails['backup_encryption']`. - -##### Other S3 Providers - -Not all S3 providers are fully compatible with the Fog library. For example, -if you see a `411 Length Required` error message after attempting to upload, -you may need to downgrade the `aws_signature_version` value from the default -value to `2`, [due to this issue](https://github.com/fog/fog-aws/issues/428). - -For installations from source: - -1. Edit `home/git/gitlab/config/gitlab.yml`: - - ```yaml - backup: - # snip - upload: - # Fog storage connection settings, see https://fog.io/storage/ . - connection: - provider: AWS - region: eu-west-1 - aws_access_key_id: AKIAKIAKI - aws_secret_access_key: 'secret123' - # If using an IAM Profile, leave aws_access_key_id & aws_secret_access_key empty - # ie. aws_access_key_id: '' - # use_iam_profile: 'true' - # The remote 'directory' to store your backups. For S3, this would be the bucket name. - remote_directory: 'my.s3.bucket' - # Specifies Amazon S3 storage class to use for backups, this is optional - # storage_class: 'STANDARD' - # - # Turns on AWS Server-Side Encryption with Amazon Customer-Provided Encryption Keys for backups, this is optional - # 'encryption' must be set in order for this to have any effect. - # 'encryption_key' should be set to the 256-bit encryption key for Amazon S3 to use to encrypt or decrypt. - # To avoid storing the key on disk, the key can also be specified via the `GITLAB_BACKUP_ENCRYPTION_KEY` your data. - # encryption: 'AES256' - # encryption_key: '' - # - # - # Turns on AWS Server-Side Encryption with Amazon S3-Managed keys (optional) - # https://docs.aws.amazon.com/AmazonS3/latest/userguide/serv-side-encryption.html - # For SSE-S3, set 'server_side_encryption' to 'AES256'. - # For SS3-KMS, set 'server_side_encryption' to 'aws:kms'. Set - # 'server_side_encryption_kms_key_id' to the ARN of customer master key. - # storage_options: - # server_side_encryption: 'aws:kms' - # server_side_encryption_kms_key_id: 'arn:aws:kms:YOUR-KEY-ID-HERE' - ``` - -1. [Restart GitLab](../administration/restart_gitlab.md#installations-from-source) - for the changes to take effect - -If you're uploading your backups to S3, you should create a new -IAM user with restricted access rights. To give the upload user access only for -uploading backups create the following IAM profile, replacing `my.s3.bucket` -with the name of your bucket: - -```json -{ - "Version": "2012-10-17", - "Statement": [ - { - "Sid": "Stmt1412062044000", - "Effect": "Allow", - "Action": [ - "s3:AbortMultipartUpload", - "s3:GetBucketAcl", - "s3:GetBucketLocation", - "s3:GetObject", - "s3:GetObjectAcl", - "s3:ListBucketMultipartUploads", - "s3:PutObject", - "s3:PutObjectAcl" - ], - "Resource": [ - "arn:aws:s3:::my.s3.bucket/*" - ] - }, - { - "Sid": "Stmt1412062097000", - "Effect": "Allow", - "Action": [ - "s3:GetBucketLocation", - "s3:ListAllMyBuckets" - ], - "Resource": [ - "*" - ] - }, - { - "Sid": "Stmt1412062128000", - "Effect": "Allow", - "Action": [ - "s3:ListBucket" - ], - "Resource": [ - "arn:aws:s3:::my.s3.bucket" - ] - } - ] -} -``` - -##### Using Google Cloud Storage - -To use Google Cloud Storage to save backups, you must first create an -access key from the Google console: - -1. Go to the [Google storage settings page](https://console.cloud.google.com/storage/settings). -1. Select **Interoperability**, and then create an access key. -1. Make note of the **Access Key** and **Secret** and replace them in the - following configurations. -1. In the buckets advanced settings ensure the Access Control option - **Set object-level and bucket-level permissions** is selected. -1. Ensure you have already created a bucket. - -For Omnibus GitLab packages: - -1. Edit `/etc/gitlab/gitlab.rb`: - - ```ruby - gitlab_rails['backup_upload_connection'] = { - 'provider' => 'Google', - 'google_storage_access_key_id' => 'Access Key', - 'google_storage_secret_access_key' => 'Secret', - - ## If you have CNAME buckets (foo.example.com), you might run into SSL issues - ## when uploading backups ("hostname foo.example.com.storage.googleapis.com - ## does not match the server certificate"). In that case, uncomnent the following - ## setting. See: https://github.com/fog/fog/issues/2834 - #'path_style' => true - } - gitlab_rails['backup_upload_remote_directory'] = 'my.google.bucket' - ``` - -1. [Reconfigure GitLab](../administration/restart_gitlab.md#reconfigure-a-linux-package-installation) - for the changes to take effect - -For installations from source: - -1. Edit `home/git/gitlab/config/gitlab.yml`: - - ```yaml - backup: - upload: - connection: - provider: 'Google' - google_storage_access_key_id: 'Access Key' - google_storage_secret_access_key: 'Secret' - remote_directory: 'my.google.bucket' - ``` - -1. [Restart GitLab](../administration/restart_gitlab.md#installations-from-source) - for the changes to take effect - -##### Using Azure Blob storage - -> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/25877) in GitLab 13.4. - -For Omnibus GitLab packages: - -1. Edit `/etc/gitlab/gitlab.rb`: - - ```ruby - gitlab_rails['backup_upload_connection'] = { - 'provider' => 'AzureRM', - 'azure_storage_account_name' => '', - 'azure_storage_access_key' => '', - 'azure_storage_domain' => 'blob.core.windows.net', # Optional - } - gitlab_rails['backup_upload_remote_directory'] = '' - ``` - -1. [Reconfigure GitLab](../administration/restart_gitlab.md#reconfigure-a-linux-package-installation) - for the changes to take effect - -For installations from source: - -1. Edit `home/git/gitlab/config/gitlab.yml`: - - ```yaml - backup: - upload: - connection: - provider: 'AzureRM' - azure_storage_account_name: '' - azure_storage_access_key: '' - remote_directory: '' - ``` - -1. [Restart GitLab](../administration/restart_gitlab.md#installations-from-source) - for the changes to take effect - -For more details, see the [table of Azure parameters](../administration/object_storage.md#azure-blob-storage). - -##### Specifying a custom directory for backups - -This option works only for remote storage. If you want to group your backups, -you can pass a `DIRECTORY` environment variable: - -```shell -sudo gitlab-backup create DIRECTORY=daily -sudo gitlab-backup create DIRECTORY=weekly -``` - -Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. - -#### Skip uploading backups to remote storage - -If you have configured GitLab to [upload backups in a remote storage](#upload-backups-to-a-remote-cloud-storage), -you can use the `SKIP=remote` option to skip uploading your backups to the remote storage. - -For Omnibus GitLab packages: - -```shell -sudo gitlab-backup create SKIP=remote -``` - -For installations from source: - -```shell -sudo -u git -H bundle exec rake gitlab:backup:create SKIP=remote RAILS_ENV=production -``` - -#### Upload to locally-mounted shares - -You can send backups to a locally-mounted share (for example, `NFS`,`CIFS`, or `SMB`) using the Fog -[`Local`](https://github.com/fog/fog-local#usage) storage provider. - -To do this, you must set the following configuration keys: - -- `backup_upload_connection.local_root`: mounted directory that backups are copied to. -- `backup_upload_remote_directory`: subdirectory of the `backup_upload_connection.local_root` directory. It is created if it doesn't exist. - If you want to copy the tarballs to the root of your mounted directory, use `.`. - -When mounted, the directory set in the `local_root` key must be owned by either: - -- The `git` user. So, mounting with the `uid=` of the `git` user for `CIFS` and `SMB`. -- The user that you are executing the backup tasks as. For Omnibus GitLab, this is the `git` user. - -Because file system performance may affect overall GitLab performance, -[we don't recommend using cloud-based file systems for storage](../administration/nfs.md#avoid-using-cloud-based-file-systems). - -##### Avoid conflicting configuration - -Don't set the following configuration keys to the same path: - -- `gitlab_rails['backup_path']` (`backup.path` for source installations). -- `gitlab_rails['backup_upload_connection'].local_root` (`backup.upload.connection.local_root` for source installations). - -The `backup_path` configuration key sets the local location of the backup file. The `upload` configuration key is -intended for use when the backup file is uploaded to a separate server, perhaps for archival purposes. - -If these configuration keys are set to the same location, the upload feature fails because a backup already exists at -the upload location. This failure causes the upload feature to delete the backup because it assumes it's a residual file -remaining after the failed upload attempt. - -##### Configure uploads to locally-mounted shares - -For Omnibus GitLab packages: - -1. Edit `/etc/gitlab/gitlab.rb`: - - ```ruby - gitlab_rails['backup_upload_connection'] = { - :provider => 'Local', - :local_root => '/mnt/backups' - } - - # The directory inside the mounted folder to copy backups to - # Use '.' to store them in the root directory - gitlab_rails['backup_upload_remote_directory'] = 'gitlab_backups' - ``` - -1. [Reconfigure GitLab](../administration/restart_gitlab.md#reconfigure-a-linux-package-installation) - for the changes to take effect. - -For installations from source: - -1. Edit `home/git/gitlab/config/gitlab.yml`: - - ```yaml - backup: - upload: - # Fog storage connection settings, see https://fog.io/storage/ . - connection: - provider: Local - local_root: '/mnt/backups' - # The directory inside the mounted folder to copy backups to - # Use '.' to store them in the root directory - remote_directory: 'gitlab_backups' - ``` - -1. [Restart GitLab](../administration/restart_gitlab.md#installations-from-source) - for the changes to take effect. - -#### Backup archive permissions - -The backup archives created by GitLab (`1393513186_2014_02_27_gitlab_backup.tar`) -have the owner/group `git`/`git` and 0600 permissions by default. This is -meant to avoid other system users reading GitLab data. If you need the backup -archives to have different permissions, you can use the `archive_permissions` -setting. - -For Omnibus GitLab packages: - -1. Edit `/etc/gitlab/gitlab.rb`: - - ```ruby - gitlab_rails['backup_archive_permissions'] = 0644 # Makes the backup archives world-readable - ``` - -1. [Reconfigure GitLab](../administration/restart_gitlab.md#reconfigure-a-linux-package-installation) - for the changes to take effect. - -For installations from source: - -1. Edit `/home/git/gitlab/config/gitlab.yml`: - - ```yaml - backup: - archive_permissions: 0644 # Makes the backup archives world-readable - ``` - -1. [Restart GitLab](../administration/restart_gitlab.md#installations-from-source) - for the changes to take effect. - -#### Configuring cron to make daily backups - -WARNING: -The following cron jobs do not [back up your GitLab configuration files](#storing-configuration-files) -or [SSH host keys](https://superuser.com/questions/532040/copy-ssh-keys-from-one-server-to-another-server/532079#532079). - -You can schedule a cron job that backs up your repositories and GitLab metadata. - -For Omnibus GitLab packages: - -1. Edit the crontab for the `root` user: - - ```shell - sudo su - - crontab -e - ``` - -1. There, add the following line to schedule the backup for everyday at 2 AM: - - ```plaintext - 0 2 * * * /opt/gitlab/bin/gitlab-backup create CRON=1 - ``` - - Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. - -For installations from source: - -1. Edit the crontab for the `git` user: - - ```shell - sudo -u git crontab -e - ``` - -1. Add the following lines at the bottom: - - ```plaintext - # Create a full backup of the GitLab repositories and SQL database every day at 2am - 0 2 * * * cd /home/git/gitlab && PATH=/usr/local/bin:/usr/bin:/bin bundle exec rake gitlab:backup:create RAILS_ENV=production CRON=1 - ``` - -The `CRON=1` environment setting directs the backup script to hide all progress -output if there aren't any errors. This is recommended to reduce cron spam. -When troubleshooting backup problems, however, replace `CRON=1` with `--trace` to log verbosely. - -#### Limit backup lifetime for local files (prune old backups) - -WARNING: -The process described in this section doesn't work if you used a [custom filename](#backup-filename) -for your backups. - -To prevent regular backups from using all your disk space, you may want to set a limited lifetime -for backups. The next time the backup task runs, backups older than the `backup_keep_time` are -pruned. - -This configuration option manages only local files. GitLab doesn't prune old -files stored in a third-party [object storage](#upload-backups-to-a-remote-cloud-storage) -because the user may not have permission to list and delete files. It's -recommended that you configure the appropriate retention policy for your object -storage (for example, [AWS S3](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-lifecycle.html)). - -For Omnibus GitLab packages: - -1. Edit `/etc/gitlab/gitlab.rb`: - - ```ruby - ## Limit backup lifetime to 7 days - 604800 seconds - gitlab_rails['backup_keep_time'] = 604800 - ``` - -1. [Reconfigure GitLab](../administration/restart_gitlab.md#reconfigure-a-linux-package-installation) - for the changes to take effect. - -For installations from source: - -1. Edit `/home/git/gitlab/config/gitlab.yml`: - - ```yaml - backup: - ## Limit backup lifetime to 7 days - 604800 seconds - keep_time: 604800 - ``` - -1. [Restart GitLab](../administration/restart_gitlab.md#installations-from-source) - for the changes to take effect. - -#### Back up and restore for installations using PgBouncer - -Do not back up or restore GitLab through a PgBouncer connection. These -tasks must [bypass PgBouncer and connect directly to the PostgreSQL primary database node](#bypassing-pgbouncer), -or they cause a GitLab outage. - -When the GitLab backup or restore task is used with PgBouncer, the -following error message is shown: - -```ruby -ActiveRecord::StatementInvalid: PG::UndefinedTable -``` - -Each time the GitLab backup runs, GitLab starts generating 500 errors and errors about missing -tables will [be logged by PostgreSQL](../administration/logs/index.md#postgresql-logs): - -```plaintext -ERROR: relation "tablename" does not exist at character 123 -``` - -This happens because the task uses `pg_dump`, which -[sets a null search path and explicitly includes the schema in every SQL query](https://gitlab.com/gitlab-org/gitlab/-/issues/23211) -to address [CVE-2018-1058](https://www.postgresql.org/about/news/postgresql-103-968-9512-9417-and-9322-released-1834/). - -Since connections are reused with PgBouncer in transaction pooling mode, -PostgreSQL fails to search the default `public` schema. As a result, -this clearing of the search path causes tables and columns to appear -missing. - -##### Bypassing PgBouncer - -There are two ways to fix this: - -1. [Use environment variables to override the database settings](#environment-variable-overrides) for the backup task. -1. Reconfigure a node to [connect directly to the PostgreSQL primary database node](../administration/postgresql/pgbouncer.md#procedure-for-bypassing-pgbouncer). - -###### Environment variable overrides - -By default, GitLab uses the database configuration stored in a -configuration file (`database.yml`). However, you can override the database settings -for the backup and restore task by setting environment -variables that are prefixed with `GITLAB_BACKUP_`: - -- `GITLAB_BACKUP_PGHOST` -- `GITLAB_BACKUP_PGUSER` -- `GITLAB_BACKUP_PGPORT` -- `GITLAB_BACKUP_PGPASSWORD` -- `GITLAB_BACKUP_PGSSLMODE` -- `GITLAB_BACKUP_PGSSLKEY` -- `GITLAB_BACKUP_PGSSLCERT` -- `GITLAB_BACKUP_PGSSLROOTCERT` -- `GITLAB_BACKUP_PGSSLCRL` -- `GITLAB_BACKUP_PGSSLCOMPRESSION` - -For example, to override the database host and port to use 192.168.1.10 -and port 5432 with the Omnibus package: - -```shell -sudo GITLAB_BACKUP_PGHOST=192.168.1.10 GITLAB_BACKUP_PGPORT=5432 /opt/gitlab/bin/gitlab-backup create -``` - -See the [PostgreSQL documentation](https://www.postgresql.org/docs/12/libpq-envars.html) -for more details on what these parameters do. - -#### `gitaly-backup` for repository backup and restore - -> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/333034) in GitLab 14.2. -> - [Deployed behind a feature flag](../user/feature_flags.md), enabled by default. -> - [Generally available](https://gitlab.com/gitlab-org/gitlab/-/issues/333034) in GitLab 14.10. [Feature flag `gitaly_backup`](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/83254) removed. - -The `gitaly-backup` binary is used by the backup Rake task to create and restore repository backups from Gitaly. -`gitaly-backup` replaces the previous backup method that directly calls RPCs on Gitaly from GitLab. - -The backup Rake task must be able to find this executable. In most cases, you don't need to change -the path to the binary as it should work fine with the default path `/opt/gitlab/embedded/bin/gitaly-backup`. -If you have a specific reason to change the path, it can be configured in Omnibus GitLab packages: - -1. Add the following to `/etc/gitlab/gitlab.rb`: - - ```ruby - gitlab_rails['backup_gitaly_backup_path'] = '/path/to/gitaly-backup' - ``` - -1. [Reconfigure GitLab](../administration/restart_gitlab.md#reconfigure-a-linux-package-installation) - for the changes to take effect. - -## Alternative backup strategies - -In the following cases, consider using file system data transfer or snapshots as part of your backup strategy: - -- Your GitLab instance contains a lot of Git repository data and the GitLab backup script is too slow. -- Your GitLab instance has a lot of forked projects and the regular backup task duplicates the Git data for all of them. -- Your GitLab instance has a problem and using the regular backup and import Rake tasks isn't possible. - -WARNING: -Gitaly Cluster [does not support snapshot backups](../administration/gitaly/index.md#snapshot-backup-and-recovery-limitations). - -When considering using file system data transfer or snapshots: - -- Don't use these methods to migrate from one operating system to another. The operating systems of the source and destination should be as similar as possible. For example, - don't use these methods to migrate from Ubuntu to Fedora. -- Data consistency is very important. We recommend stopping GitLab with `sudo gitlab-ctl stop` before taking doing a file system transfer (with rsync, for example) or taking a - snapshot. - -Example: Amazon Elastic Block Store (EBS) - -> A GitLab server using Omnibus GitLab hosted on Amazon AWS. -> An EBS drive containing an ext4 file system is mounted at `/var/opt/gitlab`. -> In this case you could make an application backup by taking an EBS snapshot. -> The backup includes all repositories, uploads and PostgreSQL data. - -Example: Logical Volume Manager (LVM) snapshots + rsync - -> A GitLab server using Omnibus GitLab, with an LVM logical volume mounted at `/var/opt/gitlab`. -> Replicating the `/var/opt/gitlab` directory using rsync would not be reliable because too many files would change while rsync is running. -> Instead of rsync-ing `/var/opt/gitlab`, we create a temporary LVM snapshot, which we mount as a read-only file system at `/mnt/gitlab_backup`. -> Now we can have a longer running rsync job which creates a consistent replica on the remote server. -> The replica includes all repositories, uploads and PostgreSQL data. - -If you're running GitLab on a virtualized server, you can possibly also create -VM snapshots of the entire GitLab server. It's not uncommon however for a VM -snapshot to require you to power down the server, which limits this solution's -practical use. - -### Back up repository data separately - -First, ensure you back up existing GitLab data while [skipping repositories](#excluding-specific-directories-from-the-backup): - -```shell -# for Omnibus GitLab package installations -sudo gitlab-backup create SKIP=repositories - -# for installations from source: -sudo -u git -H bundle exec rake gitlab:backup:create SKIP=repositories RAILS_ENV=production -``` - -For manually backing up the Git repository data on disk, there are multiple possible strategies: - -- Use snapshots, such as the previous examples of Amazon EBS drive snapshots, or LVM snapshots + rsync. -- Use [GitLab Geo](../administration/geo/index.md) and rely on the repository data on a Geo secondary site. -- [Prevent writes and copy the Git repository data](#prevent-writes-and-copy-the-git-repository-data). -- [Create an online backup by marking repositories as read-only (experimental)](#online-backup-through-marking-repositories-as-read-only-experimental). - -#### Prevent writes and copy the Git repository data - -Git repositories must be copied in a consistent way. They should not be copied during concurrent write -operations, as this can lead to inconsistencies or corruption issues. For more details, -[issue #270422](https://gitlab.com/gitlab-org/gitlab/-/issues/270422 "Provide documentation on preferred method of migrating Gitaly servers") -has a longer discussion explaining the potential problems. - -To prevent writes to the Git repository data, there are two possible approaches: - -- Use [maintenance mode](../administration/maintenance_mode/index.md) to place GitLab in a read-only state. -- Create explicit downtime by stopping all Gitaly services before backing up the repositories: - - ```shell - sudo gitlab-ctl stop gitaly - # execute git data copy step - sudo gitlab-ctl start gitaly - ``` - -You can copy Git repository data using any method, as long as writes are prevented on the data being copied -(to prevent inconsistencies and corruption issues). In order of preference and safety, the recommended methods are: - -1. Use `rsync` with archive-mode, delete, and checksum options, for example: - - ```shell - rsync -aR --delete --checksum source destination # be extra safe with the order as it will delete existing data if inverted - ``` - -1. Use a [`tar` pipe to copy the entire repository's directory to another server or location](../administration/operations/moving_repositories.md#tar-pipe-to-another-server). - -1. Use `sftp`, `scp`, `cp`, or any other copying method. - -#### Online backup through marking repositories as read-only (experimental) - -One way of backing up repositories without requiring instance-wide downtime -is to programmatically mark projects as read-only while copying the underlying data. - -There are a few possible downsides to this: - -- Repositories are read-only for a period of time that scales with the size of the repository. -- Backups take a longer time to complete due to marking each project as read-only, potentially leading to inconsistencies. For example, - a possible date discrepancy between the last data available for the first project that gets backed up compared to - the last project that gets backed up. -- Fork networks should be entirely read-only while the projects inside get backed up to prevent potential changes to the pool repository. - -There is an **experimental** script that attempts to automate this process in -[the Geo team Runbooks project](https://gitlab.com/gitlab-org/geo-team/runbooks/-/tree/main/experimental-online-backup-through-rsync). - -## Troubleshooting - -The following are possible problems you might encounter, along with potential -solutions. - -### When the secrets file is lost - -If you didn't [back up the secrets file](#storing-configuration-files), you -must complete several steps to get GitLab working properly again. - -The secrets file is responsible for storing the encryption key for the columns -that contain required, sensitive information. If the key is lost, GitLab can't -decrypt those columns, preventing access to the following items: - -- [CI/CD variables](../ci/variables/index.md) -- [Kubernetes / GCP integration](../user/infrastructure/clusters/index.md) -- [Custom Pages domains](../user/project/pages/custom_domains_ssl_tls_certification/index.md) -- [Project error tracking](../operations/error_tracking.md) -- [Runner authentication](../ci/runners/index.md) -- [Project mirroring](../user/project/repository/mirror/index.md) -- [Integrations](../user/project/integrations/index.md) -- [Web hooks](../user/project/integrations/webhooks.md) - -In cases like CI/CD variables and runner authentication, you can experience -unexpected behaviors, such as: - -- Stuck jobs. -- 500 errors. - -In this case, you must reset all the tokens for CI/CD variables and -runner authentication, which is described in more detail in the following -sections. After resetting the tokens, you should be able to visit your project -and the jobs begin running again. - -Use the information in the following sections at your own risk. - -#### Verify that all values can be decrypted - -You can determine if your database contains values that can't be decrypted by using a -[Rake task](../administration/raketasks/check.md#verify-database-values-can-be-decrypted-using-the-current-secrets). - -#### Take a backup - -You must directly modify GitLab data to work around your lost secrets file. - -WARNING: -Be sure to create a full database backup before attempting any changes. - -#### Disable user two-factor authentication (2FA) - -Users with 2FA enabled can't sign in to GitLab. In that case, you must -[disable 2FA for everyone](../security/two_factor_authentication.md#for-all-users), -after which users must reactivate 2FA. - -#### Reset CI/CD variables - -1. Enter the database console: - - For Omnibus GitLab 14.1 and earlier: - - ```shell - sudo gitlab-rails dbconsole - ``` - - For Omnibus GitLab 14.2 and later: - - ```shell - sudo gitlab-rails dbconsole --database main - ``` - - For installations from source, GitLab 14.1 and earlier: - - ```shell - sudo -u git -H bundle exec rails dbconsole -e production - ``` - - For installations from source, GitLab 14.2 and later: - - ```shell - sudo -u git -H bundle exec rails dbconsole -e production --database main - ``` - -1. Examine the `ci_group_variables` and `ci_variables` tables: - - ```sql - SELECT * FROM public."ci_group_variables"; - SELECT * FROM public."ci_variables"; - ``` - - These are the variables that you need to delete. - -1. Delete all variables: - - ```sql - DELETE FROM ci_group_variables; - DELETE FROM ci_variables; - ``` - -1. If you know the specific group or project from which you wish to delete variables, you can include a `WHERE` statement to specify that in your `DELETE`: - - ```sql - DELETE FROM ci_group_variables WHERE group_id = ; - DELETE FROM ci_variables WHERE project_id = ; - ``` - -You may need to reconfigure or restart GitLab for the changes to take effect. - -#### Reset runner registration tokens - -1. Enter the database console: - - For Omnibus GitLab 14.1 and earlier: - - ```shell - sudo gitlab-rails dbconsole - ``` - - For Omnibus GitLab 14.2 and later: - - ```shell - sudo gitlab-rails dbconsole --database main - ``` - - For installations from source, GitLab 14.1 and earlier: - - ```shell - sudo -u git -H bundle exec rails dbconsole -e production - ``` - - For installations from source, GitLab 14.2 and later: - - ```shell - sudo -u git -H bundle exec rails dbconsole -e production --database main - ``` - -1. Clear all tokens for projects, groups, and the entire instance: - - WARNING: - The final `UPDATE` operation stops the runners from being able to pick - up new jobs. You must register new runners. - - ```sql - -- Clear project tokens - UPDATE projects SET runners_token = null, runners_token_encrypted = null; - -- Clear group tokens - UPDATE namespaces SET runners_token = null, runners_token_encrypted = null; - -- Clear instance tokens - UPDATE application_settings SET runners_registration_token_encrypted = null; - -- Clear key used for JWT authentication - -- This may break the $CI_JWT_TOKEN job variable: - -- https://gitlab.com/gitlab-org/gitlab/-/issues/325965 - UPDATE application_settings SET encrypted_ci_jwt_signing_key = null; - -- Clear runner tokens - UPDATE ci_runners SET token = null, token_encrypted = null; - ``` - -#### Reset pending pipeline jobs - -1. Enter the database console: - - For Omnibus GitLab 14.1 and earlier: - - ```shell - sudo gitlab-rails dbconsole - ``` - - For Omnibus GitLab 14.2 and later: - - ```shell - sudo gitlab-rails dbconsole --database main - ``` - - For installations from source, GitLab 14.1 and earlier: - - ```shell - sudo -u git -H bundle exec rails dbconsole -e production - ``` - - For installations from source, GitLab 14.2 and later: - - ```shell - sudo -u git -H bundle exec rails dbconsole -e production --database main - ``` - -1. Clear all the tokens for pending jobs: - - For GitLab 15.3 and earlier: - - ```sql - -- Clear build tokens - UPDATE ci_builds SET token = null, token_encrypted = null; - ``` - - For GitLab 15.4 and later: - - ```sql - -- Clear build tokens - UPDATE ci_builds SET token_encrypted = null; - ``` - -A similar strategy can be employed for the remaining features. By removing the -data that can't be decrypted, GitLab can be returned to operation, and the -lost data can be manually replaced. - -#### Fix integrations and webhooks - -If you've lost your secrets, the [integrations settings pages](../user/project/integrations/index.md) -and [webhooks settings pages](../user/project/integrations/webhooks.md) are probably displaying `500` error messages. - -The fix is to truncate the affected tables (those containing encrypted columns). -This deletes all your configured integrations, webhooks, and related metadata. -You should verify that the secrets are the root cause before deleting any data. - -1. Enter the database console: - - For Omnibus GitLab 14.1 and earlier: - - ```shell - sudo gitlab-rails dbconsole - ``` - - For Omnibus GitLab 14.2 and later: - - ```shell - sudo gitlab-rails dbconsole --database main - ``` - - For installations from source, GitLab 14.1 and earlier: - - ```shell - sudo -u git -H bundle exec rails dbconsole -e production - ``` - - For installations from source, GitLab 14.2 and later: - - ```shell - sudo -u git -H bundle exec rails dbconsole -e production --database main - ``` - -1. Truncate the following tables: - - ```sql - -- truncate web_hooks table - TRUNCATE integrations, chat_names, issue_tracker_data, jira_tracker_data, slack_integrations, web_hooks, zentao_tracker_data, web_hook_logs; - ``` - -### Container Registry push failures after restoring from a backup - -If you use the [Container Registry](../user/packages/container_registry/index.md), -pushes to the registry may fail after restoring your backup on an Omnibus GitLab -instance after restoring the registry data. - -These failures mention permission issues in the registry logs, similar to: - -```plaintext -level=error -msg="response completed with error" -err.code=unknown -err.detail="filesystem: mkdir /var/opt/gitlab/gitlab-rails/shared/registry/docker/registry/v2/repositories/...: permission denied" -err.message="unknown error" -``` - -This issue is caused by the restore running as the unprivileged user `git`, -which is unable to assign the correct ownership to the registry files during -the restore process ([issue #62759](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/62759 "Incorrect permissions on registry filesystem after restore")). - -To get your registry working again: - -```shell -sudo chown -R registry:registry /var/opt/gitlab/gitlab-rails/shared/registry/docker -``` - -If you changed the default file system location for the registry, run `chown` -against your custom location, instead of `/var/opt/gitlab/gitlab-rails/shared/registry/docker`. - -### Backup fails to complete with Gzip error - -When running the backup, you may receive a Gzip error message: - -```shell -sudo /opt/gitlab/bin/gitlab-backup create -... -Dumping ... -... -gzip: stdout: Input/output error - -Backup failed -``` - -If this happens, examine the following: - -- Confirm there is sufficient disk space for the Gzip operation. It's not uncommon for backups that - use the [default strategy](#backup-strategy-option) to require half the instance size - in free disk space during backup creation. -- If NFS is being used, check if the mount option `timeout` is set. The - default is `600`, and changing this to smaller values results in this error. - -### Backup fails with `File name too long` error - -During backup, you can get the `File name too long` error ([issue #354984](https://gitlab.com/gitlab-org/gitlab/-/issues/354984)). For example: - -```plaintext -Problem: |\r\n]+$'))[1]) > 246; - - CREATE INDEX ON uploads_with_long_filenames(row_id); - - SELECT - u.id, - u.path, - -- Current filename - (regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1] AS current_filename, - -- New filename - CONCAT( - LEFT(SPLIT_PART((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1], '.', 1), 242), - COALESCE(SUBSTRING((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1] FROM '\.(?:.(?!\.))+$')) - ) AS new_filename, - -- New path - CONCAT( - COALESCE((regexp_match(u.path, '(.*\/).*'))[1], ''), - CONCAT( - LEFT(SPLIT_PART((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1], '.', 1), 242), - COALESCE(SUBSTRING((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1] FROM '\.(?:.(?!\.))+$')) - ) - ) AS new_path - FROM uploads_with_long_filenames AS u - WHERE u.row_id > 0 AND u.row_id <= 10000; - ``` - - Output example: - - ```postgresql - -[ RECORD 1 ]----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - id | 34 - path | public/@hashed/loremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisitloremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisit.txt - current_filename | loremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisitloremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisit.txt - new_filename | loremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisitloremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelits.txt - new_path | public/@hashed/loremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelitsedvulputatemisitloremipsumdolorsitametconsecteturadipiscingelitseddoeiusmodtemporincididuntutlaboreetdoloremagnaaliquaauctorelits.txt - ``` - - Where: - - - `current_filename`: a filename that is currently more than 246 characters long. - - `new_filename`: a filename that has been truncated to 246 characters maximum. - - `new_path`: new path considering the `new_filename` (truncated). - - Once you validate the batch results, you must change the batch size (`row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. - -1. Rename the files found in the `uploads` table from long filenames to new truncated filenames. The following query rolls back the update so you can check the results safely within a transaction wrapper: - - ```sql - CREATE TEMP TABLE uploads_with_long_filenames AS - SELECT ROW_NUMBER() OVER(ORDER BY id) row_id, path, id - FROM uploads AS u - WHERE LENGTH((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1]) > 246; - - CREATE INDEX ON uploads_with_long_filenames(row_id); - - BEGIN; - WITH updated_uploads AS ( - UPDATE uploads - SET - path = - CONCAT( - COALESCE((regexp_match(updatable_uploads.path, '(.*\/).*'))[1], ''), - CONCAT( - LEFT(SPLIT_PART((regexp_match(updatable_uploads.path, '[^\\/:*?"<>|\r\n]+$'))[1], '.', 1), 242), - COALESCE(SUBSTRING((regexp_match(updatable_uploads.path, '[^\\/:*?"<>|\r\n]+$'))[1] FROM '\.(?:.(?!\.))+$')) - ) - ) - FROM - uploads_with_long_filenames AS updatable_uploads - WHERE - uploads.id = updatable_uploads.id - AND updatable_uploads.row_id > 0 AND updatable_uploads.row_id <= 10000 - RETURNING uploads.* - ) - SELECT id, path FROM updated_uploads; - ROLLBACK; - ``` - - Once you validate the batch update results, you must change the batch size (`row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. - -1. Validate that the new filenames from the previous query are the expected ones. If you are sure you want to truncate the records found in the previous step to 246 characters, run the following: - - WARNING: - The following action is **irreversible**. - - ```sql - CREATE TEMP TABLE uploads_with_long_filenames AS - SELECT ROW_NUMBER() OVER(ORDER BY id) row_id, path, id - FROM uploads AS u - WHERE LENGTH((regexp_match(u.path, '[^\\/:*?"<>|\r\n]+$'))[1]) > 246; - - CREATE INDEX ON uploads_with_long_filenames(row_id); - - UPDATE uploads - SET - path = - CONCAT( - COALESCE((regexp_match(updatable_uploads.path, '(.*\/).*'))[1], ''), - CONCAT( - LEFT(SPLIT_PART((regexp_match(updatable_uploads.path, '[^\\/:*?"<>|\r\n]+$'))[1], '.', 1), 242), - COALESCE(SUBSTRING((regexp_match(updatable_uploads.path, '[^\\/:*?"<>|\r\n]+$'))[1] FROM '\.(?:.(?!\.))+$')) - ) - ) - FROM - uploads_with_long_filenames AS updatable_uploads - WHERE - uploads.id = updatable_uploads.id - AND updatable_uploads.row_id > 0 AND updatable_uploads.row_id <= 10000; - ``` - - Once you finish the batch update, you must change the batch size (`updatable_uploads.row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. - -Truncate the filenames in the references found: - -1. Check if those records are referenced somewhere. One way to do this is to dump the database and search for the parent directory name and filename: - - 1. To dump your database, you can use the following command as an example: - - ```shell - pg_dump -h /var/opt/gitlab/postgresql/ -d gitlabhq_production > gitlab-dump.tmp - ``` - - 1. Then you can search for the references using the `grep` command. Combining the parent directory and the filename can be a good idea. For example: - - ```shell - grep public/alongfilenamehere.txt gitlab-dump.tmp - ``` - -1. Replace those long filenames using the new filenames obtained from querying the `uploads` table. - -Truncate the filenames on the file system. You must manually rename the files in your file system to the new filenames obtained from querying the `uploads` table. - -#### Re-run the backup task - -After following all the previous steps, re-run the backup task. - -### Restoring database backup fails when `pg_stat_statements` was previously enabled - -The GitLab backup of the PostgreSQL database includes all SQL statements required to enable extensions that were -previously enabled in the database. - -The `pg_stat_statements` extension can only be enabled or disabled by a PostgreSQL user with `superuser` role. -As the restore process uses a database user with limited permissions, it can't execute the following SQL statements: - -```sql -DROP EXTENSION IF EXISTS pg_stat_statements; -CREATE EXTENSION IF NOT EXISTS pg_stat_statements WITH SCHEMA public; -``` - -When trying to restore the backup in a PostgreSQL instance that doesn't have the `pg_stats_statements` extension, -the following error message is displayed: - -```plaintext -ERROR: permission denied to create extension "pg_stat_statements" -HINT: Must be superuser to create this extension. -ERROR: extension "pg_stat_statements" does not exist -``` - -When trying to restore in an instance that has the `pg_stats_statements` extension enabled, the cleaning up step -fails with an error message similar to the following: - -```plaintext -rake aborted! -ActiveRecord::StatementInvalid: PG::InsufficientPrivilege: ERROR: must be owner of view pg_stat_statements -/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:42:in `block (4 levels) in ' -/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:41:in `each' -/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:41:in `block (3 levels) in ' -/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:71:in `block (3 levels) in ' -/opt/gitlab/embedded/bin/bundle:23:in `load' -/opt/gitlab/embedded/bin/bundle:23:in `
' -Caused by: -PG::InsufficientPrivilege: ERROR: must be owner of view pg_stat_statements -/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:42:in `block (4 levels) in ' -/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:41:in `each' -/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:41:in `block (3 levels) in ' -/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:71:in `block (3 levels) in ' -/opt/gitlab/embedded/bin/bundle:23:in `load' -/opt/gitlab/embedded/bin/bundle:23:in `
' -Tasks: TOP => gitlab:db:drop_tables -(See full trace by running task with --trace) -``` - -#### Prevent the dump file to include `pg_stat_statements` - -To prevent the inclusion of the extension in the PostgreSQL dump file that is part of the backup bundle, -enable the extension in any schema except the `public` schema: - -```sql -CREATE SCHEMA adm; -CREATE EXTENSION pg_stat_statements SCHEMA adm; -``` - -If the extension was previously enabled in the `public` schema, move it to a new one: - -```sql -CREATE SCHEMA adm; -ALTER EXTENSION pg_stat_statements SET SCHEMA adm; -``` - -To query the `pg_stat_statements` data after changing the schema, prefix the view name with the new schema: - -```sql -SELECT * FROM adm.pg_stat_statements limit 0; -``` - -To make it compatible with third-party monitoring solutions that expect it to be enabled in the `public` schema, -you need to include it in the `search_path`: - -```sql -set search_path to public,adm; -``` - -#### Fix an existing dump file to remove references to `pg_stat_statements` - -To fix an existing backup file, do the following changes: - -1. Extract from the backup the following file: `db/database.sql.gz`. -1. Decompress the file or use an editor that is capable of handling it compressed. -1. Remove the following lines, or similar ones: - - ```sql - CREATE EXTENSION IF NOT EXISTS pg_stat_statements WITH SCHEMA public; - ``` - - ```sql - COMMENT ON EXTENSION pg_stat_statements IS 'track planning and execution statistics of all SQL statements executed'; - ``` - -1. Save the changes and recompress the file. -1. Update the backup file with the modified `db/database.sql.gz`. + + + + diff --git a/doc/update/package/convert_to_ee.md b/doc/update/package/convert_to_ee.md index 0e8e0b1e569093..950c1b68935edf 100644 --- a/doc/update/package/convert_to_ee.md +++ b/doc/update/package/convert_to_ee.md @@ -22,7 +22,7 @@ that may require Support intervention. The steps can be summed up to: -1. Make a [GitLab backup](../../raketasks/backup_gitlab.md). +1. Make a [GitLab backup](../../administration/backup_restore/backup_gitlab.md). 1. Find the currently installed GitLab version: diff --git a/doc/update/plan_your_upgrade.md b/doc/update/plan_your_upgrade.md index a96e02f589d1ad..9c3874893707db 100644 --- a/doc/update/plan_your_upgrade.md +++ b/doc/update/plan_your_upgrade.md @@ -77,7 +77,7 @@ to roll back GitLab to a working state if there's a problem with the upgrade: - Create a [GitLab backup](../administration/backup_restore/index.md). Make sure to follow the instructions based on your installation method. - Don't forget to back up the [secrets and configuration files](../raketasks/backup_gitlab.md#storing-configuration-files). + Don't forget to back up the [secrets and configuration files](../administration/backup_restore/backup_gitlab.md#storing-configuration-files). - Alternatively, create a snapshot of your instance. If this is a multi-node installation, you must snapshot every node. **This process is out of scope for GitLab Support.** @@ -93,7 +93,7 @@ To restore your GitLab backup: the versions of the backed up and the new GitLab instance must be the same. - [Restore GitLab](../administration/backup_restore/index.md#restore-gitlab). Make sure to follow the instructions based on your installation method. - Confirm that the [secrets and configuration files](../raketasks/backup_gitlab.md#storing-configuration-files) are also restored. + Confirm that the [secrets and configuration files](../administration/backup_restore/backup_gitlab.md#storing-configuration-files) are also restored. - If restoring from a snapshot, know the steps to do this. **This process is out of scope for GitLab Support.** diff --git a/doc/user/project/repository/mirror/index.md b/doc/user/project/repository/mirror/index.md index cfb5644df6c27f..51d1d5b2c4f807 100644 --- a/doc/user/project/repository/mirror/index.md +++ b/doc/user/project/repository/mirror/index.md @@ -210,7 +210,7 @@ Older versions of SSH may require you to remove `-E md5` from the command. - Configure a [Pull Mirroring Interval](../../../../administration/instance_limits.md#pull-mirroring-interval) - [Disable mirrors for a project](../../../admin_area/settings/visibility_and_access_controls.md#enable-project-mirroring) -- [Secrets file and mirroring](../../../../raketasks/backup_gitlab.md#when-the-secrets-file-is-lost) +- [Secrets file and mirroring](../../../../administration/backup_restore/backup_gitlab.md#when-the-secrets-file-is-lost) ## Troubleshooting -- GitLab From 91bf6022794f963c053f776550b568cb22f92890 Mon Sep 17 00:00:00 2001 From: Mike Kozono Date: Wed, 21 Jun 2023 14:58:18 -1000 Subject: [PATCH 05/13] Move restore_gitlab.md to doc/administration/backup_restore/ directory --- doc/administration/backup_restore/index.md | 2 +- .../backup_restore/restore_gitlab.md | 440 ++++++++++++++++++ doc/install/aws/manual_install_aws.md | 2 +- doc/install/docker.md | 2 +- doc/raketasks/restore_gitlab.md | 440 +----------------- doc/update/package/downgrade.md | 2 +- 6 files changed, 451 insertions(+), 437 deletions(-) create mode 100644 doc/administration/backup_restore/restore_gitlab.md diff --git a/doc/administration/backup_restore/index.md b/doc/administration/backup_restore/index.md index b98775dc1c7f9b..89b71a7c621acc 100644 --- a/doc/administration/backup_restore/index.md +++ b/doc/administration/backup_restore/index.md @@ -21,7 +21,7 @@ For detailed information on backing up GitLab, see [Backup GitLab](backup_gitlab ## Restore GitLab -For detailed information on restoring GitLab, see [Restore GitLab](../raketasks/restore_gitlab.md). +For detailed information on restoring GitLab, see [Restore GitLab](restore_gitlab.md). ## Migrate to a new server diff --git a/doc/administration/backup_restore/restore_gitlab.md b/doc/administration/backup_restore/restore_gitlab.md new file mode 100644 index 00000000000000..d6b5f7ad6eb04a --- /dev/null +++ b/doc/administration/backup_restore/restore_gitlab.md @@ -0,0 +1,440 @@ +--- +stage: Systems +group: Geo +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments +--- + +# Restore GitLab **(FREE SELF)** + +GitLab provides a command line interface to restore your entire installation, +and is flexible enough to fit your needs. + +The [restore prerequisites section](#restore-prerequisites) includes crucial +information. Be sure to read and test the complete restore process at least +once before attempting to perform it in a production environment. + +You can restore a backup only to _the exact same version and type (CE/EE)_ of +GitLab that you created it on (for example CE 15.1.4). + +If your backup is a different version than the current installation, you must +[downgrade](../../update/package/downgrade.md) or [upgrade](../../update/package/index.md#upgrade-to-a-specific-version-using-the-official-repositories) your GitLab installation +before restoring the backup. + +Each backup archive contains a full self-contained backup, including those created through the [incremental repository backup procedure](backup_gitlab.md#incremental-repository-backups). To restore an incremental repository backup, use the same instructions as restoring any other regular backup archive. + +## Restore prerequisites + +You need to have a working GitLab installation before you can perform a +restore. This is because the system user performing the restore actions (`git`) +is usually not allowed to create or delete the SQL database needed to import +data into (`gitlabhq_production`). All existing data is either erased +(SQL) or moved to a separate directory (such as repositories and uploads). +Restoring SQL data skips views owned by PostgreSQL extensions. + +To restore a backup, **you must also restore the GitLab secrets**. +These include the database encryption key, [CI/CD variables](../../ci/variables/index.md), and +variables used for [two-factor authentication](../../user/profile/account/two_factor_authentication.md). +Without the keys, [multiple issues occur](backup_gitlab.md#when-the-secrets-file-is-lost), +including loss of access by users with [two-factor authentication enabled](../../user/profile/account/two_factor_authentication.md), +and GitLab Runners cannot log in. + +Restore: + +- `/etc/gitlab/gitlab-secrets.json` (Linux package) +- `/home/git/gitlab/.secret` (self-compiled installations) +- Rails secret (cloud-native GitLab) + - [This can be converted to the Linux package format](https://docs.gitlab.com/charts/installation/migration/helm_to_package.html), if required. + +You may also want to restore your previous `/etc/gitlab/gitlab.rb` (for Omnibus packages) +or `/home/git/gitlab/config/gitlab.yml` (for installations from source) and +any TLS keys, certificates (`/etc/gitlab/ssl`, `/etc/gitlab/trusted-certs`), or +[SSH host keys](https://superuser.com/questions/532040/copy-ssh-keys-from-one-server-to-another-server/532079#532079). + +Depending on your case, you might want to run the restore command with one or +more of the following options: + +- `BACKUP=timestamp_of_backup`: Required if more than one backup exists. + Read what the [backup timestamp is about](backup_gitlab.md#backup-timestamp). +- `force=yes`: Doesn't ask if the `authorized_keys` file should get regenerated, + and assumes 'yes' for warning about database tables being removed, + enabling the `Write to authorized_keys file` setting, and updating LDAP + providers. + +If you're restoring into directories that are mount points, you must ensure these directories are +empty before attempting a restore. Otherwise, GitLab attempts to move these directories before +restoring the new data, which causes an error. + +Read more about [configuring NFS mounts](../nfs.md) + +Restoring a backup from an instance using local storage restores to local storage even if the target instance uses object storage. +Migrations to object storage must be done before or after restoration. + +## Restore for Omnibus GitLab installations + +This procedure assumes that: + +- You have installed the **exact same version and type (CE/EE)** of GitLab + Omnibus with which the backup was created. +- You have run `sudo gitlab-ctl reconfigure` at least once. +- GitLab is running. If not, start it using `sudo gitlab-ctl start`. + +First ensure your backup tar file is in the backup directory described in the +`gitlab.rb` configuration `gitlab_rails['backup_path']`. The default is +`/var/opt/gitlab/backups`. The backup file needs to be owned by the `git` user. + +```shell +sudo cp 11493107454_2018_04_25_10.6.4-ce_gitlab_backup.tar /var/opt/gitlab/backups/ +sudo chown git:git /var/opt/gitlab/backups/11493107454_2018_04_25_10.6.4-ce_gitlab_backup.tar +``` + +Stop the processes that are connected to the database. Leave the rest of GitLab +running: + +```shell +sudo gitlab-ctl stop puma +sudo gitlab-ctl stop sidekiq +# Verify +sudo gitlab-ctl status +``` + +Next, ensure you have completed the [restore prerequisites](#restore-prerequisites) steps and have run `gitlab-ctl reconfigure` +after copying over the GitLab secrets file from the original installation. + +Next, restore the backup, specifying the timestamp of the backup you wish to +restore: + +```shell +# This command will overwrite the contents of your GitLab database! +# NOTE: "_gitlab_backup.tar" is omitted from the name +sudo gitlab-backup restore BACKUP=11493107454_2018_04_25_10.6.4-ce +``` + +Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:restore` instead. +Some [known non-blocking error messages may appear](#restoring-database-backup-using-omnibus-packages-outputs-warnings). + +WARNING: +`gitlab-rake gitlab:backup:restore` doesn't set the correct file system +permissions on your Registry directory. This is a [known issue](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/62759). +In GitLab 12.2 or later, you can use `gitlab-backup restore` to avoid this +issue. + +If there's a GitLab version mismatch between your backup tar file and the +installed version of GitLab, the restore command aborts with an error +message. Install the [correct GitLab version](https://packages.gitlab.com/gitlab/), +and then try again. + +WARNING: +The restore command requires [additional parameters](backup_gitlab.md#back-up-and-restore-for-installations-using-pgbouncer) when +your installation is using PgBouncer, for either performance reasons or when using it with a Patroni cluster. + +Next, restart and [check](../raketasks/maintenance.md#check-gitlab-configuration) GitLab: + +```shell +sudo gitlab-ctl restart +sudo gitlab-rake gitlab:check SANITIZE=true +``` + +In GitLab 13.1 and later, check [database values can be decrypted](../raketasks/check.md#verify-database-values-can-be-decrypted-using-the-current-secrets) +especially if `/etc/gitlab/gitlab-secrets.json` was restored, or if a different server is +the target for the restore. + +```shell +sudo gitlab-rake gitlab:doctor:secrets +``` + +For added assurance, you can perform [an integrity check on the uploaded files](../raketasks/check.md#uploaded-files-integrity): + +```shell +sudo gitlab-rake gitlab:artifacts:check +sudo gitlab-rake gitlab:lfs:check +sudo gitlab-rake gitlab:uploads:check +``` + +## Restore for Docker image and GitLab Helm chart installations + +For GitLab installations using the Docker image or the GitLab Helm chart on a +Kubernetes cluster, the restore task expects the restore directories to be +empty. However, with Docker and Kubernetes volume mounts, some system level +directories may be created at the volume roots, such as the `lost+found` +directory found in Linux operating systems. These directories are usually owned +by `root`, which can cause access permission errors since the restore Rake task +runs as the `git` user. To restore a GitLab installation, users have to confirm +the restore target directories are empty. + +For both these installation types, the backup tarball has to be available in +the backup location (default location is `/var/opt/gitlab/backups`). + +### Restore for Helm chart installations + +The GitLab Helm chart uses the process documented in +[restoring a GitLab Helm chart installation](https://docs.gitlab.com/charts/backup-restore/restore.html#restoring-a-gitlab-installation) + +### Restore for Docker image installations + +If you're using [Docker Swarm](../../install/docker.md#install-gitlab-using-docker-swarm-mode), +the container might restart during the restore process because Puma is shut down, +and so the container health check fails. To work around this problem, +temporarily disable the health check mechanism. + +1. Edit `docker-compose.yml`: + + ```yaml + healthcheck: + disable: true + ``` + +1. Deploy the stack: + + ```shell + docker stack deploy --compose-file docker-compose.yml mystack + ``` + +For more information, see [issue 6846](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/6846 "GitLab restore can fail owing to `gitlab-healthcheck`"). + +The restore task can be run from the host: + +```shell +# Stop the processes that are connected to the database +docker exec -it gitlab-ctl stop puma +docker exec -it gitlab-ctl stop sidekiq + +# Verify that the processes are all down before continuing +docker exec -it gitlab-ctl status + +# Run the restore. NOTE: "_gitlab_backup.tar" is omitted from the name +docker exec -it gitlab-backup restore BACKUP=11493107454_2018_04_25_10.6.4-ce + +# Restart the GitLab container +docker restart + +# Check GitLab +docker exec -it gitlab-rake gitlab:check SANITIZE=true +``` + +## Restore for installation from source + +First, ensure your backup tar file is in the backup directory described in the +`gitlab.yml` configuration: + +```yaml +## Backup settings +backup: + path: "tmp/backups" # Relative paths are relative to Rails.root (default: tmp/backups/) +``` + +The default is `/home/git/gitlab/tmp/backups`, and it needs to be owned by the `git` user. Now, you can begin the backup procedure: + +```shell +# Stop processes that are connected to the database +sudo service gitlab stop + +sudo -u git -H bundle exec rake gitlab:backup:restore RAILS_ENV=production +``` + +Example output: + +```plaintext +Unpacking backup... [DONE] +Restoring database tables: +-- create_table("events", {:force=>true}) + -> 0.2231s +[...] +- Loading fixture events...[DONE] +- Loading fixture issues...[DONE] +- Loading fixture keys...[SKIPPING] +- Loading fixture merge_requests...[DONE] +- Loading fixture milestones...[DONE] +- Loading fixture namespaces...[DONE] +- Loading fixture notes...[DONE] +- Loading fixture projects...[DONE] +- Loading fixture protected_branches...[SKIPPING] +- Loading fixture schema_migrations...[DONE] +- Loading fixture services...[SKIPPING] +- Loading fixture snippets...[SKIPPING] +- Loading fixture taggings...[SKIPPING] +- Loading fixture tags...[SKIPPING] +- Loading fixture users...[DONE] +- Loading fixture users_projects...[DONE] +- Loading fixture web_hooks...[SKIPPING] +- Loading fixture wikis...[SKIPPING] +Restoring repositories: +- Restoring repository abcd... [DONE] +- Object pool 1 ... +Deleting tmp directories...[DONE] +``` + +Next, restore `/home/git/gitlab/.secret` if necessary, [as previously mentioned](#restore-prerequisites). + +Restart GitLab: + +```shell +sudo service gitlab restart +``` + +## Restoring only one or a few projects or groups from a backup + +Although the Rake task used to restore a GitLab instance doesn't support +restoring a single project or group, you can use a workaround by restoring +your backup to a separate, temporary GitLab instance, and then export your +project or group from there: + +1. [Install a new GitLab](../../install/index.md) instance at the same version as + the backed-up instance from which you want to restore. +1. [Restore the backup](#restore-gitlab) into this new instance, then + export your [project](../../user/project/settings/import_export.md) + or [group](../../user/group/import/index.md#migrate-groups-by-uploading-an-export-file-deprecated). For more information about what is and isn't exported, see the export feature's documentation. +1. After the export is complete, go to the old instance and then import it. +1. After importing the projects or groups that you wanted is complete, you may + delete the new, temporary GitLab instance. + +A feature request to provide direct restore of individual projects or groups +is being discussed in [issue #17517](https://gitlab.com/gitlab-org/gitlab/-/issues/17517). + +## Restore options + +The command line tool GitLab provides to restore from backup can accept more +options. + +### Disabling prompts during restore + +During a restore from backup, the restore script may ask for confirmation before +proceeding. If you wish to disable these prompts, you can set the `GITLAB_ASSUME_YES` +environment variable to `1`. + +For Omnibus GitLab packages: + +```shell +sudo GITLAB_ASSUME_YES=1 gitlab-backup restore +``` + +For installations from source: + +```shell +sudo -u git -H GITLAB_ASSUME_YES=1 bundle exec rake gitlab:backup:restore RAILS_ENV=production +``` + +### Excluding tasks on restore + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/19347) in GitLab 14.10. + +You can exclude specific tasks on restore by adding the environment variable `SKIP`, whose values are a comma-separated list of the following options: + +- `db` (database) +- `uploads` (attachments) +- `builds` (CI job output logs) +- `artifacts` (CI job artifacts) +- `lfs` (LFS objects) +- `terraform_state` (Terraform states) +- `registry` (Container Registry images) +- `pages` (Pages content) +- `repositories` (Git repositories data) +- `packages` (Packages) + +For Omnibus GitLab packages: + +```shell +sudo gitlab-backup restore BACKUP=timestamp_of_backup SKIP=db,uploads +``` + +For installations from source: + +```shell +sudo -u git -H bundle exec rake gitlab:backup:restore BACKUP=timestamp_of_backup SKIP=db,uploads RAILS_ENV=production +``` + +### Restore specific repository storages + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/86896) in GitLab 15.0. + +When using [multiple repository storages](../repository_storage_paths.md), +repositories from specific repository storages can be restored separately +using the `REPOSITORIES_STORAGES` option. The option accepts a comma-separated list of +storage names. + +For example, for Omnibus GitLab installations: + +```shell +sudo gitlab-backup restore BACKUP=timestamp_of_backup REPOSITORIES_STORAGES=storage1,storage2 +``` + +For example, for installations from source: + +```shell +sudo -u git -H bundle exec rake gitlab:backup:restore BACKUP=timestamp_of_backup REPOSITORIES_STORAGES=storage1,storage2 +``` + +### Restore specific repositories + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/88094) in GitLab 15.1. + +You can restore specific repositories using the `REPOSITORIES_PATHS` and the `SKIP_REPOSITORIES_PATHS` options. +Both options accept a comma-separated list of project and group paths. If you +specify a group path, all repositories in all projects in the group and +descendent groups are included or skipped, depending on which option you used. The project and group repositories must exist within the specified backup. + +For example, to restore all repositories for all projects in **Group A** (`group-a`), the repository for **Project C** in **Group B** (`group-b/project-c`), +and skip the **Project D** in **Group A** (`group-a/project-d`): + +- Omnibus GitLab installations: + + ```shell + sudo gitlab-backup restore BACKUP=timestamp_of_backup REPOSITORIES_PATHS=group-a,group-b/project-c SKIP_REPOSITORIES_PATHS=group-a/project-d + ``` + +- Installations from source: + + ```shell + sudo -u git -H bundle exec rake gitlab:backup:restore BACKUP=timestamp_of_backup REPOSITORIES_PATHS=group-a,group-b/project-c SKIP_REPOSITORIES_PATHS=group-a/project-d + ``` + +### Restore untarred backups + +If an [untarred backup](backup_gitlab.md#skipping-tar-creation) (made with `SKIP=tar`) is found, +and no backup is chosen with `BACKUP=`, the untarred backup is used. + +For example, for Omnibus GitLab installations: + +```shell +sudo gitlab-backup restore +``` + +For example, for installations from source: + +```shell +sudo -u git -H bundle exec rake gitlab:backup:restore +``` + +## Troubleshooting + +The following are possible problems you might encounter, along with potential +solutions. + +### Restoring database backup using Omnibus packages outputs warnings + +If you're using backup restore procedures, you may encounter the following +warning messages: + +```plaintext +ERROR: must be owner of extension pg_trgm +ERROR: must be owner of extension btree_gist +ERROR: must be owner of extension plpgsql +WARNING: no privileges could be revoked for "public" (two occurrences) +WARNING: no privileges were granted for "public" (two occurrences) +``` + +Be advised that the backup is successfully restored in spite of these warning +messages. + +The Rake task runs this as the `gitlab` user, which doesn't have superuser +access to the database. When restore is initiated, it also runs as the `gitlab` +user, but it also tries to alter the objects it doesn't have access to. +Those objects have no influence on the database backup or restore, but display +a warning message. + +For more information, see: + +- PostgreSQL issue tracker: + - [Not being a superuser](https://www.postgresql.org/message-id/201110220712.30886.adrian.klaver@gmail.com). + - [Having different owners](https://www.postgresql.org/message-id/2039.1177339749@sss.pgh.pa.us). + +- Stack Overflow: [Resulting errors](https://stackoverflow.com/questions/4368789/error-must-be-owner-of-language-plpgsql). diff --git a/doc/install/aws/manual_install_aws.md b/doc/install/aws/manual_install_aws.md index 765269a8d44313..01d79e67d53c25 100644 --- a/doc/install/aws/manual_install_aws.md +++ b/doc/install/aws/manual_install_aws.md @@ -783,7 +783,7 @@ For GitLab 12.1 and earlier, use `gitlab-rake gitlab:backup:create`. To restore GitLab, first review the [restore documentation](../../administration/backup_restore/index.md#restore-gitlab), and primarily the restore prerequisites. Then, follow the steps under the -[Linux package installations section](../../raketasks/restore_gitlab.md#restore-for-omnibus-gitlab-installations). +[Linux package installations section](../../administration/backup_restore/restore_gitlab.md#restore-for-omnibus-gitlab-installations). ## Updating GitLab diff --git a/doc/install/docker.md b/doc/install/docker.md index f1af65afe4ebb4..27ec60627aba62 100644 --- a/doc/install/docker.md +++ b/doc/install/docker.md @@ -596,7 +596,7 @@ To downgrade GitLab after an upgrade: - Restoring is required to back out database data and schema changes (migrations) made as part of the upgrade. - GitLab backups must be restored to the exact same version and edition. - - [Follow the restore steps for Docker images](../raketasks/restore_gitlab.md#restore-for-docker-image-and-gitlab-helm-chart-installations), including + - [Follow the restore steps for Docker images](../administration/backup_restore/restore_gitlab.md#restore-for-docker-image-and-gitlab-helm-chart-installations), including stopping Puma and Sidekiq. Only the database must be restored, so add `SKIP=artifacts,repositories,registry,uploads,builds,pages,lfs,packages,terraform_state` to the `gitlab-backup restore` command line arguments. diff --git a/doc/raketasks/restore_gitlab.md b/doc/raketasks/restore_gitlab.md index ac2f9f11036d26..434e256de35130 100644 --- a/doc/raketasks/restore_gitlab.md +++ b/doc/raketasks/restore_gitlab.md @@ -2,439 +2,13 @@ stage: Systems group: Geo info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments +redirect_to: '../administration/backup_restore/restore_gitlab.md' +remove_date: '2023-09-26' --- -# Restore GitLab **(FREE SELF)** +This document was moved to [another location](../administration/backup_restore/restore_gitlab.md). -GitLab provides a command line interface to restore your entire installation, -and is flexible enough to fit your needs. - -The [restore prerequisites section](#restore-prerequisites) includes crucial -information. Be sure to read and test the complete restore process at least -once before attempting to perform it in a production environment. - -You can restore a backup only to _the exact same version and type (CE/EE)_ of -GitLab that you created it on (for example CE 15.1.4). - -If your backup is a different version than the current installation, you must -[downgrade](../update/package/downgrade.md) or [upgrade](../update/package/index.md#upgrade-to-a-specific-version-using-the-official-repositories) your GitLab installation -before restoring the backup. - -Each backup archive contains a full self-contained backup, including those created through the [incremental repository backup procedure](backup_gitlab.md#incremental-repository-backups). To restore an incremental repository backup, use the same instructions as restoring any other regular backup archive. - -## Restore prerequisites - -You need to have a working GitLab installation before you can perform a -restore. This is because the system user performing the restore actions (`git`) -is usually not allowed to create or delete the SQL database needed to import -data into (`gitlabhq_production`). All existing data is either erased -(SQL) or moved to a separate directory (such as repositories and uploads). -Restoring SQL data skips views owned by PostgreSQL extensions. - -To restore a backup, **you must also restore the GitLab secrets**. -These include the database encryption key, [CI/CD variables](../ci/variables/index.md), and -variables used for [two-factor authentication](../user/profile/account/two_factor_authentication.md). -Without the keys, [multiple issues occur](backup_gitlab.md#when-the-secrets-file-is-lost), -including loss of access by users with [two-factor authentication enabled](../user/profile/account/two_factor_authentication.md), -and GitLab Runners cannot log in. - -Restore: - -- `/etc/gitlab/gitlab-secrets.json` (Linux package) -- `/home/git/gitlab/.secret` (self-compiled installations) -- Rails secret (cloud-native GitLab) - - [This can be converted to the Linux package format](https://docs.gitlab.com/charts/installation/migration/helm_to_package.html), if required. - -You may also want to restore your previous `/etc/gitlab/gitlab.rb` (for Omnibus packages) -or `/home/git/gitlab/config/gitlab.yml` (for installations from source) and -any TLS keys, certificates (`/etc/gitlab/ssl`, `/etc/gitlab/trusted-certs`), or -[SSH host keys](https://superuser.com/questions/532040/copy-ssh-keys-from-one-server-to-another-server/532079#532079). - -Depending on your case, you might want to run the restore command with one or -more of the following options: - -- `BACKUP=timestamp_of_backup`: Required if more than one backup exists. - Read what the [backup timestamp is about](backup_gitlab.md#backup-timestamp). -- `force=yes`: Doesn't ask if the `authorized_keys` file should get regenerated, - and assumes 'yes' for warning about database tables being removed, - enabling the `Write to authorized_keys file` setting, and updating LDAP - providers. - -If you're restoring into directories that are mount points, you must ensure these directories are -empty before attempting a restore. Otherwise, GitLab attempts to move these directories before -restoring the new data, which causes an error. - -Read more about [configuring NFS mounts](../administration/nfs.md) - -Restoring a backup from an instance using local storage restores to local storage even if the target instance uses object storage. -Migrations to object storage must be done before or after restoration. - -## Restore for Omnibus GitLab installations - -This procedure assumes that: - -- You have installed the **exact same version and type (CE/EE)** of GitLab - Omnibus with which the backup was created. -- You have run `sudo gitlab-ctl reconfigure` at least once. -- GitLab is running. If not, start it using `sudo gitlab-ctl start`. - -First ensure your backup tar file is in the backup directory described in the -`gitlab.rb` configuration `gitlab_rails['backup_path']`. The default is -`/var/opt/gitlab/backups`. The backup file needs to be owned by the `git` user. - -```shell -sudo cp 11493107454_2018_04_25_10.6.4-ce_gitlab_backup.tar /var/opt/gitlab/backups/ -sudo chown git:git /var/opt/gitlab/backups/11493107454_2018_04_25_10.6.4-ce_gitlab_backup.tar -``` - -Stop the processes that are connected to the database. Leave the rest of GitLab -running: - -```shell -sudo gitlab-ctl stop puma -sudo gitlab-ctl stop sidekiq -# Verify -sudo gitlab-ctl status -``` - -Next, ensure you have completed the [restore prerequisites](#restore-prerequisites) steps and have run `gitlab-ctl reconfigure` -after copying over the GitLab secrets file from the original installation. - -Next, restore the backup, specifying the timestamp of the backup you wish to -restore: - -```shell -# This command will overwrite the contents of your GitLab database! -# NOTE: "_gitlab_backup.tar" is omitted from the name -sudo gitlab-backup restore BACKUP=11493107454_2018_04_25_10.6.4-ce -``` - -Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:restore` instead. -Some [known non-blocking error messages may appear](#restoring-database-backup-using-omnibus-packages-outputs-warnings). - -WARNING: -`gitlab-rake gitlab:backup:restore` doesn't set the correct file system -permissions on your Registry directory. This is a [known issue](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/62759). -In GitLab 12.2 or later, you can use `gitlab-backup restore` to avoid this -issue. - -If there's a GitLab version mismatch between your backup tar file and the -installed version of GitLab, the restore command aborts with an error -message. Install the [correct GitLab version](https://packages.gitlab.com/gitlab/), -and then try again. - -WARNING: -The restore command requires [additional parameters](backup_gitlab.md#back-up-and-restore-for-installations-using-pgbouncer) when -your installation is using PgBouncer, for either performance reasons or when using it with a Patroni cluster. - -Next, restart and [check](../administration/raketasks/maintenance.md#check-gitlab-configuration) GitLab: - -```shell -sudo gitlab-ctl restart -sudo gitlab-rake gitlab:check SANITIZE=true -``` - -In GitLab 13.1 and later, check [database values can be decrypted](../administration/raketasks/check.md#verify-database-values-can-be-decrypted-using-the-current-secrets) -especially if `/etc/gitlab/gitlab-secrets.json` was restored, or if a different server is -the target for the restore. - -```shell -sudo gitlab-rake gitlab:doctor:secrets -``` - -For added assurance, you can perform [an integrity check on the uploaded files](../administration/raketasks/check.md#uploaded-files-integrity): - -```shell -sudo gitlab-rake gitlab:artifacts:check -sudo gitlab-rake gitlab:lfs:check -sudo gitlab-rake gitlab:uploads:check -``` - -## Restore for Docker image and GitLab Helm chart installations - -For GitLab installations using the Docker image or the GitLab Helm chart on a -Kubernetes cluster, the restore task expects the restore directories to be -empty. However, with Docker and Kubernetes volume mounts, some system level -directories may be created at the volume roots, such as the `lost+found` -directory found in Linux operating systems. These directories are usually owned -by `root`, which can cause access permission errors since the restore Rake task -runs as the `git` user. To restore a GitLab installation, users have to confirm -the restore target directories are empty. - -For both these installation types, the backup tarball has to be available in -the backup location (default location is `/var/opt/gitlab/backups`). - -### Restore for Helm chart installations - -The GitLab Helm chart uses the process documented in -[restoring a GitLab Helm chart installation](https://docs.gitlab.com/charts/backup-restore/restore.html#restoring-a-gitlab-installation) - -### Restore for Docker image installations - -If you're using [Docker Swarm](../install/docker.md#install-gitlab-using-docker-swarm-mode), -the container might restart during the restore process because Puma is shut down, -and so the container health check fails. To work around this problem, -temporarily disable the health check mechanism. - -1. Edit `docker-compose.yml`: - - ```yaml - healthcheck: - disable: true - ``` - -1. Deploy the stack: - - ```shell - docker stack deploy --compose-file docker-compose.yml mystack - ``` - -For more information, see [issue 6846](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/6846 "GitLab restore can fail owing to `gitlab-healthcheck`"). - -The restore task can be run from the host: - -```shell -# Stop the processes that are connected to the database -docker exec -it gitlab-ctl stop puma -docker exec -it gitlab-ctl stop sidekiq - -# Verify that the processes are all down before continuing -docker exec -it gitlab-ctl status - -# Run the restore. NOTE: "_gitlab_backup.tar" is omitted from the name -docker exec -it gitlab-backup restore BACKUP=11493107454_2018_04_25_10.6.4-ce - -# Restart the GitLab container -docker restart - -# Check GitLab -docker exec -it gitlab-rake gitlab:check SANITIZE=true -``` - -## Restore for installation from source - -First, ensure your backup tar file is in the backup directory described in the -`gitlab.yml` configuration: - -```yaml -## Backup settings -backup: - path: "tmp/backups" # Relative paths are relative to Rails.root (default: tmp/backups/) -``` - -The default is `/home/git/gitlab/tmp/backups`, and it needs to be owned by the `git` user. Now, you can begin the backup procedure: - -```shell -# Stop processes that are connected to the database -sudo service gitlab stop - -sudo -u git -H bundle exec rake gitlab:backup:restore RAILS_ENV=production -``` - -Example output: - -```plaintext -Unpacking backup... [DONE] -Restoring database tables: --- create_table("events", {:force=>true}) - -> 0.2231s -[...] -- Loading fixture events...[DONE] -- Loading fixture issues...[DONE] -- Loading fixture keys...[SKIPPING] -- Loading fixture merge_requests...[DONE] -- Loading fixture milestones...[DONE] -- Loading fixture namespaces...[DONE] -- Loading fixture notes...[DONE] -- Loading fixture projects...[DONE] -- Loading fixture protected_branches...[SKIPPING] -- Loading fixture schema_migrations...[DONE] -- Loading fixture services...[SKIPPING] -- Loading fixture snippets...[SKIPPING] -- Loading fixture taggings...[SKIPPING] -- Loading fixture tags...[SKIPPING] -- Loading fixture users...[DONE] -- Loading fixture users_projects...[DONE] -- Loading fixture web_hooks...[SKIPPING] -- Loading fixture wikis...[SKIPPING] -Restoring repositories: -- Restoring repository abcd... [DONE] -- Object pool 1 ... -Deleting tmp directories...[DONE] -``` - -Next, restore `/home/git/gitlab/.secret` if necessary, [as previously mentioned](#restore-prerequisites). - -Restart GitLab: - -```shell -sudo service gitlab restart -``` - -## Restoring only one or a few projects or groups from a backup - -Although the Rake task used to restore a GitLab instance doesn't support -restoring a single project or group, you can use a workaround by restoring -your backup to a separate, temporary GitLab instance, and then export your -project or group from there: - -1. [Install a new GitLab](../install/index.md) instance at the same version as - the backed-up instance from which you want to restore. -1. [Restore the backup](#restore-gitlab) into this new instance, then - export your [project](../user/project/settings/import_export.md) - or [group](../user/group/import/index.md#migrate-groups-by-uploading-an-export-file-deprecated). For more information about what is and isn't exported, see the export feature's documentation. -1. After the export is complete, go to the old instance and then import it. -1. After importing the projects or groups that you wanted is complete, you may - delete the new, temporary GitLab instance. - -A feature request to provide direct restore of individual projects or groups -is being discussed in [issue #17517](https://gitlab.com/gitlab-org/gitlab/-/issues/17517). - -## Restore options - -The command line tool GitLab provides to restore from backup can accept more -options. - -### Disabling prompts during restore - -During a restore from backup, the restore script may ask for confirmation before -proceeding. If you wish to disable these prompts, you can set the `GITLAB_ASSUME_YES` -environment variable to `1`. - -For Omnibus GitLab packages: - -```shell -sudo GITLAB_ASSUME_YES=1 gitlab-backup restore -``` - -For installations from source: - -```shell -sudo -u git -H GITLAB_ASSUME_YES=1 bundle exec rake gitlab:backup:restore RAILS_ENV=production -``` - -### Excluding tasks on restore - -> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/19347) in GitLab 14.10. - -You can exclude specific tasks on restore by adding the environment variable `SKIP`, whose values are a comma-separated list of the following options: - -- `db` (database) -- `uploads` (attachments) -- `builds` (CI job output logs) -- `artifacts` (CI job artifacts) -- `lfs` (LFS objects) -- `terraform_state` (Terraform states) -- `registry` (Container Registry images) -- `pages` (Pages content) -- `repositories` (Git repositories data) -- `packages` (Packages) - -For Omnibus GitLab packages: - -```shell -sudo gitlab-backup restore BACKUP=timestamp_of_backup SKIP=db,uploads -``` - -For installations from source: - -```shell -sudo -u git -H bundle exec rake gitlab:backup:restore BACKUP=timestamp_of_backup SKIP=db,uploads RAILS_ENV=production -``` - -### Restore specific repository storages - -> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/86896) in GitLab 15.0. - -When using [multiple repository storages](../administration/repository_storage_paths.md), -repositories from specific repository storages can be restored separately -using the `REPOSITORIES_STORAGES` option. The option accepts a comma-separated list of -storage names. - -For example, for Omnibus GitLab installations: - -```shell -sudo gitlab-backup restore BACKUP=timestamp_of_backup REPOSITORIES_STORAGES=storage1,storage2 -``` - -For example, for installations from source: - -```shell -sudo -u git -H bundle exec rake gitlab:backup:restore BACKUP=timestamp_of_backup REPOSITORIES_STORAGES=storage1,storage2 -``` - -### Restore specific repositories - -> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/88094) in GitLab 15.1. - -You can restore specific repositories using the `REPOSITORIES_PATHS` and the `SKIP_REPOSITORIES_PATHS` options. -Both options accept a comma-separated list of project and group paths. If you -specify a group path, all repositories in all projects in the group and -descendent groups are included or skipped, depending on which option you used. The project and group repositories must exist within the specified backup. - -For example, to restore all repositories for all projects in **Group A** (`group-a`), the repository for **Project C** in **Group B** (`group-b/project-c`), -and skip the **Project D** in **Group A** (`group-a/project-d`): - -- Omnibus GitLab installations: - - ```shell - sudo gitlab-backup restore BACKUP=timestamp_of_backup REPOSITORIES_PATHS=group-a,group-b/project-c SKIP_REPOSITORIES_PATHS=group-a/project-d - ``` - -- Installations from source: - - ```shell - sudo -u git -H bundle exec rake gitlab:backup:restore BACKUP=timestamp_of_backup REPOSITORIES_PATHS=group-a,group-b/project-c SKIP_REPOSITORIES_PATHS=group-a/project-d - ``` - -### Restore untarred backups - -If an [untarred backup](backup_gitlab.md#skipping-tar-creation) (made with `SKIP=tar`) is found, -and no backup is chosen with `BACKUP=`, the untarred backup is used. - -For example, for Omnibus GitLab installations: - -```shell -sudo gitlab-backup restore -``` - -For example, for installations from source: - -```shell -sudo -u git -H bundle exec rake gitlab:backup:restore -``` - -## Troubleshooting - -The following are possible problems you might encounter, along with potential -solutions. - -### Restoring database backup using Omnibus packages outputs warnings - -If you're using backup restore procedures, you may encounter the following -warning messages: - -```plaintext -ERROR: must be owner of extension pg_trgm -ERROR: must be owner of extension btree_gist -ERROR: must be owner of extension plpgsql -WARNING: no privileges could be revoked for "public" (two occurrences) -WARNING: no privileges were granted for "public" (two occurrences) -``` - -Be advised that the backup is successfully restored in spite of these warning -messages. - -The Rake task runs this as the `gitlab` user, which doesn't have superuser -access to the database. When restore is initiated, it also runs as the `gitlab` -user, but it also tries to alter the objects it doesn't have access to. -Those objects have no influence on the database backup or restore, but display -a warning message. - -For more information, see: - -- PostgreSQL issue tracker: - - [Not being a superuser](https://www.postgresql.org/message-id/201110220712.30886.adrian.klaver@gmail.com). - - [Having different owners](https://www.postgresql.org/message-id/2039.1177339749@sss.pgh.pa.us). - -- Stack Overflow: [Resulting errors](https://stackoverflow.com/questions/4368789/error-must-be-owner-of-language-plpgsql). + + + + diff --git a/doc/update/package/downgrade.md b/doc/update/package/downgrade.md index c5b65e29ec4b21..14b9bd981fdaba 100644 --- a/doc/update/package/downgrade.md +++ b/doc/update/package/downgrade.md @@ -79,5 +79,5 @@ Steps: sudo gitlab-ctl reconfigure ``` -1. [Restore GitLab](../../raketasks/restore_gitlab.md#restore-for-omnibus-gitlab-installations) +1. [Restore GitLab](../../administration/backup_restore/restore_gitlab.md#restore-for-omnibus-gitlab-installations) to complete the downgrade. -- GitLab From ffdef3fc1e5d1184ff8f2ef7f87a547091171900 Mon Sep 17 00:00:00 2001 From: Mike Kozono Date: Wed, 21 Jun 2023 17:57:59 -1000 Subject: [PATCH 06/13] Add Simple procedure and Scaling backups --- .../backup_restore/backup_gitlab.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/doc/administration/backup_restore/backup_gitlab.md b/doc/administration/backup_restore/backup_gitlab.md index 31034126f9ec56..d49851cb5c3409 100644 --- a/doc/administration/backup_restore/backup_gitlab.md +++ b/doc/administration/backup_restore/backup_gitlab.md @@ -8,6 +8,19 @@ info: To determine the technical writer assigned to the Stage/Group associated w The exact procedure for backing up GitLab depends on many factors. Your particular deployment's usage and configuration determine what kind of data exists, where it is located, and how much there is. These factors influence your options for how to perform a back up, how to store it, and how to restore it. +## Simple back up procedure + +As a rough guideline, if you are using a [1k reference architecture](../reference_architectures/1k_users.md) with less than 100GB of data, then follow these steps: + +1. Run the [backup command](#backup-command). +1. Manually back up [configuration files](#storing-configuration-files). + +## Scaling backups + +As the volume of GitLab data grows, the [backup command](#backup-command) takes longer to execute. At some point, the execution time becomes impractical. For example, it can take 24 hours or more. + +See [Alternative backup strategies](#alternative-backup-strategies) for more information. + ## What data needs to be backed up? - [PostgreSQL databases](#postgresql-databases) @@ -1155,6 +1168,10 @@ If you have a specific reason to change the path, it can be configured in Omnibu ## Alternative backup strategies +Since every deployment may have different capabilities, you should first review [What data needs to be backed up?](#what-data-needs-to-be-backed-up) to better understand if, and how, you can leverage them. + +For example, if you use Amazon RDS, you might choose to use its built-in backup and restore features to handle your GitLab [PostgreSQL data](#postgresql-databases), and simply [exclude PostgreSQL data](#excluding-specific-directories-from-the-backup) when using the [backup command](#backup-command). + In the following cases, consider using file system data transfer or snapshots as part of your backup strategy: - Your GitLab instance contains a lot of Git repository data and the GitLab backup script is too slow. -- GitLab From 4a98ddf168e7b85aeee95ba017cfbd188b9c7495 Mon Sep 17 00:00:00 2001 From: Sampath Ranasinghe Date: Thu, 22 Jun 2023 18:29:52 +0000 Subject: [PATCH 07/13] Fix some typos and formatting --- .../backup_restore/backup_gitlab.md | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/doc/administration/backup_restore/backup_gitlab.md b/doc/administration/backup_restore/backup_gitlab.md index d49851cb5c3409..37773e932e0e00 100644 --- a/doc/administration/backup_restore/backup_gitlab.md +++ b/doc/administration/backup_restore/backup_gitlab.md @@ -77,7 +77,7 @@ GitLab stores blobs (or files) such as issue attachments or LFS objects into eit - The file system in a specific location. - An [Object Storage](../object_storage.md) solution. Object Storage solutions can be: - - Cloud based like Amazon S3 Google Cloud Storage. + - Cloud based like Amazon S3 and Google Cloud Storage. - Hosted by you (like MinIO). - A Storage Appliance that exposes an Object Storage-compatible API. @@ -94,7 +94,7 @@ The secrets file is essential to preserve your database encryption key. At the very **minimum**, you must back up: -For Omnibus: +For Linux package installations: - `/etc/gitlab/gitlab-secrets.json` - `/etc/gitlab/gitlab.rb` @@ -104,14 +104,15 @@ For installation from source: - `/home/git/gitlab/config/secrets.yml` - `/home/git/gitlab/config/gitlab.yml` -For [Docker installations](../../install/docker.md), you must -back up the volume where the configuration files are stored. If you created +For [Docker installations](../../install/docker.md): + +- Back up the volume where the configuration files are stored. If you created the GitLab container according to the documentation, it should be in the `/srv/gitlab/config` directory. -For [GitLab Helm chart installations](https://gitlab.com/gitlab-org/charts/gitlab) -on a Kubernetes cluster, you must follow the -[Back up the secrets](https://docs.gitlab.com/charts/backup-restore/backup.html#back-up-the-secrets) +For [GitLab Helm chart installations](https://gitlab.com/gitlab-org/charts/gitlab) on a Kubernetes cluster: + +- Follow the [Back up the secrets](https://docs.gitlab.com/charts/backup-restore/backup.html#back-up-the-secrets) instructions. You may also want to back up any TLS keys and certificates (`/etc/gitlab/ssl`, `/etc/gitlab/trusted-certs`), and your -- GitLab From 06c915a1561cc5108b180736e1159107c2bac014 Mon Sep 17 00:00:00 2001 From: Mike Kozono Date: Thu, 22 Jun 2023 11:40:18 -1000 Subject: [PATCH 08/13] Add Object storage step and section --- doc/administration/backup_restore/backup_gitlab.md | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/doc/administration/backup_restore/backup_gitlab.md b/doc/administration/backup_restore/backup_gitlab.md index 37773e932e0e00..628d1af8ff4dab 100644 --- a/doc/administration/backup_restore/backup_gitlab.md +++ b/doc/administration/backup_restore/backup_gitlab.md @@ -13,6 +13,7 @@ The exact procedure for backing up GitLab depends on many factors. Your particul As a rough guideline, if you are using a [1k reference architecture](../reference_architectures/1k_users.md) with less than 100GB of data, then follow these steps: 1. Run the [backup command](#backup-command). +1. Back up [object storage](#object-storage), if applicable. 1. Manually back up [configuration files](#storing-configuration-files). ## Scaling backups @@ -81,6 +82,13 @@ GitLab stores blobs (or files) such as issue attachments or LFS objects into eit - Hosted by you (like MinIO). - A Storage Appliance that exposes an Object Storage-compatible API. +#### Object storage + +The [backup command](#backup-command) doesn't back up blobs that aren't stored on the file system. If you're using [object storage](../object_storage.md), be sure to enable backups with your object storage provider. For example, see: + +- [Amazon S3 backups](https://docs.aws.amazon.com/aws-backup/latest/devguide/s3-backups.html) +- [Google Cloud Storage Transfer Service](https://cloud.google.com/storage-transfer-service) and [Google Cloud Storage Object Versioning](https://cloud.google.com/storage/docs/object-versioning) + ### Storing configuration files The [backup Rake task](#back-up-gitlab) GitLab provides does _not_ store your @@ -155,6 +163,7 @@ Backups do not include: - [Mattermost data](https://docs.mattermost.com/administration/config-settings.html#file-storage) - Redis (and thus Sidekiq jobs) +- [Object storage](#object-storage) WARNING: GitLab does not back up any configuration files (`/etc/gitlab`), TLS keys and certificates, or system @@ -179,9 +188,7 @@ system. If you installed GitLab: ### Backup command WARNING: -GitLab doesn't back up items that aren't stored on the file system. If you're -using [object storage](../object_storage.md), be sure to enable -backups with your object storage provider, if desired. +The backup command does not back up items in [object storage](#object-storage). WARNING: The backup command requires [additional parameters](#back-up-and-restore-for-installations-using-pgbouncer) when -- GitLab From 3449f327d773b6890aec9009ad1b76b3748db2f0 Mon Sep 17 00:00:00 2001 From: Michael Kozono Date: Mon, 26 Jun 2023 21:56:37 +0000 Subject: [PATCH 09/13] Improve back up and restore docs - Use tabs for configuration instructions - Use consistent names for installation methods - Remove some old version instructions - Mention backup directory - Use note box for important info --- .../backup_restore/backup_gitlab.md | 250 ++++++++++-------- .../backup_restore/restore_gitlab.md | 14 +- 2 files changed, 150 insertions(+), 114 deletions(-) diff --git a/doc/administration/backup_restore/backup_gitlab.md b/doc/administration/backup_restore/backup_gitlab.md index 628d1af8ff4dab..052627956ad224 100644 --- a/doc/administration/backup_restore/backup_gitlab.md +++ b/doc/administration/backup_restore/backup_gitlab.md @@ -91,45 +91,45 @@ The [backup command](#backup-command) doesn't back up blobs that aren't stored o ### Storing configuration files -The [backup Rake task](#back-up-gitlab) GitLab provides does _not_ store your -configuration files. The primary reason for this is that your database contains -items including encrypted information for two-factor authentication and the -CI/CD _secure variables_. Storing encrypted information in the same location -as its key defeats the purpose of using encryption in the first place. +WARNING: +The [backup Rake task](#back-up-gitlab) GitLab provides does _not_ store your configuration files. The primary reason for this is that your database contains items including encrypted information for two-factor authentication and the CI/CD _secure variables_. Storing encrypted information in the same location as its key defeats the purpose of using encryption in the first place. For example, the secrets file contains your database encryption key. If you lose it, then the GitLab application will not be able to decrypt any encrypted values in the database. WARNING: -The secrets file is essential to preserve your database encryption key. +The secrets file may change after upgrades. + +It is recommended to backup the configuration directory. At the very **minimum**, you must back up: -At the very **minimum**, you must back up: +::Tabs -For Linux package installations: +:::TabTitle Linux package - `/etc/gitlab/gitlab-secrets.json` - `/etc/gitlab/gitlab.rb` -For installation from source: +For more information, see [Backup and restore Linux package (Omnibus) configuration](https://docs.gitlab.com/omnibus/settings/backups.html#backup-and-restore-omnibus-gitlab-configuration). + +:::TabTitle Self-compiled - `/home/git/gitlab/config/secrets.yml` - `/home/git/gitlab/config/gitlab.yml` -For [Docker installations](../../install/docker.md): +:::TabTitle Docker - Back up the volume where the configuration files are stored. If you created the GitLab container according to the documentation, it should be in the `/srv/gitlab/config` directory. -For [GitLab Helm chart installations](https://gitlab.com/gitlab-org/charts/gitlab) on a Kubernetes cluster: +:::TabTitle GitLab Helm chart - Follow the [Back up the secrets](https://docs.gitlab.com/charts/backup-restore/backup.html#back-up-the-secrets) instructions. +::EndTabs + You may also want to back up any TLS keys and certificates (`/etc/gitlab/ssl`, `/etc/gitlab/trusted-certs`), and your [SSH host keys](https://superuser.com/questions/532040/copy-ssh-keys-from-one-server-to-another-server/532079#532079) to avoid man-in-the-middle attack warnings if you have to perform a full machine restore. -If you use Omnibus GitLab, review additional information to -[backup your configuration](https://docs.gitlab.com/omnibus/settings/backups.html). - In the unlikely event that the secrets file is lost, see the [troubleshooting section](#when-the-secrets-file-is-lost). @@ -174,8 +174,8 @@ files. You are highly advised to read about [storing configuration files](#stori To be able to back up and restore, ensure that Rsync is installed on your system. If you installed GitLab: -- _Using the Omnibus package_, Rsync is already installed. -- _From source_, check if `rsync` is installed. If Rsync is not installed, install it. For example: +- _Using the Linux package_, Rsync is already installed. +- _Using self-compiled_, check if `rsync` is installed. If Rsync is not installed, install it. For example: ```shell # Debian/Ubuntu @@ -199,15 +199,13 @@ Before GitLab 15.5.0, the backup command doesn't verify if another backup is alr [issue 362593](https://gitlab.com/gitlab-org/gitlab/-/issues/362593). We strongly recommend you make sure that all backups are complete before starting a new one. -An application data backup creates an archive file that contains the database, -all repositories and all attachments. - +NOTE: You can only restore a backup to **exactly the same version and type (CE/EE)** -of GitLab on which it was created. The best way to -[migrate your projects from one server to another](index.md#migrate-to-a-new-server) is through a backup and restore. +of GitLab on which it was created. + +::Tabs -Depending on your version of GitLab, use the following command if you installed -GitLab using the Omnibus package: +:::TabTitle Linux package (Omnibus) - GitLab 12.2 or later: @@ -221,14 +219,13 @@ GitLab using the Omnibus package: gitlab-rake gitlab:backup:create ``` -If you installed GitLab from source, use the following command: +:::TabTitle Helm chart (Kubernetes) -```shell -sudo -u git -H bundle exec rake gitlab:backup:create RAILS_ENV=production -``` +Run the backup task by using `kubectl` to run the `backup-utility` script on the GitLab toolbox pod. For more details, see the [charts backup documentation](https://docs.gitlab.com/charts/backup-restore/backup.html). + +:::TabTitle Docker -If you're running GitLab from within a Docker container, run the backup from -the host, based on your installed version of GitLab: +Run the backup from the host. - GitLab 12.2 or later: @@ -242,16 +239,19 @@ the host, based on your installed version of GitLab: docker exec -t gitlab-rake gitlab:backup:create ``` -If you're using the [GitLab Helm chart](https://gitlab.com/gitlab-org/charts/gitlab) -on a Kubernetes cluster, you can run the backup task by using `kubectl` to run the `backup-utility` -script on the GitLab toolbox pod. For more details, see the -[charts backup documentation](https://docs.gitlab.com/charts/backup-restore/backup.html). +:::TabTitle Self-compiled + +```shell +sudo -u git -H bundle exec rake gitlab:backup:create RAILS_ENV=production +``` + +::EndTabs + +If your GitLab deployment has multiple nodes, you need to pick a node for running the backup command. You must ensure that the designated node: -Similar to the Kubernetes case, if you have scaled out your GitLab cluster to -use multiple application servers, you should pick a designated node (that isn't -auto-scaled away) for running the backup Rake task. Because the backup Rake -task is tightly coupled to the main Rails application, this is typically a node -on which you're also running Puma or Sidekiq. +- is persistent, and not subject to auto-scaling. +- has the GitLab Rails application already installed. If Puma or Sidekiq is running, then Rails is installed. +- has sufficient storage and memory to produce the backup file. Example output: @@ -285,7 +285,7 @@ Deleting old backups... [SKIPPING] ### Backup timestamp The backup archive is saved in `backup_path`, which is specified in the -`config/gitlab.yml` file. The filename is `[TIMESTAMP]_gitlab_backup.tar`, +`config/gitlab.yml` file. The default path is `/var/opt/gitlab/backups`. The filename is `[TIMESTAMP]_gitlab_backup.tar`, where `TIMESTAMP` identifies the time at which each backup was created, plus the GitLab version. The timestamp is needed if you need to restore GitLab and multiple backups are available. @@ -321,8 +321,6 @@ To use the `copy` strategy instead of the default streaming strategy, specify sudo gitlab-backup create STRATEGY=copy ``` -Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. - #### Backup filename WARNING: @@ -338,8 +336,6 @@ environment variable. For example: sudo gitlab-backup create BACKUP=dump ``` -Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. - The resulting file is named `dump_gitlab_backup.tar`. This is useful for systems that make use of rsync and incremental backups, and results in considerably faster transfer speeds. @@ -358,8 +354,6 @@ on all distributions. To verify that it's available in your distribution, run sudo gitlab-backup create BACKUP=dump GZIP_RSYNCABLE=yes ``` -Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. - #### Excluding specific directories from the backup You can exclude specific directories from the backup by adding the environment variable `SKIP`, whose values are a comma-separated list of the following options: @@ -383,20 +377,22 @@ For more information see [command line arguments](https://docs.gitlab.com/charts All wikis are backed up as part of the `repositories` group. Non-existent wikis are skipped during a backup. -For Omnibus GitLab packages: +::Tabs + +:::TabTitle Linux package (Omnibus) ```shell sudo gitlab-backup create SKIP=db,uploads ``` -Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. - -For installations from source: +:::TabTitle Self-compiled ```shell sudo -u git -H bundle exec rake gitlab:backup:create SKIP=db,uploads RAILS_ENV=production ``` +::EndTabs + `SKIP=` is also used to: - [Skip creation of the tar file](#skipping-tar-creation) (`SKIP=tar`). @@ -418,18 +414,22 @@ backup in the directory used for the intermediate files. These files are overwritten when a new backup is created, so you should make sure they are copied elsewhere, because you can only have one backup on the system. -For Omnibus GitLab packages: +::Tabs + +:::TabTitle Linux package (Omnibus) ```shell sudo gitlab-backup create SKIP=tar ``` -For installations from source: +:::TabTitle Self-compiled ```shell sudo -u git -H bundle exec rake gitlab:backup:create SKIP=tar RAILS_ENV=production ``` +::EndTabs + #### Back up Git repositories concurrently > - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/37158) in GitLab 13.3. @@ -448,18 +448,24 @@ task: to be spread across storages. Defaults to `2` (in GitLab 14.1 and earlier, defaults to `1`). -For example, for Omnibus GitLab installations with 4 repository storages: +For example, with 4 repository storages: + +::Tabs + +:::TabTitle Linux package (Omnibus) ```shell sudo gitlab-backup create GITLAB_BACKUP_MAX_CONCURRENCY=4 GITLAB_BACKUP_MAX_STORAGE_CONCURRENCY=1 ``` -For example, for installations from source: +:::TabTitle Self-compiled ```shell sudo -u git -H bundle exec rake gitlab:backup:create GITLAB_BACKUP_MAX_CONCURRENCY=4 GITLAB_BACKUP_MAX_STORAGE_CONCURRENCY=1 ``` +::EndTabs + #### Incremental repository backups > - Introduced in GitLab 14.9 [with a flag](../feature_flags.md) named `incremental_repository_backup`. Disabled by default. @@ -515,18 +521,24 @@ repositories from specific repository storages can be backed up separately using the `REPOSITORIES_STORAGES` option. The option accepts a comma-separated list of storage names. -For example, for Omnibus GitLab installations: +For example: + +::Tabs + +:::TabTitle Linux package (Omnibus) ```shell sudo gitlab-backup create REPOSITORIES_STORAGES=storage1,storage2 ``` -For example, for installations from source: +:::TabTitle Self-compiled ```shell sudo -u git -H bundle exec rake gitlab:backup:create REPOSITORIES_STORAGES=storage1,storage2 ``` +::EndTabs + #### Back up specific repositories > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/88094) in GitLab 15.1. @@ -540,18 +552,22 @@ descendent groups are included or skipped, depending on which option you used. For example, to back up all repositories for all projects in **Group A** (`group-a`), the repository for **Project C** in **Group B** (`group-b/project-c`), and skip the **Project D** in **Group A** (`group-a/project-d`): -- Omnibus GitLab installations: +::Tabs + +:::TabTitle Linux package (Omnibus) ```shell sudo gitlab-backup create REPOSITORIES_PATHS=group-a,group-b/project-c SKIP_REPOSITORIES_PATHS=group-a/project-d ``` -- Installations from source: +:::TabTitle Self-compiled ```shell sudo -u git -H bundle exec rake gitlab:backup:create REPOSITORIES_PATHS=group-a,group-b/project-c SKIP_REPOSITORIES_PATHS=group-a/project-d ``` +::EndTabs + #### Upload backups to a remote (cloud) storage NOTE: @@ -568,7 +584,7 @@ for AWS, Google, and Aliyun. A local driver is ##### Using Amazon S3 -For Omnibus GitLab packages: +For Linux package (Omnibus): 1. Add the following to `/etc/gitlab/gitlab.rb`: @@ -605,7 +621,7 @@ different, configuration methods. ###### SSE-S3 To enable SSE-S3, in the backup storage options set the `server_side_encryption` -field to `AES256`. For example, in Omnibus GitLab: +field to `AES256`. For example, in the Linux package (Omnibus): ```ruby gitlab_rails['backup_upload_storage_options'] = { @@ -622,7 +638,7 @@ Under the `backup_upload_storage_options` configuration setting, set: - `server_side_encryption` to `aws:kms`. - `server_side_encryption_kms_key_id` to the ARN of the key. -For example, in Omnibus GitLab: +For example, in the Linux package (Omnibus): ```ruby gitlab_rails['backup_upload_storage_options'] = { @@ -638,7 +654,7 @@ SSE-C requires you to set these encryption options: - `backup_encryption`: AES256. - `backup_encryption_key`: Unencoded, 32-byte (256 bits) key. The upload fails if this isn't exactly 32 bytes. -For example, in Omnibus GitLab: +For example, in the Linux package (Omnibus): ```ruby gitlab_rails['backup_encryption'] = 'AES256' @@ -792,7 +808,7 @@ access key from the Google console: **Set object-level and bucket-level permissions** is selected. 1. Ensure you have already created a bucket. -For Omnibus GitLab packages: +For the Linux package (Omnibus): 1. Edit `/etc/gitlab/gitlab.rb`: @@ -835,7 +851,9 @@ For installations from source: > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/25877) in GitLab 13.4. -For Omnibus GitLab packages: +::Tabs + +:::TabTitle Linux package (Omnibus) 1. Edit `/etc/gitlab/gitlab.rb`: @@ -852,7 +870,7 @@ For Omnibus GitLab packages: 1. [Reconfigure GitLab](../restart_gitlab.md#reconfigure-a-linux-package-installation) for the changes to take effect -For installations from source: +:::TabTitle Self-compiled 1. Edit `home/git/gitlab/config/gitlab.yml`: @@ -869,6 +887,8 @@ For installations from source: 1. [Restart GitLab](../restart_gitlab.md#installations-from-source) for the changes to take effect +::EndTabs + For more details, see the [table of Azure parameters](../object_storage.md#azure-blob-storage). ##### Specifying a custom directory for backups @@ -881,25 +901,27 @@ sudo gitlab-backup create DIRECTORY=daily sudo gitlab-backup create DIRECTORY=weekly ``` -Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. - #### Skip uploading backups to remote storage If you have configured GitLab to [upload backups in a remote storage](#upload-backups-to-a-remote-cloud-storage), you can use the `SKIP=remote` option to skip uploading your backups to the remote storage. -For Omnibus GitLab packages: +::Tabs + +:::TabTitle Linux package (Omnibus) ```shell sudo gitlab-backup create SKIP=remote ``` -For installations from source: +:::TabTitle Self-compiled ```shell sudo -u git -H bundle exec rake gitlab:backup:create SKIP=remote RAILS_ENV=production ``` +::EndTabs + #### Upload to locally-mounted shares You can send backups to a locally-mounted share (for example, `NFS`,`CIFS`, or `SMB`) using the Fog @@ -914,7 +936,7 @@ To do this, you must set the following configuration keys: When mounted, the directory set in the `local_root` key must be owned by either: - The `git` user. So, mounting with the `uid=` of the `git` user for `CIFS` and `SMB`. -- The user that you are executing the backup tasks as. For Omnibus GitLab, this is the `git` user. +- The user that you are executing the backup tasks as. For the Linux package (Omnibus), this is the `git` user. Because file system performance may affect overall GitLab performance, [we don't recommend using cloud-based file systems for storage](../nfs.md#avoid-using-cloud-based-file-systems). @@ -935,7 +957,9 @@ remaining after the failed upload attempt. ##### Configure uploads to locally-mounted shares -For Omnibus GitLab packages: +::Tabs + +:::TabTitle Linux package (Omnibus) 1. Edit `/etc/gitlab/gitlab.rb`: @@ -953,7 +977,7 @@ For Omnibus GitLab packages: 1. [Reconfigure GitLab](../restart_gitlab.md#reconfigure-a-linux-package-installation) for the changes to take effect. -For installations from source: +:::TabTitle Self-compiled 1. Edit `home/git/gitlab/config/gitlab.yml`: @@ -972,6 +996,8 @@ For installations from source: 1. [Restart GitLab](../restart_gitlab.md#installations-from-source) for the changes to take effect. +::EndTabs + #### Backup archive permissions The backup archives created by GitLab (`1393513186_2014_02_27_gitlab_backup.tar`) @@ -980,7 +1006,9 @@ meant to avoid other system users reading GitLab data. If you need the backup archives to have different permissions, you can use the `archive_permissions` setting. -For Omnibus GitLab packages: +::Tabs + +:::TabTitle Linux package (Omnibus) 1. Edit `/etc/gitlab/gitlab.rb`: @@ -991,7 +1019,7 @@ For Omnibus GitLab packages: 1. [Reconfigure GitLab](../restart_gitlab.md#reconfigure-a-linux-package-installation) for the changes to take effect. -For installations from source: +:::TabTitle Self-compiled 1. Edit `/home/git/gitlab/config/gitlab.yml`: @@ -1003,6 +1031,8 @@ For installations from source: 1. [Restart GitLab](../restart_gitlab.md#installations-from-source) for the changes to take effect. +::EndTabs + #### Configuring cron to make daily backups WARNING: @@ -1011,7 +1041,9 @@ or [SSH host keys](https://superuser.com/questions/532040/copy-ssh-keys-from-one You can schedule a cron job that backs up your repositories and GitLab metadata. -For Omnibus GitLab packages: +::Tabs + +:::TabTitle Linux package (Omnibus) 1. Edit the crontab for the `root` user: @@ -1026,9 +1058,7 @@ For Omnibus GitLab packages: 0 2 * * * /opt/gitlab/bin/gitlab-backup create CRON=1 ``` - Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:create` instead. - -For installations from source: +:::TabTitle Self-compiled 1. Edit the crontab for the `git` user: @@ -1043,6 +1073,8 @@ For installations from source: 0 2 * * * cd /home/git/gitlab && PATH=/usr/local/bin:/usr/bin:/bin bundle exec rake gitlab:backup:create RAILS_ENV=production CRON=1 ``` +::EndTabs + The `CRON=1` environment setting directs the backup script to hide all progress output if there aren't any errors. This is recommended to reduce cron spam. When troubleshooting backup problems, however, replace `CRON=1` with `--trace` to log verbosely. @@ -1063,7 +1095,9 @@ because the user may not have permission to list and delete files. It's recommended that you configure the appropriate retention policy for your object storage (for example, [AWS S3](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-lifecycle.html)). -For Omnibus GitLab packages: +::Tabs + +:::TabTitle Linux package (Omnibus) 1. Edit `/etc/gitlab/gitlab.rb`: @@ -1075,7 +1109,7 @@ For Omnibus GitLab packages: 1. [Reconfigure GitLab](../restart_gitlab.md#reconfigure-a-linux-package-installation) for the changes to take effect. -For installations from source: +:::TabTitle Self-compiled 1. Edit `/home/git/gitlab/config/gitlab.yml`: @@ -1088,6 +1122,8 @@ For installations from source: 1. [Restart GitLab](../restart_gitlab.md#installations-from-source) for the changes to take effect. +::EndTabs + #### Back up and restore for installations using PgBouncer Do not back up or restore GitLab through a PgBouncer connection. These @@ -1143,7 +1179,7 @@ variables that are prefixed with `GITLAB_BACKUP_`: - `GITLAB_BACKUP_PGSSLCOMPRESSION` For example, to override the database host and port to use 192.168.1.10 -and port 5432 with the Omnibus package: +and port 5432 with the Linux package (Omnibus): ```shell sudo GITLAB_BACKUP_PGHOST=192.168.1.10 GITLAB_BACKUP_PGPORT=5432 /opt/gitlab/bin/gitlab-backup create @@ -1163,7 +1199,7 @@ The `gitaly-backup` binary is used by the backup Rake task to create and restore The backup Rake task must be able to find this executable. In most cases, you don't need to change the path to the binary as it should work fine with the default path `/opt/gitlab/embedded/bin/gitaly-backup`. -If you have a specific reason to change the path, it can be configured in Omnibus GitLab packages: +If you have a specific reason to change the path, it can be configured in the Linux package (Omnibus): 1. Add the following to `/etc/gitlab/gitlab.rb`: @@ -1198,18 +1234,18 @@ When considering using file system data transfer or snapshots: Example: Amazon Elastic Block Store (EBS) -> A GitLab server using Omnibus GitLab hosted on Amazon AWS. -> An EBS drive containing an ext4 file system is mounted at `/var/opt/gitlab`. -> In this case you could make an application backup by taking an EBS snapshot. -> The backup includes all repositories, uploads and PostgreSQL data. +- A GitLab server using the Linux package (Omnibus) hosted on Amazon AWS. +- An EBS drive containing an ext4 file system is mounted at `/var/opt/gitlab`. +- In this case you could make an application backup by taking an EBS snapshot. +- The backup includes all repositories, uploads and PostgreSQL data. Example: Logical Volume Manager (LVM) snapshots + rsync -> A GitLab server using Omnibus GitLab, with an LVM logical volume mounted at `/var/opt/gitlab`. -> Replicating the `/var/opt/gitlab` directory using rsync would not be reliable because too many files would change while rsync is running. -> Instead of rsync-ing `/var/opt/gitlab`, we create a temporary LVM snapshot, which we mount as a read-only file system at `/mnt/gitlab_backup`. -> Now we can have a longer running rsync job which creates a consistent replica on the remote server. -> The replica includes all repositories, uploads and PostgreSQL data. +- A GitLab server using the Linux package (Omnibus), with an LVM logical volume mounted at `/var/opt/gitlab`. +- Replicating the `/var/opt/gitlab` directory using rsync would not be reliable because too many files would change while rsync is running. +- Instead of rsync-ing `/var/opt/gitlab`, we create a temporary LVM snapshot, which we mount as a read-only file system at `/mnt/gitlab_backup`. +- Now we can have a longer running rsync job which creates a consistent replica on the remote server. +- The replica includes all repositories, uploads and PostgreSQL data. If you're running GitLab on a virtualized server, you can possibly also create VM snapshots of the entire GitLab server. It's not uncommon however for a VM @@ -1220,14 +1256,22 @@ practical use. First, ensure you back up existing GitLab data while [skipping repositories](#excluding-specific-directories-from-the-backup): +::Tabs + +:::TabTitle Linux package (Omnibus) + ```shell -# for Omnibus GitLab package installations sudo gitlab-backup create SKIP=repositories +``` -# for installations from source: +:::TabTitle Self-compiled + +```shell sudo -u git -H bundle exec rake gitlab:backup:create SKIP=repositories RAILS_ENV=production ``` +::EndTabs + For manually backing up the Git repository data on disk, there are multiple possible strategies: - Use snapshots, such as the previous examples of Amazon EBS drive snapshots, or LVM snapshots + rsync. @@ -1340,13 +1384,13 @@ after which users must reactivate 2FA. 1. Enter the database console: - For Omnibus GitLab 14.1 and earlier: + For the Linux package (Omnibus) GitLab 14.1 and earlier: ```shell sudo gitlab-rails dbconsole ``` - For Omnibus GitLab 14.2 and later: + For the Linux package (Omnibus) GitLab 14.2 and later: ```shell sudo gitlab-rails dbconsole --database main @@ -1393,13 +1437,13 @@ You may need to reconfigure or restart GitLab for the changes to take effect. 1. Enter the database console: - For Omnibus GitLab 14.1 and earlier: + For the Linux package (Omnibus) GitLab 14.1 and earlier: ```shell sudo gitlab-rails dbconsole ``` - For Omnibus GitLab 14.2 and later: + For the Linux package (Omnibus) GitLab 14.2 and later: ```shell sudo gitlab-rails dbconsole --database main @@ -1442,13 +1486,13 @@ You may need to reconfigure or restart GitLab for the changes to take effect. 1. Enter the database console: - For Omnibus GitLab 14.1 and earlier: + For the Linux package (Omnibus) GitLab 14.1 and earlier: ```shell sudo gitlab-rails dbconsole ``` - For Omnibus GitLab 14.2 and later: + For the Linux package (Omnibus) GitLab 14.2 and later: ```shell sudo gitlab-rails dbconsole --database main @@ -1497,13 +1541,13 @@ You should verify that the secrets are the root cause before deleting any data. 1. Enter the database console: - For Omnibus GitLab 14.1 and earlier: + For the Linux package (Omnibus) GitLab 14.1 and earlier: ```shell sudo gitlab-rails dbconsole ``` - For Omnibus GitLab 14.2 and later: + For the Linux package (Omnibus) GitLab 14.2 and later: ```shell sudo gitlab-rails dbconsole --database main @@ -1531,7 +1575,7 @@ You should verify that the secrets are the root cause before deleting any data. ### Container Registry push failures after restoring from a backup If you use the [Container Registry](../../user/packages/container_registry/index.md), -pushes to the registry may fail after restoring your backup on an Omnibus GitLab +pushes to the registry may fail after restoring your backup on a Linux package (Omnibus) instance after restoring the registry data. These failures mention permission issues in the registry logs, similar to: @@ -1631,13 +1675,13 @@ Truncate the filenames in the `uploads` table: 1. Enter the database console: - For Omnibus GitLab 14.2 and later: + For the Linux package (Omnibus) GitLab 14.2 and later: ```shell sudo gitlab-rails dbconsole --database main ``` - For Omnibus GitLab 14.1 and earlier: + For the Linux package (Omnibus) GitLab 14.1 and earlier: ```shell sudo gitlab-rails dbconsole diff --git a/doc/administration/backup_restore/restore_gitlab.md b/doc/administration/backup_restore/restore_gitlab.md index d6b5f7ad6eb04a..2cc0c68c66b587 100644 --- a/doc/administration/backup_restore/restore_gitlab.md +++ b/doc/administration/backup_restore/restore_gitlab.md @@ -13,8 +13,9 @@ The [restore prerequisites section](#restore-prerequisites) includes crucial information. Be sure to read and test the complete restore process at least once before attempting to perform it in a production environment. -You can restore a backup only to _the exact same version and type (CE/EE)_ of -GitLab that you created it on (for example CE 15.1.4). +NOTE: +You can only restore a backup to **exactly the same version and type (CE/EE)** +of GitLab on which it was created (for example CE 15.1.4). If your backup is a different version than the current installation, you must [downgrade](../../update/package/downgrade.md) or [upgrade](../../update/package/index.md#upgrade-to-a-specific-version-using-the-official-repositories) your GitLab installation @@ -109,15 +110,6 @@ restore: sudo gitlab-backup restore BACKUP=11493107454_2018_04_25_10.6.4-ce ``` -Users of GitLab 12.1 and earlier should use the command `gitlab-rake gitlab:backup:restore` instead. -Some [known non-blocking error messages may appear](#restoring-database-backup-using-omnibus-packages-outputs-warnings). - -WARNING: -`gitlab-rake gitlab:backup:restore` doesn't set the correct file system -permissions on your Registry directory. This is a [known issue](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/62759). -In GitLab 12.2 or later, you can use `gitlab-backup restore` to avoid this -issue. - If there's a GitLab version mismatch between your backup tar file and the installed version of GitLab, the restore command aborts with an error message. Install the [correct GitLab version](https://packages.gitlab.com/gitlab/), -- GitLab From a3d13035974c43353d6d86a9644b9d8cd8b85b56 Mon Sep 17 00:00:00 2001 From: Mike Kozono Date: Mon, 26 Jun 2023 16:03:43 -1000 Subject: [PATCH 10/13] Mention another use-case for skipping tar creation --- doc/administration/backup_restore/backup_gitlab.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/doc/administration/backup_restore/backup_gitlab.md b/doc/administration/backup_restore/backup_gitlab.md index 052627956ad224..b697c809ce39b1 100644 --- a/doc/administration/backup_restore/backup_gitlab.md +++ b/doc/administration/backup_restore/backup_gitlab.md @@ -403,11 +403,10 @@ sudo -u git -H bundle exec rake gitlab:backup:create SKIP=db,uploads RAILS_ENV=p NOTE: It is not possible to skip the tar creation when using [object storage](#upload-backups-to-a-remote-cloud-storage) for backups. -The last part of creating a backup is generation of a `.tar` file containing -all the parts. In some cases (for example, if the backup is picked up by other -backup software) creating a `.tar` file might be wasted effort or even directly -harmful, so you can skip this step by adding `tar` to the `SKIP` environment -variable. +The last part of creating a backup is generation of a `.tar` file containing all the parts. In some cases, creating a `.tar` file might be wasted effort or even directly harmful, so you can skip this step by adding `tar` to the `SKIP` environment variable. Example use-cases: + +- When the backup is picked up by other backup software. +- To speed up incremental backups by avoiding having to extract the backup every time. (In this case, `PREVIOUS_BACKUP` and `BACKUP` must not be specified, otherwise the specified backup will be extracted, but no `.tar` file will be generated at the end.) Adding `tar` to the `SKIP` variable leaves the files and directories containing the backup in the directory used for the intermediate files. These files are -- GitLab From cbbb64d124754a2458a8a993f3544cc2c33d1ad6 Mon Sep 17 00:00:00 2001 From: Achilleas Pipinellis Date: Tue, 27 Jun 2023 23:24:06 +0000 Subject: [PATCH 11/13] Incorporate technical writing feedback --- .../backup_restore/backup_gitlab.md | 46 +++++++------------ 1 file changed, 16 insertions(+), 30 deletions(-) diff --git a/doc/administration/backup_restore/backup_gitlab.md b/doc/administration/backup_restore/backup_gitlab.md index b697c809ce39b1..c81ab202d41d8b 100644 --- a/doc/administration/backup_restore/backup_gitlab.md +++ b/doc/administration/backup_restore/backup_gitlab.md @@ -20,7 +20,7 @@ As a rough guideline, if you are using a [1k reference architecture](../referenc As the volume of GitLab data grows, the [backup command](#backup-command) takes longer to execute. At some point, the execution time becomes impractical. For example, it can take 24 hours or more. -See [Alternative backup strategies](#alternative-backup-strategies) for more information. +For more information, see [alternative backup strategies](#alternative-backup-strategies). ## What data needs to be backed up? @@ -37,13 +37,13 @@ In the simplest case, GitLab has one PostgreSQL database in one PostgreSQL serve In general, this data is the single source of truth for most user-generated content in the Web interface, such as issue and merge request content, comments, permissions, and credentials. PostgreSQL also holds some cached data like HTML-rendered Markdown, and by default, merge request diffs. -Although merge request diffs can also be configured to be offloaded to the file system or object storage, see [Blobs](#blobs). +However, merge request diffs can also be configured to be offloaded to the file system or object storage, see [Blobs](#blobs). Gitaly Cluster's Praefect service uses a PostgreSQL database as a single source of truth to manage its Gitaly nodes. A common PostgreSQL utility, [`pg_dump`](https://www.postgresql.org/docs/current/app-pgdump.html), produces a backup file which can be used to restore a PostgreSQL database. The [backup command](#backup-command) uses this utility under the hood. -Unfortunately, `pg_dump` takes longer to execute, the larger the database. Depending on your situation, the duration becomes impractical at some point (days, for example). If your database is over 100GB, `pg_dump`, and by extension the [backup command](#backup-command), is likely not usable. +Unfortunately, the larger the database, the longer it takes `pg_dump` to execute. Depending on your situation, the duration becomes impractical at some point (days, for example). If your database is over 100 GB, `pg_dump`, and by extension the [backup command](#backup-command), is likely not usable. For more information, see [alternative backup strategies](#alternative-backup-strategies). ### Git repositories @@ -97,7 +97,7 @@ The [backup Rake task](#back-up-gitlab) GitLab provides does _not_ store your co WARNING: The secrets file may change after upgrades. -It is recommended to backup the configuration directory. At the very **minimum**, you must back up: +You should back up the configuration directory. At the very **minimum**, you must back up: ::Tabs @@ -135,10 +135,10 @@ In the unlikely event that the secrets file is lost, see the ### Other data -GitLab uses Redis both as a cache store and to hold persistent data for our background jobs system, Sidekiq. The provided [backup command](#backup-command) does _not_ back up Redis data. This means to take a consistent backup with the [backup command](#backup-command), there must be no pending or running background jobs. It is possible to manually back up Redis. +GitLab uses Redis both as a cache store and to hold persistent data for our background jobs system, Sidekiq. The provided [backup command](#backup-command) does _not_ back up Redis data. This means that in order to take a consistent backup with the [backup command](#backup-command), there must be no pending or running background jobs. It is possible to [manually back up Redis](https://redis.io/docs/management/persistence/#backing-up-redis-data). Elasticsearch is an optional database for advanced search. It can improve search -in both source-code level, and user generated content in issues, merge requests, and discussions. The [backup command](#backup-command) does _not_ back up Elasticsearch data. Elasticsearch data can be regenerated from PostgreSQL data after a restore. It is possible to manually back up Elasticsearch. +in both source-code level, and user generated content in issues, merge requests, and discussions. The [backup command](#backup-command) does _not_ back up Elasticsearch data. Elasticsearch data can be regenerated from PostgreSQL data after a restore. It is possible to [manually back up Elasticsearch](https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshot-restore.html). ## Command line interface @@ -174,8 +174,8 @@ files. You are highly advised to read about [storing configuration files](#stori To be able to back up and restore, ensure that Rsync is installed on your system. If you installed GitLab: -- _Using the Linux package_, Rsync is already installed. -- _Using self-compiled_, check if `rsync` is installed. If Rsync is not installed, install it. For example: +- Using the Linux package, Rsync is already installed. +- Using self-compiled, check if `rsync` is installed. If Rsync is not installed, install it. For example: ```shell # Debian/Ubuntu @@ -207,17 +207,9 @@ of GitLab on which it was created. :::TabTitle Linux package (Omnibus) -- GitLab 12.2 or later: - - ```shell - sudo gitlab-backup create - ``` - -- GitLab 12.1 and earlier: - - ```shell - gitlab-rake gitlab:backup:create - ``` +```shell +sudo gitlab-backup create +``` :::TabTitle Helm chart (Kubernetes) @@ -229,15 +221,9 @@ Run the backup from the host. - GitLab 12.2 or later: - ```shell - docker exec -t gitlab-backup create - ``` - -- GitLab 12.1 and earlier: - - ```shell - docker exec -t gitlab-rake gitlab:backup:create - ``` +```shell +docker exec -t gitlab-backup create +``` :::TabTitle Self-compiled @@ -1211,9 +1197,9 @@ If you have a specific reason to change the path, it can be configured in the Li ## Alternative backup strategies -Since every deployment may have different capabilities, you should first review [What data needs to be backed up?](#what-data-needs-to-be-backed-up) to better understand if, and how, you can leverage them. +Because every deployment may have different capabilities, you should first review [what data needs to be backed up](#what-data-needs-to-be-backed-up) to better understand if, and how, you can leverage them. -For example, if you use Amazon RDS, you might choose to use its built-in backup and restore features to handle your GitLab [PostgreSQL data](#postgresql-databases), and simply [exclude PostgreSQL data](#excluding-specific-directories-from-the-backup) when using the [backup command](#backup-command). +For example, if you use Amazon RDS, you might choose to use its built-in backup and restore features to handle your GitLab [PostgreSQL data](#postgresql-databases), and [exclude PostgreSQL data](#excluding-specific-directories-from-the-backup) when using the [backup command](#backup-command). In the following cases, consider using file system data transfer or snapshots as part of your backup strategy: -- GitLab From 6ae47fdbb66dbf9249e9e2f283efa679922a2dcb Mon Sep 17 00:00:00 2001 From: Achilleas Pipinellis Date: Tue, 27 Jun 2023 23:25:27 +0000 Subject: [PATCH 12/13] Tweak wording and formatting --- doc/administration/backup_restore/backup_gitlab.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/administration/backup_restore/backup_gitlab.md b/doc/administration/backup_restore/backup_gitlab.md index c81ab202d41d8b..915b2054acc372 100644 --- a/doc/administration/backup_restore/backup_gitlab.md +++ b/doc/administration/backup_restore/backup_gitlab.md @@ -10,7 +10,7 @@ The exact procedure for backing up GitLab depends on many factors. Your particul ## Simple back up procedure -As a rough guideline, if you are using a [1k reference architecture](../reference_architectures/1k_users.md) with less than 100GB of data, then follow these steps: +As a rough guideline, if you are using a [1k reference architecture](../reference_architectures/1k_users.md) with less than 100 GB of data, then follow these steps: 1. Run the [backup command](#backup-command). 1. Back up [object storage](#object-storage), if applicable. @@ -27,7 +27,7 @@ For more information, see [alternative backup strategies](#alternative-backup-st - [PostgreSQL databases](#postgresql-databases) - [Git repositories](#git-repositories) - [Blobs](#blobs) -- [Storing configuration files](#storing-configuration-files) +- [Configuration files](#storing-configuration-files) - [Other data](#other-data) ### PostgreSQL databases -- GitLab From 2b0c34f80a3afb92644b098cad50ec5ec1cde1f6 Mon Sep 17 00:00:00 2001 From: Achilleas Pipinellis Date: Wed, 28 Jun 2023 05:28:29 +0000 Subject: [PATCH 13/13] Apply 11 suggestion(s) to 1 file(s) --- .../backup_restore/backup_gitlab.md | 24 +++++++++---------- 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/doc/administration/backup_restore/backup_gitlab.md b/doc/administration/backup_restore/backup_gitlab.md index 915b2054acc372..5cdfbefff1a8a4 100644 --- a/doc/administration/backup_restore/backup_gitlab.md +++ b/doc/administration/backup_restore/backup_gitlab.md @@ -70,7 +70,7 @@ Personal and project snippets, and group wiki content, are stored in Git reposit Project forks are deduplicated in live a GitLab site using pool repositories. -The [backup command](#backup-command) produces a Git bundle for each repository and tars them all up. This duplicates pool repository data into every fork. For more information, see [Alternative backup strategies](#alternative-backup-strategies). +The [backup command](#backup-command) produces a Git bundle for each repository and tars them all up. This duplicates pool repository data into every fork. In [our testing](https://gitlab.com/gitlab-org/gitlab/-/issues/396343), 100 GB of Git repositories took a little over 2 hours to back up and upload to S3. At around 400 GB of Git data, the backup command is likely not viable for regular backups. For more information, see [alternative backup strategies](#alternative-backup-strategies). ### Blobs @@ -281,7 +281,7 @@ the timestamp is `1493107454_2018_04_25_10.6.4-ce`. ### Backup options -The command line tool GitLab provides to backup your instance can accept more +The command line tool GitLab provides to back up your instance can accept more options. #### Backup strategy option @@ -291,14 +291,14 @@ data locations to the backup using the Linux command `tar` and `gzip`. This work fine in most cases, but can cause problems when data is rapidly changing. When data changes while `tar` is reading it, the error `file changed as we read -it` may occur, and causes the backup process to fail. To combat this, 8.17 -introduces a new backup strategy called `copy`. The strategy copies data files +it` may occur, and causes the backup process to fail. In that case, you can use +the backup strategy called `copy`. The strategy copies data files to a temporary location before calling `tar` and `gzip`, avoiding the error. A side-effect is that the backup process takes up to an additional 1X disk space. The process does its best to clean up the temporary files at each stage so the problem doesn't compound, but it could be a considerable change for large -installations. This is why the `copy` strategy is not the default in 8.17. +installations. To use the `copy` strategy instead of the default streaming strategy, specify `STRATEGY=copy` in the Rake task command. For example: @@ -392,7 +392,7 @@ It is not possible to skip the tar creation when using [object storage](#upload- The last part of creating a backup is generation of a `.tar` file containing all the parts. In some cases, creating a `.tar` file might be wasted effort or even directly harmful, so you can skip this step by adding `tar` to the `SKIP` environment variable. Example use-cases: - When the backup is picked up by other backup software. -- To speed up incremental backups by avoiding having to extract the backup every time. (In this case, `PREVIOUS_BACKUP` and `BACKUP` must not be specified, otherwise the specified backup will be extracted, but no `.tar` file will be generated at the end.) +- To speed up incremental backups by avoiding having to extract the backup every time. (In this case, `PREVIOUS_BACKUP` and `BACKUP` must not be specified, otherwise the specified backup is extracted, but no `.tar` file is generated at the end.) Adding `tar` to the `SKIP` variable leaves the files and directories containing the backup in the directory used for the intermediate files. These files are @@ -1133,7 +1133,7 @@ This happens because the task uses `pg_dump`, which [sets a null search path and explicitly includes the schema in every SQL query](https://gitlab.com/gitlab-org/gitlab/-/issues/23211) to address [CVE-2018-1058](https://www.postgresql.org/about/news/postgresql-103-968-9512-9417-and-9322-released-1834/). -Since connections are reused with PgBouncer in transaction pooling mode, +Because connections are reused with PgBouncer in transaction pooling mode, PostgreSQL fails to search the default `public` schema. As a result, this clearing of the search path causes tables and columns to appear missing. @@ -1214,7 +1214,7 @@ When considering using file system data transfer or snapshots: - Don't use these methods to migrate from one operating system to another. The operating systems of the source and destination should be as similar as possible. For example, don't use these methods to migrate from Ubuntu to Fedora. -- Data consistency is very important. We recommend stopping GitLab with `sudo gitlab-ctl stop` before taking doing a file system transfer (with rsync, for example) or taking a +- Data consistency is very important. You should stop GitLab with `sudo gitlab-ctl stop` before taking doing a file system transfer (with `rsync`, for example) or taking a snapshot. Example: Amazon Elastic Block Store (EBS) @@ -1735,9 +1735,9 @@ Truncate the filenames in the `uploads` table: - `new_filename`: a filename that has been truncated to 246 characters maximum. - `new_path`: new path considering the `new_filename` (truncated). - Once you validate the batch results, you must change the batch size (`row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. + After you validate the batch results, you must change the batch size (`row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. -1. Rename the files found in the `uploads` table from long filenames to new truncated filenames. The following query rolls back the update so you can check the results safely within a transaction wrapper: +1. Rename the files found in the `uploads` table from long filenames to new truncated filenames. The following query rolls back the update so you can check the results safely in a transaction wrapper: ```sql CREATE TEMP TABLE uploads_with_long_filenames AS @@ -1770,7 +1770,7 @@ Truncate the filenames in the `uploads` table: ROLLBACK; ``` - Once you validate the batch update results, you must change the batch size (`row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. + After you validate the batch update results, you must change the batch size (`row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. 1. Validate that the new filenames from the previous query are the expected ones. If you are sure you want to truncate the records found in the previous step to 246 characters, run the following: @@ -1802,7 +1802,7 @@ Truncate the filenames in the `uploads` table: AND updatable_uploads.row_id > 0 AND updatable_uploads.row_id <= 10000; ``` - Once you finish the batch update, you must change the batch size (`updatable_uploads.row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. + After you finish the batch update, you must change the batch size (`updatable_uploads.row_id`) using the following sequence of numbers (10000 to 20000). Repeat this process until you reach the last record in the `uploads` table. Truncate the filenames in the references found: -- GitLab