failures restoring backup to gitaly cluster
We have two customer tickets currently, showing similar issues around restoring backups to gitaly cluster environments:
This customer has a single-node GitLab instance in production, and has configured a new GitLab environment with gitaly cluster. When he attempts to restore a backup from the single-node production instance to the new gitaly cluster instance, he runs into RPC errors and sometimes read-only errors attempting to restore repositories. The customer reports the following errors:
Many of the projects appear to restore okay, but several show grpc 14 errors and the restore itself eventually fails with the following error:
[root@ip-172-17-21-111 backups]# rake aborted!
Gitlab::Git::CommandError: 9:repository is in read-only mode.
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/git/wraps_gitaly_errors.rb:15:in `rescue in wrapped_gitaly_errors'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/git/wraps_gitaly_errors.rb:6:in `wrapped_gitaly_errors'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/git/repository.rb:150:in `remove'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/repositories.rb:218:in `restore_snippet_repository'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/repositories.rb:61:in `block in restore_snippets'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/repositories.rb:61:in `each'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/repositories.rb:61:in `map'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/repositories.rb:61:in `restore_snippets'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/repositories.rb:44:in `restore'
/opt/gitlab/embedded/service/gitlab-rails/ee/lib/ee/backup/repositories.rb:12:in `restore'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:120:in `block (4 levels) in <top (required)>'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:78:in `block (3 levels) in <top (required)>'
/opt/gitlab/embedded/bin/bundle:23:in `load'
/opt/gitlab/embedded/bin/bundle:23:in `<main>'
Caused by:
GRPC::FailedPrecondition: 9:repository is in read-only mode. debug_error_string:{"created":"@1617041064.468415038","description":"Error received from peer ipv4:172.17.68.155:2305","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"repository is in read-only mode","grpc_status":9}
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/gitaly_client.rb:177:in `execute'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/gitaly_client/call.rb:18:in `block in call'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/gitaly_client/call.rb:55:in `recording_request'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/gitaly_client/call.rb:17:in `call'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/gitaly_client.rb:167:in `call'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/gitaly_client/repository_service.rb:359:in `remove'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/git/repository.rb:151:in `block in remove'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/git/wraps_gitaly_errors.rb:7:in `wrapped_gitaly_errors'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/git/repository.rb:150:in `remove'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/repositories.rb:218:in `restore_snippet_repository'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/repositories.rb:61:in `block in restore_snippets'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/repositories.rb:61:in `each'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/repositories.rb:61:in `map'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/repositories.rb:61:in `restore_snippets'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/repositories.rb:44:in `restore'
/opt/gitlab/embedded/service/gitlab-rails/ee/lib/ee/backup/repositories.rb:12:in `restore'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:120:in `block (4 levels) in <top (required)>'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:78:in `block (3 levels) in <top (required)>'
/opt/gitlab/embedded/bin/bundle:23:in `load'
/opt/gitlab/embedded/bin/bundle:23:in `<main>'
Tasks: TOP => gitlab:backup:repo:restore
(See full trace by running task with --trace)
This customer reports similar behavior trying to restore in what he says is a gitaly/praefect environment. He doesn't mention the architecture of the instance he's doing the backup on.
[Failed] restoring api-common/cni (@hashed/eb/0c/eb0c9cdc0862653468dacc6a876a0c40e9d642c50f798bae1162fe27f18d482c)
Error 9:repository is in read-only mode. debug_error_string:{"created":"@1616683445.784983667","description":"Error received from peer ipv4:10.232.72.17:2305","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"repository is in read-only mode","grpc_status":9}
[Failed] restoring xxiabs/xxi (@hashed/ee/9d/ee9d527a0a6108477fc5c98cf2a00f65d38c8e8508c4d17c1c11b2441c78a2ec)
Error 14:Socket closed. debug_error_string:{"created":"@1616683501.107158798","description":"Error received from peer ipv4:10.232.72.17:2305","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"Socket closed","grpc_status":14}
multiple runs of checks on all prefect hosts always show the same:
root@m1-devops-test-gitlab-praefect-1:~# sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml dataloss -virtual-storage default
Virtual storage: default
All repositories are writable!
root@m1-devops-test-gitlab-praefect-1:~# sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml dataloss -virtual-storage default -partially-replicated
Virtual storage: default
All repositories are up to date!