[go: up one dir, main page]

Geo: Implement project repository replication for organization migration

Problem

As part of organization migration from Legacy Cell to Protocell, the Geo team needs to implement special handling for project repository replication. The project_repositories table is now marked as cell-local and won't be replicated directly, requiring custom logic to ensure project repositories are properly handled on the target cell.

Background

projects has a repository_storage field. In addition, project_repositories is for the most part not a dependency for GitLab Rails and Gitaly (this was discovered here).

Proposal

  • Classify project_repositories as cell-local.

When doing PG replication for migrating an org to a Protocell:

  • Ignore project_repositories table (as usual)
  • Do replicate projects rows (as usual)
  • Upon replicating a projects row, transform the repository_storage value to a magic string like "reset_storage" (just an idea to avoid making the field nullable, which seems riskier. "reset_storage" has more meaning than null.). For example in AWS DMS, add a transformation rule to the table mapping configuration and use remove-column and add-column.
  • Hook into Project#repository_storage to set itself to Repository.pick_storage_shard if it currently equals "reset_storage"
  • If &17974 is done. When syncing a project Git repo, trigger Project#track_project_repository if the project has a repo on the source cell. If the project does not have a repo on the source cell, then don't trigger it.
  • If &17974 is not done. When syncing a project Git repo, always trigger Project#track_project_repository.
  • To do: Do we have to worry about carrying over the original object_format on project_repositories?
  • Ensure there are sufficient automated (unit? QA?) tests to meet GitLab's standards
  • Smoke test locally

Related

Edited by Michael Kozono