diff --git a/doc/development/backend/create_source_code_be/pull_mirroring.md b/doc/development/backend/create_source_code_be/pull_mirroring.md new file mode 100644 index 0000000000000000000000000000000000000000..27c5638472c40130e7d5aa95f6220815871d1683 --- /dev/null +++ b/doc/development/backend/create_source_code_be/pull_mirroring.md @@ -0,0 +1,82 @@ +--- +stage: Create +group: Source Code +info: Any user with at least the Maintainer role can merge updates to this content. For details, see https://docs.gitlab.com/ee/development/development_processes.html#development-guidelines-review. +--- + +# Pull Mirroring Implementation + +For more details, refer to the official documentation: [GitLab Pull Mirroring Documentation](../../../user/project/repository/mirror/pull.md). + +## Stage 1: `UpdateAllMirrorsWorker` - Fetch Eligible Projects for Pull Mirroring + +### Eligibility Criteria for Projects + +A project must meet the following criteria to be eligible for pull mirroring: + +- The project has a pull mirroring configuration. +- The project belongs to a premium or ultimate namespace. +- The project is not archived. +- The project is not deleted. +- The project has an import status of `empty`, `finished`, or `failed`. +- The `next_execution_timestamp` is not in the future. + +The [`UpdateAllMirrorsWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/f77763345e35c7c4c6a4320f1bd1633a392e3e7c/ee/app/workers/update_all_mirrors_worker.rb) queries for all projects that meet the eligibility criteria and schedules them in batches. + +For each eligible project, it schedules a [`ProjectImportScheduleWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/f77763345e35c7c4c6a4320f1bd1633a392e3e7c/ee/app/workers/project_import_schedule_worker.rb) with the project's ID. + +## Stage 2: `ProjectImportScheduleWorker` - Pre-verification + +The [`ProjectImportScheduleWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/f77763345e35c7c4c6a4320f1bd1633a392e3e7c/ee/app/workers/project_import_schedule_worker.rb) performs several checks before scheduling the pull mirroring process: + +- Verifies if the project still exists. +- Changes the project’s import status to `scheduled`. +- Marks the project as `failed` if it is no longer eligible for pull mirroring. + +The `project.import_state.schedule` call triggers a series of actions, ultimately scheduling the [`RepositoryUpdateMirrorWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/f77763345e35c7c4c6a4320f1bd1633a392e3e7c/ee/app/workers/repository_update_mirror_worker.rb): + +1. [`import_state.schedule`](https://gitlab.com/gitlab-org/gitlab/-/blob/f77763345e35c7c4c6a4320f1bd1633a392e3e7c/ee/app/workers/project_import_schedule_worker.rb#L29) +1. [`ProjectImportState#after_transition`](https://gitlab.com/gitlab-org/gitlab/-/blob/f77763345e35c7c4c6a4320f1bd1633a392e3e7c/app/models/project_import_state.rb#L49-58) +1. [`EE::Project#add_import_job`](https://gitlab.com/gitlab-org/gitlab/-/blob/f77763345e35c7c4c6a4320f1bd1633a392e3e7c/ee/app/models/ee/project.rb#L778) +1. [`RepositoryUpdateMirrorWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/f77763345e35c7c4c6a4320f1bd1633a392e3e7c/ee/app/workers/repository_update_mirror_worker.rb#L3) + +## Stage 3: `RepositoryUpdateMirrorWorker` - Start Processing + +The [`RepositoryUpdateMirrorWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/f77763345e35c7c4c6a4320f1bd1633a392e3e7c/ee/app/workers/repository_update_mirror_worker.rb) is responsible for starting the pull mirroring process: + +- It changes the project’s import status from `scheduled` to `started`. +- It invokes the [`Projects::UpdateMirrorService`](https://gitlab.com/gitlab-org/gitlab/-/blob/f77763345e35c7c4c6a4320f1bd1633a392e3e7c/ee/app/services/projects/update_mirror_service.rb) to perform the actual mirroring operations. + +## Stage 4: `Projects::UpdateMirrorService` - Processing + +The [`Projects::UpdateMirrorService`](https://gitlab.com/gitlab-org/gitlab/-/blob/f77763345e35c7c4c6a4320f1bd1633a392e3e7c/ee/app/services/projects/update_mirror_service.rb) performs the core operations of pull mirroring: + +1. Verifies that the pull mirror configuration is correct and that the user has sufficient permissions. +1. Requests Gitaly to execute a `git fetch` via the [`FetchRemoteRequest`](https://gitlab-org.gitlab.io/gitaly/#gitaly.FetchRemoteRequest). +1. Creates any missing tags locally and removes tags that no longer exist remotely. +1. Imports new LFS (Large File Storage) objects. +1. Creates new branches that are missing locally. +1. Performs housekeeping tasks (e.g., cleaning up stale data). + +## Stage 5: `RepositoryUpdateMirrorWorker` - Finalization + +After the mirroring process is completed, the [`RepositoryUpdateMirrorWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/f77763345e35c7c4c6a4320f1bd1633a392e3e7c/ee/app/workers/repository_update_mirror_worker.rb) finalizes the process: + +- It changes the project’s import status from `started` to `finished`. +- If any errors occur during the process, it changes the import status from `started` to `failed` and attaches an error message to the [`ProjectImportState`](https://gitlab.com/gitlab-org/gitlab/-/blob/f77763345e35c7c4c6a4320f1bd1633a392e3e7c/app/models/project_import_state.rb). + +## Execution graph + +```mermaid +flowchart TB + +DB[(Database)] + +DB -- 1.regularly fetch mirrors to update --> UpdateAllMirrorsWorker +UpdateAllMirrorsWorker --> DB +UpdateAllMirrorsWorker -- 2.schedule workers for each project --> import_worker("ProjectImportScheduleWorker: start one per project") +import_worker -- 3.verify mirror state and change status to scheduled --> import(" ProjectImportState#add_import_job") +import -- 4.scheduled status triggers a new worker --> RepositoryUpdateMirrorWorker +RepositoryUpdateMirrorWorker -- 5.set status to started and start processing --> Projects::UpdateMirrorService#execute +Projects::UpdateMirrorService#execute --> final("6.set status to finished or failed") +```