[go: up one dir, main page]

Skip to content

Run stages in parallel - GitHub Import

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

#438280 (closed) investigate how we can improve GitHub Import performance by running stages in parallel.

It seems that when using a 15k access token, running stages in parallel could improve the migration performance by ~36%. And when using a 5k access token, the improvement would be ~15%.

Proposed solution

Enable GitHub Import to run stages in parallel, similar to how it was done in the POC

Remember that on !142732 (merged), the stage order was updated to run collaborators before other stages. So, stages should be executed in the following order:

graph

  Stage::ImportRepositoryWorker-->Stage::ImportProtectedBranchesWorker
  Stage::ImportRepositoryWorker-->Stage::ImportLfsObjectsWorker
  Stage::ImportRepositoryWorker-->Stage::ImportCollaboratorsWorker

  Stage::ImportBaseDataWorker-->Stage::ImportReleasesAttachmentsWorker

  Stage::ImportCollaboratorsWorker-->Stage::ImportBaseDataWorker

  Stage::ImportBaseDataWorker-->Stage::ImportPullRequestsWorker
  Stage::ImportPullRequestsWorker-->Stage::ImportDiffNotesWorker
  Stage::ImportPullRequestsWorker-->Stage::ImportPullRequestsEventsWorker
  Stage::ImportPullRequestsWorker-->Stage::ImportPullRequestsAttachmentsWorker
  Stage::ImportPullRequestsEventsWorker-->Stage::ImportPullRequestNoteAttachmentsWorker

  Stage::ImportBaseDataWorker-->Stage::ImportIssuesWorker
  Stage::ImportIssuesWorker-->Stage::ImportIssueEventsWorker
  Stage::ImportIssuesWorker-->Stage::ImportIssueAttachmentsWorker
  Stage::ImportIssueEventsWorker-->Stage::ImportIssueNoteAttachmentsWorker
Edited by 🤖 GitLab Bot 🤖