Run stages in parallel - GitHub Import
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
#438280 (closed) investigate how we can improve GitHub Import performance by running stages in parallel.
It seems that when using a 15k access token, running stages in parallel could improve the migration performance by ~36%. And when using a 5k access token, the improvement would be ~15%.
Proposed solution
Enable GitHub Import to run stages in parallel, similar to how it was done in the POC
Remember that on !142732 (merged), the stage order was updated to run collaborators before other stages. So, stages should be executed in the following order:
graph
Stage::ImportRepositoryWorker-->Stage::ImportProtectedBranchesWorker
Stage::ImportRepositoryWorker-->Stage::ImportLfsObjectsWorker
Stage::ImportRepositoryWorker-->Stage::ImportCollaboratorsWorker
Stage::ImportBaseDataWorker-->Stage::ImportReleasesAttachmentsWorker
Stage::ImportCollaboratorsWorker-->Stage::ImportBaseDataWorker
Stage::ImportBaseDataWorker-->Stage::ImportPullRequestsWorker
Stage::ImportPullRequestsWorker-->Stage::ImportDiffNotesWorker
Stage::ImportPullRequestsWorker-->Stage::ImportPullRequestsEventsWorker
Stage::ImportPullRequestsWorker-->Stage::ImportPullRequestsAttachmentsWorker
Stage::ImportPullRequestsEventsWorker-->Stage::ImportPullRequestNoteAttachmentsWorker
Stage::ImportBaseDataWorker-->Stage::ImportIssuesWorker
Stage::ImportIssuesWorker-->Stage::ImportIssueEventsWorker
Stage::ImportIssuesWorker-->Stage::ImportIssueAttachmentsWorker
Stage::ImportIssueEventsWorker-->Stage::ImportIssueNoteAttachmentsWorker
Edited by 🤖 GitLab Bot 🤖