[go: up one dir, main page]

Skip to content

Consider breaking down project import/export into smaller transactions

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Summary

Investigate whether we are holding a single transaction open while we import (export) the entire project? If so, consider breaking down these into smaller transactions, such as import N issues at a time per transaction.

Additional background

@georgekoltsov (this comment):

We explicitly start db transactions in Import/Export in 2 places (here & here) but I don't think this contributes to extended opened transactions.
I think the main reason for long transactions is how Import works by having nested objects saved.
For example, if you import 1 issue that has 2000 notes and 2000 emojis associated with it, the object that gets persisted in the database would look like this:
object = Issue.new(title: ..., project_id: id, notes: [Note.new(..., award_emoji: [AwardEmoji.new(....)]), Note.new(..., award_emoji: [AwardEmoji.new(....)]), Note.new(..., award_emoji: [AwardEmoji.new(....)])])
When such object receives save!, a transaction is going to be started, all nested relations processed, making sure all validations pass, etc, and only then commit is performed. At least that is what I saw when testing locally an issue creation with a high number of new labels associated with it.
Such importing behaviour is fundamental to Import and this is how relations are exported on the Export side. Top level relations like issues/mrs/milestones have been split into their separate ndjson files with 1 line representing 1 object, however nested relations are still nested within those objects.
In order to reduce opened transaction times we need to split creation of top level relations and their nested associations. Because there can be virtually unlimited number of nested subrelations, such problem can occur at any level of nesting (e.g. if a single note in a single issue has 2000 emojis), which makes implementing something like this even trickier. I don't currently see easy wins here that do not involve completely changing the way we export and import json 🤔

Proposal

As a first step, focus on transactions that are critical for Group Migrations.

Edited by 🤖 GitLab Bot 🤖