[go: up one dir, main page]

Skip to content

Ensure BulkImports::Tracker and BulkImports::Failure keep the same sharding key as BulkImports::Entity

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

BulkImports::Tracker and BulkImports::Failure both belong to and get the sharding key from the BulkImports::Entity. This means for example, if a tracker is created when the entity has the organization_id as the sharding key, it will keep that sharding key, even if the entity it belongs to gets updated with the project_id as the sharding key.

A few questions to discuss:

  • Should we ensure all these records have the same sharding key?
  • Would this mean too many tracker records will exist with organization_id as the sharding key?
    • We ideally want to keep this below 100k, so we don't cause issues with the org mover.
  • How can we keep the sharding keys in sync?
    • Perhaps a database trigger to update the BulkImports::Trackers when the sharding key of the BulkImports::Entity changes?
  • Should we stop changing the bulk_import_entites sharding key throughout the process, as suggested here: #499829 (comment 2329239480)
    • That would be the more simple solution here, but would mean more records with organization_id as the sharding key (not just entities, but also failures and trackers).

@.luke and @jnutt I came across this concern, whilst working on #499829 (closed). It's out of scope for that, but I think it's something we should discuss and decide on. Tagging you since you were both involved in that discussion. What do you think?

CC: @wortschi

Edited by 🤖 GitLab Bot 🤖