Ensure BulkImports::Tracker and BulkImports::Failure keep the same sharding key as BulkImports::Entity
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
BulkImports::Tracker
and BulkImports::Failure
both belong to and get the sharding key from the BulkImports::Entity
. This means for example, if a tracker is created when the entity has the organization_id
as the sharding key, it will keep that sharding key, even if the entity it belongs to gets updated with the project_id
as the sharding key.
A few questions to discuss:
- Should we ensure all these records have the same sharding key?
- Our Cells development guidelines suggest we only use
organization_id
for leaf nodes.
- Our Cells development guidelines suggest we only use
- Would this mean too many tracker records will exist with
organization_id
as the sharding key?- We ideally want to keep this below 100k, so we don't cause issues with the org mover.
- How can we keep the sharding keys in sync?
- Perhaps a database trigger to update the
BulkImports::Trackers
when the sharding key of theBulkImports::Entity
changes?
- Perhaps a database trigger to update the
- Should we stop changing the
bulk_import_entites
sharding key throughout the process, as suggested here: #499829 (comment 2329239480)- That would be the more simple solution here, but would mean more records with
organization_id
as the sharding key (not just entities, but also failures and trackers).
- That would be the more simple solution here, but would mean more records with
@.luke and @jnutt I came across this concern, whilst working on #499829 (closed). It's out of scope for that, but I think it's something we should discuss and decide on. Tagging you since you were both involved in that discussion. What do you think?
CC: @wortschi
Edited by 🤖 GitLab Bot 🤖