Cells: Classify: Make uploads table to be attributable to be an org
Problem
The uploads holds a record of all uploaded files into GitLab. This table is attached to many models (users, projects, groups, etc.).
This table is not clearly attributable to be either clusterwide or cell-local.
There was some investigation into the problem in [Feature] Cells 1.0 impact for file uploads (#443573 - closed)
Geo
The same applies to upload_states that is used by Geo to track uploaded records that needs verification.
Dependencies
We need the tables backing the models using uploads to have their sharding keys so that we can use them.
- abuse_reports (!210550 (merged))
- achievements
- ai_vectorizable_files
- alert_management_alert_metric_images
- appearances
- bulk_import_export_uploads
- dependency_list_export_parts
- dependency_list_exports
- design_management_designs_versions
- import_export_uploads
- issuable_metric_images
- namespaces
- organization_details
- project_relation_export_uploads
- topics
- projects
- snippets
- user_permission_export_uploads
- users
- vulnerability_archive_exports
- vulnerability_export_parts
- vulnerability_exports
- vulnerability_remediations
https://docs.google.com/spreadsheets/d/19CcPaUGxOaT1rwjSdRvLkhu_-91RUBOdjDFGVxOonVs/edit?usp=sharing
Solution
We should introduce new table to be either cluster or cell-local and split this table into two with a clear purpose.
Proposal
Based on the discussion here - #398199 (comment 2101029924).
-
Milestone 17.7:
- Add new sharding key columns to uploads (!168003 (merged))
- Update the app to populate sharding key columns for new uploads when available (!168003 (merged))
-
Milestone 17.11:
-
Create new
uploads_9ba88c4165table (likeuploads) partitioned bymodel_type, mark it asexempt_from_sharding: true(!175203 (merged)) - Create partition for each model_type in the public schema (!175203 (merged))
- For each partition create FK referencing the sharding key table (!175203 (merged))
-
Start syncing
uploads->uploads_9ba88c4165(!175203 (merged))
-
Create new
-
Milestone 18.2 (required stop):
-
Backfill
uploads_9ba88c4165when every related model has its sharding key ready (!181349 (merged))
-
Backfill
-
Milestone 18.3:
- Finalize back-fill migration !198033 (merged)
-
Milestone %18.5 :
-
Clean up
note_uploads(no longer needed after !185893 (merged)) (!206764 (merged))
-
Clean up
-
Milestone %18.6
-
NOT NULLconstraint onappearance_uploads-> !209290 (merged)
-
-
Milestone %18.7 (work on all dependencies are completed)
- Add database triggers for all partitions to set sharding key if missing (!208858 (merged))
- Truncate partitions (to remove orphaned uploads) && create NOT NULL constraint for each partition (!213237 (merged))
- Re-run back-fill (updated to set new sharding keys) (!214675)
-
Shard
uploads_9ba88c4165(!215829)
-
Milestone %18.9 (after a required stop)
- Finalize back-fill
- Switch the app to use the new partitioned table by swapping the table names