Geo: Implement LFS object replication for organization migration
Problem
As part of organization migration from Legacy Cell to Protocell, the Geo team needs to implement special handling for LFS object replication. The lfs_objects
table is now marked as cell-local and won't be replicated directly, requiring custom logic to ensure LFS objects are properly available on the target cell.
Background
The overall LFS migration strategy involves:
-
✅ Copyoid
fromlfs_objects
tolfs_object_projects
(groupsource code #490482) -
✅ Marklfs_objects
table as cell local (completed) -
🔄 Migratelfs_object_projects
rows and nullifylfs_object_id
(Geo team - this issue) -
🔄 Ensure LFS objects exist on target cell and link them (Geo team - this issue) -
✅ Recycle unusedlfs_objects
on source cell (existing functionality)
Tasks for Geo Team
Task 1: Migrate lfs_object_projects without lfs_object_id during PG data replication
When migrating an organization:
- Migrate
lfs_object_projects
rows for the organization (as normal) - Nullify (or otherwise exclude from replication) the
lfs_object_id
reference during migration (this is the special part) - Do NOT migrate
lfs_objects
rows (as usual for cell-local tables)
Task 2: Implement LFS object deduplication on target cell
After migrating lfs_object_projects
rows:
- For each row in
lfs_object_projects WHERE lfs_object_id IS NULL
- Upsert an
lfs_objects
row with theoid
, returning ID (atomic to avoid race conditions) - Set the
lfs_object_projects
lfs_object_id
field - Insert
lfs_object_registry
to make the rest of Geo do its thing
- Upsert an
Related
- Blocked by Add sharding key for `lfs_objects` (#490482) because
lfs_object_projects.oid
needs to be backfilled and maintained in order for us to do this - Somewhat blocked by Organization migration: Replicate PostgreSQL data (&18462) or at least some kind of Postgres replication for cells in GDK POC of Geo Protocell Mode (#571916) because we need to do Task 1 on top of Postgres replication.
Edited by Michael Kozono