[go: up one dir, main page]

Skip to content

DeleteDiffFilesWorker doesn't work as expected for diffs in external storage

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Summary

This worker wipes MergeRequestDiffFile records from a MergeRequestDiff in some circumstances. However, it was introduced prior to external merge request diff storage.

When it is given a merge request diff that is stored externally, it removes the MergeRequestDiffFile rows, but it should also ensure that the file associated with MergeRequestDiff#external_diff is deleted, and that stored_externally and external_diff_store are set to false and LOCAL, respectively.

Steps to reproduce

  • Run DeleteDiffFilesWorker against any merge request diff sat in object storage
  • Note that the external diff file is still accessible

What is the current bug behavior?

Data attached to MergeRequestDiff is not cleaned as expected

What is the expected correct behavior?

Data should be removed

Output of checks

This bug happens on GitLab.com

Possible fixes

The actual fix isn't too bad - we should move the logic inside this worker to the clean! state transition for the MergeRequestDiff, and have it take account of the

What to do about pre-existing data is a harder question. When the MergeRequestDiff record is removed (say, along with the project), the associated file is destroyed, so it's not technically a resource leak. However, I'd feel better if we scanned through all MergeRequestDiff records with state: without_file, stored_externally: true and cleaned things up eagerly, to free up object storage space.

Edited by 🤖 GitLab Bot 🤖