[go: up one dir, main page]

Skip to content

After supporting compression with zip: read individual exported files directly from ObjectStorage

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

As discussed in POC #376261 (closed) GitLab Migration should be able to read individual files of the exported .zip relations directly from Object Storage instead of downloading the entire zip file.

This allows us to decrease disk usage. If a zip file has a collection of large files (e.g. lfs objects) - we can import them one by one instead of downloading all of the at once.

Proposed solutions

  1. Use gitlab-zip-cat to read individual files from .zip stored in Object Storage. This solution has a few unknowns:
  • Is gitlab-zip-cat workhorse binary available under the same location in all types of gitlab distributions?
  • Using gitlab-zip-cat binary requires making a system call from ruby which is not ideal since it increases malicious user attack surface
  1. Implement gitlab-zip-cat functionality in ruby, which uses httpio, seeks specified file and decompresses the contents on the fly

All the zip work needs to be done before this. Wait for the batches before the zip.

Edited by 🤖 GitLab Bot 🤖