[go: up one dir, main page]

Skip to content

SQL timeout on uploads table in file-based export

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

This is a follow-on from #504039 (closed), a catch-all issue to fix instances of query timeouts during project import/export.

SELECT
    "uploads".*
FROM
    "uploads"
WHERE
    "uploads"."model_id" = $1
    AND "uploads"."model_type" = $2
ORDER BY
    "uploads"."id" ASC
LIMIT $3

Backtrace excerpt:

[lib/gitlab/import_export/uploads_manager.rb:75:in `each_uploader',
lib/gitlab/import_export/uploads_manager.rb:52:in `copy_project_uploads',
lib/gitlab/import_export/uploads_manager.rb:20:in `save',
lib/gitlab/import_export/uploads_saver.rb:18:in `block in save',
lib/gitlab/import_export/duration_measuring.rb:15:in `block in with_duration_measuring',
benchmark (0.4.0) lib/benchmark.rb:323:in `realtime',
lib/gitlab/import_export/duration_measuring.rb:14:in `with_duration_measuring',
lib/gitlab/import_export/uploads_saver.rb:14:in `save',
app/services/projects/import_export/relation_export_service.rb:22:in `execute',
app/workers/projects/import_export/relation_export_worker.rb:46:in `perform']

This is another case of ignoring any suitable index because the ORDER_BY imposed by batching a costly pkey scan.

Limit  (cost=0.57..1000.03 rows=100 width=320)
  ->  Index Scan using uploads_pkey on uploads  (cost=0.57..14850140.30 rows=1485810 width=320)
        Filter: ((model_id = 64662867) AND ((model_type)::text = 'Project'::text))
CREATE INDEX idx_uploads_model_model_type_upload_id ON uploads(model_id, model_type, id);

The above index helps dramatically. However, the uploads table is listed in the "large tables" section of https://gitlab.com/gitlab-org/gitlab/blob/56587219c9511e965536bfc238210c9cb068ae03/rubocop/rubocop-migrations.yml#L66, so we're discouraged from adding a new index to it.

This is happening regularly for one customer who seems to have a scheduled export, and seemingly only them. Looking closer, the project in question has 1.9 million uploads, which seems like a lot! I ran some queries to see if it was something like 1.9 million orphaned export files, but it looks like they're pretty much all from the FileUploader.

Given the edge-casiness of this scenario, it's hard to justify making an exception to the policy forbidding extra indexes on the uploads table.

Edited by 🤖 GitLab Bot 🤖