SQL timeout on uploads
table in file-based export
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
This is a follow-on from #504039 (closed), a catch-all issue to fix instances of query timeouts during project import/export.
SELECT
"uploads".*
FROM
"uploads"
WHERE
"uploads"."model_id" = $1
AND "uploads"."model_type" = $2
ORDER BY
"uploads"."id" ASC
LIMIT $3
Backtrace excerpt:
[lib/gitlab/import_export/uploads_manager.rb:75:in `each_uploader',
lib/gitlab/import_export/uploads_manager.rb:52:in `copy_project_uploads',
lib/gitlab/import_export/uploads_manager.rb:20:in `save',
lib/gitlab/import_export/uploads_saver.rb:18:in `block in save',
lib/gitlab/import_export/duration_measuring.rb:15:in `block in with_duration_measuring',
benchmark (0.4.0) lib/benchmark.rb:323:in `realtime',
lib/gitlab/import_export/duration_measuring.rb:14:in `with_duration_measuring',
lib/gitlab/import_export/uploads_saver.rb:14:in `save',
app/services/projects/import_export/relation_export_service.rb:22:in `execute',
app/workers/projects/import_export/relation_export_worker.rb:46:in `perform']
This is another case of ignoring any suitable index because the ORDER_BY
imposed by batching a costly pkey scan.
Limit (cost=0.57..1000.03 rows=100 width=320)
-> Index Scan using uploads_pkey on uploads (cost=0.57..14850140.30 rows=1485810 width=320)
Filter: ((model_id = 64662867) AND ((model_type)::text = 'Project'::text))
CREATE INDEX idx_uploads_model_model_type_upload_id ON uploads(model_id, model_type, id);
The above index helps dramatically. However, the uploads
table is listed in the "large tables" section of https://gitlab.com/gitlab-org/gitlab/blob/56587219c9511e965536bfc238210c9cb068ae03/rubocop/rubocop-migrations.yml#L66, so we're discouraged from adding a new index to it.
This is happening regularly for one customer who seems to have a scheduled export, and seemingly only them. Looking closer, the project in question has 1.9 million uploads, which seems like a lot! I ran some queries to see if it was something like 1.9 million orphaned export files, but it looks like they're pretty much all from the FileUploader
.
Given the edge-casiness of this scenario, it's hard to justify making an exception to the policy forbidding extra indexes on the uploads
table.