Manage packages_size statistic with a counter attribute
⏳ Context
In the usage quota page, one metric taken into account is the size of the Package Registry. Basically, how much storage each package is using.
This statistic is linked to the attribute size of ::Packages::PackageFile.
On master, the statistic is updated in the following way:
-
Rails callbacks are registered so that
after_saveorafter_destroy, we take action. - Those callbacks will call
ProjectStatistics.increment_statisticwith the properamount(which can be negative). - After chaining several functions, we end up using
.update_countersfrom rails.
(3.) is done outside of a database transaction. The reason of why it is outside a transaction, it's because:
@fabiopitino said:
The reason why we moved the increment in a separate transaction is because project_statistics is a highly contended table and before it was causing a lot of statement timeout errors on many concurrent updates. By moving it to a different transaction (after_commit) we separated the main transaction (add/remove model) from the side-effect (update statistics) and made the main transaction more resilient. See !20852 (merged) for context.
The problem is that being outside a transaction, we introduced a race condition risk.
Add lease to update project statistics row and ... (!97912 - merged) improved the monitoring around those statistics updates. Among other things, we now detect concurrent updates. Guess who is the main culprit here? Yes, packages_size
Because of those concurrent updates, we are noticing a loss of accuracy in the usage quota page where the packages_size metric is no longer the sum of all package files sizes. This is issue #363010 (closed).
🌬 The solution
Basically, the solution used in this MR is to avoid or at least lower those concurrent updates. We already have a tool in place for this: CounterAttribute.
In very short words, a counter attribute will "stack" the counter updates in Redis and enqueue a job that will run in 10.minutes. That job will "simply" flush the counter update from Redis to the database.
This works because in Redis, we have means to guarantee that we will never have concurrent updates.
This MR is thus as simple as move packages_size updates to a CounterAttribute.
🤔 What does this MR do and why?
- Declare
packages_sizeinProjectStatisticsas acounter_attribute. - Add a feature flag support when updating the
packages_sizeso we can still decide if the update is sync (old approach) or async/delayed (new approach).- Rollout issue: #381287 (closed)
- Update/Create related specs.
📺 Screenshots or screen recordings
None
⚗ How to set up and validate locally
- Have GDK ready with one project and a Personal Access Token.
- To keep things simple, we're going to use the generic package registry. With a terminal, let's create 5 packages:
$ curl --upload-file <dummy text file path> "http://<username>:<PAT>@gdk.test:8000/api/v4/projects/310/packages/generic/generic1/1.1.2/file.txt" $ curl --upload-file <dummy text file path> "http://<username>:<PAT>@gdk.test:8000/api/v4/projects/310/packages/generic/generic2/1.1.2/file.txt" $ curl --upload-file <dummy text file path> "http://<username>:<PAT>@gdk.test:8000/api/v4/projects/310/packages/generic/generic3/1.1.2/file.txt" $ curl --upload-file <dummy text file path> "http://<username>:<PAT>@gdk.test:8000/api/v4/projects/310/packages/generic/generic4/1.1.2/file.txt" $ curl --upload-file <dummy text file path> "http://<username>:<PAT>@gdk.test:8000/api/v4/projects/310/packages/generic/generic5/1.1.2/file.txt" - Now check the project usage quota page
http://gdk.test:8000/<project full path>/-/usage_quotas:
Everything is setup properly.
Let's have a run without the feature flag enabled.
- Delete all packages in a rails console:
Project.last.packages.destroy_all - In the rails console, you should see these SQL queries:
ProjectStatistics Update All (1.1ms) UPDATE "project_statistics" SET "packages_size" = COALESCE("packages_size", 0) - 8, "storage_size" = COALESCE("storage_size", 0) - 8 WHERE "project_statistics"."id" = 307 /*application:console,db_config_name:main,console_hostname:worky.local,console_username:david,line:/app/models/concerns/counter_attribute.rb:135:in `block in update_counters_with_lease'*/ ProjectStatistics Update All (0.8ms) UPDATE "project_statistics" SET "packages_size" = COALESCE("packages_size", 0) - 8, "storage_size" = COALESCE("storage_size", 0) - 8 WHERE "project_statistics"."id" = 307 /*application:console,db_config_name:main,console_hostname:worky.local,console_username:david,line:/app/models/concerns/counter_attribute.rb:135:in `block in update_counters_with_lease'*/ ProjectStatistics Update All (0.5ms) UPDATE "project_statistics" SET "packages_size" = COALESCE("packages_size", 0) - 8, "storage_size" = COALESCE("storage_size", 0) - 8 WHERE "project_statistics"."id" = 307 /*application:console,db_config_name:main,console_hostname:worky.local,console_username:david,line:/app/models/concerns/counter_attribute.rb:135:in `block in update_counters_with_lease'*/ ProjectStatistics Update All (0.5ms) UPDATE "project_statistics" SET "packages_size" = COALESCE("packages_size", 0) - 8, "storage_size" = COALESCE("storage_size", 0) - 8 WHERE "project_statistics"."id" = 307 /*application:console,db_config_name:main,console_hostname:worky.local,console_username:david,line:/app/models/concerns/counter_attribute.rb:135:in `block in update_counters_with_lease'*/ ProjectStatistics Update All (0.5ms) UPDATE "project_statistics" SET "packages_size" = COALESCE("packages_size", 0) - 8, "storage_size" = COALESCE("storage_size", 0) - 8 WHERE "project_statistics"."id" = 307 /*application:console,db_config_name:main,console_hostname:worky.local,console_username:david,line:/app/models/concerns/counter_attribute.rb:135:in `block in update_counters_with_lease'*/ - Check the usage quota page again, it's down to
0 bytes.
Ok, that's the "synchronous" packages_size updates.
Let's re upload 5 packages (step 2 from our setup above). Let's enable the feature flag now:
Feature.enable(:packages_size_counter_attribute)
- (Make sure that you have background jobs running!)
- Delete all packages in a rails console:
Project.last.packages.destroy_all - This time around, no SQL updates on
project_statistics. - While we wait the flush job to kick in (10 minutes), we can check that we have the updates in Redis:
ps = Project.last.statistics key = ps.counter_key(:packages_size) Gitlab::Redis::SharedState.with { |r| r.get(key) } => "-40"- This means that our
-40update topackages_sizeis waiting for the job. Note that while in this waiting state, any update onpackages_sizewill affect this redis key (eg. we upload a file of100bytes, that key will get updated to60(-40 + 100)).
- This means that our
- Also while waiting, check the usage quota page. It still shows
40 bytes. - (After waiting 10 minutes) The job runs and the
packages_sizeis updated accordingly. The redis key is gone (nil) and the usage quota page is updated accordingly.- Even if this is a very small example, we just combined
5UPDATEstatements into a single chances to have concurrent updates.
- Even if this is a very small example, we just combined
Async packages_size updates are working properly!
🚦 MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
