Improve PostReceive performance by routing more queries to replicas

In this issue #402254 (closed), we enabled data_consistency: sticky for PostReceive, which directs all read queries to replicas until a write is made at which point queries switch to the primary.

This had a noticeable performance boost, however most queries are still going to the primary.

In this spike: #498229 (closed) we confirmed that this is because a write is made relatively early in PostReceive, so so we aren't taking advantage of replicas as much as we might be able to.

Here are a few options we can look into implementing to further improve PostReceive's performance.

Move events and pipelines writes to the end of PostReceive

Writes towards the end of the job means more reads can go to the replica before sticking to the primary
We would need to refactor some of PostReceive, ProcessRefChangesService, BranchPushService, and BranchHooksService.

Create events and pipelines async

If we can defer the most common writes to another worker, the remaining queries can use replicas more often. This adds extra latency and has a UX tradeoff. The majority of other services triggered by PostReceive are already async as well, ex: UpdateMergeRequestsWorker, DetectRepositoryLanguagesWorker, RefDeleteUnlockArtifactsWorker, CreateCommitSignatureWorker

Wrap certain reads in Gitlab::Database::LoadBalancing::ScopedSessions::#use_replicas_for_read_queries?

Gitlab::Database::LoadBalancing::ScopedSessions provides other means of controlling whether queries go to the primary or replicas. We can identify blocks that perform expensive or frequent reads, and wrap those blocks in use_replicas_for_read_queries?.

Edited Nov 18, 2024 by Fred Reinink