Improve PostReceive performance by routing more queries to replicas
In this issue #402254 (closed), we enabled data_consistency: sticky
for PostReceive
, which directs all read queries to replicas until a write is made at which point queries switch to the primary.
This had a noticeable performance boost, however most queries are still going to the primary.
In this spike: #498229 (closed) we confirmed that this is because a write is made relatively early in PostReceive
, so so we aren't taking advantage of replicas as much as we might be able to.
Here are a few options we can look into implementing to further improve PostReceive's performance.
- Move events and pipelines writes to the end of PostReceive
- Writes towards the end of the job means more reads can go to the replica before sticking to the primary
- We would need to refactor some of
PostReceive
,ProcessRefChangesService
,BranchPushService
, andBranchHooksService
.
- Create events and pipelines async
- If we can defer the most common writes to another worker, the remaining queries can use replicas more often. This adds extra latency and has a UX tradeoff. The majority of other services triggered by PostReceive are already async as well, ex:
UpdateMergeRequestsWorker
,DetectRepositoryLanguagesWorker
,RefDeleteUnlockArtifactsWorker
,CreateCommitSignatureWorker
- Wrap certain reads in
Gitlab::Database::LoadBalancing::ScopedSessions::#use_replicas_for_read_queries?
-
Gitlab::Database::LoadBalancing::ScopedSessions
provides other means of controlling whether queries go to the primary or replicas. We can identify blocks that perform expensive or frequent reads, and wrap those blocks inuse_replicas_for_read_queries?
.
Edited by Fred Reinink