diff --git a/doc/administration/monitoring/prometheus/index.md b/doc/administration/monitoring/prometheus/index.md
index 1bddbbc25c2812510bf02a271ba3c7f796c67d78..0af13624b6ec0519f88391ef23a95b1a33c6377e 100644
--- a/doc/administration/monitoring/prometheus/index.md
+++ b/doc/administration/monitoring/prometheus/index.md
@@ -371,12 +371,20 @@ to work with the collected data where you can visualize the output.
For a more fully featured dashboard, Grafana can be used and has
[official support for Prometheus](https://prometheus.io/docs/visualization/grafana/).
-Sample Prometheus queries:
+## Sample Prometheus queries
+
+Below are some sample Prometheus queries that can be used.
+
+NOTE:
+These are only examples and may not work on all setups. Further adjustments may be required.
-- **% Memory available:** `((node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) or ((node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes) / node_memory_MemTotal_bytes)) * 100`
- **% CPU utilization:** `1 - avg without (mode,cpu) (rate(node_cpu_seconds_total{mode="idle"}[5m]))`
+- **% Memory available:** `((node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) or ((node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes) / node_memory_MemTotal_bytes)) * 100`
- **Data transmitted:** `rate(node_network_transmit_bytes_total{device!="lo"}[5m])`
- **Data received:** `rate(node_network_receive_bytes_total{device!="lo"}[5m])`
+- **Disk read IOPS:** `sum by (instance) (rate(node_disk_reads_completed_total[1m]))`
+- **Disk write IOPS**: `sum by (instance) (rate(node_disk_writes_completed_total[1m]))`
+- **RPS via GitLab transaction count**: `sum(irate(gitlab_transaction_duration_seconds_count{controller!~'HealthController|MetricsController|'}[1m])) by (controller, action)`
## Prometheus as a Grafana data source
diff --git a/doc/administration/reference_architectures/10k_users.md b/doc/administration/reference_architectures/10k_users.md
index e6acf7840bb08487dacd5b20f882162bc951d3e2..4d604cce9e9c22cd6e89417f2bf2f477e68d8737 100644
--- a/doc/administration/reference_architectures/10k_users.md
+++ b/doc/administration/reference_architectures/10k_users.md
@@ -10,20 +10,20 @@ DETAILS:
**Tier:** Premium, Ultimate
**Offering:** Self-managed
-This page describes the GitLab reference architecture designed to target a peak load of 200 requests per second (RPS), the typical peak load of up to 10,000 users, both manual and automated, based on real data with headroom added.
+This page describes the GitLab reference architecture designed to target a peak load of 200 requests per second (RPS), the typical peak load of up to 10,000 users, both manual and automated, based on real data.
For a full list of reference architectures, see
[Available reference architectures](index.md#available-reference-architectures).
NOTE:
Before deploying this architecture it's recommended to read through the [main documentation](index.md) first,
-specifically the [Before you start](index.md#before-you-start) and [Deciding which architecture to use](index.md#deciding-which-architecture-to-use) sections.
+specifically the [Before you start](index.md#before-you-start) and [Deciding which architecture to use](index.md#deciding-which-architecture-to-start-with) sections.
> - **Target load:** API: 200 RPS, Web: 20 RPS, Git (Pull): 20 RPS, Git (Push): 4 RPS
> - **High Availability:** Yes ([Praefect](#configure-praefect-postgresql) needs a third-party PostgreSQL solution for HA)
> - **Estimated Costs:** [See cost table](index.md#cost-to-run)
> - **Cloud Native Hybrid Alternative:** [Yes](#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative)
-> - **Unsure which Reference Architecture to use?** [Go to this guide for more info](index.md#deciding-which-architecture-to-use)
+> - **Unsure which Reference Architecture to use?** [Go to this guide for more info](index.md#deciding-which-architecture-to-start-with)
| Service | Nodes | Configuration | GCP | AWS | Azure |
|------------------------------------------|-------|-------------------------|------------------|----------------|-----------|
@@ -56,12 +56,13 @@ specifically the [Before you start](index.md#before-you-start) and [Deciding whi
Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster). If you want sharded Gitaly, use the same specs listed above for `Gitaly`.
6. Gitaly specifications are based on high percentiles of both usage patterns and repository sizes in good health.
However, if you have [large monorepos](index.md#large-monorepos) (larger than several gigabytes) or [additional workloads](index.md#additional-workloads) these can *significantly* impact Git and Gitaly performance and further adjustments will likely be required.
-7. Can be placed in Auto Scaling Groups (ASGs) as the component doesn't store any [stateful data](index.md#autoscaling-of-stateful-nodes).
- However, for GitLab Rails certain processes like [migrations](#gitlab-rails-post-configuration) and [Mailroom](../incoming_email.md) should be run on only one node.
+6. Can be placed in Auto Scaling Groups (ASGs) as the component doesn't store any [stateful data](index.md#autoscaling-of-stateful-nodes).
+ However, [Cloud Native Hybrid setups](#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative) are generally preferred as certain components
+ such as like [migrations](#gitlab-rails-post-configuration) and [Mailroom](../incoming_email.md) can only be run on one node, which is handled better in Kubernetes.
NOTE:
-For all PaaS solutions that involve configuring instances, it is strongly recommended to implement a minimum of three nodes in three different availability zones to align with resilient cloud architecture practices.
+For all PaaS solutions that involve configuring instances, it's recommended to implement a minimum of three nodes in three different availability zones to align with resilient cloud architecture practices.
```plantuml
@startuml 10k
@@ -165,7 +166,7 @@ against the following endpoint throughput targets:
- Git (Push): 4 RPS
The above targets were selected based on real customer data of total environmental loads corresponding to the user count,
-including CI and other workloads along with additional substantial headroom added.
+including CI and other workloads.
If you have metrics to suggest that you have regularly higher throughput against the above endpoint targets, [large monorepos](index.md#large-monorepos)
or notable [additional workloads](index.md#additional-workloads) these can notably impact the performance environment and [further adjustments may be required](index.md#scaling-an-environment).
@@ -2268,16 +2269,19 @@ as the typical environment above.
First are the components that run in Kubernetes. These run across several node groups, although you can change
the overall makeup as desired as long as the minimum CPU and Memory requirements are observed.
-| Service Node Group | Nodes | Configuration | GCP | AWS | Min Allocatable CPUs and Memory |
-|---------------------|-------|-------------------------|-----------------|--------------|---------------------------------|
-| Webservice | 4 | 32 vCPU, 28.8 GB memory | `n1-highcpu-32` | `c5.9xlarge` | 127.5 vCPU, 118 GB memory |
-| Sidekiq | 4 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` | 15.5 vCPU, 50 GB memory |
-| Supporting services | 2 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` | 7.75 vCPU, 25 GB memory |
+| Component Node Group | Target Node Pool Totals | GCP Example | AWS Example |
+|----------------------|-------------------------|-----------------|--------------|
+| Webservice | 80 vCPU
100 GB memory (request)
140 GB memory (limit) | 3 x `n1-standard-32` | 3 x `c5.9xlarge` |
+| Sidekiq | 12.6 vCPU
28 GB memory (request)
56 GB memory (limit) | 4 x `n1-standard-4` | 4 x `m5.xlarge` |
+| Supporting services | 4 vCPU
15 GB memory | 2 x `n1-standard-4` | 2 x `m5.xlarge` |
- For this setup, we **recommend** and regularly [test](index.md#validation-and-test-results)
- [Google Kubernetes Engine (GKE)](https://cloud.google.com/kubernetes-engine) and [Amazon Elastic Kubernetes Service (EKS)](https://aws.amazon.com/eks/). Other Kubernetes services may also work, but your mileage may vary.
-- Nodes configuration is shown as it is forced to ensure pod vCPU / memory ratios and avoid scaling during **performance testing**.
- - In production deployments, there is no need to assign pods to specific nodes. A minimum of three nodes per node group in three different availability zones is strongly recommended to align with resilient cloud architecture practices.
+[Google Kubernetes Engine (GKE)](https://cloud.google.com/kubernetes-engine) and [Amazon Elastic Kubernetes Service (EKS)](https://aws.amazon.com/eks/). Other Kubernetes services may also work, but your mileage may vary.
+- GCP and AWS examples of how to reach the Target Node Pool Total are given for convenience. These sizes are used in performance testing but following the example is not required. Different node pool designs can be used as desired as long as the targets are met, and all pods can deploy.
+- The [Webservice](#webservice) and [Sidekiq](#sidekiq) target node pool totals are given for GitLab components only. Additional resources are required for the chosen Kubernetes provider's system processes. The given examples take this into account.
+- The [Supporting](#supporting) target node pool total is given generally to accommodate several resources for supporting the GitLab deployment as well as any additional deployments you may wish to make depending on your requirements. Similar to the other node pools, the chosen Kubernetes provider's system processes also require resources. The given examples take this into account.
+- In production deployments, it's not required to assign pods to specific nodes. However, it is recommended to have several nodes in each pool spread across different availability zones to align with resilient cloud architecture practices.
+- Enabling autoscaling, such as Cluster Autoscaler, for efficiency reasons is encouraged, but it's generally recommended targeting a floor of 75% for Webservice and Sidekiq pods to ensure ongoing performance.
Next are the backend components that run on static compute VMs using the Linux package (or External PaaS
services where applicable):
@@ -2312,7 +2316,7 @@ services where applicable):
NOTE:
-For all PaaS solutions that involve configuring instances, it is strongly recommended to implement a minimum of three nodes in three different availability zones to align with resilient cloud architecture practices.
+For all PaaS solutions that involve configuring instances, it's recommended to implement a minimum of three nodes in three different availability zones to align with resilient cloud architecture practices.
```plantuml
@startuml 10k
@@ -2322,11 +2326,11 @@ card "Kubernetes via Helm Charts" as kubernetes {
card "**External Load Balancer**" as elb #6a9be7
together {
- collections "**Webservice** x4" as gitlab #32CD32
- collections "**Sidekiq** x4" as sidekiq #ff8dd1
+ collections "**Webservice**" as gitlab #32CD32
+ collections "**Sidekiq**" as sidekiq #ff8dd1
}
- card "**Supporting Services** x2" as support
+ card "**Supporting Services**" as support
}
card "**Internal Load Balancer**" as ilb #9370DB
@@ -2384,55 +2388,60 @@ consul .[#e76a9b]--> redis
@enduml
```
-### Resource usage settings
+### Kubernetes component targets
-The following formulas help when calculating how many pods may be deployed within resource constraints.
-The [10k reference architecture example values file](https://gitlab.com/gitlab-org/charts/gitlab/-/blob/master/examples/ref/10k.yaml)
-documents how to apply the calculated configuration to the Helm Chart.
+The following section details the targets used for the GitLab components deployed in Kubernetes.
#### Webservice
-Webservice pods typically need about 1 CPU and 1.25 GB of memory _per worker_.
-Each Webservice pod consumes roughly 4 CPUs and 5 GB of memory using
-the [recommended topology](#cluster-topology) because four worker processes
-are created by default and each pod has other small processes running.
+Each Webservice pod (Puma and Workhorse) is recommended to be run with the following configuration:
-For 200 RPS or 10,000 users we recommend a total Puma worker count of around 80.
-With the [provided recommendations](#cluster-topology) this allows the deployment of up to 20
-Webservice pods with 4 workers per pod and 5 pods per node. Expand available resources using
-the ratio of 1 CPU to 1.25 GB of memory _per each worker process_ for each additional
-Webservice pod.
+- 4 Puma Workers
+- 4 vCPU
+- 5 GB memory (request)
+- 7 GB memory (limit)
-For further information on resource usage, see the [Webservice resources](https://docs.gitlab.com/charts/charts/gitlab/webservice/#resources).
+For 200 RPS or 10,000 users we recommend a total Puma worker count of around 80 so in turn it's recommended to run at
+least 20 Webservice pods.
+
+For further information on Webservice resource usage, see the Charts documentation on [Webservice resources](https://docs.gitlab.com/charts/charts/gitlab/webservice/#resources).
+
+##### NGINX
+
+It's also recommended deploying the NGINX controller pods across the Webservice nodes as a DaemonSet. This is to allow the controllers to scale dynamically with the Webservice pods they serve as well as take advantage of the higher network bandwidth larger machine types typically have.
+
+Note that this isn't a strict requirement. The NGINX controller pods can be deployed as desired as long as they have enough resources to handle the web traffic.
#### Sidekiq
-Sidekiq pods should generally have 0.9 CPU and 2 GB of memory.
+Each Sidekiq pod is recommended to be run with the following configuration:
+
+- 1 Sidekiq worker
+- 900m vCPU
+- 2 GB memory (request)
+- 4 GB memory (limit)
-[The provided starting point](#cluster-topology) allows the deployment of up to
-14 Sidekiq pods. Expand available resources using the 0.9 CPU to 2 GB memory
-ratio for each additional pod.
+Similar to the standard deployment above, an initial target of 14 Sidekiq workers has been used here.
+Additional workers may be required depending on your specific workflow.
-For further information on resource usage, see the [Sidekiq resources](https://docs.gitlab.com/charts/charts/gitlab/sidekiq/#resources).
+For further information on Sidekiq resource usage, see the Charts documentation on [Sidekiq resources](https://docs.gitlab.com/charts/charts/gitlab/sidekiq/#resources).
-#### Supporting
+### Supporting
The Supporting Node Pool is designed to house all supporting deployments that don't need to be
on the Webservice and Sidekiq pools.
This includes various deployments related to the Cloud Provider's implementation and supporting
-GitLab deployments such as NGINX or [GitLab Shell](https://docs.gitlab.com/charts/charts/gitlab/gitlab-shell/).
+GitLab deployments such as [GitLab Shell](https://docs.gitlab.com/charts/charts/gitlab/gitlab-shell/).
-If you wish to make any additional deployments, such as for Monitoring, it's recommended
+If you wish to make any additional deployments such as Container Registry, Pages or Monitoring, it's recommended
to deploy these in this pool where possible and not in the Webservice or Sidekiq pools, as the Supporting pool has been designed
specifically to accommodate several additional deployments. However, if your deployments don't fit into the
-pool as given, you can increase the node pool accordingly.
-
-## Secrets
+pool as given, you can increase the node pool accordingly. Conversely, if the pool in your use case is over-provisioned you can reduce accordingly.
-When setting up a Cloud Native Hybrid environment, it's worth noting that several secrets should be synced from backend VMs from the `/etc/gitlab/gitlab-secrets.json` file into Kubernetes.
+### Example config file
-For this setup specifically, the [GitLab Rails](https://docs.gitlab.com/charts/installation/secrets.html#gitlab-rails-secret) and [GitLab Shell](https://docs.gitlab.com/charts/installation/secrets.html#gitlab-rails-secret) secrets should be synced.
+An example for the GitLab Helm Charts targetting the above 200 RPS or 10,000 reference architecture configuration [can be found in the Charts project](https://gitlab.com/gitlab-org/charts/gitlab/-/blob/master/examples/ref/10k.yaml).