20240712:08 - PostgreSQL REINDEX for glibc update
Re-index database to recover sort order
We have some customers who are encountering the GNU C Library (glibc
) string ordinality update from older versions to version 2.28, released 2018-08-01. This affected many Linux distributions, back in late 2018, but it was after a popular distribution, Ubuntu 18.04 LTS was released, and that stopped being supported by GitLab after GitLab 16.11.
There's a change in glibc
in most Linux distributions, where the default sorting order for text is updated to include more non-English languages. It has the side-effect of causing PostgreSQL to sort in a different order. This is discussed on their wiki
For GitLab, the last version that was supported on Ubuntu 18.04 was GitLab 16.11.
A few of our customers are wishing to upgrade to GitLab 17, and are still on Ubuntu 18.04 — after all, it is an LTS (Long-Term-Support) release, and is actually not EOL (End-Of-Life) until 2026, despite what our policy says. These customers (or anyone on an un-supported OS with a glibc
older than version 2.28) will need to upgrade their version of the OS first. But the glibc
change is a challenge.
Here's a workflow to upgrade a GitLab installation's Ubuntu 18.04 LTS (old glibc
) to 20.04 LTS (new glibc
) and not break the database.
⚠️ This assumes a GitLab 100 RPS/5k reference architecture without Gitaly Cluster, and without GitLab Geo. If you have those, then the procedure is going to be more complicated because the PostgreSQL streaming replication will need to be considered, and I haven't worked it out yet.
-
Stop database client nodes (rails, sidekiq, gitaly)
sudo gitlab-ctl stop
-
Stop PostgreSQL on pg nodes
sudo gitlab-ctl stop
-
Disable automatic start of GitLab on the pg nodes
sudo systemctl disable gitlab-runsvdir
-
Upgrade Ubuntu 18.04 OS on all pg nodes to Ubuntu 20.04
-
Re-enable GitLab on the pg nodes and confirm it started
sudo systemctl enable gitlab-runsvdir
sudo gitlab-ctl status
Do not start the clients
-
Confirm patroni leader (see Check replication status)
sudo gitlab-ctl patroni members
-
Using a session manager like
tmux
orscreen
, connect to the leader, and re-index the database:sudo gitlab-psql
set statement_timeout to 0;
show statement_timeout;
REINDEX DATABASE gitlabhq_production;
-
You can monitor the session to completion (it will report "
REINDEX
" once complete), or also watch the status in a secondgitlab-psql
console with:SELECT * FROM pg_stat_activity WHERE query LIKE 'REINDEX%';
-
After the re-index is complete, confirm that the patroni cluster is in sync
gitlab-ctl patroni members
-
Start db clients (rails, sidekiq, gitaly)
sudo gitlab-ctl start