[go: up one dir, main page]

Skip to content

Support no-downtime upgrades via alternative documentRoot for static files

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Problem

From omnibus-gitlab#3895, admins attempting to upgrade GitLab via our no-downtime upgrade procedure have found that CSS and JavaScript often don't load while the upgrade is in progress (omnibus-gitlab#3895 (comment 419060384)). This is because in a mixed deployment scenario with a load balancer, this can happen:

  1. User accesses node version N+1, which then makes a CSS/JS request on version N.
  2. User accesses node version N, which then makes a CSS/JS requests on version N+1.

In both scenarios, the user gets a 404 since only one version of the assets exist on a given server.

Solutions

  1. Use a CDN. On GitLab.com, we have a canary deployment, and all assets get prefaced with https://assets.gitlab-static.net. Fastly has some mechanism for retrieving the requested files from the server from which it was requested (or maybe it just caches it when it receives a 200?).
  2. Install the current and target version assets into all nodes, and do the upgrade. This is what $2019978 does. It's a bit kludgy because it pollutes /opt/gitlab/embedded/service/gitlab-rails/public/assets.

It's not clear to me how easy is it to support an out-of-the-box CDN for GitLab. NGINX does make it possible to serve as a CDN (http://linuxplayer.org/2013/06/nginx-try-files-on-multiple-named-location-or-server). It should be possible to designate a deploy node (the first node to be upgraded), and have NGINX request assets from the deploy node if it encounters a 404. However, this requires some knowledge of the customer's topology and network DNS changes.

Perhaps a simpler option is to install all the requisite assets (say, versions 12.0 to 13.0) in a directory. Currently, the NGINX config for GitLab routes all /assets requests to Workhorse, which serves files from the documentRoot parameter. We might want to add another parameter (e.g. altDocumentRoot) that Workhorse searches if it can't find the assets in the documentRoot.

That way, we can:

  1. Create a separate package/tarball that installs all the assets in some alternative directory that does not conflict with the Omnibus-packaged files.
  2. Ensure that every node has a copy of every asset needed for no-downtime upgrades.

Thoughts?

Edited by 🤖 GitLab Bot 🤖