[go: up one dir, main page]

Websocket requests (impacting Classic Duo Chat) can hang when using Geo URL

Summary

When using Classic Duo Chat with Geo, and unified URL is configured, Duo Chat doesn't work and just hangs.

This is because the Geo secondary proxy returns 404 errors for ActionCable websocket connections /-/cable when proxying connections for Classic Duo Chat. As a result, the websocket connection breaks.

Steps to reproduce

  1. Use GET to setup a Geo environment - 1 node in each site is sufficient, and ensure unified URL is configured
  2. Enable Classic Duo Chat (you can link to the staging AIGW)
  3. On your localhost, update /etc/hosts and point your unified URL directly to the primary site
  4. Try using Classic Duo Chat - it works fine
  5. On your localhost, update /etc/hosts and point your unified URL directly to the secondary site
  6. Try using Classic Duo Chat - it breaks

In a real environment, I suspect what happens is that the connection flip flops between Geo sites. So classic Duo Chat may work initially and then appear to hang, and refreshing the page will then fix it temporarily.

Edit: ⚠️ this bug is not restricted to Duo, but all features relying on websockets. My setup does not have unified URL so this is also not necessarily at play.

To replicate: On the secondary site, open an existing issue in a tab. In another tab, add some labels to the same issue. See the cable request failing in the first tab.

Example Project

N/A

What is the current bug behavior?

When using Classic Duo Chat with Geo, and unified URL is configured, Duo Chat doesn't work.

Edit: Websocket request fail on Geo secondaries

What is the expected correct behavior?

When using Classic Duo Chat with Geo, and unified URL is configured, Duo Chat should work.

Edit: Websocket requests work on Geo secondaries

Relevant logs and/or screenshots

image

image

# Secondary site
{"backend_id":"rails","content_type":"text/html; charset=utf-8","correlation_id":"01K8W8ZZHXSHD7GW32EQ9KJ6FJ","duration_ms":44,"host":"primary.example.com","level":"info","method":"GET","msg":"access","proto":"HTTP/1.1","read_bytes":1224,"referrer":"","remote_addr":"127.0.0.1:0","remote_ip":"127.0.0.1","route":"^/-/cable\\z","route_id":"geo_action_cable","status":404,"system":"http","time":"2025-10-31T04:39:12Z","ttfb_ms":44,"uri":"/-/cable","user_agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/142.0.0.0 Safari/537.36","written_bytes":838}
{"backend_id":"rails","content_type":"text/html; charset=utf-8","correlation_id":"01K8W90A9KGPN57J5AJTNZEHGH","duration_ms":69,"host":"primary.example.com","level":"info","method":"GET","msg":"access","proto":"HTTP/1.1","read_bytes":1224,"referrer":"","remote_addr":"127.0.0.1:0","remote_ip":"127.0.0.1","route":"^/-/cable\\z","route_id":"geo_action_cable","status":404,"system":"http","time":"2025-10-31T04:39:23Z","ttfb_ms":69,"uri":"/-/cable","user_agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/142.0.0.0 Safari/537.36","written_bytes":838}

# Primary site
{"backend_id":"rails","body_limit":104857600,"content_type":"text/html; charset=utf-8","correlation_id":"01K8W90A9WCGN94YY9GNP493JW","duration_ms":52,"host":"primary.example.com","level":"info","method":"GET","msg":"access","proto":"HTTP/1.1","read_bytes":1881,"referrer":"","remote_addr":"34.129.180.129:0","remote_ip":"34.129.180.129","route":"^/-/","route_id":"dash","status":404,"system":"http","time":"2025-10-31T04:39:23Z","ttfb_ms":51,"uri":"/-/cable","user_agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/142.0.0.0 Safari/537.36","written_bytes":1591}
{"backend_id":"rails","body_limit":104857600,"content_type":"text/html; charset=utf-8","correlation_id":"01K8W90QZT76CT0NTXX3SVN8H3","duration_ms":27,"host":"primary.example.com","level":"info","method":"GET","msg":"access","proto":"HTTP/1.1","read_bytes":1881,"referrer":"","remote_addr":"34.129.180.129:0","remote_ip":"34.129.180.129","route":"^/-/","route_id":"dash","status":404,"system":"http","time":"2025-10-31T04:39:37Z","ttfb_ms":26,"uri":"/-/cable","user_agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/142.0.0.0 Safari/537.36","written_bytes":1591}

Possible fixes

I suspect Workhorse is not adding the correct headers for sockets back to the primary site:

Fix definition

After discussion in #579074 (comment 2885550838) and !212819 (comment 2893371557) the best fix would be to add a setting to Geo installations in order to let admins set ActionCable allowed_request_origins.

In order for Websockets to work between secondary and primary sites, ActionCable must be configured to set allowed_request_origins to the internal and external URLs of all sites. This will need to be documented 📓

The new setting need to be added to Rails, Omnibus, Charts and the GDK.

geo:
  action_cable_allowed_origins:
    - https://primary.example.com
    - https://secondary.example.com  
    - https://unified.example.com

Implementation plan

  1. Rails implementation
  • Filter input to only accept valid HTTP/HTTPS URLs (no regexes to prevent security issues)
  • Strip trailing slashes for consistency
  • Handle invalid URLs gracefully (skip them)
  • Only apply in production (dev/test keep existing behavior)
  1. Omnibus implementation
  • Update gitlab.rb template to expose gitlab_rails['geo_action_cable_allowed_origins']
  • Map to Rails geo.action_cable_allowed_origins setting
  1. Charts implementation
  • Update values.yaml with geo.actionCableAllowedOrigins
  • Template into gitlab.yml configuration
  1. GDK implementation
  • Update GDK's gitlab.yml.erb template to include the new setting for local development/testing.
  1. Documentation
  • Explain admins must add both internal and external URLs of all Geo sites
  • Provide examples for unified URL and separate URL setups
  • Note: Required for WebSocket features (Duo Chat, live issue updates, etc.)

Workaround

  • On your localhost, update /etc/hosts and point your unified URL directly to the primary site

Patch release information for backports

If the bug fix needs to be backported in a patch release to a version under the maintenance policy, please follow the steps on the patch release runbook for GitLab engineers.

Refer to the internal "Release Information" dashboard for information about the next patch release, including the targeted versions, expected release date, and current status.

High-severity bug remediation

To remediate high-severity issues requiring an internal release for single-tenant SaaS instances, refer to the internal release process for engineers.

Edited by Chloe Fons