[go: up one dir, main page]

Skip to content

Merge Requests API returns 408 instead of 504 under specific conditions

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Summary

When querying the Merge Requests API under specific conditions the API returns a 408 instead of a 504 error. I was not able to isolate a specific failure case to duplicate the issue. These 408 errors seem to happen most frequently when querying for automation users that frequently cycle merge requests, which may impact the system's ability to return results from cache

Steps to reproduce

  1. Have a service user that creates and closes Merge Requests in a large, active namespace
  2. Run a script like this with a valid API token and configured to scan for the username and namespace ID:
exercise-the-api.py
import os
import datetime
import gitlab

gl = gitlab.Gitlab('https://gitlab.com',
private_token="", timeout=60,
pagination="keyset", order_by="id", per_page=100,
retry_transient_errors=True)
gl.enable_debug()

username = ''
groupid = 
state = 'opened'
num_days = 30

group = gl.groups.get(groupid)
oldestDateTime = datetime.datetime.now() - datetime.timedelta(num_days)

mrs = group.mergerequests.list(state=state,
                               author_username=username,
                               created_before=oldestDateTime.isoformat(),
                               order_by='created_at',
                               sort='desc',
                               iterator=True)

for groupmr in mrs:
  project = gl.projects.get(groupmr.project_id)
  mr = project.mergerequests.get(groupmr.iid)
  try:
    # print("MR:", mr.title, project.path_with_namespace)
    branch = mr.source_branch
    print("MR branch:", branch)
  except Exception as e:
    print(e)

This script is not required; it's simply supplied as an example.

What is the current bug behavior?

Under the specific conditions that can produce this failure the API will return a 408. The logs show ~15 seconds of db_duration_s time, which indicates the API has received a complete query and is attempting to service it. In these cases a 408 is returned, which causes problems for clients expecting a 504.

This dashboard has been filtered to show Kibana logs from a customer that reported this issue. I believe other 408 errors may be similar, but I was not able to validate.

What is the expected correct behavior?

When this API call reaches 15 seconds of db_duration_s time it should return a 504 since I can't see why a 408 would be correct when the DB is used.

Relevant logs and/or screenshots

Scoped Kibana dashboard showing logs of the issue: https://log.gprd.gitlab.net/app/r/s/Pwufd

This issues was also discussed in confidential RFH 291

Possible fixes

Edited by 🤖 GitLab Bot 🤖