[go: up one dir, main page]

Skip to content

Clarify the Failing webhooks to explain what is the backoff strategy applied

Problem to solve

The documentation and the original MR do not agree about when and how a failing web hook is automatically disabled.

  • For 5xx errors, the documentation says to disable for 1 minute, extended on each retry up to 24 hours. The MR says 10 minutes with a backoff factor of 2.0, never to exceed 24 hours.
  • For 4xx errors, the documentation says that these are understood to be misconfigured hooks and are permanently disabled, requiring manual intervention, The MR says there is a "3 strikes policy" and they will only be disabled on the 4th failure. Additionally 429 errors (too many requests) may be handled differently.

Further details

Proposal

Review the current code logic and/or discuss with devopsecosystem to clarify what happens, and update the documentation to reflect this.

Who can address the issue

Someone familiar with the workings of auto disabling web hooks.

Other links/references

Link the doc and describe what is wrong with it.