[go: up one dir, main page]

Skip to content

Add regex support for auth success URL

Problem

A customer wanted to use the DAST_AUTH_SUCCESS_IF_AT_URL to verify that the DAST login was successful, however, the URL of post-login landing was dynamic/different each login, so they couldn't take this approach.

What Needs to Be Done

We need to support both exact URLs and regex based URL matching in the DAST_AUTH_SUCCESS_IF_AT_URL field, while maintaining backward compatibility with existing configurations. The change should allow users to specify a regex pattern explicitly without misclassifying valid URLs as regex.

Proposal

If possible support both a URL and regex in the existing DAST_AUTH_SUCCESS_IF_AT_URL field, or add a new variable for the regex.

This shouldn't be a breaking change, customers who use the full URL should still be able to verify using the approach.

Proposed Approaches

Approach 1: Explicit Regex with r:/ Prefix

  • Description: Users prefix their value with r:/ to indicate that it's a regex, e.g. r:/https://example.com/user/[0-9]+.
  • Pros:
    • Very explicit and clear to interpret in code.
    • Easy to validate and test regex intent.
    • Clean separation between exact and regex-based URLs.
  • Cons:
    • Introduces a new convention not used elsewhere in the product.
    • May confuse users unfamiliar with regex.
    • Risk of inconsistent UX with other pattern features (e.g. scope variables or wildcards).

This avoids ambiguity with real URLs and keeps the implementation simple and explicit. The validateURIs function will also be updated to parse and validate regex patterns only when the r: prefix is detected.

Regex examples:

  • r:/https://example.com/dashboard/session/[0-9]+/

Exact URL examples:

  • https://example.com/dashboard/user/12345

Approach 2: Wildcard/URL Pattern Normalization

  • Description: Leverage existing internal conventions already used elsewhere in browserker where URLs / patterns are normalized and regexified under the hood. If the value starts with http, it is parsed as a URL and the RequestURI is extracted; the query delimiter (?) is escaped automatically and prepends regex anchors (^) to enforce matching from the start. Regex behavior is layered on internally.

  • How it Works:

    • If the input starts with http, it's assumed to be a URL and the RequestURI (path + query) is extracted.
    • The query delimiter ? is escaped automatically to prevent regex syntax issues.
    • The string is wrapped with ^... to anchor the regex.
    • If the input is not a URL, it's still treated as a pattern, and minimally transformed into a regex-friendly form.
    • Regex metacharacters like *, ., [, etc. remain functional - only ? is escaped.
    • No special prefix (r:/) is required — this keeps it seamless and consistent with other fields.
  • Pros:

    • Familiar syntax for users: Users can think in terms of existing functionality support.
    • Reduces some regex footguns: Helps prevent common mistakes like forgetting to escape ?.
    • Aligns with product consistency:Consolidates how pattern-based matching works in other configuration places already used in other parts of the code (AddExcludedURIs).
    • Simplifies validation: Centralized logic can safely compile and validate user input.
  • Cons:

    • Limits advanced regex: Regex users may feel constrained by the simplified wrapper.
  • Design Intent: This design favors usability and consistency over full power. It's ideal for users who want basic dynamic matching (e.g., URL with optional query params), and don't want to learn regex. For expert users, more advanced regex features can still work within the limits of the normalized input, but the system avoids exposing raw regex complexity unless absolutely necessary.

  • Example Inputs & Behavior:

Input Internally Compiled As Notes
http://localhost:8080/page.html ^/page.html Extracts RequestURI and anchors start
http://localhost/dashboard?tab=1 ^/dashboard\?tab=1 Escaped ?
http://localhost:8090/dashboard*.html ^localhost/dashboard* Treated as basic pattern, * wildcard pattern, but .* is standard regex and that also works
[0-9]+ ^[0-9]+ Partial raw regex works, but must follow constraints

This change will preserve all existing behaviour for users relying on exact URLs and enable regex-based matching in a familiar way to how it's done elsewhere in the codebase.

Edited by Hannah Baker