[go: up one dir, main page]

Skip to content

Audit and optimize default queue usage

Problem

The default queue (weight 3) is a catch-all for jobs without explicit queue assignments. This creates several issues:

  1. Unclear intent: Jobs in default queue have no documented priority rationale
  2. Mixed priorities: May contain both important and trivial operations
  3. Discourages explicit assignment: Developers may rely on default instead of choosing appropriate queue
  4. Difficult to optimize: Can't tune performance without knowing what jobs are in the queue

Current State

The default queue has weight 3 (medium-high priority), which may be too high for a catch-all queue.

Unknown:

  • What jobs currently use the default queue?
  • Are they all appropriate for weight 3?
  • Should some be moved to more specific queues?

Proposal

Audit all jobs using the default queue and migrate them to appropriate specialized queues.

Phase 1: Discovery

  1. Identify all jobs in default queue:

    # Search codebase for jobs without explicit queue_as
    # Check Sidekiq metrics for jobs in default queue
    # Review recent job executions
  2. Categorize jobs:

    • Customer-facing operations → Move to appropriate high-priority queue
    • Internal operations → Move to appropriate medium-priority queue
    • Maintenance tasks → Move to low-priority queue
    • Truly generic → Keep in default but document why
  3. Document findings:

    • List all jobs currently using default
    • Proposed queue for each job
    • Rationale for the assignment

Phase 2: Migration

  1. Create missing specialized queues (if needed):

    • Consider queues identified in other issues (#14269, #14271, etc.)
  2. Update job classes:

    # Before
    class SomeJob < ApplicationJob
      # No queue_as specified, uses :default
    end
    
    # After
    class SomeJob < ApplicationJob
      queue_as :appropriate_queue
      feature_category :relevant_category
    end
  3. Update tests:

    • Verify jobs are enqueued to correct queue
    • Update any tests that check queue names

Phase 3: Reduce Default Queue Priority

After migration, lower the default queue weight to discourage its use:

# config/sidekiq.yml
:queues:
  # ... other queues ...
  - [default, 2]  # Reduced from 3

This encourages developers to explicitly choose appropriate queues for new jobs.

Phase 4: Establish Guidelines

Document when it's acceptable to use default queue:

Acceptable uses:

  • Truly one-off administrative tasks
  • Jobs that genuinely don't fit any specialized queue
  • Temporary jobs during development (must be moved before production)

Not acceptable:

  • Customer-facing operations
  • Revenue-impacting jobs
  • Jobs that fit an existing specialized queue
  • Long-running or resource-intensive jobs

Implementation Steps

  1. Run audit (1-2 days):

    • Query Sidekiq for jobs in default queue
    • Search codebase for jobs without queue_as
    • Review with team to categorize jobs
  2. Create migration plan (1 day):

    • List all jobs to migrate
    • Identify target queues
    • Note any new queues needed
  3. Execute migration (3-5 days):

    • Update job classes
    • Update tests
    • Deploy to staging
    • Verify correct queue usage
  4. Lower default weight (after migration):

    • Update config/sidekiq.yml
    • Monitor for any issues
    • Document the change
  5. Add linting (optional):

    • Consider adding a linter to flag jobs without explicit queue_as
    • Add to CI pipeline

Success Criteria

  • All jobs have explicit queue assignments
  • default queue only contains truly generic jobs
  • default queue weight reflects its catch-all nature (weight 2)
  • Documentation exists for when to use default queue
  • No customer-facing or critical jobs in default queue

Related

  • Parent epic: &19587
  • Blocks: #14268 (weight granularity)
  • Related: All other queue optimization issues