[go: up one dir, main page]

[Refactor] TrialDurationService to use Rails.cache

What does this MR do and why?

Replace ReactiveCaching with Rails.cache in TrialDurationService as an interim solution while we investigate the root cause of cache expiration issues.

Problem Context

In production, the ReactiveCaching implementation was not expiring after 1 hour as expected. When trial durations were updated in CustomersDot, GitLab.com showed stale data for 3-4 hours instead of the expected 1 hour.

Root Cause Investigation

Analysis suggests the issue may be related to ExclusiveLease failures in locking_reactive_cache, but the exact root cause is unclear:

  1. Suspected issue: First ReactiveWorker failed to obtain ExclusiveLease
  2. Cache extension: Frequent access (~1 request/minute) via keep_alive_reactive_cache! prevented natural expiration
  3. Result: Cache never refreshed, showing stale trial duration data

However, since we cannot definitively reproduce or identify the exact failure point, we're implementing an interim solution.

Interim Solution: Rails.cache

Given that the CustomersDot API is performant (~0.05 seconds in production), ReactiveCaching's complexity is unnecessary:

  • Performance: API responds in ~50ms, acceptable for ~1 user/hour cache miss
  • Reliability: Rails.cache provides guaranteed 1-hour expiration
  • Simplicity: Eliminates background job complexity and potential race conditions

Changes made:

  • Replace ReactiveCaching module with Rails.cache.fetch
  • Add race_condition_ttl (30 seconds) to prevent cache stampede
  • Remove ReactiveCaching configuration and helper methods
  • Preserve all DEFAULT_DURATIONS constants and fallback behavior
  • Update tests to verify new caching behavior

Next Steps

A dedicated issue will be created to investigate the root cause of the ReactiveCaching failure for future reference and potential fixes to the ReactiveCaching mechanism itself.

References

Screenshots or screen recordings

N/A - This is a backend refactor with no UI changes.

How to set up and validate locally

  1. Make sure simulate SAAS is enabled.
  2. Make sure you have CustomerDot running
    1. If not, you can mock the API response
  3. Open rails console

In rails console

 service = GitlabSubscriptions::TrialDurationService.new
 
 # First call populates cache
 service.execute
 Rails.cache.exist?('gitlab_subscriptions_trial_duration_service') # Should return true
 
 # Manually expire cache to simulate 1-hour passage
 Rails.cache.delete('gitlab_subscriptions_trial_duration_service')
 Rails.cache.exist?('gitlab_subscriptions_trial_duration_service') # Should return false
 
 # Next call repopulates cache
 service.execute
 Rails.cache.exist?('gitlab_subscriptions_trial_duration_service') # Should return true

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #561213

Merge request reports

Loading