[Refactor] TrialDurationService to use Rails.cache
What does this MR do and why?
Replace ReactiveCaching with Rails.cache in TrialDurationService as an interim solution while we investigate the root cause of cache expiration issues.
Problem Context
In production, the ReactiveCaching implementation was not expiring after 1 hour as expected. When trial durations were updated in CustomersDot, GitLab.com showed stale data for 3-4 hours instead of the expected 1 hour.
Root Cause Investigation
Analysis suggests the issue may be related to ExclusiveLease failures in locking_reactive_cache, but the exact root cause is unclear:
- Suspected issue: First ReactiveWorker failed to obtain ExclusiveLease
-
Cache extension: Frequent access (~1 request/minute) via
keep_alive_reactive_cache!prevented natural expiration - Result: Cache never refreshed, showing stale trial duration data
However, since we cannot definitively reproduce or identify the exact failure point, we're implementing an interim solution.
Interim Solution: Rails.cache
Given that the CustomersDot API is performant (~0.05 seconds in production), ReactiveCaching's complexity is unnecessary:
- Performance: API responds in ~50ms, acceptable for ~1 user/hour cache miss
- Reliability: Rails.cache provides guaranteed 1-hour expiration
- Simplicity: Eliminates background job complexity and potential race conditions
Changes made:
- Replace ReactiveCaching module with Rails.cache.fetch
- Add race_condition_ttl (30 seconds) to prevent cache stampede
- Remove ReactiveCaching configuration and helper methods
- Preserve all DEFAULT_DURATIONS constants and fallback behavior
- Update tests to verify new caching behavior
Next Steps
A dedicated issue will be created to investigate the root cause of the ReactiveCaching failure for future reference and potential fixes to the ReactiveCaching mechanism itself.
References
Screenshots or screen recordings
N/A - This is a backend refactor with no UI changes.
How to set up and validate locally
- Make sure simulate SAAS is enabled.
- Make sure you have CustomerDot running
- If not, you can mock the API response
- Open rails console
In rails console
service = GitlabSubscriptions::TrialDurationService.new
# First call populates cache
service.execute
Rails.cache.exist?('gitlab_subscriptions_trial_duration_service') # Should return true
# Manually expire cache to simulate 1-hour passage
Rails.cache.delete('gitlab_subscriptions_trial_duration_service')
Rails.cache.exist?('gitlab_subscriptions_trial_duration_service') # Should return false
# Next call repopulates cache
service.execute
Rails.cache.exist?('gitlab_subscriptions_trial_duration_service') # Should return true
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related to #561213