Run the RSpec test suite with large sequence ranges (bigint)
Problem
There was a seeding failure in GDK recently, because of trying to store sequence ID (bigint) to an integer column - gitlab-development-kit#2931 (comment 2718526062).
This would have been a problem for the legacy cell too, but only on large tables (with IDs more than 2 billion - 2^32 -1). But this will be evident in cells because sequence ranges for new cell will start from 1 trillion.
There is not an easy way to find all such occasions, but running our entire test suite as a cell would surface them.
Essentially we want the RSpec running in another variant (similar to pg16 single-db, pg16 single-db-ci-connection, etc) which will have it's sequence ranges altered and objects created using them.
Root Cause Analysis
The issue stems from integer column overflow when sequence values exceed the 32-bit integer limit (2^31 - 1 = 2,147,483,647). This manifests in two scenarios:
- Legacy environments: Large tables with IDs approaching 2 billion records
- Cells environments: New cells start with sequence ranges at 1 trillion (100_000_000_000), immediately exceeding integer column limits
The problem occurs when:
- Database sequences generate
bigintvalues (64-bit) - Application code assumes
integercolumns (32-bit) - Values get truncated/overflow during insertion, causing seeding failures
Proposed Solutions
Solution 1: CI Sequence Range Simulation (Draft MR !204731)
Approach: Modify CI pipelines to simulate cell-like sequence ranges during database setup.
Implementation:
- Extract reusable
gitlab:db:alter_sequences_rangerake task from cell-specific version - Configure
db:migrate:resetto simulate cell-2 sequence ranges (100B-200B) - Apply to all CI pipelines to surface integer overflow issues early
Changes:
# In scripts/utils.sh setup_db function
run_timed_command_with_metric "bundle exec rake db:drop db:create db:schema:load db:migrate gitlab:db:lock_writes 'gitlab:db:alter_sequences_range[100_000_000_001,200_000_000_000]'" "setup_db"
Pros:
-
✅ Catches bigint overflow issues in CI before production -
✅ No additional CI jobs required -
✅ Immediate feedback on MRs -
✅ Tests realistic cell conditions
Cons:
-
⚠️ Potential for "master broken" due to configuration change -
⚠️ All tests run with non-standard sequence ranges
Solution 2: Dedicated CI Variant (Alternative)
Approach: Create separate CI job variant (like pg16 single-db-bigint) with large sequence ranges.
Pros:
-
✅ Isolated testing environment -
✅ Preserves existing test behavior
Cons:
-
❌ Additional CI jobs increase pipeline complexity -
❌ Delayed feedback (separate job) -
❌ Higher CI resource usage
Impact Assessment
Immediate Impact
- Development: Seeding failures in GDK when using cell configurations
- Testing: Potential test failures when sequence values exceed integer limits
- CI/CD: Pipeline failures due to database constraint violations
Long-term Impact
- Production Risk: Silent data corruption or application errors in high-volume environments
- Scalability: Blocks adoption of Cells architecture for large GitLab instances
- Data Integrity: Risk of ID collisions or truncated values
Affected Components
Based on the linked GDK issues, potential problem areas include:
- Database models with
integerID columns that should bebigint - Seed data generation scripts
- Migration files with incorrect column types
- Application code making integer size assumptions
Recommended Next Steps
- Immediate: Complete and merge Draft MR !204731 to enable CI detection
- Short-term: Run comprehensive test suite to identify all affected models/columns
-
Medium-term: Create migration plan to convert
integercolumns tobigintwhere needed - Long-term: Establish guidelines to prevent future integer/bigint mismatches
Documentation Needs
As suggested by @praba.m7n, we should document:
- New CI behavior with large sequence ranges
- Guidelines for choosing
integervsbigintcolumn types - Migration patterns for existing integer columns
- Testing procedures for cell-compatible code
This issue affects not just Cells but any high-volume GitLab instance approaching the 2 billion record limit, making it a critical infrastructure concern.