[go: up one dir, main page]

Add functionality to identify large repositories using existing project statistics data

Problem Statement

Currently, identifying large repositories (monorepos) in GitLab environments requires running Rails console commands or scraping through all projects via API. This is inefficient for administrators who need to check repository sizes, especially in the context of migrating from existing infrastructure to a new GitLab architecture.

Current State

GitLab already collects and displays repository size data in multiple places (Usage Quotas page or Project API) for single projects using project_statistics.repository_size. Data is pulled from Gitaly via app/models/project_statistics.rb and lib/gitlab/git/repository.rb

Proposal

Reuse the existing project_statistics.repository_size data to provide administrators with easy access to large repository information through Admin UI Dashboard/Page by displaying a list of projects sorted by repository size with size >1GB:

  • Project name and path
  • Repository size (from repository_size field)
  • Total repository size (repository + LFS objects)
  • Link to project
  • Filtering and sorting capabilities

Alternatively this can be done as Rake task, but UI would be more user friendly.

Update - see note on #578110 (comment 2843950599), MVC with rake output can be a good iteration to unblock folks.

Related Documentation

Edited by Nailia Iskhakova