[go: up one dir, main page]

Move enumeration of new blobs out of file size checks into a utility class

Overview

To run secret detection scans on code pushed, we have to fetch and enumerate through git blobs. The enumeration process itself is described in full detail in the parent epic.

As of now, the same enumeration logic exists in the following file-size checks:

So instead of re-inventing the wheel, we want to reuse that logic, and for that we have to move it into a utility class so it can be used by both the file-size checks above, and the new push check created in #427039 (closed).

Note: With this moved into a utility class, we can cache the blobs and not have to do multiple calls to list_all_blobs/list_blobs, which can reduce latency and essentially make our approach have zero extra requests to Gitaly.

Note: The above statement is discarded because list_all_blobs/list_blobs are currently not loading the blob data in file size checks, and for us to load the data, we have to request it explicitly, so moving the enumeration into a utility/helper class will likely not help that much in terms of latency.

See this discussion for more information.

Assistance Needed

Assistance from groupgitaly might be needed.

Implementation Plan

Before you start:

  • Consider if the move will be helpful or not, especially as file size checks do not load blob data.
    • If the move is not worth the effort, please close this issue.

In case a move might reduce latency:

  • Move enumeration from the file-size checks into a utility class.
  • Update the file-size checks to use the new utility class.
  • Update the new push check to use the new utility class.
Edited by Ahmed Hemdan