[go: up one dir, main page]

Skip to content

GithubImporter: Refactor the Representation layer

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Problem to solve

If we compare the github importer with an ETL architecture, like GitLab Migration (BulkImports) the representation layer would be the T, the layer responsible for transform data from the source to be used in the Loaders (layer that saved the data). But, currently, the transformation is being leaked to the Loaders layer, example. That's happening because the representation objects receives too little context of the importer, for instance it doesn't have access to what project is being imported which is required to build some data.

Representations

  • Gitlab::GithubImport::Representation::DiffNote
  • Gitlab::GithubImport::Representation::Issue
  • Gitlab::GithubImport::Representation::LfsObject
  • Gitlab::GithubImport::Representation::Note
  • Gitlab::GithubImport::Representation::PullRequest
  • Gitlab::GithubImport::Representation::PullRequestReview
  • Gitlab::GithubImport::Representation::User

Proposal

  • optional Rename the Represenation classes/namespace to Transformers to express better its intent and use a similar vocabulary to the GitLab Migration (BulkImports);
  • Pass more context to the Transformers, like the project being imported and the client being used. Similar to the BulkImports, a Context class could be created to hold this information; (!72429 (closed))
  • Remove some of the duplication among the Transformers by adding either a super class or a mixing with the shared behavior;
    • Define a clear public API for all Transformers, something like #transform (again following what's being used in BulkImports

Expected results

  • Better maintainability due to
    • Simplified Representation/Transformation layer
    • Simplified Loader/Saving layer
Edited by 🤖 GitLab Bot 🤖