GithubImporter: Refactor the Representation layer
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Problem to solve
If we compare the github importer with an ETL architecture, like GitLab Migration (BulkImports) the representation layer would be the T, the layer responsible for transform data from the source to be used in the Loaders (layer that saved the data). But, currently, the transformation is being leaked to the Loaders layer, example. That's happening because the representation objects receives too little context of the importer, for instance it doesn't have access to what project is being imported which is required to build some data.
Representations
-
Gitlab::GithubImport::Representation::DiffNote -
Gitlab::GithubImport::Representation::Issue -
Gitlab::GithubImport::Representation::LfsObject -
Gitlab::GithubImport::Representation::Note -
Gitlab::GithubImport::Representation::PullRequest -
Gitlab::GithubImport::Representation::PullRequestReview -
Gitlab::GithubImport::Representation::User
Proposal
-
optional Rename the Represenationclasses/namespace toTransformersto express better its intent and use a similar vocabulary to the GitLab Migration (BulkImports); -
Pass more context to the Transformers, like the projectbeing imported and theclientbeing used. Similar to theBulkImports, aContextclass could be created to hold this information; (!72429 (closed)) -
Remove some of the duplication among the Transformersby adding either a super class or a mixing with the shared behavior;-
Define a clear public API for all Transformers, something like#transform(again following what's being used inBulkImports
-
Expected results
- Better maintainability due to
- Simplified Representation/Transformation layer
- Simplified Loader/Saving layer
Edited by 🤖 GitLab Bot 🤖