[go: up one dir, main page]

Skip to content

GitHub import: pull project statistics from GitHub API

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

There was recently a thread about the statistics for our GitHub importer test on Slack:

@m_gill : the stats here say all issues and MRs were migrated. We manually tested this a few weeks ago and I think I recall the results were the same (all issues and MRs imported). Do you think the extra 10-20 missing MRs and Issues you're seeing might be new, as in created after the import kicked off?

@acunskis : I don't really trust these numbers fully, quickly taking a look in UI it seems compared to github there are objects missing but these large imports that take several days to import is almost impossible to validate properly after the fact. That is why I wrote a test that independently fetches all the objects, but as we discussed in one of the issues, we first wanted to just perform "lightweight" validation, because actually comparing everything in the tests takes quite a long time

In other words: our import tests only compare the number of fetched objects to the number of saved objects. But if a resource is never fetched from GitHub, then the numbers will match even though GitLab has failed to import some data.

In terms of the end to end tests and what we show customers in the GitLab UI, could we instead pull GitHub statistics directly from the GitHub API at the time an import starts? That way, our comparison would be with the actual number of resources in GitHub vs what was imported.

It looks like the GraphQL can pull the issue/PR counts. Maybe we could save this data as part of the Project Import State record?

Edited by 🤖 GitLab Bot 🤖