Exclude git submodules from language analysis
Problem to solve
Each project now has a language bar (at the project overview page) showing the languages used and their distribution within the project. Some projects include dependencies as submodules and these modules do also count into the language record. The following bar is an example where this behavior does not seem correct:
The project referenced is one of my own (currently private) projects which is written in Kotlin (the orange part of the bar). As it uses pdf.js as submodule Kotlin is not the primary language of the project anymore, but JavaScript (yellow) dominates the bar. As a software developer this feels wrong to me, because the pdf.js part is included in the resource tree (src/main/resources
) as submodule and it is only needed in one part of the project (a JavaFX WebView).
Further details
The primary goal of this feature proposal is to exclude submodules from the language analysis, because they are separate projects (in own repositories). As far as I understand this matter, the language bar should show the languages used in the current project/repo (and this makes sense to me).
Imagine a volunteer searching for a project in a specific language to start contributing. This person would maybe skip a repository, because the main language displayed does not match his criteria while the project in reality would.
Proposal
Exclude git submodules (specified in the .gitmodules
file) from the language analysis.
What does success look like, and how can we measure that?
In this case success would be that the language analysis does not consider submodules when calculating the distribution. So in the above picture the yellow part would completely vanish. A simple example would be to create a project with one main script file (bash/python/…) and include a project like pdf.js as submodule. It should still show the script's language in the whole bar.