Suggestion for Enhancing Pydriller: Support for Fetching Single Commits
CoffeeGeeker opened this issue · 2 comments
Hi all,
Firstly, thank you for creating such a useful tool! It has been incredibly helpful for analyzing Git repositories.
I have a suggestion that could make Pydriller even better. When I need to analyze a single specific commit (for example, torvalds/linux@c301f09), cloning the entire repository is unnecessary and inefficient, especially for large repositories like the Linux kernel. This process often leads to slow performance and frequent errors, such as network interruptions and data transfer failures.
It would be great if Pydriller could support fetching and analyzing just a single commit without needing to clone the whole repository. This feature would significantly improve efficiency and reliability when dealing with large repositories.
Thank you for considering this suggestion!
Best regards
Hi @whpks, pydriller is designed to work as a wrapper around any Git repositories, which means cloning the entire repo is necessary for its analysis afaik. Implementing your idea would require using APIs from platforms like GitHub or GitLab, which is beyond Pydriller's current scope, as open-source libraries exist that allow that -- such as PyGithub and python-gitlab.
Best regards.
Hi! Yes, agree with @stefanodallapalma here, this is something that GitHub offers, but not Git itself.
If someone is interested in building PyHub, feel free to do it 😄