GrimoireLab include a set of interesting tools, but sometimes I need to run specific analysis or proof of concept ideas not convered yet by the current platform. Usually, I need easier ways to play with projects and communities data without setting up the whole Grimoire Lab infrastructure.
This repository is my personal playground to test some of these ideas, mostly as Jupyter notebooks.
Feel free to play with them!
For most of the ideas you need:
- Jupyter notebooks
- GrimoireLab/Perceval
- Elasticsearch, elasticsearch-py and elasticsearch-dsl-py
- To play with generated data, you might need Kibana
- utils.py file has some extra dependencies
There is a settings example file where you can define some variables to be used.
Index generators:
- Light Git index generator.ipynb: given a list of git urls in the settings file, it generate an elasticsearch index called
'git'
with items showing info about commits at file level. - Light Meetup index generator.ipynb: given a list of Meetup groups names in the settings file, it generate an elasticsearch index called
'meetup'
with items showing info about meetup groups rvsps. - Light Github index generator.ipynb: given a list of Github repositories urls in the settings file, it generate two elasticsearch indexes called
'github-git'
and'github-issues'
with items showing info about commits at files level, github issues and github pull requests.
Other ideas:
- Genderize Index.ipynb: given an elasticsearch index, names field in the index, an optional
names.csv
file (containingname
,gender
,probability
,count
), it update each item in the index with gender information for the indicated names field.
Of course, there will be issues! I am not a computer scientist, and I am self-learning Python, Elasticsearch, etc. during this journey.
If you find any issue, feel free to report it.
Pull requests are also welcome, but I wouldn't recommend you losing time with this poor code. If you wanna help, go for the real thing!
100% free, open source software.. of course! MIT License