KPI metric preprocessing notebooks
Closed this issue · 1 comments
hemajv commented
Now that we have explored what the raw GitHub data looks like (#16), we should implement notebooks to analyze, preprocess and store metrics data.
Acceptance Criteria
- Pull most relevant and updated (till current day) issue/PR data for org/repo
- Preprocess data for metrics (those defined in #3)
- Store the data in Ceph and create Trino tables
- 1 simple visualization per metric (Superset)
hemajv commented
We have been investigating the MI tool and ran into some issues. We have reported our feedback to help improve the usability of the tool:
- Unable use the MI module to load within the notebook
(This issue is now resolved 🎉) - Bugs in the
Metric
class of the MI module. This class is helpful to analyze and process the metrics within the notebook - Enable MI tool to fetch all default GitHub fields. The MI tool currently fetches a limited number of fields from the GitHub data, there were few fields which we were interested in, but were missing in the data.