No Data Available For GitHub Comments Dashboard
maltif opened this issue · 17 comments
I've been following this documentation to have separate dashboard GitHub Comments For PRs and Issues on our existing working GrimoireLab tool.
I've done the required changes in setup.cfg
and projects.json
as per grimoirelab-sirmordred github2 doc
- This is the
setup.cfg
configuration which is being used by grimoirelab.
[general]
short_name = TivoInc
update = true
min_update_delay = 60
debug = false
logs_dir = /home/bitergia/logs
aliases_file = /home/bitergia/conf/aliases.json
[projects]
projects_file = /home/bitergia/conf/projects.json
[es_collection]
url = http://elasticsearch:9200
[es_enrichment]
url = http://elasticsearch:9200
autorefresh = true
[sortinghat]
host = mariadb
user = root
password =
database = demo_sh
load_orgs = true
orgs_file = /home/bitergia/conf/organizations.json
autoprofile = [github, pipermail, git]
matching = [email]
sleep_for = 100
unaffiliated_group = Unknown
affiliate = true
strict_mapping = false
reset_on_load = false
identities_file = [/home/bitergia/conf/identities.yml]
identities_format = grimoirelab
[panels]
kibiter_time_from = now-1y
kibiter_default_index = git
kibiter_url = http://kibiter:5601
kibiter_version = 6.1.4-1
#gitlab-issues = true
#gitlab-merges = true
github-comments = true
[phases]
collection = true
identities = true
enrichment = true
panels = true
[git]
raw_index = git_raw
enriched_index = git_enriched
latest-items = true
studies = [enrich_demography:git, enrich_git_branches:git, enrich_areas_of_code:git, enrich_onion:git]
[github]
raw_index = github_raw
enriched_index = github_enriched
api-token = ghp_XXX
category = issue
sleep-for-rate = true
no-archive = true
studies = [enrich_onion:github, enrich_geolocation:user, enrich_geolocation:assignee, enrich_extra_data:github, enrich_backlog_analysis, enrich_demography:github]
[github:pull]
raw_index = github_pull_raw
enriched_index = github_pull_enriched
api-token = ghp_XXX
category = pull_request
sleep-for-rate = true
no-archive = true
studies = [enrich_geolocation:user, enrich_geolocation:assignee, enrich_extra_data:github, enrich_demography:github]
[github2:issue]
api-token = ghp_XXX
raw_index = github2-issues_raw
enriched_index = github2-issues_enriched
sleep-for-rate = true
category = issue
no-archive = true
studies = [enrich_geolocation:user, enrich_geolocation:assignee, enrich_extra_data:github2, enrich_feelings]
[github2:pull]
api-token = ghp_XXX
raw_index = github2-pull_raw
enriched_index = github2-pull_enriched
sleep-for-rate = true
category = pull_request
no-archive = true
studies = [enrich_geolocation:user, enrich_geolocation:assignee, enrich_extra_data:git, enrich_feelings]
## studies based on enriched indexes
[enrich_demography:git]
[enrich_areas_of_code:git]
in_index = git_raw
out_index = git_aoc_enriched
[enrich_onion:git]
in_index = git_raw
out_index = git_onion_enriched
contribs_field = hash
[enrich_git_branches:git]
run_month_days = [1, 23]
[enrich_extra_data:git]
json_url = https://gist.githubusercontent.com/zhquan/bb48654bed8a835ab2ba9a149230b11a/raw/5eef38de508e0a99fa9772db8aef114042e82e47/bitergia-example.txt
[enrich_forecast_activity]
out_index = git_study_forecast
[enrich_onion:github]
in_index_iss = github_issues_onion_src
in_index_prs = github_prs_onion_src
out_index_iss = github_issues_onion_enriched
out_index_prs = github_prs_onion_enriched
[enrich_geolocation:user]
location_field = user_location
geolocation_field = user_geolocation
[enrich_geolocation:assignee]
location_field = assignee_location
geolocation_field = assignee_geolocation
[enrich_extra_data:github]
json_url = https://gist.githubusercontent.com/zhquan/bb48654bed8a835ab2ba9a149230b11a/raw/5eef38de508e0a99fa9772db8aef114042e82e47/bitergia-example.txt
#Added as part of github2
[enrich_extra_data:github2]
json_url = https://gist.githubusercontent.com/zhquan/bb48654bed8a835ab2ba9a149230b11a/raw/5eef38de508e0a99fa9772db8aef114042e82e47/bitergia-example.txt
[enrich_feelings]
attributes = [title, body]
nlp_rest_url = http://localhost:2901
#End Here
[enrich_backlog_analysis]
out_index = github_enrich_backlog
interval_days = 7
reduced_labels = [bug,enhancement]
map_label = [others, bugs, enhancements]
[enrich_demography:github]
[enrich_duration_analysis:kanban]
start_event_type = MovedColumnsInProjectEvent
fltr_attr = board_name
target_attr = board_column
fltr_event_types = [MovedColumnsInProjectEvent, AddedToProjectEvent]
[enrich_duration_analysis:label]
start_event_type = UnlabeledEvent
target_attr = label
fltr_attr = label
fltr_event_types = [LabeledEvent]
[enrich_reference_analysis]
- After required changes, I'd restarted the grimoirelab and I could see the index are available on kibana dashboard.
- Downloaded the below JSON files
wget https://raw.githubusercontent.com/chaoss/grimoirelab-sigils/master/panels/json/github2_pull_requests-index-pattern.json
wget https://raw.githubusercontent.com/chaoss/grimoirelab-sigils/master/panels/json/github2_pull_requests_comments_and_collaboration.json
- Then imported the following JSON files using
kidash
tool
kidash -g -e http://localhost:9200/ --import github2_pull_requests-index-pattern.json
kidash -g -e http://localhost:9200/ --import github2_pull_requests_comments_and_collaboration.json
- At last, I'd restarted the grimoirelab, However Data is not available on Kibana Dashboard as shown in the attached image.
- Attaching image for current Data Status
I'm not sure what I missed here. Could you please help to debug this?
Hi @maltif
On your indexes screenshot I only see raw indexes (github2-issues_raw
and github2-pull_raw
), let Mordred creates the enriched one. When the enriched indexes are created check if the aliases are correct (GET _cat/aliases
).
You also need to import github2_issues-index-pattern.json
and github2_issues_comments_and_collaboration.json
Thank you @zhquan for the reply. Sure, I'll wait for it.
Meanwhile could you please verify aliases.json
file?
This is aliases.json
which is configured in docker-compose.yml
:
{
"askbot": {
"raw": [
{
"alias": "askbot-raw"
}
],
"enrich": [
{
"alias": "askbot"
},
{
"alias": "affiliations"
},
{
"alias": "all_enriched"
}
]
},
"gerrit": {
"raw": [
{
"alias": "gerrit-raw"
}
],
"enrich": [
{
"alias": "affiliations"
},
{
"alias": "all_enriched"
}
]
},
"git": {
"raw": [
{
"alias": "git-raw"
}
],
"enrich": [
{
"alias": "git"
},
{
"alias": "git_author"
},
{
"alias": "git_enrich"
},
{
"alias": "affiliations"
},
{
"alias": "all_enriched"
}
]
},
"github:repo": {
"raw": [
{
"alias": "github_repositories-raw"
}
],
"enrich": [
{
"alias": "github_repositories"
}
]
},
"github2:issue": {
"raw": [
{
"alias": "github2_issues-raw"
}
],
"enrich": [
{
"alias": "github2_issues"
},
{
"alias": "github2_pull_requests",
"filter": {
"terms": {
"issue_pull_request" : [
"true"
]
}
}
},
{
"alias": "affiliations"
},
{
"alias": "all_enriched"
}
]
},
"github2:pull": {
"raw": [
{
"alias": "github2_pull_requests-raw"
}
],
"enrich": [
{
"alias": "github2_pull_requests"
},
{
"alias": "affiliations"
},
{
"alias": "all_enriched"
}
]
},
"github": {
"raw": [
{
"alias": "github-raw"
}
],
"enrich": [
{
"alias": "github_issues"
},
{
"alias": "github_issues_enrich"
},
{
"alias": "issues_closed"
},
{
"alias": "issues_created"
},
{
"alias": "issues_updated"
},
{
"alias": "affiliations"
},
{
"alias": "all_enriched"
},
{
"alias": "all_enriched_tickets",
"filter" : {
"terms" : {
"pull_request" : ["false"]
}
}
},
{
"alias": "github_issues_onion-src",
"filter" : {
"terms" : {
"pull_request" : [
"false"
]
}
}
},
{
"alias": "github_prs_onion-src",
"filter" : {
"terms" : {
"pull_request" : [
"true"
]
}
}
}
]
},
"pipermail": {
"raw": [
{
"alias": "pipermail-raw"
}
],
"enrich": [
{
"alias": "mbox"
},
{
"alias": "kafka"
},
{
"alias": "affiliations"
},
{
"alias": "all_enriched"
}
]
}
}
Use this aliases.json instead. There is no studies_aliases
section on your aliases.json
I'd configured aliases.json which you've shared and imported github2_issues-index-pattern.json
& github2_issues_comments_and_collaboration.json
as well.
I'll wait for enrichment process to be completed and confirm.
Thanks again for your kinds support 🙂
I'm not sure if there is a problem or not, but it's been over 8 hours and the GitHub PRs and Issues Comments data is still not appearing on the Kibana dashboard. I have checked the Mordred all.logs, but I couldn't find any errors or issues.
- I also haven't seen any initialization of the enrich task. Please refer to the screenshot for reference.
- This is the current
aliases
status:
- This is the current 'indices' status:
- Though, I noticed one thing in the setup.cfg for the Github2 section. As you can see, I am using a hyphen(-) instead of an underscore(_). I'm unsure if this makes a difference, so I thought I would reach out to you for clarification. Please refer to the screenshot for reference.
Let's try to run only the enrichment phase because I don't see any log related to the enrichment or study phases
[phases]
collection = false
identities = false
enrichment = true
panels = false
Take into account that if you are using the same GitHub token in github
and github2
sections the collection phase it will take very long due to the token limit, you can add more tokens (different accounts).
[github]
api-token = [ghp_XXX, ghp_YYY, ghp_ZZZ]
Though, I noticed one thing in the setup.cfg for the Github2 section. As you can see, I am using a hyphen(-) instead of an underscore(_). I'm unsure if this makes a difference, so I thought I would reach out to you for clarification. Please refer to the screenshot for reference.
You can use the hyphen without any problem
@zhquan thank you so much for responding.
I can now view data in the GitHub Comments Issues Dashboard after enabling enrichment in the phases
section as you advised. The GitHub Issues Comments and Collaboration dashboard appears to be working well.
However, the Top 10 Repository visualization in the GitHub Pull Requests Comments and Collaboration Dashboard displays "Pull Requests Count 0" even though the repos have multiple PRs. Is this because the collection process(github2:pull) has not finished for all the repos(Total 1679 Repos)?
[root@grimoire tmp]# grep '\[github2:issue\]' all.log | grep collection | awk '{print $NF}' | sort | uniq | grep http | wc -l
1372
[root@grimoire tmp]# grep '\[github2:pull\]' all.log | grep collection | awk '{print $NF}' | sort | uniq | grep http | wc -l
235
[root@grimoire tmp]# grep '\[github\]' all.log | grep collection | awk '{print $NF}' | sort | uniq | grep http | wc -l
1679
[root@grimoire tmp]# grep '\[github:pull\]' all.log | grep collection | awk '{print $NF}' | sort | uniq | grep http | wc -l
1679
[root@grimoire tmp]# grep '\[git\]' all.log | grep collection | awk '{print $NF}' | sort | uniq | grep http | wc -l
1679
However, the Top 10 Repository visualization in the GitHub Pull Requests Comments and Collaboration Dashboard displays "Pull Requests Count 0" even though the repos have multiple PRs. Is this because the collection process(github2:pull) has not finished for all the repos(Total 1679 Repos)?
Probably
Some checks:
- The visualization is using the correct field (
is_github_pull_request
) - Go to Discover and check the
github2_pull_requests
index pattern
The GitHub Pull Requests Comments and Collaboration Dashboard is still showing a "Pull Requests Count" of 0. It appears that the collection process for github2:pull
has not yet finished for all repositories. Can the configuration be enabled specifically for github2:pull
in the setup.cfg file? I'd like to see github2:pull
data.
Almost all items (99.6%) come from the github2-issues_enriched
index, it is normal for the Pull Requests Count
to be 0.
If you want to run only github2:pull
remove the rest of the backends and also remove the studies.
[github2:pull]
api-token = ghp_XXX
raw_index = github2-pull_raw
enriched_index = github2-pull_enriched
sleep-for-rate = true
category = pull_request
no-archive = true
Of course, I'll give it a go tomorrow.
I noticed that Grimoire is set to retrieve GitHub data for the past 10 years by default. Can I change the duration to only retrieve data for the past 1 year?
I'd like to gather GitHub metrics data for the past 1 year.
Thank you so much for your assistance and support, I truly appreciate your time.
I noticed that Grimoire is set to retrieve GitHub data for the past 10 years by default. Can I change the duration to only retrieve data for the past 1 year?
Of course, you can from-date = 2022-01-01
[github2:pull]
api-token = ghp_XXX
raw_index = github2-pull_raw
enriched_index = github2-pull_enriched
sleep-for-rate = true
category = pull_request
no-archive = true
from-date = 2022-01-01
Thank you so much for your assistance and support, I truly appreciate your time.
It's a pleasure :)
@zhquan Firstly, thank you for sharing the from-date
parameter. I've configured it under both the github2:pull
and github2:issue
backend.
Secondly, I apologize for the repeated inquiries regarding the issue I'm experiencing. It has been a couple of days, and the Top Repository Visualization in the **GitHub Pull Requests Comments and Collaboration Dashboard ** continues to mostly display "Pull Requests Count 0." However, I can see Reviews and Comments
data.
I'm attaching my current setup.cfg file configuration again, just to double-check in case I missed anything.
[general]
short_name = TivoInc
update = true
min_update_delay = 60
debug = false
logs_dir = /home/bitergia/logs
aliases_file = /home/bitergia/conf/aliases.json
[projects]
projects_file = /home/bitergia/conf/projects.json
[es_collection]
url = http://elasticsearch:9200
[es_enrichment]
url = http://elasticsearch:9200
autorefresh = true
[sortinghat]
host = mariadb
user = root
password =
database = demo_sh
load_orgs = true
orgs_file = /home/bitergia/conf/organizations.json
autoprofile = [github, pipermail, git]
matching = [email]
sleep_for = 100
unaffiliated_group = Unknown
affiliate = true
strict_mapping = false
reset_on_load = false
identities_file = [/home/bitergia/conf/identities.yml]
identities_format = grimoirelab
[panels]
kibiter_time_from = now-1y
kibiter_default_index = git
kibiter_url = http://kibiter:5601
kibiter_version = 6.1.4-1
#gitlab-issues = true
#gitlab-merges = true
github-comments = true
[phases]
collection = true
identities = true
enrichment = true
panels = true
[git]
raw_index = git_raw
enriched_index = git_enriched
latest-items = true
studies = [enrich_demography:git, enrich_git_branches:git, enrich_areas_of_code:git, enrich_onion:git]
[github]
raw_index = github_raw
enriched_index = github_enriched
api-token = ghp_XXX
category = issue
sleep-for-rate = true
no-archive = true
studies = [enrich_onion:github, enrich_geolocation:user, enrich_geolocation:assignee, enrich_extra_data:github, enrich_backlog_analysis, enrich_demography:github]
[github:pull]
raw_index = github_pull_raw
enriched_index = github_pull_enriched
api-token = ghp_XXX
category = pull_request
sleep-for-rate = true
no-archive = true
studies = [enrich_geolocation:user, enrich_geolocation:assignee, enrich_extra_data:github, enrich_demography:github]
[github2:issue]
api-token = [ghp_XXX, ghp_YYY, ghp_ZZZ]
raw_index = github2-issues_raw
enriched_index = github2-issues_enriched
sleep-for-rate = true
category = issue
no-archive = true
studies = [enrich_geolocation:user, enrich_geolocation:assignee, enrich_extra_data:github2, enrich_feelings]
from-date = 2022-01-01
[github2:pull]
api-token = [ghp_XXX, ghp_YYY, ghp_ZZZ]
raw_index = github2-pull_raw
enriched_index = github2-pull_enriched
sleep-for-rate = true
category = pull_request
no-archive = true
studies = [enrich_geolocation:user, enrich_geolocation:assignee, enrich_extra_data:github2, enrich_feelings]
from-date = 2022-01-01
## studies based on enriched indexes
[enrich_demography:git]
[enrich_areas_of_code:git]
in_index = git_raw
out_index = git_aoc_enriched
[enrich_onion:git]
in_index = git_raw
out_index = git_onion_enriched
contribs_field = hash
[enrich_git_branches:git]
run_month_days = [1, 23]
[enrich_extra_data:git]
json_url = https://gist.githubusercontent.com/zhquan/bb48654bed8a835ab2ba9a149230b11a/raw/5eef38de508e0a99fa9772db8aef114042e82e47/bitergia-example.txt
[enrich_forecast_activity]
out_index = git_study_forecast
[enrich_onion:github]
in_index_iss = github_issues_onion_src
in_index_prs = github_prs_onion_src
out_index_iss = github_issues_onion_enriched
out_index_prs = github_prs_onion_enriched
[enrich_geolocation:user]
location_field = user_location
geolocation_field = user_geolocation
[enrich_geolocation:assignee]
location_field = assignee_location
geolocation_field = assignee_geolocation
[enrich_extra_data:github]
json_url = https://gist.githubusercontent.com/zhquan/bb48654bed8a835ab2ba9a149230b11a/raw/5eef38de508e0a99fa9772db8aef114042e82e47/bitergia-example.txt
#Added as part of github2
[enrich_extra_data:github2]
json_url = https://gist.githubusercontent.com/zhquan/bb48654bed8a835ab2ba9a149230b11a/raw/5eef38de508e0a99fa9772db8aef114042e82e47/bitergia-example.txt
[enrich_feelings]
attributes = [title, body]
nlp_rest_url = http://localhost:2901
#End Here
[enrich_backlog_analysis]
out_index = github_enrich_backlog
interval_days = 7
reduced_labels = [bug,enhancement]
map_label = [others, bugs, enhancements]
[enrich_demography:github]
[enrich_duration_analysis:kanban]
start_event_type = MovedColumnsInProjectEvent
fltr_attr = board_name
target_attr = board_column
fltr_event_types = [MovedColumnsInProjectEvent, AddedToProjectEvent]
[enrich_duration_analysis:label]
start_event_type = UnlabeledEvent
target_attr = label
fltr_attr = label
fltr_event_types = [LabeledEvent]
[enrich_reference_analysis]
I'm attaching a few screenshots for your reference, just to double-check. I could see that after 08th Feb, there is no data available for github2_pull_requests
and github2_issues
index. I don't see any error in all.log file.
I have observed that the mordred
container is being terminated automatically after a few hours. I have included the container logs below for reference.
Currently, as a workaround, I have a shell script in place that starts any container that is not running.
[root@grimoire docker-compose]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1f5710dcbe7d bitergia/kibiter:community-v6.8.6-3 "/docker_entrypoint.…" 6 months ago Up 33 hours 0.0.0.0:5601->5601/tcp, :::5601->5601/tcp docker-compose_kibiter_1
4b2abb019db6 bitergia/mordred:latest "/bin/sh -c ${DEPLOY…" 6 months ago Up 2 hours (healthy) docker-compose_mordred_1
14a6926bfd06 grimoirelab/hatstall:latest "/bin/sh -c ${DEPLOY…" 6 months ago Up 33 hours 0.0.0.0:8000->80/tcp, :::8000->80/tcp docker-compose_hatstall_1
a295a748c38d docker.elastic.co/elasticsearch/elasticsearch-oss:6.8.6 "/usr/local/bin/dock…" 6 months ago Up 33 hours 0.0.0.0:9200->9200/tcp, :::9200->9200/tcp, 9300/tcp docker-compose_elasticsearch_1
37abe7721333 mariadb:10.0 "docker-entrypoint.s…" 6 months ago Up 33 hours 3306/tcp docker-compose_mariadb_1
Could you please review the provided configurations, screenshots, and logs and guide me in the right direction to troubleshoot this issue?
I understand that you may be occupied with other pressing tasks, and I genuinely appreciate your assistance in resolving this matter. With your support, I have made significant progress, and only a few issues remain.
Thanks,
Altif
- setup.cfg LGTM. Since you have data in the github2 indexes you can remove
from-date
due to Mordred will fetch them incrementally. - dashboards: Try to increase the time picker like
Last 1 year
instead ofLast 7 days
- indexes: LGTM
- Mordred container: It seems that your Mordred container cannot connect to github.com. Run Mordred docker container, enter into de container (
docker exec -it <mordred> bash
), and try to runperceval git https://github.com/....../manager-lambda.git
Thank you @zhquan for replying.
- setup.cfg LGTM. Since you have data in the github2 indexes you can remove
from-date
due to Mordred will fetch them incrementally.
I have removed the from-date
.
- dashboards: Try to increase the time picker like
Last 1 year
instead ofLast 7 days
I attempted to set the time range for the past 30 days, but I discovered that the data is unavailable in the github2_issues
and github2_pull_requests
index patterns after February 8th. I am including reference screenshots.
It's worth mentioning that we have repositories that are frequently accessed, modified, and have pull requests created and merged. This leads me to wonder why there is no data available in either index pattern after 08th Feb.
- Mordred container: It seems that your Mordred container cannot connect to github.com. Run Mordred docker container, enter into de container (
docker exec -it <mordred> bash
), and try to runperceval git https://github.com/....../manager-lambda.git
I observed that sometime Mordred
encounters timeout errors while communicating with GitHub, which are likely caused by temporary or brief technical issues on GitHub's end. However, I'm interested in understanding whether a timeout error could result in the termination of the mordred container.
@zhquan Thanks a lot for your time and support, truly appreciate.
It took a few days to collect the pull request data from GitHub for over 500 repositories. I am happy to report that the data has been successfully retrieved and is now accessible through the GitHub Comments PRs dashboard.
Therefore, I am closing this issue as it has been resolved. Thank you once again for your assistance.