chaoss/grimoirelab

[cocom][colic] empty enriched indexes

Closed this issue · 31 comments

Hello, All

I can't get cocom and colic indexes filled.... no idea which way digging, but for now result is:

yellow open gitlab_merges_raw           dW5A7OkRQsSTOf2ERm-P7Q 5 1    103     5   2.9mb   2.9mb
yellow open .grimoirelab-sigils         aDhWnYcqTSCc7gMYO2I0fA 5 1     39     2    88kb    88kb
yellow open colic_chaoss                PvRbrZ3LRD-Ue6WL0Xmsvg 5 1   1189    87   1.4mb   1.4mb
yellow open git-aoc_enriched            9fUv9DYBQhm_FLcnLVuauA 5 1 272789 77217 323.9mb 323.9mb
yellow open git_raw                     9KX4FBhnTo2Ifhif79Nqow 5 1  20411     0  41.3mb  41.3mb
yellow open git_study_forecast_activity XLLiClXHQI2-g9t4NPL0KQ 5 1     65     7 225.5kb 225.5kb
yellow open cocom_chaoss                pm7uQm3sQwCNOjoC_NImrA 5 1   1189    75   2.7mb   2.7mb
yellow open colic_chaoss_enrich         q2lyh-kwT0ue3GuX8_iG4g 5 1      0     0   1.2kb   1.2kb
yellow open gitlab_merges_enriched      K9VVPrOmSUG0WVOA1qfJJw 5 1    103    30 369.9kb 369.9kb
yellow open jira_raw                    5ORCz2yJTSysSW_JluF4iQ 5 1  88754     1   1.9gb   1.9gb
yellow open cocom_enrich_graal_repo     0AnnohkNSiSiCJDeM7N83g 5 1      0     0   1.2kb   1.2kb
green  open .kibana_1                   j4McvhMOT2yti7675qYaqA 1 0    222    29 406.4kb 406.4kb
yellow open cocom_chaoss_enrich         KGNO_pbhQqyDsFPS8zAKbw 5 1      0     0   1.2kb   1.2kb
yellow open colic_enrich_graal_repo     jdEdlP5CSGqSs58hFN1uUQ 5 1      0     0   1.2kb   1.2kb
yellow open jira_enriched               PJSGOncaT12q2xElSkX8lQ 5 1 404486 90160   2.7gb   2.7gb
yellow open git_enriched                HVqoZeGnSza0kSl0a9EOOQ 5 1  20411  5618 107.8mb 107.8mb
yellow open git-onion_enriched          IvFrcGSVSjSmoQhs9kn42Q 5 1   1092     0 633.4kb 633.4kb
2023-05-29 21:04:56,585 - grimoire_elk.elk - INFO - [cocom] Starting study: enrich_cocom_analysis, params {}
2023-05-29 21:04:56,585 - grimoire_elk.enriched.cocom - INFO - [cocom] study enrich-cocom-analysis start
2023-05-29 21:04:56,590 - grimoire_elk.enriched.cocom - INFO - [cocom] study enrich-cocom-analysis 0 repositories to process
2023-05-29 21:04:56,669 - grimoire_elk.enriched.cocom - INFO - [cocom] study enrich-cocom-analysis End

and the same for colic

I had already tried delete indexes to let sirmordred re-create it - the same result.

cloc and fossology builded inside the container with mordred and there are no related errors in log.

versions of all apps - from the last GrimoireLab 0.10.0 release.

HelpPlease :).

zhquan commented

Hi @FixItNowPlease

Do you have any log related to the enrichment process (not the studies one)? Before [cocom] Starting study you should have

[cocom] enrichment phase start
...
...
[cocom] enrichment phase finished ...

Is there any error between these logs related to cocom?

Yes, a lot of collection and enrichment steps with no errors:

2023-05-29 21:04:50,848 - sirmordred.task_enrich - INFO - [cocom:master] enrichment starts for https://giturl
2023-05-29 21:04:50,898 - graal.graal - INFO - Fetch process completed: 4 commits inspected

For several repos (3 from 57 test asset) i have this errors:

2023-05-29 21:02:31,442 - grimoire_elk.elk - ERROR - Error feeding raw from cocom (https://gituser:gitpass@giturl/repo): Impossible to checkout the worktree /home/grimoire/tmp/cocom/repo-git at 32642e25be9492f0b4af6853ef78c545a04757e9
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/perceval/backends/core/git.py", line 1339, in _exec
    proc = subprocess.Popen(cmd, stdout=subprocess.PIPE,
  File "/usr/local/lib/python3.8/subprocess.py", line 858, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/local/lib/python3.8/subprocess.py", line 1704, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '/home/grimoire/tmp/cocom/repo-git'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/graal/graal.py", line 337, in checkout
    self._exec(cmd_checkout, cwd=self.worktreepath, env=self.gitenv)
  File "/usr/local/lib/python3.8/site-packages/perceval/backends/core/git.py", line 1344, in _exec
    raise RepositoryError(cause=str(e))
perceval.errors.RepositoryError: [Errno 2] No such file or directory: '/home/grimoire/tmp/cocom/repo-git'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/grimoire_elk/elk.py", line 204, in feed_backend
    ocean_backend.feed(**params)
  File "/usr/local/lib/python3.8/site-packages/grimoire_elk/raw/elastic.py", line 234, in feed
    self.feed_items(items)
  File "/usr/local/lib/python3.8/site-packages/grimoire_elk/raw/elastic.py", line 250, in feed_items
    for item in items:
  File "/usr/local/lib/python3.8/site-packages/perceval/backend.py", line 316, in fetch
    for item in self.fetch_items(category, **kwargs):
  File "/usr/local/lib/python3.8/site-packages/graal/graal.py", line 197, in fetch_items
    raise e
  File "/usr/local/lib/python3.8/site-packages/graal/graal.py", line 189, in fetch_items
    self.graalRepo.checkout(commit['commit'])
  File "/usr/local/lib/python3.8/site-packages/graal/graal.py", line 341, in checkout
    raise RepositoryError(cause=cause)
perceval.errors.RepositoryError: Impossible to checkout the worktree /home/grimoire/tmp/cocom/repo-git at 32642e25be9492f0b4af6853ef78c545a04757e9

but for other repos - no errors at all.

Any way to get it filled? :). Really no idea what's going wrong :(.

This is only thing keeping me from rollout GL over the all my 1000+ repos :). If anybody could help - will be very appreciated!

I hope I can have a look a it next week.

Will be appreciated for any idea - where the reason of empty indexes could be.

any ideas? :)

zhquan commented

Hi @GrayStranger

Sorry for the late reply, I can create the cocom enriched index without any issue

  • setup.cfg: you have to create first /home/grimoire/worktrees/ in the mordred container or mount a volume with the correct permissions (for test you can set chmod 777 <worktress_path>)
[cocom]
raw_index = cocom_raw
enriched_index = cocom_enrich
category = code_complexity_lizard_file
studies = [enrich_cocom_analysis]
branches = master
worktree-path = /home/grimoire/worktrees/

[enrich_cocom_analysis]
out_index = cocom_chaoss_study
interval_months = [7]
  • projects.json
{
  "GrimoireLab": {
    "cocom": [
      "https://github.com/chaoss/grimoirelab-toolkit"
    ]
  }
}
  • all.logs
2023-07-19 09:42:10,980 - sirmordred.task_enrich - INFO - [cocom] enrichment phase starts
2023-07-19 09:42:11,062 - grimoire_elk.elastic - INFO - Created index https://localhost:9200/cocom_enrich
2023-07-19 09:42:11,068 - sirmordred.task_enrich - INFO - [cocom] enrichment starts for https://github.com/chaoss/grimoirelab-toolkit
2023-07-19 09:42:11,283 - grimoire_elk.elastic - INFO - Alias cocom created on https://localhost:9200/cocom_enrich.
2023-07-19 09:42:11,647 - grimoire_elk.enriched.cocom - INFO - [cocom] 287 items inserted
2023-07-19 09:42:11,648 - grimoire_elk.elk - INFO - [cocom] Done enrichment for https://github.com/chaoss/grimoirelab-toolkit
2023-07-19 09:42:11,648 - sirmordred.task_enrich - INFO - [cocom] enrichment finished for https://github.com/chaoss/grimoirelab-toolkit
2023-07-19 09:42:11,648 - sirmordred.task_enrich - INFO - [cocom] enrichment phase finished in 0:00:00
2023-07-19 09:42:11,648 - sirmordred.task_enrich - INFO - [cocom] data retention start
2023-07-19 09:42:11,665 - sirmordred.task_enrich - INFO - [cocom] data retention end
2023-07-19 09:42:11,665 - sirmordred.task_enrich - INFO - [cocom] identities retention end
2023-07-19 09:42:11,665 - sirmordred.task_enrich - INFO - [cocom] autorefresh start
2023-07-19 09:42:11,710 - sirmordred.task_enrich - INFO - [cocom] Refreshing identities
2023-07-19 09:42:11,757 - sirmordred.task_enrich - INFO - [cocom] autorefresh end
2023-07-19 09:42:11,758 - sirmordred.task_enrich - INFO - [cocom] studies phase start
2023-07-19 09:42:13,824 - sirmordred.task_enrich - INFO - [cocom] Executing studies ['enrich_cocom_analysis']
2023-07-19 09:42:13,825 - grimoire_elk.elk - INFO - [cocom] Starting study: enrich_cocom_analysis, params {'out_index': 'cocom_chaoss_study', 'interval_months': ['7']}
2023-07-19 09:42:13,826 - grimoire_elk.enriched.cocom - INFO - [cocom] study enrich-cocom-analysis start
2023-07-19 09:42:13,836 - grimoire_elk.enriched.cocom - INFO - [cocom] study enrich-cocom-analysis 1 repositories to process
2023-07-19 09:42:13,909 - grimoire_elk.elastic - INFO - Created index https://localhost:9200/cocom_chaoss_study
2023-07-19 09:42:13,966 - grimoire_elk.elastic - INFO - Alias cocom_study created on https://localhost:9200/cocom_chaoss_study.
2023-07-19 09:42:13,966 - grimoire_elk.enriched.cocom - INFO - [cocom] study enrich-cocom-analysis start analysis for https://github.com/chaoss/grimoirelab-toolkit
2023-07-19 09:42:14,121 - grimoire_elk.enriched.cocom - INFO - [cocom] study enrich-cocom-analysis 38 items inserted for Graal CoCom Analysis Study
2023-07-19 09:42:14,121 - grimoire_elk.enriched.cocom - INFO - [cocom] study enrich-cocom-analysis End analysis for https://github.com/chaoss/grimoirelab-toolkit with month interval
2023-07-19 09:42:14,121 - grimoire_elk.enriched.cocom - INFO - [cocom] study enrich-cocom-analysis End
2023-07-19 09:42:14,138 - sirmordred.task_enrich - INFO - [cocom] studies phase end
2023-07-19 09:42:14,138 - sirmordred.task_enrich - INFO - [cocom] autorefresh for studies start
2023-07-19 09:42:14,139 - sirmordred.task_enrich - INFO - [cocom] autorefresh for studies end
2023-07-19 09:42:14,139 - sirmordred.task_manager - INFO - [cocom] sleeping for 60 seconds
  • indexes
GET _cat/indices?s=i

cocom_chaoss_study           L8dDQkVcQAe-T_0QRB6SAA 1 1      38      0  18.5kb  18.5kb
cocom_enrich                 xld4NZQDQU2nOcJIAmyDyw 1 1     287      0 274.1kb 274.1kb
cocom_raw                    jSJ99bUKQJiFH9bNx0XW4w 1 1     129      0 161.8kb 161.8kb

I'm using this version: grimoirelab/grimoirelab:0.10.0

  • setup.cfg: you have to create first /home/grimoire/worktrees/ in the mordred container or mount a volume with the correct permissions (for test you can set chmod 777 <worktress_path>)

is this only that i need to add to use cocom and colic functionality? My config for mordred container look like this:

    mordred:
      restart: on-failure:5
      build: 
        context: .
        dockerfile: mordred.upgraded.new.yml
#      image: grimoirelab/grimoirelab:latest
      volumes:
        - ../grimoirelab-settings/setup.cfg:/home/grimoire/conf/setup.cfg
        - ../grimoirelab-settings/projects.json:/home/grimoire/conf/projects.json
#        - ../grimoirelab-settings/organizations.json:/home/grimoire/conf/organizations.json
#        - ../grimoirelab-settings/identities.yml:/home/grimoire/conf/identities.yml
        - ../logs:/home/grimoire/logs
      depends_on:
        nginx:
          condition: service_healthy
      mem_limit: 24g
FROM grimoirelab/grimoirelab:latest

RUN sudo apt-get update && \
    sudo apt-get install -y cloc && \
    sudo apt-get install -y wget && \
    wget https://github.com/fossology/fossology/releases/download/3.11.0/FOSSology-3.11.0-ubuntu-focal.tar.gz && \
    tar -xzf FOSSology-3.11.0-ubuntu-focal.tar.gz && \
    sudo apt-get -y install ./packages/fossology-common_3.11.0-1_amd64.deb && \
    wget http://security.ubuntu.com/ubuntu/pool/main/j/json-c/libjson-c4_0.13.1+dfsg-7ubuntu0.3_amd64.deb && \
    sudo apt-get -y install ./libjson-c4_0.13.1+dfsg-7ubuntu0.3_amd64.deb && \
    sudo apt-get -y install ./packages/fossology-nomos_3.11.0-1_amd64.deb && \
    mkdir /home/grimoire/tmp && \
    mkdir /home/grimoire/tmp/grimoirelab-toolkit-git && \
    mkdir /home/grimoire/tmp/colic && \
    mkdir /home/grimoire/tmp/cocom && \
    mkdir /home/grimoire/tmp/cocom/grimoirelab-toolkit-git && \
    mkdir /home/grimoire/tmp/colic/grimoirelab-toolkit-git && \
    chmod -R 777 /home/grimoire/tmp

I think what we need it's the full list described in here: https://github.com/chaoss/grimoirelab-graal/#how-to-install-and-create-the-executables

So... I have on my setup.cfg all the results on github test repo, but when im replacing it with my test intranet gitlab repo - no enrichment for cocom and colic. Only difference - one repo in projects.json. No errors - only [cocom] study enrich-cocom-analysis 0 repositories to process.

yellow open colic_enrich        3S0sMDmEQrO_DcxzWQCnCw 5 1   0  0   1.2kb   1.2kb
yellow open cocom_enrich        jJoBFTJzQ3Sv3W6iIUz26A 5 1   0  0   1.2kb   1.2kb
yellow open cocom_chaoss_study  kxgKeLiYTGS2oBcmwIYh6w 5 1   0  0   1.2kb   1.2kb
yellow open colic_chaoss_study  Op5Nt_UKT1ilodyxfoC0Pw 5 1   0  0   1.2kb   1.2kb
green  open .kibana_1           P-tJzBoTQlmDbv95E8tVkQ 1 0 222 27 396.3kb 396.3kb
yellow open .grimoirelab-sigils qCAnxfLHT8a6qfoC5dlQpA 5 1  39  2   100kb   100kb
yellow open colic_raw           1fk3bh5NSRWax-dUQ5Rg4A 5 1  22  0  73.9kb  73.9kb
yellow open cocom_raw           xDozJ4UOT3iTRO7JpnEItw 5 1  22  0  84.8kb  84.8kb

Im using latest version, but this issue appears on 0.10.0 also.

if there is an doc about gitlab setup for grimoirelab, any specific features - will be very appreciated to pointed on.

Hi @GrayStranger,

Is that gitlab repository public to your intranet? Does it require user and password to access the repo? Do the git indices contain any data?

PD: Sorry for our late response. Most of us are on vacation, so it's hard to be active here.

Thanks for the answer!

  1. No, it isn't - all the accesses covered by roles within repos, user for grimoirelab has no direct access to any project but has "auditor" role.
  2. Yes, its require the password
  3. Yes, git indices contain data, same for cocom/colic raw indices

Below the data for one repository with same backend format and creds for git/cocom/colic sections of projects.json

yellow open gitlab_merges_enriched      Ghv8uR1DRj-GtRyArxGEiw 5 1   195    1 403.5kb 403.5kb
yellow open git_enriched                Nusr-UyeRmyIdHXQByWDYw 5 1   580    1   2.8mb   2.8mb
yellow open git_raw                     G37hU6o2Sji6XWqYv4NUOg 5 1   580    0   2.8mb   2.8mb
yellow open colic_raw                   uo2rHh0JTRKg4OxvaDLR3Q 5 1   524    0   1.3mb   1.3mb
yellow open git-onion_enriched          Q0B25oiGQZi5kzo7x90epw 5 1    96    0 123.2kb 123.2kb
yellow open gitlab_merges_raw           ZymSKivyStq_NcHM6jECXQ 5 1   195    1 901.7kb 901.7kb
yellow open cocom_raw                   KsRuAf3wSyKSpt-DQN-S7Q 5 1   524   53  14.6mb  14.6mb
yellow open .grimoirelab-sigils         Oan7d-ifTfSJu2GJ0NKjrg 5 1    39    1 104.2kb 104.2kb
yellow open gitlab-mrs-onion_enriched   W2nmtMTNQzexmrr4Ek1k7w 5 1    56    0 112.4kb 112.4kb
yellow open colic_enrich                j5TrcXdFS-KFgLw5G7r5zA 5 1     0    0   1.2kb   1.2kb
yellow open git_study_forecast_activity PG2dLb2UT8yimXB3fkXk5A 5 1     5    0  55.2kb  55.2kb
green  open .kibana_1                   j5pgVAatSHeXs-se8hOa8g 1 0   222   27 413.2kb 413.2kb
yellow open colic_chaoss_study          SeIErzbkSRyn01br8UPpeA 5 1     0    0   1.2kb   1.2kb
yellow open git-aoc_enriched            zwK5QFjjQYSKHdPf3oyFIg 5 1 25639 6822  20.3mb  20.3mb
yellow open cocom_chaoss_study          4D1D5ygDQM6ZLqZd-9lgwg 5 1     0    0   1.2kb   1.2kb
yellow open cocom_enrich                n5fKz9aGSSq48kvrSTN0iA 5 1     0    0   1.2kb   1.2kb

That's weird. The difficult part is to get the raw indices. There must be some kind of error of exception we aren't seeing. Would you mind to set the platform to the debug mode and paste here the logs you get?

To do it, you have to set the debug parameter to true on the general section.

[general]
debug = true
...

Also, to reduce the noise and make the process faster, you should activate cocom or colic only. To do it, set to false everything under phases section but enrichment and comment the sections git and gitlab, studies, and cocom or colic - so we get debug messages from one of the processes only. This should look like:

[phases]
collection = false
identities = false
enrichment = true
panels = false

# [git]
# raw_index = git_raw
# ...
# [gitlab]
# ...

[cocom]
raw_index = cocom_raw
enriched_index = cocom_enrich
category = code_complexity_lizard_file
# studies = [enrich_cocom_analysis]
branches = master
worktree-path = /home/grimoire/worktrees/

here it is:

2023-08-03 17:02:11,787 - sirmordred.sirmordred - INFO - 
2023-08-03 17:02:11,787 - sirmordred.sirmordred - INFO - ----------------------------
2023-08-03 17:02:11,787 - sirmordred.sirmordred - INFO - Starting SirMordred engine ...
2023-08-03 17:02:11,787 - sirmordred.sirmordred - INFO - ----------------------------
2023-08-03 17:02:11,790 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:02:11,793 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET / HTTP/1.1" 200 298
2023-08-03 17:02:11,795 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET / HTTP/1.1" 200 298
2023-08-03 17:02:11,796 - sirmordred.sirmordred - INFO - Loading projects
2023-08-03 17:02:11,796 - sirmordred.task_manager - DEBUG - [Global tasks] Task starts
2023-08-03 17:02:11,797 - sirmordred.task_manager - DEBUG - [<class 'sirmordred.task_projects.TaskProjects'>]
2023-08-03 17:02:11,798 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): nginx:8000
2023-08-03 17:02:11,801 - urllib3.connectionpool - DEBUG - http://nginx:8000 "GET /identities/api/ HTTP/1.1" 400 79
2023-08-03 17:02:11,802 - sgqlc.endpoint.requests - DEBUG - Query:
mutation {
tokenAuth(username: "root", password: "root") {
token
}
}
2023-08-03 17:02:11,919 - urllib3.connectionpool - DEBUG - http://nginx:8000 "POST /identities/api/ HTTP/1.1" 200 192
2023-08-03 17:02:11,920 - sirmordred.task_manager - DEBUG - [Global tasks] Tasks will be executed in this order: [<sirmordred.task_projects.TaskProjects object at 0x7f6070d9d850>]
2023-08-03 17:02:11,920 - sirmordred.task_manager - DEBUG - [Global tasks] Tasks started: <sirmordred.task_projects.TaskProjects object at 0x7f6070d9d850>
2023-08-03 17:02:11,921 - sirmordred.task_projects - INFO - Reading projects data from  /home/grimoire/conf/projects.json 
2023-08-03 17:02:11,921 - sirmordred.task_projects - DEBUG - Projects file has changed
2023-08-03 17:02:11,921 - sirmordred.task_manager - DEBUG - [Global tasks] Tasks finished: <sirmordred.task_projects.TaskProjects object at 0x7f6070d9d850>
2023-08-03 17:02:11,921 - sirmordred.task_manager - DEBUG - [Global tasks] Task is exiting
2023-08-03 17:02:11,921 - sirmordred.sirmordred - INFO - Projects loaded
2023-08-03 17:02:11,922 - sirmordred.sirmordred - DEBUG - backend_tasks = [<class 'sirmordred.task_enrich.TaskEnrich'>]
2023-08-03 17:02:11,922 - sirmordred.sirmordred - DEBUG - global_tasks = [<class 'sirmordred.task_projects.TaskProjects'>]
2023-08-03 17:02:11,923 - sirmordred.task_manager - DEBUG - [cocom:master] Task starts
2023-08-03 17:02:11,923 - sirmordred.task_manager - DEBUG - [Global tasks] Task starts
2023-08-03 17:02:11,923 - sirmordred.sirmordred - INFO - TaskProjects will be executed on Thu, 03 Aug 2023 17:03:51 
2023-08-03 17:02:11,924 - sirmordred.task_manager - DEBUG - [<class 'sirmordred.task_enrich.TaskEnrich'>]
2023-08-03 17:02:11,924 - sirmordred.task_manager - DEBUG - [<class 'sirmordred.task_projects.TaskProjects'>]
2023-08-03 17:02:11,925 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): nginx:8000
2023-08-03 17:02:11,926 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): nginx:8000
2023-08-03 17:02:11,930 - urllib3.connectionpool - DEBUG - http://nginx:8000 "GET /identities/api/ HTTP/1.1" 400 79
2023-08-03 17:02:11,930 - urllib3.connectionpool - DEBUG - http://nginx:8000 "GET /identities/api/ HTTP/1.1" 400 79
2023-08-03 17:02:11,931 - sgqlc.endpoint.requests - DEBUG - Query:
mutation {
tokenAuth(username: "root", password: "root") {
token
}
}
2023-08-03 17:02:11,932 - sgqlc.endpoint.requests - DEBUG - Query:
mutation {
tokenAuth(username: "root", password: "root") {
token
}
}
2023-08-03 17:02:12,039 - urllib3.connectionpool - DEBUG - http://nginx:8000 "POST /identities/api/ HTTP/1.1" 200 192
2023-08-03 17:02:12,040 - sirmordred.task_manager - DEBUG - [Global tasks] Tasks will be executed in this order: [<sirmordred.task_projects.TaskProjects object at 0x7f6070d9dd90>]
2023-08-03 17:02:12,040 - sirmordred.task_manager - DEBUG - [Global tasks] Tasks started: <sirmordred.task_projects.TaskProjects object at 0x7f6070d9dd90>
2023-08-03 17:02:12,041 - sirmordred.task_projects - INFO - Reading projects data from  /home/grimoire/conf/projects.json 
2023-08-03 17:02:12,041 - sirmordred.task_manager - DEBUG - [Global tasks] Tasks finished: <sirmordred.task_projects.TaskProjects object at 0x7f6070d9dd90>
2023-08-03 17:02:12,041 - sirmordred.task_manager - INFO - [Global tasks] sleeping for 100 seconds 
2023-08-03 17:02:12,042 - urllib3.connectionpool - DEBUG - http://nginx:8000 "POST /identities/api/ HTTP/1.1" 200 192
2023-08-03 17:02:12,043 - sirmordred.task_manager - DEBUG - [cocom:master] Tasks will be executed in this order: [<sirmordred.task_enrich.TaskEnrich object at 0x7f6070d9db80>]
2023-08-03 17:02:12,043 - sirmordred.task_manager - DEBUG - [cocom:master] Tasks started: <sirmordred.task_enrich.TaskEnrich object at 0x7f6070d9db80>
2023-08-03 17:02:22,054 - sirmordred.task_enrich - DEBUG - Number of enrichment tasks active: 1
2023-08-03 17:02:22,056 - sirmordred.task_enrich - INFO - [cocom:master] enrichment phase starts
2023-08-03 17:02:22,057 - sirmordred.task_projects - DEBUG - List of repos for cocom:master: ['https://grimoire-bot:password@git.domen.com/repo'] (raw=False)
2023-08-03 17:02:22,058 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:02:22,059 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET / HTTP/1.1" 200 298
2023-08-03 17:02:22,060 - grimoire_elk.elastic - DEBUG - Found version of elasticsearch instance at http://elasticsearch:9200: 6.
2023-08-03 17:02:22,061 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:02:22,062 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET /cocom_enrich HTTP/1.1" 200 310
2023-08-03 17:02:22,062 - grimoire_elk.elastic - DEBUG - http://elasticsearch:9200/cocom_enrich/_search 
        { "size": 0, "query": {"bool": {"filter": []}},  
            "aggs": {
                "1": {
                  "max": {
                    "field": "metadata__timestamp"
                  }
                }
            }
        
        } 
2023-08-03 17:02:22,064 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "POST /cocom_enrich/_search HTTP/1.1" 200 148
2023-08-03 17:02:22,066 - sirmordred.task - WARNING - Empty section latest-items
2023-08-03 17:02:22,066 - sirmordred.task_enrich - INFO - [cocom:master] enrichment starts for https://git.domen.com/repo
2023-08-03 17:02:22,069 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): nginx:8000
2023-08-03 17:02:22,072 - urllib3.connectionpool - DEBUG - http://nginx:8000 "GET /identities/api/ HTTP/1.1" 400 79
2023-08-03 17:02:22,073 - sgqlc.endpoint.requests - DEBUG - Query:
mutation {
tokenAuth(username: "root", password: "root") {
token
}
}
2023-08-03 17:02:22,182 - urllib3.connectionpool - DEBUG - http://nginx:8000 "POST /identities/api/ HTTP/1.1" 200 192
2023-08-03 17:02:22,186 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:02:22,187 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET / HTTP/1.1" 200 298
2023-08-03 17:02:22,187 - grimoire_elk.elastic - DEBUG - Found version of elasticsearch instance at http://elasticsearch:9200: 6.
2023-08-03 17:02:22,188 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:02:22,190 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET /cocom_enrich HTTP/1.1" 200 310
2023-08-03 17:02:22,193 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "PUT /cocom_enrich/items/_mapping HTTP/1.1" 200 47
2023-08-03 17:02:22,195 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET /_aliases HTTP/1.1" 200 297
2023-08-03 17:02:22,196 - grimoire_elk.elastic - DEBUG - Alias cocom won't be set on http://elasticsearch:9200/cocom_enrich, it already exists on http://elasticsearch:9200
2023-08-03 17:02:22,196 - grimoire_elk.elk - DEBUG - Last enrichment: None
2023-08-03 17:02:22,197 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:02:22,199 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET / HTTP/1.1" 200 298
2023-08-03 17:02:22,199 - grimoire_elk.elastic - DEBUG - Found version of elasticsearch instance at http://elasticsearch:9200: 6.
2023-08-03 17:02:22,200 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:02:22,202 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET /cocom_raw HTTP/1.1" 200 530
2023-08-03 17:02:22,206 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "PUT /cocom_raw/items/_mapping HTTP/1.1" 200 47
2023-08-03 17:02:22,210 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "PUT /cocom_raw/items/_mapping HTTP/1.1" 200 47
2023-08-03 17:02:22,210 - grimoire_elk.elk - DEBUG - Adding enrichment data to http://elasticsearch:9200/cocom_enrich
2023-08-03 17:02:22,210 - grimoire_elk.elastic_items - DEBUG - Creating a elastic items generator.
2023-08-03 17:02:22,211 - grimoire_elk.elastic_items - DEBUG - Raw query to http://elasticsearch:9200/cocom_raw/_search?scroll=10m&size=100
{
    "query": {
        "bool": {
            "filter": [
                {
                    "term": {
                        "origin": "https://git.domen.com/repo"
                    }
                }
            ]
        }
    },
    "sort": {
        "metadata__timestamp": {
            "order": "asc"
        }
    }
}
2023-08-03 17:02:22,212 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:02:22,214 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "POST /cocom_raw/_search?scroll=10m&size=100 HTTP/1.1" 200 272
2023-08-03 17:02:22,215 - grimoire_elk.elastic_items - DEBUG - No results found from http://elasticsearch:9200/cocom_raw and filter None
2023-08-03 17:02:22,215 - grimoire_elk.elastic_items - DEBUG - Releasing scroll_id=DnF1ZXJ5VGhlbkZldGNoBQAAAAAAGB4IFklGa094M2NSUVBtYWxjQXU0VXU2QXcAAAAAABgeCRZJRmtPeDNjUlFQbWFsY0F1NFV1NkF3AAAAAAAYHgUWSUZrT3gzY1JRUG1hbGNBdTRVdTZBdwAAAAAAGB4HFklGa094M2NSUVBtYWxjQXU0VXU2QXcAAAAAABgeBhZJRmtPeDNjUlFQbWFsY0F1NFV1NkF3
2023-08-03 17:02:22,217 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "DELETE /_search/scroll HTTP/1.1" 200 57
2023-08-03 17:02:22,217 - grimoire_elk.enriched.cocom - INFO - [cocom] 0 items inserted
2023-08-03 17:02:22,217 - grimoire_elk.elk - DEBUG - Total items enriched 0 
2023-08-03 17:02:22,217 - grimoire_elk.elk - INFO - [cocom] Done enrichment for https://git.domen.com/repo
2023-08-03 17:02:22,218 - sirmordred.task_enrich - INFO - [cocom:master] enrichment finished for https://git.domen.com/repo
2023-08-03 17:02:22,218 - sirmordred.task_enrich - INFO - [cocom:master] enrichment phase finished in 0:00:00
2023-08-03 17:02:22,218 - sirmordred.task_enrich - INFO - [cocom:master] data retention start
2023-08-03 17:02:22,220 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:02:22,221 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET / HTTP/1.1" 200 298
2023-08-03 17:02:22,222 - grimoire_elk.elastic - DEBUG - Found version of elasticsearch instance at http://elasticsearch:9200: 6.
2023-08-03 17:02:22,223 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:02:22,225 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET /cocom_enrich HTTP/1.1" 200 310
2023-08-03 17:02:22,225 - grimoire_elk.elastic - DEBUG - [items retention] Retention policy disabled, no items will be deleted.
2023-08-03 17:02:22,226 - sirmordred.task_enrich - INFO - [cocom:master] data retention end
2023-08-03 17:02:22,226 - sirmordred.task_enrich - DEBUG - [identities retention] Retention policy disabled, no identities will be deleted.
2023-08-03 17:02:22,226 - sirmordred.task_enrich - INFO - [cocom:master] identities retention end
2023-08-03 17:02:22,227 - sirmordred.task_enrich - INFO - [cocom:master] autorefresh start
2023-08-03 17:02:22,229 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:02:22,230 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET / HTTP/1.1" 200 298
2023-08-03 17:02:22,231 - grimoire_elk.elastic - DEBUG - Found version of elasticsearch instance at http://elasticsearch:9200: 6.
2023-08-03 17:02:22,232 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:02:22,234 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET /cocom_enrich HTTP/1.1" 200 310
2023-08-03 17:02:22,237 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "PUT /cocom_enrich/items/_mapping HTTP/1.1" 200 47
2023-08-03 17:02:22,240 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:02:22,241 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET / HTTP/1.1" 200 298
2023-08-03 17:02:22,242 - grimoire_elk.elastic - DEBUG - Found version of elasticsearch instance at http://elasticsearch:9200: 6.
2023-08-03 17:02:22,243 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:02:22,246 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET /cocom_enrich HTTP/1.1" 200 310
2023-08-03 17:02:22,250 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "PUT /cocom_enrich/items/_mapping HTTP/1.1" 200 47
2023-08-03 17:02:22,251 - sirmordred.task_enrich - INFO - [cocom:master] Refreshing identities from 2023-08-01 17:02:12.043862
2023-08-03 17:02:22,251 - sirmordred.task_enrich - DEBUG - Getting last modified identities from SH since 2023-08-01 17:02:12.043862 for cocom:master
2023-08-03 17:02:22,251 - sirmordred.task_enrich - DEBUG - Refreshing identity ids for cocom:master
2023-08-03 17:02:22,253 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): nginx:8000
2023-08-03 17:02:22,265 - urllib3.connectionpool - DEBUG - http://nginx:8000 "POST /identities/api/ HTTP/1.1" 200 69
2023-08-03 17:02:22,266 - grimoire_elk.elastic - DEBUG - Adding items to http://elasticsearch:9200/cocom_enrich/items/_bulk (in 1000 packs)
2023-08-03 17:02:22,266 - grimoire_elk.elk - DEBUG - Refreshing identities fields from http://elasticsearch:9200/cocom_enrich
2023-08-03 17:02:22,266 - grimoire_elk.elk - DEBUG - Refreshing identities
2023-08-03 17:02:22,266 - grimoire_elk.elk - DEBUG - Total eitems refreshed for identities fields 0
2023-08-03 17:02:22,266 - sirmordred.task_enrich - DEBUG - [cocom:master] Individuals refreshed: 0
2023-08-03 17:02:22,266 - sirmordred.task_enrich - DEBUG - No ids to be refreshed found
2023-08-03 17:02:22,266 - sirmordred.task_enrich - INFO - [cocom:master] Refreshed 0 identities in 0:00:00
2023-08-03 17:02:22,266 - sirmordred.task_enrich - INFO - [cocom:master] autorefresh end
2023-08-03 17:02:22,267 - sirmordred.task_enrich - INFO - [cocom:master] no studies phase
2023-08-03 17:02:22,267 - sirmordred.task_enrich - INFO - [cocom:master] autorefresh for studies start
2023-08-03 17:02:22,267 - sirmordred.task_enrich - DEBUG - Not doing autorefresh for studies, Areas of Code study is not active.
2023-08-03 17:02:22,267 - sirmordred.task_enrich - INFO - [cocom:master] autorefresh for studies end
2023-08-03 17:02:22,267 - sirmordred.task_manager - DEBUG - [cocom:master] Tasks finished: <sirmordred.task_enrich.TaskEnrich object at 0x7f6070d9db80>
2023-08-03 17:02:22,267 - sirmordred.task_manager - INFO - [cocom:master] sleeping for 60 seconds 
2023-08-03 17:03:22,294 - sirmordred.task_manager - DEBUG - [cocom:master] Tasks started: <sirmordred.task_enrich.TaskEnrich object at 0x7f6070d9db80>
2023-08-03 17:03:32,305 - sirmordred.task_enrich - DEBUG - Number of enrichment tasks active: 1
2023-08-03 17:03:32,307 - sirmordred.task_enrich - INFO - [cocom:master] enrichment phase starts
2023-08-03 17:03:32,308 - sirmordred.task_projects - DEBUG - List of repos for cocom:master: ['https://grimoire-bot:password@git.domen.com/repo'] (raw=False)
2023-08-03 17:03:32,310 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:03:32,312 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET / HTTP/1.1" 200 298
2023-08-03 17:03:32,313 - grimoire_elk.elastic - DEBUG - Found version of elasticsearch instance at http://elasticsearch:9200: 6.
2023-08-03 17:03:32,314 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:03:32,317 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET /cocom_enrich HTTP/1.1" 200 310
2023-08-03 17:03:32,317 - grimoire_elk.elastic - DEBUG - http://elasticsearch:9200/cocom_enrich/_search 
        { "size": 0, "query": {"bool": {"filter": []}},  
            "aggs": {
                "1": {
                  "max": {
                    "field": "metadata__timestamp"
                  }
                }
            }
        
        } 
2023-08-03 17:03:32,320 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "POST /cocom_enrich/_search HTTP/1.1" 200 148
2023-08-03 17:03:32,322 - sirmordred.task - WARNING - Empty section latest-items
2023-08-03 17:03:32,322 - sirmordred.task_enrich - INFO - [cocom:master] enrichment starts for https://git.domen.com/repo
2023-08-03 17:03:32,328 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:03:32,329 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET / HTTP/1.1" 200 298
2023-08-03 17:03:32,330 - grimoire_elk.elastic - DEBUG - Found version of elasticsearch instance at http://elasticsearch:9200: 6.
2023-08-03 17:03:32,332 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:03:32,333 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET /cocom_enrich HTTP/1.1" 200 310
2023-08-03 17:03:32,337 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "PUT /cocom_enrich/items/_mapping HTTP/1.1" 200 47
2023-08-03 17:03:32,339 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET /_aliases HTTP/1.1" 200 297
2023-08-03 17:03:32,339 - grimoire_elk.elastic - DEBUG - Alias cocom won't be set on http://elasticsearch:9200/cocom_enrich, it already exists on http://elasticsearch:9200
2023-08-03 17:03:32,340 - grimoire_elk.elk - DEBUG - Last enrichment: None
2023-08-03 17:03:32,341 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:03:32,342 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET / HTTP/1.1" 200 298
2023-08-03 17:03:32,343 - grimoire_elk.elastic - DEBUG - Found version of elasticsearch instance at http://elasticsearch:9200: 6.
2023-08-03 17:03:32,344 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:03:32,346 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET /cocom_raw HTTP/1.1" 200 530
2023-08-03 17:03:32,350 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "PUT /cocom_raw/items/_mapping HTTP/1.1" 200 47
2023-08-03 17:03:32,355 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "PUT /cocom_raw/items/_mapping HTTP/1.1" 200 47
2023-08-03 17:03:32,355 - grimoire_elk.elk - DEBUG - Adding enrichment data to http://elasticsearch:9200/cocom_enrich
2023-08-03 17:03:32,355 - grimoire_elk.elastic_items - DEBUG - Creating a elastic items generator.
2023-08-03 17:03:32,356 - grimoire_elk.elastic_items - DEBUG - Raw query to http://elasticsearch:9200/cocom_raw/_search?scroll=10m&size=100
{
    "query": {
        "bool": {
            "filter": [
                {
                    "term": {
                        "origin": "https://git.domen.com/repo"
                    }
                }
            ]
        }
    },
    "sort": {
        "metadata__timestamp": {
            "order": "asc"
        }
    }
}
2023-08-03 17:03:32,357 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:03:32,359 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "POST /cocom_raw/_search?scroll=10m&size=100 HTTP/1.1" 200 270
2023-08-03 17:03:32,360 - grimoire_elk.elastic_items - DEBUG - No results found from http://elasticsearch:9200/cocom_raw and filter None
2023-08-03 17:03:32,360 - grimoire_elk.elastic_items - DEBUG - Releasing scroll_id=DnF1ZXJ5VGhlbkZldGNoBQAAAAAAGB4QFklGa094M2NSUVBtYWxjQXU0VXU2QXcAAAAAABgeERZJRmtPeDNjUlFQbWFsY0F1NFV1NkF3AAAAAAAYHhMWSUZrT3gzY1JRUG1hbGNBdTRVdTZBdwAAAAAAGB4PFklGa094M2NSUVBtYWxjQXU0VXU2QXcAAAAAABgeEhZJRmtPeDNjUlFQbWFsY0F1NFV1NkF3
2023-08-03 17:03:32,362 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "DELETE /_search/scroll HTTP/1.1" 200 57
2023-08-03 17:03:32,362 - grimoire_elk.enriched.cocom - INFO - [cocom] 0 items inserted
2023-08-03 17:03:32,362 - grimoire_elk.elk - DEBUG - Total items enriched 0 
2023-08-03 17:03:32,362 - grimoire_elk.elk - INFO - [cocom] Done enrichment for https://git.domen.com/repo
2023-08-03 17:03:32,363 - sirmordred.task_enrich - INFO - [cocom:master] enrichment finished for https://git.domen.com/repo
2023-08-03 17:03:32,363 - sirmordred.task_enrich - INFO - [cocom:master] enrichment phase finished in 0:00:00
2023-08-03 17:03:32,363 - sirmordred.task_enrich - INFO - [cocom:master] data retention start
2023-08-03 17:03:32,365 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:03:32,366 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET / HTTP/1.1" 200 298
2023-08-03 17:03:32,367 - grimoire_elk.elastic - DEBUG - Found version of elasticsearch instance at http://elasticsearch:9200: 6.
2023-08-03 17:03:32,368 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:03:32,370 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET /cocom_enrich HTTP/1.1" 200 310
2023-08-03 17:03:32,371 - grimoire_elk.elastic - DEBUG - [items retention] Retention policy disabled, no items will be deleted.
2023-08-03 17:03:32,371 - sirmordred.task_enrich - INFO - [cocom:master] data retention end
2023-08-03 17:03:32,372 - sirmordred.task_enrich - DEBUG - [identities retention] Retention policy disabled, no identities will be deleted.
2023-08-03 17:03:32,372 - sirmordred.task_enrich - INFO - [cocom:master] identities retention end
2023-08-03 17:03:32,372 - sirmordred.task_enrich - INFO - [cocom:master] autorefresh start
2023-08-03 17:03:32,374 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:03:32,376 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET / HTTP/1.1" 200 298
2023-08-03 17:03:32,376 - grimoire_elk.elastic - DEBUG - Found version of elasticsearch instance at http://elasticsearch:9200: 6.
2023-08-03 17:03:32,378 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:03:32,380 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET /cocom_enrich HTTP/1.1" 200 310
2023-08-03 17:03:32,384 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "PUT /cocom_enrich/items/_mapping HTTP/1.1" 200 47
2023-08-03 17:03:32,388 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:03:32,390 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET / HTTP/1.1" 200 298
2023-08-03 17:03:32,390 - grimoire_elk.elastic - DEBUG - Found version of elasticsearch instance at http://elasticsearch:9200: 6.
2023-08-03 17:03:32,392 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): elasticsearch:9200
2023-08-03 17:03:32,394 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "GET /cocom_enrich HTTP/1.1" 200 310
2023-08-03 17:03:32,399 - urllib3.connectionpool - DEBUG - http://elasticsearch:9200 "PUT /cocom_enrich/items/_mapping HTTP/1.1" 200 47
2023-08-03 17:03:32,399 - sirmordred.task_enrich - INFO - [cocom:master] Refreshing identities from 2023-08-03 17:02:22.251733
2023-08-03 17:03:32,400 - sirmordred.task_enrich - DEBUG - Getting last modified identities from SH since 2023-08-03 17:02:22.251733 for cocom:master
2023-08-03 17:03:32,400 - sirmordred.task_enrich - DEBUG - Refreshing identity ids for cocom:master
2023-08-03 17:03:32,401 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): nginx:8000
2023-08-03 17:03:32,415 - urllib3.connectionpool - DEBUG - http://nginx:8000 "POST /identities/api/ HTTP/1.1" 200 69
2023-08-03 17:03:32,416 - grimoire_elk.elastic - DEBUG - Adding items to http://elasticsearch:9200/cocom_enrich/items/_bulk (in 1000 packs)
2023-08-03 17:03:32,416 - grimoire_elk.elk - DEBUG - Refreshing identities fields from http://elasticsearch:9200/cocom_enrich
2023-08-03 17:03:32,416 - grimoire_elk.elk - DEBUG - Refreshing identities
2023-08-03 17:03:32,416 - grimoire_elk.elk - DEBUG - Total eitems refreshed for identities fields 0
2023-08-03 17:03:32,416 - sirmordred.task_enrich - DEBUG - [cocom:master] Individuals refreshed: 0
2023-08-03 17:03:32,416 - sirmordred.task_enrich - DEBUG - No ids to be refreshed found
2023-08-03 17:03:32,416 - sirmordred.task_enrich - INFO - [cocom:master] Refreshed 0 identities in 0:00:00
2023-08-03 17:03:32,416 - sirmordred.task_enrich - INFO - [cocom:master] autorefresh end
2023-08-03 17:03:32,417 - sirmordred.task_enrich - INFO - [cocom:master] no studies phase
2023-08-03 17:03:32,417 - sirmordred.task_enrich - INFO - [cocom:master] autorefresh for studies start
2023-08-03 17:03:32,417 - sirmordred.task_enrich - DEBUG - Not doing autorefresh for studies, Areas of Code study is not active.
2023-08-03 17:03:32,417 - sirmordred.task_enrich - INFO - [cocom:master] autorefresh for studies end
2023-08-03 17:03:32,417 - sirmordred.task_manager - DEBUG - [cocom:master] Tasks finished: <sirmordred.task_enrich.TaskEnrich object at 0x7f6070d9db80>
2023-08-03 17:03:32,418 - sirmordred.task_manager - INFO - [cocom:master] sleeping for 60 seconds 
2023-08-03 17:03:52,115 - sirmordred.task_manager - DEBUG - [Global tasks] Tasks started: <sirmordred.task_projects.TaskProjects object at 0x7f6070d9dd90>
2023-08-03 17:03:52,116 - sirmordred.task_projects - INFO - Reading projects data from  /home/grimoire/conf/projects.json 
2023-08-03 17:03:52,116 - sirmordred.task_manager - DEBUG - [Global tasks] Tasks finished: <sirmordred.task_projects.TaskProjects object at 0x7f6070d9dd90>
2023-08-03 17:03:52,116 - sirmordred.task_manager - INFO - [Global tasks] sleeping for 100 seconds 
zhquan commented

Hi @GrayStranger

Could you run this query in dev tools and paste the result? It seems that this query is returning 0 items

GET cocom_raw/_search
{
    "query": {
        "bool": {
            "filter": [
                {
                    "term": {
                        "origin": "https://git.domen.com/repo"
                    }
                }
            ]
        }
    },
    "sort": {
        "metadata__timestamp": {
            "order": "asc"
        }
    }
}

this request response with only git url:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

this request response with git url with creds inside returns with full of data, too much sensitive info inside so - sorry i can't put it here :). I think the key of problem:

"origin" : "https://grimoire-bot:password@git.domen.com/repo"

full format with creds inside origin stored in index, and default request to raw index with no creds within repo url - returns nothing. Am i right?

bug or feature? :)

zhquan commented

full format with creds inside origin stored in index, and default request to raw index with no creds within repo url - returns nothing. Am i right?

Yes, that means cocom and colic can only enrich repositories without creds in projects.json

but if i remove creds - it is unable to get raw data to cocom and colic and i have this in log:

fatal: could not read Username for 'https://git.domen.com': No such device or address

so for now it is only works for public repos? Because in case of cocom and colic we need different formats of projects.json corresponding sections for raw data collection and for enrichment.

zhquan commented

If you remove the creds in projects.json you have to remove it from all documents in the raw index (origin field) and also rename the clones.

For now, it only works for public repos

the way to fix - modify one request within enrichment logic (to use backend format of repo entities) ? or it is more complex issue?

zhquan commented

The way to fix this is to anonymize the fields with credentials in the raw index so no credentials will be stored in the indexes, like https://github.com/chaoss/grimoirelab-elk/blob/master/grimoire_elk/raw/git.py#L75

Cocom and colic use https://github.com/chaoss/grimoirelab-elk/blob/master/grimoire_elk/raw/graal.py

im using docker-compose to run grimoirelab and could fix it in mordred container, but would this fix be included in nearest release?

zhquan commented

If your code works, could you please create a PR with the fix?

Yes, its works. I'll create PR tomorrow.

Thanks for help!

update: this solution only works if latest-items=false for cocom/colic. Im can't figure out why, most likely it is not connected issues.

@sduenas could you please review connected PR to let the fix in.

Thanks!