Opensearch service not opening up port 9200
Closed this issue · 9 comments
Hi Team,
I'm running epiphany for hitachienergy project, but getting error while running the specific task.
TASK [opensearch : Wait for opensearch service to start up] ********************
09:26:31 INFO cli.src.ansible.AnsibleCommand - "connect_timeout": 5,
09:26:31 INFO cli.src.ansible.AnsibleCommand - "delay": 0,
09:26:31 INFO cli.src.ansible.AnsibleCommand - "exclude_hosts": null,
09:26:31 INFO cli.src.ansible.AnsibleCommand - "host": "10.1.30.96",
09:26:31 INFO cli.src.ansible.AnsibleCommand - "msg": null,
09:26:31 INFO cli.src.ansible.AnsibleCommand - "path": null,
09:26:31 INFO cli.src.ansible.AnsibleCommand - "port": 9200,
09:26:31 INFO cli.src.ansible.AnsibleCommand - "search_regex": null,
09:26:31 INFO cli.src.ansible.AnsibleCommand - "sleep": 1,
09:26:31 INFO cli.src.ansible.AnsibleCommand - "state": "started",
09:26:31 INFO cli.src.ansible.AnsibleCommand - "timeout": 300
09:26:31 INFO cli.src.ansible.AnsibleCommand - }
09:26:31 INFO cli.src.ansible.AnsibleCommand - },
09:26:31 INFO cli.src.ansible.AnsibleCommand - "msg": "Timeout when waiting for 10.1.30.96:9200"
While checking the respective machine I can see the opensearch service running-
but the related port 9200 is not open, same I verified below,
while checking the config file, it as mentioned that the service is configured to run with port 9200 only, but somehow the service is not accessible with that port and epiphany is failing with opensearch startup task. Please help me to proceed further.
Hey @rahulrajanrepo, can you pls share you config file? First thought would be that you might need to enable the proper security rules for the vnet that opensearch is deployed in, this can be either for the monitoring or opensearch component:
This would depend on how you did you network layout.
Hi Seriva,
Thanks for the update, looks like with my current system configuration openserach is taking around 10 min to fully come up and active, so before that the validation step is getting failed "Wait for opensearch service to start up". Also I'm running this on onprem machine not on cloud. Have attached the config file below.
Looking at the config you have logging, monitoring and repo running on the same box. My guess would be that its starved for resources and thats why it takes so long for opensearch to spin up. If that is the case you would have a hard time using the services to begin with.
What are the specs of the box?
yes that is a valid point, we are given only a small cluster with limited configuration, hence had to club them together.
VMs (each 4vCPUs, 16GB RAM) | Hosted software | Storage requirements (disk) |
---|---|---|
1 | Kubernetes master, haproxy, kafka | 64 gb |
2 & 3 | Kubernetes worker node | 32 gb |
4 | DB(db data, db log, backup data, backup service log) | 64 gb |
5 | Repo, logging, monitoring | 100 GB |
As @seriva already mentioned, it looks like you're running out of resources on the vm. To go through the installation process, as a workaround, you can modify the failing task by adding a timeout
parameter:
https://github.com/hitachienergy/epiphany/blob/v2.0/ansible/playbooks/roles/opensearch/tasks/configure-opensearch.yml#L123 (/epicli/ansible/playbooks/roles/opensearch/tasks/configure-opensearch.yml
in the container).
However, I would expect further problems with the proper operation of all services.
Yes, agreed with what @przemyslavic said. Hosting repo/logging/monitoring on a single box that size might be to much. By default logging and monitoring run on separate VMs of the size you are using so expect more problems when the services would be under load.
My advice is to scale up that machine or separate services over multiple machines.
@rahulrajanrepo Any more input from your side or something more needed from our side? Or can this issue be closed?
Hi Seriva,
Just want to know the minimum hardware requirement for each server to perform seamlessly , thanks.
That is not something we can answer as this is project specific. For our testing of clusters in the Azure cloud without an application on top we use the following VM sizes per component separately:
Logging: Standard_DS2_v2
Monitoring: Standard_DS2_v2
Repository: Standard_DS1_v2
Again this is an empty clusters so you need to check with the project what load you are expecting and adjust accordingly. Closing the issue as this is not Epiphany related but project.