Issue deploying logging in 3.11
Closed this issue · 9 comments
Deploying logging in 3.11 cluster fails with following error.
The full traceback is:
WARNING: The below traceback may *not* be related to the actual failure.
File "/tmp/ansible_command_payload_svseXs/ansible_command_payload.zip/ansible/module_utils/basic.py", line 2561, in run_command
cmd = subprocess.Popen(args, **kwargs)
File "/usr/lib64/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child
raise child_exception
fatal: [master01.domain -> localhost]: FAILED! => {
"changed": false,
"cmd": "patch --force --quiet -u /tmp/openshift-logging-ansible-yb1r0N/configmap_new_file /tmp/openshift-logging-ansible-yb1r0N/patch.patch",
"invocation": {
"module_args": {
"_raw_params": "patch --force --quiet -u /tmp/openshift-logging-ansible-yb1r0N/configmap_new_file /tmp/openshift-logging-ansible-yb1r0N/patch.patch",
"_uses_shell": false,
"argv": null,
"chdir": null,
"creates": null,
"executable": null,
"removes": null,
"stdin": null,
"stdin_add_newline": true,
"strip_empty_ends": true,
"warn": true
}
},
"msg": "[Errno 2] No such file or directory",
"rc": 2
}
Using latest openshift-ansible playbooks
openshift-ansible-3.11.343-1
Inventory
openshift_logging_image_version=v3.11.0
openshift_logging_use_ops=true
openshift_logging_install_logging=true
openshift_logging_master_url=https://cluster11.domain:8443
openshift_logging_install_eventrouter=true
openshift_logging_eventrouter_nodeselector={"node-role.kubernetes.io/infra":"true"}
openshift_logging_curator_default_days=15
openshift_logging_curator_run_hour=23
openshift_logging_curator_run_minute=00
openshift_logging_curator_run_timezone=America/NewYork
openshift_logging_curator_nodeselector={"node-role.kubernetes.io/infra":"true"}
openshift_logging_es_memory_limit=32Gi
openshift_logging_es_ops_memory_limit=16Gi
openshift_logging_kibana_hostname=logging.prod11.domain
openshift_logging_fluentd_audit_container_engine=true
openshift_logging_es_pvc_dynamic=true
openshift_logging_es_pvc_storage_class_name=glusterfs
openshift_logging_es_pvc_size=250Gi
openshift_logging_elasticsearch_storage_type=pvc
openshift_logging_es_pvc_prefix=logging-es
openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra":"true"}
openshift_logging_es_ops_pvc_dynamic=true
openshift_logging_es_ops_pvc_storage_class_name=glusterfs
openshift_logging_es_ops_pvc_size=250Gi
openshift_logging_es_ops_pvc_prefix=logging-ops-es
openshift_logging_es_ops_nodeselector={"node-role.kubernetes.io/infra":"true"}
openshift_logging_es_number_of_replicas=1
openshift_logging_fluentd_image=quay.io/openshift/origin-logging-fluentd:v3.11.0
openshift_logging_kibana_image=quay.io/openshift/origin-loging-kibana5:v3.11.0
openshift_logging_curator_image=quay.io/openshift/origin-logging-curator:v3.11.0
openshift_logging_eventrouter_image=quay.io/openshift/origin-logging-eventrouter:v3.11.0
openshift_logging_elasticsearch_image=quay.io/openshift/origin-logging-elasticsearch5:v3.11.0
openshift_logging_es_cluster_size=3
Can you please attach logs with -vvv
enabled
Jeff, exactly which log(s) are you referring? Ansible?
Rerun the playbook at enable more verbose logging with -vvv
and attach the outcome.
Rerun the playbook at enable more verbose logging with
-vvv
and attach the outcome.
Jeff, log was uploaded. Also I noted a typo in the inventory for logging-kibana image, and I reverted back to a previous ansible playbook. This deployed ok. A little adjustment got the logging-es-data-master, and logging-kibana running - es-ops-data and ops-kibana crash. Fluentd is up and running on all nodes, no issues. While I see the storage being used the kibana dashboard returns an empty result. Appears to be " temporarily failed to flush the buffer." as I see in the fluent logs on the nodes.
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle rotten
/remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting /reopen
.
Mark the issue as fresh by commenting /remove-lifecycle rotten
.
Exclude this issue from closing again by commenting /lifecycle frozen
.
/close
@openshift-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting
/reopen
.
Mark the issue as fresh by commenting/remove-lifecycle rotten
.
Exclude this issue from closing again by commenting/lifecycle frozen
./close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.