EMQX-Cluster not working in IPV6 only network
axkng opened this issue · 28 comments
Describe the bug
After following the getting-started page to setup the emqx-operator I provisioned a emqx-cluster.
The pods start and are running, but the status commands return errors:
kubectl exec -n emqx -it emqx-0 -c emqx -- emqx_ctl status
Node 'emqx@emqx-0.emqx-headless.emqx.svc.cluster.local' not responding to pings.
/opt/emqx/bin/emqx: line 46: die: command not found
command terminated with exit code 127
To Reproduce
Steps to reproduce the behavior:
- Deploy the operator to a EKS cluster with Kubernetes 1.22.9
- Deploy a simple broker (can be without persistence, I tested that.)
- Check the output of the status commands and get errors.
Expected behavior
Not to get errors on the status commands after provisioning a simple broker with no config.
Anything else we need to know?:
Environment details::
- Kubernetes version: 1.22.9
- Cloud-provider/provisioner: AWS EKS
- emqx-operator version: 1.2.4
- Install method: helm, emqx deployed as crd
emqx-manifest:
---
apiVersion: apps.emqx.io/v1beta3
kind: EmqxBroker
metadata:
name: emqx
labels:
app: emqx
environment: dev
spec:
persistent:
accessModes:
- ReadWriteOnce
storageClassName: ebs-gp3
resources:
requests:
storage: 1Gi
emqxTemplate:
image: emqx/emqx:4.4.6
Did I do something wrong here?
Hi, @Furragen
Could you please show emqx-operator logs and emqx custom resource status? run the following command
kubectl get EmqxBroker emqx -o json | jq '.status'
kubectl logs -f -l "control-plane=controller-manager" -n emqx-operator-system -c manager --tail=100
And the emqx pod logs
kubectl logs emqx-0 -c emqx
Hi @Rory-Z ,
thanks for your quick response.
kubectl get -n emqx EmqxBroker emqx -o json | jq '.status'
{
"conditions": [
{
"lastTransitionTime": "2022-08-10T07:04:09Z",
"lastUpdateTime": "2022-08-10T07:26:23Z",
"message": "Some nodes are not ready",
"reason": "ClusterNotReady",
"status": "False",
"type": "Running"
},
{
"lastTransitionTime": "2022-08-10T07:03:26Z",
"lastUpdateTime": "2022-08-10T07:03:26Z",
"message": "All default plugins initialized",
"reason": "PluginInitializeSuccessfully",
"status": "True",
"type": "PluginInitialized"
}
],
"emqxNodes": [
{
"node": "emqx@emqx-0.emqx-headless.emqx.svc.cluster.local",
"node_status": "Running",
"otp_release": "24.1.5/12.1.5",
"version": "4.4.6"
}
],
"readyReplicas": 1,
"replicas": 3
}
Logs of the operator ( kubectl logs -f -l "control-plane=controller-manager" -n emqx -c manager --tail=100
)
Logs
E0810 07:03:59.570352 1 portforward.go:234] lost connection to pod E0810 07:03:59.838997 1 portforward.go:406] an error occurred forwarding 38417 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:00.358323 1 portforward.go:406] an error occurred forwarding 43423 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:00.358653 1 portforward.go:234] lost connection to pod 1.660115040377564e+09 ERROR Reconciler error {"controller": "emqxbroker", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxBroker", "emqxBroker": {"name":"emqx","namespace":"emqx"}, "namespace": "emqx", "name": "emqx", "reconcileID": "d0295269-0001-4c19-ae4d-2be9e74a7321", "error": "Operation cannot be fulfilled on emqxbrokers.apps.emqx.io \"emqx\": the object has been modified; please apply your changes to the latest version and try again"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:273 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:234 E0810 07:04:00.653062 1 portforward.go:406] an error occurred forwarding 46363 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:00.656235 1 portforward.go:234] lost connection to pod 1.6601150413087435e+09 ERROR Reconciler error {"controller": "emqxbroker", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxBroker", "emqxBroker": {"name":"emqx","namespace":"emqx"}, "namespace": "emqx", "name": "emqx", "reconcileID": "aed5651b-c774-4496-8e04-41ec215aeb76", "error": "Operation cannot be fulfilled on emqxbrokers.apps.emqx.io \"emqx\": the object has been modified; please apply your changes to the latest version and try again"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:273 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:234 E0810 07:04:01.933473 1 portforward.go:406] an error occurred forwarding 36141 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:01.933924 1 portforward.go:234] lost connection to pod 1.6601150419643033e+09 ERROR Reconciler error {"controller": "emqxbroker", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxBroker", "emqxBroker": {"name":"emqx","namespace":"emqx"}, "namespace": "emqx", "name": "emqx", "reconcileID": "7207ec58-c48f-4f47-bb51-d156051a2e78", "error": "Operation cannot be fulfilled on emqxbrokers.apps.emqx.io \"emqx\": the object has been modified; please apply your changes to the latest version and try again"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:273 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:234 E0810 07:04:02.226753 1 portforward.go:406] an error occurred forwarding 36151 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:02.227081 1 portforward.go:234] lost connection to pod E0810 07:04:02.639584 1 portforward.go:406] an error occurred forwarding 39885 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:02.639792 1 portforward.go:234] lost connection to pod 1.6601150426838543e+09 ERROR Reconciler error {"controller": "emqxbroker", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxBroker", "emqxBroker": {"name":"emqx","namespace":"emqx"}, "namespace": "emqx", "name": "emqx", "reconcileID": "4ef8f472-14f7-4da7-849b-5220115b9dbc", "error": "Operation cannot be fulfilled on emqxbrokers.apps.emqx.io \"emqx\": the object has been modified; please apply your changes to the latest version and try again"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:273 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:234 E0810 07:04:02.948658 1 portforward.go:406] an error occurred forwarding 40737 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:02.948869 1 portforward.go:234] lost connection to pod E0810 07:04:03.425248 1 portforward.go:406] an error occurred forwarding 34645 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:03.425551 1 portforward.go:234] lost connection to pod 1.660115043447316e+09 ERROR Reconciler error {"controller": "emqxbroker", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxBroker", "emqxBroker": {"name":"emqx","namespace":"emqx"}, "namespace": "emqx", "name": "emqx", "reconcileID": "7d86ef9a-e0f3-465b-b1e0-32123f7d2377", "error": "Operation cannot be fulfilled on emqxbrokers.apps.emqx.io \"emqx\": the object has been modified; please apply your changes to the latest version and try again"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:273 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:234 E0810 07:04:03.772788 1 portforward.go:406] an error occurred forwarding 37549 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:03.773081 1 portforward.go:234] lost connection to pod 1.6601150441139083e+09 ERROR Reconciler error {"controller": "emqxbroker", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxBroker", "emqxBroker": {"name":"emqx","namespace":"emqx"}, "namespace": "emqx", "name": "emqx", "reconcileID": "300bc970-0ebd-4d1a-a101-6b073ec449e0", "error": "failed to update StatefulSet emqx: Operation cannot be fulfilled on statefulsets.apps \"emqx\": the object has been modified; please apply your changes to the latest version and try again"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:273 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:234 E0810 07:04:04.424648 1 portforward.go:406] an error occurred forwarding 32813 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:04.425127 1 portforward.go:234] lost connection to pod E0810 07:04:04.909241 1 portforward.go:406] an error occurred forwarding 41449 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:04.909421 1 portforward.go:234] lost connection to pod 1.660115044925919e+09 ERROR Reconciler error {"controller": "emqxbroker", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxBroker", "emqxBroker": {"name":"emqx","namespace":"emqx"}, "namespace": "emqx", "name": "emqx", "reconcileID": "e6941001-14b7-4462-8b90-80e6ab8feac4", "error": "Operation cannot be fulfilled on emqxbrokers.apps.emqx.io \"emqx\": the object has been modified; please apply your changes to the latest version and try again"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:273 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:234 E0810 07:04:05.216967 1 portforward.go:406] an error occurred forwarding 34665 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:05.725675 1 portforward.go:406] an error occurred forwarding 41193 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:05.726157 1 portforward.go:234] lost connection to pod 1.6601150457562108e+09 ERROR Reconciler error {"controller": "emqxbroker", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxBroker", "emqxBroker": {"name":"emqx","namespace":"emqx"}, "namespace": "emqx", "name": "emqx", "reconcileID": "aaa3c815-7409-4e44-b752-09cff2b0531e", "error": "Operation cannot be fulfilled on emqxbrokers.apps.emqx.io \"emqx\": the object has been modified; please apply your changes to the latest version and try again"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:273 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:234 E0810 07:04:06.034595 1 portforward.go:406] an error occurred forwarding 35435 -> 8081: error forwarding port 8081 to pod 862a1a59ff6fdc75b1c8a7520a2ed57d2720c341f7015556443b4063771ccdd4, uid : failed to execute portforward in network namespace "/var/run/netns/cni-4d298c69-db9b-1c7e-ebac-314710d61826": failed to connect to localhost:8081 inside namespace "862a1a59ff6fdc75b1c8a7520a2ed57d2720c341f7015556443b4063771ccdd4", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:06.034845 1 portforward.go:234] lost connection to pod E0810 07:04:06.440777 1 portforward.go:406] an error occurred forwarding 33389 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:06.441162 1 portforward.go:234] lost connection to pod E0810 07:04:06.735334 1 portforward.go:406] an error occurred forwarding 35219 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:06.735726 1 portforward.go:234] lost connection to pod E0810 07:04:07.043941 1 portforward.go:406] an error occurred forwarding 34727 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:07.044309 1 portforward.go:234] lost connection to pod E0810 07:04:07.450824 1 portforward.go:406] an error occurred forwarding 34031 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:07.451169 1 portforward.go:234] lost connection to pod E0810 07:04:07.790990 1 portforward.go:406] an error occurred forwarding 43441 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:07.791210 1 portforward.go:234] lost connection to pod E0810 07:04:08.188870 1 portforward.go:406] an error occurred forwarding 32839 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:08.189418 1 portforward.go:234] lost connection to pod 1.6601150482063682e+09 ERROR Reconciler error {"controller": "emqxbroker", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxBroker", "emqxBroker": {"name":"emqx","namespace":"emqx"}, "namespace": "emqx", "name": "emqx", "reconcileID": "bbf92a96-d65e-4792-a130-bc0d0f594557", "error": "Operation cannot be fulfilled on emqxbrokers.apps.emqx.io \"emqx\": the object has been modified; please apply your changes to the latest version and try again"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:273 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:234 E0810 07:04:08.443142 1 portforward.go:406] an error occurred forwarding 35395 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:08.907719 1 portforward.go:406] an error occurred forwarding 33055 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:08.907842 1 portforward.go:234] lost connection to pod E0810 07:04:09.192174 1 portforward.go:406] an error occurred forwarding 34689 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused E0810 07:04:09.192605 1 portforward.go:234] lost connection to pod E0810 07:04:09.616450 1 portforward.go:406] an error occurred forwarding 42301 -> 8081: error forwarding port 8081 to pod 0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435, uid : failed to execute portforward in network namespace "/var/run/netns/cni-29540245-853e-246f-ed7f-0443a91d3642": failed to connect to localhost:8081 inside namespace "0ea370e97c58078f78a2ac82dd6cac94e03da6368ac639ad25df3f72e04af435", IPv4: dial tcp4 127.0.0.1:8081: connect: connection refused IPv6 dial tcp6 [::1]:8081: connect: connection refused 1.6601150500930722e+09 ERROR Reconciler error {"controller": "emqxbroker", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxBroker", "emqxBroker": {"name":"emqx","namespace":"emqx"}, "namespace": "emqx", "name": "emqx", "reconcileID": "371cf80b-1e05-4126-ab31-639b04c5d478", "error": "Operation cannot be fulfilled on emqxbrokers.apps.emqx.io \"emqx\": the object has been modified; please apply your changes to the latest version and try again"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:273 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:234 1.6601150507833595e+09 ERROR Reconciler error {"controller": "emqxbroker", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxBroker", "emqxBroker": {"name":"emqx","namespace":"emqx"}, "namespace": "emqx", "name": "emqx", "reconcileID": "2c813143-c2f5-4398-b6b7-bf3e92bd4350", "error": "Operation cannot be fulfilled on emqxbrokers.apps.emqx.io \"emqx\": the object has been modified; please apply your changes to the latest version and try again"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:273 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:234Logs of the first node:
kubectl -n emqx logs emqx-0 -c emqx
hostname: emqx-0: Host not found
Starting emqx on node emqx@emqx-0.emqx-headless.emqx.svc.cluster.local
Start mqtt:tcp:internal listener on 127.0.0.1:11883 successfully.
Start mqtt:tcp:external listener on 0.0.0.0:1883 successfully.
Start mqtt:ws:external listener on 0.0.0.0:8083 successfully.
Start mqtt:ssl:external listener on 0.0.0.0:8883 successfully.
Start mqtt:wss:external listener on 0.0.0.0:8084 successfully.
Start http:management listener on 8081 successfully.
2022-08-10T07:04:09.807451+00:00 [warning] [Dashboard] Using default password for dashboard 'admin' user. Please use './bin/emqx_ctl admins' command to change it. NOTE: the default password in config file is only used to initialise the database record, changing the config file after database is initialised has no effect.
Start http:dashboard listener on 18083 successfully.
EMQ X Broker 4.4.6 is running now!
2022-08-10T07:04:11.816562+00:00 [warning] Ekka(AutoCluster): discovered nodes outside cluster: ['emqx@emqx-1.emqx-headless.emqx.svc.cluster.local']
2022-08-10T07:04:11.816736+00:00 [warning] Ekka(AutoCluster): discovery did not succeed; retrying in 5000 ms
2022-08-10T07:04:17.569069+00:00 [warning] Ekka(AutoCluster): discovered nodes outside cluster: ['emqx@emqx-1.emqx-headless.emqx.svc.cluster.local','emqx@emqx-2.emqx-headless.emqx.svc.cluster.local']
2022-08-10T07:04:17.569265+00:00 [warning] Ekka(AutoCluster): discovery did not succeed; retrying in 5000 ms
2022-08-10T07:04:25.334693+00:00 [warning] Ekka(AutoCluster): discovered nodes outside cluster: ['emqx@emqx-1.emqx-headless.emqx.svc.cluster.local','emqx@emqx-2.emqx-headless.emqx.svc.cluster.local']
2022-08-10T07:04:25.334864+00:00 [warning] Ekka(AutoCluster): discovery did not succeed; retrying in 5000 ms
2022-08-10T07:04:32.698077+00:00 [warning] Ekka(AutoCluster): discovered nodes outside cluster: ['emqx@emqx-1.emqx-headless.emqx.svc.cluster.local','emqx@emqx-2.emqx-headless.emqx.svc.cluster.local']
2022-08-10T07:04:32.698264+00:00 [warning] Ekka(AutoCluster): discovery did not succeed; retrying in 5000 ms
2022-08-10T07:04:38.495689+00:00 [warning] Ekka(AutoCluster): discovered nodes outside cluster: ['emqx@emqx-1.emqx-headless.emqx.svc.cluster.local','emqx@emqx-2.emqx-headless.emqx.svc.cluster.local']
2022-08-10T07:04:38.495870+00:00 [warning] Ekka(AutoCluster): discovery did not succeed; retrying in 5000 ms
The logs just stay the same after that.
Is this the first deployment? Have you deployed emqx before and deleted it?
This is the first deployment of that broker.
But yes, I tried to deploy other ones before.
Could you please show logs for emqx-1
and emqx-2
?
Sure thing.
kubectl -n exo-emqx logs emqx-1 -c emqx
hostname: emqx-1: Host not found
Starting emqx on node emqx@emqx-1.emqx-headless.emqx.svc.cluster.local
Start mqtt:tcp:internal listener on 127.0.0.1:11883 successfully.
Start mqtt:tcp:external listener on 0.0.0.0:1883 successfully.
Start mqtt:ws:external listener on 0.0.0.0:8083 successfully.
Start mqtt:ssl:external listener on 0.0.0.0:8883 successfully.
Start mqtt:wss:external listener on 0.0.0.0:8084 successfully.
Start http:management listener on 8081 successfully.
2022-08-10T07:04:11.429818+00:00 [warning] [Dashboard] Using default password for dashboard 'admin' user. Please use './bin/emqx_ctl admins' command to change it. NOTE: the default password in config file is only used to initialise the database record, changing the config file after database is initialised has no effect.
Start http:dashboard listener on 18083 successfully.
EMQ X Broker 4.4.6 is running now!
2022-08-10T07:04:12.467104+00:00 [warning] Ekka(AutoCluster): discovered nodes outside cluster: ['emqx@emqx-0.emqx-headless.emqx.svc.cluster.local']
2022-08-10T07:04:12.467272+00:00 [warning] Ekka(AutoCluster): discovery did not succeed; retrying in 5000 ms
2022-08-10T07:04:17.639297+00:00 [warning] Ekka(AutoCluster): discovered nodes outside cluster: ['emqx@emqx-0.emqx-headless.emqx.svc.cluster.local','emqx@emqx-2.emqx-headless.emqx.svc.cluster.local']
2022-08-10T07:04:17.639472+00:00 [warning] Ekka(AutoCluster): discovery did not succeed; retrying in 5000 ms
2022-08-10T07:04:24.940386+00:00 [warning] Ekka(AutoCluster): discovered nodes outside cluster: ['emqx@emqx-0.emqx-headless.emqx.svc.cluster.local','emqx@emqx-2.emqx-headless.emqx.svc.cluster.local']
2022-08-10T07:04:24.940561+00:00 [warning] Ekka(AutoCluster): discovery did not succeed; retrying in 5000 ms
2022-08-10T07:04:30.877727+00:00 [warning] Ekka(AutoCluster): discovered nodes outside cluster: ['emqx@emqx-0.emqx-headless.emqx.svc.cluster.local','emqx@emqx-2.emqx-headless.emqx.svc.cluster.local']
2022-08-10T07:04:30.877912+00:00 [warning] Ekka(AutoCluster): discovery did not succeed; retrying in 5000 ms
2022-08-10T07:04:38.386440+00:00 [warning] Ekka(AutoCluster): discovered nodes outside cluster: ['emqx@emqx-0.emqx-headless.emqx.svc.cluster.local','emqx@emqx-2.emqx-headless.emqx.svc.cluster.local']
kubectl -n emqx logs emqx-2 -c emqx
hostname: emqx-2: Host not found
Starting emqx on node emqx@emqx-2.emqx-headless.emqx.svc.cluster.local
Start mqtt:tcp:internal listener on 127.0.0.1:11883 successfully.
Start mqtt:tcp:external listener on 0.0.0.0:1883 successfully.
Start mqtt:ws:external listener on 0.0.0.0:8083 successfully.
Start mqtt:ssl:external listener on 0.0.0.0:8883 successfully.
Start mqtt:wss:external listener on 0.0.0.0:8084 successfully.
Start http:management listener on 8081 successfully.
2022-08-10T07:04:21.079133+00:00 [warning] [Dashboard] Using default password for dashboard 'admin' user. Please use './bin/emqx_ctl admins' command to change it. NOTE: the default password in config file is only used to initialise the database record, changing the config file after database is initialised has no effect.
Start http:dashboard listener on 18083 successfully.
EMQ X Broker 4.4.6 is running now!
2022-08-10T07:04:24.909316+00:00 [warning] Ekka(AutoCluster): discovered nodes outside cluster: ['emqx@emqx-0.emqx-headless.emqx.svc.cluster.local','emqx@emqx-1.emqx-headless.emqx.svc.cluster.local']
2022-08-10T07:04:24.909510+00:00 [warning] Ekka(AutoCluster): discovery did not succeed; retrying in 5000 ms
2022-08-10T07:04:32.263225+00:00 [warning] Ekka(AutoCluster): discovered nodes outside cluster: ['emqx@emqx-0.emqx-headless.emqx.svc.cluster.local','emqx@emqx-1.emqx-headless.emqx.svc.cluster.local']
2022-08-10T07:04:32.263384+00:00 [warning] Ekka(AutoCluster): discovery did not succeed; retrying in 5000 ms
2022-08-10T07:04:37.785043+00:00 [warning] Ekka(AutoCluster): discovered nodes outside cluster: ['emqx@emqx-0.emqx-headless.emqx.svc.cluster.local','emqx@emqx-1.emqx-headless.emqx.svc.cluster.local']
2022-08-10T07:04:37.785206+00:00 [warning] Ekka(AutoCluster): discovery did not succeed; retrying in 5000 ms
2022-08-10T07:04:43.694825+00:00 [warning] Ekka(AutoCluster): discovered nodes outside cluster: ['emqx@emqx-0.emqx-headless.emqx.svc.cluster.local','emqx@emqx-1.emqx-headless.emqx.svc.cluster.local']
Again, the logs just stay the same.
Hi @Furragen EMQX Operator 1.2.5 is released, please try again, and please let me know is it work
@Furragen Sounds frustrating, the EMQX pod log still the same ?
Hi, @Furragen Could you please check pod network ? running following command in EMQX pod
nslookup -type=srv $(headless service name).$(namespace).svc.cluster.local
you should got output like this
emqx-headless.default.svc.cluster.local service = 0 33 8081 emqx-0.emqx-headless.default.svc.cluster.local
emqx-headless.default.svc.cluster.local service = 0 33 8081 emqx-1.emqx-headless.default.svc.cluster.local
emqx-headless.default.svc.cluster.local service = 0 33 8081 emqx-2.emqx-headless.default.svc.cluster.local
and check network ping
nc -zv emqx-2.emqx-headless.default.svc.cluster.local 8081
and like this output is successfully
emqx-2.emqx-headless.default.svc.cluster.local (172.17.0.8:8081) open
So the lookup worked fine. My cluster uses IPv6 btw. Could that be a problem?
Network ping did not work.
Network ping did not work.
I think that is reason.
Could you please check if pinging another EMQX pod with IP in the EMQX pod works?
In statefulSet, pod should have stable network ID: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#stable-network-id, EMQX use this network ID discover each other, if network don't work, EMQX cluster will failed.
Because this is the k8s feature, so maybe need check AWS EKS
The direct way via the IP of the pod also did not work.
And I think I know why:
EMQX only listens on IPv4.
netstat -tulpen
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:11883 0.0.0.0:* LISTEN 1/emqx
tcp 0 0 0.0.0.0:8081 0.0.0.0:* LISTEN 1/emqx
tcp 0 0 0.0.0.0:4370 0.0.0.0:* LISTEN 1/emqx
tcp 0 0 0.0.0.0:8883 0.0.0.0:* LISTEN 1/emqx
tcp 0 0 0.0.0.0:8083 0.0.0.0:* LISTEN 1/emqx
tcp 0 0 0.0.0.0:8084 0.0.0.0:* LISTEN 1/emqx
tcp 0 0 0.0.0.0:5369 0.0.0.0:* LISTEN 1/emqx
tcp 0 0 0.0.0.0:1883 0.0.0.0:* LISTEN 1/emqx
tcp 0 0 0.0.0.0:18083 0.0.0.0:* LISTEN 1/emqx
This was from inside the emqx-0 pod.
Like I said, the cluster uses IPv6, so this can not work.
Is there any way to make EMQX listen to IPv6?
@Furragen You can deploy EMQX like this:
apiVersion: apps.emqx.io/v1beta3
kind: EmqxBroker
metadata:
name: emqx
spec:
emqxTemplate:
image: emqx/emqx:4.4.6
config:
listener.tcp.external: :::1883
management.listener.http: :::8081
dashboard.listener.http: :::18083
Sorry I don't have IPV6 cluster, so need your try this
Absolutely no problem.
I redeployed the broker and we got a little further.
The logs and the error stays the same:
emqx_ctl cluster_status
Node 'emqx@emqx-0.emqx-headless.exo-emqx.svc.cluster.local' not responding to pings.
/opt/emqx/bin/emqx: line 46: die: command not found
But: doing the ping by hand with nc
now succeeds.
So the connection works, but something is still broken.
Could there be more listeners that I need to switch to v6?
Cooool, You can change all the listener you care about to IPV6 format, see https://www.emqx.io/docs/en/v4.4/configuration/configuration.html#listener-tcp-external
Could you please run following command in EMQX pod:
emqx eval "net_adm:ping('emqx@emqx-0.emqx-headless.default.svc.cluster.local')."
The emqx@emqx-0.emqx-headless.default.svc.cluster.local
is other EMQX node name
So, I tried this and the command you mentioned did not succeed.
The error is:
Node 'emqx@emqx-0.emqx-headless.emqx.svc.cluster.local' not responding to pings.
/usr/local/bin/emqx: line 46: die: command not found
This error always appears when running the emqx
-command.
Also, I have tested around with setting listeners to IPv6:
apiVersion: apps.emqx.io/v1beta3
kind: EmqxBroker
metadata:
name: emqx
labels:
app: emqx
environment: dev
spec:
persistent:
accessModes:
- ReadWriteOnce
storageClassName: ebs-gp3
resources:
requests:
storage: 1Gi
emqxTemplate:
image: emqx/emqx:4.4.6
config:
listener.tcp.external: :::1883
listener.ssl.external: :::8883
management.listener.http: :::8081
dashboard.listener.http: :::18083
listener.tcp.internal: :::11883
listener.ws.external: :::8083
listener.wss.external: :::8084
The pods start, but the dashboard-plugin seems to be unhappy:
2022-08-11T09:45:39.371399+00:00 [alert] [Plugins] Plugin emqx_dashboard load failed with {function_clause,[{emqx_plugins,apply_configs,[{error,transform_datatypes,{errorlist,[{error,{transform_type,"dashboard.listener.http"}},{error,{conversion,{":::18083",integer}}}]}}],[{file,"emqx_plugins.erl"},{line,302}]},{emqx_plugins,load_plugin,2,[{file,"emqx_plugins.erl"},{line,325}]},{lists,foreach,2,[{file,"lists.erl"},{line,1342}]},{emqx_app,start,2,[{file,"emqx_app.erl"},{line,50}]},{application_master,start_it_old,4,[{file,"application_master.erl"},{line,293}]}]}
Looks like it cannot convert the v6-notation.
On top of that I found three other settings that would need tuning I think.
The first one is cluster.proto_dist.
The docs mention that I could set it to inet6_tcp
to use IPv6. But when I do that, the pods do not start anymore.
And then there are cluster.mcast.iface and rpc.tcp_server_ip. These two settings do not seem to support IPv6 according to the docs. Is that correct?
The listeners I just mentioned and the ones in my manifest seem to be the ones EMQX starts by default, so I did not look further.
Do you know of anyone using EMQX with IPv6?
Node 'emqx@emqx-0.emqx-headless.emqx.svc.cluster.local' not responding to pings.
/usr/local/bin/emqx: line 46: die: command not found
means the peer node that we are pinging is unreachable.
I ran the command from the emqx-0 pod, trying to query emqx-1.
Does that not mean emqx-0 has a problem?
It's likely that EMQX's distribution and RPC library does not support ipv6 that well.
We'll investigate it.
Good to know, thank you.