kubernetes pvc fails with hostname empty
Closed this issue · 5 comments
I am following this guide to deploy portworx. I was able to run the portworx pod just fine, and the status looks like:
root@kubeguest04:/# /opt/pwx/bin/pxctl status
Status: PX is operational
Node ID: a6247594-f908-4789-8998-9657682b027d
IP: 10.129.37.35
Local Storage Pool: 1 pool
POOL IO_PRIORITY RAID_LEVEL USABLE USED STATUS ZONE REGION
0 HIGH raid0 441 GiB 2.0 GiB Online default default
Local Storage Devices: 1 device
Device Path Media Type Size Last-Scan
0:1 /dev/mapper/cl-home STORAGE_MEDIUM_MAGNETIC 441 GiB 20 Jun 17 01:04 UTC
total - 441 GiB
Cluster Summary
Cluster ID: portworx-storage-0
IP ID Used Capacity Status
10.129.4.35 a6247594-f908-4789-8998-9657682b027d 0 B 441 GiB Online (This node)
Global Storage Pool
Total Used : 0 B
Total Capacity : 441 GiB
[root@kubeguest04 ~]# kubectl exec -it portworx-storage-x31h9 -- /opt/pwx/bin/pxctl cluster alerts
AlertID ClusterID Timestamp Severity AlertType Description
0 portworx-storage-0 Jun 20 01:04:23 UTC 2017 NOTIFY Node start success Node a6247594-f908-4789-8998-9657682b027d with Index (0) is Up
1 portworx-storage-0 Jun 20 01:04:38 UTC 2017 NOTIFY Node start success Node a6247594-f908-4789-8998-9657682b027d joining the cluster with index (0)
2 portworx-storage-0 Jun 20 01:04:43 UTC 2017 NOTIFY Node start success PX is ready on Node: a6247594-f908-4789-8998-9657682b027d. CLI accessible at /opt/pwx/bin/pxctl.
[root@kubeguest04 ~]# kubectl create -f pvc.yml
persistentvolumeclaim "minio-persistent-storage" created
Now, when i tried to create a pvc
, it fails with:
[root@kubeguest04 ~]# kubectl describe pvc minio-persistent-storage
Name: minio-persistent-storage
Namespace: default
StorageClass: portworx
Status: Pending
Volume:
Labels: <none>
Annotations: volume.beta.kubernetes.io/storage-class=portworx
volume.beta.kubernetes.io/storage-provisioner=kubernetes.io/portworx-volume
Capacity:
Access Modes:
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
22s 6s 3 persistentvolume-controller Warning ProvisioningFailed Failed to provision volume with StorageClass "portworx": Post http://:9001/v1/osd-volumes: dial tcp :9001: getsockopt: connection refused
Why is it connecting to http://:9001
and not including the hostname in there.
Though, when i tried on hostname, i am able to connect just fine:
[root@kubeguest04 ~]# curl http://10.129.4.35:9001/v1/osd-volumes
[]
Hello @prat0318
While the logs say http://:9001
, the underlying library is invoking calls to http://localhost:9001
. The empty hostname can be misleading.
Now coming to the actual error message, portworx needs to be running on the kubernetes master node. The kubernetes control plane on the master node invokes the Portworx api server on http://localhost:9001 on the master node. (This will change with the k8s 1.6.5 release)
Can I learn more about your setup?
- How did you deploy kubernetes
- Can I get output of following commands
--kubectl get nodes
--kubectl get pods -l name=portworx -n kube-system
--kubectl logs <pod-name> -n kube-system
where is the name of the Portworx pod running on master node
Portworx can talk to etcd over https as well. You will need to provide the url in the following format while starting portworx container.
etcd:https://<ip>:<port>
You can find more info here
Note that if you have an already running PX cluster, I would recommend starting afresh with a new clusterID when switching to the secure etcd.
Regarding k8s master nodes, yes you will need to run portworx on all master replicas. We have removed this dependency in the latest kubernetes release 1.6.6
With 1.6.6 you will need to run PX only on worker nodes.
Let us know, if you need more help.
The two things worked for me:
- Running portworx on the master nodes
- Using https with ca cert files for k8s etcd
Also, i realized that if i want to change the cluster id/change some config of a portworx running on a node, i should rm -rf /opt/pdx
on that node, else the pod will again pick up old config, ignoring the one set in the daemonset.
One thing still confuses me. The pvc for one of the pods was attached to a volume whose replicas were on 2 other nodes (2 because of repl count set to 2) than the node running the pod. In volume list
i get an asterisk saying the same thing. But i couldn't find a reason why such a scheduling could happen. Shouldn't almost always local volume be given a preference?
@prat0318 glad you were able to make progress!
Also, i realized that if i want to change the cluster id/change some config of a portworx running on a node, i should rm -rf /opt/pdx on that node, else the pod will again pick up old config, ignoring the one set in the daemonset.
Once a cluster is initialized, you cannot change it's name. Since name is used as a key for cluster membership. I assume you were referring to removing /etc/pwx
. You must not remove this directory since this has a file /etc/pwx/.private.json
which is important for cluster membership of the node. To change settings on a node, you have to do so in Maintenence mode. Please read: https://docs.portworx.com/maintain/maintenance-mode.html for more details.
One thing still confuses me. The pvc for one of the pods was attached to a volume whose replicas were on 2 other nodes (2 because of repl count set to 2) than the node running the pod. In volume list i get an asterisk saying the same thing. But i couldn't find a reason why such a scheduling could happen. Shouldn't almost always local volume be given a preference?
Be default, Kubernetes does not have knowledge of locality of volume data. To do so, you have to use labels to node affinity in the pod spec. For e.g below
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "pvc-high-01"
operator: In
values:
- "true"
When a Portworx volume is created in a kubernetes cluster, we add node labels in the format pvc_name=true
to nodes where the data for the volume resides. To add node affinity like above will ensure your pods run on nodes where these labels exist.
Please read: https://docs.portworx.com/scheduler/kubernetes/scheduler-convergence.html for more details.
Feel free to ask any further questions.