portworx/px-dev

kubernetes pvc fails with hostname empty

Closed this issue · 5 comments

I am following this guide to deploy portworx. I was able to run the portworx pod just fine, and the status looks like:

root@kubeguest04:/# /opt/pwx/bin/pxctl status
Status: PX is operational
Node ID: a6247594-f908-4789-8998-9657682b027d
	IP: 10.129.37.35
 	Local Storage Pool: 1 pool
	POOL	IO_PRIORITY	RAID_LEVEL	USABLE	USED	STATUS	ZONE	REGION
	0	HIGH		raid0		441 GiB	2.0 GiB	Online	default	default
	Local Storage Devices: 1 device
	Device	Path			Media Type		Size		Last-Scan
	0:1	/dev/mapper/cl-home	STORAGE_MEDIUM_MAGNETIC	441 GiB		20 Jun 17 01:04 UTC
	total				-			441 GiB
Cluster Summary
	Cluster ID: portworx-storage-0
	IP		ID					Used	Capacity	Status
	10.129.4.35	a6247594-f908-4789-8998-9657682b027d	0 B	441 GiB		Online (This node)
Global Storage Pool
	Total Used    	:  0 B
	Total Capacity	:  441 GiB
[root@kubeguest04 ~]# kubectl exec -it portworx-storage-x31h9  -- /opt/pwx/bin/pxctl cluster alerts
AlertID	ClusterID		Timestamp			Severity	AlertType		Description
0	portworx-storage-0	Jun 20 01:04:23 UTC 2017	NOTIFY		Node start success	Node a6247594-f908-4789-8998-9657682b027d with Index (0) is Up
1	portworx-storage-0	Jun 20 01:04:38 UTC 2017	NOTIFY		Node start success	Node a6247594-f908-4789-8998-9657682b027d joining the cluster with index (0)
2	portworx-storage-0	Jun 20 01:04:43 UTC 2017	NOTIFY		Node start success	PX is ready on Node: a6247594-f908-4789-8998-9657682b027d. CLI accessible at /opt/pwx/bin/pxctl.
[root@kubeguest04 ~]# kubectl create -f pvc.yml
persistentvolumeclaim "minio-persistent-storage" created

Now, when i tried to create a pvc, it fails with:

[root@kubeguest04 ~]# kubectl describe pvc minio-persistent-storage
Name:		minio-persistent-storage
Namespace:	default
StorageClass:	portworx
Status:		Pending
Volume:
Labels:		<none>
Annotations:	volume.beta.kubernetes.io/storage-class=portworx
		volume.beta.kubernetes.io/storage-provisioner=kubernetes.io/portworx-volume
Capacity:
Access Modes:
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----				-------------	--------	------		-------
  22s		6s		3	persistentvolume-controller			Warning		ProvisioningFailed	Failed to provision volume with StorageClass "portworx": Post http://:9001/v1/osd-volumes: dial tcp :9001: getsockopt: connection refused

Why is it connecting to http://:9001 and not including the hostname in there.

Though, when i tried on hostname, i am able to connect just fine:

[root@kubeguest04 ~]# curl http://10.129.4.35:9001/v1/osd-volumes
[]

Hello @prat0318

While the logs say http://:9001, the underlying library is invoking calls to http://localhost:9001. The empty hostname can be misleading.

Now coming to the actual error message, portworx needs to be running on the kubernetes master node. The kubernetes control plane on the master node invokes the Portworx api server on http://localhost:9001 on the master node. (This will change with the k8s 1.6.5 release)

Can I learn more about your setup?

  1. How did you deploy kubernetes
  2. Can I get output of following commands
    -- kubectl get nodes
    -- kubectl get pods -l name=portworx -n kube-system
    -- kubectl logs <pod-name> -n kube-system where is the name of the Portworx pod running on master node

Portworx can talk to etcd over https as well. You will need to provide the url in the following format while starting portworx container.

etcd:https://<ip>:<port>

You can find more info here

Note that if you have an already running PX cluster, I would recommend starting afresh with a new clusterID when switching to the secure etcd.

Regarding k8s master nodes, yes you will need to run portworx on all master replicas. We have removed this dependency in the latest kubernetes release 1.6.6
With 1.6.6 you will need to run PX only on worker nodes.

Let us know, if you need more help.

The two things worked for me:

  1. Running portworx on the master nodes
  2. Using https with ca cert files for k8s etcd

Also, i realized that if i want to change the cluster id/change some config of a portworx running on a node, i should rm -rf /opt/pdx on that node, else the pod will again pick up old config, ignoring the one set in the daemonset.

One thing still confuses me. The pvc for one of the pods was attached to a volume whose replicas were on 2 other nodes (2 because of repl count set to 2) than the node running the pod. In volume list i get an asterisk saying the same thing. But i couldn't find a reason why such a scheduling could happen. Shouldn't almost always local volume be given a preference?

@prat0318 glad you were able to make progress!

Also, i realized that if i want to change the cluster id/change some config of a portworx running on a node, i should rm -rf /opt/pdx on that node, else the pod will again pick up old config, ignoring the one set in the daemonset.

Once a cluster is initialized, you cannot change it's name. Since name is used as a key for cluster membership. I assume you were referring to removing /etc/pwx. You must not remove this directory since this has a file /etc/pwx/.private.json which is important for cluster membership of the node. To change settings on a node, you have to do so in Maintenence mode. Please read: https://docs.portworx.com/maintain/maintenance-mode.html for more details.

One thing still confuses me. The pvc for one of the pods was attached to a volume whose replicas were on 2 other nodes (2 because of repl count set to 2) than the node running the pod. In volume list i get an asterisk saying the same thing. But i couldn't find a reason why such a scheduling could happen. Shouldn't almost always local volume be given a preference?

Be default, Kubernetes does not have knowledge of locality of volume data. To do so, you have to use labels to node affinity in the pod spec. For e.g below

  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: "pvc-high-01"
            operator: In
            values:
              - "true"

When a Portworx volume is created in a kubernetes cluster, we add node labels in the format pvc_name=true to nodes where the data for the volume resides. To add node affinity like above will ensure your pods run on nodes where these labels exist.

Please read: https://docs.portworx.com/scheduler/kubernetes/scheduler-convergence.html for more details.

Feel free to ask any further questions.