kubernetes pvc fails with hostname empty

Question

kubernetes pvc fails with hostname empty

Closed this issue 7 years ago · 5 comments

I am following this guide to deploy portworx. I was able to run the portworx pod just fine, and the status looks like:

root@kubeguest04:/# /opt/pwx/bin/pxctl status
Status: PX is operational
Node ID: a6247594-f908-4789-8998-9657682b027d
	IP: 10.129.37.35
 	Local Storage Pool: 1 pool
	POOL	IO_PRIORITY	RAID_LEVEL	USABLE	USED	STATUS	ZONE	REGION
	0	HIGH		raid0		441 GiB	2.0 GiB	Online	default	default
	Local Storage Devices: 1 device
	Device	Path			Media Type		Size		Last-Scan
	0:1	/dev/mapper/cl-home	STORAGE_MEDIUM_MAGNETIC	441 GiB		20 Jun 17 01:04 UTC
	total				-			441 GiB
Cluster Summary
	Cluster ID: portworx-storage-0
	IP		ID					Used	Capacity	Status
	10.129.4.35	a6247594-f908-4789-8998-9657682b027d	0 B	441 GiB		Online (This node)
Global Storage Pool
	Total Used    	:  0 B
	Total Capacity	:  441 GiB

[root@kubeguest04 ~]# kubectl exec -it portworx-storage-x31h9  -- /opt/pwx/bin/pxctl cluster alerts
AlertID	ClusterID		Timestamp			Severity	AlertType		Description
0	portworx-storage-0	Jun 20 01:04:23 UTC 2017	NOTIFY		Node start success	Node a6247594-f908-4789-8998-9657682b027d with Index (0) is Up
1	portworx-storage-0	Jun 20 01:04:38 UTC 2017	NOTIFY		Node start success	Node a6247594-f908-4789-8998-9657682b027d joining the cluster with index (0)
2	portworx-storage-0	Jun 20 01:04:43 UTC 2017	NOTIFY		Node start success	PX is ready on Node: a6247594-f908-4789-8998-9657682b027d. CLI accessible at /opt/pwx/bin/pxctl.

[root@kubeguest04 ~]# kubectl create -f pvc.yml
persistentvolumeclaim "minio-persistent-storage" created

Now, when i tried to create a pvc, it fails with:

[root@kubeguest04 ~]# kubectl describe pvc minio-persistent-storage
Name:		minio-persistent-storage
Namespace:	default
StorageClass:	portworx
Status:		Pending
Volume:
Labels:		<none>
Annotations:	volume.beta.kubernetes.io/storage-class=portworx
		volume.beta.kubernetes.io/storage-provisioner=kubernetes.io/portworx-volume
Capacity:
Access Modes:
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----				-------------	--------	------		-------
  22s		6s		3	persistentvolume-controller			Warning		ProvisioningFailed	Failed to provision volume with StorageClass "portworx": Post http://:9001/v1/osd-volumes: dial tcp :9001: getsockopt: connection refused

Why is it connecting to http://:9001 and not including the hostname in there.

Though, when i tried on hostname, i am able to connect just fine:

[root@kubeguest04 ~]# curl http://10.129.4.35:9001/v1/osd-volumes
[]

Answer 1 · 2017-06-20T01:59:39.000Z

Hello @prat0318

While the logs say http://:9001, the underlying library is invoking calls to http://localhost:9001. The empty hostname can be misleading.

Now coming to the actual error message, portworx needs to be running on the kubernetes master node. The kubernetes control plane on the master node invokes the Portworx api server on http://localhost:9001 on the master node. (This will change with the k8s 1.6.5 release)

Can I learn more about your setup?

How did you deploy kubernetes
Can I get output of following commands
-- kubectl get nodes
-- kubectl get pods -l name=portworx -n kube-system
-- kubectl logs <pod-name> -n kube-system where is the name of the Portworx pod running on master node

Answer 2 · 2017-06-20T03:40:00.000Z

Sure, I will give the cluster details first thing in the morning. Though curious, what if there are multiple master replicas, do i need to run portworx on all of master nodes? Also, portworx tries the etcd connection on http which was failing when i gave the etcd nodes i used for the k8s cluster (they only accept https), the workaround was to 'docker run' another etcd node. I can create a new issue for etcd, if you feel so.

On Mon, Jun 19, 2017 at 6:59 PM Harsh Desai ***@***.***> wrote: Hello @prat0318 <https://github.com/prat0318> While the logs say http://:9001, the underlying library is invoking calls to http://localhost:9001. The empty hostname can be misleading. Now coming to the actual error message, portworx needs to be running on the kubernetes master node. The kubernetes control plane on the master node invokes the Portworx api server on http://localhost:9001 on the master node. (This will change with the k8s 1.6.5 release) Can I learn more about your setup? 1. How did you deploy kubernetes 2. Can I get output of following commands -- kubectl get nodes -- kubectl get pods -l name=portworx -n kube-system -- kubectl logs <pod-name> -n kube-system where is the name of the Portworx pod running on master node — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#34 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABdTeSGyhOdOfCaLmvs6b4DAzaRLKDCOks5sFyeNgaJpZM4N--tI> .

-- Prateek Agarwal

Answer 3 · 2017-06-20T04:27:02.000Z

Portworx can talk to etcd over https as well. You will need to provide the url in the following format while starting portworx container.

etcd:https://<ip>:<port>

You can find more info here

Note that if you have an already running PX cluster, I would recommend starting afresh with a new clusterID when switching to the secure etcd.

Regarding k8s master nodes, yes you will need to run portworx on all master replicas. We have removed this dependency in the latest kubernetes release 1.6.6
With 1.6.6 you will need to run PX only on worker nodes.

Let us know, if you need more help.

Answer 4 · 2017-06-21T05:06:47.000Z

The two things worked for me:

Running portworx on the master nodes
Using https with ca cert files for k8s etcd

Also, i realized that if i want to change the cluster id/change some config of a portworx running on a node, i should rm -rf /opt/pdx on that node, else the pod will again pick up old config, ignoring the one set in the daemonset.

One thing still confuses me. The pvc for one of the pods was attached to a volume whose replicas were on 2 other nodes (2 because of repl count set to 2) than the node running the pod. In volume list i get an asterisk saying the same thing. But i couldn't find a reason why such a scheduling could happen. Shouldn't almost always local volume be given a preference?

Answer 5 · 2017-06-22T21:18:29.000Z

@prat0318 glad you were able to make progress!

Also, i realized that if i want to change the cluster id/change some config of a portworx running on a node, i should rm -rf /opt/pdx on that node, else the pod will again pick up old config, ignoring the one set in the daemonset.

Once a cluster is initialized, you cannot change it's name. Since name is used as a key for cluster membership. I assume you were referring to removing /etc/pwx. You must not remove this directory since this has a file /etc/pwx/.private.json which is important for cluster membership of the node. To change settings on a node, you have to do so in Maintenence mode. Please read: https://docs.portworx.com/maintain/maintenance-mode.html for more details.

One thing still confuses me. The pvc for one of the pods was attached to a volume whose replicas were on 2 other nodes (2 because of repl count set to 2) than the node running the pod. In volume list i get an asterisk saying the same thing. But i couldn't find a reason why such a scheduling could happen. Shouldn't almost always local volume be given a preference?

Be default, Kubernetes does not have knowledge of locality of volume data. To do so, you have to use labels to node affinity in the pod spec. For e.g below

  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: "pvc-high-01"
            operator: In
            values:
              - "true"

When a Portworx volume is created in a kubernetes cluster, we add node labels in the format pvc_name=true to nodes where the data for the volume resides. To add node affinity like above will ensure your pods run on nodes where these labels exist.

Please read: https://docs.portworx.com/scheduler/kubernetes/scheduler-convergence.html for more details.

Feel free to ask any further questions.