threefoldtech/home

Add more sizes for K8S workload

Closed this issue ยท 11 comments

At the moment k8s workload only supports 2 sizes:

size vCpu RAM (GiB) SSD Storage (GiB)
2 1 2 50
1 2 4 100

We could propose more variant of this workloads.

vCpu RAM (GiB) SSD Storage (GiB)
2 8 25
2 8 50
2 8 200
4 16 50
4 16 100
4 16 400
8 32 100
8 32 200
8 32 800
16 64 200
16 64 400
16 64 800

Tasks :

@despiegk @andhartl @weynandkuijpers does these new sizes make sense ? Do I miss some?

I think the proposed sizes make sense from a vcpu and RAM viewpoint. Not sure a "large" k8s workload also needs a lot more ssd storage. Do more powerful k8s also require more SSD storage?

K8S by itself does not required a lot of storage. But this is the usable storage for the application you want to run in your cluster.
At the moment, the only way to extend the usable storage of your cluster is to add new nodes to it. But since you can only have one VM per node per network. There is a hard cap on how much storage you can have.

So my idea was to have these "heavy" storage size for people that needs it. So if your usage case is only based on storage, you do not have to also pay for a lot of CPU and memory that are not required

Then may I suggest to add also heavy storage size node with (a lot) less vcpu and RAM. Then you can select to have "Storage nodes" in you cluster. Am I understanding this correctly?

Can the storage of a node be separated from the vcpu/mem capacity so they can be selected separately ? I understand this might require changes to the reservation object, but why it was not considered from the start?

Then may I suggest to add also heavy storage size node with (a lot) less vcpu and RAM. Then you can select to have "Storage nodes" in you cluster. Am I understanding this correctly?

it's not that simple. At the moment I didn't manage to install a network storage system that would allow to create virtual disk that can be used by any node in the cluster. So right now only the local disk can be used. This means the containers running can only use the storage available on the local node.

I think there are some research to be done to see how could make a system like https://github.com/longhorn/longhorn to work.
And for this, I think the VM needs to have raw disk available. This means @muhamadazmy we probably should then allow to configure the number and size of extra disk to attach to the VM so they can be used by longhorn.

I'm not sure but I think integrating virtio-fs would actually be a solution here. Since we could then mount a filesystem on the host created by a volume reservation somehow. That would allow users to customize how much storage they want, and we can keep a small SSD size reservation to have the main k8s data. So this would be similar to how containers work then ( cc @delandtj )

@zaibon that's not exactly what I mean, my idea is that the k8s reservation should have 2 numbers, one for the vcpu and another one for the disk size, unlike what we have now which is one number that defines the (size) of the vm

@LeeSmet I would love that if it's possible using virtio-fs

@LeeSmet I'm not sure this is a valid solution. I think system like longhorn requires raw disk and not just a mountpoint with a filesystem on it. So mounting a btrfs subvolume won't work.

that's not exactly what I mean, my idea is that the k8s reservation should have 2 numbers, one for the vcpu and another one for the disk size, unlike what we have now which is one number that defines the (size) of the vm

Yeah I got that. This was chosen to be like this cause the industry works like this for VMs. They propose fixed size and not "a la carte".

The question is what we want to achieve related to storage I guess. Do we need something like longhorn? or do we just need to allow users to configure more storage space. I'm not really experienced with k8s so I don't really know what would be good enough.

If we do need raw disks, would it be possible to allow adding volume reservation ids to k3s reservations such that the provision engine will have storaged create a raw disk file for the full size of the volume (later possibly a qcow file). Provision knows when a volume is mounted so it can make sure the volume is only used by the vm, not by a container (or other vm). That would allow adding raw disks inside the vm without messing with existing reservation structures

That is also how I was seeing things. Let the user decide which extra disk to attach to its VM. Would probably makes things more modular. But these things requires some research to know exactly how the all system should work.

Regarding this specific issue, I think we can go with the size I proposed for now and open a new issue for the follow up of this discussion.
@weynandkuijpers agreed ?