libvirt: cannot set master node vm memory
ValentinoUberti opened this issue · 12 comments
Version
$ openshift-install version
./openshift-install unreleased-master-1452-g6e2977c740853e842247b55c7cd08d9b350f3e93-dirty
built from commit 6e2977c740853e842247b55c7cd08d9b350f3e93
release image registry.svc.ci.openshift.org/origin/release:4.2
Platform:
libvirt
What happened?
domainMemory var value in {INSTALL_DIR}/openshift/99_openshift-cluster-api_master-machines-0.yaml is not used for create the master vm. Master vm always have 6G of ram.
What you expected to happen?
Change to domainMemory var should be reflected during master vm creation
How to reproduce it ?
- Create manifiests:
./openshift-install create manifests --dir=./ocp4 - vi ocp4/openshift/99_openshift-cluster-api_master-machines-0.yaml
- Change value of domainMemory
- build the cluster: openshift-install create manifests --dir=./ocp4
- After a while, check the master vm info with libvirt:
sudo virsh dommemstat
the vaule of "actual" should be equal to the domainMemory var, but it is not.
/label platform/libvirt
Not sure yet who uses this config in the end (is it used at all?) but master is created through terraform and the terrform config respects the libvirt_master_memory
variable (which translate to TF_VAR_libvirt_master_memory
environment variable so there is way to specify the memory at least and that is the official way AFAIK.
Having said that, we need to check what's up with openshift/99_openshift-cluster-api_master-machines-0.yaml
file.
/label plaform/libvirt
So turns out this is the config for machines created by libvirt cluster api provider but master is created by Installer through terraform. So this config ends up being used for worker, instead of master. It's a bit confusing cause it's not obvious at all.
small correction: For worker nodes, we've openshift/99_openshift-cluster-api_worker-machines-0.yaml
file. And that is respected. If you create a master node through libvirt provider (after Installer is done), it will respect the values provided in the openshift/99_openshift-cluster-api_master-machines-0.yaml
file.
So the problem is that the first master node is being created by Installer and it only uses the terraform configuration. I think it should respect the manifest config somehow.
Thank you. I get some of OOM on master node, that's why i tried to change master node memory. Installer stop @ 98% on Fedora 30.
Hi @zeenix. The openshift/99_openshift-cluster-api_worker-machines-0.yaml
file contains the workers machineset and if you change the replicas you effectively get the expected numbers of nodes, but memory settings changes don't work there too.
I think I have found the missing connection to the worker behavior.
The following provider
function is used to define defaults for the Machines:
https://github.com/openshift/installer/blob/master/pkg/asset/machines/libvirt/machines.go#L60-L80
Here we can see the deafult values for cpu and the 6144 memory value. The provider function return a *libvirtprovider.LibvirtMachineProviderConfig
struct that is used the in the MachineSets() function in the same package: https://github.com/openshift/installer/blob/master/pkg/asset/machines/libvirt/machinesets.go#L16-L78.
I can confirm you that by changing the DomainMemory
to 8192 field in the provider
function the generated manifests assume the above value.
https://github.com/openshift/installer/blob/master/pkg/asset/machines/libvirt/machines.go#L66
The inspected manifests are:
99_openshift-cluster-api_master-machines-0.yaml
99_openshift-cluster-api_worker_machineset-0.yaml
Anyway, on cluster creation the virtual master and worker virtual machines are created with different values:
- master: 6144
- worker: 8192
This is probably because the master is managed through the terraform variables, as @zeenix pointed out, and the worker by a MachineSet with machine configs passed through the above mentioned **libvirtprovider.LibvirtMachineProviderConfig
struct.
After this first attempt I rebuilt the installer updating the memory value in the data/data/libvirt/variables-libvirt.tf
file which provides default Terraform variables.
https://github.com/openshift/installer/blob/master/data/data/libvirt/variables-libvirt.tf#L35
This time I can confirm that both master and worker are created with 8192 MiB of RAM.
I talked to @abhinavdahiya about this and seems other platforms the master machine objects provide the vcpu and mem value to terrform. So we need to do that same for libvirt. I'll look into this next week.
should be fixed by #2399
/close
should be fixed by #2399
/close
@abhinavdahiya: Closing this issue.
In response to this:
should be fixed by #2399
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.