Adjusting -Xmx and -Xms for the cassandra pods, increasing utilization of Node resources

Type of question

Trying to understand a design decision

Question

A specific question to understand how to adjust the -Xmx and -Xms in the pod.

What did you do?
I followed the example in the documentation and referred the samples/cassandra-configmap-v2.yaml. Providing jvm.options in the config file does not help.
It was taking 3GB, which is one fourth of the Pod resource limit.

What did you expect to see?
I was expecting to see 10GB as -Xmx and -Xms, as I have provided the valus as 10GB in the jvm.options for both.

What did you see instead? Under which circumstances?
However, I saw that the value assigned to those variables was 500MB, which is 25% of the pod resource limit.
I looked at the issue #219 . And came across the code snippet -

casskop/pkg/controller/cassandracluster/generator.go

Lines 417 to 424 in 282d790

    
           if resources.Limits.Memory().IsZero() == false { 
        
           	m := float64(resources.Limits.Memory().Value()) * float64(0.25) // Maxheapsize should be 1/4 of container Memory Limit 
        
           	mi := int(m / float64(1048576)) 
        
           	mhs = strings.Join([]string{strconv.Itoa(mi), "M"}, "") 
        
           } else { 
        
           	mhs = defaultJvmMaxHeap 
        
           }

It looks like the code will limit the max heap size as 25% of resource limit, and ignore anything provided in the jvm.options.

Presently I am using Machine type e2-standard-4 (4 vCPUs, 16 GB memory) for each nodes hosting the cassandra pod. This GCP instance allows 13.97 GB as allocatable memory.
To ensure maximum utilization of the memory, I put the resource limit of 12GB, that allows other system pods with 2 GB of RAM. Which is fine.
However, it leaves the cassandra JVM to use 3GB, and rest of the 9GB remains unutilized

Please share the rational behind the 25% allocation. Presently, this logic causes the remaining 75% to be not utilized when a Node is hosting only 1 pod of cassandra.

Environment

casskop version: Latest/0.5.2
Kubernetes version information:

insert output of kubectl version here

Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.9-gke.2", GitCommit:"4751ff766b3f6dbf6c6a63394a909e8108e89744", GitTreeState:"clean", BuildDate:"2020-05-08T16:44:50Z", GoVersion:"go1.13.9b4", Compiler:"gc", Platform:"linux/amd64"}

Kubernetes cluster kind:
GKE
Cassandra version:
3.11.6

@pratimsc, first of all there is a current issue with setting the heap that #230 solves. Second, we don't store memtables in the heap

casskop/docker/bootstrap/files/cassandra.yaml

Line 422 in 2f13639

memtable_allocation_type: offheap_objects

. Cassandra uses, by default, at most 2GB of heap to store memtables data before flushing it to disk (see memtable_heap_space_in_mb). So except if you overwrite those parameters, it shouldn't be a problem for you.

@pratimsc I propose that we close your issue and as the question was answered

Please share the rational behind the 25% allocation. Presently, this logic causes the remaining 75% to be not utilized when a Node is hosting only 1 pod of cassandra.

@cscetbon can you please elaborate on that?

@rogaha cassandra uses the remaining memory for off heap objects. You usually want to use a small heap to avoid long STW caused by garbage collectors, that's why using a 1/4 of the memory has been decided. It was also based on the recommendations made by Datastax, see https://docs.datastax.com/en/dse/6.0/dse-admin/datastax_enterprise/operations/opsConHeapSize.html.

	if resources.Limits.Memory().IsZero() == false {
	m := float64(resources.Limits.Memory().Value()) * float64(0.25) // Maxheapsize should be 1/4 of container Memory Limit
	mi := int(m / float64(1048576))
	mhs = strings.Join([]string{strconv.Itoa(mi), "M"}, "")

	} else {
	mhs = defaultJvmMaxHeap
	}