Orange-OpenSource/casskop

Adjusting -Xmx and -Xms for the cassandra pods, increasing utilization of Node resources

pratimsc opened this issue · 4 comments

Type of question

Trying to understand a design decision

Question

A specific question to understand how to adjust the -Xmx and -Xms in the pod.

What did you do?
I followed the example in the documentation and referred the samples/cassandra-configmap-v2.yaml. Providing jvm.options in the config file does not help.
It was taking 3GB, which is one fourth of the Pod resource limit.

What did you expect to see?
I was expecting to see 10GB as -Xmx and -Xms, as I have provided the valus as 10GB in the jvm.options for both.

What did you see instead? Under which circumstances?
However, I saw that the value assigned to those variables was 500MB, which is 25% of the pod resource limit.
I looked at the issue #219 . And came across the code snippet -

if resources.Limits.Memory().IsZero() == false {
m := float64(resources.Limits.Memory().Value()) * float64(0.25) // Maxheapsize should be 1/4 of container Memory Limit
mi := int(m / float64(1048576))
mhs = strings.Join([]string{strconv.Itoa(mi), "M"}, "")
} else {
mhs = defaultJvmMaxHeap
}

It looks like the code will limit the max heap size as 25% of resource limit, and ignore anything provided in the jvm.options.

Presently I am using Machine type e2-standard-4 (4 vCPUs, 16 GB memory) for each nodes hosting the cassandra pod. This GCP instance allows 13.97 GB as allocatable memory.
To ensure maximum utilization of the memory, I put the resource limit of 12GB, that allows other system pods with 2 GB of RAM. Which is fine.
However, it leaves the cassandra JVM to use 3GB, and rest of the 9GB remains unutilized

Please share the rational behind the 25% allocation. Presently, this logic causes the remaining 75% to be not utilized when a Node is hosting only 1 pod of cassandra.

Environment

  • casskop version: Latest/0.5.2

  • Kubernetes version information:

    insert output of kubectl version here

Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.9-gke.2", GitCommit:"4751ff766b3f6dbf6c6a63394a909e8108e89744", GitTreeState:"clean", BuildDate:"2020-05-08T16:44:50Z", GoVersion:"go1.13.9b4", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes cluster kind:
    GKE

  • Cassandra version:
    3.11.6

@pratimsc, first of all there is a current issue with setting the heap that #230 solves. Second, we don't store memtables in the heap

memtable_allocation_type: offheap_objects
. Cassandra uses, by default, at most 2GB of heap to store memtables data before flushing it to disk (see memtable_heap_space_in_mb). So except if you overwrite those parameters, it shouldn't be a problem for you.

@pratimsc I propose that we close your issue and as the question was answered

Please share the rational behind the 25% allocation. Presently, this logic causes the remaining 75% to be not utilized when a Node is hosting only 1 pod of cassandra.

@cscetbon can you please elaborate on that?

@rogaha cassandra uses the remaining memory for off heap objects. You usually want to use a small heap to avoid long STW caused by garbage collectors, that's why using a 1/4 of the memory has been decided. It was also based on the recommendations made by Datastax, see https://docs.datastax.com/en/dse/6.0/dse-admin/datastax_enterprise/operations/opsConHeapSize.html.