Airflow Core Parallelism set to 512 but still the tasks are running at max 32 task at a time.

Question

Airflow Core Parallelism set to 512 but still the tasks are running at max 32 task at a time.

asif2017 opened this issue a year ago · 2 comments

asif2017 commented a year ago

Checks

I have checked for existing issues.
This report is about the User-Community Airflow Helm Chart.

Chart Version

8.7.1

Kubernetes Version

Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.5", GitCommit:"804d6167111f6858541cef440ccc53887fbbc96a", GitTreeState:"clean", BuildDate:"2022-12-08T10:15:02Z", GoVersion:"go1.19.4", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.5", GitCommit:"4009cfb7c076b982a1e38b079ae363ece1eb1a19", GitTreeState:"clean", BuildDate:"2023-06-12T18:46:22Z", GoVersion:"go1.19.4", Compiler:"gc", Platform:"linux/amd64"}

Helm Version

version.BuildInfo{Version:"v3.12.3", GitCommit:"3a31588ad33fe3b89af5a2a54ee1d25bfe6eaa5e", GitTreeState:"clean", GoVersion:"go1.20.7"}

Description

I have set airflow core parallelism to 512 but the tasks are running at max 32 at a time. I have default slot of 128. I thought this might be due to less slot and because of this settings might not be working. so, I have tested with parallelism 56 and in that case only 16 tasks are running at maximum.

Second issue I encountered the high usage of memory. I have around 350 dags and if you see only 32 tasks can run parallelly but the memory consumption is reaching almost 20-25 gb. Is this expected or is this could be a memory leakage.
I'm using CeleryExecutor with postgres as a metadata serve. Below are my memory consumption details.
Airflow version 2.5.3

NAME CPU(cores) MEMORY(bytes)
airflow-cluster-db-migrations 2m 227Mi
airflow-cluster-flower 8m 293Mi
airflow-cluster-pgbouncer 344m 23Mi
airflow-cluster-redis-master-0 7m 14Mi
airflow-cluster-scheduler 698m 3324Mi
airflow-cluster-triggerer 176m 841Mi
airflow-cluster-web 118m 1512Mi
airflow-cluster-worker-0 212m 5429Mi
airflow-cluster-worker-1 234m 10059Mi

Relevant Logs

No response

Custom Helm Values

config:
  AIRFLOW_CORE_PARALLELISM=512
  AIRFLOW__CORE__MAX_ACTIVE_TASKS_PER_DAG: 128
  AIRFLOW__CORE__MAX_ACTIVE_RUNS_PER_DAG: 1

I have default values for all the pods. Only the above values I have added in config for parallelism.

Answer 1 · 2023-09-05T10:07:28.000Z

Okay, I got the issue I have set it up the AIRFLOW_CORE_PARALLELISM but I was missing one more setting AIRFLOW__CELERY__WORKER_CONCURRENCY, which is by default 16 per worker. This needs to be set it up for parallelism.

But the memory consumption still too high and I see the both the worker share the fair amount of tasks but worker 1 always using the double the memory of worker 0.
I also feel if my pod is not restarting then in that case memory getting increased day by day till the time pod is restarting. I deployed a new cluster 2 days ago with 40 dags and it trigger pod was not restarted since first deployment and today I saw trigger pod is consuming 975Mi. Now, I have restarted the trigger pod and the memory usage got down to 200 Mi.

Answer 2 · 2023-12-15T03:29:51.000Z

This issue has been automatically marked as stale because it has not had activity in 60 days.
It will be closed in 7 days if no further activity occurs.

Thank you for your contributions.

Issues never become stale if any of the following is true:

they are added to a Project
they are added to a Milestone
they have the lifecycle/frozen label