helm/charts

Airflow Gunicorn service not running || web UI inaccessible

Vivekdjango opened this issue · 2 comments

Airflow version: 1.8.0
Gunicorn version: 19.3.0
Python version: 2.7

When: We restarted the server where airflow service was running

What happened: Everything is working fine but airflow Gunicorn service is not running due to the web server UI is not accessible.

Web server log: Whenever we start the webserver its getting stuck :

airflow webserver
[2021-12-22 18:26:15,403] {init.py:57} INFO - Using executor LocalExecutor
[2021-12-22 18:26:15,477] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt
[2021-12-22 18:26:15,509] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt


____ |( )_______ / /________ __
____ /| |_ /__ / / __ / __ _ | /| / /
___ ___ | / _ / _ / _ / / // / |/ |/ /
// |// // // // _
/____/|__/

/usr/local/lib/python2.7/dist-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead.
.format(x=modname), ExtDeprecationWarning
[2021-12-22 18:26:15,906] [9263] {models.py:167} INFO - Filling up the DagBag from /usr/local/lib/airflow/dags
/usr/local/lib/python2.7/dist-packages/airflow/utils/helpers.py:406: DeprecationWarning: Importing SSHHook directly from <module 'airflow.contrib.hooks' from '/usr/local/lib/python2.7/dist-packages/airflow/contrib/hooks/init.pyc'> has been deprecated. Please import from '<module 'airflow.contrib.hooks' from '/usr/local/lib/python2.7/dist-packages/airflow/contrib/hooks/init.pyc'>.[operator_module]' instead. Support for direct imports will be dropped entirely in Airflow 2.0.
DeprecationWarning)
[2021-12-22 18:26:17,439] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(BashOperator): sync_dag>, though it has been deprecated as a task parameter, you need to specify it as a DAG parameter instead
spark-submit --files /etc/spark2/conf/hive-site.xml --deploy-mode cluster --name trust_ride_ano_in_1 --executor-memory 4g --queue trust --num-executors 5 --master yarn --driver-memory 10G --py-files /home/suvratjain/RideAnomalyDetection.zip --conf spark.ui.showConsoleProgress=true --conf spark.driver.maxResultSize=6g --conf spark.files.overwrite=true /home/suvratjain/RideAnomalyDetection/raw_data.py --country IN --s3_path s3a://s3-emr-test-stg/fraud/ride_level_anomaly/raw_feature/IN/{{ task_instance.xcom_pull(key="for_date", task_ids='set_for_date') }} --day_t {{ task_instance.xcom_pull(key="for_date", task_ids='set_for_date') }}
/usr/local/lib/python2.7/dist-packages/airflow/models.py:1927: PendingDeprecationWarning: Invalid arguments were passed to BashOperator. Support for passing such arguments will be dropped in Airflow 2.0. Invalid arguments were:
*args: ()
**kwargs: {'provide_context': True}
category=PendingDeprecationWarning
spark-submit --files /etc/spark2/conf/hive-site.xml --deploy-mode cluster --name trust_ride_ano_gb_1 --executor-memory 4g --queue trust --num-executors 5 --master yarn --driver-memory 10G --py-files /home/suvratjain/RideAnomalyDetection.zip --conf spark.ui.showConsoleProgress=true --conf spark.driver.maxResultSize=6g --conf spark.files.overwrite=true /home/suvratjain/RideAnomalyDetection/raw_data.py --country GB --s3_path s3a://s3-emr-test-stg/fraud/ride_level_anomaly/raw_feature/GB/{{ task_instance.xcom_pull(key="for_date", task_ids='set_for_date') }} --day_t {{ task_instance.xcom_pull(key="for_date", task_ids='set_for_date') }}

spark-submit --files /etc/spark2/conf/hive-site.xml --deploy-mode cluster --name partner_ano_in_1 --executor-memory 10g --queue trust --num-executors 20 --master yarn --driver-memory 20G --py-files /home/suvratjain/PartnerAnomalyDetection.zip --conf spark.ui.showConsoleProgress=true --conf spark.driver.maxResultSize=6g --conf spark.files.overwrite=true /home/suvratjain/PartnerAnomalyDetection/main.py --country IN --day_t {{ task_instance.xcom_pull(key="for_date", task_ids='set_for_date') }}
spark-submit --files /etc/spark2/conf/hive-site.xml --deploy-mode cluster --name customer_ride_ano_in_1 --executor-memory 5G --queue trust --num-executors 20 --master yarn --driver-memory 5G --py-files s3a://s3-emr-test-stg/fraud/datascience/anomaly-detection/CustomerAnomalyDetection/CustomerAnomalyDetection.zip --conf spark.ui.showConsoleProgress=true --conf spark.driver.maxResultSize=6g --conf spark.files.overwrite=true s3a://s3-emr-test-stg/fraud/datascience/anomaly-detection/CustomerAnomalyDetection/CustomerRideFeatureCollection.py --country IN --day_t {{ task_instance.xcom_pull(key="for_date", task_ids='set_for_date') }}
spark-submit --files /etc/spark2/conf/hive-site.xml --deploy-mode cluster --name customer_ride_ano_gb_1 --executor-memory 5G --queue trust --num-executors 20 --master yarn --driver-memory 5G --py-files s3a://s3-emr-test-stg/fraud/datascience/anomaly-detection/CustomerAnomalyDetection/CustomerAnomalyDetection.zip --conf spark.ui.showConsoleProgress=true --conf spark.driver.maxResultSize=6g --conf spark.files.overwrite=true s3a://s3-emr-test-stg/fraud/datascience/anomaly-detection/CustomerAnomalyDetection/CustomerRideFeatureCollection.py --country GB --day_t {{ task_instance.xcom_pull(key="for_date", task_ids='set_for_date') }}
spark-submit --files /etc/spark2/conf/hive-site.xml --deploy-mode cluster --name outstation_partner_ds_in_1 --executor-memory 20g --queue trust --num-executors 10 --master yarn --driver-memory 20G --py-files /home/souritmanna/Trust/gitlab_repo/anomaly-detection/outstation_anomaly_detection/partner_anomaly/OutstationPartnerAnomaly.zip --conf spark.ui.showConsoleProgress=true --conf spark.driver.maxResultSize=6g --conf spark.files.overwrite=true /home/souritmanna/Trust/gitlab_repo/anomaly-detection/outstation_anomaly_detection/partner_anomaly/data_fetching.py --country IN --day_t {{ task_instance.xcom_pull(key="for_date", task_ids='set_for_date') }} --output_s3_path s3a://s3-emr-test-stg/fraud/outstation_partner_level_anomaly/
spark-submit --files /etc/spark2/conf/hive-site.xml --deploy-mode cluster --name auto_ride_ds_in --executor-memory 10g --queue trust --num-executors 10 --master yarn --driver-memory 20G --py-files /home/suvratjain/AutoRideAnomaly.zip --conf spark.ui.showConsoleProgress=true --conf spark.driver.maxResultSize=6g --conf spark.files.overwrite=true /home/suvratjain/AutoRideAnomaly/raw_data.py --day_t {{ task_instance.xcom_pull(key="for_date", task_ids='set_for_date') }} --country IN --s3_path s3a://s3-emr-test-stg/fraud/auto_ride_level_anomaly/raw_feature/IN/{{ task_instance.xcom_pull(key="for_date", task_ids='set_for_date') }}/
spark-submit --files /etc/spark2/conf/hive-site.xml --deploy-mode cluster --name customer_non_ride_ano_au_1 --executor-memory 5G --queue trust --num-executors 20 --master yarn --driver-memory 5G --py-files /home/suvratjain/CustomerNonRideAnomaly.zip --conf spark.ui.showConsoleProgress=true --conf spark.driver.maxResultSize=6g --conf spark.files.overwrite=true /home/suvratjain/CustomerNonRideAnomaly/CustomerNonRideDataFetch.py --country AU --day_t {{ task_instance.xcom_pull(key="for_date", task_ids='set_for_date') }} --s3_path s3a://s3-emr-test-stg/fraud/customer_non_ride_anomaly/raw_feature/AU/{{ task_instance.xcom_pull(key="for_date", task_ids='set_for_date') }}
echoq spark-submit --files /etc/spark2/conf/hive-site.xml --deploy-mode cluster --name trust_ride_ano_gb_1 --executor-memory 4g --queue trust --num-executors 5 --master yarn --driver-memory 10G --py-files s3a://s3-emr-test-stg/fraud/datascience/anomaly-detection/RideAnomalyDetection/RideFeatureCode.zip --conf spark.ui.showConsoleProgress=true --conf spark.driver.maxResultSize=6g --conf spark.files.overwrite=true s3a://s3-emr-test-stg/fraud/datascience/anomaly-detection/RideAnomalyDetection/raw_data.py --country GB --s3_path s3a://s3-emr-test-stg/fraud/ride_level_anomaly/raw_feature/GB/{{ task_instance.xcom_pull(key="for_date", task_ids='set_for_date') }} --day_t {{ task_instance.xcom_pull(key="for_date", task_ids='set_for_date') }}
spark-submit --files /etc/spark2/conf/hive-site.xml --deploy-mode cluster --name partner_ano_au_1 --executor-memory 10g --
[2021-12-22 18:26:32,333] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(BashOperator): Bind_To_Favourites>, though it has been deprecated as a task parameter, you need to specify it as a DAG parameter instead
[2021-12-22 18:26:32,334] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(BashOperator): Generate_RocksDB>, though it has been deprecated as a task parameter, you need to specify it as a DAG parameter instead
[2021-12-22 18:26:32,344] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(SparkRemoteOperator):
[2021-12-22 18:26:32,426] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(MarathonOperator): data_processor_bangalore>, though it has been deprecated as a task parameter, you need to specify it as a DAG parameter instead
[2021-12-22 18:26:32,427] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(S3KeySensor): data_processor_city_BANGALORE_output_sensor>, though it has been deprecated as a task parameter, you need to specify it as a DAG parameter instead
[2021-12-22 18:26:32,429] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(MarathonOperator): data_processor_bangalore_kill>, though it has been deprecated as a task parameter, you need to specify it as a DAG parameter instead
[2021-12-22 18:26:32,430] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(MarathonOperator): data_processor_delhi>, though it has been deprecated as a task parameter, you need to specify it as a DAG parameter instead
[2021-12-22 18:26:32,431] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(S3KeySensor): data_processor_city_DELHI_output_sensor>, though it has been deprecated as a task parameter, you need to specify it as a DAG parameter instead
[2021-12-22 18:26:32,433] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(MarathonOperator): data_processor_delhi_kill>, though it has been deprecated as a task parameter, you need to specify it as a DAG parameter instead
[2021-12-22 18:26:32,434] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(MarathonOperator): data_processor_mumbai>, though it has been deprecated as a task parameter, you need to specify it as a DAG parameter instead
[2021-12-22 18:26:32,435] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(S3KeySensor): data_processor_city_MUMBAI_output_sensor>, though it has been deprecated as a task parameter, you need to specify it as a DAG parameter instead

[2021-12-22 18:26:34,949] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(BashOperator): ahemadabad_UnscheduledStopFinalModel>, though it has been deprecated as a task parameter, you need to specify it as a DAG parameter instead
[2021-12-22 18:26:34,950] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(DummyOperator): lucknow_dag_start>, though it has been deprecated as a task parameter, you need to specify it as a DAG parameter instead
[2021-12-22 18:26:34,951] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(BashOperator): lucknow_AbandonedZoneScoreFinalModel>, though it has been deprecated as a task parameter, you need to specify it as a DAG parameter instead
[2021-12-22 18:26:34,952] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(BashOperator): lucknow_RouteDeivationFinalModel>, though it has been deprecated as a task parameter, you need to specify it as a DAG parameter instead
[2021-12-22 18:26:34,953] [9263] {models.py:1958} WARNING - schedule_interval is used for <Task(BashOperator): lucknow_UnscheduledStopFinalModel>, though it has been deprecated as a task parameter, you need to specify it as a DAG parameter instead
/usr/local/lib/python2.7/dist-packages/airflow/models.py:1927: PendingDeprecationWarning: Invalid arguments were passed to SparkSubmitOperator. Support for passing such arguments will be dropped in Airflow 2.0. Invalid arguments were:
*args: ()
**kwargs: {'driver_memory': '2g', 'application_args': ['staging', '2021-12-22'], 'java_class': 'com.ola.fs.preauth.exps.Exps1'}
category=PendingDeprecationWarning

No response after this line.

Now when tried to start gunicorn getting App not found error:

root@airflow1:/usr/local/lib/python2.7/dist-packages/airflow# gunicorn -b 0.0.0.0:8080 airflow
[2021-12-22 21:11:03 +0000] [21851] [INFO] Starting gunicorn 19.3.0
[2021-12-22 21:11:03 +0000] [21851] [INFO] Listening at: http://0.0.0.0:8080 (21851)
[2021-12-22 21:11:03 +0000] [21851] [INFO] Using worker: sync
[2021-12-22 21:11:03 +0000] [21856] [INFO] Booting worker with pid: 21856
[2021-12-22 21:11:03,753] {init.py:57} INFO - Using executor LocalExecutor
[2021-12-22 21:11:03,840] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt
[2021-12-22 21:11:03,860] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
Failed to find application: 'airflow'
[2021-12-22 21:11:03 +0000] [21856] [INFO] Worker exiting (pid: 21856)
[2021-12-22 21:11:03 +0000] [21851] [INFO] Shutting down: Master
[2021-12-22 21:11:03 +0000] [21851] [INFO] Reason: App failed to load.
root@airflow1:/usr/local/lib/python2.7/dist-packages/airflow#

Please help me to fix this issue ASAP.

Hi, @Vivekdjango!

I suggest looking in https://docs.gunicorn.org/en/stable/ or https://airflow.apache.org/docs/apache-airflow/stable/, and checking with those communities.

This repo is the former home of helm charts which are now found at Artifact Hub. Project support is not available in this repo. Thanks!

Oh, and one more thing - if this question is indeed related to a Helm chart, then you may find the info you're looking for in https://github.com/airflow-helm/charts/. Best!