polyledger/connect

Celerybeat periodic task does not run in staging env

Closed this issue · 5 comments

Description

Celery configuration in polyledger/polyledger/settings/base.py:

# Celery application definition
# http://docs.celeryproject.org/en/v4.0.2/userguide/configuration.html
CELERY_BROKER_URL = 'redis://localhost:6379'
CELERY_RESULT_BACKEND = 'redis://localhost:6379'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TASK_SERIALIZER = 'json'
CELERY_TIMEZONE = 'US/Pacific'
CELERY_BEAT_SCHEDULE = {
    'get-new-day-prices': {
        'task': 'api.tasks.get_current_prices',
        'schedule': crontab(hour=0, minute=0)
    }
}

'get-new-day-prices' should be run by Celery every day at midnight. Will have to debug this issue.


Celery is managed by supervisor.

Contents of /etc/supervisor/conf.d/celery.conf:

[program:celery]
command=/var/www/staging.polyledger.com/polyledger/server/venv/bin/celery -E -A polyledger worker --loglevel=info -B
environment=
	SECRET_KEY='fj(k(x6yk*h)t_j^%%u*b3q)ml!t&w-sb6!-phvj@a(o',
	DJANGO_SETTINGS_MODULE='polyledger.settings.staging',
	POSTGRESQL_NAME=polyledger_staging,
	POSTGRESQL_PASSWORD=polyledger,
	POSTGRESQL_USER=admin
directory=/var/www/staging.polyledger.com/polyledger/server
user=www-data
numprocs=1
stdout_logfile=/var/www/staging.polyledger.com/logs/celery-worker.log
stderr_logfile=/var/www/staging.polyledger.com/logs/celery-worker.log
autostart=true
autorestart=true
startsecs=10

; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600

; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true

; If redis is supervised, set its priority higher
; so it starts first
priority = 998

Fixed by creating two separate configuration files:

  • celery_worker.conf
  • celery_beat.conf

The Celery beat command is /path/to/celery -A beat -l info

Celery beat creates a .pid file in tmp, and reads from /var/www/staging.polyledger/com/polyledger/server/celerybeat-schedule.db. Therefore, I modified permissions: user www-data was added to the polyledger group, which owns /var/www/staging.polyledger.com, and www-data was given write access.

Re-opening because the beat schedule is running 8 hours ahead of time, leading me to believe that Celery is using UTC time instead of US/Pacific time (-8h), even though the Django and Celery configuration settings both use US/Pacific.

Setting CELERY_ENABLE_UTC = False may be a potential fix if it is set to True by default; there seems to be a pending PR (celery/celery#4173, merged but not released) for these issues:

Update: the periodic task should now run properly at UTC midnight (-8:00/4:00 PM PST).

It failed in the most recent attempt, possibly because redis-server was not running. The result from celery-beat.log:

[2017-12-23 00:06:40,845: ERROR/MainProcess] beat: Connection error: MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.. Trying again in 32.0 seconds...

From /var/log/redis/redis-server.log:

10156:M 23 Dec 00:12:10.070 * 1 changes in 900 seconds. Saving...
10156:M 23 Dec 00:12:10.070 # Can't save in background: fork: Cannot allocate memory

https://gist.github.com/fernandoaleman/07f40de23864c50071929f5e3ab5840d

Modified /etc/sysctl.conf with:

vm.overcommit_memory=1

Then restarted sysctl by:

$ sudo sysctl -p /etc/sysctl.conf

The result was that the periodic task was properly running, but a backlog of tasks seemed to be running sporadically. Seems to be due to redis, since /etc/init.d/redis-server stop stopped the tasks.

Link to relevant thread: celery/celery#943

Closing this issue since there is no staging environment host.