NYUCCL/psiTurk

Feature: Adding support for stream logging (only used when ON_CLOUD=1)

jmuchovej opened this issue ยท 14 comments

I was recently trying to debug different local application behavior vs Heroku-hosted behavior. (One of my custom routes was breaking on Heroku, but not locally.)

However, since ON_CLOUD will reset psiTurk's logging, this made it a nightmare to debug.

if 'ON_CLOUD' in os.environ:
LOG_FILE_PATH = None
else:
LOG_FILE_PATH = os.path.join(os.getcwd(), CONFIG.get("Server Parameters",
"logfile"))

My changes are detailed here. Essentially, I replaced the enforced logging to a file to logging.<type>Handlers so that a StreamHandler could be added, allowing for Heroku pull psiTurk logs, since they aggregate all stdout and stderr output.

Wait I'm confused, but I admit to being a noob with logging, and I know there's some head scratching stuff in psiturk rn with logging. At one point that logging was disabled so that the SECRET_KEY didn't get logged to the heroku logs, but on second thought there's no need to log that anyway. psiturk_server.py sets the logger to be stderr, and I thought there was code that bound the flask logger to gunicorn's when gunicorn is detected. How do your changes fit in with that? Do you see the part I'm talking about? Thanks for helping with this btw, again the current state is a mess.

Hmm... Let me roll back the experiment to a state where I should have the breakdown. Give me a few minutes. ๐Ÿ˜…

I don't recall seeing anything from gunicorn nor flask, but will check back.

Rollback

Okay, so, rolling back to a few commits before I implemented the change โ€“ I see only logs from the following:

  • heroku[router]
  • heroku[web.1]
  • app[web.1]

Trying the erroneous path, I don't see Heroku's logs raising the error through their dashboard (https://dashboard.heroku.com/apps/<app>/logs).

psiturk_server.py?

I don't see a psiturk_server.py file anywhere, so I'll assume you meant experiment_server.py? If so, I don't see what you're mentioning. Presumably you're referring to this?

self.user_options = {
'bind': config.get('Server Parameters', "host") + ":" + config.get('Server Parameters', "port"),
'workers': workers,
'worker_class': 'gevent',
'loglevels': self.loglevels,
'loglevel': self.loglevels[config.getint("Server Parameters", "loglevel")],
'accesslog': config.get('Server Parameters', "accesslog"),
'errorlog': config.get('Server Parameters', "errorlog"),
'proc_name': 'psiturk_experiment_server_' + project_hash,
'limit_request_line': '0',
}
I'm not sure that actually sets anything? Doing a quick search over gunicorn and flask, I don't see any meaningful mention of a user_options (or options) variable.

Looking over gunicorn and flask, I see everything you mentioned; so now I'm just as puzzled. ๐Ÿ˜‚ Maybe this has something to do with initialization order?

What I do know:

  • I wasn't receiving any routing errors (for custom.py) until I set psiTurk's logging to the sys.stderr stream (which gunicorn and flask also do).
  • I've been receiving local gunicorn errors through stderr since last Sept (so v2.3, I think?); so I'm not sure what's going on there. ๐Ÿ˜…

What I'd guess:

  • The few times I've used stdlib's logging, anytime logging.basicConfig is used, I vaguely recall seeing logging behavior change.
  • flask and gunicorn bind their loggers to variables that can be accessed later. Searching for explicit logging attempts in psiTurk, I see there aren't any calls to logging aside from the basicConfig call. This leads me to believe that calling logging.basicConfig may be changing the overall behavior of both flask and gunicorn's logging. Give me a few minutes to verify this.

Okay, I think I know what's going on. When I modified psiTurk to only setup logging when $ON_CLOUD is unset, whatever's normally logging (gunicorn, presumably?) dumps the expected errors to Heroku's logs. So, it seems that doing logging.basicConfig(filename=None) is what's causing the breakdown.

I also have a nagging feeling that it might be good to specify log files however gunicorn/flask expects, but I don't know how to do that. (Nor if it's a super high priority for you/psiTurk. ๐Ÿ˜…)

This is the gunicorn logging binding thing I was remembering: https://github.com/NYUCCL/psiTurk/blob/master/psiturk/experiment.py#L58

I'll help think over this too as early as next week. Thanks again for your help.

Are you using app.logger to write to logs? https://flask.palletsprojects.com/en/1.1.x/logging/

I'm thinking we maybe just need to set flask's logging level to be the same as gunicorn, when gunicorn is being used: https://trstringer.com/logging-flask-gunicorn-the-manageable-way/

The code I linked two comments up is an attempt to detect whether the app is being run directly as a flask application (not supported by psiturk, but I do this sometimes when I want live template reloading) or if it's being run via gunicorn.

No, I wasn't explicitly writing to logs. I was trying see what the exceptions being thrown were.

I should point out that if you have not set any value for log file in your config.txt, then stout and stderr will be used by default if ON_CLOUD=1. This differs from previous psiturk versions which would ignore any log file setting in config.txt and just use stout/stderr if ON_CLOUD=1. Config files generated by new psiturk versions comment out all defaults in the config.txt file generated by the psiturk-setup-example command, so a migrated config file might lead to unexpected logging results. Is that your case?

I believe this is my case, but to be sure โ€“ I'm using psiTurk 3.0.4 and have set ON_CLOUD in Heroku. Here's my config, if it's helpful:

config.txt
[HIT Configuration]
title = <REDACTED>
description = <REDACTED>
keywords = Perception, Psychology
lifetime = 24
us_only = true
approve_requirement = 95
number_hits_approved = 100
require_master_workers = false
browser_exclude_rule = MSIE, mobile, tablet
allow_repeats = false
ad_url_domain = <REDACTED>

[Database Parameters]
table_name = <REDACTED>

[Server Parameters]
loglevel = 2
debug = false
threads = 1

[Task Parameters]
experiment_code_version = <REDACTED>
num_conds = 5
allow_repeats = false
num_counters = 1
contact_email_on_error = <REDACTED>
cutoff_time = 40

[Shell Parameters]
launch_in_sandbox_mode = false
persistent_history_file = .psiturk_history

I removed it before deploying with psiTurk 3 and using Heroku. (I can go back to a prior deployment that should be on psiTurk 2 and check for logged errors, if that would be helpful.)

Looks like this can be closed in #483 was merged?