How to specify the naming convention of log files?
aaronm137 opened this issue · 1 comments
Hello,
in my Scrapy spider, I specify the name of the log file as follows:
custom_settings = {
'LOG_FILE': f'{datetime.fromtimestamp(time.time()).strftime("%Y-%m-%d_%H%M%S")}_{project_name}.log,
'LOG_LEVEL': 'INFO'
}
so the name of the log file looks like this: 2024-05-23_103249_my-cool-spider.log
. This works perfectly on localhost.
When I deploy it to production where Scrapyd takes case of running spider jobs, the log files naming convention specified above is ignored and instead of that is used Scrapyd's naming convention that looks like this: task_169_2024-06-14T13_55_48.log
.
Is there any way to change the naming convention, so Scrapyd would respect the format specified in the Scrapy spider?
In https://github.com/scrapy/scrapyd/blob/master/scrapyd/environ.py, if logs_dir
is set in Scrapyd's configuration file, then Scrapy's LOG_FILE setting is overridden. The pattern is {logs_dir value}/{Scrapyd project name}/{Scrapy spider name}/{job ID}.log
In https://github.com/scrapy/scrapyd/blob/master/scrapyd/webservice.py, the job ID defaults to a uuid.uuid().hex
, but you can provide your own jobid
when scheduling the job. See https://scrapyd.readthedocs.io/en/latest/api.html#schedule-json
So, if you only want to configure the filename, set the job ID when scheduling. task_169_2024-06-14T13_55_48.log
is already not a UUID, so you (or some software you're using) must already be setting the jobid
(or, perhaps you haven't set logs_dir
and it is using the non-overridden LOG_FILE setting).