scrapy/scrapyd

Cannot access environment variables in the spider running under ScrapyD

iamumairayub opened this issue · 2 comments

So I am using Ubuntu 18.04 and have tried setting some envirement varialbes in /etc/envirement and also in .basrc file

After doing source myFile I can do echo $my_var and it shows variable correctly.

I ran my spider using scrapy crawl mySpider and environment variables shows just fine.

But when same spider is run under ScrapyD, the environment variable is empty.

I tried printing user with getpass.getuser() and it shows same user when I run scraper from terminal or from ScrapyD.

I saw this issue but it only says to restart ScrapyD, and have tried restarting ScrapyD and tried logging out and logging back in to terminal but no use.

How can I access environment variable in the Spider running under ScrapyD?

It depends how you are running Scrapyd.

For example, I use systemd to create a Scrapyd service. systemd does not create a login shell, so it will not run a user's .bashrc file, etc. .bashrc and /etc/environment are intended for interactive shells.

Instead, I need to set Environment="myvar=myvalue" in the service file.

Whatever you're using to run Scrapyd probably has its own way to configure environment variables.

Here's my /etc/systemd/system/scrapyd.service file:

[Unit]
Description=Scrapyd
After=network.target

[Service]
User=scrapyd
Group=scrapyd
Environment="myvar=myvalue"
# More Environment lines...
WorkingDirectory=/home/scrapyd/scrapyd
ExecStart=/home/scrapyd/scrapyd/.ve/bin/scrapyd --nodaemon --logfile=/var/log/scrapyd/scrapyd.log

[Install]
WantedBy=multi-user.target

I finally fixed my issue by using

[Service]
EnvironmentFile=/etc/environment