Miserlou/Zappa

Downloading NLTK dependencies before deploy

Opened this issue · 1 comments

Context

I'm trying to deploy a Django application that makes use of NLTK. I want to include things like NLTK wordnet and stopwords but not sure how to go about this.
Since my app is already quite large I have slim_handler turned on, in case that matters.

Expected Behavior

Is there any way to extend the build step of zappa and have something like this in there:

nltk.download('wordnet')
nltk.download('stopwords')

Not sure where this would go and I guess I'd need to explicitely mention in which directory this would go so that it can be found?

Actual Behavior

I'm currently migrating this app from heroku where you can list all nltk dependencies in a file called nltk.txt and they're downloaded upon deployment.

Possible Fix

Some mechanism to add dependencies that can't be provided through pip/requirements.txt

Your Environment

  • Zappa version used: 0.51.0
  • Operating System and Python version: Ubuntu, 3.7

Where does nltk.download() puts the files in? If you do that before packaging and it's somewhere in the local project directory they should be packaged automatically.