steveash/NETransliteration-COLING2018

Dependency on AWS

Opened this issue · 5 comments

vsoch commented

hey @steveash ! The tutorial looks really interesting, but you've placed a huge dependency on using AWS. Many (if not) most academic researchers do not have accounts here provided by an institution. Could you please provide a similar environment (or way to produce the one on AWS via container or similar) so that the code can be run? Thank you!

Thanks for the interest! There shouldn't be anything in our code that requires the AWS deep learning AMI or any other part of AWS for that matter. It's just a convenient packaging of the dependencies. Any Linux box with python 3.6 and tensorflow 1.4.1 installed should work. If you have a GPU then we used CUDA 9 but that's just a matter of configuring tensorflow.

I'll update the README to make that more clear.

vsoch commented

Yep, so I've been working for a few days on reproducing that exact setup (linux container with these libraries) and it's taken me down a pathway of consistent errors. I haven't even gotten to the testing GPU part because there is so much conflict, and it could be some subtle difference between versions of the libraries. I am hoping since you have this working environment you might provide enough to make the analysis reproducible?

Oh wow. Yea it likely it's subtle version mismatches. Tensorflow and T2T have pretty drastically changed their APIs over the last few versions. I certainly didn't intend to create a dependency on AWS. I'll work on creating instructions from scratch with only ubuntu as a dependency. I also want to update the scripts to work with current versions of TF and T2T so I'll open an issue for that as well.

If you have a specific error that you're running into that you want to post here, I might be able to help with that as well.

vsoch commented

Great! I would love to help with this, even a general sense of the libraries needed (and a base container?) to start would be immensely useful. Here is what I've documented so far (and some of the library issues, e.g., tensor2tensor) and errors that you might have insight about. I'm using some of the base tensorflow containers (for cpu and then there is also a gpu version) since these are actively maintained, popular, etc. --> https://github.com/vsoch/NETransliteration-COLING2018/tree/add/containers/docker#debugging

Note that the usage docs (and scripts I'll add to interact with the container) will be developed when I can interactively (via the shell) get everything working without error messages. This should be fun, thanks for the help!

vsoch commented

And yes I totally understand! The speed at which these libraries (and dependencies) change is like a Taco Bell dog intersected with a Red Bull and the Broadinstitute Cromwell logo. I go to the kitchen to grab a drink, come back to my computer, and we are different versions into the future! 🐖 🕶️