gslab-econ/template

Repo organization with a cloud-computing step in the middle of the SCons build

Closed this issue · 2 comments

Hi @gentzkow ,

We were thinking of this SCons-scenario and want to ask if you have a suggestion on how to deal with it.

DATA BUILD      ---->           ANALYSIS                 ---->   PLOTTING

SConscript                      Cloud computing                  SConscript
source=/Dropbox/data.txt        source=build/data.txt            source=build/analysis.txt    
target=build/data.txt           target=build/analysis.txt        output=release/plot.png

In this scenario, the data build and the plotting steps can be done locally, but the analysis step must be done on the cloud (e.g. AWS or Sherlock). Our two current suggested solutions are:

  1. Do everything on the cloud. Then the analysis step can be co-opted into the SCons framework.

  2. Break the data build and the plotting into two separate SConstruct files and manage the analysis files manually (copy build/data.txt to the cloud server, run computations, then download build/analysis.txt to the local build folder).

This is going to be a problem in ad-price-drivers and is currently small problem in divergence (except that the "cloud" step is at the beginning so it's not such a big deal to manage it manually). Thanks!

@stanfordquan: My initial instinct is close to (2). I'd think we'd want separate top-level directories w/ their own SConstruct files just as we currently do for /paper_slides/. So the top-level would be

/data/
/analysis/
/plotting/
/paper_slides/