Repo organization with a cloud-computing step in the middle of the SCons build
Closed this issue · 2 comments
Hi @gentzkow ,
We were thinking of this SCons-scenario and want to ask if you have a suggestion on how to deal with it.
DATA BUILD ----> ANALYSIS ----> PLOTTING
SConscript Cloud computing SConscript
source=/Dropbox/data.txt source=build/data.txt source=build/analysis.txt
target=build/data.txt target=build/analysis.txt output=release/plot.png
In this scenario, the data build and the plotting steps can be done locally, but the analysis step must be done on the cloud (e.g. AWS or Sherlock). Our two current suggested solutions are:
-
Do everything on the cloud. Then the analysis step can be co-opted into the SCons framework.
-
Break the data build and the plotting into two separate SConstruct files and manage the analysis files manually (copy
build/data.txt
to the cloud server, run computations, then downloadbuild/analysis.txt
to the local build folder).
This is going to be a problem in ad-price-drivers
and is currently small problem in divergence
(except that the "cloud" step is at the beginning so it's not such a big deal to manage it manually). Thanks!
@stanfordquan: My initial instinct is close to (2). I'd think we'd want separate top-level directories w/ their own SConstruct files just as we currently do for /paper_slides/. So the top-level would be
/data/
/analysis/
/plotting/
/paper_slides/
Got it, thanks @gentzkow. FYI @arosenbe @Shun-Yang @yuchuan2016